Comments (5)
yes, agree!
Have already shared by idea about something more general .assert
/ .expect
here: #11064 (comment)
Would love to see:
.assert(...)
or .expect
- assert that some conditions are met like: datatypes, max < x, min > 0, schema = ...
- with configuration options to turn on for "testing" (full evalutation) or "production" (ignore assertions)
df.select(...).assert(col(x)>0, pl.max(y) < 100).with_columns(...).assert(...).group_by...
.inspect(...)
- allow printing / debugung current df values
df.select(...).with_columns(...).inspect(lambda df: print(df.schema)).select(...).inspect(lambda df: print(df.describe()).group_by...
from polars.
Interesting.
But schemas are different from other properties. You can get the schema of a LazyFrame without collecting it.
The more general assert
and inspect
methods of LazyFrames would only add delayed actions. (Still useful.)
from polars.
Wrote #16311 specifically about dropping/forgetting/ignoring values that are used only for assertions.
from polars.
You can already do this using pipe
:
import polars as pl
def assert_schema(
lf: pl.LazyFrame, schema: dict[str, pl.PolarsDataType]
) -> pl.LazyFrame:
if lf.schema != schema:
msg = (
"Wrong LazyFrame schema:\n"
f"• expected: '{schema}',\n"
f"• observed: '{dict(lf.schema)}'."
)
raise AssertionError(msg)
return lf
lf = pl.LazyFrame({"a": [1, 2, 3], "b": [4.0, 5.0, 6.0]})
lf.select("a").pipe(assert_schema, schema={"a": pl.Float64})
AssertionError: Wrong LazyFrame schema:
• expected: '{'a': Float64}',
• observed: '{'a': Int64}'.
I don't think this is worth adding to the Polars API.
For a more general 'assertion / raising' utility, please see #11064
from polars.
You can already do this using
pipe
:
Nice. I somehow missed pipe
.
Maybe, it deserves a more prominent place in the docs? The user guide mentions it only in passing (Pipe littering: "Don't").
from polars.
Related Issues (20)
- Add `show` method to `DataFrame` and `LazyFrame` HOT 1
- `gather` in `agg` context gathers values from other groups
- ShapeError: filter's length: 155 differs from that of the series: 0 HOT 9
- Version 0.20.30 bug HOT 4
- `.list.to_array()` fails if first element of a list column is excluded HOT 2
- `scan_parquet` + `with_row_index` causing `pl.len()` to return 0 HOT 1
- full join with coalesce=True panics if more key expressions are used than columns in a frame
- LazyFrames containing nested List types will cause panic in `collect()` HOT 1
- Another "coalesce=False" `join` schema issue HOT 2
- performance slowdown with `Expr.alias` HOT 3
- Shift(n) should accept a varying n HOT 4
- Rolling ewm/prod/rank HOT 4
- Improve string split API and DataTypes (`split`, `splitn`, `split_exact`) HOT 2
- Inconsistent Behavior with `inspect` in Aggregations
- In LazyFrame, select empty Series causes panic HOT 3
- `check_sorted` causes error in `DataFrame.rolling` HOT 3
- Expose API for custom grouping operations similar to expression plugin API
- add strategy="mode" for fill_null HOT 5
- `write_parquet(pyarrow=False)` with `Struct` panic: "The children must have an equal number of values." HOT 2
- Add `pl.Config.show_full` (or something similar) HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.