Skip to content

Release 0.19.0: Polars validation support

Compare
Choose a tag to compare
@cosmicBboy cosmicBboy released this 06 May 14:28
· 101 commits to main since this release
612d25c

✨ Highlights ✨

📣 Pandera now supports validation of polars.DataFrame and polars.LazyFrame 🐻‍❄️!

You can now do this:

import pandera.polars as pa
import polars as pl


class Schema(pa.DataFrameModel):
    state: str
    city: str
    price: int = pa.Field(in_range={"min_value": 5, "max_value": 20})


lf = pl.LazyFrame(
    {
        'state': ['FL','FL','FL','CA','CA','CA'],
        'city': [
            'Orlando',
            'Miami',
            'Tampa',
            'San Francisco',
            'Los Angeles',
            'San Diego',
        ],
        'price': [8, 12, 10, 16, 20, 18],
    }
)
Schema.validate(lf).collect()

And of course you can do functional validation with decorators like so:

from pandera.typing.polars import LazyFrame

@pa.check_types
def function(lf: LazyFrame[Schema]) -> LazyFrame[Schema]:
    return lf.filter(pl.col("state").eq("CA"))

function(lf).collect()

You can read more about the integration here. Not all pandera features are supported at this point, but depending on community demand/contributions we'll slowly add them. To learn more about what's currently supported, check out this table.

Special shoutout to @AndriiG13 and @FilipAisot for their contributions on the built-in checks and polars datatypes, respectively, and to @evanrasmussen9, @baldwinj30, @obiii, @Filimoa, @philiporlando, @r-bar, @alkment, @jjfantini, and @robertdj for their early feedback and bug reports during the 0.19.0 beta.

What's Changed

New Contributors

Full Changelog: v0.18.3...v0.19.0