Dataframe date field validation is not giving correct results #965

learnnk · 2022-10-15T00:14:02Z

learnnk
Oct 15, 2022

When one of the value in date field is wrong it is showing all the rows from date field in failure case.

Here is my code snippet

import pandera as pa
import pandas as pd
from pandera.typing import Index, DataFrame, Series

class CalendarYearSchema(pa.SchemaModel):
year: Series[int] = pa.Field(gt=2000, coerce=True, nullable=True)
month: Series[int] = pa.Field(ge=1, le=12, coerce=True)
day: Series[int] = pa.Field(ge=0, le=365, coerce=True)
date: Series[pa.DateTime] = pa.Field(coerce=True, nullable=False)

df = pd.DataFrame({
"year": ["2001", "2002", "2003"],
"month": ["1", "12", "1"],
"day": ["231", "156", "365"],
"date": ["16-31-2011", "12-12-2011", "12-12-2020"],
})

try:
CalendarYearSchema.validate(df, lazy=True)
except pa.errors.SchemaErrors as err:
print(err.failure_cases, '\n')

Getting o/p as from above code snippet as
schema_context column ... failure_case index
0 Column date ... 16-31-2011 0
1 Column date ... 12-12-2011 1
2 Column date ... 12-12-2020 2

[3 rows x 6 columns]

But here we need to have failure case of one row i.e index 0 not index 1 and 2( Those are valid date fields)
schema_context column ... failure_case index
0 Column date ... 16-31-2011 0

cosmicBboy · 2022-10-18T18:22:56Z

cosmicBboy
Oct 18, 2022
Maintainer

Hi @learnnk would you mind formatting the code above to something copy-pasteable? you can use the triple backticks (or the UI) to make your question easier to interact with

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataframe date field validation is not giving correct results #965

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Dataframe date field validation is not giving correct results #965

learnnk Oct 15, 2022

Replies: 1 comment

cosmicBboy Oct 18, 2022 Maintainer

learnnk
Oct 15, 2022

cosmicBboy
Oct 18, 2022
Maintainer