Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaned and extended function that extracts datetimes from paths #2181

Merged
merged 6 commits into from
Oct 5, 2023

Conversation

schlunma
Copy link
Contributor

@schlunma schlunma commented Aug 24, 2023

Description

This PR

  1. cleans the module local.py by removing the duplicated code in the function _get_start_end_year and uses _get_start_end_date instead.
  2. extended the regex used to derive datetimes from paths. Now the following strings can be parsed correctly:
'tas_A1.20C3M_1.CCSM.atmm.1990-01_cat_1999-12.nc' -> '199001', '199912'
'E5sf00_1M_1940_032.grb' -> '1940', '1940'
'E5sf00_1D_1998-04_167.grb' -> '199804', '199804'
'E5sf00_1H_1986-04-11_167.grb' -> '19860411', '19860411'
'E5sf00_1M_1940-1941_032.grb' -> '1940', '1941'
'E5sf00_1D_1998-01_1999-12_167.grb' -> '199801', '199912'
'E5sf00_1H_2000-01-01_2001-12-31_167.grb -> '20000101', '20011231'

The first is used in CMIP3 data, the latter in native ERA5 data in GRIB format. This speeds up many recipes by quite some time, since only the files that are actually relevant for the user are processed.

Note that I did not change any existing test case, so this should be fully backwards-compatible.

Closes #2180
Addresses parts of #1991


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@schlunma schlunma added the enhancement New feature or request label Aug 24, 2023
@schlunma schlunma added this to the v2.10.0 milestone Aug 24, 2023
@schlunma schlunma self-assigned this Aug 24, 2023
@codecov
Copy link

codecov bot commented Aug 24, 2023

Codecov Report

Merging #2181 (23a7788) into main (13a444e) will increase coverage by 0.02%.
The diff coverage is 100.00%.

❗ Current head 23a7788 differs from pull request most recent head 738d7b4. Consider uploading reports for the commit 738d7b4 to get more accurate results

@@            Coverage Diff             @@
##             main    #2181      +/-   ##
==========================================
+ Coverage   93.19%   93.22%   +0.02%     
==========================================
  Files         238      238              
  Lines       12826    12836      +10     
==========================================
+ Hits        11953    11966      +13     
+ Misses        873      870       -3     
Files Coverage Δ
esmvalcore/dataset.py 100.00% <100.00%> (ø)
esmvalcore/esgf/_search.py 100.00% <100.00%> (ø)
esmvalcore/local.py 98.37% <100.00%> (+0.54%) ⬆️

... and 8 files with indirect coverage changes

Copy link
Member

@bouweandela bouweandela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @schlunma! Just one minor question

esmvalcore/local.py Show resolved Hide resolved
Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

biutiful, fellas 🍻

@valeriupredoi valeriupredoi merged commit 9b323aa into main Oct 5, 2023
@valeriupredoi valeriupredoi deleted the clean_and_extend_date_finder branch October 5, 2023 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unify _get_start_end_date and _get_start_end_year in local.py
3 participants