Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for native ERA5 data in GRIB format #2178

Merged
merged 84 commits into from
Dec 6, 2024
Merged
Show file tree
Hide file tree
Changes from 70 commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
f998ae3
First working prototype of ERA5 GRIB reader
schlunma Aug 18, 2023
8c48834
Extended list of supported variables for ERA5 GRIB support
schlunma Aug 21, 2023
9740cd3
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Aug 21, 2023
60488dc
Added public function to check for unstructured grids
schlunma Aug 22, 2023
f9a4ab7
Make regridding much faster
schlunma Aug 22, 2023
fc7384a
Add support for more variables and make regridding optional
schlunma Aug 22, 2023
e0c4da3
Add doc
schlunma Aug 22, 2023
3db12bb
Added first tests
schlunma Aug 23, 2023
09aabcb
Added test for loading grib files
schlunma Aug 23, 2023
8c73373
Added iris-grib to environment and setup.py
schlunma Aug 23, 2023
744c20b
Fixed environment
schlunma Aug 23, 2023
0c8ce64
Fixed eccodes dependency
schlunma Aug 23, 2023
39c6677
Next try to get environment working
schlunma Aug 23, 2023
b7a0c68
Temporarily remove GRIB loading test
schlunma Aug 23, 2023
e907ff1
Fixed tests
schlunma Aug 23, 2023
0b8fbfa
Added missing tests
schlunma Aug 23, 2023
bef5b5e
Fixed test
schlunma Aug 23, 2023
2693066
Improved test coverage of ERA5 CMORizer
schlunma Aug 24, 2023
e7e7285
Increased test coverage of regrid module
schlunma Aug 24, 2023
911ed28
Optimized doc
schlunma Aug 24, 2023
7afa551
More customizable automatic regriddind for ERA5 GRIB
schlunma Aug 25, 2023
7d09738
Merge branch 'main' into read_era5_grib
schlunma Aug 25, 2023
d0ae8d2
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Oct 10, 2023
b4972c9
Added iris-grib to setup.py
schlunma Oct 10, 2023
6be7059
turn on GA tests
valeriupredoi Oct 10, 2023
0a01d90
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Oct 12, 2023
908afc4
Removed unneeded changes
schlunma Apr 25, 2024
67d3494
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Apr 25, 2024
dd433c0
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma May 28, 2024
36f9bcf
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Jun 10, 2024
660f870
Remove unused test
schlunma Jun 10, 2024
a31332f
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Jun 11, 2024
0488179
Unrun GA tests
schlunma Jun 11, 2024
cac1dff
Remove unused extra facets
schlunma Jun 11, 2024
846930b
Update docs to latest changes
schlunma Jun 11, 2024
e6125ab
Do not fix time bounds for variables with no time dim coord
schlunma Jun 11, 2024
d6bbf21
Add ERA5-GRIB to Levante-specific options
schlunma Jun 11, 2024
7be8966
Merge branch 'main' into read_era5_grib
schlunma Jun 11, 2024
67d9d31
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Jul 4, 2024
cc632dc
Re-enable automatic regridding
schlunma Jul 4, 2024
4c3f1be
Rename extra facets file and add link to Levante doc
schlunma Jul 8, 2024
317cf22
Fix doc build
schlunma Jul 8, 2024
0fea4ab
Update version of ERA5 GRIB data in extra facets
schlunma Jul 24, 2024
5e6d149
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Aug 13, 2024
22cdf6c
Merge branch 'main' into read_era5_grib
schlunma Sep 4, 2024
bf4e2e4
Replace yapf and isort by ruff, drop docformatter
bouweandela Sep 12, 2024
272bd49
Replace flake8 by ruff and move to pre-commit.ci
bouweandela Sep 13, 2024
15f8054
Add yamllint configuration
bouweandela Sep 13, 2024
3657745
Disable pycodestyle rules that conflict with ruff and fix some issues…
bouweandela Sep 16, 2024
81a4347
Disable conflicting pycodestyle vs ruff settings also in prospector c…
bouweandela Sep 16, 2024
8f49aa3
Try renaming pycodestyle to pep8
bouweandela Sep 16, 2024
06a38cf
Another attempt
bouweandela Sep 18, 2024
c73cff8
And another one
bouweandela Sep 18, 2024
704f543
Update docs
bouweandela Sep 18, 2024
cb4e7c6
Fix link
bouweandela Sep 18, 2024
c0316ad
Replace flake8 by pre-commit linting in GitHub Actions
bouweandela Sep 18, 2024
10983ee
Add Python 3.10 codespell pre-commit hook dependency
bouweandela Sep 18, 2024
e4c887b
Remove pre-commit run from conda-lock files GitHub Actions
bouweandela Sep 18, 2024
63e1e42
Add links with explanations why certain rules are ignored
bouweandela Sep 25, 2024
167b8c0
Latest version of pre-commit
schlunma Sep 27, 2024
036b7c6
Ignore 'oce' for codespell
schlunma Sep 27, 2024
f30e19f
Update doc/conf.py
schlunma Sep 27, 2024
712b7a1
Merge commit 'f30e19f9761db8262f15667bc3a41fcdb420c925' into read_era…
schlunma Sep 27, 2024
7119884
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Sep 27, 2024
4d449b7
Apply autoformatting
schlunma Sep 27, 2024
7a59ea1
Restored original LICENSE file
schlunma Sep 27, 2024
872fc50
Pin iris-grib
schlunma Sep 27, 2024
e24d4b9
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Oct 7, 2024
f7fcc69
Remove superfluous spaces
schlunma Oct 7, 2024
7da3b26
Merge branch 'main' into read_era5_grib
schlunma Oct 7, 2024
a031ea9
Fix doc build
schlunma Oct 8, 2024
44040c3
Remove unnecessary unit conversions
schlunma Oct 8, 2024
bda16f4
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Oct 14, 2024
e4d642a
Add note about downloading GRIB data from the CDS
schlunma Oct 14, 2024
086422f
Fix units for lat and lon
schlunma Oct 14, 2024
57c478f
Fix GRIB_PARAM attribute in resulting netcdf file
schlunma Oct 14, 2024
2efd6f7
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Oct 16, 2024
e123244
Avoid double regridding for ERA5 GRIB data
schlunma Oct 17, 2024
e6389db
Add debug message
schlunma Oct 17, 2024
66ecd84
100% coverage
schlunma Oct 17, 2024
1845cdc
Fix test fixture
schlunma Oct 17, 2024
57369c9
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Oct 21, 2024
1da3d12
Merge remote-tracking branch 'origin/main' into read_era5_grib
schlunma Nov 12, 2024
9af867a
Apply suggestions from code review
schlunma Dec 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/quickstart/configure.rst
Original file line number Diff line number Diff line change
Expand Up @@ -924,7 +924,7 @@ infrastructure. The following example illustrates the concept.
.. _extra-facets-example-1:

.. code-block:: yaml
:caption: Extra facet example file `native6-era5.yml`
:caption: Extra facet example file `native6-era5-example.yml`

ERA5:
Amon:
Expand Down
96 changes: 88 additions & 8 deletions doc/quickstart/find_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,18 +107,27 @@ The following native reanalysis/observational datasets are supported under the
To use these datasets, put the files containing the data in the directory that
you have :ref:`configured <config_options>` for the ``rootpath`` of the
``native6`` project, in a subdirectory called
``Tier{tier}/{dataset}/{version}/{frequency}/{short_name}``.
``Tier{tier}/{dataset}/{version}/{frequency}/{short_name}`` (assuming your are
schlunma marked this conversation as resolved.
Show resolved Hide resolved
using the ``default`` DRS for ``native6``).
Replace the items in curly braces by the values used in the variable/dataset
definition in the :ref:`recipe <recipe_overview>`.
Below is a list of native reanalysis/observational datasets currently
supported.

.. _read_native_era5:
.. _read_native_era5_nc:

ERA5
^^^^
ERA5 (in netCDF format downloaded from the CDS)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERA5 data can be downloaded from the Copernicus Climate Data Store (CDS) using
the convenient tool `era5cli <https://era5cli.readthedocs.io>`__.
For example for monthly data, place the files in the
``/Tier3/ERA5/version/mon/pr`` subdirectory of your ``rootpath`` that you have
configured for the ``native6`` project (assuming your are using the ``default``
schlunma marked this conversation as resolved.
Show resolved Hide resolved
DRS for ``native6``).

- Supported variables: ``cl``, ``clt``, ``evspsbl``, ``evspsblpot``, ``mrro``, ``pr``, ``prsn``, ``ps``, ``psl``, ``ptype``, ``rls``, ``rlds``, ``rsds``, ``rsdt``, ``rss``, ``uas``, ``vas``, ``tas``, ``tasmax``, ``tasmin``, ``tdps``, ``ts``, ``tsn`` (``E1hr``/``Amon``), ``orog`` (``fx``)
- Supported variables: ``cl``, ``clt``, ``evspsbl``, ``evspsblpot``, ``mrro``,
``pr``, ``prsn``, ``ps``, ``psl``, ``ptype``, ``rls``, ``rlds``, ``rsds``,
``rsdt``, ``rss``, ``uas``, ``vas``, ``tas``, ``tasmax``, ``tasmin``,
``tdps``, ``ts``, ``tsn`` (``E1hr``/``Amon``), ``orog`` (``fx``).
- Tier: 3

.. note:: According to the description of Evapotranspiration and potential Evapotranspiration on the Copernicus page
Expand All @@ -131,6 +140,74 @@ ERA5
of both liquid and solid phases to vapor (from underlying surface and vegetation)."
Therefore, the ERA5 (and ERA5-Land) CMORizer switches the signs of ``evspsbl`` and ``evspsblpot`` to be compatible with the CMOR standard used e.g. by the CMIP models.

.. _read_native_era5_grib:

ERA5 (in GRIB format available on DKRZ's Levante)
bouweandela marked this conversation as resolved.
Show resolved Hide resolved
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ERA5 data in monthly, daily, and hourly resolution is `available on Levante
<https://docs.dkrz.de/doc/dataservices/finding_and_accessing_data/era_data/index.html#era-data>`__
in its native GRIB format.
To read these data with ESMValCore, use the root path ``/pool/data/ERA5`` with
DRS ``DKRZ-ERA5-GRIB`` in your :ref:`user configuration file`, for example:
bouweandela marked this conversation as resolved.
Show resolved Hide resolved

.. code-block:: yaml

rootpath:
...
native6:
/pool/data/ERA5: DKRZ-ERA5-GRIB
...

The `naming conventions
<https://docs.dkrz.de/doc/dataservices/finding_and_accessing_data/era_data/index.html#file-and-directory-names>`__
for input directories and files for native ERA5 data in GRIB format on Levante
are

* input directories: ``{family}/{level}/{type}/{tres}/{grib_id}``
* input files: ``{family}{level}{typeid}_{tres}_*_{grib_id}.grb``

All of these facets have reasonable defaults preconfigured in the corresponding
:ref:`extra facets<extra_facets>` file, which is available here:
:download:`native6-era5.yml
</../esmvalcore/config/extra_facets/native6-era5.yml>`.
If necessary, these facets can be overwritten in the recipe.

Thus, example dataset entries could look like this:

.. code-block:: yaml

datasets:
- {project: native6, dataset: ERA5, timerange: '2000/2001',
short_name: tas, mip: Amon}
- {project: native6, dataset: ERA5, timerange: '2000/2001',
short_name: cl, mip: Amon, tres: 1H, frequency: 1hr}
- {project: native6, dataset: ERA5, timerange: '2000/2001',
short_name: ta, mip: Amon, type: fc, typeid: '12'}

The native ERA5 output in GRIB format is stored on a `reduced Gaussian grid
<https://confluence.ecmwf.int/display/CKB/ERA5:+data+documentation#ERA5:datadocumentation-SpatialgridSpatialGrid>`__.
By default, these data is regridded to a regular 0.25Β°x0.25Β° grid as
schlunma marked this conversation as resolved.
Show resolved Hide resolved
`recommended by the ECMWF
<https://confluence.ecmwf.int/display/CKB/ERA5%3A+What+is+the+spatial+reference#heading-Interpolation>`__
using bilinear interpolation.

To disable this, you can use the facet ``regrid: false`` in the recipe:

.. code-block:: yaml

datasets:
- {project: native6, dataset: ERA5, timerange: '2000/2001',
short_name: tas, mip: Amon, regrid: false}

It is recommended to disable the default regridding if regridding is setup in
the :ref:`preprocessor <Horizontal regridding>`.

- Supported variables: ``albsn``, ``cl``, ``cli``, ``clt``, ``clw``, ``hur``,
``hus``, ``o3``, ``prw``, ``ps``, ``psl``, ``rainmxrat27``, ``sftlf``,
``snd``, ``snowmxrat27``, ``ta``, ``tas``, ``tdps``, ``toz``, ``ts``, ``ua``,
``uas``, ``va``, ``vas``, ``wap``, ``zg``.

.. _read_native_mswep:

MSWEP
Expand All @@ -140,7 +217,10 @@ MSWEP
- Supported frequencies: ``mon``, ``day``, ``3hr``.
- Tier: 3

For example for monthly data, place the files in the ``/Tier3/MSWEP/version/mon/pr`` subdirectory of your ``native6`` project location.
For example for monthly data, place the files in the
``/Tier3/MSWEP/version/mon/pr`` subdirectory of your ``rootpath`` that you have
configured for the ``native6`` project (assuming your are using the ``default``
schlunma marked this conversation as resolved.
Show resolved Hide resolved
DRS for ``native6``).

.. note::
For monthly data (``V220``), the data must be postfixed with the date, i.e. rename ``global_monthly_050deg.nc`` to ``global_monthly_050deg_197901-201710.nc``
Expand Down
8 changes: 5 additions & 3 deletions esmvalcore/_provenance.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import logging
import os
from functools import total_ordering
from pathlib import Path

from netCDF4 import Dataset
from PIL import Image
Expand Down Expand Up @@ -209,9 +210,10 @@ def _initialize_entity(self):
"""Initialize the entity representing the file."""
if self.attributes is None:
self.attributes = {}
with Dataset(self.filename, "r") as dataset:
for attr in dataset.ncattrs():
self.attributes[attr] = dataset.getncattr(attr)
if "nc" in Path(self.filename).suffix:
with Dataset(self.filename, "r") as dataset:
for attr in dataset.ncattrs():
self.attributes[attr] = dataset.getncattr(attr)

attributes = {
"attribute:" + str(k).replace(" ", "_"): str(v)
Expand Down
2 changes: 2 additions & 0 deletions esmvalcore/cmor/_fixes/fix.py
Original file line number Diff line number Diff line change
Expand Up @@ -845,6 +845,8 @@ def _fix_time_bounds(self, cube: Cube, cube_coord: Coord) -> None:
"""Fix time bounds."""
times = {"time", "time1", "time2", "time3"}
key = times.intersection(self.vardef.coordinates)
if not key: # cube has time, but CMOR variable does not
return
cmor = self.vardef.coordinates[" ".join(key)]
if cmor.must_have_bounds == "yes" and not cube_coord.has_bounds():
cube_coord.bounds = get_time_bounds(cube_coord, self.frequency)
Expand Down
Loading