diff --git a/diag_manager/README.md b/diag_manager/README.md index 53abdab39..fc0b8d10e 100644 --- a/diag_manager/README.md +++ b/diag_manager/README.md @@ -1,342 +1,113 @@ -## Diag Table Yaml Format: - -The purpose of this document is to explain the diag_table yaml format. +The purpose of this document is to document the differences between the old diag manager and the new (modern) diag manager. ## Contents -- [1. Converting from legacy ascii diag_table format](README.md#1-converting-from-legacy-ascii-diag_table-format) -- [2. Diag table yaml sections](README.md#2-diag-table-yaml-sections) -- [2.1 Global Section](README.md#21-global-section) -- [2.2 File Section](README.md#22-file-section) -- [2.2.1 Flexible output timings](README.md#221-flexible-output-timings) -- [2.2.2 Coupled Model Diag Files](README.md#222-coupled-model-diag-files) -- [2.3 Variable Section](README.md#23-variable-section) -- [2.4 Variable Metadata Section](README.md#24-variable-metadata-section) -- [2.5 Global Meta Data Section](README.md#25-global-meta-data-section) -- [2.6 Sub_region Section](README.md#26-sub_region-section) -- [3. More examples](README.md#3-more-examples) - -### 1. Converting from legacy ascii diag_table format - -To convert the legacy ascii diad_table format to this yaml format, the python script [**diag_table_to_yaml.py**](https://github.com/NOAA-GFDL/fms_yaml_tools/blob/aafc3293d45df2fc173d3c7afd8b8b0adc18fde4/fms_yaml_tools/diag_table/diag_table_to_yaml.py#L23-L26) can be used. To confirm that your diag_table.yaml was created correctly, the python script [**is_valid_diag_table_yaml.py**](https://github.com/NOAA-GFDL/fms_yaml_tools/blob/aafc3293d45df2fc173d3c7afd8b8b0adc18fde4/fms_yaml_tools/diag_table/is_valid_diag_table_yaml.py#L24-L27) can be used. - -### 2. Diag table yaml sections -The diag_table.yaml is organized by file. Each file has the required and optional key/value pairs for the file, an optional subsection defining any additional global metadata to add to the file, an optional subsection defining a subregion of the grid to output the data for and a required subsection for all of the variables in the file. Each variable has the required and optional key/value pairs for the variable and an optional subsection defining any additional variable attributes to add to the file. The hierarchical structure looks like this: - -```yaml -title: -base_date: -diag_files: -- file1 - - #key/value pairs for file1 - varlist: - - var1 - - #key/value pairs for var1 - attributes: - - #atributes for var1 - global_metadata: - - #global attributes for file1 - subregion: - - #subregion for file1 -``` - -### 2.1 Global Section -The diag_yaml requires “title” and the “baseDate”. -- The **title** is a string that labels the diag yaml. The equivalent in the legacy diag_table would be the experiment. It is recommended that each diag_yaml have a separate title label that is descriptive of the experiment that is using it. -- The **basedate** is an array of 6 integers indicating the base_date in the format [year month day hour minute second]. +- [1. Diag Table Format](README.md#1-diag-table-format) +- [2. Scalar Axis](README.md#2-scalar-axis) +- [3. Average Time Variables](README.md#3-average-time-variables) +- [4. Subregional Files](README.md#4-subregional-files) +- [5. Global attributes](README.md#5-global-attributes) +- [6. Real attributes from diag_field_add_attribute calls](README.md#6-real-attributes-from-diag_field_add_attribute-calls) +- [7. Answers "changes"](README.md#7-answers-changes) -**Example:** +### 1. Diag Table Format +The modern diag manager uses a YAML format instead of the legacy ascii table. A description of the YAML diag table can be found [here](diag_yaml_format.md). -In the YAML format: -```yaml -title: ESM4_piControl -base_date: 2022 5 26 12 3 1 -``` +### 2. Scalar Axis +The old diag manager was adding a `scalar_axis` dimension of size 1 for scalar variables -In the legacy ascii format: ``` -ESM4_piControl -2022 5 26 12 3 1 +dimensions: + scalar_axis = 1 ; +variables: + double p700(scalar_axis) ; + p700:_FillValue = 1.e+20 ; + p700:missing_value = 1.e+20 ; ``` +The new diag manager will no longer have a dummy scalar axis dimension. -### 2.2 File Section -The files are listed under the diagFiles section as a dashed array. - -Below are the **required** keys needed to define each file. -- **file_name** is a string that defines the name of the file. Do not add ".nc" and "tileX" to the filename as this will be handled by FMS. -- **freq** defines the frequency and the units that data will be written - - The acceptable values for freq are: - - =-1: output at the end of the run only - - =0: output every timestep - - \>0 units: output frequency and units (with a space between the frequency number and units e.g 24 hours) - - Values of -1 or 0 do not require units. - - The acceptable values for units are seconds, minutes, hours, days, months, years. -- **time_units** is a string that defines units for time. The acceptable values are seconds, minutes, hours, days, months, years. -- **unlimdim** is a string that defines the name of the unlimited dimension in the output netcdf file, usually “time”. -- **varlist** is a subsection that list all of the variable in the file - -**Example:** The following creates a file with data written every 6 hours. +### 3. Average Time Variables +The old diag manager includes time bounds metadata in a non-standard convention (i.e. `average_T1`, `average_T2`, and `average_DT`) +1. `average_T1` is the start time for the averaging period (in the same time units as time) +2. `average_T2` is the end time for the averaging period +3. `average_DT` is the length of the averaging period, in days -In the YAML format: -```yaml -diag_files: -- file_name: atmos_6hours - freq: 6 hours - time_units: hours - unlimdim: time - varlist: - - varinfo ``` - -In the legacy ascii format: + double average_T1(time) ; + average_T1:_FillValue = 1.e+20 ; + average_T1:missing_value = 1.e+20 ; + average_T1:units = "days since 1979-01-01 00:00:00" ; + average_T1:long_name = "Start time for average period" ; + double average_T2(time) ; + average_T2:_FillValue = 1.e+20 ; + average_T2:missing_value = 1.e+20 ; + average_T2:units = "days since 1979-01-01 00:00:00" ; + average_T2:long_name = "End time for average period" ; + double average_DT(time) ; + average_DT:_FillValue = 1.e+20 ; + average_DT:missing_value = 1.e+20 ; + average_DT:units = "days" ; + average_DT:long_name = "Length of average period" ; ``` -"atmos_6hours", 6, "hours", 1, "hours", "time" +These 3 variables are referenced as a variable attribute in each diagnostic. e.g. ``` - -**NOTE:** The fourth column (file_format) has been deprecated. Netcdf files will always be written. - -Below are some *optional* keys that may be added. -- **write_file** is a logical that indicates if you want the file to be created (default is true). This is a new feature that is not supported by the legacy ascii data_table. -- **new_file_freq** is a string that defines the frequency and the frequency units (with a space between the frequency number and units) for closing the existing file -- **start_time** is an array of 6 integer indicating when to start the file for the first time. It is in the format [year month day hour minute second]. Requires “new_file_freq” -- **filename_time** is the time used to set the name of new files when using new_file_freq. The acceptable values are begin (which will use the begining of the file's time bounds), middle (which will use the middle of the file's time bounds), and end (which will use the end of the file's time bounds). The default is middle - -**Example:** The following will create a new file every 6 hours starting at Jan 1 2020. Variable data will be written to the file every 6 hours. - -In the YAML format: -```yaml -- file_name: ocn%4yr%2mo%2dy%2hr - freq: 6 hours - freq_units: hours - unlimdim: time - new_file_freq: 6 hours - start_time: 2020 1 1 0 0 0 + dis_liq:time_avg_info = "average_T1,average_T2,average_DT" ; ``` -In the legacy ascii format: -``` -"ocn%4yr%2mo%2dy%2hr", 6, "hours", 1, "hours", "time", 6, "hours", "1901 1 1 0 0 0" -``` +The new diag manager will not be using these non standard variables. Instead the time bounds information will be specified in CF standards. One time_bounds variable, with an extra nv dimension, that species both average_T1 and average_T2. The average_DT information can be obtained from the time_bnds variable. -Because this is using the default `filename_time` (middle), this example will create the files: ``` -ocn_2020_01_01_03.nc for time_bnds [0,6] -ocn_2020_01_01_09.nc for time_bnds [6,12] -ocn_2020_01_01_15.nc for time_bnds [12,18] -ocn_2020_01_01_21.nc for time_bnds [18,24] + double nv(nv) ; + nv:long_name = "vertex number" ; + double time_bnds(time, nv) ; + time_bnds:units = "days since 1979-01-01 00:00:00" ; + time_bnds:long_name = "time axis boundaries" ; ``` - -**NOTE** If using the new_file_freq, there must be a way to distinguish each file, as it was done in the example above. - -- **file_duration** is a string that defines how long the file should receive data after start time in “file_duration_units”. This optional field can only be used if the start_time field is present. If this field is absent, then the file duration will be equal to the frequency for creating new files. -- **global_meta** is a subsection that lists any additional global metadata to add to the file. This is a new feature that is not supported by the legacy ascii data_table. -- **sub_region** is a subsection that defines the four corners of a subregional section to capture. - -### 2.2.1 Flexible output timings - -In order to provide more flexibility in output timings, the diag_table yaml format allows for different file frequencies for the same file by allowing the `freq`, `new_file_freq`, and `file_duration` keys to accept a comma seperated list. - -For example, -``` yaml -- file_name: flexible_timing%4yr%2mo%2dy%2hr - freq: 1 hours, 1 hours, 1 hours - time_units: hours - unlimdim: time - new_file_freq: 6 hours, 3 hours, 1 hours - start_time: 2 1 1 0 0 0 - file_duration: 12 hours, 3 hours, 9 hours - filename_time: begin - varlist: - - module: ocn_mod - var_name: var1 - reduction: average - kind: r4 -``` -This will create a file every 6 hours for 12 hours -``` -flexible_timing_0002_01_01_00.nc - using hourly averaged data from hour 0 to hour 6 -flexible_timing_0002_01_01_06.nc - using hourly averaged data from hour 6 to hour 12 -``` - -Then it will create a file every 3 hours for 3 hours +This time_bounds variable is refernced as a variable attribute of time: ``` -flexible_timing_0002_01_01_12.nc - using hourly averaged data from hour 12 to hour 15 + time:bounds = "time_bnds" ; ``` -Then it will create a file every 1 hour for 9 hours. -``` -flexible_timing_0002_01_01_15.nc - using data from hour 15 to hour 16 -flexible_timing_0002_01_01_16.nc - using data from hour 16 to hour 17 -flexible_timing_0002_01_01_17.nc - using data from hour 17 to hour 18 -flexible_timing_0002_01_01_18.nc - using data from hour 18 to hour 19 -flexible_timing_0002_01_01_19.nc - using data from hour 19 to hour 20 -flexible_timing_0002_01_01_20.nc - using data from hour 20 to hour 21 -flexible_timing_0002_01_01_21.nc - using data from hour 21 to hour 22 -flexible_timing_0002_01_01_22.nc - using data from hour 22 to hour 23 -flexible_timing_0002_01_01_23.nc - using data from hour 23 to hour 24 +### 4. Subregional Files -``` +#### A. `is_subregional` global attribute: +Subregional files will have a global NetCDF attribute `is_subregional = True` set for non-global history files. This attribute will be used in PP tools. -### 2.2.2 Coupled Model Diag Files -In the *legacy ascii diag_table*, when running a coupled model (ATM + OCN) in a seperate PE list: - - The ATM PEs ignored the files in the diag_table that contain "OCEAN" in the filename - - The OCN PEs ignored the files in the diag_table that did not contain "OCEAN" in the filename +#### B. Subregional dimension names: +In some cases, the old diag manager was adding `sub0X` to the dimension names where X is a number greater than 1. This was causing problems in PP tools that were expecting the dimension to have `sub01` in the name. The new diag manager will not have this problem. -In the *yaml diag_table*: - - The ATM PEs will ignore the files in the diag_table.yaml that contain the key/value pair `is_ocean: true` - - The OCN PEs will ignore the files in the diag_table.yaml that do not contain the key/value pair `is_ocean: true` +#### C. Corner and center diagnostics: +In the old diag manager, if mixing variables that are corner variables, such as velocities={uo,vo,umo,vmo} and center variables, such as tracers={thetao,so,volcello} you sometimes ended up with a different number of variables per file. The extra files had duplicate data for the corner velocities because the two PEs shared the point at the edge. This happened with some grid/layouts/masks/subregion combinations and it caused problems with the combiner. The new diag manager will not have this problem. -### 2.3 Variable Section -The variables in each file are listed under the varlist section as a dashed array. +### 5. Global attributes +#### A. Grid type and grid tile: +The old diag manager was adding the global attributes grid_type = "regular" and grid_tile = "N/A" for all files regardless of what the grid_type and the grid_title actually were. The new diag manager will no longer be doing this as they are not correct and don’t seem to be used. -- **var_name:** is a string that defines the variable name as it is defined in the register_diag_field call in the model -- **reduction:** is a string that describes the data reduction method to perform prior to writing data to disk. Acceptable values are average, diurnalXX (where XX is the number of diurnal samples), powXX (whre XX is the power level), min, max, none, rms, and sum. -- **module:** is a string that defines the module where the variable is registered in the model code -- **kind:** is a string that defines the type of variable as it will be written out in the file. Acceptable values are r4, r8, i4, and i8 +#### B. Associated_files global attribute: +We were unable to reproduce the exact order of the associated_files global attribute, so users may see differences like -**Example:** - -In the YAML format: -```yaml - varlist: - - module: moist - var_name: precip - reduction: average - kind: r4 ``` - -In the legacy ascii format: +lake_area: 19790101.land_static.nc soil_area: 19790101.land_static.nc land_area: 19790101.land_static.nc <> land_area: 19790101.land_static.nc soil_area: 19790101.land_static.nc lake_area: 19790101.land_static.nc ``` -"moist", "precip", "precip", "atmos_8xdaily", "all", .true., "none", 2 -``` -**NOTE:** The fifth column (time_sampling) has been deprecated. The reduction_method (`.true.`) has been replaced with `average`. The output name was not included in the yaml because it is the same as the var_name. - -which corresponds to the following model code -```F90 -id_precip = register_diag_field ( 'moist', 'precip', axes, Time) -``` -where: -- `moist` corresonds to the module key in the diag_table.yaml -- `precip` corresponds to the var_name key in the diag_table.yaml -- `axes` are the ids of the axes the variable is a function of -- `Time` is the model time -Below are some *optional* keys that may be added. -- **write_var:** is a logical that is set to false if the user doesn’t want the variable to be written to the file (default: true). -- **out_name:** is a string that defines the name of the variable that will be written to the file (default same as var_name) -- **long_name:** is a string defining the long_name attribute of the variable. It overwrites the long_name in the variable's register_diag_field call -- **attributes:** is a subsection with any additional metadata to add to the variable in the netcdf file. This is a new feature that is not supported by the legacy ascii data_table. -- **zbounds:** is a 2 member array of integers that define the bounds of the z axis (zmin, zmin), optional default is no limits. +### 6. Real attributes from diag_field_add_attribute calls +When real attributes were added to the file via a diag_field_add_attribute call, the old diag manager is always saving it as NF90_FLOAT regardless of the precision the data was [passed in](https://github.com/NOAA-GFDL/FMS/blob/ebb32649efa395ea14598f74c8d49e74d1408579/diag_manager/diag_manager.F90#L4532-L4543) -### 2.4 Variable Metadata Section -Any aditional variable attributes can be added for each variable can be listed under the attributes section as a dashed array. The key is attribute name and the value is the attribute value. +The new diag manager is going to write the attribute as it is passed in. This will cause differences when the model component was compiled with r8 as it will write the attribute as r8 instead of r4. -**Example:** +### 7. Answers "changes" +When the model run time is less than then the output frequency (i.e if the module run time is 2 days and you are writing monthly diagnostics), the old diag manager was writing 9.96921e+36. The new diag manager is not going to write anything for this cases, so if you ncdump the output from the new diag manager, you will get: -```yaml - attributes: - - attribute_name: attribute_value - attribute_name: attribute_value ``` - -Although this was not supported by the legacy ascii data_table, with the legacy diag_manager, a call to `diag_field_add_attribute` could have been used to do the same thing. - -```F90 -call diag_field_add_attribute(diag_field_id, attribute_name, attribute_value) + wa = + _, _, _, _, _, _, ... ``` -### 2.5 Global Meta Data Section -Any aditional global attributes can be added for each file can be listed under the global_meta section as a dashed array. The key is the attribute name and the value is the attribute value. +Similarly, when a variable was registered, but send_data was never called, the old diag manager was outputting the warning like -```yaml - global_meta: - - attribute_name: attribute_value - attribute_name: attribute_value +``` +WARNING from PE 0: diag_manager_mod::closing_file: module/output_field soil/soil_fgw, skip one time level, maybe send_data never called ``` -### 2.6 Sub_region Section -The sub region can be listed under the sub_region section as a dashed array. The legacy ascii diag_table only allows regions to be defined using the latitude and longitude, and it only allowed rectangular sub regions. With the yaml diag_table, you can use indices to defined the sub_region and you can define **any** four corner shape. Each file can only have 1 sub_region defined. These are keys that can be used: -- **grid_type:** is a **required** string defining the method used to define the fourth sub_region corners. The acceptable values are "latlon" if using latitude/longitude or "indices" if using the indices of the corners. -- **corner1:** is a **required** 2 member array of reals if using (grid_type="latlon") or integers if using (grid_type="indices") defining the x and y points of the first corner of a sub_grid. -- **corner2:** is a **required** 2 member array of reals if using (grid_type="latlon") or integers if using (grid_type="indices") defining the x and y points of the second corner of a sub_grid. -- **corner3:** is a **required** 2 member array of reals if using (grid_type="latlon") or integers if using (grid_type="indices") defining the x and y points of the third corner of a sub_grid. -- **corner4:** is a **required** 2 member array of reals if using (grid_type="latlon") or integers if using (grid_type="indices") defining the x and y points of the fourth corner of a sub_grid. -- **tile:** is an integer defining the tile number the sub_grid is on. It is **required** only if using (grid_type="indices"). - -**Exampe:** +And writing out `9.96921e+36` for the variable. The new diag manager will also be outputting the warning, but it will not write out anything. -```yaml - sub_region: - - grid_type: latlon - corner1: -80, 0 - corner2: -80, 75 - corner3: -60, 0 - corner4: -60, 75 -``` -### 3. More examples -Bellow is a complete example of diag_table.yaml: -```yaml -title: test_diag_manager -base_date: 2 1 1 0 0 0 -diag_files: -- file_name: wild_card_name%4yr%2mo%2dy%2hr - freq: 6 hours - time_units: hours - unlimdim: time - new_file_freq: 6 hours - start_time: 2 1 1 0 0 0 - file_duration: 12 hours - varlist: - - module: test_diag_manager_mod - var_name: sst - reduction: average - kind: r4 - global_meta: - - is_a_file: true -- file_name: normal - freq: 24 days - time_units: hours - unlimdim: records - varlist: - - module: test_diag_manager_mod - var_name: sst - reduction: average - kind: r4 - write_var: true - attributes: - - do_sst: .true. - sub_region: - - grid_type: latlon - corner1: -80, 0 - corner2: -80, 75 - corner3: -60, 0 - corner4: -60, 75 -- file_name: normal2 - freq: -1 days - time_units: hours - unlimdim: records - write_file: true - varlist: - - module: test_diag_manager_mod - var_name: sstt - reduction: average - kind: r4 - long_name: S S T - - module: test_diag_manager_mod - var_name: sstt2 - reduction: average - kind: r4 - write_var: false - sub_region: - - grid_type: index - tile: 1 - corner1: 10, 15 - corner2: 20, 15 - corner3: 10, 25 - corner4: 20, 25 -- file_name: normal3 - freq: -1 days - time_units: hours - unlimdim: records - write_file: false -``` diff --git a/diag_manager/diag_yaml_format.md b/diag_manager/diag_yaml_format.md new file mode 100644 index 000000000..63ed4630c --- /dev/null +++ b/diag_manager/diag_yaml_format.md @@ -0,0 +1,342 @@ +## Diag Table Yaml Format: + +The purpose of this document is to explain the diag_table yaml format. + +## Contents +- [1. Converting from legacy ascii diag_table format](diag_yaml_format.md#1-converting-from-legacy-ascii-diag_table-format) +- [2. Diag table yaml sections](diag_yaml_format.md#2-diag-table-yaml-sections) +- [2.1 Global Section](diag_yaml_format.md#21-global-section) +- [2.2 File Section](diag_yaml_format.md#22-file-section) +- [2.2.1 Flexible output timings](diag_yaml_format.md#221-flexible-output-timings) +- [2.2.2 Coupled Model Diag Files](diag_yaml_format.md#222-coupled-model-diag-files) +- [2.3 Variable Section](diag_yaml_format.md#23-variable-section) +- [2.4 Variable Metadata Section](diag_yaml_format.md#24-variable-metadata-section) +- [2.5 Global Meta Data Section](diag_yaml_format.md#25-global-meta-data-section) +- [2.6 Sub_region Section](diag_yaml_format.md#26-sub_region-section) +- [3. More examples](diag_yaml_format.md#3-more-examples) + +### 1. Converting from legacy ascii diag_table format + +To convert the legacy ascii diad_table format to this yaml format, the python script [**diag_table_to_yaml.py**](https://github.com/NOAA-GFDL/fms_yaml_tools/blob/aafc3293d45df2fc173d3c7afd8b8b0adc18fde4/fms_yaml_tools/diag_table/diag_table_to_yaml.py#L23-L26) can be used. To confirm that your diag_table.yaml was created correctly, the python script [**is_valid_diag_table_yaml.py**](https://github.com/NOAA-GFDL/fms_yaml_tools/blob/aafc3293d45df2fc173d3c7afd8b8b0adc18fde4/fms_yaml_tools/diag_table/is_valid_diag_table_yaml.py#L24-L27) can be used. + +### 2. Diag table yaml sections +The diag_table.yaml is organized by file. Each file has the required and optional key/value pairs for the file, an optional subsection defining any additional global metadata to add to the file, an optional subsection defining a subregion of the grid to output the data for and a required subsection for all of the variables in the file. Each variable has the required and optional key/value pairs for the variable and an optional subsection defining any additional variable attributes to add to the file. The hierarchical structure looks like this: + +```yaml +title: +base_date: +diag_files: +- file1 + - #key/value pairs for file1 + varlist: + - var1 + - #key/value pairs for var1 + attributes: + - #atributes for var1 + global_metadata: + - #global attributes for file1 + subregion: + - #subregion for file1 +``` + +### 2.1 Global Section +The diag_yaml requires “title” and the “baseDate”. +- The **title** is a string that labels the diag yaml. The equivalent in the legacy diag_table would be the experiment. It is recommended that each diag_yaml have a separate title label that is descriptive of the experiment that is using it. +- The **basedate** is an array of 6 integers indicating the base_date in the format [year month day hour minute second]. + +**Example:** + +In the YAML format: +```yaml +title: ESM4_piControl +base_date: 2022 5 26 12 3 1 +``` + +In the legacy ascii format: +``` +ESM4_piControl +2022 5 26 12 3 1 +``` + +### 2.2 File Section +The files are listed under the diagFiles section as a dashed array. + +Below are the **required** keys needed to define each file. +- **file_name** is a string that defines the name of the file. Do not add ".nc" and "tileX" to the filename as this will be handled by FMS. +- **freq** defines the frequency and the units that data will be written + - The acceptable values for freq are: + - =-1: output at the end of the run only + - =0: output every timestep + - \>0 units: output frequency and units (with a space between the frequency number and units e.g 24 hours) + - Values of -1 or 0 do not require units. + - The acceptable values for units are seconds, minutes, hours, days, months, years. +- **time_units** is a string that defines units for time. The acceptable values are seconds, minutes, hours, days, months, years. +- **unlimdim** is a string that defines the name of the unlimited dimension in the output netcdf file, usually “time”. +- **varlist** is a subsection that list all of the variable in the file + +**Example:** The following creates a file with data written every 6 hours. + +In the YAML format: +```yaml +diag_files: +- file_name: atmos_6hours + freq: 6 hours + time_units: hours + unlimdim: time + varlist: + - varinfo +``` + +In the legacy ascii format: +``` +"atmos_6hours", 6, "hours", 1, "hours", "time" +``` + +**NOTE:** The fourth column (file_format) has been deprecated. Netcdf files will always be written. + +Below are some *optional* keys that may be added. +- **write_file** is a logical that indicates if you want the file to be created (default is true). This is a new feature that is not supported by the legacy ascii data_table. +- **new_file_freq** is a string that defines the frequency and the frequency units (with a space between the frequency number and units) for closing the existing file +- **start_time** is an array of 6 integer indicating when to start the file for the first time. It is in the format [year month day hour minute second]. Requires “new_file_freq” +- **filename_time** is the time used to set the name of new files when using new_file_freq. The acceptable values are begin (which will use the begining of the file's time bounds), middle (which will use the middle of the file's time bounds), and end (which will use the end of the file's time bounds). The default is middle + +**Example:** The following will create a new file every 6 hours starting at Jan 1 2020. Variable data will be written to the file every 6 hours. + +In the YAML format: +```yaml +- file_name: ocn%4yr%2mo%2dy%2hr + freq: 6 hours + freq_units: hours + unlimdim: time + new_file_freq: 6 hours + start_time: 2020 1 1 0 0 0 +``` + +In the legacy ascii format: +``` +"ocn%4yr%2mo%2dy%2hr", 6, "hours", 1, "hours", "time", 6, "hours", "1901 1 1 0 0 0" +``` + +Because this is using the default `filename_time` (middle), this example will create the files: +``` +ocn_2020_01_01_03.nc for time_bnds [0,6] +ocn_2020_01_01_09.nc for time_bnds [6,12] +ocn_2020_01_01_15.nc for time_bnds [12,18] +ocn_2020_01_01_21.nc for time_bnds [18,24] +``` + +**NOTE** If using the new_file_freq, there must be a way to distinguish each file, as it was done in the example above. + +- **file_duration** is a string that defines how long the file should receive data after start time in “file_duration_units”. This optional field can only be used if the start_time field is present. If this field is absent, then the file duration will be equal to the frequency for creating new files. +- **global_meta** is a subsection that lists any additional global metadata to add to the file. This is a new feature that is not supported by the legacy ascii data_table. +- **sub_region** is a subsection that defines the four corners of a subregional section to capture. + +### 2.2.1 Flexible output timings + +In order to provide more flexibility in output timings, the diag_table yaml format allows for different file frequencies for the same file by allowing the `freq`, `new_file_freq`, and `file_duration` keys to accept a comma seperated list. + +For example, +``` yaml +- file_name: flexible_timing%4yr%2mo%2dy%2hr + freq: 1 hours, 1 hours, 1 hours + time_units: hours + unlimdim: time + new_file_freq: 6 hours, 3 hours, 1 hours + start_time: 2 1 1 0 0 0 + file_duration: 12 hours, 3 hours, 9 hours + filename_time: begin + varlist: + - module: ocn_mod + var_name: var1 + reduction: average + kind: r4 +``` +This will create a file every 6 hours for 12 hours +``` +flexible_timing_0002_01_01_00.nc - using hourly averaged data from hour 0 to hour 6 +flexible_timing_0002_01_01_06.nc - using hourly averaged data from hour 6 to hour 12 +``` + +Then it will create a file every 3 hours for 3 hours +``` +flexible_timing_0002_01_01_12.nc - using hourly averaged data from hour 12 to hour 15 +``` + +Then it will create a file every 1 hour for 9 hours. +``` +flexible_timing_0002_01_01_15.nc - using data from hour 15 to hour 16 +flexible_timing_0002_01_01_16.nc - using data from hour 16 to hour 17 +flexible_timing_0002_01_01_17.nc - using data from hour 17 to hour 18 +flexible_timing_0002_01_01_18.nc - using data from hour 18 to hour 19 +flexible_timing_0002_01_01_19.nc - using data from hour 19 to hour 20 +flexible_timing_0002_01_01_20.nc - using data from hour 20 to hour 21 +flexible_timing_0002_01_01_21.nc - using data from hour 21 to hour 22 +flexible_timing_0002_01_01_22.nc - using data from hour 22 to hour 23 +flexible_timing_0002_01_01_23.nc - using data from hour 23 to hour 24 + +``` + +### 2.2.2 Coupled Model Diag Files +In the *legacy ascii diag_table*, when running a coupled model (ATM + OCN) in a seperate PE list: + - The ATM PEs ignored the files in the diag_table that contain "OCEAN" in the filename + - The OCN PEs ignored the files in the diag_table that did not contain "OCEAN" in the filename + +In the *yaml diag_table*: + - The ATM PEs will ignore the files in the diag_table.yaml that contain the key/value pair `is_ocean: true` + - The OCN PEs will ignore the files in the diag_table.yaml that do not contain the key/value pair `is_ocean: true` + +### 2.3 Variable Section +The variables in each file are listed under the varlist section as a dashed array. + +- **var_name:** is a string that defines the variable name as it is defined in the register_diag_field call in the model +- **reduction:** is a string that describes the data reduction method to perform prior to writing data to disk. Acceptable values are average, diurnalXX (where XX is the number of diurnal samples), powXX (whre XX is the power level), min, max, none, rms, and sum. +- **module:** is a string that defines the module where the variable is registered in the model code +- **kind:** is a string that defines the type of variable as it will be written out in the file. Acceptable values are r4, r8, i4, and i8 + +**Example:** + +In the YAML format: +```yaml + varlist: + - module: moist + var_name: precip + reduction: average + kind: r4 +``` + +In the legacy ascii format: +``` +"moist", "precip", "precip", "atmos_8xdaily", "all", .true., "none", 2 +``` +**NOTE:** The fifth column (time_sampling) has been deprecated. The reduction_method (`.true.`) has been replaced with `average`. The output name was not included in the yaml because it is the same as the var_name. + +which corresponds to the following model code +```F90 +id_precip = register_diag_field ( 'moist', 'precip', axes, Time) +``` +where: +- `moist` corresonds to the module key in the diag_table.yaml +- `precip` corresponds to the var_name key in the diag_table.yaml +- `axes` are the ids of the axes the variable is a function of +- `Time` is the model time + +Below are some *optional* keys that may be added. +- **write_var:** is a logical that is set to false if the user doesn’t want the variable to be written to the file (default: true). +- **out_name:** is a string that defines the name of the variable that will be written to the file (default same as var_name) +- **long_name:** is a string defining the long_name attribute of the variable. It overwrites the long_name in the variable's register_diag_field call +- **attributes:** is a subsection with any additional metadata to add to the variable in the netcdf file. This is a new feature that is not supported by the legacy ascii data_table. +- **zbounds:** is a 2 member array of integers that define the bounds of the z axis (zmin, zmin), optional default is no limits. + +### 2.4 Variable Metadata Section +Any aditional variable attributes can be added for each variable can be listed under the attributes section as a dashed array. The key is attribute name and the value is the attribute value. + +**Example:** + +```yaml + attributes: + - attribute_name: attribute_value + attribute_name: attribute_value +``` + +Although this was not supported by the legacy ascii data_table, with the legacy diag_manager, a call to `diag_field_add_attribute` could have been used to do the same thing. + +```F90 +call diag_field_add_attribute(diag_field_id, attribute_name, attribute_value) +``` + +### 2.5 Global Meta Data Section +Any aditional global attributes can be added for each file can be listed under the global_meta section as a dashed array. The key is the attribute name and the value is the attribute value. + +```yaml + global_meta: + - attribute_name: attribute_value + attribute_name: attribute_value +``` + +### 2.6 Sub_region Section +The sub region can be listed under the sub_region section as a dashed array. The legacy ascii diag_table only allows regions to be defined using the latitude and longitude, and it only allowed rectangular sub regions. With the yaml diag_table, you can use indices to defined the sub_region and you can define **any** four corner shape. Each file can only have 1 sub_region defined. These are keys that can be used: +- **grid_type:** is a **required** string defining the method used to define the fourth sub_region corners. The acceptable values are "latlon" if using latitude/longitude or "indices" if using the indices of the corners. +- **corner1:** is a **required** 2 member array of reals if using (grid_type="latlon") or integers if using (grid_type="indices") defining the x and y points of the first corner of a sub_grid. +- **corner2:** is a **required** 2 member array of reals if using (grid_type="latlon") or integers if using (grid_type="indices") defining the x and y points of the second corner of a sub_grid. +- **corner3:** is a **required** 2 member array of reals if using (grid_type="latlon") or integers if using (grid_type="indices") defining the x and y points of the third corner of a sub_grid. +- **corner4:** is a **required** 2 member array of reals if using (grid_type="latlon") or integers if using (grid_type="indices") defining the x and y points of the fourth corner of a sub_grid. +- **tile:** is an integer defining the tile number the sub_grid is on. It is **required** only if using (grid_type="indices"). + +**Exampe:** + +```yaml + sub_region: + - grid_type: latlon + corner1: -80, 0 + corner2: -80, 75 + corner3: -60, 0 + corner4: -60, 75 +``` + +### 3. More examples +Bellow is a complete example of diag_table.yaml: +```yaml +title: test_diag_manager +base_date: 2 1 1 0 0 0 +diag_files: +- file_name: wild_card_name%4yr%2mo%2dy%2hr + freq: 6 hours + time_units: hours + unlimdim: time + new_file_freq: 6 hours + start_time: 2 1 1 0 0 0 + file_duration: 12 hours + varlist: + - module: test_diag_manager_mod + var_name: sst + reduction: average + kind: r4 + global_meta: + - is_a_file: true +- file_name: normal + freq: 24 days + time_units: hours + unlimdim: records + varlist: + - module: test_diag_manager_mod + var_name: sst + reduction: average + kind: r4 + write_var: true + attributes: + - do_sst: .true. + sub_region: + - grid_type: latlon + corner1: -80, 0 + corner2: -80, 75 + corner3: -60, 0 + corner4: -60, 75 +- file_name: normal2 + freq: -1 days + time_units: hours + unlimdim: records + write_file: true + varlist: + - module: test_diag_manager_mod + var_name: sstt + reduction: average + kind: r4 + long_name: S S T + - module: test_diag_manager_mod + var_name: sstt2 + reduction: average + kind: r4 + write_var: false + sub_region: + - grid_type: index + tile: 1 + corner1: 10, 15 + corner2: 20, 15 + corner3: 10, 25 + corner4: 20, 25 +- file_name: normal3 + freq: -1 days + time_units: hours + unlimdim: records + write_file: false +```