IODA
04-VariablesAttributesAndDimensions.py
Go to the documentation of this file.
1 #
2 # (C) Copyright 2020 UCAR
3 #
4 # This software is licensed under the terms of the Apache Licence Version 2.0
5 # which can be obtained at http://www.apache.org/licenses/LICENSE-2.0.
6 #
7 
8 # This example supplements the previous discussion on variables.
9 #
10 # Variables store data, but how should this data be interpreted? This is the
11 # purpose of attributes. Attributes are bits of metadata that can describe groups
12 # and variables. Good examples of attributes include tagging the units of a variable,
13 # describing a variable, listing a variable's valid range, or "coding" missing
14 # or invalid values. Other examples include tagging the source of your data,
15 # or recording when you ingested / converted data into the ioda format.
16 #
17 # Basic manipulation of attributes was already discussed in Tutorial 2. Now, we want to
18 # focus instead on good practices with tagging your data.
19 #
20 # Supplementing attributes, we introduce the concept of adding "dimension scales" to your
21 # data. Basically, your data have dimensions, but we want to attach a "meaning" to each
22 # axis of your data. Typically, the first axis corresponds to your data's Location.
23 # A possible second axis for brightness temperature data might be "instrument channel", or
24 # maybe "pressure level". This tutorial will show you how to create new dimension scales and
25 # attach them to new Variables.
26 
27 import os
28 import sys
29 
30 if os.environ.get('LIBDIR') is not None:
31  sys.path.append(os.environ['LIBDIR'])
32 
33 import ioda
34 
36  name = "Example-04-python.hdf5",
37  mode = ioda.Engines.BackendCreateModes.Truncate_If_Exists)
38 
39 # Let's start with dimensions and Dimension Scales.
40 #
41 # ioda stores data using Variables, and you can view each variable as a
42 # multidimensional matrix of data. This matrix has dimensions.
43 # A dimension may be used to represent a real physical dimension, for example,
44 # time, latitude, longitude, or height. A dimension might also be used to
45 # index more abstract quantities, for example, color-table entry number,
46 # instrument number, station-time pair, or model-run ID
47 #
48 # A dimension scale is simply another variable that provides context, or meaning,
49 # to a particular dimension. For example, you might have ATMS brightness
50 # temperature information that has dimensions of location by channel number. In ioda,
51 # we want every "axis" of a variable to be associated with an explanatory dimension.
52 #
53 # Let's create a few dimensions... Note: when working with an already-existing Obs Space,
54 # (covered later), these dimensions may already be present.
55 #
56 # Create two dimensions, "nlocs", and "ATMS Channel". Set distinct values within
57 # these dimensions.
58 
59 num_locs = 3000
60 num_channels = 23
61 
62 dim_location = g.vars.create('nlocs', ioda.Types.int32, [ num_locs ])
63 dim_location.scales.setIsScale('nlocs')
64 
65 dim_channel = g.vars.create('ATMS Channel', ioda.Types.int32, [num_channels])
66 dim_channel.scales.setIsScale('ATMS Channel')
67 
68 # Now that we have created dimensions, we can create new variables and attach the
69 # dimensions to our data.
70 #
71 # But first, a note about attributes:
72 # Attributes provide metadata that describe our variables.
73 # In IODA, we at least must to keep track of each variable's:
74 # - Units (in SI; we follow CF conventions)
75 # - Long name
76 # - Range of validity. Data outside of this range are automatically rejected for
77 # future processing.
78 #
79 # Let's create variables for Latitude, Longitude and for
80 # ATMS Observed Brightness Temperature.
81 #
82 # There are two ways to define a variable that has attached dimensions.
83 # First, we can explicitly create a variable and set its dimensions.
84 #
85 # Longitude has dimensions of nlocs. It has units of degrees_east, and has
86 # a valid_range of (-180,180).
87 
88 longitude = g.vars.create('MetaData/Longitude', ioda.Types.float, [num_locs])
89 longitude.scales.set([dim_location])
90 longitude.atts.create('valid_range', ioda.Types.float, [2]).writeVector.float([-180, 180])
91 longitude.atts.create('units', ioda.Types.str).writeVector.str(['degrees_east'])
92 longitude.atts.create('long_name', ioda.Types.str).writeVector.str(['Longitude'])
93 
94 # The above method is a bit clunky because you have to make sure that the new variable's
95 # dimensions match the sizes of each dimension.
96 # Here, we do the same variable creation, but instead use dimension scales to determine sizes.
97 latitude = g.vars.create('MetaData/Latitude', ioda.Types.float, scales=[dim_location])
98 # Latitude has units of degrees_north, and a valid_range of (-90,90).
99 latitude.atts.create('valid_range', ioda.Types.float, [2]).writeVector.float([-90,90])
100 latitude.atts.create('units', ioda.Types.str).writeVector.str(['degrees_north'])
101 latitude.atts.create('long_name', ioda.Types.str).writeVector.str(['Latitude'])
102 
103 # The ATMS Brightness Temperature depends on both location and instrument channel number.
104 tb = g.vars.create('ObsValue/Brightness_Temperature', ioda.Types.float, scales=[dim_location, dim_channel])
105 tb.atts.create('valid_range', ioda.Types.float, [2]).writeVector.float([100,500])
106 tb.atts.create('units', ioda.Types.str).writeVector.str(['K'])
107 tb.atts.create('long_name', ioda.Types.str).writeVector.str(['ATMS Observed (Uncorrected) Brightness Temperature'])
108 
109 
110 # Variable Parameter Packs
111 # When creating variables, you can also provide an optional
112 # VariableCreationParameters structure. This struct lets you specify
113 # the variable's fill value (a default value that is a placeholder for unwritten data).
114 # It also lets you specify whether you want to compress the data stored in the variable,
115 # and how you want to store the variable (contiguously or in chunks).
117 
118 # Variable storage: contiguous or chunked
119 # See https://support.hdfgroup.org/HDF5/doc/Advanced/Chunking/ and
120 # https://www.unidata.ucar.edu/blogs/developer/en/entry/chunking_data_why_it_matters
121 # for detailed explanations.
122 # To tell ioda that you want to chunk a variable:
123 p1.chunk = True
124 p1.chunks = [100]
125 
126 # Fill values:
127 # The "fill value" for a dataset is the specification of the default value assigned
128 # to data elements that have not yet been written.
129 # When you first create a variable, it is empty. If you read in a part of a variable that
130 # you have not filled with data, then ioda has to return fake, filler data.
131 # A fill value is a "bit pattern" that tells ioda what to return.
132 p1.setFillValue.float(-999)
133 
134 # Compression
135 # If you are using chunked storage, you can tell ioda that you want to compress
136 # the data using ZLIB or SZIP.
137 # - ZLIB / GZIP:
138 p1.compressWithGZIP()
139 
140 
141 # Let's create one final variable, "Solar Zenith Angle", and let's use
142 # out new variable creation parameters.
143 sza = g.vars.create(name='ObsValue/Solar Zenith Angle', dtype=ioda.Types.float, scales=[dim_location], params=p1)
144 
145 
IODA_DL Group createFile(const std::string &filename, BackendCreateModes mode, HDF5_Version_Range compat=defaultVersionRange())
Create a ioda::Group backed by an HDF5 file.
Definition: HH.cpp:113
Used to specify Variable creation-time properties.
Definition: Has_Variables.h:57