UFO
ufo::DataExtractor< ExtractedValue > Class Template Reference

This class makes it possible to extract and interpolate data loaded from a file. More...

#include <DataExtractor.h>

Classes

struct  Coordinate
 Coordinate used for data extraction from the payload array. More...
 

Public Member Functions

 DataExtractor (const std::string &filepath, const std::string &group)
 Create an object that can be used to extract data loaded from a file. More...
 
void scheduleSort (const std::string &varName, const InterpMethod &method)
 Update the instruction on how to sort the data for the provided variable name. More...
 
void sort ()
 Finalise the sort, sorting each of the coordinates indexing the axes of the array to be interpolated, as well as that array itself. More...
 
void extract (float obVal)
 Perform extract, given an observation value for the coordinate associated with this extract iteration. More...
 
void extract (int obVal)
 
void extract (const std::string &obVal)
 
template<typename T , typename R >
void extract (T obValDim0, R obValDim1)
 Perform extract, given the observation values associated with the current extract iteration and the next. More...
 
ExtractedValue getResult ()
 Fetch the final interpolated value. More...
 
float getResult ()
 

Private Types

typedef boost::variant< std::vector< int >, std::vector< float >, std::vector< std::string > > CoordinateValues
 

Private Member Functions

template<typename T >
void extractImpl (const T &obVal)
 Common implementation of the overloaded public function extract(). More...
 
template<typename T >
void maybeExtractByLinearInterpolation (const T &obVal)
 Perform extraction using piecewise linear interpolation, if it's compatible with the ExtractedValue type in use; otherwise throw an exception. More...
 
template<typename T , typename R >
void maybeExtractByBiLinearInterpolation (const T &obValDim0, const R &obValDim1)
 
ExtractedValue getUniqueMatch () const
 Fetch the result produced by previous calls to extract(), none of which may have used linear interpolation. More...
 
void resetExtract ()
 Reset the extraction range for this object. More...
 
void load (const std::string &filepath, const std::string &interpolatedArrayGroup)
 Load all data from the input file. More...
 
void maybeExtractByLinearInterpolation (const T &obVal)
 
void maybeExtractByBiLinearInterpolation (const T &obValDim0, const R &obValDim1)
 

Static Private Member Functions

static std::unique_ptr< DataExtractorBackend< ExtractedValue > > createBackendFor (const std::string &filepath)
 Create a backend able to read file filepath. More...
 

Private Attributes

std::array< ConstrainedRange, 3 > constrainedRanges_
 
std::unordered_map< std::string, CoordinateValuescoordsVals_
 
DataExtractorPayload< ExtractedValue > interpolatedArray_
 
float result_
 
bool resultSet_
 
std::vector< ufo::RecursiveSplittersplitter_
 
std::unordered_map< std::string, int > coord2DimMapping_
 Maps coordinate names to dimensions (0 or 1) of the payload array. More...
 
std::vector< std::vector< std::string > > dim2CoordMapping_
 Maps dimensions of the payload array (0 or 1) to coordinate names. More...
 
std::vector< CoordinatecoordsToExtractBy_
 Coordinates to use in successive calls to extract(). More...
 
std::vector< Coordinate >::const_iterator nextCoordToExtractBy_
 

Detailed Description

template<typename ExtractedValue>
class ufo::DataExtractor< ExtractedValue >

This class makes it possible to extract and interpolate data loaded from a file.

Template Parameters
ExtractedValueType of the values to extract. Must be float, int or std::string.

Currently the following file formats are supported: NetCDF and CSV. See DataExtractorNetCDFBackend and DataExtractorCSVBackend for more information about these formats.

In both cases, the files will contain a payload array (the array from which data will be extracted) and one or more coordinate arrays indexing the payload array. This class makes it possible to rapidly extract a value from the payload array corresponding to particular values of the coordinates, or to interpolate multiple values from this array. Coordinate matching can be exact or approximate (looking for the nearest match). It is also possible to perform a piecewise linear interpolation of the data along one coordinate axis or bilinear interpolation along two axes.

Here's how to use this class:

  • Call the constructor to load data from an input file.
  • Call scheduleSort() one or more times, each time passing to it the name of a coordinate array and the matching method to be used for this coordinate. This will determine the order in which coordinates will be matched.
  • Call sort(). This will prepare internal data structures required for a rapid search for matching coordinates.
  • To extract a value from the payload array for a particular data point, pass the values of successive coordinates of that point to calls to extract() (in the order matching the order of the preceding calls to scheduleSort()). Then call getResult() to retrieve the extracted value.

Here is a summary of particulars to the extraction/interpolation algorithms available:

  • Nearest neighbour 'interpolation' chooses the first nearest value to be found in the case of equidistant neighbours of different values. The algorithm will then return the one or more locations corresponding to this nearest value. Let's illustrate by extracting the nearest neighbours to 1.5:

    [1, 1, 2, 3, 4, 5]

    Both 1 and 2 are equidistant, but we take the first found equidistant neighbour (1), then return all indices matching this neighbour (indices 0 and 1 in this case).

Definition at line 426 of file src/ufo/utils/dataextractor/DataExtractor.h.

Member Typedef Documentation

◆ CoordinateValues

template<typename ExtractedValue >
typedef boost::variant<std::vector<int>, std::vector<float>, std::vector<std::string> > ufo::DataExtractor< ExtractedValue >::CoordinateValues
private

Definition at line 547 of file src/ufo/utils/dataextractor/DataExtractor.h.

Constructor & Destructor Documentation

◆ DataExtractor()

template<typename ExtractedValue >
ufo::DataExtractor< ExtractedValue >::DataExtractor ( const std::string &  filepath,
const std::string &  group 
)
explicit

Create an object that can be used to extract data loaded from a file.

This object is capable of sorting the data from this file extracting the relevant values for a given observation as well as performing linear interpolation to derive the final value.

Parameters
[in]filepathPath to the input file.
[in]groupGroup containing the payload variable.

Definition at line 396 of file DataExtractor.cc.

Member Function Documentation

◆ createBackendFor()

template<typename ExtractedValue >
std::unique_ptr< DataExtractorBackend< ExtractedValue > > ufo::DataExtractor< ExtractedValue >::createBackendFor ( const std::string &  filepath)
staticprivate

Create a backend able to read file filepath.

Definition at line 429 of file DataExtractor.cc.

◆ extract() [1/4]

template<typename ExtractedValue >
void ufo::DataExtractor< ExtractedValue >::extract ( const std::string &  obVal)

Definition at line 537 of file DataExtractor.cc.

◆ extract() [2/4]

template<typename ExtractedValue >
void ufo::DataExtractor< ExtractedValue >::extract ( float  obVal)

Perform extract, given an observation value for the coordinate associated with this extract iteration.

Calls the relevant extract method (linear, nearest or exact), corresponding to the coordinate associated with this extract iteration (along with the associated interpolation method). The extract then utilises the observation value ('obVal') to perform this extraction.

Parameters
[in]obValis the observation value used for the extract operation.

Definition at line 525 of file DataExtractor.cc.

◆ extract() [3/4]

template<typename ExtractedValue >
void ufo::DataExtractor< ExtractedValue >::extract ( int  obVal)

Definition at line 531 of file DataExtractor.cc.

◆ extract() [4/4]

template<typename ExtractedValue >
template<typename T , typename R >
void ufo::DataExtractor< ExtractedValue >::extract ( obValDim0,
obValDim1 
)
inline

Perform extract, given the observation values associated with the current extract iteration and the next.

Calls the relevant extract method (linear), corresponding to the coordinate associated with this extract iteration and the next (along with the associated interpolation method). This method actually functions as two iterations, passing the current iteration coordinate and the next iteration coordinate. These are passed to the underlying binary operation (for example bilinear interpolation).

Parameters
[in]obValDim0is the observation value used for the extract operation corresponding to the first coordinate utilised by the underlying method.
[in]obValDim1is the observation value used for the extract operation corresponding to the second coordinate utilised by the underlying method.

Definition at line 482 of file src/ufo/utils/dataextractor/DataExtractor.h.

Here is the call graph for this function:

◆ extractImpl()

template<typename ExtractedValue >
template<typename T >
void ufo::DataExtractor< ExtractedValue >::extractImpl ( const T &  obVal)
private

Common implementation of the overloaded public function extract().

Definition at line 544 of file DataExtractor.cc.

Here is the call graph for this function:

◆ getResult() [1/2]

float ufo::DataExtractor< float >::getResult ( )

Definition at line 600 of file DataExtractor.cc.

◆ getResult() [2/2]

template<typename ExtractedValue >
ExtractedValue ufo::DataExtractor< ExtractedValue >::getResult

Fetch the final interpolated value.

This will only be successful if previous calls to extract() have produced a single value to return.

Definition at line 590 of file DataExtractor.cc.

◆ getUniqueMatch()

template<typename ExtractedValue >
ExtractedValue ufo::DataExtractor< ExtractedValue >::getUniqueMatch
private

Fetch the result produced by previous calls to extract(), none of which may have used linear interpolation.

An exception is thrown if these calls haven't produced a unique match of the extraction criteria.

Definition at line 615 of file DataExtractor.cc.

◆ load()

template<typename ExtractedValue >
void ufo::DataExtractor< ExtractedValue >::load ( const std::string &  filepath,
const std::string &  interpolatedArrayGroup 
)
private

Load all data from the input file.

Definition at line 410 of file DataExtractor.cc.

◆ maybeExtractByBiLinearInterpolation() [1/2]

template<typename ExtractedValue >
template<typename T , typename R >
void ufo::DataExtractor< ExtractedValue >::maybeExtractByBiLinearInterpolation ( const T &  obValDim0,
const R &  obValDim1 
)
inlineprivate

Definition at line 511 of file src/ufo/utils/dataextractor/DataExtractor.h.

Here is the caller graph for this function:

◆ maybeExtractByBiLinearInterpolation() [2/2]

void ufo::DataExtractor< float >::maybeExtractByBiLinearInterpolation ( const T &  obValDim0,
const R &  obValDim1 
)
private

Definition at line 584 of file src/ufo/utils/dataextractor/DataExtractor.h.

Here is the call graph for this function:

◆ maybeExtractByLinearInterpolation() [1/2]

void ufo::DataExtractor< float >::maybeExtractByLinearInterpolation ( const T &  obVal)
private

Definition at line 576 of file DataExtractor.cc.

Here is the call graph for this function:

◆ maybeExtractByLinearInterpolation() [2/2]

template<typename ExtractedValue >
template<typename T >
void ufo::DataExtractor< ExtractedValue >::maybeExtractByLinearInterpolation ( const T &  obVal)
private

Perform extraction using piecewise linear interpolation, if it's compatible with the ExtractedValue type in use; otherwise throw an exception.

Definition at line 566 of file DataExtractor.cc.

◆ resetExtract()

template<typename ExtractedValue >
void ufo::DataExtractor< ExtractedValue >::resetExtract
private

Reset the extraction range for this object.

Each time an exactMatch, nearestMatch, leastUpperBoundMatch or greatestLowerBoundMatch call is made for one or more variable, the extraction range is further constrained to match our updated match conditions. After the final 'extract' is made (i.e. an interpolated value is derived) it is desirable to reset the extraction range by calling this method.

Definition at line 631 of file DataExtractor.cc.

◆ scheduleSort()

template<typename ExtractedValue >
void ufo::DataExtractor< ExtractedValue >::scheduleSort ( const std::string &  varName,
const InterpMethod method 
)

Update the instruction on how to sort the data for the provided variable name.

This works iteratively by further splitting the RecursiveSplitter sub-groups according to the variable name provided. By this, it is possible to sort the data in such a way to ensure that extraction always results in 1 contiguous chunk. No data or coordinates are actually physically sorted yet. Special treatment for float type variables, where this is used to sort each of the sub-groups.

Parameters
[in]varNameis the name of the coordinate axis to sort.
[in]methodis the interpolation/extraction method to use for this coordinate.

Definition at line 498 of file DataExtractor.cc.

Here is the caller graph for this function:

◆ sort()

template<typename ExtractedValue >
void ufo::DataExtractor< ExtractedValue >::sort

Finalise the sort, sorting each of the coordinates indexing the axes of the array to be interpolated, as well as that array itself.

Utilising the instructions provided by the user calling the scheduleSort() member function, we now physically sort the array itself along with all coordinates which describe it.

Definition at line 443 of file DataExtractor.cc.

Member Data Documentation

◆ constrainedRanges_

template<typename ExtractedValue >
std::array<ConstrainedRange, 3> ufo::DataExtractor< ExtractedValue >::constrainedRanges_
private

Definition at line 542 of file src/ufo/utils/dataextractor/DataExtractor.h.

◆ coord2DimMapping_

template<typename ExtractedValue >
std::unordered_map<std::string, int> ufo::DataExtractor< ExtractedValue >::coord2DimMapping_
private

Maps coordinate names to dimensions (0 or 1) of the payload array.

Definition at line 559 of file src/ufo/utils/dataextractor/DataExtractor.h.

◆ coordsToExtractBy_

template<typename ExtractedValue >
std::vector<Coordinate> ufo::DataExtractor< ExtractedValue >::coordsToExtractBy_
private

Coordinates to use in successive calls to extract().

Definition at line 576 of file src/ufo/utils/dataextractor/DataExtractor.h.

◆ coordsVals_

template<typename ExtractedValue >
std::unordered_map<std::string, CoordinateValues> ufo::DataExtractor< ExtractedValue >::coordsVals_
private

Definition at line 548 of file src/ufo/utils/dataextractor/DataExtractor.h.

◆ dim2CoordMapping_

template<typename ExtractedValue >
std::vector<std::vector<std::string> > ufo::DataExtractor< ExtractedValue >::dim2CoordMapping_
private

Maps dimensions of the payload array (0 or 1) to coordinate names.

Definition at line 561 of file src/ufo/utils/dataextractor/DataExtractor.h.

◆ interpolatedArray_

template<typename ExtractedValue >
DataExtractorPayload<ExtractedValue> ufo::DataExtractor< ExtractedValue >::interpolatedArray_
private

Definition at line 550 of file src/ufo/utils/dataextractor/DataExtractor.h.

◆ nextCoordToExtractBy_

template<typename ExtractedValue >
std::vector<Coordinate>::const_iterator ufo::DataExtractor< ExtractedValue >::nextCoordToExtractBy_
private

Definition at line 577 of file src/ufo/utils/dataextractor/DataExtractor.h.

◆ result_

template<typename ExtractedValue >
float ufo::DataExtractor< ExtractedValue >::result_
private

Definition at line 552 of file src/ufo/utils/dataextractor/DataExtractor.h.

◆ resultSet_

template<typename ExtractedValue >
bool ufo::DataExtractor< ExtractedValue >::resultSet_
private

Definition at line 554 of file src/ufo/utils/dataextractor/DataExtractor.h.

◆ splitter_

template<typename ExtractedValue >
std::vector<ufo::RecursiveSplitter> ufo::DataExtractor< ExtractedValue >::splitter_
private

Definition at line 556 of file src/ufo/utils/dataextractor/DataExtractor.h.


The documentation for this class was generated from the following files: