SAHARA

We have developed a software tool, the SOM-Assisted Hazard Area Risk Analysis (SAHARA), to reduce large climate datasets to more manageable sizes - yet statistically similar - which are then used to produce ensembles of potential hazard outcomes.The Self-Organizing Map (SOM) is a machine learning / data clustering algorithm which is well-suited for data that have strong topological properties. By employing the SOM algorithm to analyze topological patterns of climatological fields over a regional domain for a 30 year span, we can find a close statistical equivalent with fewer, non-contiguous input days. When using SOMs to cluster monthly climate data in this way, we find that by sampling only 150 days, it reduces computational time by greater than a factor of 6 compared to using the entire climate dataset. (See Figure 1)

Figure 1
Figure 1
Figure 2
Figure 2

The SAHARA software can scale from a laptop to workstations to many-core, many-node clusters by using a modern microservice architecture to distribute the Climate Database (CSFR currently), the SOM Engine, atmospheric model ensembles (such as the SCIPUFF Transport and Dispersion model) and pre- and post-processing across available computing resources, either locally or remotely. (See Figure 2)

We are currently in the process of adding the Weather Research and Forecasting (WRF) model as an additional workflow component to provide on-demand dynamical downscaling to increase the fidelity of simulations with high spatial and temporal resolution requirements, such as Urban-scale events.

Additional planned features and capabilities include:

  • Output compatibility with GIS (Geographical Information System) tools for interactive analysis and post-processing
  • Integration with large HPC systems, such as those at the DOD HPCMP centers
  • Seasonal-based forecasts of hazard areas, with user-configurable time periods

Funding

DTRA

ROMIO

The Remote Oceanic Meteorology Information Operational (ROMIO) Demonstration is a project sponsored by the FAA’s Weather Technology in the Cockpit (WTIC) program.  It is focused on analyzing oceanic aviation inefficiencies in current or future NextGen operations caused by gaps in either the available meteorological information or in the technology utilized in the cockpit.  Using an operational demonstration to uplink convective weather products into the cockpit of domestic airlines, this effort helps to identify and analyze operational gaps. 

In 2018, the WTIC ROMIO team began the operational demonstration with Delta Air Lines, United Airlines, and American Airlines. Following the ROMIO Operational Plan, all aspects of the demonstrations were carefully planned and included the availability and ingest of meteorological data sets, the creation of weather products, their dissemination to the aircraft and their display.

cdo-info@rap.ucar.edu

RAL Benefits & Impacts: Avoiding Dangerous Weather Oceanic Flights:
Remote Oceanic Met Info Operational (ROMIO)

 

The Remote Oceanic Meteorology Information Operational (ROMIO) Demonstration

WRFDA

The WRF Variational Data Assimilation (WRFDA) system is in the public domain and is freely available for community use. It is designed to be a flexible, state-of-the-art atmospheric data assimilation system that is portable and efficient on available parallel computing platforms. WRFDA is suitable for use in a broad range of applications, across scales ranging from kilometers for regional and mesoscale modeling to thousands of kilometers for global scale modeling.

Ensemble Generalized Analog Regression Downscaling (En-GARD)

This code is an implementation of a hybrid analog / regression multivariate downscaling procedure. The program reads a namelist file for the configuration information. The downscaling process is performed on a grid-cell by grid-cell basis and permits multiple approaches to downscaling. The standard hybrid analog-regression approach uses the input predictor variables to select a group of analog days (e.g. 300) from the training period for each day to be predicted. These analog days are then used to compute a multi-variable regression between the training data (e.g. wind, humidity, and stability) and the variable to be predicted (e.g. precipitation). The regression coefficients are then applied to the predictor variables to compute the expected downscaled value, and they are applied to the training data to compute the error in the regression. Optionally, a logistic regression can be used to compute (e.g.) the probability of precipitation occurrence on a given day, or the probability of exceeding any other threshold. Similarly, the logistic regression coefficients are applied to the predictors and output, or the analog exceedence probabilities can be output. Alternatively the code can compute the regressions over the entire time series supplied, a pure regression approach, or the analogs them selves can be used as the result, a pure analog approach. The pure analog approach can compute the mean of the selected analogs, it can randomly sample the analogs, or it can compute the weighted mean based on the distance from the current predictor.

The code requires both training and predictor data for the same variables as well as a variable to be predicted. The training and prediction data can include as many variables as desired (e.g. wind, humidity, precipitation, CAPE). All data must have geographic and time coordinate variables associated with them.

While this was developed for downscaling climate data, it is general purpose and could be applied to a wide variety of problems in which both analogs and regression make sense.