1 Background

For most of the postprocessing, there is a need to create spatial maps, aggregate over spatial units and also produce georeference raster and shapefiles. Many of the existing functions in available spatial libraries such as SP, RGDAL, RGEOS and Raster has been wrapped in rwrfhydro to serve our purpose. Here we explain these spatial functions, their applications as well as some examples.

1.1 List of the available functions

  • GetProj
  • GetGeogridSpatialInfo
  • ExportGeogrid
  • GetGeogridIndex
  • GetTimeZone
  • GetRfc
  • GetPoly
  • PolyToRaster

1.2 General Info

The path to the files have been hard coded here, please change them as you see fit. This vignette will wrok with the wrf-hydro-training material provided at Oct 2018.

Load the rwrfhydro package.

library(rwrfhydro)
## To check rwrfhydro updates run: CheckForUpdates()

Geogrid file is the main file containing all the base geographic information on the model domain such as the geographic coordinate system, latitude, longitude of each pixel and so on. We use this file frequently. Set a path to geogrid file.

geoFile <- '~/wrf-hydro-training/example_case/Gridded/DOMAIN/geo_em.d01.nc'

2 GetProj

To be able to use any of the spatial tools in R, projection information of the model domain is required. Some of the model input and output files are based on geogrid file domain. GetProj pull projection information of WRF-Hydro modeling domain from geogrid file. It takes only the path to the geogrid file and return the projection information as a character.

proj4 <- GetProj(geoFile)
proj4
## [1] "+proj=lcc +lat_1=30 +lat_2=60 +lat_0=40.0000076293945 +lon_0=-97 +x_0=0 +y_0=0 +a=6370000 +b=6370000 +units=m +no_defs"

3 GetGeogridSpatialInfo

It pulls necessary geospatial information about WRF-Hydro modeling domain from geogrid file used for regridding and deprojection. It only requires the address to the geogrid file and return a data frame containing geospatial information such as the projection information, number of rows and columns and size of the grids.

geoInfo <- GetGeogridSpatialInfo(geoFile)
geoInfo
##     DX   DY GRID_TYPE     LAT1      LON1  REF_LAT REF_LON SPLAT1 SPLAT2
## 1 1000 1000         C 41.42281 -73.85333 40.00001     -97     30     60
##   POLE_LAT POLE_LON MAP_PROJ NCOL NROW
## 1       90        0        1   15   16

4 ExportGeogrid

If you need to create a georeferenced TIFF file from any variable in a netcdf file having the same domain and pixel size as the geogrid file, then you need to use ExportGeogrid function. It takes the Geogrid file having lat/lon information and converts the specified variable into a georeferenced TIFF file for use in standard GIS tools. Let’s export one variable from the geogrid file. You can get a list of all available variables in the geoFile using ncdump function in rwrfhydro.

ncdump(geoFile)

We create a georeferenced TIFF file from HGT_M field. You only need to provide the address to geogrid file geoFile, the name of the variable HGT_M and the name of the output file geogrid_hgt.tif.

ExportGeogrid(geoFile,"HGT_M", "geogrid_hgt.tif")

You can use the provided file in any standard GIS platform. Let’s read it into memory as a raster and dispaly it.

# read the saved tiff file
r <- raster::raster("geogrid_hgt.tif")

# plot the imported raster from tif file
raster::plot(r, main = "HGT_M")

# check the raster information. Notice that geographic coordinate information has been added.
r
## class       : RasterLayer 
## dimensions  : 16, 15, 240  (nrow, ncol, ncell)
## resolution  : 1000, 1000  (x, y)
## extent      : 1842000, 1857000, 420998.6, 436998.6  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=lcc +lat_1=30 +lat_2=60 +lat_0=40.0000076293945 +lon_0=-97 +x_0=0 +y_0=0 +a=6370000 +b=6370000 +units=m +no_defs 
## data source : C:\Users\arezoo\Documents\GitHub\wrf_hydro_training\rwrfhydro\geogrid_hgt.tif 
## names       : geogrid_hgt

Many of the input and output files such as LDASOUT output file does not contain lat/lon coordinates but matches the spatial coordinate system of the geogrid input file. In that case, you could feed the geogrid file (geoFile) from which the lat/lon information will be taken.

file = "~/wrf-hydro-training/example_case/FORCING/2011082800.LDASIN_DOMAIN1"
# ncdump(file) # check if the SOIL_T exist in the file

# we will read the third layer of soil temperature
ExportGeogrid(file,
              inVar="RAINRATE",
              outFile="RAINRATE.tif",
              inCoordFile=geoFile)

# read the  created tiff file
r <- raster::raster("RAINRATE.tif")

# plot the imported raster from tiff file
raster::plot(r*3600, main = "RAINRATE (mm/hr)")

# check the raster information and notice geographic coordinate information has been added
r
## class       : RasterLayer 
## dimensions  : 16, 15, 240  (nrow, ncol, ncell)
## resolution  : 1000, 1000  (x, y)
## extent      : 1842000, 1857000, 420998.6, 436998.6  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=lcc +lat_1=30 +lat_2=60 +lat_0=40.0000076293945 +lon_0=-97 +x_0=0 +y_0=0 +a=6370000 +b=6370000 +units=m +no_defs 
## data source : C:\Users\arezoo\Documents\GitHub\wrf_hydro_training\rwrfhydro\RAINRATE.tif 
## names       : RAINRATE

5 GetGeogridIndex

To be able to use a bunch of tools such as GetMultiNcdf, one needs to have the indices (x,y) or the location of each cell within the domain. GetGeogridIndex get geogrid cell indices from lat/lon (or other) coordinates. GetGeogridIndex reads in a set of lat/lon (or other) coordinates and generates a corresponding set of geogrid index pairs. You can assign a projection to the points using proj4 argument which will be used to transform the point to the geoFile coordinate system. Chkeck the vignette on precipitation for usage. If the point falls outside the domain, it will return NA value.

sg <- data.frame(lon = seq(-73.6, -73.8, length.out = 10),
                 lat = seq(41.4, 41.5, length.out = 10))
GetGeogridIndex(sg, geoFile)
##    we sn
## 1  NA NA
## 2  NA NA
## 3  NA NA
## 4  15  6
## 5  13  7
## 6  11  8
## 7   9  8
## 8   7  9
## 9   5 10
## 10  3 10

6 GetTimeZone

Many of the point observation are reported in local time and needs to be converted to UTC time to be comparable with WRF-Hydro input and outputs. GetTimeZone return the time zone for any point having longitude and latitude. It takes a dataframe containing at least two fields of latitude and longitude, overlays the points with a timezone shapefile (can be downloded from http://efele.net/maps/tz/world/). The shapefile is provided in rwrfhydro data and it is called timeZone.

# timeZone has been provided by rwrfhydro as a SpatialPolygonDataFrame
class(timeZone)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
# Shows the available timezone (TZID column in timeZone@data)
head(timeZone@data)
##                  TZID
## 0    America/Dominica
## 1  America/St_Vincent
## 2 Australia/Lord_Howe
## 3        Asia/Kashgar
## 4      Pacific/Wallis
## 5      Asia/Jerusalem

Function has three arguments.

  • points: A dataframe of the points. The dataframe should contain at least two fields called latitude and longitude.
  • proj4: Projection of the points to be used in transforming the points projection to timeZone projection. Default is +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 which is the same as the timezone projection.
  • parallel: If the number of points are high you can parallelize the process.

It will return the points dataframe with an added column called timeZone. It will return NA in case the point is not in any polygon. Now let’s generate some random points and find their time zone information.

# Provide a dataframe of 10 points having longitude and latitude as column name.
sg <- data.frame(longitude = seq(-110, -80, length.out = 10),
latitude = seq(30, 50, length.out = 10))

# Find the time zone for each point
sg <- GetTimeZone(sg)
sg
##              timeZone  longitude latitude
## 1  America/Hermosillo -110.00000 30.00000
## 2      America/Denver -106.66667 32.22222
## 3      America/Denver -103.33333 34.44444
## 4     America/Chicago -100.00000 36.66667
## 5     America/Chicago  -96.66667 38.88889
## 6     America/Chicago  -93.33333 41.11111
## 7     America/Chicago  -90.00000 43.33333
## 8                <NA>  -86.66667 45.55556
## 9     America/Toronto  -83.33333 47.77778
## 10    America/Toronto  -80.00000 50.00000

7 GetRfc

US has 13 River Forecast Centers (RFC) and many of the statistics are desired to be grouped into River Forecast Center level so it would be easier to compare with the performance of the RFC models in the past. The RFC boundary shapefile is provided in rwrfhydro data and is called rfc.

class(rfc)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
# Shows the available rfc, name of the column is BASIN_ID
head(rfc@data)
##   SITE_ID STATE           RFC_NAME                  RFC_CITY BASIN_ID
## 0     ACR    AK             Alaska                 Anchorage    AKRFC
## 1     KRF    MO     Missouri Basin Kansas City/Pleasant Hill    MBRFC
## 2     STR    UT     Colorado Basin            Salt Lake City    CBRFC
## 3     TUA    OK Arkansas-Red Basin                     Tulsa    ABRFC
## 4     RSA    CA  California-Nevada                Sacramento    CNRFC
## 5     ORN    LA  Lower Mississippi   New Orleans/Baton Rouge    LMRFC

GetRfc return the RFC name for any point having longitude and latitude. It takes a dataframe containing at least two fields of latitude and longitude, overlays the points with a rfc SpatialPolygonDataFrame and return the rfc’s BASIN_ID. Function has three arguments.

  • points: A dataframe of the points. The dataframe should contain at least two fields called “latitude” and “longitude”.
  • proj4: Projection of the points to be used in transforming the points projection to rfc projection. Default is +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0.
  • parallel: If the number of points are high you can parallelize the process.

It will return the points dataframe with an added column called rfc. It will return NA in case the point is not within RFCs.

# Provide a dataframe of 10 points having longitude and latitude as column name.
sg <- data.frame(longitude = seq(-110, -80, length.out = 10),
                 latitude = seq(30, 50, length.out = 10))

# Find the rfc for each point
sg <- GetRfc(sg)
sg
##      rfc  longitude latitude
## 1   <NA> -110.00000 30.00000
## 2  WGRFC -106.66667 32.22222
## 3  WGRFC -103.33333 34.44444
## 4  ABRFC -100.00000 36.66667
## 5  MBRFC  -96.66667 38.88889
## 6  NCRFC  -93.33333 41.11111
## 7  NCRFC  -90.00000 43.33333
## 8   <NA>  -86.66667 45.55556
## 9   <NA>  -83.33333 47.77778
## 10  <NA>  -80.00000 50.00000

8 GetPoly

Getpoly is similar to GetRfc, it is a wrapper for function sp::over. It takes a dataframe containing at least two fields of latitude and longitude, overlays the points with a SpatialPolygonDataFrame and return the requested attribute from the polygon. One could use the available SpatialPolygon* loaded into memory or provide the address to the location of a polygon shapefile and the name of the shapefile and it will read the polygon using rgdal::readOGR function.

Let’s get the RFC information from GetPoly instead of GetRfc. Here we provide the name of the SpatialPolygon* and using argument join request one of the attributes of the polygon. For example, here we have requested the BASIN_ID, RFC_NAME and RFC_CITY.

# Provide a dataframe of 10 points having longitude and latitude
sg <- data.frame(longitude = seq(-110, -80, length.out = 10),
latitude = seq(30, 50, length.out = 10))

# Find the ID of RFC for each point
sg <- GetPoly(points = sg, polygon = rfc, join = "BASIN_ID")

# Find the full name of RFC for each point
sg <- GetPoly(points = sg, polygon = rfc, join = "RFC_NAME")

# Find the location/city of RFC for each point
sg <- GetPoly(points = sg, polygon = rfc, join = "RFC_CITY")
sg
##    BASIN_ID           RFC_NAME                  RFC_CITY  longitude
## 1      <NA>               <NA>                      <NA> -110.00000
## 2     WGRFC          West Gulf         Dallas/Fort Worth -106.66667
## 3     WGRFC          West Gulf         Dallas/Fort Worth -103.33333
## 4     ABRFC Arkansas-Red Basin                     Tulsa -100.00000
## 5     MBRFC     Missouri Basin Kansas City/Pleasant Hill  -96.66667
## 6     NCRFC      North Central               Minneapolis  -93.33333
## 7     NCRFC      North Central               Minneapolis  -90.00000
## 8      <NA>               <NA>                      <NA>  -86.66667
## 9      <NA>               <NA>                      <NA>  -83.33333
## 10     <NA>               <NA>                      <NA>  -80.00000
##    latitude
## 1  30.00000
## 2  32.22222
## 3  34.44444
## 4  36.66667
## 5  38.88889
## 6  41.11111
## 7  43.33333
## 8  45.55556
## 9  47.77778
## 10 50.00000

9 PolyToRaster

If one wants to create a mask in the model domain (geogrid file), then needs to use PolyToRaster. It first picks up the required geographic information (like proj4) from the geogrid file (geoFile) and then use raster::rasterize function to grab the mask or attribute values from the SpatialPolygonDataFrame. This function is basically wrapping the raster::rasterize function to serve our purpose. Below is a few different ways one could use this function.

Let’s get the RFC’s ID for each pixel within the Fourmile Creek domain. This is equivalent to rasterizing the rfc SpatialPolygonDataFrame based on the BASIN_ID.

r <- PolyToRaster(geoFile = geoFile,
                  useRfc = TRUE,
                  field ="BASIN_ID")
## Loading required namespace: rgeos

To know what are the corresponding values to the integer values used in rasterized output, use the following command.

r@data@attributes
## [[1]]
##   ID BASIN_ID
## 1  9    NERFC

As the result shows the rectangle domain is in NERFC.