1 Background

For most of the postprocessing, there is a need to create spatial maps, aggregate over spatial units and also produce georeference raster and shapefiles. Many of the existing functions in available spatial libraries such as SP, RGDAL, RGEOS and Raster has been wrapped in rwrfhydro to serve our purpose. Here we explain these spatial functions, their applications as well as some examples.

1.1 List of the available functions

  • GetProj
  • GetGeogridSpatialInfo
  • ExportGeogrid
  • GetGeogridIndex
  • GetTimeZone
  • GetRfc
  • GetPoly
  • PolygonToRaster

1.2 General Info

The path to the files have been hard coded here, please change them as you see fit. This vignette will wrok with the wrf-hydro-training material provided at June 2018.

Load the rwrfhydro package.

> library(rwrfhydro)
To check rwrfhydro updates run: CheckForUpdates()

Geogrid file is the main file containing all the base geographic information on the model domain such as the geographic coordinate system, latitude, longitude of each pixel and so on. We use this file frequently. Set a path to geogrid file.

> geoFile <- '~/wrf-hydro-training/example_case/Gridded/DOMAIN/geo_em.d01.nc'

2 GetProj

To be able to use any of the spatial tools in R, projection information of the model domain is required. Some of the model input and output files are based on geogrid file domain. GetProj pull projection information of WRF-Hydro modeling domain from geogrid file. It takes only the path to the geogrid file and return the projection information as a character.

> proj4 <- GetProj(geoFile)
> proj4
[1] "+proj=lcc +lat_1=30 +lat_2=60 +lat_0=40.0000076293945 +lon_0=-97 +x_0=0 +y_0=0 +a=6370000 +b=6370000 +units=m +no_defs"

3 GetGeogridSpatialInfo

It pulls necessary geospatial information about WRF-Hydro modeling domain from geogrid file used for regridding and deprojection. It only requires the address to the geogrid file and return a data frame containing geospatial information such as the projection information, number of rows and columns and size of the grids.

> geoInfo <- GetGeogridSpatialInfo(geoFile)
> geoInfo
    DX   DY GRID_TYPE     LAT1      LON1  REF_LAT REF_LON SPLAT1 SPLAT2
1 1000 1000         C 41.42281 -73.85333 40.00001     -97     30     60
  POLE_LAT POLE_LON MAP_PROJ NCOL NROW
1       90        0        1   15   16

4 ExportGeogrid

If you need to create a georeferenced TIFF file from any variable in a netcdf file having the same domain and pixel size as the geogrid file, then you need to use ExportGeogrid function. It takes the Geogrid file having lat/lon information and converts the specified variable into a georeferenced TIFF file for use in standard GIS tools. Let’s export one variable from the geogrid file. You can get a list of all available variables in the geoFile using ncdump function in rwrfhydro.

> ncdump(geoFile)

We create a georeferenced TIFF file from HGT_M field. You only need to provide the address to geogrid file (geoFile), the name of the variable (HGT_M) and the name of the output file (geogrid_hgt.tif).

> ExportGeogrid(geoFile,"HGT_M", "geogrid_hgt.tif")

You can use the provided file in any standard GIS platform. Let’s read it into memory as a raster and dispaly it.

> # read the saved tiff file
> r <- raster::raster("geogrid_hgt.tif")
> 
> # plot the imported raster from tif file
> raster::plot(r, main = "HGT_M")
> 
> # check the raster information. Notice that geographic coordinate information has been added.
> r
class       : RasterLayer 
dimensions  : 16, 15, 240  (nrow, ncol, ncell)
resolution  : 1000, 1000  (x, y)
extent      : 1842000, 1857000, 420998.6, 436998.6  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=lcc +lat_1=30 +lat_2=60 +lat_0=40.0000076293945 +lon_0=-97 +x_0=0 +y_0=0 +a=6370000 +b=6370000 +units=m +no_defs 
data source : C:\Users\arezoo\Downloads\geogrid_hgt.tif 
names       : geogrid_hgt 

Many of the input and output files such as LDASOUT output file does not contain lat/lon coordinates but matches the spatial coordinate system of the geogrid input file. In that case, you could feed the geogrid file (geoFile) from which the lat/lon information will be taken.

> file = "~/wrf-hydro-training/example_case/FORCING/2011082800.LDASIN_DOMAIN1"
> # ncdump(file) # check if the SOIL_T exist in the file
> 
> # we will read the third layer of soil temperature
> ExportGeogrid(file,
+               inVar="RAINRATE",
+               outFile="RAINRATE.tif",
+               inCoordFile=geoFile)
> 
> # read the  created tiff file
> r <- raster::raster("RAINRATE.tif")
> 
> # plot the imported raster from tiff file
> raster::plot(r*3600, main = "RAINRATE (mm/hr)")

> # check the raster information and notice geographic coordinate information has been added
> r
class       : RasterLayer 
dimensions  : 16, 15, 240  (nrow, ncol, ncell)
resolution  : 1000, 1000  (x, y)
extent      : 1842000, 1857000, 420998.6, 436998.6  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=lcc +lat_1=30 +lat_2=60 +lat_0=40.0000076293945 +lon_0=-97 +x_0=0 +y_0=0 +a=6370000 +b=6370000 +units=m +no_defs 
data source : C:\Users\arezoo\Downloads\RAINRATE.tif 
names       : RAINRATE 

5 GetGeogridIndex

To be able to use a bunch of tools such as GetMultiNcdf, one needs to have the indices (x,y) or the location of each cell within the domain. GetGeogridIndex get geogrid cell indices from lat/lon (or other) coordinates. GetGeogridIndex reads in a set of lat/lon (or other) coordinates and generates a corresponding set of geogrid index pairs. You can assign a projection to the points using proj4 argument which will be used to transform the point to the geoFile coordinate system. Chkeck the vignette on precipitation for usage. If the point falls outside the domain, it will return NA value.

> sg <- data.frame(lon = seq(-73.6, -73.8, length.out = 10),
+                  lat = seq(41.4, 41.5, length.out = 10))
> GetGeogridIndex(sg, geoFile)
   we sn
1  NA NA
2  NA NA
3  NA NA
4  15  6
5  13  7
6  11  8
7   9  8
8   7  9
9   5 10
10  3 10

6 GetTimeZone

Many of the point observation are reported in local time and needs to be converted to UTC time to be comparable with WRF-Hydro input and outputs. GetTimeZone return the time zone for any point having longitude and latitude. It takes a dataframe containing at least two fields of latitude and longitude, overlays the points with a timezone shapefile (can be downloded from http://efele.net/maps/tz/world/). The shapefile is provided in rwrfhydro data and it is called timeZone.

> # timeZone has been provided by rwrfhydro as a SpatialPolygonDataFrame
> class(timeZone)
[1] "SpatialPolygonsDataFrame"
attr(,"package")
[1] "sp"
> # Shows the available timezone (TZID column in timeZone@data)
> head(timeZone@data)
                 TZID
0    America/Dominica
1  America/St_Vincent
2 Australia/Lord_Howe
3        Asia/Kashgar
4      Pacific/Wallis
5      Asia/Jerusalem

Function has three arguments.

  • points: A dataframe of the points. The dataframe should contain at least two fields called latitude and longitude.
  • proj4: Projection of the points to be used in transforming the points projection to timeZone projection. Default is +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 which is the same as the timezone projection.
  • parallel: If the number of points are high you can parallelize the process.

It will return the points dataframe with an added column called timeZone. It will return NA in case the point is not in any polygon. Now let’s generate some random points and find their time zone information.

> # Provide a dataframe of 10 points having longitude and latitude as column name.
> sg <- data.frame(longitude = seq(-110, -80, length.out = 10),
+ latitude = seq(30, 50, length.out = 10))
> 
> # Find the time zone for each point
> sg <- GetTimeZone(sg)
> sg
             timeZone  longitude latitude
1  America/Hermosillo -110.00000 30.00000
2      America/Denver -106.66667 32.22222
3      America/Denver -103.33333 34.44444
4     America/Chicago -100.00000 36.66667
5     America/Chicago  -96.66667 38.88889
6     America/Chicago  -93.33333 41.11111
7     America/Chicago  -90.00000 43.33333
8                <NA>  -86.66667 45.55556
9     America/Toronto  -83.33333 47.77778
10    America/Toronto  -80.00000 50.00000

7 GetRfc

US has 13 River Forecast Centers (RFC) and many of the statistics are desired to be grouped into River Forecast Center level so it would be easier to compare with the performance of the RFC models in the past. The RFC boundary shapefile is provided in rwrfhydro data and is called rfc.

> class(rfc)
[1] "SpatialPolygonsDataFrame"
attr(,"package")
[1] "sp"
> # Shows the available rfc, name of the column is BASIN_ID
> head(rfc@data)
  SITE_ID STATE           RFC_NAME                  RFC_CITY BASIN_ID
0     ACR    AK             Alaska                 Anchorage    AKRFC
1     KRF    MO     Missouri Basin Kansas City/Pleasant Hill    MBRFC
2     STR    UT     Colorado Basin            Salt Lake City    CBRFC
3     TUA    OK Arkansas-Red Basin                     Tulsa    ABRFC
4     RSA    CA  California-Nevada                Sacramento    CNRFC
5     ORN    LA  Lower Mississippi   New Orleans/Baton Rouge    LMRFC

GetRfc return the RFC name for any point having longitude and latitude. It takes a dataframe containing at least two fields of latitude and longitude, overlays the points with a rfc SpatialPolygonDataFrame and return the rfc’s BASIN_ID. Function has three arguments.

  • points: A dataframe of the points. The dataframe should contain at least two fields called “latitude” and “longitude”.
  • proj4: Projection of the points to be used in transforming the points projection to rfc projection. Default is +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0.
  • parallel: If the number of points are high you can parallelize the process.

It will return the points dataframe with an added column called rfc. It will return NA in case the point is not within RFCs.

> # Provide a dataframe of 10 points having longitude and latitude as column name.
> sg <- data.frame(longitude = seq(-110, -80, length.out = 10),
+                  latitude = seq(30, 50, length.out = 10))
> 
> # Find the rfc for each point
> sg <- GetRfc(sg)
> sg
     rfc  longitude latitude
1   <NA> -110.00000 30.00000
2  WGRFC -106.66667 32.22222
3  WGRFC -103.33333 34.44444
4  ABRFC -100.00000 36.66667
5  MBRFC  -96.66667 38.88889
6  NCRFC  -93.33333 41.11111
7  NCRFC  -90.00000 43.33333
8   <NA>  -86.66667 45.55556
9   <NA>  -83.33333 47.77778
10  <NA>  -80.00000 50.00000

7.1 GetPoly

Getpoly is similar to GetRfc, it is a wrapper for function sp::over. It takes a dataframe containing at least two fields of latitude and longitude, overlays the points with a SpatialPolygonDataFrame and return the requested attribute from the polygon. One could use the available SpatialPolygon* loaded into memory or provide the address to the location of a polygon shapefile and the name of the shapefile and it will read the polygon using rgdal::readOGR function.

Let’s get the RFC information from GetPoly instead of GetRfc. Here we provide the name of the SpatialPolygon* and using argument join request one of the attributes of the polygon. For example, here we have requested the BASIN_ID, RFC_NAME and RFC_CITY.

> # Provide a dataframe of 10 points having longitude and latitude
> sg <- data.frame(longitude = seq(-110, -80, length.out = 10),
+ latitude = seq(30, 50, length.out = 10))
> 
> # Find the ID of RFC for each point
> sg <- GetPoly(points = sg, polygon = rfc, join = "BASIN_ID")
> 
> # Find the full name of RFC for each point
> sg <- GetPoly(points = sg, polygon = rfc, join = "RFC_NAME")
> 
> # Find the location/city of RFC for each point
> sg <- GetPoly(points = sg, polygon = rfc, join = "RFC_CITY")
> sg
   BASIN_ID           RFC_NAME                  RFC_CITY  longitude
1      <NA>               <NA>                      <NA> -110.00000
2     WGRFC          West Gulf         Dallas/Fort Worth -106.66667
3     WGRFC          West Gulf         Dallas/Fort Worth -103.33333
4     ABRFC Arkansas-Red Basin                     Tulsa -100.00000
5     MBRFC     Missouri Basin Kansas City/Pleasant Hill  -96.66667
6     NCRFC      North Central               Minneapolis  -93.33333
7     NCRFC      North Central               Minneapolis  -90.00000
8      <NA>               <NA>                      <NA>  -86.66667
9      <NA>               <NA>                      <NA>  -83.33333
10     <NA>               <NA>                      <NA>  -80.00000
   latitude
1  30.00000
2  32.22222
3  34.44444
4  36.66667
5  38.88889
6  41.11111
7  43.33333
8  45.55556
9  47.77778
10 50.00000

8 PolyToRaster

If one wants to create a mask in the model domain (geogrid file), then needs to use PolyToRaster. It first picks up the required geographic information (like proj4) from the geogrid file (geoFile) and then use raster::rasterize function to grab the mask or attribute values from the SpatialPolygonDataFrame. This function is basically wrapping the raster::rasterize function to serve our purpose. Below is a few different ways one could use this function.

Example 1 : Let’s get the RFC’s ID for each pixel within the Fourmile Creek domain. This is equivalent to rasterizing the rfc SpatialPolygonDataFrame based on the BASIN_ID.

> r <- PolyToRaster(geoFile = geoFile,
+                   useRfc = TRUE,
+                   field ="BASIN_ID")
Loading required namespace: rgeos

To know what are the corresponding values to the integer values used in rasterized output, use the following command.

> r@data@attributes
[[1]]
  ID BASIN_ID
1  9    NERFC

As the result shows the rectangle domain is in CBRFC and MBRFC.

Example 2 : Rasterize the HUC12 SpatialPolygonDataFrame based on the HUC12 field. The Clipped HUC12 shapefile is provided with the test case which is slightly bigger than the model domain. You could read the shapefile and plot it as below.

> polyg <- rgdal::readOGR("~/wrf-hydro-training/example_case/supplemental/polygons", "Clipped_huc12")
> raster::plot(polyg, main = "Clipped HUC12")

Our study domain covers some of the HUCs.

> polygonAddress <- "~/wrf-hydro-training/example_case/supplemental/polygons"
> r <- PolyToRaster(geoFile = geoFile,
+                   polygonAddress = polygonAddress,
+                   polygonShapeFile = "Clipped_huc12",
+                   field ="HUC12",
+                   plot = FALSE)

To get the HUC12 actual values:

> r@data@attributes
[[1]]
  ID        HUC12
1  1 020200080303
2  2 020200080306
3  3 020301010201
4  4 020301010205
5  5 020301010206

Example 3: You can get a unified mask over the study domain as follows:

> r <- PolyToRaster(geoFile = geoFile,
+                   polygonAddress = polygonAddress,
+                   polygonShapeFile = "Clipped_huc12",
+                   mask =TRUE)

Example 4: You could also get a separate mask for each subbasin (HUC12 in this case) with the fraction of each grid cell that is covered by each polygon. The fraction covered is estimated by dividing each cell into 100 subcells and determining presence/absence of the polygon in the center of each subcell.

> r <- PolyToRaster(geoFile = geoFile,
+                   polygonAddress = polygonAddress,
+                   polygonShapeFile = "Clipped_huc12",
+                   field = "HUC12",
+                   getCover = TRUE)
> raster::plot(r)