#' --- #' title: rwrfhydro introduction #' author: James & Aubrey #' --- #' #' #### What is rwrfhydro? #' A community-contributed tool box for managing, analyzing, and visualizing WRF Hydro (and HydroDART) input and output files in R. #' #' Intentionally, “rwrfhydro” can be read as “our wrf hydro”. The purpose of this R package is to focus *community development* of tools for working with and analyzing data related to the WRF Hydro model. These tools are both free and open-source, just like R, which should help make them accessible and popular. #' #' The git hub repository is found here: [https://github.com/mccreigh/rwrfhydro/](https://github.com/mccreigh/rwrfhydro/). #' #' The [README.Rmd](https://github.com/mccreigh/rwrfhydro/blob/master/README.Rmd) provides instructions on installing, using, and contributing. Additional resouces for learning R are listed there as well. #' #' #' # Lecture Outline #' ------------------------- -------------------------------------------------------------------------------------------------------------- #' * [Install](#Install) #' * R Basics [ [html](#RBasics) - [Rmd](rwrhydro_intro.Rmd) - [R](rwrhydro_intro.R) ] #' * rwrfhydro overview [ [html](overview.html) - [Rmd](overview.Rmd) - [R](overview.R) ] #' * Domain visualization [ [html](domainChannelVis.html) - [Rmd](domainChannelVis.Rmd) - [R](domainChannelVis.R) ] #' * USGS historical data [ [html](usgsObsDb.html) - [Rmd](usgsObsDb.Rmd) - [R](usgsObsDb.R) ] #' * Multi-ncdf grabs [ [html](getMultiNetcdf.html) - [Rmd](getMultiNetcdf.Rmd) - [R](getMultiNetcdf.R) ] #' * Streamflow evaluation [ [html](streamflowEval.html) - [Rmd](streamflowEval.Rmd) - [R](streamflowEval.R) ] #' * Water balance [ [html](waterBudget.html) - [Rmd](waterBudget.Rmd) - [R](waterBudget.R) ] #' * ET [ [html](evapotranspirationEval.html) - [Rmd](evapotranspirationEval.Rmd) - [R](evapotranspirationEval.R) ] #' * Free-for-all #' ------------------------- -------------------------------------------------------------------------------------------------------------- #' *** #' #' #' # Install #' The [installation instructions](https://github.com/mccreigh/rwrfhydro/blob/master/README.Rmd#installing) require a minor detour for 2 reasons: 1) ncdf4 in a non-standard location, 2) there is no local R library yet. First we'll install `devtools`. #' #' #' In the terminal: ## ---- eval=FALSE, engine="bash"------------------------------------------ ## R ## install.packages("devtools", repos="http://cran.at.r-project.org/") ## q() #' On a clean R install, the above will ask the user to create a new, writable Library path. This will be the first in the R path (`.libPath()`). #' #' #' Next, we'll return to the shell and install the `ncdf4` package against the correct-for-R (gcc) build on the system. Again, in the terminal: ## ---- eval=FALSE, engine="bash"------------------------------------------ ## wget http://cran.r-project.org/src/contrib/ncdf4_1.13.tar.gz ## R CMD INSTALL \\ ## --configure-args="--with-nc-config=/usr/local/netcdf-4.3.2-gcc/bin/nc-config" ncdf4_1.13.tar.gz #' #' #' Now we will install `rwrfhydro`. The remainder of its dependencies install without issue. Again in the terminal: ## ---- eval=FALSE, engine="bash"------------------------------------------ ## R ## library(ncdf4) ## devtools::install_github('mccreigh/rwrfhydro') ## library(rwrfhydro) #' #' *** #' #' #' # R Basics #' The goal is to *demistify* R and get you less confused when using it. The examples here are not deep but give a view into some basic ways that R might be different than whatever language you are used to. Once you absorb this, you can graduate to more sophisticated resources like [Advanced R](http://adv-r.had.co.nz/) or [R Packages](http://r-pkgs.had.co.nz/) with a solid foundation. Please see the rwrfhydro home page for more references. #' #' One note is that R was developed as an open source project which did no apparently impose much coding standards at the outset. Hence, there is really no standard code style (e.g. `function.name` or `function_name` or `functionName` or `FunctionName` or `functionname` might all be used for different functions). #' #' If you are used to including a library of functions in your path in another language, this notion is replaced by the "package" in R. This is a somewhat advanced way to use R but it is well worth learning the fundamentals. R also possesses advanced and powerful documentation and markup features which are worth learning, note that these documents are provided in three formats generated from the rmarkdown (.Rmd) documents: .html are generated by running the code and the .R files turn the text into comments, leaving the code. #' #' None of this is to mention the availablilty of packages and interfaces to other open-source software which extend the power of the R language. #' #' ## Table of Contents #' * [Setup](#Setup) #' * [Functions](#Functions) #' * [Lists](#Lists) #' * [Data Frames](#DataFrames) #' * [Serious List Example](#SeriousListEx) #' * [Methods & Classes](#MethodsClasses) #' * [Scoping](#Scoping) #' * [Getting Help](#Help) #' * [Packages & Namespaces](#PackagesNamespaces) #' * [Bonus: Computing on the language](#COL) #' #' #' #' ## Setup #' #' While intefacing with R on command line is an option, using a more integrated editor will help you be more efficient. Rstudio provides a very good and free IDE (Integraged Development Environment) for R. The only competitor is ESS for Emacs. Both are highly recommended with RStudio having several advantages. #' #' #' When running R, you might benefit from setting `options(warn=2)` which turns all warnings into errors. It's useful to stop or discover unintended mis-use of R as it starts. A debugging in R can be turned on by `options(error=utils::recover)`. Debugging is more advanced but good to be aware of, see [this resource](http://www.biostat.jhsph.edu/~rpeng/docs/R-debug-tools.pdf) for more information. #' #' The above is also to illustrate the two things which have remained in my "startup" file to date. Here's how that startup file is configured by use of your ~/.Renviron file. ## ---- eval=FALSE, engine="bash"------------------------------------------ ## james@orographic:~> more .Renviron ## R_LIBS=~/R/Libraries/R3.2/ ## R_PROFILE=/Users/james/R/startup_jlm.r ## ## james@orographic:~> more /Users/james/R/startup_jlm.r ## options(warn=1) ## options(error=utils::recover) #' #' The first line of .Renviron specifies the R libraries path (`R_LIBS`), which is unix-like in searching for read and write. The second line specifies the `R_PROFILE` or startup file location. Below the contents of that file are shown to be the options mentioned previously. #' #' #' #' ## Functions #' A basic R function looks like this: ## ------------------------------------------------------------------------ BasicFunc <- function(x) { ## define the function y=x^2 ## square the argument y ## return the square } xx <- 3 yy <- BasicFunc(xx) print(yy) yy #' Note that the last line in the function is the return value. Also, the print method is invoked on an object when that object is entered by itself, as in the very last line. #' #' Remember that nearly everything in R is a function object. Indeed, functions are "first-class". A more complicated example illustrates passing a named function to another function. (In the [scoping](#Scoping) section below, we illustrate a function returning a function, which is then called a closure). ## ------------------------------------------------------------------------ ## Note named arguments. Args can be passed by name or position. Function1 <- function(x=x,f=f) list(arg=x, result=f(x), func=f) result1 <- Function1(xx, BasicFunc) result2 <- Function1(f=BasicFunc, x=xx) identical(result1,result2) result1 str(result1) #' The result of this function is not a scalar or vector, it is a `list` object which contains a possibly arbitrary collection of things. The `str()` function reveals the structure of objects. It helps structure complicated, hierarchical items as we'll see later. It also provides object type and class information. #' #' #' #' ## Lists (how to get more stuff out of a function) #' Above we saw that functions can return multiple items from a function. Then how do we get the contents out of a returned list? For simple lists, this can be done, by hand. Note that `str()` is wrapping many of the commands below to give a more informative view what the output actually is. ## ------------------------------------------------------------------------ names(result1) ## returns a character vector, hence the [1] str(result1['arg']) ## returns a sub-list str(result1[['arg']]) ## returns the item requested, a numeric scalar str(result1$arg) ## dollar acts like the "[[" function. str(result1$func(2)) ## function evaluation. result1[['func']](2) ## function evaluation, same as previous. ## The easiest way to rename is by name using plyr::rename renames <- paste0('names.',c("b","(c)")) ## the new names are the vector entries names(renames) <- c('result','func') ## the old names are the names of the new names ## This kind of translation vector is very handy, note that ## value=rename[name] and name=names(rename)[which(rename == value)] str(renames) print(renames) ## the top line is the names result1 <- plyr::rename(result1, renames) str(result1) result1$names.b result1$`names.(c)`(2) ## backquote can handle illegal names #' #' This demonstrates a basic principle in R that you can forget about indices and **call it by name**. NO indices were used above, the number two is a value passed to a function. #' #' #' #' ## Data frames (regular lists) #' Before showing complicated lists, lets look at data frames. Data frames are special kinds of lists that are "regular". ## ------------------------------------------------------------------------ mtcars ## data packaged with R typically comes in data frames. str(mtcars) rownames(mtcars) colnames(mtcars) ?mtcars ## subset is on rows ecoCars<-subset(mtcars, mpg > 25) ## non-standard evaluation of the column name ecoCars ecoCars$hp ## $ gives columns by name mtcars[mtcars$wt<3,c('wt','mpg','cyl','disp')] ## rows and cols can be referenced mtcars$names <- rownames(mtcars) ## new col with names, mixed types in df mtcars$names <- NULL ## remove a column #' The regular collation of data allows many special operations to be performed on data frames to summarize, subset, and etc the data. We dont have time to cover this important aspect directly. #' #' #' #' ## Serious list example #' One example of a very complicated list is all the netcdf meta data. ## ------------------------------------------------------------------------ ncFile <- '~/wrfHydroTestCases/Fourmile_Creek/RUN.RTTESTS/OUTPUT_CHRT_DAILY/201305160000.LSMOUT_DOMAIN1' library(ncdf4) ncid <- nc_open(ncFile) ncid rwrfhydro::ncdump(ncFile) ## something more visually akin to unix ncdump #' #' Detailed inspection of the ncid object makes it obvious that automating information extraction is going to be key. ## ------------------------------------------------------------------------ str(ncid) names(ncid) ## drill down by name into the object names(ncid$var) ## var looks like it describes the variables names(ncid$var$stc) ## indeed, and each variables has these names varList <- rwrfhydro::NamedList(names(ncid$var)) ## for all variables varList ## NamedList simply names a list by its entries ## use an anonymous function to return "size" information. note that the ## output references the input. indices are not involved. str(plyr::llply(varList, function(vv) ncid$var[[vv]]$size )) nc_close(ncid) #' In the above, the `llply` function from the `plyr` package applies a function (anonymously specified in-line) to a list (`varList`) and returns a list. Note that the names on the returned list are the names which were iterated over. #' #' #' #' #' ## Methods and classes #' It's also good to have a basic awareness of methods and classes in R, this can be particularly mystifying to new users. This is how the same generic function can appear to give a variety of different behaviors, these are methods conditioned on the class of the input. It's also worth noting that the common S3 object system used in R is very "lightweight" (e.g. see setting the class below) and easy to use. For those who shudder at such an informal system there are [other object approaches in R](http://adv-r.had.co.nz/OO-essentials.html). #' ## ------------------------------------------------------------------------ class(mtcars) print(head(mtcars)) print.data.frame(head(mtcars)) print.default(head(mtcars)) print.foo <- function(x){ print(names(x)) print('foo!!!!!!!!!!!') ## really not helpful! } class(mtcars) <- append('foo',class(mtcars)) print(mtcars) #' #' #' ## Scoping #' Scoping: what variables are available where? R has *Lexical scoping*: variables in enclosing environments/functions are available to a given environment/function. For assignment: when assigning to a name the name is searched for in enclosing functions until it is found, if not found in the global environment it is created there. Lastly, functions can be returned with their own environment, these are called closures. Here are some examples. #' #' The variable `a` is in the global environment and is found by the function `f` #' because `f` is enclosed in the global environment. ## ------------------------------------------------------------------------ a <- 10 f <- function(x) a*x f(2) #' #' Now `g` contains a variable `b` which `f2` cannot find because `g` does not enclose `f2`. I wrap failing evaluations in `print(try())` so that this document compiles. Note that `try()` is a very vaulable function for fault tolerance. Also, examination of `try()` shows that it returns it's agument wrapped in `invisible()`, to see that object `print()` has to be explicitly called on `try()`. ## ------------------------------------------------------------------------ options(warn=1) g <- function(x) {b <- 10; 1/x} f2 <- function(x) b*x print(try(f2(.1))) ## the following lexical assignment is frowned upon by most guRus. g2 <- function(x) {b <<- 100; 1/x} print(try(b)) # whoa, isnt it supposed to be assigned? g2(2) b # R is lazy, so b isnt assigned until the function is called. f2(.1) #' #' A closure is a function with data, that data is arbitrary. ## ------------------------------------------------------------------------ fOuter <- function(x) {qqq <- 2*x; function() qqq } fInner <- fOuter(4) fInner() print(try(qqq)) get('qqq',envir = environment(fInner)) #' #' Or maybe more informatively: ## ------------------------------------------------------------------------ vvv <- 123 gOuter <- function(x) { ttt <- 2*x junk <- rwrfhydro::NamedList(letters[1:4]) function(getVar) get(getVar) } gInner <- gOuter(4) print(try(ttt)) gInner('junk') gOuter(c(1,2,4))('ttt') gInner('vvv') #' #' The caveat emptor with scoping is that, while functions which reference variables in enclosing envrionments are plug-and-play, unintended changes to variables in enclosing environments are vunerabilities/liabilities to function accuracy! Be careful and declare all variables in a function when saftey is needed. #' #' #' #' ## Getting help #' Getting information on functions is key. Getting the source code of a function is as easy as calling a function without parentheses (i.e., invoking the print method on the function!) and reading functions is a great way to pickup R programming tips: ## ------------------------------------------------------------------------ lm #' Beyond reading the body of the text, note that the arguments are listed at the top with default values following `=`. The `...` argument is for passing arguments to "low-level" regression functions (described in `?lm`). We also see that the function belongs to the `stats` namespace. #' #' See also (not run here): ## ---- eval=FALSE--------------------------------------------------------- ## formals(lm) ## ?lm ## ?'%in%' ## ?'[' ## ## the following are help on help ## ?help ## ?`?` ## ?`??` #' #' #' #' ## Packages and namespaces #' R packages are like toolboxes for specific purposes.Namespaces for packages make their functions available and managing namespaces is how one can open an entire toolbox or just pull a specific tool out of a specific toolbox. Namespaces can be made entirely available by attaching a package using `library()`, e.g. the whole toolbox is available in the global environment: ## ------------------------------------------------------------------------ sessionInfo() library(rwrfhydro) ## this adds rwrfhydro to the 'other attached packages' list sessionInfo() PlotFdc ## this function is available in the global environment. #' Now all the functions in the rwrfhydro namespace are available because it is attached. #' #' Now we grab a specific tool (`melt`) from the "reshape2" package, but this dosent make the whole toolbox available, though we see it's namespace is ## ------------------------------------------------------------------------ mtcars$model <- rownames(mtcars) head(reshape2::melt(mtcars[,c("mpg","cyl","disp","model")], id='model')) ## use a function in plyr without attaching it sessionInfo() ## attaches plyr, reshape2, Rcpp, & stringr (some already attached.) print(try(melt)) ## since plyr is not attached, the melt function is not available. #' #' The namespace of a package is the set of exported objects. Internal (non-exported objects) in a package can be accessed via `:::` though this is often frowned upon. #' #' #' #' ## Bonus: Computing on the language #' R is an extremely flexible language. The non-standard evaluation mechanisms allow for a variety of powerful behaviours. These can be confusing to new users. Here's one simple example showing a few useful functions. ## ------------------------------------------------------------------------ assign('one',1) two <- eval(parse(text='one+one')) get('two')