Irucka Embry, EIT [Cherokee Nation Technology Solutions (CNTS) United States Geological Survey (USGS) Contractor] gave this tutorial to his USGS colleagues on Friday, 28 August 2015, as part of a group R tutorial. This tutorial has been modified from its original presentation.
R Resource Web page: https://www.ecoccs.com/rtraining.html {R Trainings and Resources provided by EcoC2S (Irucka Embry, EIT)}
That Web page contains information related to this series of R Tutorials as well as many other useful resources for the USGS
# Install and load the necessary R packages prior to beginning this tutorial
# If you are unsure if you have all of the packages already installed
install.packages(install.load)
# install the install.load package maintained by Irucka Embry
install.load::install_load("openxlsx", "readxl", "lubridate", "ggplot2", "DT", "pander")
# install and/or load the following packages and dependenciesThis Tutorial version was created with rmarkdown using the following:
Rinstall.loadDTpanderggplot2lubridatereadxlopenxlsx
Download this file (https://www.ecoccs.com/R_Tutorial/28_August_2015/WForkStonesRiver03428200.xlsx) to your working directory on your computer
# If you are sure that you have all of the packages installed
# install.packages(install.load) # install the install.load package maintained
# by Irucka Embry
install.load::load_package("openxlsx", "readxl", "lubridate", "ggplot2", "DT", "pander")
# load the packages and dependencies
file <- "WForkStonesRiver03428200.xlsx" # tk_choose.files() # obtain the file using the file dialog
# the function tk_choose.files() was used in the original tutorial, but won't
# be used here
file # this will print the filename## [1] "WForkStonesRiver03428200.xlsx"
# Importing Data
stone <- read.xlsx(file, startRow = 2) # read in the spreadsheet chosen with the file dialog starting at Row 2
stone <- stone[-1, ] # remove Row 1
stone[, 3] <- ymd(stone[, 3]) # change Column 3 (datetime) from character class to POSIXct (date format)
stone[, 4] <- as.numeric(stone[, 4]) # change Column 4 [Discharge, cubic feet per second (Mean)] from character class to numeric class
pander(summary(stone[, 4], na.rm = TRUE), header = c("Min.", "1st Qu.", "Median",
"Mean", "3rd Qu.", "Max.", "NA's"))| Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | NA's |
|---|---|---|---|---|---|---|
| 4.7 | 37 | 109 | 298.1 | 295 | 21200 | 1432 |
# print out a statistical summary for Column 4
# select only the Rows where the data are Approved (A)
Agrepl <- grepl("A", stone[, 5]) # find all 'A's in Column 5 of stone
Agreplwhich <- which(grepl("A", stone[, 5])) # what are the Row numbers where there is an 'A' in Column 5 of stone
stoneA <- stone[which(grepl("A", stone[, 5])), 1:5] # select only the Rows in Columns 1:5 where the data are Approved
datatable(head(stoneA, 30)) # only print the first 30 rows# select only the Rows where the data are Provisional (P)
Pgrepl <- grepl("P", stone[, 5]) # find all 'P's in Column 5 of stone
Pgreplwhich <- which(grepl("P", stone[, 5])) # what are the Row numbers where there is an 'P' in Column 5 of stone
stoneP <- stone[which(grepl("P", stone[, 5])), 1:5] # select only the Rows in Columns 1:5 where the data are Provisional
datatable(head(stoneP, 30)) # only print the first 30 rows# the name of column 3 is datetime
colnames(stone[3])## [1] "datetime"
# or
names(stone[3])## [1] "datetime"
# the name of column 4 is X01_00060_00003
colnames(stone[4])## [1] "X01_00060_00003"
# or
names(stone[4])## [1] "X01_00060_00003"
# plot the time series using base R plot
plot(stone[, 3], stone[, 4], type = "l", main = "Time Series of West Stone Forks River",
xlab = "Year", ylab =