Irucka Embry, EIT [Cherokee Nation Technology Solutions (CNTS) United States Geological Survey (USGS) Contractor] gave this tutorial to his USGS colleagues on Friday, 28 August 2015, as part of a group R tutorial. This tutorial has been modified from its original presentation.



Useful Resources

R Resource Web page: https://www.ecoccs.com/rtraining.html {R Trainings and Resources provided by EcoC2S (Irucka Embry, EIT)}

That Web page contains information related to this series of R Tutorials as well as many other useful resources for the USGS



Tutorial

# Install and load the necessary R packages prior to beginning this tutorial

# If you are unsure if you have all of the packages already installed
install.packages(install.load)
# install the install.load package maintained by Irucka Embry

install.load::install_load("openxlsx", "readxl", "lubridate", "ggplot2", "DT", "pander")
# install and/or load the following packages and dependencies


This Tutorial version was created with rmarkdown using the following:

  • R
  • install.load
  • DT
  • pander
  • ggplot2
  • lubridate
  • readxl
  • openxlsx



Download this file (https://www.ecoccs.com/R_Tutorial/28_August_2015/WForkStonesRiver03428200.xlsx) to your working directory on your computer



# If you are sure that you have all of the packages installed
# install.packages(install.load) # install the install.load package maintained
# by Irucka Embry
install.load::load_package("openxlsx", "readxl", "lubridate", "ggplot2", "DT", "pander")
# load the packages and dependencies


file <- "WForkStonesRiver03428200.xlsx"  # tk_choose.files() # obtain the file using the file dialog
# the function tk_choose.files() was used in the original tutorial, but won't
# be used here
file  # this will print the filename
## [1] "WForkStonesRiver03428200.xlsx"
# Importing Data
stone <- read.xlsx(file, startRow = 2)  # read in the spreadsheet chosen with the file dialog starting at Row 2
stone <- stone[-1, ]  # remove Row 1
stone[, 3] <- ymd(stone[, 3])  # change Column 3 (datetime) from character class to POSIXct (date format)
stone[, 4] <- as.numeric(stone[, 4])  # change Column 4 [Discharge, cubic feet per second (Mean)] from character class to numeric class
pander(summary(stone[, 4], na.rm = TRUE), header = c("Min.", "1st Qu.", "Median",
    "Mean", "3rd Qu.", "Max.", "NA's"))
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
4.7 37 109 298.1 295 21200 1432
# print out a statistical summary for Column 4


# select only the Rows where the data are Approved (A)
Agrepl <- grepl("A", stone[, 5])  # find all 'A's in Column 5 of stone
Agreplwhich <- which(grepl("A", stone[, 5]))  # what are the Row numbers where there is an 'A' in Column 5 of stone
stoneA <- stone[which(grepl("A", stone[, 5])), 1:5]  # select only the Rows in Columns 1:5 where the data are Approved
datatable(head(stoneA, 30))  # only print the first 30 rows
# select only the Rows where the data are Provisional (P)
Pgrepl <- grepl("P", stone[, 5])  # find all 'P's in Column 5 of stone
Pgreplwhich <- which(grepl("P", stone[, 5]))  # what are the Row numbers where there is an 'P' in Column 5 of stone
stoneP <- stone[which(grepl("P", stone[, 5])), 1:5]  # select only the Rows in Columns 1:5 where the data are Provisional
datatable(head(stoneP, 30))  # only print the first 30 rows
# the name of column 3 is datetime
colnames(stone[3])
## [1] "datetime"
# or
names(stone[3])
## [1] "datetime"
# the name of column 4 is X01_00060_00003
colnames(stone[4])
## [1] "X01_00060_00003"
# or
names(stone[4])
## [1] "X01_00060_00003"
# plot the time series using base R plot

plot(stone[, 3], stone[, 4], type = "l", main = "Time Series of West Stone Forks River",
    xlab = "Year", ylab =