# A Practical Introduction to the R Programming Language Part 4
# by Irucka Embry, EIT [Cherokee Nation Technology Solutions (CNTS) United States Geological Survey (USGS) Contractor]
# E-mail: iruckaE@mail2world.com
# Presented on Thursday, 12 March 2015
## Pre-Tutorial
# We will start by checking out http://www.ecoccs.com/RandUSGS.html (R Resources provided by Irucka Embry)
# This Web page contains information related to the R Tutorial as well as many other useful resources
# Everyone should have a copy of R for MATLAB users (http://mathesaurus.sourceforge.net/octave-r.html). This document is useful for showing R commands and a description of the commands.
## Tutorial
# This tutorial begins with an introduction to the command line and how it is used in R and then we will proceed on to practical applications of R
# This R Tutorial is based around the concept of
# get -> clean -> explore -> visualize -> analyze / Source 18
# and also showing how to do various operations in R that you may perform in Microsoft Excel / Source 19
## Table of Contents
# How to Find and Get Into R
# Introduction to RStudio and moving around RStudio
# Creating a RStudio Project for the R Tutorial
# Getting Around the Command Line
# Working Directory
# Use of Arrow Keys
# To Execute Commands in R
# R Workspace
# Installing required R packages (and their dependencies) for the R Tutorial
# Installing and Updating Packages
# Installing Packages in the R Library
# Updating Packages in the R Library
# Basic Operations in R
# Valid Variable Names in R
# R Classification
# get -> clean -> explore -> visualize -> analyze
## NOTE: Prior to beginning this R Tutorial, it is advised that you have already downloaded and installed R 3.1.x (if using Microsoft Windows, then download and install the R binary file from https://cloud.r-project.org/bin/windows/base), RStudio, and the appropriate JAVA >= 1.6 libraries for your operating system.
## NOTE: Download the following files:
# http://www.ecoccs.com/R_Tutorial/12_March_2015/West_Fork_Stones_River_03428200.gz
# http://www.ecoccs.com/R_Tutorial/12_March_2015/West_Fork_Stones_River_03428200.rdb
## How to Find and Get Into R
# On Windows, you can click on the Applications Menu
# Locate R 32-bit or 64-bit GUI (based on your architecture -- Is it 32- or 64-bit? If you have 64-bit, then use the 64-bit R GUI.).
# Click on the R GUI icon.
# Or alternatively, locate RStudio (GUI) and click on the icon.
## Introduction to RStudio and moving around RStudio
# Click on Tools -> Global Options -> Packages (Change the CRAN Repository Mirror) -> Browse -> Scroll down to "Global (CDN) - RStudio" -> Click on OK
# You will return back to the Options window
# When finished, click on "Apply", then "OK"
# What other options do you wish to change from the default settings?
# The RStudio Console functions similarly to the R Console [GUI or command line interface (CLI) versions]
## Creating a RStudio Project for the R Tutorial
# Click on File -> New Project... -> Create project from:
# We'll perform the next steps together based on your preferences.
## Getting Around the Command Line
# Working Directory
getwd() # provides the working directory, this is where all files will be saved unless you change the working directory
?getwd # retrieve R help on the command
setwd() # sets the working directory, this is where all files will be saved unless you change the working directory
?setwd # retrieve R help on the command
library() # library of R packages [packages are collections of function(s)], see ?library
?library # retrieve R help on the command
# Use of Arrow Keys
# Upward arrow = up in the stored history of the command line
# Downward arrow = down in the stored history of the command line
# Leftward arrow = to the left in the command line
# Rightward arrow = to the right in the command line
# To Execute Commands in R
# Hit ENTER to execute commands in R
## R Workspace
# source 3
# Head to http://www.statmethods.net/interface/workspace.html (Quick-R: The Workspace)
# Let's type in the various commands to better understand the R Workspace
# Type in the following commands to see what happens:
demo()
?demo # retrieve R help on the command
help()
?help # retrieve R help on the command
?plot # retrieve R help on the command
help(plot)
help.start()
?help.start # retrieve R help on the command
rm(list=ls()) # this clears your R workspace (removes all R objects)
rm(x) # removes a specific variable/object named "x"
?rm # retrieve R help on the command
list.files() # list all files, including folders, in current working directory
dir() # list all files, including folders, in current working directory
list.files(pattern = ".txt$") # list all files in current working directory that end with the file extension ".txt". The "$" (dollar sign) has to follow the extension to indicate that the preceding characters are the last characters.
## Installing required R packages (and their dependencies) for the R Tutorial
# without JAVA >= 1.6
# does not enable reading/writing of .xls in addition to .xlsx files
# install & load the packages
install.packages("install.load")
library(install.load)
install_load("devtools", "lubridate", "stringr", "waterData", "psych", "prob", "combinat", "data.table", "ggplot2", "directlabels", "ggthemes", "scales", "GGally", "reader", "dplyr", "qdap", "qdapDictionaries", "qdapTools", "sos", "cwhmisc", "pracma", "matrixcalc", "matlab", "matpow", "openxlsx")
# Please note that many package dependencies will also be installed in the process of installing the packages in this list
?source # retrieve R help on the command
# If you are sure that you have all of the packages installed
install.packages(install.load) # install the install.load package maintained by Irucka Embry
library(install.load) # load the installed package install.load
load_package("devtools", "lubridate", "stringr", "waterData", "psych", "prob", "combinat", "data.table", "ggplot2", "directlabels", "ggthemes", "scales", "GGally", "reader", "dplyr", "qdap", "qdapDictionaries", "qdapTools", "sos", "cwhmisc", "pracma", "matrixcalc", "matlab", "matpow", "openxlsx") # load the packages and dependencies
## Installing and Updating Packages
# Installing Packages in the R Library
install.packages()
# For example, to install the packages from the list above, this is the code:
# install.packages(c("stringr", "waterData", "psych", "prob", "combinat", "data.table", "pracma", "ggplot2", "reader", "dplyr", "qdap", "qdapDictionaries", "qdapTools", "openxlsx", "sos", "cwhmisc"), repos = "https://cloud.r-project.org/", dependencies = TRUE)
?install.packages # retrieve R help on the command
# Updating Packages in the R Library
update.packages(ask = FALSE) # all package names should be written as "name_package"
?update.packages # retrieve R help on the command
## Basic Operations in R
# source 1
- # subtraction
+ # addition
* # multiplication
/ # division
^ # exponential
x <- c(0, 30, 60, 90, 180, 270, 360)
exp(x) # e^x (exponential function of x)
log10(x) # log base 10 of x
log(x) # natural logarithm of x
log2(x) # log base 2 of x
cos(x) # cos(x) in radians
sin(x) # sin(x) in radians
cos(x*pi/180) # cos(x) in degrees
sin(x*pi/180) # sin(x) in degrees
sqrt(x) # square root of x
.Last.value # most recent evaluated expression
.Last.value * log10(x) # most recent evaluated expression * log10(x)
objects() # list variables loaded into memory
print(x) # print the specific variable/object named "x"
## Valid Variable Names in R
# Which of the following variable names are valid in R?
# source 1
a <- c(1, 0.34, -0.981)
a
B <- c(1, 0.34, -0.981, 1e16, sin(40))
B
ecky_ecky_ecky_ecky_ptang_zoo_boing <- c(1, 0.34, -0.981, pi, sin(x * pi/180))
print(ecky_ecky_ecky_ecky_ptang_zoo_boing)
ecky ecky ecky ecky ptang zoo boing <- 1
2nd <- 1
John-Bigboote <- 1
## R Classification
# source 3 and 6
# character vectors
HiWorld <- "Hello, World"
HiWorld
class(HiWorld) # character vector
?class # retrieve R help on the command
?character # retrieve R help on the command
A <- c("Dog", "Cat", "Pig")
A
print(A)
class(A) # character vector
# numeric vectors
C <- c(1, 4, 4, 9)
C
class(C) # numeric vector
?numeric # retrieve R help on the command
# matrix
M <- matrix(c(1, 4, 4, 9, 1.23, 4.32, 0.009, -1.0013), nrow = 2, ncol = 4, byrow = TRUE) # byrow = TRUE means that the matrix will be filled by rows
M
class(M) # matrix
?matrix # retrieve R help on the command
M1 <- matrix(c(1, 4, 4, 9, 1.23, 4.32, 0.009, -1.0013), nrow = 2, ncol = 4, byrow = TRUE, dimnames = list(c("row1", "row2"), c("M1", "M2", "M3", "M4"))) # byrow = TRUE means that the matrix will be filled by rows / dimnames provides either or both the row names and column names
M
class(M) # matrix
yy <- matrix(c(1, 3, 5, 7, 9, 2, 4, 6, 8, 10), nrow = 2, ncol = 5, byrow = TRUE)
yy
class(yy) # matrix
xx <- matrix(c(1, 4, 2, 5, 3, 6), nrow = 3, ncol = 2, byrow = TRUE)
xx
class(xx) # matrix
xxyy <- xx %*% yy # matrix multiplication of xx and yy / source 20
class(xxyy) # matrix
Teach Yourself R
matrix(10, 3, 2)
matrix(seq(1, 6), 3, 2) = matrix(c(1, 2, 3, 4, 5, 6), 3, 2)
matrix(c(1, 2, 3), 3, 2)
matrix(seq(1, 6), 3, 2, byrow = TRUE) = matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE)
# array
Ar <- array(-1:10, c(3,4))
Ar
class(Ar)
?array # retrieve R help on the command
# Searching for functions in packages through the sos R package
# Note: this requires access to an Internet connection
# library(sos)
???pde # search for pde in all of the R packages
findFn("{partial differential equations}")
?findFn # retrieve R help on the command
## get -> clean -> explore -> visualize -> analyze
# If you are not in your working directory and you would like to either import or export a file, then you will need to make sure that the pathname can be read by R
# For example, if you want to read "mammals.exp" from "C:\Documents\mammals.exp", then in R you would change the \ to / so "C:/Documents/mammals.exp" is CORRECT in R
## Get and Clean
# http://waterdata.usgs.gov/nwis/dv?cb_all_00060_00010_00095_00300_00400=on&cb_00060=on&cb_00010=on&cb_00095=on&cb_00300=on&cb_00400=on&format=rdb&site_no=03428200&referred_module=sw&period=&begin_date=1972-07-20&end_date=2015-03-10
# http://waterdata.usgs.gov/nwis/inventory/?site_no=03428200&agency_cd=USGS
# USGS 03428200 WEST FORK STONES RIVER AT MURFREESBORO, TN
# 00060 Discharge (Mean) 1972-07-20 2015-03-10
# 00010 Temperature, water (Max.,Min.,Mean) 1986-02-06 2014-01-21
# 00095 Specific cond at 25C (Max.,Min.,Mean) 1986-02-06 2014-01-21
# 00300 Dissolved Oxygen (Max.,Min.,Mean) 1986-02-06 2014-01-21
# 00400 pH (Max.,Min.) 1986-02-06 2014-01-21
# read in the online NWIS table for 03428200 WEST FORK STONES RIVER AT MURFREESBORO, TN
WForkStonesRiver03428200 <- read.table("http://waterdata.usgs.gov/nwis/dv?cb_all_00060_00010_00095_00300_00400=on&cb_00060=on&cb_00010=on&cb_00095=on&cb_00300=on&cb_00400=on&format=rdb&site_no=03428200&referred_module=sw&period=&begin_date=1972-07-20&end_date=2015-03-10", header = TRUE, sep = "\t", stringsAsFactors = FALSE) # reads the large table from NWIS Web (> 1 MB)
WForkStonesRiver03428200 <- read.table("http://www.ecoccs.com/R_Tutorial/12_March_2015/West_Fork_Stones_River_03428200.rdb", header = TRUE, sep = "\t", stringsAsFactors = FALSE) # reads the same table stored online as a .rdb file (> 1 MB)
WForkStonesRiver03428200 <- read.table("http://www.ecoccs.com/R_Tutorial/12_March_2015/West_Fork_Stones_River_03428200.gz", header = TRUE, sep = "\t", stringsAsFactors = FALSE) # reads the same table stored online as a .gz file (compressed .rdb file)
# read the file in from your current working directory
WForkStonesRiver03428200 <- read.table("West_Fork_Stones_River_03428200.rdb", header = TRUE, sep = "\t", stringsAsFactors = FALSE) # read in the .rdb file saved in your working directory
WForkStonesRiver03428200 <- read.table("West_Fork_Stones_River_03428200.gz", header = TRUE, sep = "\t", stringsAsFactors = FALSE) # read in the .gz file saved in your working directory
# Either open http://waterdata.usgs.gov/nwis/dv?cb_all_00060_00010_00095_00300_00400=on&cb_00060=on&cb_00010=on&cb_00095=on&cb_00300=on&cb_00400=on&format=rdb&site_no=03428200&referred_module=sw&period=&begin_date=1972-07-20&end_date=2015-03-10
#
# or West_Fork_Stones_River_03428200.rdb
#
# to view the station metadata which will be useful in changing the column names
WForkStonesRiver03428200 <- WForkStonesRiver03428200[-1, ] # remove row 1
WForkStonesRiver03428200 <- WForkStonesRiver03428200[, -c(1:2)] # remove columns 1 and 2
names(WForkStonesRiver03428200)[1:3] <- c("Date", "Mean Discharge (cfs)", "Discharge Quality Control") # change the names of columns 1 - 3
names(WForkStonesRiver03428200) <- c("Date", "Mean Discharge (cfs)", "Discharge Quality Control", "Maximum Water Temp (°C)", "Maximum Water Temp Quality Control", "Minimum Water Temp (°C)", "Minimum Water Temp Quality Control", "Mean Water Temp (°C)", "Mean Water Temp Quality Control", "Maximum Water Specific Conductance (µS/cm at 25°C)", "Maximum Water Specific Conductance Quality Control", "Minimum Water Specific Conductance (µS/cm at 25°C)", "Minimum Water Specific Conductance Quality Control", "Mean Water Specific Conductance (µS/cm at 25°C)", "Mean Water Specific Conductance Quality Control",
"Maximum Water Dissolved Oxygen (mg/L)", "Maximum Water Dissolved Oxygen Quality Control", "Minimum Water Dissolved Oxygen (mg/L)", "Minimum Water Dissolved Oxygen Quality Control", "Mean Water Dissolved Oxygen (mg/L)", "Mean Water Dissolved Oxygen Quality Control",
"Maximum Water pH", "Maximum Water pH Quality Control", "Minimum Water pH", "Minimum Water pH Quality Control") # change the names of all columns
## Explore
str(WForkStonesRiver03428200) # Compactly Display the Structure of WForkStonesRiver03428200
class(WForkStonesRiver03428200[, 1]) # class of column 1 in WForkStonesRiver03428200 is character
WForkStonesRiver03428200[, 1] <- as.Date(WForkStonesRiver03428200[, 1]) # change from character to Date class
?as.Date # retrieve R help on the command
class(WForkStonesRiver03428200[, 1]) # Date class
head(WForkStonesRiver03428200[, 1]) # 1st 6 rows of column 1 in WForkStonesRiver03428200
class(WForkStonesRiver03428200[, 2]) # class of column 2 in WForkStonesRiver03428200 is character
WForkStonesRiver03428200[, 2] <- as.numeric(WForkStonesRiver03428200[, 2]) # change from character to numeric class
?as.numeric # retrieve R help on the command
head(WForkStonesRiver03428200[, 2]) # Numeric class
class(WForkStonesRiver03428200[, 2]) # 1st 6 rows of column 2 in WForkStonesRiver03428200
# view the first 6 rows
head(WForkStonesRiver03428200)[1:2] # view the first 6 rows of columns 1 and 2
# view the last 6 rows
tail(WForkStonesRiver03428200)[1:2] # view the last 6 rows of columns 1 and 2
## Visualize
# using base plot
plot(WForkStonesRiver03428200[, 1], WForkStonesRiver03428200[, 2], main = "Mean Discharge for West Stone Forks River", xlab = "Date", ylab = "Mean Discharge (cfs)")
?plot # retrieve R help on the command
# using quick plot from the ggplot2 package
# library(ggplot2)
qplot(WForkStonesRiver03428200[, 1], WForkStonesRiver03428200[, 2], main = "Mean Discharge for West Stone Forks River", xlab = "Date", ylab = "Mean Discharge (cfs)")
?qplot # retrieve R help on the command
## Export the cleaned up .rdb file as a spreadsheet with 2 pages (page 1 is the cleaned up data set and page 2 contains the comments)
WForkStonesRiver03428200comments <- readLines(con = "http://waterdata.usgs.gov/nwis/dv?cb_all_00060_00010_00095_00300_00400=on&cb_00060=on&cb_00010=on&cb_00095=on&cb_00300=on&cb_00400=on&format=rdb&site_no=03428200&referred_module=sw&period=&begin_date=1972-07-20&end_date=2015-03-10") # reads the large table from NWIS Web (> 1 MB)
?readLines # retrieve R help on the command
WForkStonesRiver03428200comments <- readLines(con = "http://www.ecoccs.com/R_Tutorial/12_March_2015/West_Fork_Stones_River_03428200.rdb") # reads the large table stored online (> 1 MB)
# read the file in from your current working directory
WForkStonesRiver03428200comments <- readLines(con = "http://www.ecoccs.com/R_Tutorial/12_March_2015/West_Fork_Stones_River_03428200.rdb") # reads the large table stored in your working directory
http://stackoverflow.com/questions/26070388/how-to-convert-code-to-more-readable-form-in-r/26071017?s=2|0.8541#26071017
How to convert code to more readable form in R - Stack Overflow
Sep 27 2014
flodel
idx.comments <- grep("^[#]", WForkStonesRiver03428200comments)
WForkStonesRiver03428200comments <- WForkStonesRiver03428200comments[idx.comments]
WForkStonesRiver03428200comments<- stri_replace_all_fixed(WForkStonesRiver03428200comments, "#", "")
# library(openxlsx)
input1 <- "USGS 03428200 West Fork Stones River at Murfreesboro, TN" # create the header row for sheet 1 of the spreadsheet
wb <- createWorkbook() # use R package openxlsx to create the .xlsx spreadsheet
addWorksheet(wb, "W Fork Stones River") # adds the worksheet with the name of W Fork Stones River
writeData(wb, "W Fork Stones River", input1, xy = c(1,1)) # writes the data to the workbook beginning in column 1, row 1
writeDataTable(wb, "W Fork Stones River", WForkStonesRiver03428200, xy = c(1,2)) # writes the data to the workbook beginning in column 1, row 2
setColWidths(wb, sheet = 1, cols = 1:ncol(WForkStonesRiver03428200), widths = "auto") # sets the column widths to auto for sheet 1
addWorksheet(wb, "meta") # adds the worksheet with the name of meta
writeData(wb, "meta", WForkStonesRiver03428200comments) # writes the data to the workbook
saveWorkbook(wb, "WForkStonesRiver03428200.xlsx", overwrite = TRUE)
## R Resources pages
# http://www.ecoccs.com/RandUSGS.html
# Created by Irucka Embry for the USGS (wealth of useful links)
# http://www.ecoccs.com/tsuresearch.html
# Created by Irucka Embry for TSU (wealth of useful links)
# Resources used for this Tutorial
# source 1
# http://cnx.org/content/col10325/1.18/
# Freshman Engineering Problem Solving with MATLAB
# Darryl Morrell, Arizona State University at the Polytechnic Campus, EGR 294, Apr 23, 2007
# pages 9-11
# source 2
# http://www.lisa.stat.vt.edu/sites/default/files/Using_R_for_Your_Basic_Statistical_Needs.r
# Using R for Your Basic Statistical Needs
# LISA Short-Course
# Nels Johnson, LISA Collaborator, Department of Statistics, November 15 and 16, 2010
# source 3
# http://science.nature.nps.gov/im/datamgmt/statistics/r/rcourse/index.cfm
# R For Natural Resources Course (Spring 2013)
# source 4
# http://en.wikibooks.org/wiki/R_Programming/Working_with_data_frames
# R Programming/Working with data frames - Wikibooks, open books for an open world
# source 5
# http://www.statmethods.net/interface/workspace.html
# Quick-R: The Workspace
# source 6
# http://scc.stat.ucla.edu/page_attachments/0000/0139/reg_1.pdf
# Regression in R: Part I: Simple Linear Regression by Denise Ferrari & Tiffany Head, UCLA Department of Statistics Statistical Consulting Center, Feb 10, 2010
# source 7
# https://support.rstudio.com/hc/en-us/articles/200711843-Working-Directories-and-Workspaces
# Working Directories and Workspaces by Josh Paulson, October 12, 2013 (RStudio Support)
# source 8
# https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects
# Using Projects by Josh Paulson, October 11, 2013 (RStudio Support)
# source 9
# http://www.r-bloggers.com/select-operations-on-r-data-frames/
# R-bloggers: Select operations on R data frames By Chris, July 26, 2009
# source 10
# http://stackoverflow.com/questions/1299871/how-to-join-data-frames-in-r-inner-outer-left-right
# How to join data frames in R (inner, outer, left, right)? - Stack Overflow
# source 11
# http://statmethods.net/management/sorting.html
# Quick-R: Sorting Data
# source 12
# http://stackoverflow.com/questions/1296646/how-to-sort-a-dataframe-by-columns-in-r
# sorting - How to sort a dataframe by column(s) in R - Stack Overflow
# source 13
# http://stackoverflow.com/questions/13438556/how-do-i-copy-and-paste-data-into-r
# How do I copy and paste data into R - Stack Overflow
# source 14
# http://r.789695.n4.nabble.com/quantiles-and-dataframe-td834739.html
# R help - quantiles and dataframe
# source 15
# http://www.statmethods.net/graphs/scatterplot.html
# Scatterplots
# source 16
# http://stackoverflow.com/questions/18382883/what-is-the-right-way-to-multiply-data-frame-by-vector
# r - What is the right way to multiply data.frame by vector? - Stack Overflow
# source 17
# http://stackoverflow.com/questions/10324515/excel-like-column-operations-in-r-dataframe
# data.frame - Excel like column operations in R dataframe - Stack Overflow
# source 18
# http://usepa.github.io/introR/
# USEPA Introduction To R
# Jeff Hollister
# source 19
# https://districtdatalabs.silvrback.com/intro-to-r-for-microsoft-excel-users
# How to Transition from Excel to R: An Intro to R for Microsoft Excel Users
# Tony Ojeda
# source 20
# http://www.math.umaine.edu/~hiebeler/comp/matlabR.html
# MATLAB® / R Reference
# David Hiebeler, June 24, 2014