Common Microsoft Excel Operations Performed in R

Author

Irucka Embry, EIT (EcoC²S) [https://www.ecoccs.com]

Published

28 November 2025



Introduction

The following document is modeled after the article by Sunil Ray (Refer to the Works Cited section). The following are the 10 Microsoft Excel operations that Ray performed:

  1. Vlookup()

  2. CONCATENATE()

  3. LEN()

  4. LOWER(), UPPER() and PROPER()

  5. TRIM()

  6. If()

  7. Pivot Table

  8. Creating Charts (Data Visualization)

  9. Remove duplicate values

  10. Text to Columns



Replicate the R code

Install the packages

Note: If you wish to replicate the R code below, then you will need to copy and paste the following command in R first to make sure you have all the packages and their dependencies installed:

install.packages(c("install.load", "rando", "randomNames", "generator", "randtoolbox",
    "stringi", "ggplot2", "data.table", "maps", "pivottabler", "openxlsx", "round",
    "pander", "basictabler", "magrittr", "DT", "sf", "USA.state.boundaries", "flextable",
    "strex"))
# install the packages and their dependencies



Load the packages

Note: If you wish to replicate the R code below, then you will need to copy and paste the following commands in R first to make sure you have all the packages loaded:

install.load::load_package("randomNames", "generator", "randtoolbox", "stringi",
    "ggplot2", "data.table", "maps", "stats", "pivottabler", "openxlsx", "round",
    "pander", "basictabler", "magrittr", "DT", "sf", "USA.state.boundaries", "flextable",
    "strex")
# load needed packages using the load_package function from the install.load
# package (it is assumed that you have already installed these packages)



# set the pander options
panderOptions("table.continues", "")
panderOptions("table.caption.prefix", "")
panderOptions("missing", "")
# set pander options to have missing values in the tables as blank cells



Create the Data Set with 1000 Fake Policies

set.seed(1000)
# makes the example reproducible


# Create the fake data sets to be used in these examples

# create a data.table for the Policy data set
policy <- data.table(Customer = 1:1000, Policy = paste0(as.numeric(gsub("-", "",
    (r_national_identification_numbers(1000)))), stri_rand_strings(1000, 3, pattern = "[A-Z]")),
    Product = (as.numeric(r_phone_numbers(1000))), Premium = round_r3(sample(c(congruRand(1000) *
        15073, congruRand(1000) * 17158, congruRand(1000) * 19710), 1000, replace = TRUE),
        d = 2), City = sample(us.cities$name, 1000))


# change the column names
setnames(policy, c("Customer ID", "Policy ID", "Product ID", "Premium ($)", "City"))


# correct spelling of City name
change_column <- "City"
for (col in change_column) set(policy, i = which(grepl("Cincinati", policy[[col]])),
    j = col, value = "Cincinnati")  # Source 1


# display the first 5 customers
pander(head(policy))
Customer ID Policy ID Product ID Premium ($) City
1 679259874WJO 3.693e+09 17242 Fullerton CA
2 398734972ZAT 5.471e+09 12097 Pawtucket RI
3 313512130KKG 8.726e+09 6753 Victorville CA
4 150571626VHB 4.525e+09 2202 Quincy MA
5 443391456VOC 7.149e+09 16148 Richmond VA
6 384885628ZSK 2.312e+09 7847 Huntington WV
# display the last 5 customers
pander(tail(policy))
Customer ID Policy ID Product ID Premium ($) City
995 249325153DDN 2.434e+09 19199 Coeur d’Alene ID
996 565883197KXH 9.242e+09 12415 Cleveland OH
997 649838447HGH 8.313e+09 9388 Seattle WA
998 445305026QNH 3.968e+09 9250 DeKalb IL
999 838201152HIZ 5.376e+09 5548 Surprise AZ
1000 292873799CKE 3.847e+09 7238 Poway CA
# Customer data set create genders character vector
genders <- c("Female", "Male")


# create Gender character vector sampled from genders
Gender <- sample(genders, 1000, replace = TRUE)


# create random names based on the gender giving only the first names
Name <- randomNames(gender = Gender, which.names = "first")


# create a data.table for the Customer data set
customer <- data.table(Customer = 1:1000, Name = Name, Gender = Gender, Age = sample(1:120,
    1000, replace = TRUE), City = policy$City, Vintage = sample(1:720, 1000, replace = TRUE))


# change the column names
setnames(customer, c("Customer ID", "Name", "Gender", "Age", "City", "Vintage (in months)"))


# display the first 5 customers
pander(head(customer))
Customer ID Name Gender Age City Vintage (in months)
1 Devin Male 19 Fullerton CA 346
2 Shaahida Female 75 Pawtucket RI 702
3 Jordan Male 67 Victorville CA 627
4 Krista Female 102 Quincy MA 317
5 Abigail Female 14 Richmond VA 713
6 Joan Female 99 Huntington WV 645
# display the last 5 customers
pander(tail(customer))
Customer ID Name Gender Age City Vintage (in months)
995 Rachel Female 30 Coeur d’Alene ID 650
996 Latasha Female 76 Cleveland OH 622
997 Leah Frances Female 97 Seattle WA 589
998 Ju-Young Female 72 DeKalb IL 310
999 Haafil Male 22 Surprise AZ 96
1000 Anerae Female 8 Poway CA 615
# write the policy and customer data.tables to a .xlsx workbook to be read back
# in use the R package openxlsx to create the .xlsx spreadsheet

## Lists elements are written to individual worksheets, using list names as
## sheet names if available
l <- list(Policy = policy, Customer = customer)

## different sheets can be given different parameters
write.xlsx(l, "Fake-User-Data.xlsx")



# read back in the worksheets from the workbook and test for identicalness read
# in the policy worksheet (sheet 1)
policy1 <- read.xlsx("Fake-User-Data.xlsx", sheet = 1)

# make policy1 a data.table
policy1 <- setDT(policy1)


# change the column names of policy1
setnames(policy1, names(policy1), names(policy))


# changing column to integer class
change_class1 <- "Customer ID"
for (col in change_class1) set(policy1, j = col, value = as.integer(policy1[[col]]))  # Source 1


# Are policy and policy1 identical?
all(policy == policy1)
[1] TRUE
# read in the customer worksheet (sheet 2)
customer1 <- read.xlsx("Fake-User-Data.xlsx", sheet = 2)


# make customer1 a data.table
customer1 <- setDT(customer1)


# change the column names of customer1
setnames(customer1, names(customer1), names(customer))


# changing column to integer class
change_class2 <- c("Customer ID", "Age", "Vintage (in months)")
for (col in change_class2) set(customer1, j = col, value = as.integer(customer1[[col]]))  # Source 1


# Are customer and customer1 identical?
all(customer == customer1)
[1] TRUE



Perform the data analysis

1) Vlookup() Example 1

# make a new column named Key which contains the first digit of the Product ID
policy[, Key := as.integer(stri_extract_first_regex(policy$`Product ID`, "[1-9]"))]

# set the key of policy
setkey(policy, Key)

# display the first 5 Policies
pander(head(policy))
  Customer ID Policy ID Product ID Premium ($) City
1 7 454536322GIG 1.869e+09 5709 West Babylon NY
1 19 538684138QQD 1.268e+09 9522 Columbia MD
1 24 406299389SPF 1.835e+09 11192 Odessa TX
1 45 627835714CWY 1.799e+09 9166 Lincoln NE
1 53 756277121DFH 1.878e+09 1524 Country Club FL
1 64 702156477GIV 1.362e+09 15446 Chino Hills CA
# create the key_table to be used in the lookup
key_table <- data.table(Key = 1:9, Category = LETTERS[1:9])


# display the key_table
pander(key_table)
Key Category
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
9 I
# set the key of key_table
setkey(key_table, Key)




# merge the policy and key_table data.tables thus doing a VLOOKUP()
policy <- policy[key_table]


# remove the unneeded Key column
policy[, Key := NULL]


# set the key of key_table
setkeyv(policy, "Customer ID")


# change the column name
setnames(policy, ncol(policy), "Product Category")


# change the column order
setcolorder(policy, c(1:3, 6, 4:5))


# display the first 5 Policies
pander(head(policy))
Policy ID Product ID Product Category Premium ($) City
679259874WJO 3.693e+09 C 17242 Fullerton CA
398734972ZAT 5.471e+09 E 12097 Pawtucket RI
313512130KKG 8.726e+09 H 6753 Victorville CA
150571626VHB 4.525e+09 D 2202 Quincy MA
443391456VOC 7.149e+09 G 16148 Richmond VA
384885628ZSK 2.312e+09 B 7847 Huntington WV


1) Vlookup() Example 2

# add in the Latitude, Longitude, State Abbreviation, and State Region for each
# City in the policy data.table

# create the City data.table with the US city names and respective Latitude and
# Longitude
City <- data.table(City = us.cities$name, Latitude = us.cities$lat, Longitude = us.cities$long)



# get the State_names as a character vector
State_names <- str_after_last(getElement(City, "City"), " ")


# add the State abbreviation to the City data.table
City[, State := State_names]


# display the first 5 Cities
pander(head(City))
City Latitude Longitude State
Abilene TX 32.45 -99.74 TX
Akron OH 41.08 -81.52 OH
Alameda CA 37.77 -122.3 CA
Albany GA 31.58 -84.18 GA
Albany NY 42.67 -73.8 NY
Albany OR 44.62 -123.1 OR
# set the key of City
setkey(City, City)



# set the key of policy
setkey(policy, City)


# merge the policy and city data.tables thus doing a VLOOKUP()
policy <- City[policy]


# display the first 5 Policies
pander(head(policy))
  Latitude Longitude State Customer ID Policy ID
Abilene TX 32.45 -99.74 TX 900 882492748NOG
Akron OH 41.08 -81.52 OH 766 883527884MYY
Alameda CA 37.77 -122.3 CA 164 727715324KFW
Albany GA 31.58 -84.18 GA 713 259651667ONC
Albany NY 42.67 -73.8 NY 189 949624664QCF
Albany OR 44.62 -123.1 OR 632 308656677PPI
  Product ID Product Category Premium ($)
Abilene TX 3.743e+09 C 1971
Akron OH 8.696e+09 H 6363
Alameda CA 8.43e+09 H 4275
Albany GA 7.533e+09 G 10324
Albany NY 3.787e+09 C 6196
Albany OR 9.22e+09 I 16547
# change the column order
setcolorder(policy, c(1:6, 9, 7:8))


# display the first 5 Policies
pander(head(policy))
  Latitude Longitude State Customer ID Policy ID
Abilene TX 32.45 -99.74 TX 900 882492748NOG
Akron OH 41.08 -81.52 OH 766 883527884MYY
Alameda CA 37.77 -122.3 CA 164 727715324KFW
Albany GA 31.58 -84.18 GA 713 259651667ONC
Albany NY 42.67 -73.8 NY 189 949624664QCF
Albany OR 44.62 -123.1 OR 632 308656677PPI
  Premium ($) Product ID Product Category
Abilene TX 1971 3.743e+09 C
Akron OH 6363 8.696e+09 H
Alameda CA 4275 8.43e+09 H
Albany GA 10324 7.533e+09 G
Albany NY 6196 3.787e+09 C
Albany OR 16547 9.22e+09 I
# create region to be merged with the policy table
region <- data.table(State = state.fips$abb, Region = state.fips$region)

# display the first 5 Regions
pander(head(region))
State Region
AL 3
AZ 4
AR 3
CA 4
CO 4
CT 1
# create addregion to include Alaska and Hawai'i as region only includes 48
# states
addregion <- data.table(State = c("AK", "HI"), Region = 5:6)

# display addregion
pander(addregion)
State Region
AK 5
HI 6
# create a list of region and addregion to effectively combine the data.tables
# by row
l <- list(region, addregion)


# the new region now has 2 additional rows
region <- rbindlist(l)


# remove duplicate State abbreviations
region <- unique(region)

# display the first 5 Regions
pander(head(region))
State Region
AL 3
AZ 4
AR 3
CA 4
CO 4
CT 1
# set the key of Region
setkey(region, Region)


# create region_code for the VLOOKUP() with region
region_code <- data.table(Region = 1:6, Region1 = c("East", "Central", "Mid West",
    "West", "Alaska", "Hawai'i"))

# display region_code
pander(region_code)
Region Region1
1 East
2 Central
3 Mid West
4 West
5 Alaska
6 Hawai’i
# set the key of region_code
setkey(region_code, Region)


# merge the region and region_code data.tables thus doing a VLOOKUP()
region <- region[region_code]


# set the key of Region
setkey(region, State)


# set the key of policy
setkey(policy, State)


# merge the policy and city data.tables thus doing a VLOOKUP()
policy <- policy[region]


# display the first 5 Policies
pander(head(policy))
  City Latitude Longitude Customer ID Policy ID
AK Anchorage AK 61.18 -149.2 532 384475938AMY
AK Juneau AK 58.3 -134.4 795 837561119EAL
AL Auburn AL 32.59 -85.48 803 980378167CVE
AL Birmingham AL 33.53 -86.8 752 473542814SEE
AL Decatur AL 34.57 -86.99 919 580577016YOF
AL Dothan AL 31.24 -85.41 347 286944266ECV
  Premium ($) Product ID Product Category Region Region1
AK 9017 6.287e+09 F 5 Alaska
AK 8370 2.632e+09 B 5 Alaska
AL 7576 8.526e+09 H 3 Mid West
AL 18330 7.461e+09 G 3 Mid West
AL 13279 3.649e+09 C 3 Mid West
AL 936.2 8.434e+09 H 3 Mid West
# remove the Region column
policy[, Region := NULL]


# change the column name
setnames(policy, "Region1", "Region")


# change the column order
setcolorder(policy, c(5:9, 1, 4, 10, 2:3))


# display the first 5 Policies
pander(head(policy))
  Customer ID Policy ID Premium ($) Product ID
AK 532 384475938AMY 9017 6.287e+09
AK 795 837561119EAL 8370 2.632e+09
AL 803 980378167CVE 7576 8.526e+09
AL 752 473542814SEE 18330 7.461e+09
AL 919 580577016YOF 13279 3.649e+09
AL 347 286944266ECV 936.2 8.434e+09
  Product Category City Region Latitude Longitude
AK F Anchorage AK Alaska 61.18 -149.2
AK B Juneau AK Alaska 58.3 -134.4
AL H Auburn AL Mid West 32.59 -85.48
AL G Birmingham AL Mid West 33.53 -86.8
AL C Decatur AL Mid West 34.57 -86.99
AL H Dothan AL Mid West 31.24 -85.41



2) CONCATENATE()

# create 2 new columns
policy[, `:=`(Host = rep("https://www.customerunknown.com", nrow(policy)), Request = policy$"Policy ID")]


# concatenate the Host and Request columns into the URL column with '/'
# separators and extra space padding using the paste0 function
policy[, `:=`(URL, paste0("       ", Host, "/", Request, "/", "    "))]


# remove the Host and Request columns
policy[, c("Host", "Request") := NULL]


# display the first 5 Policies
pander(head(policy))
  Customer ID Policy ID Premium ($) Product ID
AK 532 384475938AMY 9017 6.287e+09
AK 795 837561119EAL 8370 2.632e+09
AL 803 980378167CVE 7576 8.526e+09
AL 752 473542814SEE 18330 7.461e+09
AL 919 580577016YOF 13279 3.649e+09
AL 347 286944266ECV 936.2 8.434e+09
  Product Category City Region Latitude Longitude
AK F Anchorage AK Alaska 61.18 -149.2
AK B Juneau AK Alaska 58.3 -134.4
AL H Auburn AL Mid West 32.59 -85.48
AL G Birmingham AL Mid West 33.53 -86.8
AL C Decatur AL Mid West 34.57 -86.99
AL H Dothan AL Mid West 31.24 -85.41
  URL
AK https://www.customerunknown.com/384475938AMY/
AK https://www.customerunknown.com/837561119EAL/
AL https://www.customerunknown.com/980378167CVE/
AL https://www.customerunknown.com/473542814SEE/
AL https://www.customerunknown.com/580577016YOF/
AL https://www.customerunknown.com/286944266ECV/



3) LEN()

# gives the number of characters of the first 5 URLs
head(nchar(policy$URL))
[1] 56 56 56 56 56 56



4) LOWER(), UPPER() and PROPER()

# transform the first 5 Customer Names to lowercase
head(stri_trans_tolower(customer$Name))
[1] "devin"    "shaahida" "jordan"   "krista"   "abigail"  "joan"    
# transform the first 5 City Names to uppercase
head(stri_trans_toupper(customer$City))
[1] "FULLERTON CA"   "PAWTUCKET RI"   "VICTORVILLE CA" "QUINCY MA"     
[5] "RICHMOND VA"    "HUNTINGTON WV" 
# transform the first 5 URLs to Title Case
head(stri_trans_totitle(policy$URL))
[1] "       Https://Www.customerunknown.com/384475938amy/    "
[2] "       Https://Www.customerunknown.com/837561119eal/    "
[3] "       Https://Www.customerunknown.com/980378167cve/    "
[4] "       Https://Www.customerunknown.com/473542814see/    "
[5] "       Https://Www.customerunknown.com/580577016yof/    "
[6] "       Https://Www.customerunknown.com/286944266ecv/    "



5) TRIM()

# the URL column in Policy has extra spaces on both sides (left and right).
# Remove the spacing from both sides.
policy[, URL := stri_trim_both(policy$URL, pattern = "\\P{Wspace}")]


# display the first 5 Policies
pander(head(policy))
  Customer ID Policy ID Premium ($) Product ID
AK 532 384475938AMY 9017 6.287e+09
AK 795 837561119EAL 8370 2.632e+09
AL 803 980378167CVE 7576 8.526e+09
AL 752 473542814SEE 18330 7.461e+09
AL 919 580577016YOF 13279 3.649e+09
AL 347 286944266ECV 936.2 8.434e+09
  Product Category City Region Latitude Longitude
AK F Anchorage AK Alaska 61.18 -149.2
AK B Juneau AK Alaska 58.3 -134.4
AL H Auburn AL Mid West 32.59 -85.48
AL G Birmingham AL Mid West 33.53 -86.8
AL C Decatur AL Mid West 34.57 -86.99
AL H Dothan AL Mid West 31.24 -85.41
  URL
AK https://www.customerunknown.com/384475938AMY/
AK https://www.customerunknown.com/837561119EAL/
AL https://www.customerunknown.com/980378167CVE/
AL https://www.customerunknown.com/473542814SEE/
AL https://www.customerunknown.com/580577016YOF/
AL https://www.customerunknown.com/286944266ECV/



6) If()

# if the Premium > $5000, then it's High and if not, then it's Low
policy[, Category := ifelse(policy$"Premium ($)" > 5000, "High", "Low")]


# change the column order
setcolorder(policy, c(1:5, 12, 6:11))


# display the first 5 Policies
pander(head(policy))
  Customer ID Policy ID Premium ($) Product ID
AK 532 384475938AMY 9017 6.287e+09
AK 795 837561119EAL 8370 2.632e+09
AL 803 980378167CVE 7576 8.526e+09
AL 752 473542814SEE 18330 7.461e+09
AL 919 580577016YOF 13279 3.649e+09
AL 347 286944266ECV 936.2 8.434e+09
  Product Category Category City Region Latitude
AK F High Anchorage AK Alaska 61.18
AK B High Juneau AK Alaska 58.3
AL H High Auburn AL Mid West 32.59
AL G High Birmingham AL Mid West 33.53
AL C High Decatur AL Mid West 34.57
AL H Low Dothan AL Mid West 31.24
  Longitude URL
AK -149.2 https://www.customerunknown.com/384475938AMY/
AK -134.4 https://www.customerunknown.com/837561119EAL/
AL -85.48 https://www.customerunknown.com/980378167CVE/
AL -86.8 https://www.customerunknown.com/473542814SEE/
AL -86.99 https://www.customerunknown.com/580577016YOF/
AL -85.41 https://www.customerunknown.com/286944266ECV/



7) Pivot Table

# melt policy using reshape
policy_pt <- melt(policy, id.vars = c(9, 5), measure.vars = 3)


# cast policy_pt using reshape
policy_pt <- dcast(policy_pt, Region ~ `Product Category`, sum, fill = NA)  # Sources 4 and 5


# obtain the row sums
policy_pt[, "Grand Total" := rowSums(.SD, na.rm = TRUE), .SDcols = 2:ncol(policy_pt)]


# compute the column sums
addpt <- policy_pt[, lapply(.SD, sum, na.rm = TRUE), .SDcols = 2:ncol(policy_pt)]


# create a new column with the name of 'Grand Total'
addpt[, `Grand Total2` := "Grand Total"]


# change the column order
setcolorder(addpt, c(ncol(addpt), 1:(ncol(addpt) - 1L)))


# create a list of policy_pt and addpt in order to combine them by row
l1 <- list(policy_pt, addpt)


# combine the list
policy_pt <- rbindlist(l1, fill = TRUE)


# display the pivot table created using data.table
pander(policy_pt)
Region A B C D E F G
Alaska 8370 9017
Central 159975 116100 286302 197884 228935 153537 224985
East 173858 161561 231200 118963 124928 99298 128164
Hawai’i 9613 4632
Mid West 218196 313187 380994 342939 294637 372498 267178
West 314655 303071 250713 271445 310124 303904 421518
866683 902290 1158821 931231 958624 938254 1046476
H I Grand Total Grand Total2
17387
205900 180277 1753894
113953 128149 1280072
14246
251895 209061 2650585
305813 271242 2752485
877561 788728 8468669 Grand Total
# or using pivottabler

ptt <- PivotTable$new()
ptt$addData(policy)
ptt$addColumnDataGroups("Product Category")
ptt$addRowDataGroups("Region")
ptt$defineCalculation(calculationName = "Premium ($)", summariseExpression = "sum(`Premium ($)`)")
ptt$renderPivot()
# convert to flextable to modify the formatting of the numeric data convert to
# a basictabler table first
tbll <- ptt$asBasicTable()

# now convert to a flextable
ftt <- tbll$asFlexTable()

# change the numeric formatting
ftt <- flextable::colformat_double(ftt, big.mark = ",", digits = 2, na_str = "N/A")

# autofit the table
ftt <- flextable::autofit(ftt)

# display the pivot table created using flextable
ftt

A

B

C

D

E

F

G

H

I

Total

Alaska

8370.4

9016.87

17387.27

Central

159974.75

116100.09

286301.5

197883.55

228935.48

153537

224984.84

205900.2

180276.67

1753894.08

East

173857.9

161561.16

231199.79

118963.03

124927.69

99297.65

128163.66

113952.73

128148.62

1280072.23

Hawai'i

9613.39

4632.22

14245.61

Mid West

218196.15

313186.58

380993.97

342939.16

294636.58

372498.39

267177.58

251895.44

209060.84

2650584.69

West

314654.69

303071.32

250712.51

271445.17

310124.31

303904.03

421518.06

305812.78

271242.16

2752485.03

Total

866683.49

902289.55

1158821.16

931230.91

958624.06

938253.94

1046476.36

877561.15

788728.29

8468668.91

# the object ptt can be exported as a Microsoft Excel pivot table as well
wbpt <- createWorkbook()
addWorksheet(wbpt, "Pivot Table")
ptt$writeToExcelWorksheet(wb = wbpt, wsName = "Pivot Table", topRowNumber = 1, leftMostColumnNumber = 1,
    outputValuesAs = "rawValue", applyStyles = TRUE, mapStylesFromCSS = TRUE)
saveWorkbook(wbpt, file = "Policy-pivot-table.xlsx", overwrite = TRUE)



8) A) Creating Charts (Data Visualization)

# create a bar chart based on the pivot table

# remove the last row
policy_ptuse <- policy_pt[-nrow(policy_pt)]


# remove the last column
policy_ptuse[, ncol(policy_pt) := NULL]


# melt to make it easier to plot with ggplot2
policy_ptmelt <- melt(policy_ptuse, id.vars = "Region", measure.vars = LETTERS[1:9])
setnames(policy_ptmelt, 2, "Legend")


# plot with ggplot2
ggplot(policy_ptmelt, aes(x = Region, y = value, fill = Legend)) + geom_bar(stat = "identity",
    width = 1, position = position_dodge(width = 0.9)) + labs(title = "Total Premium Cost by Region",
    x = "Region", y = "Premium Cost ($)") + scale_fill_brewer(palette = "Spectral") +
    theme_dark()  # Source 5



8) B) Creating a Customer Location Map



9) Remove duplicate values

# The quickest way to remove duplicate values in R is to use unique

# the dimensions of policy (number of rows by number of columns)
dim(policy)
[1] 1000   12
# display the duplicated rows in policy
head(duplicated(policy))
[1] FALSE FALSE FALSE FALSE FALSE FALSE
# the dimensions of the unique values of policy (with duplicated values
# removed)
dim(unique(policy))
[1] 1000   12



10) Text to Columns

# the text used in the article can be converted from text to columns directly
# using the fread function from the data.table package


pander(fread("A001; 2345; North; A
A002; 3423; North; B
A003; 2124; South; C
A004; 5134; West; D
A004; 4321; East; D
A005; 7653; South; C",
    header = FALSE, sep = ";"))
V1 V2 V3 V4
A001 2345 North A
A002 3423 North B
A003 2124 South C
A004 5134 West D
A004 4321 East D
A005 7653 South C



Sources used in the R code

Source 1
r - Convert column classes in data.table - Stack Overflow answered by Matt Dowle on Dec 27 2013. https://stackoverflow.com//questions/7813578/convert-column-classes-in-data-table

Source 2
r - data.table operations by column name with spaces fails - Stack Overflow answered by grrgrrbla on Jun 10 2015 and edited by Arun on Jun 10 2015. https://stackoverflow.com/questions/30757316/data-table-operations-by-column-name-with-spaces-fails

Source 3
2657 Productions. Using the reshape package in R for pivot-table-like functionality, 8 Aug 2010, http://news.mrdwab.com/post/reshape-package/

Source 4
data.table vignette: Efficient reshaping using data.tables, See https://rdatatable.gitlab.io/data.table/articles/datatable-reshape.html

Source 5
r - ggplot side by side geom_bar() - Stack Overflow answered by David Arenburg on Jul 31 2014 & edited by David Arenburg on Aug 1 2014. See https://stackoverflow.com/questions/25070547/ggplot-side-by-side-geom-bar.

Source 6
How to map data with R: A hands-on tutorial to get you to start creating maps with R. By Abhinav Malasi, Jun 29, 2021. See https://medium.com/geekculture/how-to-map-data-with-r-8333110dff5b

Source 7
R Error in x$ed : $ operator is invalid for atomic vectors - Stack Overflow answered & edited by Jilber Urbina on Apr 25, 2014. See https://stackoverflow.com/questions/23299684/r-error-in-xed-operator-is-invalid-for-atomic-vectors



Works Cited

Sunil Ray, Analytics Vidhya, “Simple Yet Powerful Excel Tricks for Analyzing Data,” November 3, 2015, https://www.analyticsvidhya.com/blog/2015/11/excel-tips-tricks-data-analysis/, Accessed: November 11, 2016.



Donations accepted with Liberapay

If you would like to contribute to the continued development of Irucka Embry’s R packages and/or Irucka Embry’s R Examples, please feel free to donate via the link below:

Please feel free to review Irucka Embry (iaembry)’s profile on Liberapay.