4.8 Additional resources
rio alternatives. While rio is a great Swiss Army knife of file handling, there may be times when you want a bit more control over how your data is pulled into or saved out of R. In addition, there have been times when I’ve had a challenging data file that rio choked on but another package could handle it. Some other functions and packages you may want to explore:
Base R’s read.csv() and read.table() to import text files (use
?read.tableto get more information). stringsAsFactors = FALSE is needed with these if you want to keep your character strings as character strings. write.csv() will save to CSV.
rio uses Hadley Wickham’s readxl package for reading Excel files. Another alternative for Excel is openxlsx, which can write to an Excel file as well as read one. Look at the openxlsx package vignettes for information about formatting your spreadsheets as you export.
Wickham’s readr package is also worth a look as part of the “tidyverse.” readr includes functions to read CSV, tab-separated, fixed-width, Web logs, and several other types of files. readr prints out the type of data it has determined for each column – integer, character, double (non-whole numbers), etc. It creates tibbles.
Import directly from a Google spreadsheet. The googlesheets package lets you import data from a Google Sheet, even if it’s private, by authenticating your Google account. The package is available on CRAN; install it with with
install.packages("googlesheets"). After loading it with
library("googlesheets"), read the excellent introductory vignette. At the time of this writing, the intro vignette was available within R at
vignette("basic-usage", package="googlesheets"). If you don’t see it, try
help(package="googlesheets") and click on the “User guides, package vignettes and other documentation” link for available vignettes, or look at the package information on GitHub at https://github.com/jennybc/googlesheets.
Alternatives to base R’s save and read functions. If you are working with large data sets, speed may become important to you when saving and loading files. The data.table package has a speedy fread() function, but beware that resulting objects are data.tables and not plain data frames; some behaviors are different. If you want a conventional data frame, you can get one with the
as.data.frame(mydatatable) syntax. The data.table package’s fwrite() function is aimed at writing to a CSV file considerably faster than base R’s write.csv().
Two other packages might be of interest for storing and retrieving data. The feather package saves in a binary format that can be read either into R or Python. And, the fst package’s read.fst() and write.fst() offer fast saving and loading of R data frame objects – plus the option of file compression.