What is R? The data analysis tool you must learn
If your research involves manipulating data sets than you may have heard your peers talking about “R”. This free programming language is robust, easy-to-learn and versatile. Just like everyone else, your time is precious. While learning a new tool requires time, it can be a fun and valuable learning experience if you know that it will look great on your CV, help you work faster, and produce more interesting results. Learn about the research capabilities of R to decide if it’s worth your time or not.
The Benefits of R: For Any Discipline
You can use R to code just like Matlab, Mathematica, Python, Octave, Fortran or even C. Similar to Python, it is dynamic and extendible thanks to a wide quantity of available packages (libraries). Mostly, R is used for analyzing and visualizing data, writing documents (articles, books, webpages, presentations, etc.) and creating dashboards (interactive web-applications) to make sense out of data. No matter what domain you are in, being able to do all of these is vital for you as a graduate student. My version of “The benefits of R” goes as follows:
- Analyzing and visualizing your data: You can use advanced data analysis packages (for example dplyr, tidyr and rlang) to write R scripts that automate the data analysis process. These packages will help you code in a clean and understandable way (ie. check some of the functions in dplyr such as filter, select and mutate). You can also use data visualization packages such as ggplot, ggmap and geofacet to create high quality graphs. If you learn the gganimate and plotly packages, you can easily generate animated or interactive graphs!
- Automating your documents: A set of R packages (knitr, rmarkdown, bookdown, blogdown and xaringan) are designed to provide an authoring framework in data science. If you are wondering what to do with an authoring framework, I could say, “Anything you want!” You can generate reports, articles, resumes, books, presentations, websites, interactive web-applications. In an rmarkdown document, you can add your data, code as well as your graphs; explain your analysis in detail; change the output format of your document (pdf, ppt, html or doc) by changing one keyword in the document. This way, you can turn your analysis into a high quality document, report or presentation and make your research easy to reproduce - if you discover an error in your analysis or you need to update your data and rerun the analysis, you can just re-compile your rmarkdown document to generate the new or corrected version as opposed to running R scripts to plot the graphs, manually adding them to the document and manually updating the numerical values in your discussion. Imagine the time you’d save!
- Showcasing your research: You can publish your documents, including your academic resume, in a personal webpage created with R. This way, you can use R to increase your visibility as a researcher and to develop transferable computer skills. You can use a version control system (such as Git) and create an account on a web-based hosting service (such as bitbucket or GitHub) to showcase your project management and coding skills. Increasing, companies are asking for github profiles on job applications because they are a great portfolio of a candidate’s work.
R, as a central tool, can help you become a data scientist in academia and outside of academia. Knowing how to code, analyze and visualize data, create a blog/webpage and generate reproducible reports are highly demanded skills. After all, in Canada, data scientists are one of the best-paying jobs for new graduates.
Next Steps Toward Learning R
If I’ve at least convinced you to attempt learning R, the next step for you is to decide how you would like to learn. There are a lot of excellent and free resources you can start with. Everyone has their own learning preferences, so it’s important to find what works for you. I started learning the basics and quickly became intrigued with R’s capabilities. First, I was surprised how easy it was to plot publication-quality graphs. So, I continued to learn more with online courses, e-books and tutorials (listed below).
What helped me the most was applying what I learned on my own research problems: I started analyzing my data with R and ended up rewriting my old Python scripts in R. Most importantly, I did not give up. If you know how to code in any other programming language, you already know what learning method works best for you. For those who are unsure, keep reading for your R-learning resources!
Take a workshop:
Sign up for any of GradProSkills’ two workshops to provide you with a comprehensive introduction to R:
GPDI 515 is a hands-on, introductory level R workshop that covers the basics of the R programming language.
GPDI 517 focuses on how to analyze and visualize data with R in a reproducible way.
External Resources: To continue learning and keep yourself up-to-date with new features, you need to be familiar with R’s major features – a dynamic community and various packages for specific purposes. The R community is formed by people who love to learn, teach, talk, share and generate a large amount of learning resources. Here are some of my suggested favorites:
For online courses: Roger Peng’s courses (R programming, Advanced R programming, Reproducible research and The R Programming Environment) at coursera.org, Hadley Wickham’s course (Writing functions in R) at datacamp.com and Zico Kolter’s Practical Data Science course at datasciencecourse.org.
From the source: RStudio also has a lot of tutorials. Visit https://www.rstudio.com/online-learning/ to see their list of online resources and books.
Learn from workshops and group studies: The R community, through RLadies (a worldwide organization that aims to promote gender diversity in the R community), uses Slack and Meet-up to communicate with each other and organize local meetings. Click here for more information about the RLadies Montreal Chapter.
Different packages exist: for wrangling/analysis, data visualization, writing/documentation and text mining. Visit the following websites: https://support.rstudio.com/hc/en-us/articles/201057987-Quick-list-of-useful-R-packages or https://cran.r-project.org/web/packages/available_packages_by_name.html to learn about more packages.
Lastly, stay in the loop! Since R evolves through its packages every day, it’s challenging for everyone to keep up. Here are a few inspirational R community members to follow on social media: Hadley Wickham (@hadleywickham), Jenny Bryan (@JennyBryan), Mara Averick (@datanadme), Julia Silge (@juliasilge) and Kristoffer Magnusson (@krstoffr). You can count on them to show you new, cool, free R packages and the resources to learn them. As your interests in R become more specific, you can find more people who can help you on your R journey!