Summary and Setup

an introduction to R using gapminder data

The goal of this lesson is to teach novice programmers to write modular code and best practices for using R for data analysis. R is commonly used in many scientific disciplines for statistical analysis and its array of third-party packages. The emphasis of these materials is to give attendees a strong foundation in the fundamentals of R, and to teach best practices for scientific computing: breaking down analyses into modular units, task automation, and encapsulation.

Note that this workshop will focus on teaching the fundamentals of the programming language R, and will not teach statistical analysis.

This workshop is based off of the (much longer!) R for Reproducible Scientific Analysis lesson developed as part of the Software Carpentry curriculum. For more information on the Carpentries, see their homepage.

As you enter, please follow the setup instructions so we can get right into the action!

Get started with RStudio on the HPCC

To use RStudio on the HPCC, we will use OnDemand. Log in here: https://ondemand.hpcc.msu.edu/pun/sys/dashboard/.

After logging in, from the Interactive Apps pull-down tab, choose RStudio Server.

OnDemand with interactive apps

In the settings for the interactive job, set “Number of hours” to 3 and “Number of cores per task” to 4. Leave the remaining entries blank and click the Launch button.

OnDemand with job submission options set as described above

You will be taken to a list of your interactive jobs. After queuing, your job will start running, and you can access RStudio by clicking the “Connect to RStudio Server” button.

A card showing a running RStudio job with a button to connect to the server.

If you would like to install R and RStudio on your computer in the future, check out the following links: