Introduction to Reproducible Coding Environment

Welcome

About Code Club

SORTEE Code Club is an online meeting where people come together to learn, share, and collaborate on coding-related topics in an informal and supportive environment

SORTEE Resources

Do you know about SORTEE Resources?

Today’s Agenda


We’ll cover:

  • What reproducibility actually means
  • Why reproducibility matters
  • Using R Projects
  • Managing package dependencies with renv
  • Handling different R versions
  • Some limitations and tips

Disclaimer: today’s Code Club will focus on Reproducibility in R

Introduction to Reproducibility in R

A reproducible coding environment is a computational setup with clearly documented code, data, software versions, and methods allowing exact replication of results.

Original comic from xkcd

The Why

Reproducible coding environments:

  • Ensure reliability and credibility of research findings
  • Facilitate collaborative research and transparent scientific communication
  • Enable efficient troubleshooting and debugging
  • Save time by reducing redundant efforts to recreate computational setups
  • Promote best practices in scientific computing and data analysis

The How

There is no single solution or perfect approach to reproducibility

Instead, there are a collection of various best strategies that you can use!

Example

For example, consider different ways of referencing file paths in your R scripts:

# Bad: absolute path
setwd("C:/Users/MyName/Documents/Project/Data")
data <- read.csv("data.csv")

# OK: relative path extensively documented in README (but fragile!)
setwd("~/Downloads/Data")
data <- read.csv("data.csv")

# Better: using here package, but without RProject the structure is unclear
library(here)
data <- read.csv(here("Data", "data.csv"))

# Best: explicit use of an R project (.Rproj file) ensures 'here()' always resolves correctly
library(here)
data <- read.csv(here("Data", "data.csv"))

The How: .RProject

R Projects is a way to organize all your analysis files into one easy-to-use place.

R Projects make your life easier because:

  • Paths are simple(r)
  • Easy collaboration
  • Less confusion

How to make an .RProject

Spoiler alert!
Check Code Club calendar for our Git & Github session!

Example : .RProject

For example:

.RProject is not enough alone

Even though an R Project helps organize your analysis into one coherent location…

… You still need a structured directory layout to organize your file inside this project to get the most out of it

Organise your Files!

Not this!

But this! NCEAS Learning Hub’s coreR Course

Understanding Packages📦 and Libraries📘

Collections of functions and compiled code that extend R’s functionality


Packages📦 reside within a library📘, a directory on your computer where R stores installed packages


Managing package versions matters for:

  • Version control
  • Transparency
  • Open projects

R Packages (2e)

The How: Managing Packages with renv

renv helps make your R projects reproducible by managing your package dependencies

  • It records which packages (and their versions) you’re using in your project.
  • It helps you avoid package version conflicts across different projects.
  • It ensures that your project setup can be easily replicated later.

Using renv: you, starting the project

  • renv::init(): create an isolated library for your project
  • renv::snapshot(): save exact versions of packages you’re using into a renv.lock file

This lockfile acts like a recipe that describes the environment your project needs to run.

Using renv: someone else, with your project

  • renv::restore()

Yes, that’s it.

Limitations

renv doesn’t handle:

  • R itself (versions)
  • System dependencies (compilers, external libraries)

Managing R Versions

Sometimes it’s not just about the packages — the version of R itself can matter too.

That’s where Rig comes in.

What is Rig?

Rig is a lightweight tool that helps you manage multiple versions of R on the same machine.

  • It works on Windows, macOS, and Linux.
  • You can easily install new R versions.
  • You can switch between R versions with a simple command.

Why Use Rig?

  • Makes it easy to test code across R versions.
  • Simplifies keeping old projects running on older setups.
  • Plays well with tools like renv, completing the reproducibility setup.

🧩 renv manages your packages
🛠️ Rig manages your R versions

Wrapping Up

Improving reproducibility:

  • Organize projects clearly
  • Manage dependencies carefully (renv)
  • Use version management (Rig)
  • Document everything!

Resources

R-Ladies - Personal R Administration
Introduction to renv
What they forgot to teach you about R
renv challenges
Reproducible Environments - Posit
The Turing Way - Reproducible Environments
Rig, the R installation Manager
Groundhog, an alternative to renv
Enough targets to Write a Thesis