1 Agenda

Intro to parallelization
Pinnacle Portal
Simulation 1 Demonstration
HPC in Terminal
Simulation 2 Activity

2 Getting Started

Make sure the Workshop Materials (Sim1-AHPC.R file, Sim2-AHPC.R file, and Sim2-Datasets folder) are easily accessible on your personal computer.
If you use Projects in R (I recommend), use the R-in-AHPC folder containing (1) the R-in-AHPC.Rproj Project, (2) the “Code” folder, with the Sim1-AHPC.R and Sim2-AHPC.R files, and (3) the “Sim2-Datasets” folder.
If you have not yet made an AHPCC account, request one at hpc.uark.edu/hpc-support/user-account-requests/internal.php

3 Intro to Parallelization¹

3.1 Serial vs. Parallel Processing

Suppose we have a series of functions to run, \(f_1\), \(f_2\), and \(f_3\).
Serial processing: Run \(f_1\) until it completes, and until \(f_1\) is finished, nothing else can run. Once \(f_1\) completes, \(f_2\) begins, and the process repeats.
Parallel processing: All \(f_i\) functions start simultaneously and run to completion.

3.2 The Serial-Parallel Scale

A problem can range from “inherently serial” to “perfectly parallel” (or “embarrassingly parallel”).
Inherently serial: A problem that cannot be parallelized at all.
- Example: \(f_2\) depends on the output of \(f_1\) before it can begin. In this case, parallel processing wouldn’t help and might take longer than on a single core.
Perfectly parallel: There is absolutely no dependency between iterations, and all functions can start simultaneously.
- Monte Carlo and statistical simulations usually fall into this category.

3.3 Vocabulary

HPC: High performance computing. Implies a program that is too large, or takes too long, to reasonably run on a desktop computer.
Core: A general term for either a single processor on your own computer or a single machine in a cluster.
Cluster: A collection of objects capable of hosting cores, either a network or the collection of cores on your personal computer.
Process: A single version of R (or any program). Each core runs a single process, and a process typically runs a single function.

Resource

AHPC Webpage on Jargon

3.4 Parallelization Analogy

In AHPC, we can run up to 32 cores per session.
Imagine having 15,000 jobs. Distributing these jobs among 32 friends will take much less time than doing 15,000 jobs alone.
Once one friend (core) finishes the job they’re working on, that friend begins the next job in the list.
There are diminishing returns for adding cores. Giving each friend (core) instructions takes time, and the friend telling you the results takes time, too.

Resource

More info on parallelization efficiency

3.5 The `parallel` & `doParallel` Packages

Step 1: Load the parallel package in R: library(parallel)
```
library(parallel)
```
Step 2: Check the number of cores you have access to with detectCores().
```
detectCores()
```
```
[1] 8
```
Tip
Leave one core free when you’re running a simulation.
n_cores <- detectCores() - 1 n_cores
[1] 7

Step 3: Use foreach loops

The doParallel package allows foreach “loops”, similar to for loops.

library(doParallel)

Before using a foreach loop, you must make a cluster via the makeCluster function, then use the registerDoParallel() function:

cl <- makeCluster(n_cores)
registerDoParallel(cl)

Example: foreach loop

foreach(i = 1:20) %dopar% {
  print(paste0("sqrt(", i, ") = ", round(sqrt(i), digits = 4)))
}

[[1]]
[1] "sqrt(1) = 1"

[[2]]
[1] "sqrt(2) = 1.4142"

[[3]]
[1] "sqrt(3) = 1.7321"

[[4]]
[1] "sqrt(4) = 2"

[[5]]
[1] "sqrt(5) = 2.2361"

[[6]]
[1] "sqrt(6) = 2.4495"

[[7]]
[1] "sqrt(7) = 2.6458"

[[8]]
[1] "sqrt(8) = 2.8284"

[[9]]
[1] "sqrt(9) = 3"

[[10]]
[1] "sqrt(10) = 3.1623"

[[11]]
[1] "sqrt(11) = 3.3166"

[[12]]
[1] "sqrt(12) = 3.4641"

[[13]]
[1] "sqrt(13) = 3.6056"

[[14]]
[1] "sqrt(14) = 3.7417"

[[15]]
[1] "sqrt(15) = 3.873"

[[16]]
[1] "sqrt(16) = 4"

[[17]]
[1] "sqrt(17) = 4.1231"

[[18]]
[1] "sqrt(18) = 4.2426"

[[19]]
[1] "sqrt(19) = 4.3589"

[[20]]
[1] "sqrt(20) = 4.4721"

Tip

Always stop the cluster at the end of your parallelization using the stopCluster function.

stopCluster(cl)

3.6 Demonstration: `foreach` Efficiency²

How quickly can we square the first 1,000 or 10,000 or 100,000 integers? It might take a while sequentially (using serial processing), but we can speed up the process by using multiple cores. Let’s see how using more cores shortens computation time.

3.6.1 Demonstration Overview

The test() function does the following:

Creates and registers a new cluster with n_cores CPU cores (specified by user), and stops the cluster after computation.
Uses foreach to square the first n_iter integers (specified by user).
Keeps track of the time needed in total, the time needed for the squaring computations, and the time spent communicating with the cores.

Note: test() does not store the squared integers. We’re only concerned about timing for this demonstration.

test <- function(n_cores, n_iter){
  # Record start time
  time_start <- Sys.time()

  # Create and register cluster
  cl <- makeCluster(n_cores)
  registerDoParallel(cl)

  # Record this run's computation start time
  time_start_processing <- Sys.time()

  # Do the processing
  results <- foreach(i = 1:n_iter) %dopar% {
    i^2
  }

  # Record this run's computation stop time
  time_finish_processing <- Sys.time()

  # Stop the cluster
  stopCluster(cl)

  # Keep track of the end time
  time_end <- Sys.time()

  # Create report 
  total_time <- round(difftime(time_end, time_start, units = "secs"), digits = 5)
  
  compute_time <- round(difftime(time_finish_processing, time_start_processing, units = "secs"), digits = 5)
  
  overhead_time <- round((total_time - compute_time), digits = 5)
  
  out <- data.frame(
    Cores = n_cores,
    Iterations = as.integer(n_iter),
    Total.Time = total_time,
    Compute.Time = compute_time,
    Overhead.Time = overhead_time)
  
  # Return the report
  return(out)
}

3.6.2 Demonstration Execution

# Using 1, 4, and (almost) all cores
cores <- c(1, 4, detectCores()-1)

# Varying the number of replications
replications <- c(1000, 10000, 100000)

# Initializing data frame to store results
results <- data.frame()

for(n in 1:length(cores)){
  for(r in 1:length(replications)){
    # running test with specified number of cores & replications
    out1 <- test(cores[n], replications[r])
    
    # appending out1 to results data frame
    results <- rbind(results, out1)
  }
}

# view results
results

3.6.3 Demonstration Results

library(ggplot2)
library(tidyverse)
library(patchwork)
library(reshape2)

res.vis <- results |>
  mutate("Overhead.Time" = Process.Time) |>
  dplyr::select(Cores, Iterations, Total.Time, Compute.Time, Overhead.Time) |>
  melt(id.vars = c("Cores", "Iterations"), value.name = "Time") |>
  filter(variable != "Total.Time")

res.vis <- res.vis |>
  mutate(variable = factor(res.vis$variable, 
                           levels = c("Compute.Time", "Overhead.Time")), 
         Cores = as.factor(res.vis$Cores),
         Time = as.numeric(res.vis$Time))

results.vis1000 <- res.vis |> 
  filter(Iterations == 1000) 

runtime1000 <- results.vis1000 |>
  ggplot(aes(x = Cores, y = Time)) +
  geom_bar(position = "stack", stat = "identity", aes(fill = variable)) +
  scale_fill_manual(labels = c("Compute.Time" = "Computation Time", 
                               "Overhead.Time" = "Overhead Time"),
                    values = c("cyan3", "coral")) +
  labs(title = "Run Time with \n 1,000 Iterations",
       x = "Number of Cores", 
       y = "Time (Seconds)",
       fill = "Time Type") 


results.vis10000 <- res.vis |> 
  filter(Iterations == 10000) 

runtime10000 <- results.vis10000 |>
  ggplot(aes(x = Cores, y = Time)) +
  geom_bar(position = "stack", stat = "identity", aes(fill = variable)) +
  scale_fill_manual(labels = c("Compute.Time" = "Computation Time", 
                               "Overhead.Time" = "Overhead Time"),
                    values = c("cyan3", "coral")) +
  labs(title = "Run Time with \n 10,000 Iterations",
       x = "Number of Cores", 
       y = "Time (Seconds)",
       fill = "Time Type") 



results.vis100000 <- res.vis |> 
  filter(Iterations == 100000) 

runtime100000 <- results.vis100000 |>
  mutate(variable = factor(results.vis100000$variable, 
                           levels = c("Compute.Time", "Overhead.Time")), 
         Cores = as.factor(results.vis100000$Cores),
         Time = as.numeric(results.vis100000$Time)) |>
  ggplot(aes(x = Cores, y = Time)) +
  geom_bar(position = "stack", stat = "identity", aes(fill = variable)) +
  scale_fill_manual(labels = c("Compute.Time" = "Computation Time", 
                               "Overhead.Time" = "Overhead Time"),
                    values = c("cyan3", "coral")) +
  labs(title = "Run Time with \n 100,000 Iterations",
       x = "Number of Cores", 
       y = "Time (Seconds)",
       fill = "Time Type") 

runtime1000 + runtime10000 + runtime100000 +
  plot_layout(guides = "collect",
              ncol = 3) &
  theme(legend.position = "bottom",
        guides)

Observations:

More iterations require more computation & overhead time.
With more cores, iterations take less computation time but more overhead time.

4 Interactive Sessions in the Pinnacle Portal

4.1 Starting an Interactive Session

Pinnacle Portal: hpc-portal2.hpc.uark.edu
Menu at the top of Pinnacle Portal > Interactive Apps > R-Studio
Choose number of hours (start small; use 1 for today).
Choose number of cores (leave blank for all; use only a few for today).
Choose queue
“Launch”

Resource

Check out the Selecting Resource page in the AHPC Wiki to decide which queue to use. See the Slurm Queues page in the AHPC Wiki for a nice table detailing queue requirements.

4.2 Launch R-Studio

Once your job is active, the “My Interactive Sessions” page looks like this:
Clicking “Launch R-Studio” takes you to the interactive session.

4.3 Checking Active Jobs Queue

In Pinnacle Portal, Menu > Jobs > Active Jobs
Active Jobs are sorted alphabetically by queue.

4.4 Tips for Interactive Sessions in R

Tip 1: Closing RStudio ends your Interactive Session.
Tip 2: the read_csv function causes RStudio to terminate the (RStudio) session. Use read.csv instead.

4.5 Uploading Data

Menu > Files > Home Directory
I put my files in the Desktop for quick access.
Make new folders using the “New Dir” button.
Upload file(s) using the “Upload” button.

Tip

To upload multiple files, compress the files/folder on your computer and upload a zipped file (.zip)

4.6 Downloading Files

Tip: For multiple files, compress the files/folder in the Interactive Session (or in Terminal).
In the Files App, click on the file, then click the “Download” button.

5 Simulation 1 Demonstration: Type I Error

5.1 Simulation Rationale

In this study, we will demonstrate the Type I Error rates for One-Way ANOVA.
Recall, we use One-Way ANOVA to detect group mean differences.
For three groups, the null hypothesis is \(H_0:\mu_1=\mu_2=\mu_3\).
Type I Error is a false positive (rejecting the null hypothesis when the null hypothesis is true).
We (the researchers) determine \(\alpha\), the probability of committing a Type I error (usually .05, sometimes .01 or .001).

5.2 Simulation Overview

In this simulation, we will generate (simulate) 10,000 datasets of three groups with equal means.
We will perform an ANOVA test on each dataset, and record whether the resulting \(p\)-value is significant.
When \(\alpha=.05\), we expect ~500/10,000 significance tests to reject \(H_0\), even though \(H_0\) is true.
When \(\alpha=.01\), we expect ~100/10,000 significance tests to reject \(H_0\).
When \(\alpha=.001\), we expect ~10/10,000 significance tests to reject \(H_0\).

5.3 Simulation 1 in AHPC Interactive Session: Steps

Upload Sim1-AHPC.R code file from your computer using the Files App in the Pinnacle Portal.
- If you are using the R-in-AHPC folder, upload the .zip file using the Files App.
Begin Interactive Session.
Run R Script in the Interactive Session.
Compress Results file in the Interactive Session.
End Interactive Session.
Download (zipped) results from Files App to your personal computer, then analyze!

5.4 Video: Simulation 1 in AHPC

6 Submitting Jobs in Terminal

6.1 Connect to HPCC from the Terminal

Type in the following command in your terminal on Mac (or Powershell on Windows). It will request your university password (“[username]@hpc-portal2.hpc.uark.edu’s password:”). After you provide the password, you should connect to the pinnacle login node.

Code

ssh [username]@hpc-portal2.hpc.uark.edu

6.2 Create a new folder and Upload the file

Use the following command to create a new folder in your AHPCC account root folder.

Code

mkdir R-in-HPCC

Open a new terminal, type in the following command to upload the R file and job file to the folder R-in-HPCC

Code

scp Sim1-HPCC.R [username]@hpc-portal2.hpc.uark.edu:/home/[username]/R-in-HPCC
scp job.sh [username]@hpc-portal2.hpc.uark.edu:/home/[username]/R-in-HPCC

6.3 Submit the job task

Code

pinnacle-l3:[username]:~/R-in-HPCC$ sbatch job.sh
Submitted batch job 637189

6.4 Check Results

Code

pinnacle-l3:[username]:~/R-in-HPCC$ ls Type1-Results/

7 Simulation 2 Exercise: Power

Your turn! Use the Sim2-AHPC.R file to run Simulation 2.

8 Wrapping up

9 Thank you!

Footnotes

Information in this section is adapted from Dr. Josh Errickson’s Notes on Parallel Processing in R ↩︎
Demonstration adapted from https://www.appsilon.com/post/r-doparallel.↩︎

1 Agenda

2 Getting Started

3 Intro to Parallelization1

3.1 Serial vs. Parallel Processing

3.2 The Serial-Parallel Scale

3.3 Vocabulary

3.4 Parallelization Analogy

3.5 The parallel & doParallel Packages

3.6 Demonstration: foreach Efficiency2

3.6.1 Demonstration Overview

3.6.2 Demonstration Execution

3.6.3 Demonstration Results

4 Interactive Sessions in the Pinnacle Portal

4.1 Starting an Interactive Session

4.2 Launch R-Studio

4.3 Checking Active Jobs Queue

4.4 Tips for Interactive Sessions in R

4.5 Uploading Data

4.6 Downloading Files

5 Simulation 1 Demonstration: Type I Error

5.1 Simulation Rationale

5.2 Simulation Overview

5.3 Simulation 1 in AHPC Interactive Session: Steps

5.4 Video: Simulation 1 in AHPC

6 Submitting Jobs in Terminal

6.1 Connect to HPCC from the Terminal

6.2 Create a new folder and Upload the file

6.3 Submit the job task

6.4 Check Results

7 Simulation 2 Exercise: Power

8 Wrapping up

9 Thank you!

Footnotes

3 Intro to Parallelization¹

3.5 The `parallel` & `doParallel` Packages

3.6 Demonstration: `foreach` Efficiency²