Published January 15, 2026 | Version v1
Thesis Open

Modelling GPU Parallelism in ROOT RDataFrame Histograms

Authors/Creators

  • 1. ROR icon VU Amsterdam
  • 2. ROR icon University of Amsterdam
  • 1. ROR icon University of Twente
  • 2. ROR icon University of Amsterdam
  • 3. ROR icon European Organization for Nuclear Research

Description

Context CERN, housing to the Large Hadron Collider (LHC), generates petabytes of collision event data for particle physicists to analyse. ROOT, the standard framework for high-energy physics (HEP) data analyses, offers a declarative interface through RDataFrame, where histogramming is a core operation. 

Goal We aim to predict the runtime performance benefits of offloading computations to the GPU, using histogramming in RDataFrame as a case study.

Method We propose an analytical modelling approach to predict the runtime of CPU and GPU implementations, employing a systematic strategy that includes microbenchmarking and code analysis. We validate our models using benchmarked runtimes on an AMD EPYC 7402P CPU and NVIDIA A4000 GPU.

Results Our findings indicate that the models accurately predict trends and performance rankings, although the CPU model tends to overestimate runtime, while the GPU model underestimates it. Consequently, our models are overly optimistic in speedup predictions. Nonetheless, the process of designing and using the model enhances our understanding of the performance bottlenecks.

Conclusions Designing an accurate performance model for runtime predictions entails a trade-off between accuracy and effort. Despite suboptimal accuracies, the modelling process facilitates a deeper understanding of performance characteristics.

Files

MSc_Thesis_Jolly_Chen.pdf

Files (2.2 MB)

Name Size Download all
md5:41b6df0a580abcc98eaed578b6a683c5
2.2 MB Preview Download

Additional details

CERN

Department
EP
Administrative Unit
SFT
Programme
CERN Technical Student Program
Projects
ROOT