Published August 5, 2021 | Version v1
Technical note Open

Analysis and improvement of the DIRAC CPU Performance Benchmarking tool

Authors/Creators

  • 1. ROR icon European Organization for Nuclear Research

Contributors

Description

DB12, short for DIRAC Benchmark 12, is a fast benchmark tool designed to run within DIRAC and used for evaluating at run time the performance of the worker nodes. DB12 gives a faster, better estimation of the CPU power for LHCb applications. It's also run from inside the DIRAC pilot before fetching a job, and so in combination with the CPU time left, which we can get by interrogating the batch system, we can get an idea of which jobs we can run. DB12 however has some issues: • It does not support Python 3, as it is entirely written in Python 2. • Does not include CI/CD. • Copy pasted, instead of imported, within DIRAC. • Does not run well in multi-core environments: There is a function included in DB12 to run it on multiple cores in parallel, but it takes a lot of time to run and the accuracy of the scores is not certain. About 20% of the jobs fail because they run out of time on the Santos Dumont supercomputer for instance (See Figure 1). Currently when there are multi-cores DB12 is just run on one of the cores. This is not enough as DB12 should be run in parallel on every core and the lowest value should be used as reference, to make sure that jobs will not run out of time. To fix the issues, my work consisted of two parts: a coding part and an analysis part, which I am going to discuss in further detail in this report. In the first section, I will give some context about CERN and the LHCb experiment. Then, in section 2, I will describe the project in detail, to then discuss my work in section 3 and finally conclude the report.

Files

DB12_Report.pdf

Files (241.4 kB)

Name Size Download all
md5:a045ed774ef09a69072b888b9885d400
241.4 kB Preview Download

Additional details

Identifiers

CDS Report Number
CERN-STUDENTS-Note-2021-016

CERN

Experiment
LHCb

Linked records