Bachelor Thesis Open Access

Performance Study of NN-Training with Data Streaming from Remote Storage

Annalena Garleff

Thesis supervisor(s)

Ferber, Torben; Schnepf, Matthias

This work focuses on the computing infrastructure at KIT as used by the Belle II group. When bigger amounts of data gets processed, it needs to be transferred from the portal machines to TOpAS. In this thesis two ways of data transfer have been tested: On one hand, copying the data to the TOpAS worker node ahead of starting the job, and on the other hand, streaming the data with XRootD directly from /ceph during the job. The performance of the ways to transfer the data has been compared for different dataloaders. For each test, the same program, containing a NN-training, has been used, however different dataloaders have been attached to the training. The training ran on data from the Belle II experiment.

Files (1.5 MB)
Name Size
Annalena_Garleff_BA.pdf
md5:34452ea339dbdd504cd4c9aa9e704b91
1.5 MB Download

Cite as