Optimisation of MongoDB Data Structures for KASCADE Cosmic-ray Data Centre
KASCADE Cosmic-ray Data Centre (KCDC, https://kcdc.ikp.kit.edu/) makes publicly available the data from the astroparticle-physics experiment KASCADE. The system will eventually hold over 20 TB of data, or nearly half a billion events. Since 2015 it has been using the NoSQL database MongoDB as its storage back-end.
Your goal in the project would be to tune MongoDB data structures used to hold KASCADE data, as well as corresponding indices, in order to achieve optimal performance under typical operating conditions of KCDC.
- analyse existing structure
- identify bottlenecks
- design and implement improvements
- benchmark results
- basic user-level knowledge of Linux
- familiarity with benchmarking, NoSQL databases would be an advantage
Marek.Szuba@kit.edu - 29178 Doris.Wochele@kit.edu - 22418