Optimisation of MongoDB Data Structures for KASCADE Cosmic-ray Data Centre

From Lsdf
Revision as of 15:19, 9 February 2016 by M Szuba (talk | contribs) (Description)

Zurück zur Themenliste

Description

KASCADE Cosmic-ray Data Centre (KCDC) makes publicly available the data from the astroparticle-physics experiment KASCADE. The system will eventually hold over 20 TB of data, or nearly half a billion events. Since 2015 it has been using the NoSQL database MongoDB as its storage back-end.

Your goal in the project would be to tune MongoDB data structures used to hold KASCADE data, as well as corresponding indices, in order to achieve optimal performance under typical operating conditions of KCDC.

This is a joint project between the Steinbuch Centre for Computing (SCC) and the Institute for Nuclear Physics (IKP).

Tasks

  • analyse existing structure
  • identify bottlenecks
  • design and implement improvements
  • benchmark results

Requirements

  • basic user-level knowledge of Linux
  • programming in Python, Node.js JavaScript and/or other cross-platform scripting languages
  • familiarity with benchmarking, NoSQL databases would be an advantage

Contact

Marek.Szuba@kit.edu - 29178

Doris.Wochele@kit.edu - 22418