Difference between revisions of "Design and Deployment of a Sharded Cluster for the KASCADE Cosmic-ray Data Centre"
From Lsdf
(Created page with "Zurück zur Themenliste = Description = [https://kcdc.ikp.kit.edu/ KASCADE Cosmic-ray Data Centre] (KCDC) makes publicly available the data f...") |
(Replaced content with "{{db|1=topic exists no longer}}") |
||
Line 1: | Line 1: | ||
+ | {{db|1=topic exists no longer}} |
||
− | [[Studentische_Arbeiten_am_SCC|Zurück zur Themenliste]] |
||
− | |||
− | = Description = |
||
− | [https://kcdc.ikp.kit.edu/ KASCADE Cosmic-ray Data Centre] (KCDC) makes publicly available the data from the astroparticle-physics experiment KASCADE. The system will eventually hold over 20 TB of data, or nearly half a billion events. Since 2015 it has been using the NoSQL database MongoDB as its storage back-end. |
||
− | |||
− | The goal of this project is to assist the KCDC database from a single server to a sharded (partitioned) cluster. In particular, you will be required to select and evaluate optimal shard keys for partitioned collections. |
||
− | |||
− | This is a joint project between the Steinbuch Centre for Computing (SCC) and the Institute for Nuclear Physics (IKP). |
||
− | |||
− | = Tasks = |
||
− | * analyse common work flows of KCDC from the point of view of database operations |
||
− | * identify candidates for shard keys |
||
− | * deploy a MongoDB cluster and activate sharding |
||
− | * evaluate performance |
||
− | A possibility exists to extend the project with additional goals to meet the requirements of a Master thesis. |
||
− | |||
− | = Requirements = |
||
− | * familiarity with MongoDB and sharding |
||
− | * basic administrator-level knowledge of Linux |
||
− | * knowledge of Python, Node.js JavaScript or another cross-platform scripting language would be an asset |
||
− | |||
− | = Contact = |
||
− | Marek.Szuba@kit.edu - 29178 |
||
− | |||
− | Doris.Wochele@kit.edu - 22418 |