MongoDB as an In-memory Sharded Database: Difference between revisions

From Lsdf
Jump to navigationJump to search
(Structure)
(Replaced content with "{{db|1=topic exists no longer}}")
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{db|1=topic exists no longer}}
[[Studentische_Arbeiten_am_SCC|Zurück zur Themenliste]]

== Description ==

The Data Life-Cycle Lab Earth and Environment at KIT manages data from Earth-observing climate satellites such as Envisat MIPAS, which is stored in the NoSQL database MongoDB. The size of data in question makes it necessary for it to be distributed across a large database cluster, using sharding - a using horizontal-partitioning solution available in MongoDB.

A typical MongoDB instance makes heavy use of available system memory but ultimately relies on underlying persistent storage. At the same time, our database cluster offers only low-performance persistent storage which is unsuitable for sustained load. It is therefore required to have MongoDB on our cluster operate as an *in-memory database*. Your task will be to to research optimal configuration of MongoDB for in-memory operation, develop tools for the initial population of cluster nodes as well as periodic commits of their data to persistent storage, and finally to evaluate the performance of the system.

== Requirements ==
* basic administration of Linux/Unix systems
* good working knowledge of Python, Node.js JavaScript or other scripting language capable of interfacing with MongoDB
* familiarity with MongoDB and/or sharding would be a plus

== Contact ==
For more information, please contact [mailto:Marek.Szuba@kit.edu Marek Szuba].

Latest revision as of 09:57, 6 February 2017