Real-time Data Analytics: Difference between revisions

From Lsdf
Jump to navigationJump to search
(Created page with "= Description = This topic plays an important role in real-world machine learning applications like re commender systems, stock market analysis, anomaly detections and Intern...")
 
No edit summary
 
(One intermediate revision by the same user not shown)
Line 2: Line 2:


This topic plays an important role in real-world machine learning applications like re commender systems, stock market analysis, anomaly detections and Internet of Things sensor data [0].
This topic plays an important role in real-world machine learning applications like re commender systems, stock market analysis, anomaly detections and Internet of Things sensor data [0].
The goal of the project is to create basic models using Spark [1,2] or Streaming Random Forest (Mondrian Forest) [3,4] and to apply the created algorithms in order to analyze streaming data like meeting calls [5] or financial data [6].
The goal of the project is to create basic models using Spark [1,2] and Streaming Random Forest (Mondrian Forest) [3,4] and to apply the created algorithms in order to analyze streaming data like meeting calls [5] or financial data [6].


The analysis will obey investigation of a time window width (length of analyzed data) on the accuracy of resulting predictions.
The analysis will obey investigation of a time window width (length of analyzed data) on the accuracy of resulting predictions.
Line 16: Line 16:


= Contact =
= Contact =
[mailto:Bogdan.Lobodzinski@kit.edu bogdan.Lobodzinski@kit.edu]
[mailto:Bogdan.Lobodzinski@kit.edu Bogdan.Lobodzinski@kit.edu]

Latest revision as of 18:56, 13 September 2016

Description

This topic plays an important role in real-world machine learning applications like re commender systems, stock market analysis, anomaly detections and Internet of Things sensor data [0]. The goal of the project is to create basic models using Spark [1,2] and Streaming Random Forest (Mondrian Forest) [3,4] and to apply the created algorithms in order to analyze streaming data like meeting calls [5] or financial data [6].

The analysis will obey investigation of a time window width (length of analyzed data) on the accuracy of resulting predictions.

References

[0] Introduction to stream: An Extensible Framework for Data Stream Clustering Research with R
[1] https://spark.apache.org/docs/latest/mllib-linear-methods.html#streaming-linear-regression
[2] https://spark.apache.org/docs/latest/streaming-programming-guide.html
[3] http://www.ment.at/blog-old/streaming-random-forest
[4] http://research.cs.queensu.ca/home/cords2/ideas07.pdf
[5] http://meetup.github.io/stream/rsvpTicker/
[6] http://finance.google.com/finance/info?client=ig&q=NASDAQ%3AGOOG

Contact

Bogdan.Lobodzinski@kit.edu