Big Data/SparkStreaming

Apache Spark Streaming is the streaming extension of Apache Spark.

The basic idea is buffer the data of a continuously arriving data stream for a certain amount of time. The resulting data stream segments are processed by Spark as usual. The resulting intermediate results are combined by applying specific operations like union.

References

Apache Spark Streaming - official web site
Spark Streaming Programming Guide
M. Zaharia and T. Das and H. Li and S. Shenker and I. Stoica "Discretized Streams: An Efficient and Fault-tolerant Model for Stream Processing on Large Clusters" Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Ccomputing (HotCloud'12), 2012, Pages 10-10, USENIX Association Berkeley, CA, USA

Technologies

References