A Big Data Streaming Recipe
I am Konstantin Gregor, a developer and software consultant with a background in mathematics and machine learning. Since two years, I work for TNG Technology Consulting in Munich, Germany, where I help our clients develop big data applications with a main focus on real time streaming applications and I always enjoy sharing my knowledge of this awesome field of IT with other developers.
There is a lot to consider when setting up a big data streaming application: How much data will we need to handle? How important are “real time” results? What about constraints on data quality? And how can we deal with various failure scenarios? The open-source world offers numerous big data frameworks that can help process unbounded data, each with its own mechanisms to tackle these problems. I want to show you these frameworks and explain their mechanisms in order to give you some insights on which ingredients you should add to build a big data streaming application that suits your needs.
Video 23:49 Erratum: Spilling does not happen in the Flink DataStream-API. However you can handle state that does not fit in memory with the RocksDB state backend