The Future is Real-Time Data
We now live in a world where data is constantly generated and collected, from transaction logs and website clicks to IoT sensors and more. The ability to act on information as it arrives has become crucial for staying competitive in many industries. Stream processing enables organizations to harness data in real time, driving fast, informed decisions that impact everything from customer experience to operational efficiency.
This presentation delves into the fundamentals of stream processing and open-source stream processing technologies. We'll go into topics such as how it’s changing the landscape of different industries (such as finance, e-commerce, and IoT) by improving upon existing operations and opening up possibilities for new opportunities. In the end, we'll show how to quickly set up a data pipeline to handle incoming streams of data.
We'll start with the core concepts behind stream processing, including how it's different from traditional batch processing, which is most commonly employed. Next, we'll go over where stream processing can deliver the most value—whether it’s detecting fraud as it happens, tailoring online experiences to user behavior, or monitoring critical infrastructure. However, stream processing is not a one-size-fits-all. So we'll discuss when and where stream processing is most effectively used, and when to look into other data processing techniques.
Moving from concepts to application, we’ll dive into what a typical stream processing data pipeline looks like, exploring each stage from data ingestion and processing to real-time analytics and storage. To provide a broad overview, we’ll compare some of the most popular open-source tools and platforms used in stream processing. We'll compare commonly used messaging queues (Apache Kafka, Apache Pulsar, etc.) and stream processing tools (Apache Flink, RisingWave, etc.). Each of these tools plays a unique role within a real-time architecture, from handling data ingestion and ensuring data reliability, to enabling complex transformations and aggregations. We’ll outline the specific strengths of each tool, providing insights on when and where they shine within a modern data stack.
Finally, we’ll bring these concepts and tools together by walking through a live demonstration of how to set up a simple yet powerful data pipeline. This hands-on example will guide attendees through setting up a pipeline that ingests data, processes it, and outputs actionable insights in real time.
By the end of the session, attendees will have a clear understanding of how stream processing works, where it fits into a modern data architecture, and how they can start applying these techniques to deliver value in their projects.