Database Systems and Information Management

SpaDa - Adaptive Query Compilation for Stream Processing

Adaptive Query Compilation for Stream Processing is part of the DFG priority programme SPP 2037:  Scalable Data Management on Future Hardware.

Over the last decades, the requirements of data processing workloads significantly changed. Nowadays, real-time analytics requires the execution of long-running queries over unbounded, continuously changing, high-velocity data streams. Common SPEs such as Flink and Storm scale-out execution to achieve high throughput and low-latency. However, recent research revealed that these SPEs cannot fully utilize available hardware resources. First, they do not take the particular hardware resources into account for optimization. Second, they do not take changing data characteristics intro account, which hinders a variety of adaptive optimizations. Third, they rely heavily on user-defined functions, which introduce a high processing overhead due to data serialization and transformation. In this project, we want to face these challenges to enable efficient processing of complex stream processing pipelines on modern hardware. To this end, we propose a novel adaptive query compiler for stream processing techniques to optimize code with regards to the hardware resources and changing data characteristics. Furthermore, we study possibilities to embed complex user-defined functions into compiled pipelines efficiently to support a wide range of advanced analytical data processing workloads.

For more information, pease visit https://gepris.dfg.de/gepris/projekt/447268056?language=en.

 

Project Duration: 11/08/2022 - 31/07/2023

Funding Agency: German Research Foundations (DFG)