Researchers in the Database Systems and Information Management (DIMA) Group at TU Berlin presented three full research papers as well as three demo papers at the 47th International Conference on Very Large Data Bases (VLDB 2021), which took place from August 16 – 29, 2021. In conjunction with VLDB, DIMA researchers also co-organized the BOSS 2021 workshop on open source big data systems.
DIMA researchers contributed to the leading international conference on the management and analysis of very large datasets, VLDB, with three full research papers and three demos of their latest database system management research. The paper “Automated Feature Engineering for Algorithmic Fairness”, authored by Ricardo Salazar Diaz, Ziawasch Abedjan and then DIMA member Felix Neutatz proposes a highly accurate fairness-aware approach for machine learning. Condor, a high-performing dataflow system that integrates approximate summarizations, was presented in the second paper, “In the Land of Data Streams where Synopses are Missing, One Framework to Bring Them All” by Rudi Poepsel-Lemaitre, Martin Kiefer, Joscha von Hein, Jorge-Arnulfo Quiane-Ruiz, and Volker Markl. In their paper “Scotch: Generating FPGA-Accelerators for Sketching at Line Rate,” Martin Kiefer, Ilias Poulakis, Sebastian Bress, and Volker Markl presented Scotch, a novel system for accelerating sketch maintenance using FPGAs, that enables faster processing of compressed data.
Additionally, two papers from the NebulaStream research program were presented at VLDB’s VLIoT workshop.
BOSS 2021 Workshop
Dr. Quiané-Ruiz, Principal Researcher in the DIMA Group, also co-organized the Big Data Open Source Systems (BOSS) workshop, which was held in conjunction with VLDB. On August 16, BOSS 2021 featured tutorials on open source big data systems like Apache Calcite, Apache Arrow, Apache AsterixDB, and a presentation on Apache Wayang by DIMA researcher Dr. Zoi Kaoudi.
Highlight of the workshop was the keynote on “Lessons learned from building and growing Apache Spark “ by Reynold Xin, co-founder of Databricks and one of the main developers of Apache Spark – one of the most important open source massive data analytics engine currently in use.
The Publications in Detail:
Full Research Paper:
Ricardo Salazar, Felix Neutatz, Ziawasch Abedjan: Automated Feature Engineering for Algorithmic Fairness. Proc. VLDB Endow. 14(9): 1694-1702 (2021) [PDF]
Rudi Poepsel Lemaitre, Martin Kiefer, Joscha Von Hein, Jorge-Arnulfo Quiané-Ruiz, Volker Markl: In the Land of Data Streams where Synopses are Missing, One Framework to Bring Them All. Proc. VLDB Endow. 14(10): 1818-1831 (2021) [PDF]
Martin Kiefer, Ilias Poulakis, Sebastian Breß, Volker Markl: Scotch: Generating FPGA-Accelerators for Sketching at Line Rate. Proc. VLDB Endow. 14(3): 281-293 (2020) [PDF]
Kaustubh Beedkar, David Brekardin, Jorge-Arnulfo Quiané-Ruiz, Volker Markl: Compliant Geo-distributed Data Processing in Action. Proc. VLDB Endow. 14(12): 2843-2846 (2021) [PDF]
Alexander Renz-Wieland, Tobias Drobisch, Zoi Kaoudi, Rainer Gemulla, Volker Markl: Just Move It! Dynamic Parameter Allocation in Action. Proc. VLDB Endow. 14(12): 2707-2710 (2021) [PDF]
Zihao Chen, Zhizhen Xu, Chen Xu, Juan Soto, Volker Markl, Weining Qian, Aoying Zhou: HyMAC: A Hybrid Matrix Computation System. Proc. VLDB Endow. 14(12): 2699-2702 (2021) [PDF]
VLIoT Workshop Paper:
Dimitrios Giouroukis, Johannes Jestram, Steffen Zeuch, Volker Markl: Streaming Data through the IoT via Actor-Based Semantic Routing Trees. Open J. Internet Things 7(1): 59-70 (2021) [PDF]
Xenofon Chatziliadis, Eleni Tzirita Zacharatou, Steffen Zeuch, Volker Markl: Monitoring of Stream Processing Engines Beyond the Cloud: An Overview. Open J. Internet Things 7(1): 71-82 (2021) [PDF]