Database Systems and Information Management

An Interview with Dr. Jonas Traub

"I first learned about DIMA in 2012 when I met Volker Markl at the IBM Almaden Research Center. During my Master’s studies, I focused on data stream processing in Apache Flink, given my passion for database technologies and contributed to Flink's Streaming API. Subsequently I was hired as a research associate."

Jonas Traub was a research associate at the DIMA Group from 2015 to 2020. Afterwards, he joined SAP as a Senior Developer.

How did you first learn about the DIMA Group and when did you join?

I first learned about DIMA in 2012 when I met Volker Markl at the IBM Almaden Research Center (ARC) in San Jose, California. Back then, I was a Bachelor’s student [1] conducting an internship at IBM ARC. During my Master’s studies, I focused on data stream processing in Apache Flink, given my passion for database technologies. I later went on to contribute to the first versions of Flink’s Streaming API [2]. Subsequently, in 2015, I was hired as a full-time staff member, i.e., a Research Associate/Doctoral Student, which has enabled me to pursue research in stream processing, database technologies, and sensor data acquisition.

Can you describe your research?

My major interests are in sensor data acquisition and real-time processing. As the Internet of Things (IoT) progresses, millions of sensor nodes will yield massive amounts of streaming data, which will challenge today’s processing systems, particularly, under low latency constraints and at a reasonable operating cost. To address this challenge, I developed an on-demand streaming technique, which drastically reduces the amount of data that would need to be transferred between sensor nodes and data analysis applications. This technique was presented at the 2017 ACM Symposium on Cloud Computing (SoCC) conference and published in the SoCC proceedings [3]. In addition, I investigated optimizations to reduce application analysis times. For example, ‘streaming window aggregations’ are often a bottleneck in many applications. Consequently, two of my publications [2, 4] were centered on addressing this particular type of bottleneck.

Since joining the group, how have you grown professionally, particularly, as a researcher?

Working at DIMA requires diverse skills, which make the job very rich in variety. The core research work demands a combination of technical skills, writing skills, and broad knowledge in state-of-the-art database technologies. I benefited a lot from my meetings with DIMA team members, visiting researchers, and cooperating partners. I have also had the opportunity to travel and present my research to fellow researchers at many technical conferences, which has enabled me to grow my professional network.

Given your teaching experiences while in the group, how would you describe your teaching philosophy? What have you learned about the educational process?

I consider teaching to be one of the most impactful tasks at DIMA. Every semester, many students learn about database systems and information management in our courses. Upon completing their coursework, students are well prepared to demonstrate their competencies in database technologies in the workforce. Personally, I enjoy working closely with motivated students on current research challenges, either in projects or Master’s Theses under my supervision, which under ideal conditions yields research publications (e.g., [5, 6]).

What advice would you give to future students who are interested in pursuing a PhD in the DIMA Group?

Take the first step and reach out to DIMA representatives. Put in a request for a face-to-face meeting to be held at TU Berlin or arrange for a Skype call to be held at a mutually convenient time. Compose a list of questions and feel free to present them. We will be happy to discuss our employment opportunities with you.

"I first learned about DIMA in 2012 when I met Volker Markl at the IBM Almaden Research Center. During my Master’s studies, I focused on data stream processing in Apache Flink, given my passion for database technologies and contributed to Flink's Streaming API. Subsequently I was hired as a research associate."

Jonas Traub was a research associate at the DIMA Group from 2015 to 2020. Afterwards, he joined SAP as a Senior Developer.

How did you first learn about the DIMA Group and when did you join?

I first learned about DIMA in 2012 when I met Volker Markl at the IBM Almaden Research Center (ARC) in San Jose, California. Back then, I was a Bachelor’s student [1] conducting an internship at IBM ARC. During my Master’s studies, I focused on data stream processing in Apache Flink, given my passion for database technologies. I later went on to contribute to the first versions of Flink’s Streaming API [2]. Subsequently, in 2015, I was hired as a full-time staff member, i.e., a Research Associate/Doctoral Student, which has enabled me to pursue research in stream processing, database technologies, and sensor data acquisition.

Can you describe your research?

My major interests are in sensor data acquisition and real-time processing. As the Internet of Things (IoT) progresses, millions of sensor nodes will yield massive amounts of streaming data, which will challenge today’s processing systems, particularly, under low latency constraints and at a reasonable operating cost. To address this challenge, I developed an on-demand streaming technique, which drastically reduces the amount of data that would need to be transferred between sensor nodes and data analysis applications. This technique was presented at the 2017 ACM Symposium on Cloud Computing (SoCC) conference and published in the SoCC proceedings [3]. In addition, I investigated optimizations to reduce application analysis times. For example, ‘streaming window aggregations’ are often a bottleneck in many applications. Consequently, two of my publications [2, 4] were centered on addressing this particular type of bottleneck.

Since joining the group, how have you grown professionally, particularly, as a researcher?

Working at DIMA requires diverse skills, which make the job very rich in variety. The core research work demands a combination of technical skills, writing skills, and broad knowledge in state-of-the-art database technologies. I benefited a lot from my meetings with DIMA team members, visiting researchers, and cooperating partners. I have also had the opportunity to travel and present my research to fellow researchers at many technical conferences, which has enabled me to grow my professional network.

Given your teaching experiences while in the group, how would you describe your teaching philosophy? What have you learned about the educational process?

I consider teaching to be one of the most impactful tasks at DIMA. Every semester, many students learn about database systems and information management in our courses. Upon completing their coursework, students are well prepared to demonstrate their competencies in database technologies in the workforce. Personally, I enjoy working closely with motivated students on current research challenges, either in projects or Master’s Theses under my supervision, which under ideal conditions yields research publications (e.g., [5, 6]).

What advice would you give to future students who are interested in pursuing a PhD in the DIMA Group?

Take the first step and reach out to DIMA representatives. Put in a request for a face-to-face meeting to be held at TU Berlin or arrange for a Skype call to be held at a mutually convenient time. Compose a list of questions and feel free to present them. We will be happy to discuss our employment opportunities with you.

Biography

You can learn more about Jonas at https://www.user.tu-berlin.de/powibol/.

References

[1] Adaptive Fragment Assignment for Processing File Data in a Database. Andrey Balmin, Romulo Antonio Pereira Goncalves, Fatma Ozcan, and Jonas Traub. US Patent 9,576,000 Β2, 2014.

[2] Cutty: Aggregate Sharing for User-Defined Windows. Paris Carbone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, and Volker Markl. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM'16), October 24 - 28, 2016, Indianapolis, Indiana, USA.

[3] Optimized On-Demand Data Streaming from Sensor Nodes. Jonas Traub, Sebastian Breß, Tilmann Rabl, Asterios Katsifodimos, and Volker Markl. In ACM Symposium on Cloud Computing 2017 (SoCC '17), Sep 25 - 27, 2017, Santa Clara, CA, USA.

[4] Scotty: Efficient Window Aggregation for out-of-order Stream Processing. Jonas Traub, Philipp M. Grulich, Alejandro Rodríguez Cuellar, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl, IEEE International Conference on Data Engineering (ICDE), 2018.

[5] Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive Windowing. Philipp Marian Grulich, René Saitenmacher, Jonas Traub, Sebastian Breß, Tilmann Rabl, Volker Markl. In International Conference on Extending Database Technology (EDBT), 2018.

[6] Efficient SIMD Vectorization for Hashing in OpenCL. Tobias Behrens, Viktor Rosenfeld, Jonas Traub, Sebastian Breß, Volker Markl. In International Conference on Extending Database Technology (EDBT), 2018.