Short description of the event:
Our demo paper co-authored by Robin van de Water, Francesco Ventura, Zoi Kaoudi, Jorge-Arnulfo Quiane-Ruiz, and Volker Markl on “Farming Your ML-based Query Optimizer’s Food” presented yesterday and today at the virtual conference ICDE 2022 has won the best demonstration award. The award committee members have unanimously chosen our demonstration based on the relevance of the problem, the high potential of the proposed approach and the excellent presentation.
Short description of the work:
As machine learning is becoming a core component in query optimizers, e.g., to estimate costs or cardinalities, it is critical to collect a large amount of labeled training data to build this machine learning models. The training data should consist of diverse query plans with their label (execution time or cardinality). However, collecting such a training dataset is a very tedious and time-consuming task: It requires both developing numerous plans and executing them to acquire ground-truth labels. The latter can take days if not months, depending on the size of the data. In a research paper presented last year in SIGMOD 2021 we presented DataFarm, a framework for efficiently creating training data for optimizers with learning-based components. This demo paper extends DataFarm with an intuitive graphical user interface which allows users to get informative details of the generated plans and guides them through the generation process step-by-step. As an output of DataFarm, users can download both the generated plans to use as a benchmark and the training data (jobs with their labels).
Link to the video "GUI DataFarm - Farming Your ML-based Query Optimizer’s Food"