Approximate data analysis trades off accuracy for faster or cheaper results. Instead of performing data analyses on the entire data, space-efficient summaries are constructed and evaluated. Applications from various domains, such as databases, computational biology, and machine learning, require accepting a loss in accuracy to gain the desired throughput and response times with limited resources. Specialized parallel processors, such as graphics processing units (GPUs) and field-programmable gate arrays (FPGAs), promise substantial gains in the efficiency of approximate analyses. However, exploiting these architectures is challenging as it requires considering an architecture's properties and explicitly designing and implementing for parallelism.
In this thesis, we investigate novel approaches to maintaining and evaluating data summaries on parallel processors:
First, we propose kernel density models for GPU-accelerated join selectivity estimation in relational databases. Our estimators do not require common error-prone assumptions and typically provide higher accuracy than the state of the art. Furthermore, the estimators fit the massively parallel computations supported by modern GPUs. In contrast to central processing units (CPUs), our approach allows for larger model sizes and more accurate estimates within the same time budget.
Second, we propose Scotch, a holistic approach to implementing FPGA-accelerated sketching at the full rate of the interconnect.
The framework generates hardware descriptions for a broad class of sketches based on a common abstraction and domain-specific language. Furthermore, Scotch automatically maximizes the sketch size to boost accuracy. As these tasks commonly require an FPGA expert, Scotch makes FPGA-accelerated sketching more accessible to software engineers.
Third, we propose an optimistic data-parallel architecture for FPGA-accelerated sketch summary maintenance. We share resources among inputs, instead of pessimistically replicating all resources for every input. Thus, our optimistic architecture reduces the resources required to achieve given throughput for applications that tolerate stalls in processing due to resource congestion.
Overall, this thesis shows that specialized parallel processors can substantially increase the efficiency of approximate data analysis and that the entry barrier for using them can be lowered substantially by systems and abstractions.