Architektur eingebetteter Systeme

Description

Reconfigurable Computing devices, such as Field-Programmable Gate Arrays (FPGA), play an increasingly important role, both in Embedded Systems and in High-Performance Computing (HPC). Major industry players shift their focus towards FPGAs, most notably Intel when acquiring Altera, the second-largest FPGA manufacturer in the world, in 2015. Their high throughput, due to their inherent parallelism, coupled with their ability to be reconfigured to adapt to almost any application, makes FPGAs a great choice for a huge number of use cases. In this project line, we research the use of FPGAs for different applications, ranging from Machine Intelligence to Signal Processing, as well as operating system support and CAD tools for FPGAs.

Deep Learning on FPGAs
Deep convolutional neural networks (DCNN) are widely used, e.g. for image processing applications. The implementation of DCNNs is difficult due to a set of problems, such as high computational power and high memory bandwidth requirements in a power-constrained environment. FPGAs can solve these problems and pose a way to significantly accelerate computation of DCNNs. They have been rapidly adopted for acceleration with improved latency and energy efficiency compared to CPU- and GPU-based implementations. This project will focus on the investigation of DCNNs on FPGAs.

Memory Architectures and Accesses on FPGA-SoCs
FPGAs can be used to accelerate many applications due to their inherent parallelism. However, the required memory accesses can pose a major bottleneck, thus reducing performance significantly. As a part of our Reconfigurable Architectures cluster, we analyze the memory subsystems of FPGAs and FPGA-SoCs such as Xilinx’s Zynq-7000 to optimize memory access behavior. We focus both on benchmarking and on developing tools and IP cores to ease the implementation of HW/SW-codesign approaches on FPGA-SoCs.

Video Coding on FPGAs
Video encoding and decoding are highly computationally intensive tasks, which often require hardware implementations to meet real-time requirements. FPGAs are well-suited for their efficient implementation because the huge amount of parallel computing resources can be used to exploit the massive existing thread- and data-level parallelism. Additionally, fully-customized hardware accelerators can be used to speed up critical parts that do not offer sufficient parallelization opportunities, e.g. entropy coding.

Operating System Support for Stream Processing on SoC/FPGA Hybrid ICs
The heterogeneous computing resources available on modern SoC ICs present a challenge to operating system designers. SoC/FPGA hybrid ICs additionally include programmable logic fabric -- commonly used to implement physical data streams. Focussing on Unix-like operating systems, we research possibilities to unify the management of shared-memory-based and physical data streams. In this context, we also explore fine-grained, autonomous processing of interrupt requests to do mundane tasks like DMA-descriptor management using various computational resources, e.g. real-time processors.

Involved personnel

Prof. Dr. Ben Juurlink

Anastasiia Dolinina

Robert Drehmel

Matthias Goebel

Philipp Habermann

Awards

Best Paper Award at ARC 2017

Best Student Paper Award at ISM 2015

HiPEAC Paper Award 2015

Publications

A High-Performance Hardware Accelerator for HEVC Motion Compensation

High Performance Memory Accesses on FPGA-SoCs: A Quantitative Analysis (HiPEAC Paper Award)

A Trace-based Workflow for Evaluating Application-specific Memory Bandwidth for FPGA-SoCs

A Quantitative Analysis of the Memory Architecture of FPGA-SoCs (Best paper award a the 13th ARC 2017)

A Methodology for Predicting Application-specific Achievable Memory Bandwidth for HW/SW-Codesign

An Application-Specific Memory Management Unit for FPGA-SoCs

Design and Implementation of a High-Throughput CABAC Hardware Accelerator for the HEVC Decoder

Optimizing HEVC CABAC Decoding with a Context Model Cache and Application-specific Prefetching

Application-specific Cache and Prefetching for HEVC CABAC Decoding