Daniel Maier was awarded the Outstanding Paper Runner Up at the 2019 International Conference on High Performance Computing & Simulation (HPCS 2019) held in Dublin, Ireland from July 15-19. The paper titled "Approximating Memory-bound Applications on Mobile GPUs" is co-authored by Nadjib Mammeri, Biagio Cosenza and Ben Juurlink. In their work, the author investigate the approximation of applications on mobile GPUs depending on the availability of fast local memory. Under the theme of “HPC and Modeling & Simulation for the 21st Century," HPCS 2019 focussed on a wide range of the state-of-the-art as well as emerging topics pertaining to high performance and large scale computing systems at both the client and backend levels.
AES member Farzaneh Salehiminapour received the best student award (Nachwuchspreis) at the 28th Workshop on Parallel Algorithms, Parallel Computer Structures and Parallel System Software (PARS 2019). The organizers honored Mrs. Salehiminapour for the submission and presentation of her paper “Reducing DRAM Accesses through Pseudo-Channel Mode”, co-authored by Jan Lucas, Matthias Goebel and Ben Juurlink. In this paper, the authors present and evaluate a technique to use the pseudo-channel mode feature of GDDR5X for merging memory requests and thus reducing the number of memory accesses. The PARS workshop is organized by the special interest group on parallel algorithms, parallel computer structures and parallel system software within the German Informatics Societies (GI/ITG). Its 28th edition was held at TU Berlin in Berlin, Germany.
AES member Angela Pohl received the Best Presentation Award at the 21st International Workshop on Software and Compilers for Embedded Systems (SCOPES `18). She presented the full paper “Control Flow Vectorization for ARM NEON”, which was co-authored by Nicolás Morini, Biagio Cosenza, and Ben Juurlink. In this work, the authors discuss the capabilities of compilers’ auto-vectorization passes and present strategies to overcome the missing masked instructions on ARM NEON platforms, which are critical to vectorize loops with control flow. The work was selected for the award by attendee vote. The 21st edition of SCOPES was held in St. Goar, Germany, and showcased more than 20 presentations from the field of embedded systems.
AES member Matthias Göbel has received a free pass for the HiPEAC conference 2018 in Manchester, UK. The HiPEAC network of excellence has thus honored the AES group under the lead of Prof. Ben Juurlink for helping in advertising the HiPEAC Jobs Portal. Matthias will use this opportunity to present his latest research results at the co-located Sixth International Workshop on Power-Efficient GPU and Many-core Computing (PEGPUM).
The AES group is an active member of the HiPEAC network that provides a platform for cross-disciplinary research collaboration, promotes the transformation of research results into products and services, and is an incubator for the next generation of world-class computer scientists.
Matthias Göbel, Ahmed Elhossini and Ben Juurlink received the Best Paper Award at the 13th International Symposium on Applied Reconfigurable Computing (ARC 2017) in Delft, NL for their paper "A Quantitative Analysis of the Memory Architecture of FPGA-SoCs". The work is co-authored by Chi Ching Chi and Mauricio Alvarez-Mesa of Spin Digital, a spin-off of AES.
In this paper, we analyze the various memory and communication interconnects found in FPGA-SoCs, particularly the Zynq-7020 and Zynq-7045 from Xilinx and the Cyclone V SE SoC from Intel. Issues such as different access patterns, cache coherence and full-duplex communication are analyzed, for both generic accesses as well as for a real workload from the field of video coding. Furthermore, the paper shows that by carefully choosing the memory interconnect networks as well as the software interface, high-speed memory access can be achieved for various scenarios.
Matthias Göbel, Chi Ching Chi, Mauricio Alvarez-Mesa and Ben Juurlink received a HiPEAC Paper Award for the paper "High Performance Memory Accesses on FPGA-SoCs: A Quantitative Analysis" which was published at the 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM 2015).
The authors analyzed the memory bandwidth of an FPGA-SoC, namely Xilinx's Zynq-7000. Their main focus was laying on two-dimensional memory accesses which can often be found in video coding and image processing applications. They implemented various hardware and software components that perform synthetic accesses with a given width and height. Scenarios like combining multiple ports or using cache coherency were evaluated. Furthermore, a memory trace of an HEVC motion compensation unit has been used in order to simulate a real workload. In contrast to other papers, the results showed that the full bandwidth of the memory controller and the DDR chips can be used. Therefore, the FPGA and the memory ports themselves cannot be considered bottlenecks. In addition, the results proved that Full-HD HEVC decoding in real-time on a Zynq-7000 is possible while 4k decoding is too ambitious without caching or memory compression techniques.
My PhD student Philipp Habermann presented the paper "Optimizing HEVC CABAC Decoding with a Context Model Cache and Application-specific Prefetching" at the IEEE International Symposium on Multimedia (ISM 2015) in Miami, FL.
He received the Best Student Paper Award for his work, which was co-authored by Chi Ching Chi, Mauricio Alvarez-Mesa and Ben Juurlink.
The authors provide a design space exploration of different cache configurations for HEVC CABAC hardware decoding. It is demonstrated that the decoder throughput can be significantly increased when a cache replaces a bigger context model memory in the critical data path. Furthermore, it is shown that the cache miss rate can be effectively reduced with an application-specific prefetching algorithm and the corresponding optimized memory layout, up to the point where it is not noticeable anymore.
HiPEAC Technology Transfer Award (TTA) for transferring some of the proprietary video coding technology to a Greek SME.
“Best Poster Award” for our joint poster “Nexus++: A hardware Task Manager for the StarSs Programming Model” at the 8th International Conference on High-Performanceand Embedded Architectures and Compilers HiPEAC 2013, January 2013, Berlin, Germany.
Recently, several programming models have been proposed that try to relieve parallel programming. One of these programming models is StarSs. In StarSs, the programmer has to identify pieces of code that can be executed as tasks, as well as their inputs and outputs. Thereafter, the runtime system (RTS) determines the dependencies between tasks and schedules ready tasks onto worker cores. Previous work has shown, however, that the StarSs RTS may constitute a bottleneck that limits the scalability of the system and proposed a hardware task manager called Nexus to eliminate this bottleneck. Nexus has several limitations, however. For example, the number of inputs and outputs of each task is limited to a fixed constant and Nexus does not support double buffering. Here we present Nexus++ that addresses these as well as other limitations. Experimental results show that double buffering achieves a speedup of $54\times$, and that Nexus++ significantly enhances the scalability of applications parallelized using StarSs.
Awarded best paper in the area of processor architecture at the IASTED International Conference on Parallel and Distributed Computing and Systems (2002).
D. Cheresiz, B.H.H. Juurlink, S. Vassiliadis, H.A.G. Wijshoff, Architectural support for 3D graphics in the complex streamed instruction set (PDF, 55,2 KB) (November 2002), 14th International Conference on Parallel and Distributed Computing Systems (PDCS 2002), 4-6 November 2002, Cambridge, USA , Best paper award in the area of processor architecture.
Spin Digital is a spin-off of the Technische Universität Berlin. We are specialists on video codecs, in particular, we have developed a highly efficient software implementation of the new HEVC/H.265 video coding standard, capable of doing, 4K and 8K decoding and encoding on standard computing platforms. Our main focus is on high performance and high quality video: our software has been extensively optimized for doing ultra-high quality video processing on multicore computing platforms. We leverage on the extensive research done by the AES group of TU Berlin on how to map video codecs to parallel computer architectures. Based on that research Spin Digital has been able to produce one of the fastest video codecs currently available.