Nachrichtenübertragung

Multimedia Analysis and Processing

The Communication Systems Group of the TU Berlin, led by Prof. Dr. Thomas Sikora, has a strong track record in multimedia analysis and processing with more than 50 publications in this field, and was involved in many national and international funded projects related to multimedia analysis and processing. The following list gives an overview of research areas we are involved in:

  • Video surveillance, people safety and privacy protection
  • Multi-object tracking
  • Optical flow estimation
  • Crowd analysis and people counting
  • Lost luggage detection
  • Violent behaviour detection
  • Genre and commercial detection in TV broadcast
  • Geo-tagging
  • Face detection and recognition
  • Image and video segmentation
  • MPEG-7
  • Analysis and classification of speech-, audio- and video data
  • etc.

Research Activities

IOU Tracker

Tracking-by-detection is a common approach to multi-object tracking. With ever increasing performances of object detectors, the basis for a tracker becomes much more reliable. In combination with commonly higher frame rates, this poses a shift in the challenges for a successful tracker. We propose a very simple tracking algorithm which can compete with more sophisticated approaches at a fraction of the computational cost. With thorough experiments we show its potential using a wide range of object detectors. The proposed method can easily run at thousands of frames per second (fps) while outperforming the state-of-the-art on the DETRAC vehicle tracking dataset and achieves competitive results on the MOT17 benchmark.

mehr

Robust Local Optical Flow Estimation

The Robust Local Optical Flow (RLOF) is a sparse optical flow and feature tracking method. The main objective is to provide a fast and accurate motion estimation solution. The main advantage of the RLOF approach is the adjustable runtime and computational complexity which is in contrast to most common optical flow methods linearly dependend on the number of motion vectors (features) to be estimated. Thus the RLOF is a local optical flow method and most related to the PLK method ( better known as KLT Tracker ) and thus the famous Lucas Kanade method. The sparse-to-dense interpolation scheme allows for fast computation of dense optical flow fields.

mehr

Lagrangian-based Video Analytics

We aim for innovative ways to process and use dynamic patterns in video motion to quantify salient motion features and thus improve computer vision performance for tasks such as identification, segmentation, and classification. The proposed methodology provides a powerful set of data-driven descriptors for continuous and integral motion analysis on variable temporal scales (i.e., for short-term as well as long-term motion features).

mehr

Probability Hypothesis Density (PHD) Multi-Object-Tracking Filter

The Probability Hypothesis Density (PHD) filter is a multi-object Bayes filter which has recently attracted a lot of interest in the tracking community mainly for its linear complexity and its ability to deal with high clutter especially in radar/sonar scenarios. In the computer vision community however, underlying constraints are different from radar scenarios and have to be taken into account when using the PHD filter.

mehr

Hyper-Parameter Optimization for Convolutional Neural Networks Committees based on Evolutionary Algorithms

We propose an evolutionary algorithm-based framework to automatically optimize the CNN structure by means of hyper-parameters. Further, we extend our framework towards a joint optimization of a committee of CNNs to leverage specialization and cooperation among the individual networks. Experimental results show a significant improvement over the state-of-the-art on the well-established MNIST dataset for hand-written digits recognition.

mehr

Background Substraction / Foreground Detection / Static-Object Detection

Gaussian mixture models have been extensively used and enhanced in the surveillance domain because of their ability to adaptively describe multimodal distributions in real-time with low memory requirements. Nevertheless, they still often suffer from the problem of converging to poor solutions if the main mode stretches and thus over-dominates weaker distributions. We propose complementary background models for background modelling and to detect static and moving objects in crowded video sequences.

mehr

People Carrying Object Detection and Classification

Detecting people carrying objects detection and classification is a problem known from surveillance scenarios. It can be used as a first step in order to monitor interactions between people and objects, like depositing or removing an object. Research is focused on new machine learning approaches for pedestrian detection and new ways of feature representation, behavior analysis and machine learning techniques for classification.

mehr

Multimodal Geo-Tagging

We present a hierarchical, multi-modal approach for placing Flickr videos on the map. Our approach makes use of external resources to identify toponyms in the metadata and of visual and textual features to identify similar content. First, the geographical boundaries extraction method identi es the country and its dimension. We use a database of more than 3.6 million Flickr images to group them together into geographical regions and to build a hierarchical model. A fusion of visual and textual methods is used to classify the videos location into possible regions. Next, the visually nearest neighbour method uses a nearest neighbour approach to nd correspondences with the training images within the preclassified regions. The video sequences are represented using low-level feature vectors from multiple key frames. The Flickr videos are tagged with the geo-information of the visually most similar training item within the regions that is previously ltered by the pre-classi cation step for each test video. The results show that we are able to tag one third of our videos correctly within an error of 1 km.

mehr

Consistent Two-Level Metric

Since the commonly used benchmarks for abandoned object detection (AOD) only have few abandoned objects and a non-standardized evaluation procedure, an objective performance comparison between different methods is hard. Therefore, we propose a new evaluation metric which is focused on an end-user application case and an evaluation protocol which eliminates uncertainties in previous performance assessments.

mehr

Motion-based Object Segmentation

We present an unsupervised motion-based object segmentation algorithm for video sequences with moving camera, employing bidirectional inter-frame change detection. For every frame, two error frames are generated using motion compensation. They are combined and a segmentation algorithm based on thresholding is applied. We employ a simple and effective error fusion scheme and consider spatial error localization in the thresholding step. We find the optimal weights for the weighted mean thresholding algorithm that enables unsupervised robust moving object segmentation.

mehr

Short-term Motion-based Object Segmentation

Motion-based segmentation approaches employ either long-term motion information or suffer from lack of accuracy and robustness. We present an automatic motion-based object segmentation algorithm for video sequences with moving camera, employing short-term motion information solely. For every frame, two error frames are generated using motion compensation. They are combined and a thresholding segmentation algorithm is applied. Recent advances in the field of global motion estimation enable outlier elimination in the background area, and thus a more precise definition of the foreground is achieved. We propose a simple and effective error frame generation and we consider spatial error localization. Thus, we achieve improved performance compared with a previously proposed short-term motion-based method and we provide subjective as well as objective evaluation.

mehr

Datensätze

TUB CrowdFlow Dataset

Optical Flow Dataset and Benchmark for Visual Crowd Analysis. A new optical flow dataset exploiting the possibilities of a recent video engine to generate sequences with groundtruth optical flow for large crowds in different scenarios. We break with the development of the last decade of introducing ever increasing displacements to pose new difficulties. Instead we focus on real-world surveillance scenarios where numerous small, partly independent, non rigidly moving objects observed over a long temporal range pose a challenge.

mehr

Multi-Object and Multi-Camera Tracking Dataset (MOCAT)

The TU Berlin Multi-Object and Multi-Camera Tracking Dataset (MOCAT) is a synthetic dataset to train and test tracking and detection systems in a virtual world. One of the key advantages of this dataset is that there is a complete and accurate ground truth, including pixel accurate object masks, available. All sequences are rendered 3 times, each with different illumination settings. This allows to directly measure the influence of the illumination to the algorithm under test. There are 8 to 10 different camera views (including camera calibration information) with partly overlapping FOVs for each sequence available. The ground truth contains the world position for each object, so the multi-camera tracking performance can be evaluated as well. All sequences contain vehicles, animals and pedestrians as objects to detect and track.

mehr

Software

IOU Tracker @ GITHUB

Tracking-by-detection is a common approach to multi-object tracking. With ever increasing performances of object detectors, the basis for a tracker becomes much more reliable. In combination with commonly higher frame rates, this poses a shift in the challenges for a successful tracker. We propose a very simple tracking algorithm which can compete with more sophisticated approaches at a fraction of the computational cost. This GIT provides the Python implementation of the IOU Tracker.

mehr

Robust Local Optical Flow Library (RLOF) @ OpenCV contrib (4.1)

The Robust Local Optical Flow (RLOF) is a sparse optical flow and feature tracking method. We are deligthed that it is now part of OpenCV Contribution library (4.1.0). The RLOF methods are motivated by the problem of local motion estimation via robust regression with linear models. The main objective is to provide real-time capability, accurate and scaleable motion estimation solution. The software implements several versions of the RLOF algorithms for sparse and dense optical flow estimation.

mehr

Background Substraction Library (SGMM-SOD)

We provide binaries of the SGMM-SOD library in order to help other researchers to compare their results or to use our work as a module for their research. The files contain a binary package for the Windows operating system and a minimal example on how to use the library. We have tried to keep the interface as simple as possible

mehr

Evaluation framework for abandoned object detection

Since the commonly used benchmarks for abandoned object detection (AOD) only have few abandoned objects and a non-standardized evaluation procedure, an objective performance comparison between different methods is hard. Therefore, we propose a new evaluation metric which is focused on an end-user application case and an evaluation protocol which eliminates uncertainties in previous performance assessments.

mehr

Awards

Challenge Winner IWOT4S @ AVSS 2018

We are delighted to announce that our IOU tracker won in a row the IWOT4S Challenge for multi-object tracking at the International Workshop on Traffic and Street Surveillance for Safety and Security at IEEE AVSS 2018, Auchkland, New Zealand, 27.11.2018

mehr

We won the VisDrone 2018 Challenge @ ECCV!

We are delighted to announce that our V-IOU tracker won the VisDrone 2018 Challenge for multi-object tracking at the the ECCV 2018 workshop "Vision Meets Drone: A Challenge" (or VisDrone2018, for short) on September 8, 2018, in Munich, Germany.

mehr

Challenge Winner IWOT4S @ AVSS 2017

We are delighted to announce that our IOU tracker won the IWOT4S Challenge for multi-object tracking at the International Workshop on Traffic and Street Surveillance for Safety and Security at IEEE AVSS 2017, Lecce, Italy, 29.08.2017

mehr

Best Paper Award @ IET ICDP 2015

We are delighted to announce that our paper "A Local Feature based on Lagrangian Measures for Violent Video Classification" won the Best Paper Award at the IET International Conference on Imaging for Crime Detection and Prevention, 15.07.2015 - 17.07.2015. Congratulations to Tobias Senst and the co-authors.

mehr

Related Publications

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2007

  • Ronald Glasberg
    Next Generation Search Engine Idea: How will you find the Content of tomorrow? (lecture)
    IFA Consumer Electronics, Berlin, 31.08.2007
    Details BibTeX
  • Ronald Glasberg, Pascal Kelm, Hao Qin, Thomas Sikora,
    Extensible Platform for Multimedia Analysis (XPMA)
    2007 IEEE International Conference on Multimedia and Expo, volume 2007, BEIJING, CHINA, 02.07.2007 - 05.07.2007, pp. 5 - 5
    Demo
    Details BibTeX

2006