Addressing research challenges in computer science specifically needed to make breakthroughs in data-driven research
This project addresses several research challenges in computer science specifically needed to make breakthroughs in data-driven research. The project is focused on the data explosion problem, which is one of the most important challenges in almost all areas of science. The sheer volume and the distributed nature of many data sets lead to complicated technical problems around data transport and data-processing. The data also becomes more complex and heterogeneous, especially when different data sets are combined from various instruments and databases, which is typical for system-level sciences. In particular, this complexity makes it a challenge to extract semantically useful information from the data.
The parallel and distributed computing environments on which applications have to run also are changing drastically at all levels, from processor architectures (multi-cores like GPUs) to networking (hybrid networks, sensor networks), storage architectures, and middleware (virtualization and Clouds). The project divides focus between the volume aspect of the data explosion (big data) and the complexity aspect (heterogeneous data). One postdoc works on infrastructure innovation and distributed data processing (especially on many-cores such as GPUs) and a second focuses on information management, in particular complexity analysis of scientific data sets for various disciplines.