Four pioneering projects: 2020 eTEC winners revealed

13 Apr 2021 - 7 min

Photo by Shahadat Rahman

The Netherlands eScience Center is excited to announce the four winning projects for the eTEC 2020 Call. The eTEC Call 2020 supports the research and development of innovative eScience technologies and software associated with optimized data handling, data analytics and efficient computing, driven by a demand from any specified research discipline.

Proposals are classified into one of four technological research directions: 

1. Tools for data quality, integration and cleaning
2. Virtual Research Environments: Infrastructure for ‘1000’ Experiments
3. Platforms for Big & Complex Spatial Data 
4. Green Computing: Energy Efficiency in Research Computing.

Each of the winning projects receives a grant consisting of funds and in-kind support byResearch Software Engineers from the eScience Center.

  1. Tools for Data Quality, Integration and Cleaning 

The eye of the beholder: Transparent pipelines for assessing online information quality
Dr. Jacco R. van Ossenbruggen (CWI – Centrum Wiskunde & Informatica)

The amount of online information and the impact it has on society is one of the key questions of our time. It also points to a gap – and opportunity – for automation in the assessment of the quality of online information. As corporate social networks don’t often operate transparently, assessment quality can be perceived as subjective or biased. Therefore any technology that supports such assessments should necessarily be transparent and flexible to support specific scholarly requirements. To achieve these goals, this project will build on existing QuPiD and NEWSGAC platforms. QuPiD is a proof of concept pipeline for information assessment quality that involves crowdsourcing, machine learning, and symbolic reasoning. To allow scholars to benefit from the platform, the team will empower the user to ‘tune’ these pipelines. For example, a user might decide to collect training data manually or from a crowdsourcing platform. The data might use supervised or unsupervised machine learning methods to analyse the quality of documents, and there will be implications here that the user must be aware of. The project will also build on the NEWSGAC framework, as it allows domain specialists to investigate and tune their machine learning pipelines. In all, the project aims to extend NEWSGAC´s transparency-enabling architecture to fulfil the above requirements, and in doing so will create hybrid pipelines that combine crowdsourcing, symbolic reasoning, and machine learning.

2. Virtual Research Environments: Infrastructure for ‘1000’ Experiments 

Virtual Research Environment for Integrative Modelling of Biomolecular Complexes
Prof. dr. Alexandre M.J.J. Bonvin (Utrecht University)

Computational structural biology has provided valuable insights in many research fields. In particular, the complex and intricate network of interactions between macromolecules. Current software, HADDOCK (used by over 19,000+ users worldwide, including pharma companies), provides greater understanding of those interactions through command-lines and webservers. However, there are two main limitations of HADDOCK that hamper further development and efficient use. First, its core protocol is a static workflow, which hinders its expansion and incorporation into custom pipelines. Second, its operation mode is geared towards HTC (grid/local) resources, which prevents scaling to thousands of runs and the efficient use of Cloud resources.

This project will address these limitations by developing a customizable, interactive, HTC/Cloud (and HPC)-optimized and reusable Virtual Research Environment for Integrative Modelling of Biomolecular Complexes which will consist of three main layers. One, a workflow builder/manager; two, an execution middleware, and finally an analysis, storage and sharing infrastructure. These combined layers will stimulate the integration of third-party software. As part of this project, wherever possible, existing solutions from the eScience Research Software Directory will be reused. By integrating all steps involved in studying biomolecular interactions, this VRE will lower the steep learning curve for researchers and students from different fields, and contribute to reproducible research and FAIR sharing of data.

3. Platforms for Big & Complex Spatial Data 

nD-PointCloud continuous level representation for spatio-temporal phenomena in Open Point Cloud Maps
Prof. dr. ir. Peter J.M. van Oosterom (Delft University of Technology)

This innovative eScience technology proposal aims to make point clouds the primary representation of spatio-temporal features throughout the entire processing chain. This includes data acquisition, storage, analysis, visualization and dissemination. Today, point clouds are mainly used in the data acquisition phase, while gridded (raster) or object (vector) models are used in the other phases. Handling the extract-transform-load actions becomes an increasing problem in using big data. Based on a novel use of high-resolution nD space filling curves, this proposal will realize a deep integration of space, time and scale as the basis for data organization, enabling High Performance/Throughput Computing for enormous point clouds. By enabling operations directly on the raw point cloud data, nD-PointCloud represents a major advance in domains requiring lossless spatio-temporal data of an extremely high accuracy. A distributed Open Point Cloud Map (OPCM) infrastructure will be developed that supports the sharing of big data nD-PointCloud, and enables interactive real-time visualizations using perspective views without data density shocks, continuous zoom-in/out and progressive data streaming between the server and client. Applications from the water management domain will be used as Proof-of-Principle. If successful, nD-PointCloud will become the preferred model enabling progress in (research) fields like cultural heritage, land administration, vegetation monitoring, building modelling, transportation and mobility.

4. Green Computing: Energy Efficiency in Research Computing 

Reducing Energy Consumption in Radio-astronomical and Ultrasound Imaging Tools (RECRUIT)
Dr. John W. Romein (ASTRON – Netherlands Institute for Radio Astronomy)

When it comes to algorithms, technologies, and energy constraints, imaging efforts in radio astronomy and medical ultrasound share fundamentally similar challenges; both near the edge and further downstream in processing pipelines. Although time and space scales are orders of magnitude apart, the associated data processing and enabling hardware to image galaxy or brain both share the common requirements. That is, they must be processed in a local, real-time, and energy-efficient way. In this project, ASTRON (the Netherlands Institute for Radio Astronomy) and CUBE (the Center for Ultrasound and Brain imaging at Erasmus MC) join forces to tackle HPC and energy-efficiency challenges by utilizing new technologies and algorithmic improvements. The Adaptive Compute Acceleration Platform by Xilinx and Tensor Cores in NVIDIA GPUs are top examples of such enabling technologies. This project will unlock the potential of these highly-efficient technologies for use in radio astronomy and ultrasound brain imaging, delivering open-source libraries, innovation in limited-precision algorithms, and will develop a new tool to analyze energy efficiency. Ultimately, this will allow more (energy) efficient instruments to be built.

Our team at eScience Center congratulates all winners, and thanks all participants for the variety and diversity of strong submissions. Together with our extended research community, we look forward to the results of these projects in due course. 

Looking for new opportunities? Go to our current Open Call 2021 and our Technology Call 2021