In some projects, data needs to be processed when it comes in, instead of storing it first. An example of a field where this is the case is astronomy, where observations come in at Terabytes or even Petabytes per second, and need significant processing before they can be used in research. Moving or storing such volumes of data is infeasible with current technology, so a big part of the processing needs to be done at the sensors.
At the eScience Center we work on such real-time solutions, for example for RFI mitigation (removing radio frequency interference from earth-based transmitters like mobile phones).
The World Wide Web has changed the way people communicate and collaborate by decentralizing publication. A similar shift is happening to complex data sets by applying Web technology to data representation. By using Linked Data principles, it has become possible to combine different and distributed data sets. This makes a big difference for fields of research that need to combine many types of data hosted by many different organizations to come to conclusions. At the eScience Center we apply linked data in bioinformatics (e.g. metabolomics or proteomics), but also in cultural heritage and the legal domain.
At the eScience Center we work with sensor data of many kinds, from wearable sensors such as Fitbit to LOFAR telescope sensors and Copernicus satellites.
Common issues are the calibration of the sensors, and stitching together overview pictures out of many separate observations. In some cases this involves edge computing to curtail the data explosion, and in other cases it involves complex modeling of the sensors themselves.
Databases are at the core of modern data analysis. Efficient computation often involves moving the computation into the database. At the eScience Center we help scientists to pick the best data storage and processing methods for the task at hand. We work on new ways to index data for efficient retrieval and on improved performance of existing database systems.
The types of databases we deal with at the eScience Center are as varied as the types of research being done with them and include relational databases, graph databases, (distributed) search engines, geographical information systems and memory stores.
Many projects at the eScience Center revolve around continuously improving the model we have of a certain natural phenomena, such as a model of the weather or a model of crowd behavior. Models are abstractions and therefore by definition imperfect. To reduce the error propagation in a model, a method we commonly use is data assimilation, which periodically tunes a model to actual data to keep it from diverging from reality. This is usually applied to weather, ocean or climate models, but is also used to accellerate the convergence of machine learning.