The proposal on Technologies and Documentation of African languages is the winner of the Lorentz-eScience competition 2020-2021
Earlier this year the eScience Center and the Lorentz Center invited researchers to join our eScience competition and to apply for a leading-edge workshop on digitally enhanced research.
The winning proposal is:
‘Creating synergies: Technologies and Documentation of African languages’
Sara Petrollino, Felix Ameka, Daan van Esch, Mmasibidi Setaka and Emmanuel Ngue Um
The aim of this workshop is to bring together scholars and practitioners and move forward the agenda for both linguists working on the documentation of African languages and computer scientists and software developers involved in language technologies.
Important problem and excellent diversity of participants
The evaluation committee is enthusiastic about the high quality of the proposal:
‘Convincing scientific case; important problem, for which it seems timely to address it; concrete challenges and proposed solutions’
‘Direct involvement of key participants from Africa’
‘Excellent diversity of participants, very good balance of senior and junior participants as well as gender’
Venue and contact
We look forward to host this interesting and challenging workshop next year at the Lorentz Center. The workshop will be held from 31 May-4 June 2021, at our Snellius venue in Leiden, The Netherlands.
Procedure for Small Projects in context of NWA route Big Data
NWO has opened a call for a single proposal containing three small projects (50 kEuro each) within the context of each NWA route. The official call can be found here:
For the route ‘Value Creation through Responsible Access and Use of Big Data (aka the route Big Data)’ the boegbeeld prof.dr.ir. Inald Lagendijk is invited to submit the eligible proposal. The management of the route has formulated an open and transparent process to select the three project to be included in the proposal. If you wish to submit a ‘Big Data’ proposal for a 50 kEuro project plan in the context of this NWA small projects call, please follow the procedure outlined on the page https://www.esciencecenter.nl/mini-call-nwa-route/.
Deadline for your submission is September 18, 2020.
Matchmaking event of ‘Initiatives’ for the NWA-ORC 2020/21 call
Recently the NWA-ORC 2020/21 has been opened, see
Submitting to NWO an Initiative is required to be eligible for submitting an NWA pre-proposal before October 1, 2020. After submission of the Initiative, the main applicant – or another participant in the consortium – has to take part in a Matchmaking Event of the indicated primary NWA-route. The matchmaking event for the NWA route Big Data will take place on October 28, 2020 (10:00 – 13:00 hr). If you plan to submit an Initiative, then please mark this date in your agenda.
If you are not planning to submit an Initiative, you are still welcome to the Matchmaking event. IN this case, registration is required via de NWO website.
We are looking forward to your proposals for the Small Projects call and the Initiatives for the NWA-ORC call.
In October 2014, a 2-day heavy rainfall event on the western coast of Norway caused major floodings. Hundreds of residents in the valleys around Flam needed to be evacuated, leaving houses, buildings and infrastructure severely damaged.
Research project TWEX was established to investigate such events in a future warming climate. The research answers questions such as: are the events more intense and what possible impact can we expect?
To assess these questions, research software engineers of the eScience Center performed simulations with the EC-Earth climate model of future high-precipitation situations. These simulations were used to drive hydrological models simulating the overflowing rivers along the Norwegian coast.
As a result, realistic future flooding scenarios for various river catchments were created, which may serve as a useful tool in guiding local authorities with their measures to adapt to a changing climate.
Schaller, Sillmann, Müller, Haarsma, Hazeleger, Hegdahle, Kelder, van den Oord, Weert, Whan, The role of spatial and temporal model resolution in a flood event storyline approach in western Norway, in weather and climate extremes(September 2020). https://doi.org/10.1016/j.wace.2020.100259
The ongoing corona crisis is impacting the research community in many ways. The eScience Center is also affected directly as the crisis is impeding some of the projects that are executed together with our research partners.
To ensure a proper continuation and conclusion of eScience projects delayed or impeded by the crisis, the eScience Center has adopted a flexible mitigation policy, as follows:
- For all ongoing eScience projects awarded in joint calls with the Netherlands Organisation for Scientific Research (NWO), the “Guidelines for ongoing projects as result of the corona crisis” (PDF) published by NWO on June 4, 2020 apply.
- For all ongoing projects awarded in eScience Center-only calls, the following guidelines apply:
a. The starting date for projects for which the formal starting date expires before 31-08-2020 may be extended by up to 4 months. To be granted extension, the principal investigator (PI) must submit a request to the eScience Center.
b. The submission of interim and final reports may be postponed by up to 4 months. To be granted postponement, the PI must submit a request to the eScience Center.
c. Adaptations to the research plan, budget shifts or project extensions can be considered only if these alterations are budget neutral and do not change the primary research aims of the awarded project. The PI must submit a request for such alterations to the eScience Center.
- For all ongoing projects, also including eScience projects awarded in joint calls with NWO, any problems with the execution of the project, or the delivery of results, should be reported to the eScience Center.
The eScience Center will consider incoming requests as soon as possible, typically within two weeks of submission. The aim is to respond to all requests under the above guidelines as flexible and accommodating as possible.
Formal requests and further questions
Formal requests for project alterations, as well as all further questions, must be sent directly to the eScience Coordinator associated with the project. The same message also must be sent to the Operations Department of the eScience Center (email@example.com).
Per 1 januari 2020 trad cultuurhistoricus Joris van Eijnatten aan als nieuwe directeur van het Netherlands eScience Center. Komend najaar presenteert hij een nieuw strategisch plan waarin de prioriteiten en werkwijze van het eScience Center in relatie tot de ontwikkeling en toepassing van research software aan bod komt.
Steven Claeyssens interviewde van Eijnatten voor e-data&research over zijn visie aangaande kennisoverdacht, DCC’s, de FAIR-principes en de verduurzaming van software. Het hele interview kun je hier lezen: https://www.edata.nl/1403/pdf/1403_5.pdf
e-data&research, Jaargang 14 nummer 3 / juni 2020
Interview door Steven Claeyssens
Graphics processing units (GPUs) have emerged as a powerful platform because they offer high performance and energy eﬃciency at relatively low cost. They have been successfully used to accelerate many scientific workloads. Today, many of the top500 supercomputers are equipped with GPUs and are the driving force behind the recent surge in machine learning.
However, developing GPU applications can be challenging, in particular with regard to software engineering best practices, and the quantitative and qualitative evaluation of output results. While some of these challenges, such as managing different programming languages within a project, or having to deal with different memory spaces, are common to all software projects involving GPUs, others are more typical of scientiﬁc software projects.
In their paper, Lessons learned in a decade of research software engineering GPU applications, the research software engineers (RSEs) address the challenges that they have encountered and the lessons learned from using GPUs to accelerate research software in a wide range of scientific applications.
“Many of the GPU applications used as case studies in the paper were developed as part of eScience projects at the Netherlands eScience Center”, says Ben van Werkhoven, research software engineer at the Netherlands eScience Center. “Programming GPU applications is a specialized field and while many scientists develop their own code, GPU research software is often developed by research software engineers (RSEs) that have specialized in this field. The goal of the paper is really to share our experiences, hoping that others can learn from our mistakes as well as our insights.”
The researchers of the paper recommend to carefully select and if needed rewrite the original application to ensure the starting point is of sufficient code quality and is capable of solving the problem at the scale the GPU application is targeting. When performance comparisons of different applications are of interest to the broader scientiﬁc community it is important that RSEs can publish those results, both for the community to take notice of this result and for the RSEs to advance in their academic career.
According to van Werkhoven “The reason to move code to the GPU is often to target larger, more complex problems, which may require the development of new methods to operate at higher resolutions or unprecedented problem scales. GPU code can be implemented in many different ways, resulting in large design spaces with, for example, different ways to map computations to the GPU threads. As such, auto-tuning, with tools such as Kernel Tuner, is often necessary to achieve optimal and portable performance.
Evaluating the results of GPU applications often requires carefully constructed test cases and expert knowledge from the scientists who developed the original code. In eScience projects, these are often the project partners with whom we are collaborating.”
The software sustainability of GPU research software remains an open challenge as GPU programming remains a specialized field and RSEs are often only involved during short-lived collaborative projects.
According to Ben, they will continue to use GPUs in scientific projects and expect to continue to do so for a long time. Recently, several new supercomputers were announced and all the big machines include GPUs because of their high performance and energy efficiency. In addition, they plan to further advance and apply GPU auto-tuning technology.
Watch the short video presentation about the paper “Lessons learned in a decade of research software engineering GPU applications”, which is part of the SE4Science 2020 workshop.
Join the discussion until June 12, 2020
If you would like to participate in the discussion after watching the video, add your question here.
The authors will respond to your comment as soon as possible.
Ben van Werkhoven, Willem Jan Palenstijn, Alessio Sclocco, Lessons learned in a decade of research software engineering GPU applications, International Workshop on Software Engineering for Computational Science (SE4Science 2020), ICCS 2020, Part VII, LNCS 12143. (preprint: arXiv:2005.13227).
The Netherlands eScience Center has changed the format of the scheduled information event for the eTEC 2020 and ASDI 2020 calls. Instead of a live online video streaming event, all the relevant information for both calls has now been made available on the eScience Center website.
This decision was made in order to give all participants equal access to all further information associated with the eTEC and ASDI calls. ‘Despite their immense value, video calls are often accompanied by technical issues that prevent some participants from receiving all the relevant information. We want to avoid such a situation and ensure an equal playing field for all potential applicants’, says Dr. Frank Seinstra, the eScience Center’s program director.
The information slide decks and Q&As can be found here: esciencecenter.nl/asdi-etec2020
Participants who have a question that is not listed on the Q&A page are advised to contact Tom van Rens (NWO, 070 344 0509) or Dr Frank Seintra (eScience Center, 020 460 4770) or send an email to firstname.lastname@example.org or email@example.com
Please note: the Q&A document will be updated regularly so please make sure to visit the page often.
This is one of the findings of a new semi-automated literature study on the state of exascale computing. The study, carried out by a team of research engineers from the Netherlands eScience Center with reusable software specifically developed for this purpose, was recently published in the journal ACM Computing Surveys.
The next generation of supercomputers will soon break the exascale barrier. These supercomputers will be capable of performing at least one quintillion (billion billion) floating-point operations per second (1018 FLOPS) and are expected to accelerate advancements in many scientific disciplines. At the moment, the race towards these exascale systems is reaching its conclusion with the United States, the European Union and China all planning to build and launch their own exascale supercomputers within the next five years.
The state of current research
Mirroring the rapid rate of development in exascale computing, much has been written over the past decade about these systems: their tremendous computing power, the technical challenges to building and programming them, as well as possible solutions to overcoming these.
In their paper, the researchers provide an overview of the current body of knowledge on exascale computing and provide insights into the most important trends and research opportunities within this field. To do so, they use a three-staged approach in which they discuss various exascale landmark studies, use data-driven techniques to analyse the large collection of related literature, and discuss eight research areas in-depth based on influential articles.
‘The quantitative analysis was done with open-source software developed by the eScience Center’, says Stijn Heldens, PhD candidate and lead author of the paper. ‘We formulated a search query and collected all the exascale-related papers we could find. We then used data mining and natural language processing techniques to automatically process all the material. In addition, we selected the most prominent articles and studied these to determine the most important trends that will help us to reach exascale’
Progress and challenges
The team’s research shows that great progress has been made in tackling two of the major exascale barriers: energy efficiency and fault tolerance. ‘Energy efficiency has improved dramatically with an increase in ~35x over the past decade, meaning we are nearing the point where an exascale system would be feasible in terms of energy consumption’, says Heldens. ‘Fault tolerance is another topic that has received substantial attention with major developments in checkpoint-restart protocols, data corruption detection and fault understanding.’
Nevertheless, Heldens and his fellow researchers also foresee these two barriers slowly being overshadowed by two other challenges: software complexity and data volume. ‘With respect to software complexity, we see a clear lack of suitable programming models that simplify the development of scalable scientific applications. So even though we’ll soon have all this powerful hardware, the research community might not be able to fully harness it because of a lack of suitable software. Added to this is the problem of data volume, in which we see a growing gap between computing power of processors and data bandwidth. While we expect that computation will become cheaper over time, data movement (i.e. bytes per second) might not grow at the same pace and become relatively more expensive.’ According to the authors, novel hardware technology is promising solutions to the data movement challenge, but these conflict with the software complexity challenge since they introduce additional complications when programming these systems.
Heldens: ‘Exascale computing promises to radically open up new avenues for research and development. It is an enthralling prospect. But to harness its awesome potential, we’ll need to address these issues as soon as possible. I hope our paper provides the shot in the arm to start this process.’
Stijn Heldens, Pieter Hijma, Ben Van Werkhoven, Jason Maassen, Adam Belloum and Rob van Nieuwpoort, ‘The Landscape of Exascale Research: A Data-Driven Literature Analysis’ in ACM Computing Surveys (March 2020). DOI: https://doi.org/10.1145/3372390
* This paper was made possible by funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777533 (PROCESS) and No 823988 (ESiWACE2), and the Netherlands eScience Center under file number 027.016.G06.
A team of researchers, including research engineers from the Netherlands eScience Center, has developed a new open source software package that ranks and scores protein-protein interfaces (PPIs). Called iScore, the software package competes or even outperforms state-of-the-art protein scoring functions and could be generalized for a broad range of applications that involve the ranking of graphs. The software was announced in a recent paper in the journal Software X.
Interactions between proteins that lead to the formation of a three-dimensional (3D) complex is a crucial mechanism that underlies major biological activities in organisms ranging from immune defense system to enzyme catalysis. The 3D structure of such complexes provides fundamental insights on protein recognition mechanisms and protein functions.
The scoring problem
One way researchers in the field of molecular modeling try to predict the 3D structures of such complexes is by using computational docking, a tool that predicts the preferred orientation of one molecule to a second when bound to each other to form a stable complex. Despite its potential, however, a major drawback of computational docking is the scoring problem – the question on how to single out models that are likely to occur in real life experiment from the huge pool of generated docking models, in other words how to find a needle in a haystack.
‘The scoring problem has been a highly challenging task for decades.’
‘The scoring problem has been a highly challenging task for decades’, says Dr Nicolas Renaud, eScience Research Coordinator and member of the project team. ‘Over the years, many methods have been developed to overcome this problem. These can largely be grouped into five types: shape complementarity-based methods, physical energy-based methods, statistical potential-based methods, machine learning-based methods and coevolution-based methods. These different scoring approaches are regularly benchmarked against each other during a community-wide challenge: the Critical Assessment of Prediction of Interactions (CAPRI).’
To address the problem, the research team developed iScore. This novel kernel-based machine learning approach represents the interface of a protein complex as an interface graph, with the nodes being the interface residues and the edges connecting the residues in contact. By comparing the graph similarity between the query graph and the training graphs, iScore predicts how close the query graph is to the near-native model.
‘In a recent paper we demonstrated how iScore competes with, or even outperforms various state-of-the-art approaches on two independent test sets: the Docking Benchmark 5.0 set and the CAPRI score set’, says Renaud. ‘Using only a small number of features, iScore performs well compared with IRaPPA, the latest machine learning based scoring function, which exploits 91 features. This demonstrates the advantage of representing protein interfaces as graphs as compared to fixed-length feature vectors which discard information about the interaction topology.’
‘iScore offers a user-friendly solution to ranking PPIs more efficiently and more accurately than several similar scoring functions.’
According to the researchers, iScore offers a user-friendly experience thanks to dedicated workflows that fully automate the process of ranking PPIs. The software also allows to exploit large scale computer architecture by distributing the calculation across a large number of CPUs and GPUs. What’s more, although iScore has been developed specifically for ranking PPIs, the method is generic and could be used more generally for a broad range of applications that involve the ranking of graphs.
Renaud: ‘I am unbelievably proud of what the team has produced. iScore offers a user-friendly solution to ranking PPIs more efficiently and more accurately than several similar scoring functions. In addition, the software is open-source and freely available to use. I encourage researchers everywhere to give iScore a try and experience the benefits for themselves.’
N. Renaud, Y. Jung, V. Honavar, C. Geng, A. Bonvin, L. Xue, ‘iScore: An MPI supported software for ranking protein-protein docking models based on random walk graph kernel and support vector machines’ in Software X (January-June 2020). DOI: 10.1016/j.softx.2020.100462
Read the full paper
The eTEC 2020 call is aimed at domain researchers and ICT researchers working in the Netherlands who would like to apply for funding to address innovative compute-intensive and/or data-driven research problems. Its aim is to support the research and development of innovative eScience technologies and software associated with optimized data handling, data analytics and efficient computing, driven by a demand from any specified research discipline (a scientific or scholarly domain selected by the research team itself).
See the eTEC 2020 page for more information on the call, requirements, application process and deadlines.
The ASDI 2020 call is aimed at researchers who would like to carry out research projects focused on innovative domain research questions that are very hard or even impossible to investigate without the use of (advanced) eScience technologies and software. With ASDI 2020, the eScience Center intends to provide an impulse to all research endeavours in which the application of eScience tools and methodologies is relatively underdeveloped.
See the ASDI 2020 page for more information on the call, requirements, application process and deadlines
Funding and contact
Both eTEC 2020 and ASDI 2020 are funded by the Netherlands eScience Center and supported by the Netherlands Organisation for Scientific Research’s (NWO) Science Domain.
- Dr. Frank Seinstra (eScience Center): firstname.lastname@example.org or 020 460 4770
- Tom van Rens (NWO): email@example.com or 070 344 0509