Training introduces postgraduates from domain disciplines to data-intensive research

2 Feb 2016 - 2 min

From 25-29 January the first edition of the “Essential Skills in Data-Intensive Research” workshop took place in Utrecht. This training is a collaborative effort of the Netherlands eScience...

From 25-29 January the first edition of the “Essential Skills in Data-Intensive Research” workshop took place in Utrecht. This training is a collaborative effort of the Netherlands eScience Center, SURFsara, the Dutch Techcenter for Life Science (DTL), VU University Amsterdam and the Software and Data Carpentry foundation. The training is primarily targeted at early stage researchers and PhD students, but might also be very relevant for scientist at later stages of their career.

An introduction to data-intensive research

The 5 day workshop, including both hands-on and taught components, has been designed to introduce postgraduates from domain disciplines to the use of data-driven and compute-intensive approaches and the potential applications of e-infrastructure.

The heavily oversubscribed course attracted around 30 participants from across the Netherlands (Groningen, Wageningen, Twente, Amsterdam, Leiden and Utrecht) and from diverse domains (life science, climate research and physics). The course was given at the excellent facilities of SURFacademy in the welcoming offices of SURF in Hoog Overborch, Utrecht.

Well placed to further develop skills in the future

This course provided the participants with the basic experience and knowledge to empower their use of data and development of research software. It also equipped participants with fundamental skills required to optimize their research now and in the coming years. Students completing the course now have the basic skills and knowledge required to manage large datasets, utilise databases, produce metadata and ensure the long-term stewardship of their data. The course also provided many students with their first introductions to programming. The five day program covered best practices in organizing data in spreadsheets, data cleaning, dealing with tabular data in Python, basics of unix shell, FAIR Data, persistent identifiers and using the SURFsara HPC infrastructure. Having received this broad introduction to good data practices and professional software development techniques, the students are now well placed to further develop these skills in the future.

Similar courses in the near future

This training was the first of its kind, with attendees providing extensive feedback to help refine the course in the future. The team plan to run similar courses in the near future.