Machine learning in Python with scikit-learn – Online

Start Date

30 Jan 2023

Start Time

09:00 Europe/Amsterdam



End Date

2 Feb 2023

End Time

13:00 Europe/Amsterdam

Machine learning in Python with scikit-learn – Online


January 30 - 09:00 am


February 2 - 01:00 pm

Event Category:


Click to Register:

eScience Center Digital Skills Programme

This workshop will provide participants with the basics of machine learning in Python.

This hands-on workshop will provide you with the basics of machine learning using Python.

Machine learning is the field devoted to methods and algorithms that ‘learn’ from data. It can be applied to a vast range of different domains, from linguistics to physics and from medical imaging to history.

This workshop covers the basics of machine learning in a practical and hands-on manner, so that upon completion, you will be able to train your first machine learning models and understand what next steps to take to improve them.

We start with processing the data so that it is suitable for machine learning. Then we learn how to fit a model to the data using scikit-learn. We learn how to select the best model, learn about different machine learning models, and discuss some of the best practices when starting your own machine learning project.

The workshop is based on the teaching style of the Carpentries, and learners will follow along while the instructors write the code on screen. More information can be found on the workshop website (will be activated once registration is live).


The workshop is open and free to all researchers in the Netherlands at PhD candidate level and higher. We do not accept registrations by Master students. The workshop is aimed at PhD candidates and other researchers or research software engineers.

Prerequisite knowledge:

The course aims to be accessible without a strong technical background. The requirements for this course are:

  • basic knowledge of Python programming : defining variables, writing functions, importing modules
  • some prior experience with the NumPy, pandas and Matplotlib libraries is recommended but not required.


Machine learning concepts

  • What is machine learning?
  • Different types of machine learning
  • General pipeline

The predictive modeling pipeline

  • Tabular data exploration
  • Fitting a scikit-learn model on numerical data
  • Handling categorical data

Selecting the best model

  • Overfitting and underfitting
  • Validation and learning curves
  • Bias versus variance trade-off

Intuition on various models

  • Linear models
  • Tree-based models

Machine learning best practices

  • Data hygiene
  • Correct evaluation
  • How to keep your machine learning project organised


This training will take place online. The instructors will provide you with the information you will need to connect to this meeting.