Astronomical Data Science with Python


7 Mar 2022





Date end

9 Mar 2022

Time end



This workshop teaches the basics of how to deal with large online datasets in astronomy.

Astronomical Data Science with Python covers a range of core concepts necessary to efficiently study the ever-growing datasets developed in modern astronomy. In particular, this workshop teaches learners to perform database operations (SQL queries, joins, filtering) and to create publication-quality data visualisations. Learners will use software packages common to the general and astronomy-specific data science communities (Pandas, Astropy, Astroquery combined with two astronomical datasets: the large, all-sky, multi-dimensional dataset from the Gaia satellite, which measures the positions, motions, and distances of approximately a billion stars in our Milky Way galaxy with unprecedented accuracy and precision; and the Pan-STARRS photometric survey, which precisely measures light output and distribution from many stars. Together, the software and datasets are used to reproduce part of the analysis from the article “Off the beaten path: Gaia reveals GD-1 stars outside of the main stream” by Drs. Adrian M. Price-Whelan and Ana Bonaca. This lesson shows how to identify and visualize the GD-1 stellar stream, which is a globular cluster that has been tidally stretched by the Milky Way.

The workshop is based on the teaching style of the Carpentries, and learners will follow along while the instructors write the code on screen. More information can be found on the workshop website.

Who: The workshop is open and free to all researchers in the Netherlands. The workshop is aimed at PhD candidates and other researchers or research software engineers.

Prerequired knowledge:

The participant should

  • have working knowledge of Python
  • have had exposure to the Bash shell

A detailed list of functions that participants should know can be found here.

Where: This training will take place online. The instructors will provide you with the information you will need to connect to this meeting.


  • Incremental creation of complex ADQL and SQL queries.
  • Using Astroquery to query a remote server in Python.
  • Transforming coordinates between common coordinate systems using Astropy units and coordinates.
  • Working with common astronomical file formats, including FITS, HDF5, and CSV.
  • Managing your data with Pandas DataFrames and Astropy Tables.
  • Writing functions to make your work less error-prone and more reproducible.
  • Creating a reproducible workflow that brings the computation to the data.
  • Customising all elements of a plot and creating complex, multi-panel, publication-quality graphics.