One of the currently most well-known benchmarks for algorithm performance is ImageNet. Many challenges have been organized using this database, with the latest challenge now running on Kaggle. In various scientific disciplines there is a growing interest to benchmark algorithm performance on research data. Many algorithms are proposed in the literature, but there is a growing need to compare them on the same data, using the same metrics and ground truth to compare their performance for a specific task. Organizing these open online benchmarks, will not only increase insight into which algorithms perform best for a given task, but open up these tasks for a wider audience to test their algorithms on, which could lead to new breakthroughs in the field. In this project, the Netherlands eScience Center and SURF join forces to develop a platform (EYRA Benchmark Platform) that supports researchers to easily set-up benchmarks and apply their algorithm on benchmarks from various scientific disciplines.
The EYRA benchmark platform aims to facilitate:
- An easy way to set-up a research benchmark
- Cross-fertilization between scientific disciplines
- Overview of benchmarks per scientific discipline
- Infrastructure to run algorithms on test data in the cloud
- Insight into algorithm performance for a research problem, beyond the benchmark leaderboard.
The EYRA benchmark platform will be an extension of the COMIC platform developed in the group of professor Bram van Ginneken (Radboud UMC): https://github.com/comic/grand-challenge.org
- Tutorial on Benchmarking Algorithm Performance, October 29th at the 14th IEEE International Conference on eScience 2018, Amsterdam, the Netherlands: https://nlesc.github.io/IEEE-eScience-Tutorial-Designing-Benchmarks/
- Adrienne M. Mendrik, Stephen R. Aylward, "Beyond the Leaderboard: Insight and Deployment Challenges to Address Research Problems", Machine Learning Challenges "in the wild", NIPS 2018 workshops, Palais des congres de Montreal, Canada, 2018: https://arxiv.org/abs/1811.03014