The Netherlands eScience Center has released a new version of McFly, its highly popular software package that helps researchers to find a suitable neural network configuration to carry out deep learning on time series data. The latest version includes several new features and is now freely available to download.
Deep learning is a popular machine learning method that trains a computer to perform human-like tasks such as recognising speech or classifying images. Unlike other machine learning methods that organise data to run through predefined equations, deep learning sets up basic parameters about the data and trains the computer to learn on its own by recognising patterns in data using many layers of processing.
Its popularity notwithstanding, designing a deep learning network can be difficult as it requires users to choose, for example, the number of layers in the network, the number of nodes in each layer and the type of each layer. Moreover, each network must be calibrated or trained before it can be used to automatically classify data.
The Netherlands eScience Center started developing McFly in 2016 to aid researchers working with time series data. A time series is a series of data points indexed in time order such as activity logs, heights of ocean tides or the daily closing of the Amsterdam Exchange Index. Time series are used, for example, in statistics, weather forecasting, mathematical finance and astronomy.
‘Although there are tools that provide pretrained deep learning models for computer vision tasks, no such model existed for time series data, which are widely used by researchers’, says Dafne van Kuppevelt, eScience research engineer and part of the McFly development team. ‘We realised that many researchers were being hampered from using deep learning by the considerable knowledge required to train a deep network – the exact same knowledge available at the eScience Center.’
McFly simplifies the process by making explicit the steps that are required to train a model while offering useful default values at each step. It then tries out different network configurations, training each one on the data provided by the user before listing the performance of each network along with a visualisation that helps the user judge its tendency to overfit or underfit the data.
Fellow development team member Christiaan Meijer: ‘We wanted to help researchers who have never trained deep networks as well as those who know enough to train a neural network but would like to find a suitable network and hyperparameters, something that is often repeated for every new dataset or research question and is automated in McFly.’
Faster and more flexible
The latest version of McFly features a number of new network architectures that it can generate automatically. This makes it more likely to find a suitable deep learning model type for a given data set. It uses a new underlying Tensorflow version, which is an open source machine learning platform for Python.
‘McFly is to the best our knowledge the only open source tool for deep learning on time series data aimed at novices in machine learning’, Meijer adds. ‘Its main value is to provide an environment to apply deep learning classification of time series data quickly, thereby relieving the user from the technicalities of deep neural networks architectures and training. We recently evaluated the new version in two user workshops and the response was extremely positive. I am really proud of what our team has produced.’
The members of the McFly development team are (in alphabetical order): Sonja Georgievska, Vincent van Hees, Florian Huber, Dafne van Kuppevelt, Christiaan Meijer and Atze van der Ploeg