What is AutoML?
Automated Machine Learning provides methods and processes to make Machine Learning available for non-Machine Learning experts, to improve efficiency of Machine Learning and to accelerate research on Machine Learning.
Traditional Approach:
Machine learning (ML) has achieved considerable successes in recent years and an ever-growing number of disciplines rely on it. However, this success crucially relies on human machine learning experts to perform the following tasks:
- Preprocess and clean the data.
- Select and construct appropriate features.
- Select an appropriate model family.
- Optimize model hyperparameters.
- Postprocess machine learning models.
- Critically analyze the results obtained.
Power of AutoML:
As the complexity of these tasks is often beyond non-ML-experts, the rapid growth of machine learning applications has created a demand for off-the-shelf machine learning methods that can be used easily and without expert knowledge. We call the resulting research area that targets progressive automation of machine learning AutoML. The AutoML space is growing fast. At the moment there are more than 30 vendors offering systems promising a “one-click-data-in-model-out” solution to practical data-driven business problems. Some of AutoML service providers are:
- BigML
- H2O.ai
- XpanseAI
- GoogleAI
- Ople.ai
- Neuralstudio.ai
Levels of Automation:
The platforms available in the market today differ in the features they offer but they all automate – to a greater or lesser extend – the following steps in the data pipeline:
- Data connectivity
- Exploratory data analysis
- Model building (training, validation, evaluation, comparison, and scoring)
- Deployment and communication
Key Features of AutoML Services:
- Ability to read flat comma-separated files. Many platforms can read Excel spreadsheets.
- Once the data has been uploaded to the system, several summary statistics like mean, standard deviation, and the number of missing values is generated, some platforms offer support to automate even the cleaning process.
- Once the data has been preprocessed, Initiating the model building process can be as simple as hitting a start button.
- Depending on the platform, models are built for any, some, or all of these categories: regression, classification, clustering, and time series.
- The final conclusion will be the summary consisting of a ranking of different models trained on and numerical and graphical performance reports like ROC curves.
- The final score of the best models were displayed.
- Provides a quick step to deploy the model into the cloud.
Conclusion:
AutoML is a great savior in terms of automating some of the traditional approaches involved in the Machine Learning. It leads to a great level of efficiency in the work by allowing people to focus on innovation rather than tons of coding and brainstorming. Power of AutoML can be unleashed with the right approach and usage.
Sukesh Perla
Data Science Intern (EDP)