AutoML is not a threat for Data Scientists

In the last years, a lot of automated machine learning pieces of software have been introduced. They can automate some tasks that a Data Scientist has usually to perform manually. They have reached a very remarkable level of complexity and effectiveness. Are they a threat to Data Scientist’s job or are they an opportunity?


What is AutoML?

AutoML is a generic expression to indicate pieces of software that perform Machine Learning tasks automatically. They usually automate the entire pipeline processing like, for example, cleaning, encoding, feature and model selection, and hyperparameters tuning. Such pieces of software can be Python libraries like Auto-Sklearn or software programs like Data Robot.

AutoML pieces of software replace all the boring steps that take more time to a Data Scientist’s work. They actually make all the combinations of the several parameters of a pipeline (e.g. the blank filling values, scaling algorithm, model type, model hyperparameters) and select the best combination that maximizes some performance metrics (like RMSE or Area under the ROC Curve) in k-fold cross-validation using some search algorithm (like Grid or Random Search).

They can really simplify the life of somebody that has to create a model from scratch and sometimes they explore combinations and scenarios that a Data Scientist may not have thought of.

#ai & machine learning #automl software #data scientist #machine learning

Will AutoML Software Replace Data Scientists? - Experfy Insights
1.20 GEEK