When most people think about deep learning practitioners, they think of data scientists who whisper to machine learning models using special powers they learned during their PhDs.
While that may be true for some organizations, the reality of most practical deep learning applications is more banal. The biggest determinant of model performance is now the data, not the model code. And when data is supreme, **data operations **becomes the most important part of your ML team.
Fundamentally, data operations teams are responsible for the maintenance and improvement of the datasets that models train on. Some of their responsibilities include:
A data operations team member is often an expert in their domain. Think about a recycling specialist who can distinguish between plastic and glass containers on sight, or a translator who can convert Chinese to Portuguese, or a radiologist who can navigate an MRI and tell you whether a patient has cancer or not.
Data operations personnel can also come from consulting or business backgrounds. It helps to be organized and methodical when working on any operations task, but especially with data. Knowledge of the business goals and the technology’s capabilities can also inform how best to prioritize data curation in order to improve the ML system.
Within data operations teams, team members can be assigned based on the data / model type that they are responsible for (for example, in a self driving application, different teammates owning the radar, lidar, and image detection systems) or based on the customer / geography that they serve (for example, one team member handling North American deployments and another handling APAC).
#machine-learning #deep-learning #operations #data-operations