What is a Full Stack Data Scientist?

A full-stack data scientist is a jack-of-all trades who engineers and works on each stage in the data science lifecycle, from beginning to end.
The scope of a full stack data scientist covers every component of a data science business initiative, from identifying to training to deploying machine learning models that provide benefit to stakeholders.

Basic stages in the data science life cycle
Basic stages in the data science lifecycle that can be owned by a full stack data scientist:
Business problem. Unless research-oriented, all data science projects should start with a problem that adds value to a business either through efficiency gains, automation, or new capabilities.
Data collection/identification. Machine learning requires quality data to build a quality model for use.
Data exploration and analysis. The data must be analyzed and understood before a model can be built.
Machine learning. Train a model to solve the business problem given the data.
Model analysis and acceptance. Analyze the model results and behavior. Share with stakeholders for approval.
Model deployment. Make the model accessible to the end-user.
Model monitoring. Ensure that the model behaves as expected in the future.
A Jack of all Trades: the Skillset
The high-level skills listed are also keys to successful data science initiatives. It is worth highlighting the soft skills, without which data science technology may not provide value.
Business Acumen
A full stack data scientist must be able to identify and understand business problems (or opportunities) that can be solved using the data science toolkit.
In order to prioritize projects and process flows with most value to their organization, they must understand the needs and goals of their organization.
Ultimately the business doesn’t care how cool or accurate a model is if it provides no value.
Collaboration
Full stack data scientists do not work in a vacuum. They must collaborate with stakeholders to identify existing problems or inefficiencies that can be solved with data science. Once problems are identified, collaboration is essential to ensure that the result is acceptable and meets their needs. Further, collaboration with SME’s (Subject Matter Expert) enables them to work quickly, such as finding data sources in the organization.
Communication
Effective communication with the business via oral and written mediums allows for better collaboration and “selling” the model to the end-users. This means tailoring data science ideas, results, and value in plain language to non-technical audiences. In some cases, the end-user must understand and trust the model before they choose to use it.
Identifying Data Sources and ETL
Models cannot be trained if there is no data. Oftentimes data is not readily available; it needs to be found, extracted, transformed, and loaded to the right place.
Programming
A full stack data scientist must be able to write clean, efficient object-oriented code that works reliably in production. Ideally, such code will be modular and each function or class validated by unit tests.

#data-science #full-stack #machine-learning #artificial-intelligence #data-engineering #data analysis

towardsdatascience.com

What is a Full Stack Data Scientist?