Data science is about much more than jupyter notebooks, because data science problems are about more than machine learning.
What data should I collect? How good does my model need to be to be “good enough” to solve my problem? What form should my project take for it to be useful? Should it be a dashboard, a live app, or something else entirely? How do I deploy it? How do I make sure something awful and unexpected doesn’t happen when it’s deployed in production?
Data exploration is a critical step in the data science lifecycle, but its value is really hard to quantify. How would you know if someone failed to find interesting insights in a dataset because there weren’t any insights to be found, or because they’re not skilled enough for the job? Companies tend to bias towards assessing employees based on aspects of job performance that are easy to measure, and that bias means that data exploration is often de-prioritized. A good way around this is for companies or teams to carve out time explicitly for open-ended exploration tasks, so that data scientists don’t shy away from doing them when they’re needed.

#tds-podcast #data-exploration #towards-data-science #podcast #data-science

Beyond the Jupyter Notebook: How to Build Data Science Poducts
1.25 GEEK