Open-Source Data Science Projects You can Contribute to Today

Tips from a fellow beginner on some of the best data science code bases you can start working on, and why you should

Contributing to open-source is a great way to get experience working with large teams and code bases, engage with the developer community, add value to your resume and, most importantly, make real contributions to software you use or believe in.

For beginners, the world of working on open-source projects can be understandably daunting. But truthfully, there are very welcoming communities out there with members willing to guide you through your first PR (pull request). You just need to pick a repo and start!

Finding projects

I’d recommend contributing to software that you actually use, for a few reasons. Firstly, you’ll be motivated to add features you want and will receive gratification the next time you use it knowing that you played a small role in helping build it, and also because you’ll have some level of familiarity with the project and the code already.

I found  opensource.guides to be a helpful resource for tips on contributing and finding projects to work on. Additionally,  this GitHub repo has a comprehensive list of beginner-friendly open-source projects you can peruse (although not all data science-related). I believe it’s important to find a project with a community that is patient and helpful with their newcomers.

Photo by Marvin Meyer on Unsplash

Some great repos

If you’re into data science and ML, chances are you’ve used or at least come across these before, making them excellent starting points.

