That’s a question that has run through my head repeatedly over the years. The beloved sitcom has been my top “comfort show/background noise” choice for a long time.

It used to be a question I couldn’t answer, because the data Netflix allowed users to download about their activity was extremely limited.

Now, though, Netflix allows you to download a veritable treasure-trove of data about your account. With a just a little Python and pandas programming, we can now get a concrete answer to the question: how much time have I spent watching The Office?

Want to find out how much time you have spent watching The Office, or any other show on Netflix?

In this tutorial, we’ll walk you through exactly how to do it step by step!

Having a little Python and pandas experience will be helpful for this tutorial, but it’s not strictly necessary. You can sign up and try  our interactive Python for beginners course for free.

But first, let’s answer a quick question . . .

Can’t I Just Use Excel? Why Do I Need to Write Code?

Depending on how much Netflix you watch and how long you’ve had the service, you might be able to use Excel or some other spreadsheet software to analyze your data.

But there’s a good chance that will be tough.

The dataset you’ll get from Netflix includes every time a video of any length played — that includes those trailers that auto-play as you’re browsing your list.

So, if you use Netflix often or have had the streaming service for a long time, the file you’re working with is likely to be pretty big. My own viewing activity data, for example, was over 27,000 rows long.

Opening a file that big in Excel is no problem. But to do our analysis, we’ll need to do a bunch of filtering and performing calculations. With that much data, Excel can get seriously bogged-down, especially if your computer isn’t particularly powerful.

Scrolling through such a huge dataset trying to find specific cells and formulas can also become confusing fast.

Python can handle large datasets and calculations like this much more smoothly because it doesn’t have to render everything visually. And since we can do everything with just a few lines of code, it’ll be really easy to see everything we’re doing, without having to scroll through a big spreadsheet looking for cells with formulas.

#netflix #pandas #python

Beginner Python Tutorial: Analyze Your Personal Netflix Data
11.20 GEEK