In this blog post, I will describe the project I completed to a non-technical audience. For the code, I would encourage you to look at my

Problem Statement

Executives of a new movie studio are after actionable insights to maximise their return on investment and ensure successful movies are produced.

Business value

“Avengers: Endgame’s $1.2 billion opening weekend is the biggest in movie history” — Vox, April 2019.

“Box office cats-tastrophe: Cats projected to lose $70m” — The Guardian, December 2019.

From these two contrasting headlines, we see that entering the movie industry can be viewed as a high risk/ high reward venture for our stakeholders. There is potential but need to ensure the “right” movie is made. Through data analysis we will seek to provide recommendations to maximise the chance of success.

Data

The main data used for this project came from two sources.

Data from IMDB consisted of 146,144 entries with start year, runtime and genres as key features.

Data from the-numbers consisted of 5,782 entries with release_date, production_budget, domestic_gross and worldwide_gross as key features.

We also scrapped data from Wikipedia relating to Netflix Original Movies.

Methodology

The first stage focussed on data preparation including:

  • Importing libraries

  • Reading and cleaning provided data

  • Dealing with missing values

  • Joining datasets

  • Scraping additional data and cleaning it

The second stage focussed on visualisations and insights including:

  • Conducting feature engineering where applicable

  • Creating visualisations

  • Drawing conclusions

  • Providing recommendations

#exploratory-data-analysis #movie-industry #data-analysis #data-visualization #data-science #making movies

       Making Movies
1.20 GEEK