It’s quite common nowadays to employ computer technologies to predict stock returns. Thus, many stock prediction algorithms rely on machine learning technology to search patterns and insights in stock data. Before you can do that however, you first need to obtain a data set with necessary stock data and then load it into a data structure in your program. This article covers how you can accomplish those first steps required to get started with stock data analysis in Python.

Getting Stock Data from Yahoo Finance

Where can I get data for analysis? This is perhaps the first question you may have after you decide to perform stock analysis programmatically. If so, Yahoo Finance API is one the most relevant answers to your question, allowing you to obtain stock market data for free. With the yfinance library built on top of Yahoo Finance API, the procedure of obtaining the data for a certain stock within a specified time period can be accomplished in Python with just a few lines of code. The library can be installed via pip, as follows:

pip install yfinance

After the successful installation, you can start using the library. Suppose you want to obtain historical stock prices for Inovio Pharmaceuticals, Inc. (INO) over the past three months. This can be done as follows:

>>> import yfinance as yf
>>> tkr = yf.Ticker(‘INO’)
>>> hist = tkr.history(period=”3mo”)

You might specify another period, depending on your needs — the valid periods include: 1d,5d,1mo,3mo,6mo,1y,2y,5y,10y,ytd,max.

Stock Data in a Pandas Dataframe

It is important to note that yfinance returns data in the pandas dataframe format:

>>> type(hist)

<class ‘pandas.core.frame.DataFrame’>

So, you can immediately use the methods available for a pandas dataframe object:

>>> hist.head()

Date Open High Low Close Volume Dividends Stock Splits

2020–03–30 9.20 9.32 7.60 8.02 31303900 0 0
2020–03–31 7.94 8.05 7.25 7.44 13557300 0 0
2020–04–01 7.32 8.09 7.09 7.70 15973400 0 0
2020–04–02 7.63 7.65 7.10 7.52 10865400 0 0
2020–04–03 7.31 7.94 7.24 7.74 11052900 0 0

Suppose you’re interested in close stock prices only. The following code shows how you can reformat the dataframe so that it includes only the necessary columns:

>>> df = hist.iloc[:,0:1]
>>> df = df.reset_index() 
>>> columns = dict(map(reversed, enumerate(df.columns)))
>>> df = df.rename(columns=columns) 
>>> df.head()
0 1
0 2020–03–30 9.20
1 2020–03–31 7.94
2 2020–04–01 7.32
3 2020–04–02 7.63
4 2020–04–03 7.31

#python #pandas #stock-market

Getting Started with Stock Market Analysis in Python
11.45 GEEK