Let’s uncover practical details of Pandas’s Series, DataFrame, and Panel. Pandas is a column-oriented data analysis API. It’s a great tool for handling and analyzing input data.
Note to the Readers:_ Paying attention to comments in examples would be more helpful than going through the theory itself._
Pandas is a column-oriented data analysis API. It’s a great tool for handling and analyzing input data, and many ML framework support pandas data structures as inputs.
Refer [Intro to Data Structures_](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro) on Pandas docs._
The primary data structures in pandas are implemented as two classes: _`[**_DataFrame](http://0.0.0.0:8000/concepts_machine_learning.html#dataframe-2d-data-structure-table)`_ and _`[Series**](http://0.0.0.0:8000/concepts_machine_learning.html#series-1d-data-structure-column-vector-of-datatable)`.
Import `numpy_` and `pandas_`_ into your namespace:_
import numpy as np import pandas as pd import matplotlib as mpl np.__version__ pd.__version__ mpl.__version__
Series is _**_one-dimensional array having elements with non-unique labels (index)_, and is capable of holding any data type. The _axis labels_ are collectively referred to as _index**. The general way to create a Series is to call:
_ can be an NumPy’s _ndarray_`, Python’s `dict_
_, or a scalar value (like _5
_). The passed _index**`_ is a list of axis labels._
Note:_ pandas supports **_non-unique index values**. If an operation that does not support duplicate index values is attempted, an exception will be raised at that time.
If **`data` is `list` or `ndarray` (preferred way):**
If `data_` is an `ndarray_
_ or _list_`, then `index_
_ must be of the same length as _data_`. If no index is passed, one will be created having values `[0, 1, 2, ... len(data) - 1]`._
If **`data` is a scalar value:**
If `data_` is a scalar value, an `index_
_ must be provided. The value will be repeated to match the length of _index_`._
If **`data` is `dict`:**
If `data_` is a `dict_
_, and_ - _if _index_` is passed the values in `data_
_ corresponding to the labels in the _index_` will be pulled out, otherwise_ - an `index_` will be constructed from the sorted keys of the `dict_`, if possible
**SERIES**** IS LIKE `NDARRAY` AND `DICT` COMBINED**
_Series__ acts very similar to an `ndarray_
_, and is a valid argument to most NumPy functions. However, things like slicing also slice the index. Series can be passed to most NumPy methods expecting an 1D _ndarray_`._
A key difference between `Series_` and `ndarray_
_ is automatically alignment of the data based on labels during _Series_` operations. Thus, you can write computations without giving consideration to whether the `Series_`_ object involved has some non-unique labels._ For example,
🔵 Intellipaat Data Science with Python course: https://intellipaat.com/python-for-data-science-training/In this Data Science With Python Training video, you...
Enroll in our Data Science with Python training in Chennai. Best Data Science with Python Training courses in Chennai for 100% Job Placements Support.
🔥Intellipaat Python for Data Science Course: https://intellipaat.com/python-for-data-science-training/In this python for data science video you will learn e...
Master Applied Data Science with Python and get noticed by the top Hiring Companies with IgmGuru's Data Science with Python Certification Program. Enroll Now
Online Data Science Training in Noida at CETPA, best institute in India for Data Science Online Course and Certification. Call now at 9911417779 to avail 50% discount.