Let’s uncover practical details of Pandas’s Series, DataFrame, and Panel. Pandas is a column-oriented data analysis API. It’s a great tool for handling and analyzing input data.
Note to the Readers:_ Paying attention to comments in examples would be more helpful than going through the theory itself._
· Series (1D data structure: Column-vector of DataTable)
· DataFrame (2D data structure: Table)
Pandas is a column-oriented data analysis API. It’s a great tool for handling and analyzing input data, and many ML framework support pandas data structures as inputs.
Refer [Intro to Data Structures_](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro) on Pandas docs._
The primary data structures in pandas are implemented as two classes: _`[**_DataFrame](http://0.0.0.0:8000/concepts_machine_learning.html#dataframe-2d-data-structure-table)`_ and _`[Series**](http://0.0.0.0:8000/concepts_machine_learning.html#series-1d-data-structure-column-vector-of-datatable)`.
Import `numpy_` and `pandas_`_ into your namespace:_
import numpy as np
import pandas as pd
import matplotlib as mpl
np.__version__
pd.__version__
mpl.__version__
CREATING SERIES
Series is _**_one-dimensional array having elements with non-unique labels (index)_, and is capable of holding any data type. The _axis labels_ are collectively referred to as _index**. The general way to create a Series is to call:
pd.Series(data, index=index)
Here, _`**_data_ can be an NumPy’s _
ndarray_`, Python’s `dict__, or a scalar value (like _
5_). The passed _
index**`_ is a list of axis labels._
Note:_ pandas supports **_non-unique index values**. If an operation that does not support duplicate index values is attempted, an exception will be raised at that time.
If **`data` is `list` or `ndarray` (preferred way):**
If `data_` is an `ndarray__ or _
list_`, then `index__ must be of the same length as _
data_`. If no index is passed, one will be created having values `[0, 1, 2, ... len(data) - 1]`._
If **`data` is a scalar value:**
If `data_` is a scalar value, an `index__ must be provided. The value will be repeated to match the length of _
index_`._
If **`data` is `dict`:**
If `data_` is a `dict__, and_ - _if _
index_` is passed the values in `data__ corresponding to the labels in the _
index_` will be pulled out, otherwise_ - an `index_` will be constructed from the sorted keys of the `dict_`, if possible
**SERIES**
** IS LIKE `NDARRAY` AND `DICT` COMBINED**
_Series_
_ acts very similar to an `ndarray__, and is a valid argument to most NumPy functions. However, things like slicing also slice the index. Series can be passed to most NumPy methods expecting an 1D _
ndarray_`._
A key difference between `Series_` and `ndarray__ is automatically alignment of the data based on labels during _
Series_` operations. Thus, you can write computations without giving consideration to whether the `Series_`_ object involved has some non-unique labels._ For example,
python pandas data-science data-engineering data data analytic
🔵 Intellipaat Data Science with Python course: https://intellipaat.com/python-for-data-science-training/In this Data Science With Python Training video, you...
Enroll in our Data Science with Python training in Chennai. Best Data Science with Python Training courses in Chennai for 100% Job Placements Support.
🔥Intellipaat Python for Data Science Course: https://intellipaat.com/python-for-data-science-training/In this python for data science video you will learn e...
Master Applied Data Science with Python and get noticed by the top Hiring Companies with IgmGuru's Data Science with Python Certification Program. Enroll Now
Online Data Science Training in Noida at CETPA, best institute in India for Data Science Online Course and Certification. Call now at 9911417779 to avail 50% discount.