Fluent Pandas: Pandas Data Structures

Let’s uncover practical details of Pandas’s Series, DataFrame, and Panel. Pandas is a column-oriented data analysis API. It’s a great tool for handling and analyzing input data.

Note to the Readers:_ Paying attention to comments in examples would be more helpful than going through the theory itself._

· Series (1D data structure: Column-vector of DataTable)

· DataFrame (2D data structure: Table)

· Panel (3D data structure)

Handy References

Pandas Data Structures

Refer [Intro to Data Structures_](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro) on Pandas docs._

The primary data structures in pandas are implemented as two classes: _`[**_DataFrame](`_ and _`[Series**](`.

Import `numpy_` and `pandas_`_ into your namespace:_

import numpy as np
import pandas as pd
import matplotlib as mpl

Series (1D data structure: Column-vector of DataTable)


Series is _**_one-dimensional array having elements with non-unique labels (index)_, and is capable of holding any data type. The _axis labels_ are collectively referred to as _index**. The general way to create a Series is to call:

pd.Series(data, index=index)

Here, _`**_data_ can be an NumPy’s _ndarray_`, Python’s `dict__, or a scalar value (like _5_). The passed _index**`_ is a list of axis labels._

Note:_ pandas supports **_non-unique index values**. If an operation that does not support duplicate index values is attempted, an exception will be raised at that time.

If **`data` is `list` or `ndarray` (preferred way):**

If `data_` is an `ndarray__ or _list_`, then `index__ must be of the same length as _data_`. If no index is passed, one will be created having values `[0, 1, 2, ... len(data) - 1]`._

If **`data` is a scalar value:**

If `data_` is a scalar value, an `index__ must be provided. The value will be repeated to match the length of _index_`._

If **`data` is `dict`:**

If `data_` is a `dict__, and_ - _if _index_` is passed the values in `data__ corresponding to the labels in the _index_` will be pulled out, otherwise_ - an `index_` will be constructed from the sorted keys of the `dict_`, if possible


_Series__ acts very similar to an `ndarray__, and is a valid argument to most NumPy functions. However, things like slicing also slice the index. Series can be passed to most NumPy methods expecting an 1D _ndarray_`._

A key difference between `Series_` and `ndarray__ is automatically alignment of the data based on labels during _Series_` operations. Thus, you can write computations without giving consideration to whether the `Series_`_ object involved has some non-unique labels._ For example,

