1602421200

This approach is useful anytime we want to know information about both the individual records **_and _**the groups they belong to.

For example, if we have customer-level transaction data, an approach like this can provide us with information about each individual transaction, as well as the total sales during the month in which it took place:

In this post, I’ll walk through another type of the window function — one where we perform our calculation based on the position of the rows, rather than the values of a categorical column.

For my examples below, I’ll work with some game-level basketball data from the NCAA ML competition on Kaggle.

If basketball is not your thing, fear not.

Here is a quick data dictionary that tells you all you need to know about the variables:

**DayNum:**Our measure of time. It counts how many days into the season the game occurred.**Season:**The year the game took place.**Team1/Team2:**The IDs of the teams that played in the game.**Efficiency:**A measure of how well Team1 performed in the game.**Outcome:**Flag for whether or not Team1 won the game (1 is win, 0 is loss).

#programming #technology #python #machine-learning #data-science

1616818722

In my last post, I mentioned multiple selecting and filtering in Pandas library. I will talk about time series basics with Pandas in this post. Time series data in different fields such as finance and economy is an important data structure. The measured or observed values over time are in a time series structure. Pandas is very useful for time series analysis. There are tools that we can easily analyze.

In this article, I will explain the following topics.

- What is the time series?
- What are time series data structures?
- How to create a time series?
- What are the important methods used in time series?

Before starting the topic, our Medium page includes posts on data science, artificial intelligence, machine learning, and deep learning. Please don’t forget to follow us on ** Medium** 🌱 to see these posts and the latest posts.

Let’s get started.

#what-is-time-series #pandas #time-series-python #timeseries #time-series-data

1616832900

In the last post, I talked about working with time series . In this post, I will talk about important methods in time series. Time series analysis is very frequently used in finance studies. Pandas is a very important library for time series analysis studies.

In summary, I will explain the following topics in this lesson,

- Resampling
- Shifting
- Moving Window Functions
- Time zone

Before starting the topic, our Medium page includes posts on data science, artificial intelligence, machine learning, and deep learning. Please don’t forget to follow us on ** Medium** 🌱 to see these posts and the latest posts.

Let’s get started.

#pandas-time-series #timeseries #time-series-python #time-series-analysis

1586702221

In this post, we will learn about pandas’ data structures/objects. Pandas provide two type of data structures:-

Pandas Series is a one dimensional indexed data, which can hold datatypes like integer, string, boolean, float, python object etc. A Pandas Series can hold only one data type at a time. The axis label of the data is called the index of the series. The labels need not to be unique but must be a hashable type. The index of the series can be integer, string and even time-series data. In general, Pandas Series is nothing but a column of an excel sheet with row index being the index of the series.

Pandas dataframe is a primary data structure of pandas. Pandas dataframe is a two-dimensional size mutable array with both flexible row indices and flexible column names. In general, it is just like an excel sheet or SQL table. It can also be seen as a python’s dict-like container for series objects.

#python #python-pandas #pandas-dataframe #pandas-series #pandas-tutorial

1592498546

Handling NaN in Series is Mandatory to learn to start with handling the Missing Data in field of Data Analytics … Let’s explore the same…

#python #pandas #programming #pandas-series #pandas.series #nan

1595685600

In this article, we will be discussing an algorithm that helps us analyze past trends and lets us focus on what is to unfold next so this algorithm is time series forecasting.

**What is Time Series Analysis?**

In this analysis, you have one variable -TIME. A time series is a set of observations taken at a specified time usually equal in intervals. It is used to predict future value based on previously observed data points.

**Here some examples where time series is used.**

- Business forecasting
- Understand the past behavior
- Plan future
- Evaluate current accomplishments.

**Components of time series :**

**Trend:**Let’s understand by example, let’s say in a new construction area someone open hardware store now while construction is going on people will buy hardware. but after completing construction buyers of hardware will be reduced. So for some times selling goes high and then low its called uptrend and downtrend.- **Seasonality: **Every year chocolate sell goes high during the end of the year due to Christmas. This same pattern happens every year while in the trend that is not the case. Seasonality is repeating same pattern at same intervals.
**Irregularity:**It is also called noise. When something unusual happens that affects the regularity, for example, there is a natural disaster once in many years lets say it is flooded so people buying medicine more in that period. This what no one predicted and you don’t know how many numbers of sales going to happen.**Cyclic:**It is basically repeating up and down movements so this means it can go more than one year so it doesn’t have fix pattern and it can happen any time and it is much harder to predict.

**Stationarity of a time series:**

A series is said to be “strictly stationary” if the marginal distribution of Y at time t[p(Yt)] is the same as at any other point in time. This implies that the mean, variance, and covariance of the series Yt are time-invariant.

However, a series said to be “weakly stationary” or “covariance stationary” if mean and variance are constant and covariance of two-point Cov(Y1, Y1+k)=Cov(Y2, Y2+k)=const, which depends only on lag k but do not depend on time explicitly.

#machine-learning #time-series-model #machine-learning-ai #time-series-forecasting #time-series-analysis