Arvel  Parker

Arvel Parker


How to Use MultiIndex in Pandas to Level Up Your Analysis

What if you could have more than one column as in your DataFrame’s index?The multi-level index feature in Pandas allows you to do just that. A regular Pandas DataFrame has a single column that acts as a unique row identifier, or in other words, an “index”. These index values can be numbers, from 0 to infinity. They can also be more detailed, like having “Dish Name” as the index value for a table of all the food at a McDonald’s franchise.But what if you owned two McDonald’s franchises, and wanted to compare the sales of one dish across both franchises?While thegroupby() function in Pandas would work, this case is also an example of where a MultiIndex could come in handy.A MultiIndex, also known as a multi-level index or hierarchical index, allows you to have multiple columns acting as a row identifier, while having each index column related to another through a parent/child relationship.At the end of this piece, we’ll have answered the following questions by creating and selecting from a DataFrame with a hierarchical index:

  • Which characters speak in the first chapter of “The Fellowship of the Ring”? (answered with .loc)**Who are the first three elves to speak in the “The Fellowship of the Ring”? **(answered with .loc)How much do Gandalf and Saruman talk in “The Two Towers”? (answered with .loc)How much does Isildur speak in all of the films? (answered with .xs)**Which hobbits speak the most in each film and across all three films? **(answered with a pivot table and.loc)

You can find the data used in this article here. We will be using data from “The Lord of the Rings” films, specifically the “WordsByCharacter.csv” file in the data set. This file will have each character’s number of words spoken in each scene of every movie.

#programming #technology #data-science #python #coding

What is GEEK

Buddha Community

How to Use MultiIndex in Pandas to Level Up Your Analysis
Ray  Patel

Ray Patel


Getting started with Time Series using Pandas

An introductory guide on getting started with the Time Series Analysis in Python

Time series analysis is the backbone for many companies since most businesses work by analyzing their past data to predict their future decisions. Analyzing such data can be tricky but Python, as a programming language, can help to deal with such data. Python has both inbuilt tools and external libraries, making the whole analysis process both seamless and easy. Python’s Panda s library is frequently used to import, manage, and analyze datasets in various formats. However, in this article, we’ll use it to analyze stock prices and perform some basic time-series operations.

#data-analysis #time-series-analysis #exploratory-data-analysis #stock-market-analysis #financial-analysis #getting started with time series using pandas

Why Use WordPress? What Can You Do With WordPress?

Can you use WordPress for anything other than blogging? To your surprise, yes. WordPress is more than just a blogging tool, and it has helped thousands of websites and web applications to thrive. The use of WordPress powers around 40% of online projects, and today in our blog, we would visit some amazing uses of WordPress other than blogging.
What Is The Use Of WordPress?

WordPress is the most popular website platform in the world. It is the first choice of businesses that want to set a feature-rich and dynamic Content Management System. So, if you ask what WordPress is used for, the answer is – everything. It is a super-flexible, feature-rich and secure platform that offers everything to build unique websites and applications. Let’s start knowing them:

1. Multiple Websites Under A Single Installation
WordPress Multisite allows you to develop multiple sites from a single WordPress installation. You can download WordPress and start building websites you want to launch under a single server. Literally speaking, you can handle hundreds of sites from one single dashboard, which now needs applause.
It is a highly efficient platform that allows you to easily run several websites under the same login credentials. One of the best things about WordPress is the themes it has to offer. You can simply download them and plugin for various sites and save space on sites without losing their speed.

2. WordPress Social Network
WordPress can be used for high-end projects such as Social Media Network. If you don’t have the money and patience to hire a coder and invest months in building a feature-rich social media site, go for WordPress. It is one of the most amazing uses of WordPress. Its stunning CMS is unbeatable. And you can build sites as good as Facebook or Reddit etc. It can just make the process a lot easier.
To set up a social media network, you would have to download a WordPress Plugin called BuddyPress. It would allow you to connect a community page with ease and would provide all the necessary features of a community or social media. It has direct messaging, activity stream, user groups, extended profiles, and so much more. You just have to download and configure it.
If BuddyPress doesn’t meet all your needs, don’t give up on your dreams. You can try out WP Symposium or PeepSo. There are also several themes you can use to build a social network.

3. Create A Forum For Your Brand’s Community
Communities are very important for your business. They help you stay in constant connection with your users and consumers. And allow you to turn them into a loyal customer base. Meanwhile, there are many good technologies that can be used for building a community page – the good old WordPress is still the best.
It is the best community development technology. If you want to build your online community, you need to consider all the amazing features you get with WordPress. Plugins such as BB Press is an open-source, template-driven PHP/ MySQL forum software. It is very simple and doesn’t hamper the experience of the website.
Other tools such as wpFoRo and Asgaros Forum are equally good for creating a community blog. They are lightweight tools that are easy to manage and integrate with your WordPress site easily. However, there is only one tiny problem; you need to have some technical knowledge to build a WordPress Community blog page.

4. Shortcodes
Since we gave you a problem in the previous section, we would also give you a perfect solution for it. You might not know to code, but you have shortcodes. Shortcodes help you execute functions without having to code. It is an easy way to build an amazing website, add new features, customize plugins easily. They are short lines of code, and rather than memorizing multiple lines; you can have zero technical knowledge and start building a feature-rich website or application.
There are also plugins like Shortcoder, Shortcodes Ultimate, and the Basics available on WordPress that can be used, and you would not even have to remember the shortcodes.

5. Build Online Stores
If you still think about why to use WordPress, use it to build an online store. You can start selling your goods online and start selling. It is an affordable technology that helps you build a feature-rich eCommerce store with WordPress.
WooCommerce is an extension of WordPress and is one of the most used eCommerce solutions. WooCommerce holds a 28% share of the global market and is one of the best ways to set up an online store. It allows you to build user-friendly and professional online stores and has thousands of free and paid extensions. Moreover as an open-source platform, and you don’t have to pay for the license.
Apart from WooCommerce, there are Easy Digital Downloads, iThemes Exchange, Shopify eCommerce plugin, and so much more available.

6. Security Features
WordPress takes security very seriously. It offers tons of external solutions that help you in safeguarding your WordPress site. While there is no way to ensure 100% security, it provides regular updates with security patches and provides several plugins to help with backups, two-factor authorization, and more.
By choosing hosting providers like WP Engine, you can improve the security of the website. It helps in threat detection, manage patching and updates, and internal security audits for the customers, and so much more.

Read More

#use of wordpress #use wordpress for business website #use wordpress for website #what is use of wordpress #why use wordpress #why use wordpress to build a website

Tyrique  Littel

Tyrique Littel


Static Code Analysis: What It Is? How to Use It?

Static code analysis refers to the technique of approximating the runtime behavior of a program. In other words, it is the process of predicting the output of a program without actually executing it.

Lately, however, the term “Static Code Analysis” is more commonly used to refer to one of the applications of this technique rather than the technique itself — program comprehension — understanding the program and detecting issues in it (anything from syntax errors to type mismatches, performance hogs likely bugs, security loopholes, etc.). This is the usage we’d be referring to throughout this post.

“The refinement of techniques for the prompt discovery of error serves as well as any other as a hallmark of what we mean by science.”

  • J. Robert Oppenheimer


We cover a lot of ground in this post. The aim is to build an understanding of static code analysis and to equip you with the basic theory, and the right tools so that you can write analyzers on your own.

We start our journey with laying down the essential parts of the pipeline which a compiler follows to understand what a piece of code does. We learn where to tap points in this pipeline to plug in our analyzers and extract meaningful information. In the latter half, we get our feet wet, and write four such static analyzers, completely from scratch, in Python.

Note that although the ideas here are discussed in light of Python, static code analyzers across all programming languages are carved out along similar lines. We chose Python because of the availability of an easy to use ast module, and wide adoption of the language itself.

How does it all work?

Before a computer can finally “understand” and execute a piece of code, it goes through a series of complicated transformations:

static analysis workflow

As you can see in the diagram (go ahead, zoom it!), the static analyzers feed on the output of these stages. To be able to better understand the static analysis techniques, let’s look at each of these steps in some more detail:


The first thing that a compiler does when trying to understand a piece of code is to break it down into smaller chunks, also known as tokens. Tokens are akin to what words are in a language.

A token might consist of either a single character, like (, or literals (like integers, strings, e.g., 7Bob, etc.), or reserved keywords of that language (e.g, def in Python). Characters which do not contribute towards the semantics of a program, like trailing whitespace, comments, etc. are often discarded by the scanner.

Python provides the tokenize module in its standard library to let you play around with tokens:



import io


import tokenize



code = b"color = input('Enter your favourite color: ')"



for token in tokenize.tokenize(io.BytesIO(code).readline):





TokenInfo(type=62 (ENCODING),  string='utf-8')


TokenInfo(type=1  (NAME),      string='color')


TokenInfo(type=54 (OP),        string='=')


TokenInfo(type=1  (NAME),      string='input')


TokenInfo(type=54 (OP),        string='(')


TokenInfo(type=3  (STRING),    string="'Enter your favourite color: '")


TokenInfo(type=54 (OP),        string=')')


TokenInfo(type=4  (NEWLINE),   string='')


TokenInfo(type=0  (ENDMARKER), string='')

(Note that for the sake of readability, I’ve omitted a few columns from the result above — metadata like starting index, ending index, a copy of the line on which a token occurs, etc.)

#code quality #code review #static analysis #static code analysis #code analysis #static analysis tools #code review tips #static code analyzer #static code analysis tool #static analyzer

PANDAS: Most Used Functions in Data Science

Most useful functions for data preprocessing

When you get introduced to machine learning, the first step is to learn Python and the basic step of learning Python is to learn pandas library. We can install pandas library by pip install pandas. After installing we have to import pandas each time of the running session. The data used for example is from the UCI repository “

  1. Read Data

2. Head and Tail

3. Shape, Size and Info

4. isna

#pandas: most used functions in data science #pandas #data science #function #used python data #most used functions in data science

Tia  Gottlieb

Tia Gottlieb


Exploratory Data Analysis — Passport Numbers in Pandas

According to the ICAO standard, the passport number should be up to 9 characters long and can contain numbers and letters. During your work as an analyst, you can come along a data set containing the passports and you will be asked to explore it.

I have recently worked with one such set and I’d like to share the steps of this analysis with you, including:

  • Number of records
  • Duplicated records
  • Length of the records
  • Analysis of the leading and trailing zeros
  • Appearance of the character at a specific position
  • Where do letters appear in the string using regular expressions (regexes)
  • Length on the sequence of letters
  • Is there a common prefix
  • Anonymize/Randomize the data while keeping the characteristics of the dataset

You can go through the steps with me. Get the (randomized data) from github. It also contains the Jupiter notebook with all the steps.

Basic Dataset Exploration

First, let’s load the data. Since the dataset contains only one column, it’s quite straightforward.

# import the packages which will be used
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv(r"path\data.csv")

The .info() command will tell us that we have 10902 passports in the dataset and all are imported as “object” which means that the format is string.


As an initial step of any analysis should be the check if there are some duplicated values. In our case, there are, so we will remove them using pandas’s .drop_duplicates() method.

print(len(df["PassportNumber"].unique()))# if lower than 10902 there are duplicates

df.drop_duplicates(inplace=True) # or df = df.drop_duplicates()

Length of the passports

Usually, you continue with the check of the longest and the shortest passport.

[In]: df["PassportNumber"].agg(["min","max"])
   min 000000050
   max ZXD244549
   Name: PassportNumber, dtype: object

You might become happy, that all the passports are 9 characters long, but you would be misled. The data are having string format so that the lowest “string” value is the one which starts with the highest number of zeros and the largest the one which has the most zeds at the beginning.

# ordering of the strings is not the same as order of numbers
0 > 0001 > 001 > 1 > 123 > AB > Z > Z123 > ZZ123

In order to see the length of the passports let’s look at their length.

[In]: df["PassportNumber"].apply(len).agg(["min","max"])
   min 3
   max 17
   Name: PassportNumber, dtype: object

In the contracts to our initial belief, the shortest passport contains only 3 characters while the longest is 17 (way over the expected maximum of 9) characters long.

Let’s expand our data frame with the 'len'column so that we can have a look at examples:

# Add the 'len' column
df['len'] = df["PassportNumber"].apply(len)

# look on the examples having the maximum lenght
[In]: df[df["len"]==df['len'].max()]
   PassportNumber    len
   25109300000000000 17
   27006100000000000 17
# look on the examples having the minimum lenght
[In]: df[df["len"]==df['len'].min()]
   PassportNumber    len
   179               3
   917               3
   237               3

The 3 digit passport numbers look suspicious, but they meet the ICAO criteria, but the longest ones are obviously too long, however, they contain quite many trailing zeros. Maybe someone just added the zeros in order to meet some data storage requirements.

Let’s have a look at the overall length distribution of our data sample.

# calculate count of appearance of various lengths
counts_by_value = df["len"].value_counts().reset_index()
separator = pd.Series(["|"]*df["len"].value_counts().shape[0]) = "|"
counts_by_index = df["len"].value_counts().sort_index().reset_index()
lenght_distribution_df = pd.concat([counts_by_value, separator, counts_by_index], axis=1)
# draw the chart
ax = df["len"].value_counts().sort_index().plot(kind="bar")
ax.set_ylabel("number of records")
for p in ax.patches:
    ax.annotate(str(p.get_height()), (p.get_x() * 1.005, p.get_height() * 1.05))

Distribution of the passport lengths of the data sample

We see, that the most passports number in our sample, are 7, 8 or 9 characters long. Quite a few are however 10 or 12 characters long, which is unexpected.

Leading and trailing zeros

Maybe the long passports are having leading or trailing zeros, like our example with 17 characters.

In order to explore these zero-pads let’s add two more columns to our data set — ‘leading_zeros’ and ‘trailing_zeros’ to contain the number of leading and trailing zeros.

# number of leading zeros can be calculated by subtracting the length of the string l-stripped off the leading zeros from the total length of the string

df["leading_zeros"] = df["PassportNumber"].apply(lambda x: len(x) - len(x.lstrip("0")))
# similarly the number of the trailing zeros can be calculated by subtracting the length of the string r-stripped off the leading zeros from the total length of the string
df["trailing_zeros"] = df["PassportNumber"].apply(lambda x: len(x) - len(x.rstrip("0")))

Then we can easily display the passport which has more than 9 characters to check if the have any leading or trailing zeros:

[In]: df[df["len"]>9]
   PassportNumber  len  leading_zeros trailing_zeros
   73846290957     11   0             0
   N614226700      10   0             2
   WC76717593      10   0             0

#pandas #exploratory-data-analysis #data-analysis #dataset #data analysis