How to Using Python Pandas for Log Analysis

How to Using Python Pandas for Log Analysis

In this quick article, I'll walk you through how to use the Pandas library for Python in order to analyze a CSV log file for offload analysis. ## **Background** [Python Pandas](https://pandas.pydata.org/ "Python Pandas") is a library that...

In this quick article, I'll walk you through how to use the Pandas library for Python in order to analyze a CSV log file for offload analysis.

Background

Python Pandas is a library that provides data science capabilities to Python. Using this library, you can use data structures like DataFrames. This data structure allows you to model the data like an in-memory database. By doing so, you will get query-like capabilities over the data set.

Use Case

Suppose we have a URL report from taken from either the Akamai Edge server logs or the Akamai Portal report. In this case, I am using the Akamai Portal report. In this workflow, I am trying to find the top URLs that have a volume offload less than 50%. I've attached the code at the end. I am going to walk through the code line-by-line. Here are the column names within the CSV file for reference.

Offloaded Hits,Origin Hits,Origin OK Volume (MB),Origin Error Volume (MB)

Initialize the Library

The first step is to initialize the Pandas library. In almost all the references, this library is imported as pd. We'll follow the same convention.

import pandas as pd

Read the CSV as a DataFrame

The next step is to read the whole CSV file into a DataFrame. Note that this function to read CSV data also has options to ignore leading rows, trailing rows, handling missing values, and a lot more. I am not using these options for now.

urls_df = pd.read_csv('urls_report.csv')

Pandas automatically detects the right data formats for the columns. So the URL is treated as a string and all the other values are considered floating point values.

Compute Volume Offload

The default URL report does not have a column for Offload by Volume. So we need to compute this new column.

urls_df['Volume Offload'] = (urls_df['OK Volume']*100) / (urls_df[

We are using the columns named OK Volume and Origin OK Volumn (MB) to arrive at the percent offloads.

Filter the Data

At this point, we need to have the entire data set with the offload percentage computed. Since we are interested in URLs that have a low offload, we add two filters:

  • Consider the rows having a volume offload of less than 50% and it should have at least some traffic (we don't want rows that have zero traffic).

  • We will also remove some known patterns. This is based on the customer context but essentially indicates URLs that can never be cached.

Sort Data

At this point, we have the right set of URLs but they are unsorted. We need the rows to be sorted by URLs that have the most volume and least offload. We can achieve this sorting by columns using the sort command.

low_offload_urls.sort_values(by=['OK Volume','Volume Offload'],inplace

Print the Data

For simplicity, I am just listing the URLs. We can export the result to CSV or Excel as well.

First, we project the URL (i.e., extract just one column) from the dataframe. We then list the URLs with a simple for loop as the projection results in an array.

for each_url in low_offload_urls['URL']:
print (each_url)

I hope you found this useful and get inspired to pick up Pandas for your analytics as well!

References

I was able to pick up Pandas after going through an excellent course on Coursera titled Introduction to Data Science in Python. During this course, I realized that Pandas has excellent documentation.

  • Pandas Documentation: http://pandas.pydata.org/pandas-docs/stable/

Full Code

import pandas as pd

urls_df = pd.read_csv('urls_report.csv')

#now convert to right types
urls_df['Volume Offload'] = (urls_df['OK Volume']*100) / (urls_df['OK Volume'] + urls_df['Origin OK Volume (MB)'])

low_offload_urls = urls_df[(urls_df['OK Volume'] > 0) & (urls_df['Volume Offload']<50.0)]
low_offload_urls = low_offload_urls[(~low_offload_urls.URL.str.contains("some-pattern.net")) & (~low_offload_urls.URL.str.contains("stateful-apis")) ]

low_offload_urls.sort_values(by=['OK Volume','Volume Offload'],inplace=True, ascending=['True','False'])

for each_url in low_offload_urls['URL']:
print (each_url)

Thanks for reading.

Angular 9 Tutorial: Learn to Build a CRUD Angular App Quickly

What's new in Bootstrap 5 and when Bootstrap 5 release date?

What’s new in HTML6

How to Build Progressive Web Apps (PWA) using Angular 9

What is new features in Javascript ES2020 ECMAScript 2020

Top Python Development Companies | Hire Python Developers

Top Python Development Companies | Hire Python Developers

After analyzing clients and market requirements, TopDevelopers has come up with the list of the best Python service providers. These top-rated Python developers are widely appreciated for their professionalism in handling diverse projects. When...

After analyzing clients and market requirements, TopDevelopers has come up with the list of the best Python service providers. These top-rated Python developers are widely appreciated for their professionalism in handling diverse projects. When you look for the developer in hurry you may forget to take note of review and ratings of the company's aspects, but we at TopDevelopers have done a clear analysis of these top reviewed Python development companies listed here and have picked the best ones for you.

List of Best Python Web Development Companies & Expert Python Programmers.

Python GUI Programming Projects using Tkinter and Python 3

Python GUI Programming Projects using Tkinter and Python 3

Python GUI Programming Projects using Tkinter and Python 3

Description
Learn Hands-On Python Programming By Creating Projects, GUIs and Graphics

Python is a dynamic modern object -oriented programming language
It is easy to learn and can be used to do a lot of things both big and small
Python is what is referred to as a high level language
Python is used in the industry for things like embedded software, web development, desktop applications, and even mobile apps!
SQL-Lite allows your applications to become even more powerful by storing, retrieving, and filtering through large data sets easily
If you want to learn to code, Python GUIs are the best way to start!

I designed this programming course to be easily understood by absolute beginners and young people. We start with basic Python programming concepts. Reinforce the same by developing Project and GUIs.

Why Python?

The Python coding language integrates well with other platforms – and runs on virtually all modern devices. If you’re new to coding, you can easily learn the basics in this fast and powerful coding environment. If you have experience with other computer languages, you’ll find Python simple and straightforward. This OSI-approved open-source language allows free use and distribution – even commercial distribution.

When and how do I start a career as a Python programmer?

In an independent third party survey, it has been revealed that the Python programming language is currently the most popular language for data scientists worldwide. This claim is substantiated by the Institute of Electrical and Electronic Engineers, which tracks programming languages by popularity. According to them, Python is the second most popular programming language this year for development on the web after Java.

Python Job Profiles
Software Engineer
Research Analyst
Data Analyst
Data Scientist
Software Developer
Python Salary

The median total pay for Python jobs in California, United States is $74,410, for a professional with one year of experience
Below are graphs depicting average Python salary by city
The first chart depicts average salary for a Python professional with one year of experience and the second chart depicts the average salaries by years of experience
Who Uses Python?

This course gives you a solid set of skills in one of today’s top programming languages. Today’s biggest companies (and smartest startups) use Python, including Google, Facebook, Instagram, Amazon, IBM, and NASA. Python is increasingly being used for scientific computations and data analysis
Take this course today and learn the skills you need to rub shoulders with today’s tech industry giants. Have fun, create and control intriguing and interactive Python GUIs, and enjoy a bright future! Best of Luck
Who is the target audience?

Anyone who wants to learn to code
For Complete Programming Beginners
For People New to Python
This course was designed for students with little to no programming experience
People interested in building Projects
Anyone looking to start with Python GUI development
Basic knowledge
Access to a computer
Download Python (FREE)
Should have an interest in programming
Interest in learning Python programming
Install Python 3.6 on your computer
What will you learn
Build Python Graphical User Interfaces(GUI) with Tkinter
Be able to use the in-built Python modules for their own projects
Use programming fundamentals to build a calculator
Use advanced Python concepts to code
Build Your GUI in Python programming
Use programming fundamentals to build a Project
Signup Login & Registration Programs
Quizzes
Assignments
Job Interview Preparation Questions
& Much More

Guide to Python Programming Language

Guide to Python Programming Language

Guide to Python Programming Language

Description
The course will lead you from beginning level to advance in Python Programming Language. You do not need any prior knowledge on Python or any programming language or even programming to join the course and become an expert on the topic.

The course is begin continuously developing by adding lectures regularly.

Please see the Promo and free sample video to get to know more.

Hope you will enjoy it.

Basic knowledge
An Enthusiast Mind
A Computer
Basic Knowledge To Use Computer
Internet Connection
What will you learn
Will Be Expert On Python Programming Language
Build Application On Python Programming Language