How to parse JSON data with Python Pandas?

How to parse JSON data with Python Pandas?

One-liner to read and normalize JSON data into a flat table using Pandas. How to parse JSON data with Python Pandas? If you are doing anything related to data whether it is Data Engineering, Data Analytics, or even Data Science, you would have surely come across JSONs. JSON (JavaScript Object Notation) is one of the most used data formats for exchanging data over the web. NoSQL based databases like MongoDB store the data in this format. Although this format works well for storing the data, it needs to be converted into a tabular form for further analysis.

If you are doing anything related to data whether it is Data Engineering, Data Analytics, or even Data Science, you would have surely come across JSONs.

JSON (JavaScript Object Notation) is one of the most used data formats for exchanging data over the web. NoSQL based databases like MongoDB store the data in this format. Although this format works well for storing the data, it needs to be converted into a tabular form for further analysis.

In this story, we will see how easy it is to parse JSON data and convert it into the tabular form. You can download the sample data from the GitHub repository mentioned below. Also, have a look at the notebook for more details about the data and APIs used.

Data Details

I am going to use the data which I generated while working on the Machine Learning clustering problem. No need to worry if data doesn’t make sense as it is used only for demo purposes. I will use two different JSON-

  1. Simple JSON with no nested lists/dictionaries.

This is already flattened JSON and requires minimal processing.

Sample Record:
{
    "Scaler": "Standard",
    "family_min_samples_percentage": 5,
    "original_number_of_clusters": 4,
    "eps_value": 0.1,
    "min_samples": 5,
    "number_of_clusters": 9,
    "number_of_noise_samples": 72,
    "adjusted_rand_index": 0.001,
    "adjusted_mutual_info_score": 0.009,
    "homogeneity_score": 0.330,
    "completeness_score": 0.999,
    "v_measure_score": 0.497,
    "fowlkes_mallows_score": 0.0282,
    "silhouette_coefficient": 0.653,
    "calinski_harabasz_score": 10.81,
    "davies_bouldin_score": 1.70
}

2. JSON with nested lists/dictionaries.

This might seems a little complicated and in general, would require you to write a script for flattening. Later, we will see how it can be converted into a DataFrame with just 1 line of code.

Sample Record:
{
  'Scaler': 'Standard',
  'family_min_samples_percentage': 5,
  'original_number_of_clusters': 4,
  'Results': 
  [
      {
        'eps_value': 0.1,
        'min_samples': 5,
        'number_of_clusters': 9,
        'number_of_noise_samples': 72,
        'scores': 
            {
             'adjusted_rand_index': 0.001,
             'adjusted_mutual_info_score': 0.009,
             'homogeneity_score': 0.331,
             'completeness_score': 0.999,
             'v_measure_score': 0.497,
             'fowlkes_mallows_score': 0.028,
             'silhouette_coefficient': 0.653,
             'calinski_harabasz_score': 10.81,
             'davies_bouldin_score': 1.70
            }
      },
      {
        'eps_value': 0.1,
        'min_samples': 10,
        'number_of_clusters': 6,
        'number_of_noise_samples': 89,
        'scores': 
            {
             'adjusted_rand_index': 0.001,
             'adjusted_mutual_info_score': 0.008,
             'homogeneity_score': 0.294,
             'completeness_score': 0.999,
             'v_measure_score': 0.455,
             'fowlkes_mallows_score': 0.026,
             'silhouette_coefficient': 0.561,
             'calinski_harabasz_score': 12.528,
             'davies_bouldin_score': 1.760
            }
      }
  ]
}

python programming data-engineering json pandas

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Python Programming & Data Handling

Python Programming & Data Handling

Basic Data Types in Python | Python Web Development For Beginners

In the programming world, Data types play an important role. Each Variable is stored in different data types and responsible for various functions. Python had two different objects, and They are mutable and immutable objects.

Guide to Python Programming Language

Guide to Python Programming Language

Python Pandas Objects - Pandas Series and Pandas Dataframe

In this post, we will learn about pandas’ data structures/objects. Pandas provide two type of data structures:- ### Pandas Series Pandas Series is a one dimensional indexed data, which can hold datatypes like integer, string, boolean, float...

Python for Data Science - Course for Beginners (Learn Python, Pandas)

This Python data science course will take you from knowing nothing about Python to coding and analyzing data with Python using tools like Pandas, NumPy, and ...