Jupyter Notebook: Forget CSV, Fetch Data With Python

Jupyter Notebook: Forget CSV, Fetch Data With Python

In this post, see how to call the Oracle DB from Jupyter notebook with Python code.

If you read a book, article, or blog about machine learning, chances are it will use training data from a CSV file. There’s nothing wrong with CSV, but let’s think about if it is really practical. Wouldn’t it be better to read data directly from the DB? Often, you can’t feed business data directly into ML training because it needs pre-processing — changing categorial data, calculating new data features, etc. Data preparation/transformation steps can be done quite easily with SQL while fetching original business data. Another advantage of reading data directly from DB is when data changes, it is easier to automate the ML model re-train process.

In this post, I describe how to call the Oracle DB from Jupyter notebook with Python code.

Step 1

Install cx_Oracle Python module:

python -m pip install cx_Oracle

This module helps to connect to the Oracle DB from Python.

Step 2

cx_Oracle enables us to execute SQL call from Python code, but to be able to call remote DB from Python script, we need to install and configure Oracle Instant Client on the machine where Python runs.

If you are using Ubuntu, install alien:

sudo apt-get update 
sudo apt-get install alien

Download RPM files for Oracle Instant Client and install with alien:

alien -i oracle-instantclient18.3-basiclite-18.3.0.0.0-1.x86_64.rpm 
alien -i oracle-instantclient18.3-sqlplus-18.3.0.0.0-1.x86_64.rpm 
alien -i oracle-instantclient18.3-devel-18.3.0.0.0-1.x86_64.rpm

Add environment variables:

export ORACLE_HOME=/usr/lib/oracle/18.3/client64 
export PATH=$PATH:$ORACLE_HOME/bin

Read more here.

Step 3

Install Magic SQL Python modules:

pip install jupyter-sql 
pip install ipython-sql

Installation and configuration complete.

For today’s sample, I’m using the Pima Indians Diabetes Database. CSV data can be downloaded from here. I uploaded CSV data into the database table and will be fetching it through SQL directly in Jupyter notebook.

First of all, the connection is established to the DB and then SQL query is executed. The query result set is stored in a variable called result. Do you see %%sql — this magic SQL:

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import cx_Oracle\n",
    "\n",
    "%load_ext sql"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Connected: [email protected]'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "with open('credentials.json') as f:\n",
    "    data = json.load(f)\n",
    "    username = data['username']\n",
    "    password = data['password']\n",
    "\n",
    "%sql oracle+cx_oracle://$username:[email protected]:1521/?service_name=ORCLPDB1.localdomain"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * oracle+cx_oracle://hr:***@dbhost:1521/?service_name=ORCLPDB1.localdomain\n",
      "0 rows affected.\n",
      "Returning data to local variable result\n"
     ]
    }
   ],
   "source": [
    "%%sql result <<\n",
    "select TIMES_PREGNANT \"TIMES_PREGNANT\", GLUCOSE \"GLUCOSE\", BLOOD_PRESSURE \"BLOOD_PRESSURE\",\n",
    "       SKIN_FOLD_THICK \"SKIN_FOLD_THICK\", SERUM_INSULIN \"SERUM_INSULING\",\n",
    "       MASS_INDEX \"MASS_INDEX\", DIABETES_PEDIGREE \"DIABETES_PEDIGREE\", AGE \"AGE\",\n",
    "       CLASS_VAR \"CLASS_VAR\" from PIMA_INDIANS_DIABETES"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}

Username and password must be specified while establishing a connection. To avoid sharing a password, make sure to read password value from the external source (it could be simple JSON file as in this example or a more advanced encoded token from keyring).

The beauty of this approach is that data fetched through SQL query is out-of-the-box available in Data Frame. Machine Learning engineers can work with the data in the same way as it would be loaded through CSV:

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>times_pregnant</th>\n",
       "      <th>glucose</th>\n",
       "      <th>blood_pressure</th>\n",
       "      <th>skin_fold_thick</th>\n",
       "      <th>serum_insuling</th>\n",
       "      <th>mass_index</th>\n",
       "      <th>diabetes_pedigree</th>\n",
       "      <th>age</th>\n",
       "      <th>class_var</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>106</td>\n",
       "      <td>70</td>\n",
       "      <td>28</td>\n",
       "      <td>135</td>\n",
       "      <td>34.2</td>\n",
       "      <td>0.142</td>\n",
       "      <td>22</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>155</td>\n",
       "      <td>52</td>\n",
       "      <td>27</td>\n",
       "      <td>540</td>\n",
       "      <td>38.7</td>\n",
       "      <td>0.24</td>\n",
       "      <td>25</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>101</td>\n",
       "      <td>58</td>\n",
       "      <td>35</td>\n",
       "      <td>90</td>\n",
       "      <td>21.8</td>\n",
       "      <td>0.155</td>\n",
       "      <td>22</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>120</td>\n",
       "      <td>80</td>\n",
       "      <td>48</td>\n",
       "      <td>200</td>\n",
       "      <td>38.9</td>\n",
       "      <td>1.162</td>\n",
       "      <td>41</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>11</td>\n",
       "      <td>127</td>\n",
       "      <td>106</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>39</td>\n",
       "      <td>0.19</td>\n",
       "      <td>51</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   times_pregnant  glucose  blood_pressure  skin_fold_thick  serum_insuling  \\\n",
       "0               1      106              70               28             135   \n",
       "1               2      155              52               27             540   \n",
       "2               2      101              58               35              90   \n",
       "3               1      120              80               48             200   \n",
       "4              11      127             106                0               0   \n",
       "\n",
       "  mass_index diabetes_pedigree  age  class_var  \n",
       "0       34.2             0.142   22          0  \n",
       "1       38.7              0.24   25          1  \n",
       "2       21.8             0.155   22          0  \n",
       "3       38.9             1.162   41          0  \n",
       "4         39              0.19   51          0  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = result.DataFrame()\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of rows in dataset: {df.shape[0]}\n",
      "0    500\n",
      "1    268\n",
      "Name: class_var, dtype: int64\n"
     ]
    }
   ],
   "source": [
    "print('Number of rows in dataset: {df.shape[0]}')\n",
    "print(df[df.columns[8]].value_counts())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "times_pregnant     0.221898\n",
       "glucose            0.466581\n",
       "blood_pressure     0.065068\n",
       "skin_fold_thick    0.074752\n",
       "serum_insuling     0.130548\n",
       "age                0.238356\n",
       "class_var          1.000000\n",
       "Name: class_var, dtype: float64"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "corrs = df.corr()['class_var'].abs()\n",
    "columns = corrs[corrs > .01].index\n",
    "corrs = corrs.filter(columns)\n",
    "corrs"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}

Sample Jupyter notebook available on GitHub. Sample credentials JSON file.

How to write a simple toy database in Python

How to write a simple toy database in Python

How to write a simple toy database in Python: MySQL, PostgreSQL, Oracle, Redis, and many more, you just name it — databases are a really important piece of technology in the progress of human civilization. Today we can see how valuable data are, and so keeping them safe and stable is where the database comes in!

How to write a simple toy database in Python: MySQL, PostgreSQL, Oracle, Redis, and many more, you just name it — databases are a really important piece of technology in the progress of human civilization. Today we can see how valuable data are, and so keeping them safe and stable is where the database comes in!

So we can see how important databases are as well. For a quite some time I was thinking of creating My Own Toy Database just to understand, play around, and experiment with it. As Richard Feynman said:

“What I cannot create, I do not understand.”
So without any further talking let’s jump into the fun part: coding.

Let’s Start Coding…

For this Toy Database, we’ll use Python (my favorite ❤️). I named this database FooBarDB (I couldn’t find any other name 😉), but you can call it whatever you want!

So first let’s import some necessary Python libraries which are already available in Python Standard Library:

import json
import os

Yes, we only need these two libraries! We need json as our database will be based on JSON, and os for some path related stuff.

Now let’s define the main class FoobarDB with some pretty basic functions, which I’ll explain below.

class FoobarDB(object):
    def __init__(self , location):
        self.location = os.path.expanduser(location)
        self.load(self.location)

    def load(self , location):
        if os.path.exists(location):
            self._load()
        else:
            self.db = {}
        return True

    def _load(self):
        self.db = json.load(open(self.location , "r"))

    def dumpdb(self):
        try:
            json.dump(self.db , open(self.location, "w+"))
            return True
        except:
            return False

Here we defined our main class with an __init__ function. Whenever creating a Foobar Database we only need to pass the location of the database. In the first __init__ function we take the location parameter and replace ~ or ~user with user’s home directory to make it work intended way. And finally, put it in self.location variable to access it later on the same class functions. In the end, we are calling the load function passing self.location as an argument.

. . . .
    def load(self , location):
        if os.path.exists(location):
            self._load()
        else:
            self.db = {}
        return True
. . . .

In the next load function we take the location of the database as a param. Then check if the database exists or not. If it exists, we load it with the _load() function (explained below). Otherwise, we create an empty in-memory JSON object. And finally, return true on success.

. . . . 

    def _load(self):
        self.db = json.load(open(self.location , "r"))
. . . .

In the _load function, we just simply open the database file from the location stored in self.location. Then we transform it into a JSON object and load it into self.db variable.

 def dumpdb(self):
        try:
            json.dump(self.db , open(self.location, "w+"))
            return True
        except:
            return False

And finally, the dumpdb function: its name says what it does. It takes the in-memory database (actually a JSON object) from the self.db variable and saves it in the database file! It returns True if saved successfully, otherwise returns False.

Make It a Little More Usable… 😉

Wait a minute! 😐 A database is useless if it can’t store and retrieve data, isn’t it? Let’s go and add them also…😎

def set(self , key , value):
        try:
            self.db[str(key)] = value
            self.dumpdb()
            return True
        except Exception as e:
            print("[X] Error Saving Values to Database : " + str(e))
            return False

    def get(self , key):
        try:
            return self.db[key]
        except KeyError:
            print("No Value Can Be Found for " + str(key))  
            return False

    def delete(self , key):
        if not key in self.db:
            return False
        del self.db[key]
        self.dumpdb()
        return True

The set function is to add data to the database. As our database is a simple key-value based database, we’ll only take a key and value as an argument.

First, we’ll try to add the key and value to the database and then save the database. If everything goes right it will return True. Otherwise, it will print an error message and return False. (We don’t want it to crash and erase our data every time an error occurs 😎).

def get(self, key):
        try:
            return self.db[key]
        except KeyError:
            return False

get is a simple function, we take key as an argument and try to return the value linked to the key from the database. Otherwise False is returned with a message.

		def delete(self , key):
        if not key in self.db:
            return False
        del self.db[key]
        self.dumpdb()
        return True

delete function is to delete a key as well as its value from the database. First, we make sure the key is present in the database. If not we return False. Otherwise, we delete the key with the built-in del which automatically deletes the value of the key. Next, we save the database and it returns false.

Now you might think, what if I’ve created a large database and want to reset it? In theory, we can use delete — but it’s not practical, and it’s also very time-consuming! ⏳ So we can create a function to do this task…

    def resetdb(self):
        self.db={}
        self.dumpdb()
        return True

Here’s the function to reset the database, resetdb! It’s so simple: first, what we do is re-assign our in-memory database with an empty JSON object and it just saves it! And that’s it! Our Database is now again clean shaven.

Finally… 🎉

That’s it friends! We have created our own Toy Database ! 🎉🎉 Actually, FoobarDB is just a simple demo of a database. It’s like a cheap DIY toy: you can improve it any way you want. You can also add many other functions according to your needs.

Full Source is Here 👉 bauripalash/foobardb

I hope, you enjoyed it! Let me know your suggestions, ideas or mistakes I’ve made in the comments below! 👇

Thank you! See you soon!

Python GUI Programming Projects using Tkinter and Python 3

Python GUI Programming Projects using Tkinter and Python 3

Python GUI Programming Projects using Tkinter and Python 3

Description
Learn Hands-On Python Programming By Creating Projects, GUIs and Graphics

Python is a dynamic modern object -oriented programming language
It is easy to learn and can be used to do a lot of things both big and small
Python is what is referred to as a high level language
Python is used in the industry for things like embedded software, web development, desktop applications, and even mobile apps!
SQL-Lite allows your applications to become even more powerful by storing, retrieving, and filtering through large data sets easily
If you want to learn to code, Python GUIs are the best way to start!

I designed this programming course to be easily understood by absolute beginners and young people. We start with basic Python programming concepts. Reinforce the same by developing Project and GUIs.

Why Python?

The Python coding language integrates well with other platforms – and runs on virtually all modern devices. If you’re new to coding, you can easily learn the basics in this fast and powerful coding environment. If you have experience with other computer languages, you’ll find Python simple and straightforward. This OSI-approved open-source language allows free use and distribution – even commercial distribution.

When and how do I start a career as a Python programmer?

In an independent third party survey, it has been revealed that the Python programming language is currently the most popular language for data scientists worldwide. This claim is substantiated by the Institute of Electrical and Electronic Engineers, which tracks programming languages by popularity. According to them, Python is the second most popular programming language this year for development on the web after Java.

Python Job Profiles
Software Engineer
Research Analyst
Data Analyst
Data Scientist
Software Developer
Python Salary

The median total pay for Python jobs in California, United States is $74,410, for a professional with one year of experience
Below are graphs depicting average Python salary by city
The first chart depicts average salary for a Python professional with one year of experience and the second chart depicts the average salaries by years of experience
Who Uses Python?

This course gives you a solid set of skills in one of today’s top programming languages. Today’s biggest companies (and smartest startups) use Python, including Google, Facebook, Instagram, Amazon, IBM, and NASA. Python is increasingly being used for scientific computations and data analysis
Take this course today and learn the skills you need to rub shoulders with today’s tech industry giants. Have fun, create and control intriguing and interactive Python GUIs, and enjoy a bright future! Best of Luck
Who is the target audience?

Anyone who wants to learn to code
For Complete Programming Beginners
For People New to Python
This course was designed for students with little to no programming experience
People interested in building Projects
Anyone looking to start with Python GUI development
Basic knowledge
Access to a computer
Download Python (FREE)
Should have an interest in programming
Interest in learning Python programming
Install Python 3.6 on your computer
What will you learn
Build Python Graphical User Interfaces(GUI) with Tkinter
Be able to use the in-built Python modules for their own projects
Use programming fundamentals to build a calculator
Use advanced Python concepts to code
Build Your GUI in Python programming
Use programming fundamentals to build a Project
Signup Login & Registration Programs
Quizzes
Assignments
Job Interview Preparation Questions
& Much More

Guide to Python Programming Language

Guide to Python Programming Language

Guide to Python Programming Language

Description
The course will lead you from beginning level to advance in Python Programming Language. You do not need any prior knowledge on Python or any programming language or even programming to join the course and become an expert on the topic.

The course is begin continuously developing by adding lectures regularly.

Please see the Promo and free sample video to get to know more.

Hope you will enjoy it.

Basic knowledge
An Enthusiast Mind
A Computer
Basic Knowledge To Use Computer
Internet Connection
What will you learn
Will Be Expert On Python Programming Language
Build Application On Python Programming Language