Gordon  Taylor

Gordon Taylor

1656265800

Plotly.js: Open-source JavaScript charting library behind Plotly, Dash

Plotly.js is a standalone Javascript data visualization library, and it also powers the Python and R modules named plotly in those respective ecosystems (referred to as Plotly.py and Plotly.R).

Plotly.js can be used to produce dozens of chart types and visualizations, including statistical charts, 3D graphs, scientific charts, SVG and tile maps, financial charts and more.

 

Contact us for Plotly.js consulting, dashboard development, application integration, and feature additions.


Load as a node module

Install a ready-to-use distributed bundle

npm i --save plotly.js-dist-min

and use import or require in node.js

// ES6 module
import Plotly from 'plotly.js-dist-min'

// CommonJS
var Plotly = require('plotly.js-dist-min')

You may also consider using plotly.js-dist if you prefer using an unminified package.


Load via script tag

The script HTML element

In the examples below Plotly object is added to the window scope by script. The newPlot method is then used to draw an interactive figure as described by data and layout into the desired div here named gd. As demonstrated in the example above basic knowledge of html and JSON syntax is enough to get started i.e. with/without JavaScript! To learn and build more with plotly.js please visit plotly.js documentation.

<head>
    <script src="https://cdn.plot.ly/plotly-2.12.1.min.js"></script>
</head>
<body>
    <div id="gd"></div>

    <script>
        Plotly.newPlot("gd", /* JSON object */ {
            "data": [{ "y": [1, 2, 3] }],
            "layout": { "width": 600, "height": 400}
        })
    </script>
</body>

Alternatively you may consider using native ES6 import in the script tag.

<script type="module">
    import "https://cdn.plot.ly/plotly-2.12.1.min.js"
    Plotly.newPlot("gd", [{ y: [1, 2, 3] }])
</script>

Fastly supports Plotly.js with free CDN service. Read more at https://www.fastly.com/open-source.

Un-minified versions are also available on CDN

While non-minified source files may contain characters outside UTF-8, it is recommended that you specify the charset when loading those bundles.

<script src="https://cdn.plot.ly/plotly-2.12.1.js" charset="utf-8"></script>

Please note that as of v2 the "plotly-latest" outputs (e.g. https://cdn.plot.ly/plotly-latest.min.js) will no longer be updated on the CDN, and will stay at the last v1 patch v1.58.5. Therefore, to use the CDN with plotly.js v2 and higher, you must specify an exact plotly.js version.

MathJax

You could load either version two or version three of MathJax files, for example:

<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-AMS-MML_SVG.js"></script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3.2.2/es5/tex-svg.js"></script>

When using MathJax version 3, it is also possible to use chtml output on the other parts of the page in addition to svg output for the plotly graph. Please refer to devtools/test_dashboard/index-mathjax3chtml.html to see an example.

Bundles

There are two kinds of plotly.js bundles:

  1. Complete and partial official bundles that are distributed to npm and the CDN, described in the dist README.
  2. Custom bundles you can create yourself to optimize the size of bundle depending on your needs. Please visit CUSTOM_BUNDLE for more information.

Alternative ways to load and build plotly.js

If your library needs to bundle or directly load plotly.js/lib/index.js or parts of its modules similar to index-basic in some other way than via an official or a custom bundle, or in case you want to tweak the default build configurations of browserify or webpack, etc. then please visit BUILDING.md.


Documentation

Official plotly.js documentation is hosted at https://plotly.com/javascript.

These pages are generated by the Plotly graphing-library-docs repo built with Jekyll and publicly hosted on GitHub Pages. For more info about contributing to Plotly documentation, please read through contributing guidelines.


Bugs and feature requests

Have a bug or a feature request? Please open a Github issue keeping in mind the issue guidelines. You may also want to read about how changes get made to Plotly.js


Contributing

Please read through our contributing guidelines. Included are directions for opening issues, using plotly.js in your project and notes on development.


Notable contributors

Plotly.js is at the core of a large and dynamic ecosystem with many contributors who file issues, reproduce bugs, suggest improvements, write code in this repo (and other upstream or downstream ones) and help users in the Plotly community forum. The following people deserve special recognition for their outsized contributions to this ecosystem:

 GitHubTwitterStatus
Alex C. Johnson@alexcjohnson Active, Maintainer
Mojtaba Samimi@archmoj@solarchvisionActive, Maintainer
Antoine Roy-Gobeil@antoinerg Active, Maintainer
Nicolas Kruchten@nicolaskruchten@nicolaskruchtenActive, Maintainer
Jon Mease@jonmmease@jonmmeaseActive
Étienne Tétreault-Pinard@etpinard@etpinardHall of Fame
Mikola Lysenko@mikolalysenko@MikolaLysenkoHall of Fame
Ricky Reusser@rreusser@rickyreusserHall of Fame
Dmitry Yv.@dy@DimaYvHall of Fame
Robert Monfera@monfera@monferaHall of Fame
Robert Möstl@rmoestl@rmoestlHall of Fame
Nicolas Riesco@n-riesco Hall of Fame
Miklós Tusz@mdtusz@mdtuszHall of Fame
Chelsea Douglas@cldougl Hall of Fame
Ben Postlethwaite@bpostlethwaite Hall of Fame
Chris Parmer@chriddyp Hall of Fame
Alex Vados@alexander-daniel Hall of Fame

Copyright and license

Code and documentation copyright 2021 Plotly, Inc.

Code released under the MIT license.

Versioning

This project is maintained under the Semantic Versioning guidelines.

See the Releases section of our GitHub project for changelogs for each release version of plotly.js.


Community

  • Follow @plotlygraphs on Twitter for the latest Plotly news.
  • Implementation help may be found on community.plot.com (tagged plotly-js) or on Stack Overflow (tagged plotly).
  • Developers should use the keyword plotly on packages which modify or add to the functionality of plotly.js when distributing through npm.

Author: plotly
Source Code: https://github.com/plotly/plotly.js/ 
License: MIT license

#javascript #d3 #plotly #visualization #datavisualization 

What is GEEK

Buddha Community

Plotly.js: Open-source JavaScript charting library behind Plotly, Dash

NBB: Ad-hoc CLJS Scripting on Node.js

Nbb

Not babashka. Node.js babashka!?

Ad-hoc CLJS scripting on Node.js.

Status

Experimental. Please report issues here.

Goals and features

Nbb's main goal is to make it easy to get started with ad hoc CLJS scripting on Node.js.

Additional goals and features are:

  • Fast startup without relying on a custom version of Node.js.
  • Small artifact (current size is around 1.2MB).
  • First class macros.
  • Support building small TUI apps using Reagent.
  • Complement babashka with libraries from the Node.js ecosystem.

Requirements

Nbb requires Node.js v12 or newer.

How does this tool work?

CLJS code is evaluated through SCI, the same interpreter that powers babashka. Because SCI works with advanced compilation, the bundle size, especially when combined with other dependencies, is smaller than what you get with self-hosted CLJS. That makes startup faster. The trade-off is that execution is less performant and that only a subset of CLJS is available (e.g. no deftype, yet).

Usage

Install nbb from NPM:

$ npm install nbb -g

Omit -g for a local install.

Try out an expression:

$ nbb -e '(+ 1 2 3)'
6

And then install some other NPM libraries to use in the script. E.g.:

$ npm install csv-parse shelljs zx

Create a script which uses the NPM libraries:

(ns script
  (:require ["csv-parse/lib/sync$default" :as csv-parse]
            ["fs" :as fs]
            ["path" :as path]
            ["shelljs$default" :as sh]
            ["term-size$default" :as term-size]
            ["zx$default" :as zx]
            ["zx$fs" :as zxfs]
            [nbb.core :refer [*file*]]))

(prn (path/resolve "."))

(prn (term-size))

(println (count (str (fs/readFileSync *file*))))

(prn (sh/ls "."))

(prn (csv-parse "foo,bar"))

(prn (zxfs/existsSync *file*))

(zx/$ #js ["ls"])

Call the script:

$ nbb script.cljs
"/private/tmp/test-script"
#js {:columns 216, :rows 47}
510
#js ["node_modules" "package-lock.json" "package.json" "script.cljs"]
#js [#js ["foo" "bar"]]
true
$ ls
node_modules
package-lock.json
package.json
script.cljs

Macros

Nbb has first class support for macros: you can define them right inside your .cljs file, like you are used to from JVM Clojure. Consider the plet macro to make working with promises more palatable:

(defmacro plet
  [bindings & body]
  (let [binding-pairs (reverse (partition 2 bindings))
        body (cons 'do body)]
    (reduce (fn [body [sym expr]]
              (let [expr (list '.resolve 'js/Promise expr)]
                (list '.then expr (list 'clojure.core/fn (vector sym)
                                        body))))
            body
            binding-pairs)))

Using this macro we can look async code more like sync code. Consider this puppeteer example:

(-> (.launch puppeteer)
      (.then (fn [browser]
               (-> (.newPage browser)
                   (.then (fn [page]
                            (-> (.goto page "https://clojure.org")
                                (.then #(.screenshot page #js{:path "screenshot.png"}))
                                (.catch #(js/console.log %))
                                (.then #(.close browser)))))))))

Using plet this becomes:

(plet [browser (.launch puppeteer)
       page (.newPage browser)
       _ (.goto page "https://clojure.org")
       _ (-> (.screenshot page #js{:path "screenshot.png"})
             (.catch #(js/console.log %)))]
      (.close browser))

See the puppeteer example for the full code.

Since v0.0.36, nbb includes promesa which is a library to deal with promises. The above plet macro is similar to promesa.core/let.

Startup time

$ time nbb -e '(+ 1 2 3)'
6
nbb -e '(+ 1 2 3)'   0.17s  user 0.02s system 109% cpu 0.168 total

The baseline startup time for a script is about 170ms seconds on my laptop. When invoked via npx this adds another 300ms or so, so for faster startup, either use a globally installed nbb or use $(npm bin)/nbb script.cljs to bypass npx.

Dependencies

NPM dependencies

Nbb does not depend on any NPM dependencies. All NPM libraries loaded by a script are resolved relative to that script. When using the Reagent module, React is resolved in the same way as any other NPM library.

Classpath

To load .cljs files from local paths or dependencies, you can use the --classpath argument. The current dir is added to the classpath automatically. So if there is a file foo/bar.cljs relative to your current dir, then you can load it via (:require [foo.bar :as fb]). Note that nbb uses the same naming conventions for namespaces and directories as other Clojure tools: foo-bar in the namespace name becomes foo_bar in the directory name.

To load dependencies from the Clojure ecosystem, you can use the Clojure CLI or babashka to download them and produce a classpath:

$ classpath="$(clojure -A:nbb -Spath -Sdeps '{:aliases {:nbb {:replace-deps {com.github.seancorfield/honeysql {:git/tag "v2.0.0-rc5" :git/sha "01c3a55"}}}}}')"

and then feed it to the --classpath argument:

$ nbb --classpath "$classpath" -e "(require '[honey.sql :as sql]) (sql/format {:select :foo :from :bar :where [:= :baz 2]})"
["SELECT foo FROM bar WHERE baz = ?" 2]

Currently nbb only reads from directories, not jar files, so you are encouraged to use git libs. Support for .jar files will be added later.

Current file

The name of the file that is currently being executed is available via nbb.core/*file* or on the metadata of vars:

(ns foo
  (:require [nbb.core :refer [*file*]]))

(prn *file*) ;; "/private/tmp/foo.cljs"

(defn f [])
(prn (:file (meta #'f))) ;; "/private/tmp/foo.cljs"

Reagent

Nbb includes reagent.core which will be lazily loaded when required. You can use this together with ink to create a TUI application:

$ npm install ink

ink-demo.cljs:

(ns ink-demo
  (:require ["ink" :refer [render Text]]
            [reagent.core :as r]))

(defonce state (r/atom 0))

(doseq [n (range 1 11)]
  (js/setTimeout #(swap! state inc) (* n 500)))

(defn hello []
  [:> Text {:color "green"} "Hello, world! " @state])

(render (r/as-element [hello]))

Promesa

Working with callbacks and promises can become tedious. Since nbb v0.0.36 the promesa.core namespace is included with the let and do! macros. An example:

(ns prom
  (:require [promesa.core :as p]))

(defn sleep [ms]
  (js/Promise.
   (fn [resolve _]
     (js/setTimeout resolve ms))))

(defn do-stuff
  []
  (p/do!
   (println "Doing stuff which takes a while")
   (sleep 1000)
   1))

(p/let [a (do-stuff)
        b (inc a)
        c (do-stuff)
        d (+ b c)]
  (prn d))
$ nbb prom.cljs
Doing stuff which takes a while
Doing stuff which takes a while
3

Also see API docs.

Js-interop

Since nbb v0.0.75 applied-science/js-interop is available:

(ns example
  (:require [applied-science.js-interop :as j]))

(def o (j/lit {:a 1 :b 2 :c {:d 1}}))

(prn (j/select-keys o [:a :b])) ;; #js {:a 1, :b 2}
(prn (j/get-in o [:c :d])) ;; 1

Most of this library is supported in nbb, except the following:

  • destructuring using :syms
  • property access using .-x notation. In nbb, you must use keywords.

See the example of what is currently supported.

Examples

See the examples directory for small examples.

Also check out these projects built with nbb:

API

See API documentation.

Migrating to shadow-cljs

See this gist on how to convert an nbb script or project to shadow-cljs.

Build

Prequisites:

  • babashka >= 0.4.0
  • Clojure CLI >= 1.10.3.933
  • Node.js 16.5.0 (lower version may work, but this is the one I used to build)

To build:

  • Clone and cd into this repo
  • bb release

Run bb tasks for more project-related tasks.

Download Details:
Author: borkdude
Download Link: Download The Source Code
Official Website: https://github.com/borkdude/nbb 
License: EPL-1.0

#node #javascript

Anil  Sakhiya

Anil Sakhiya

1652748716

Exploratory Data Analysis(EDA) with Python

Exploratory Data Analysis Tutorial | Basics of EDA with Python

Exploratory data analysis is used by data scientists to analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. It helps determine how best to manipulate data sources to get the answers you need, making it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions. EDA is primarily used to see what data can reveal beyond the formal modeling or hypothesis testing task and provides a better understanding of data set variables and the relationships between them. It can also help determine if the statistical techniques you are considering for data analysis are appropriate or not.

🔹 Topics Covered:
00:00:00 Basics of EDA with Python
01:40:10 Multiple Variate Analysis
02:30:26 Outlier Detection
03:44:48 Cricket World Cup Analysis using Exploratory Data Analysis


Learning the basics of Exploratory Data Analysis using Python with Numpy, Matplotlib, and Pandas.

What is Exploratory Data Analysis(EDA)?

If we want to explain EDA in simple terms, it means trying to understand the given data much better, so that we can make some sense out of it.

We can find a more formal definition in Wikipedia.

In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.

EDA in Python uses data visualization to draw meaningful patterns and insights. It also involves the preparation of data sets for analysis by removing irregularities in the data.

Based on the results of EDA, companies also make business decisions, which can have repercussions later.

  • If EDA is not done properly then it can hamper the further steps in the machine learning model building process.
  • If done well, it may improve the efficacy of everything we do next.

In this article we’ll see about the following topics:

  1. Data Sourcing
  2. Data Cleaning
  3. Univariate analysis
  4. Bivariate analysis
  5. Multivariate analysis

1. Data Sourcing

Data Sourcing is the process of finding and loading the data into our system. Broadly there are two ways in which we can find data.

  1. Private Data
  2. Public Data

Private Data

As the name suggests, private data is given by private organizations. There are some security and privacy concerns attached to it. This type of data is used for mainly organizations internal analysis.

Public Data

This type of Data is available to everyone. We can find this in government websites and public organizations etc. Anyone can access this data, we do not need any special permissions or approval.

We can get public data on the following sites.

The very first step of EDA is Data Sourcing, we have seen how we can access data and load into our system. Now, the next step is how to clean the data.

2. Data Cleaning

After completing the Data Sourcing, the next step in the process of EDA is Data Cleaning. It is very important to get rid of the irregularities and clean the data after sourcing it into our system.

Irregularities are of different types of data.

  • Missing Values
  • Incorrect Format
  • Incorrect Headers
  • Anomalies/Outliers

To perform the data cleaning we are using a sample data set, which can be found here.

We are using Jupyter Notebook for analysis.

First, let’s import the necessary libraries and store the data in our system for analysis.

#import the useful libraries.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# Read the data set of "Marketing Analysis" in data.
data= pd.read_csv("marketing_analysis.csv")

# Printing the data
data

Now, the data set looks like this,

If we observe the above dataset, there are some discrepancies in the Column header for the first 2 rows. The correct data is from the index number 1. So, we have to fix the first two rows.

This is called Fixing the Rows and Columns. Let’s ignore the first two rows and load the data again.

#import the useful libraries.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# Read the file in data without first two rows as it is of no use.
data = pd.read_csv("marketing_analysis.csv",skiprows = 2)

#print the head of the data frame.
data.head()

Now, the dataset looks like this, and it makes more sense.

Dataset after fixing the rows and columns

Following are the steps to be taken while Fixing Rows and Columns:

  1. Delete Summary Rows and Columns in the Dataset.
  2. Delete Header and Footer Rows on every page.
  3. Delete Extra Rows like blank rows, page numbers, etc.
  4. We can merge different columns if it makes for better understanding of the data
  5. Similarly, we can also split one column into multiple columns based on our requirements or understanding.
  6. Add Column names, it is very important to have column names to the dataset.

Now if we observe the above dataset, the customerid column has of no importance to our analysis, and also the jobedu column has both the information of job and education in it.

So, what we’ll do is, we’ll drop the customerid column and we’ll split the jobedu column into two other columns job and education and after that, we’ll drop the jobedu column as well.

# Drop the customer id as it is of no use.
data.drop('customerid', axis = 1, inplace = True)

#Extract job  & Education in newly from "jobedu" column.
data['job']= data["jobedu"].apply(lambda x: x.split(",")[0])
data['education']= data["jobedu"].apply(lambda x: x.split(",")[1])

# Drop the "jobedu" column from the dataframe.
data.drop('jobedu', axis = 1, inplace = True)

# Printing the Dataset
data

Now, the dataset looks like this,

Dropping Customerid and jobedu columns and adding job and education columns

Missing Values

If there are missing values in the Dataset before doing any statistical analysis, we need to handle those missing values.

There are mainly three types of missing values.

  1. MCAR(Missing completely at random): These values do not depend on any other features.
  2. MAR(Missing at random): These values may be dependent on some other features.
  3. MNAR(Missing not at random): These missing values have some reason for why they are missing.

Let’s see which columns have missing values in the dataset.

# Checking the missing values
data.isnull().sum()

The output will be,

As we can see three columns contain missing values. Let’s see how to handle the missing values. We can handle missing values by dropping the missing records or by imputing the values.

Drop the missing Values

Let’s handle missing values in the age column.

# Dropping the records with age missing in data dataframe.
data = data[~data.age.isnull()].copy()

# Checking the missing values in the dataset.
data.isnull().sum()

Let’s check the missing values in the dataset now.

Let’s impute values to the missing values for the month column.

Since the month column is of an object type, let’s calculate the mode of that column and impute those values to the missing values.

# Find the mode of month in data
month_mode = data.month.mode()[0]

# Fill the missing values with mode value of month in data.
data.month.fillna(month_mode, inplace = True)

# Let's see the null values in the month column.
data.month.isnull().sum()

Now output is,

# Mode of month is
'may, 2017'
# Null values in month column after imputing with mode
0

Handling the missing values in the Response column. Since, our target column is Response Column, if we impute the values to this column it’ll affect our analysis. So, it is better to drop the missing values from Response Column.

#drop the records with response missing in data.
data = data[~data.response.isnull()].copy()
# Calculate the missing values in each column of data frame
data.isnull().sum()

Let’s check whether the missing values in the dataset have been handled or not,

All the missing values have been handled

We can also, fill the missing values as ‘NaN’ so that while doing any statistical analysis, it won’t affect the outcome.

Handling Outliers

We have seen how to fix missing values, now let’s see how to handle outliers in the dataset.

Outliers are the values that are far beyond the next nearest data points.

There are two types of outliers:

  1. Univariate outliers: Univariate outliers are the data points whose values lie beyond the range of expected values based on one variable.
  2. Multivariate outliers: While plotting data, some values of one variable may not lie beyond the expected range, but when you plot the data with some other variable, these values may lie far from the expected value.

So, after understanding the causes of these outliers, we can handle them by dropping those records or imputing with the values or leaving them as is, if it makes more sense.

Standardizing Values

To perform data analysis on a set of values, we have to make sure the values in the same column should be on the same scale. For example, if the data contains the values of the top speed of different companies’ cars, then the whole column should be either in meters/sec scale or miles/sec scale.

Now, that we are clear on how to source and clean the data, let’s see how we can analyze the data.

3. Univariate Analysis

If we analyze data over a single variable/column from a dataset, it is known as Univariate Analysis.

Categorical Unordered Univariate Analysis:

An unordered variable is a categorical variable that has no defined order. If we take our data as an example, the job column in the dataset is divided into many sub-categories like technician, blue-collar, services, management, etc. There is no weight or measure given to any value in the ‘job’ column.

Now, let’s analyze the job category by using plots. Since Job is a category, we will plot the bar plot.

# Let's calculate the percentage of each job status category.
data.job.value_counts(normalize=True)

#plot the bar graph of percentage job categories
data.job.value_counts(normalize=True).plot.barh()
plt.show()

The output looks like this,

By the above bar plot, we can infer that the data set contains more number of blue-collar workers compared to other categories.

Categorical Ordered Univariate Analysis:

Ordered variables are those variables that have a natural rank of order. Some examples of categorical ordered variables from our dataset are:

  • Month: Jan, Feb, March……
  • Education: Primary, Secondary,……

Now, let’s analyze the Education Variable from the dataset. Since we’ve already seen a bar plot, let’s see how a Pie Chart looks like.

#calculate the percentage of each education category.
data.education.value_counts(normalize=True)

#plot the pie chart of education categories
data.education.value_counts(normalize=True).plot.pie()
plt.show()

The output will be,

By the above analysis, we can infer that the data set has a large number of them belongs to secondary education after that tertiary and next primary. Also, a very small percentage of them have been unknown.

This is how we analyze univariate categorical analysis. If the column or variable is of numerical then we’ll analyze by calculating its mean, median, std, etc. We can get those values by using the describe function.

data.salary.describe()

The output will be,

4. Bivariate Analysis

If we analyze data by taking two variables/columns into consideration from a dataset, it is known as Bivariate Analysis.

a) Numeric-Numeric Analysis:

Analyzing the two numeric variables from a dataset is known as numeric-numeric analysis. We can analyze it in three different ways.

  • Scatter Plot
  • Pair Plot
  • Correlation Matrix

Scatter Plot

Let’s take three columns ‘Balance’, ‘Age’ and ‘Salary’ from our dataset and see what we can infer by plotting to scatter plot between salary balance and age balance

#plot the scatter plot of balance and salary variable in data
plt.scatter(data.salary,data.balance)
plt.show()

#plot the scatter plot of balance and age variable in data
data.plot.scatter(x="age",y="balance")
plt.show()

Now, the scatter plots looks like,

Pair Plot

Now, let’s plot Pair Plots for the three columns we used in plotting Scatter plots. We’ll use the seaborn library for plotting Pair Plots.

#plot the pair plot of salary, balance and age in data dataframe.
sns.pairplot(data = data, vars=['salary','balance','age'])
plt.show()

The Pair Plot looks like this,

Correlation Matrix

Since we cannot use more than two variables as x-axis and y-axis in Scatter and Pair Plots, it is difficult to see the relation between three numerical variables in a single graph. In those cases, we’ll use the correlation matrix.

# Creating a matrix using age, salry, balance as rows and columns
data[['age','salary','balance']].corr()

#plot the correlation matrix of salary, balance and age in data dataframe.
sns.heatmap(data[['age','salary','balance']].corr(), annot=True, cmap = 'Reds')
plt.show()

First, we created a matrix using age, salary, and balance. After that, we are plotting the heatmap using the seaborn library of the matrix.

b) Numeric - Categorical Analysis

Analyzing the one numeric variable and one categorical variable from a dataset is known as numeric-categorical analysis. We analyze them mainly using mean, median, and box plots.

Let’s take salary and response columns from our dataset.

First check for mean value using groupby

#groupby the response to find the mean of the salary with response no & yes separately.
data.groupby('response')['salary'].mean()

The output will be,

There is not much of a difference between the yes and no response based on the salary.

Let’s calculate the median,

#groupby the response to find the median of the salary with response no & yes separately.
data.groupby('response')['salary'].median()

The output will be,

By both mean and median we can say that the response of yes and no remains the same irrespective of the person’s salary. But, is it truly behaving like that, let’s plot the box plot for them and check the behavior.

#plot the box plot of salary for yes & no responses.
sns.boxplot(data.response, data.salary)
plt.show()

The box plot looks like this,

As we can see, when we plot the Box Plot, it paints a very different picture compared to mean and median. The IQR for customers who gave a positive response is on the higher salary side.

This is how we analyze Numeric-Categorical variables, we use mean, median, and Box Plots to draw some sort of conclusions.

c) Categorical — Categorical Analysis

Since our target variable/column is the Response rate, we’ll see how the different categories like Education, Marital Status, etc., are associated with the Response column. So instead of ‘Yes’ and ‘No’ we will convert them into ‘1’ and ‘0’, by doing that we’ll get the “Response Rate”.

#create response_rate of numerical data type where response "yes"= 1, "no"= 0
data['response_rate'] = np.where(data.response=='yes',1,0)
data.response_rate.value_counts()

The output looks like this,

Let’s see how the response rate varies for different categories in marital status.

#plot the bar graph of marital status with average value of response_rate
data.groupby('marital')['response_rate'].mean().plot.bar()
plt.show()

The graph looks like this,

By the above graph, we can infer that the positive response is more for Single status members in the data set. Similarly, we can plot the graphs for Loan vs Response rate, Housing Loans vs Response rate, etc.

5. Multivariate Analysis

If we analyze data by taking more than two variables/columns into consideration from a dataset, it is known as Multivariate Analysis.

Let’s see how ‘Education’, ‘Marital’, and ‘Response_rate’ vary with each other.

First, we’ll create a pivot table with the three columns and after that, we’ll create a heatmap.

result = pd.pivot_table(data=data, index='education', columns='marital',values='response_rate')
print(result)

#create heat map of education vs marital vs response_rate
sns.heatmap(result, annot=True, cmap = 'RdYlGn', center=0.117)
plt.show()

The Pivot table and heatmap looks like this,

Based on the Heatmap we can infer that the married people with primary education are less likely to respond positively for the survey and single people with tertiary education are most likely to respond positively to the survey.

Similarly, we can plot the graphs for Job vs marital vs response, Education vs poutcome vs response, etc.

Conclusion

This is how we’ll do Exploratory Data Analysis. Exploratory Data Analysis (EDA) helps us to look beyond the data. The more we explore the data, the more the insights we draw from it. As a data analyst, almost 80% of our time will be spent understanding data and solving various business problems through EDA.

Thank you for reading and Happy Coding!!!

#dataanalysis #python

Dylan  Iqbal

Dylan Iqbal

1561523460

Matplotlib Cheat Sheet: Plotting in Python

This Matplotlib cheat sheet introduces you to the basics that you need to plot your data with Python and includes code samples.

Data visualization and storytelling with your data are essential skills that every data scientist needs to communicate insights gained from analyses effectively to any audience out there. 

For most beginners, the first package that they use to get in touch with data visualization and storytelling is, naturally, Matplotlib: it is a Python 2D plotting library that enables users to make publication-quality figures. But, what might be even more convincing is the fact that other packages, such as Pandas, intend to build more plotting integration with Matplotlib as time goes on.

However, what might slow down beginners is the fact that this package is pretty extensive. There is so much that you can do with it and it might be hard to still keep a structure when you're learning how to work with Matplotlib.   

DataCamp has created a Matplotlib cheat sheet for those who might already know how to use the package to their advantage to make beautiful plots in Python, but that still want to keep a one-page reference handy. Of course, for those who don't know how to work with Matplotlib, this might be the extra push be convinced and to finally get started with data visualization in Python. 

You'll see that this cheat sheet presents you with the six basic steps that you can go through to make beautiful plots. 

Check out the infographic by clicking on the button below:

Python Matplotlib cheat sheet

With this handy reference, you'll familiarize yourself in no time with the basics of Matplotlib: you'll learn how you can prepare your data, create a new plot, use some basic plotting routines to your advantage, add customizations to your plots, and save, show and close the plots that you make.

What might have looked difficult before will definitely be more clear once you start using this cheat sheet! Use it in combination with the Matplotlib Gallery, the documentation.

Matplotlib 

Matplotlib is a Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms.

Prepare the Data 

1D Data 

>>> import numpy as np
>>> x = np.linspace(0, 10, 100)
>>> y = np.cos(x)
>>> z = np.sin(x)

2D Data or Images 

>>> data = 2 * np.random.random((10, 10))
>>> data2 = 3 * np.random.random((10, 10))
>>> Y, X = np.mgrid[-3:3:100j, -3:3:100j]
>>> U = 1 X** 2 + Y
>>> V = 1 + X Y**2
>>> from matplotlib.cbook import get_sample_data
>>> img = np.load(get_sample_data('axes_grid/bivariate_normal.npy'))

Create Plot

>>> import matplotlib.pyplot as plt

Figure 

>>> fig = plt.figure()
>>> fig2 = plt.figure(figsize=plt.figaspect(2.0))

Axes 

>>> fig.add_axes()
>>> ax1 = fig.add_subplot(221) #row-col-num
>>> ax3 = fig.add_subplot(212)
>>> fig3, axes = plt.subplots(nrows=2,ncols=2)
>>> fig4, axes2 = plt.subplots(ncols=3)

Save Plot 

>>> plt.savefig('foo.png') #Save figures
>>> plt.savefig('foo.png',  transparent=True) #Save transparent figures

Show Plot

>>> plt.show()

Plotting Routines 

1D Data 

>>> fig, ax = plt.subplots()
>>> lines = ax.plot(x,y) #Draw points with lines or markers connecting them
>>> ax.scatter(x,y) #Draw unconnected points, scaled or colored
>>> axes[0,0].bar([1,2,3],[3,4,5]) #Plot vertical rectangles (constant width)
>>> axes[1,0].barh([0.5,1,2.5],[0,1,2]) #Plot horiontal rectangles (constant height)
>>> axes[1,1].axhline(0.45) #Draw a horizontal line across axes
>>> axes[0,1].axvline(0.65) #Draw a vertical line across axes
>>> ax.fill(x,y,color='blue') #Draw filled polygons
>>> ax.fill_between(x,y,color='yellow') #Fill between y values and 0

2D Data 

>>> fig, ax = plt.subplots()
>>> im = ax.imshow(img, #Colormapped or RGB arrays
      cmap= 'gist_earth', 
      interpolation= 'nearest',
      vmin=-2,
      vmax=2)
>>> axes2[0].pcolor(data2) #Pseudocolor plot of 2D array
>>> axes2[0].pcolormesh(data) #Pseudocolor plot of 2D array
>>> CS = plt.contour(Y,X,U) #Plot contours
>>> axes2[2].contourf(data1) #Plot filled contours
>>> axes2[2]= ax.clabel(CS) #Label a contour plot

Vector Fields 

>>> axes[0,1].arrow(0,0,0.5,0.5) #Add an arrow to the axes
>>> axes[1,1].quiver(y,z) #Plot a 2D field of arrows
>>> axes[0,1].streamplot(X,Y,U,V) #Plot a 2D field of arrows

Data Distributions 

>>> ax1.hist(y) #Plot a histogram
>>> ax3.boxplot(y) #Make a box and whisker plot
>>> ax3.violinplot(z)  #Make a violin plot

Plot Anatomy & Workflow 

Plot Anatomy 

 y-axis      

                           x-axis 

Workflow 

The basic steps to creating plots with matplotlib are:

1 Prepare Data
2 Create Plot
3 Plot
4 Customized Plot
5 Save Plot
6 Show Plot

>>> import matplotlib.pyplot as plt
>>> x = [1,2,3,4]  #Step 1
>>> y = [10,20,25,30] 
>>> fig = plt.figure() #Step 2
>>> ax = fig.add_subplot(111) #Step 3
>>> ax.plot(x, y, color= 'lightblue', linewidth=3)  #Step 3, 4
>>> ax.scatter([2,4,6],
          [5,15,25],
          color= 'darkgreen',
          marker= '^' )
>>> ax.set_xlim(1, 6.5)
>>> plt.savefig('foo.png' ) #Step 5
>>> plt.show() #Step 6

Close and Clear 

>>> plt.cla()  #Clear an axis
>>> plt.clf(). #Clear the entire figure
>>> plt.close(). #Close a window

Plotting Customize Plot 

Colors, Color Bars & Color Maps 

>>> plt.plot(x, x, x, x**2, x, x** 3)
>>> ax.plot(x, y, alpha = 0.4)
>>> ax.plot(x, y, c= 'k')
>>> fig.colorbar(im, orientation= 'horizontal')
>>> im = ax.imshow(img,
            cmap= 'seismic' )

Markers 

>>> fig, ax = plt.subplots()
>>> ax.scatter(x,y,marker= ".")
>>> ax.plot(x,y,marker= "o")

Linestyles 

>>> plt.plot(x,y,linewidth=4.0)
>>> plt.plot(x,y,ls= 'solid') 
>>> plt.plot(x,y,ls= '--') 
>>> plt.plot(x,y,'--' ,x**2,y**2,'-.' ) 
>>> plt.setp(lines,color= 'r',linewidth=4.0)

Text & Annotations 

>>> ax.text(1,
           -2.1, 
           'Example Graph', 
            style= 'italic' )
>>> ax.annotate("Sine", 
xy=(8, 0),
xycoords= 'data', 
xytext=(10.5, 0),
textcoords= 'data', 
arrowprops=dict(arrowstyle= "->", 
connectionstyle="arc3"),)

Mathtext 

>>> plt.title(r '$sigma_i=15$', fontsize=20)

Limits, Legends and Layouts 

Limits & Autoscaling 

>>> ax.margins(x=0.0,y=0.1) #Add padding to a plot
>>> ax.axis('equal')  #Set the aspect ratio of the plot to 1
>>> ax.set(xlim=[0,10.5],ylim=[-1.5,1.5])  #Set limits for x-and y-axis
>>> ax.set_xlim(0,10.5) #Set limits for x-axis

Legends 

>>> ax.set(title= 'An Example Axes',  #Set a title and x-and y-axis labels
            ylabel= 'Y-Axis', 
            xlabel= 'X-Axis')
>>> ax.legend(loc= 'best')  #No overlapping plot elements

Ticks 

>>> ax.xaxis.set(ticks=range(1,5),  #Manually set x-ticks
             ticklabels=[3,100, 12,"foo" ])
>>> ax.tick_params(axis= 'y', #Make y-ticks longer and go in and out
             direction= 'inout', 
              length=10)

Subplot Spacing 

>>> fig3.subplots_adjust(wspace=0.5,   #Adjust the spacing between subplots
             hspace=0.3,
             left=0.125,
             right=0.9,
             top=0.9,
             bottom=0.1)
>>> fig.tight_layout() #Fit subplot(s) in to the figure area

Axis Spines 

>>> ax1.spines[ 'top'].set_visible(False) #Make the top axis line for a plot invisible
>>> ax1.spines['bottom' ].set_position(( 'outward',10))  #Move the bottom axis line outward

Have this Cheat Sheet at your fingertips

Original article source at https://www.datacamp.com

#matplotlib #cheatsheet #python

Gordon  Taylor

Gordon Taylor

1656265800

Plotly.js: Open-source JavaScript charting library behind Plotly, Dash

Plotly.js is a standalone Javascript data visualization library, and it also powers the Python and R modules named plotly in those respective ecosystems (referred to as Plotly.py and Plotly.R).

Plotly.js can be used to produce dozens of chart types and visualizations, including statistical charts, 3D graphs, scientific charts, SVG and tile maps, financial charts and more.

 

Contact us for Plotly.js consulting, dashboard development, application integration, and feature additions.


Load as a node module

Install a ready-to-use distributed bundle

npm i --save plotly.js-dist-min

and use import or require in node.js

// ES6 module
import Plotly from 'plotly.js-dist-min'

// CommonJS
var Plotly = require('plotly.js-dist-min')

You may also consider using plotly.js-dist if you prefer using an unminified package.


Load via script tag

The script HTML element

In the examples below Plotly object is added to the window scope by script. The newPlot method is then used to draw an interactive figure as described by data and layout into the desired div here named gd. As demonstrated in the example above basic knowledge of html and JSON syntax is enough to get started i.e. with/without JavaScript! To learn and build more with plotly.js please visit plotly.js documentation.

<head>
    <script src="https://cdn.plot.ly/plotly-2.12.1.min.js"></script>
</head>
<body>
    <div id="gd"></div>

    <script>
        Plotly.newPlot("gd", /* JSON object */ {
            "data": [{ "y": [1, 2, 3] }],
            "layout": { "width": 600, "height": 400}
        })
    </script>
</body>

Alternatively you may consider using native ES6 import in the script tag.

<script type="module">
    import "https://cdn.plot.ly/plotly-2.12.1.min.js"
    Plotly.newPlot("gd", [{ y: [1, 2, 3] }])
</script>

Fastly supports Plotly.js with free CDN service. Read more at https://www.fastly.com/open-source.

Un-minified versions are also available on CDN

While non-minified source files may contain characters outside UTF-8, it is recommended that you specify the charset when loading those bundles.

<script src="https://cdn.plot.ly/plotly-2.12.1.js" charset="utf-8"></script>

Please note that as of v2 the "plotly-latest" outputs (e.g. https://cdn.plot.ly/plotly-latest.min.js) will no longer be updated on the CDN, and will stay at the last v1 patch v1.58.5. Therefore, to use the CDN with plotly.js v2 and higher, you must specify an exact plotly.js version.

MathJax

You could load either version two or version three of MathJax files, for example:

<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-AMS-MML_SVG.js"></script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3.2.2/es5/tex-svg.js"></script>

When using MathJax version 3, it is also possible to use chtml output on the other parts of the page in addition to svg output for the plotly graph. Please refer to devtools/test_dashboard/index-mathjax3chtml.html to see an example.

Bundles

There are two kinds of plotly.js bundles:

  1. Complete and partial official bundles that are distributed to npm and the CDN, described in the dist README.
  2. Custom bundles you can create yourself to optimize the size of bundle depending on your needs. Please visit CUSTOM_BUNDLE for more information.

Alternative ways to load and build plotly.js

If your library needs to bundle or directly load plotly.js/lib/index.js or parts of its modules similar to index-basic in some other way than via an official or a custom bundle, or in case you want to tweak the default build configurations of browserify or webpack, etc. then please visit BUILDING.md.


Documentation

Official plotly.js documentation is hosted at https://plotly.com/javascript.

These pages are generated by the Plotly graphing-library-docs repo built with Jekyll and publicly hosted on GitHub Pages. For more info about contributing to Plotly documentation, please read through contributing guidelines.


Bugs and feature requests

Have a bug or a feature request? Please open a Github issue keeping in mind the issue guidelines. You may also want to read about how changes get made to Plotly.js


Contributing

Please read through our contributing guidelines. Included are directions for opening issues, using plotly.js in your project and notes on development.


Notable contributors

Plotly.js is at the core of a large and dynamic ecosystem with many contributors who file issues, reproduce bugs, suggest improvements, write code in this repo (and other upstream or downstream ones) and help users in the Plotly community forum. The following people deserve special recognition for their outsized contributions to this ecosystem:

 GitHubTwitterStatus
Alex C. Johnson@alexcjohnson Active, Maintainer
Mojtaba Samimi@archmoj@solarchvisionActive, Maintainer
Antoine Roy-Gobeil@antoinerg Active, Maintainer
Nicolas Kruchten@nicolaskruchten@nicolaskruchtenActive, Maintainer
Jon Mease@jonmmease@jonmmeaseActive
Étienne Tétreault-Pinard@etpinard@etpinardHall of Fame
Mikola Lysenko@mikolalysenko@MikolaLysenkoHall of Fame
Ricky Reusser@rreusser@rickyreusserHall of Fame
Dmitry Yv.@dy@DimaYvHall of Fame
Robert Monfera@monfera@monferaHall of Fame
Robert Möstl@rmoestl@rmoestlHall of Fame
Nicolas Riesco@n-riesco Hall of Fame
Miklós Tusz@mdtusz@mdtuszHall of Fame
Chelsea Douglas@cldougl Hall of Fame
Ben Postlethwaite@bpostlethwaite Hall of Fame
Chris Parmer@chriddyp Hall of Fame
Alex Vados@alexander-daniel Hall of Fame

Copyright and license

Code and documentation copyright 2021 Plotly, Inc.

Code released under the MIT license.

Versioning

This project is maintained under the Semantic Versioning guidelines.

See the Releases section of our GitHub project for changelogs for each release version of plotly.js.


Community

  • Follow @plotlygraphs on Twitter for the latest Plotly news.
  • Implementation help may be found on community.plot.com (tagged plotly-js) or on Stack Overflow (tagged plotly).
  • Developers should use the keyword plotly on packages which modify or add to the functionality of plotly.js when distributing through npm.

Author: plotly
Source Code: https://github.com/plotly/plotly.js/ 
License: MIT license

#javascript #d3 #plotly #visualization #datavisualization 

Reid  Rohan

Reid Rohan

1658321700

Plotly.js: Open-source JavaScript Charting Library Behind Plotly, Dash

Plotly.js is a standalone Javascript data visualization library, and it also powers the Python and R modules named plotly in those respective ecosystems (referred to as Plotly.py and Plotly.R).

Plotly.js can be used to produce dozens of chart types and visualizations, including statistical charts, 3D graphs, scientific charts, SVG and tile maps, financial charts and more.

 

Contact us for Plotly.js consulting, dashboard development, application integration, and feature additions.


Load as a node module

Install a ready-to-use distributed bundle

npm i --save plotly.js-dist-min

and use import or require in node.js

// ES6 module
import Plotly from 'plotly.js-dist-min'

// CommonJS
var Plotly = require('plotly.js-dist-min')

You may also consider using plotly.js-dist if you prefer using an unminified package.


Load via script tag

The script HTML element

In the examples below Plotly object is added to the window scope by script. The newPlot method is then used to draw an interactive figure as described by data and layout into the desired div here named gd. As demonstrated in the example above basic knowledge of html and JSON syntax is enough to get started i.e. with/without JavaScript! To learn and build more with plotly.js please visit plotly.js documentation.

<head>
    <script src="https://cdn.plot.ly/plotly-2.13.1.min.js"></script>
</head>
<body>
    <div id="gd"></div>

    <script>
        Plotly.newPlot("gd", /* JSON object */ {
            "data": [{ "y": [1, 2, 3] }],
            "layout": { "width": 600, "height": 400}
        })
    </script>
</body>

Alternatively you may consider using native ES6 import in the script tag.

<script type="module">
    import "https://cdn.plot.ly/plotly-2.13.1.min.js"
    Plotly.newPlot("gd", [{ y: [1, 2, 3] }])
</script>

Fastly supports Plotly.js with free CDN service. Read more at https://www.fastly.com/open-source.

Un-minified versions are also available on CDN

While non-minified source files may contain characters outside UTF-8, it is recommended that you specify the charset when loading those bundles.

<script src="https://cdn.plot.ly/plotly-2.13.1.js" charset="utf-8"></script>

Please note that as of v2 the "plotly-latest" outputs (e.g. https://cdn.plot.ly/plotly-latest.min.js) will no longer be updated on the CDN, and will stay at the last v1 patch v1.58.5. Therefore, to use the CDN with plotly.js v2 and higher, you must specify an exact plotly.js version.

MathJax

You could load either version two or version three of MathJax files, for example:

<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-AMS-MML_SVG.js"></script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3.2.2/es5/tex-svg.js"></script>

When using MathJax version 3, it is also possible to use chtml output on the other parts of the page in addition to svg output for the plotly graph. Please refer to devtools/test_dashboard/index-mathjax3chtml.html to see an example.

Bundles

There are two kinds of plotly.js bundles:

  1. Complete and partial official bundles that are distributed to npm and the CDN, described in the dist README.
  2. Custom bundles you can create yourself to optimize the size of bundle depending on your needs. Please visit CUSTOM_BUNDLE for more information.

Alternative ways to load and build plotly.js

If your library needs to bundle or directly load plotly.js/lib/index.js or parts of its modules similar to index-basic in some other way than via an official or a custom bundle, or in case you want to tweak the default build configurations of browserify or webpack, etc. then please visit BUILDING.md.


Documentation

Official plotly.js documentation is hosted at https://plotly.com/javascript.

These pages are generated by the Plotly graphing-library-docs repo built with Jekyll and publicly hosted on GitHub Pages. For more info about contributing to Plotly documentation, please read through contributing guidelines.


Bugs and feature requests

Have a bug or a feature request? Please open a Github issue keeping in mind the issue guidelines. You may also want to read about how changes get made to Plotly.js


Contributing

Please read through our contributing guidelines. Included are directions for opening issues, using plotly.js in your project and notes on development.


Notable contributors

Plotly.js is at the core of a large and dynamic ecosystem with many contributors who file issues, reproduce bugs, suggest improvements, write code in this repo (and other upstream or downstream ones) and help users in the Plotly community forum. The following people deserve special recognition for their outsized contributions to this ecosystem:

 GitHubTwitterStatus
Alex C. Johnson@alexcjohnson Active, Maintainer
Mojtaba Samimi@archmoj@solarchvisionActive, Maintainer
Antoine Roy-Gobeil@antoinerg Active, Maintainer
Nicolas Kruchten@nicolaskruchten@nicolaskruchtenActive, Maintainer
Jon Mease@jonmmease@jonmmeaseActive
Étienne Tétreault-Pinard@etpinard@etpinardHall of Fame
Mikola Lysenko@mikolalysenko@MikolaLysenkoHall of Fame
Ricky Reusser@rreusser@rickyreusserHall of Fame
Dmitry Yv.@dy@DimaYvHall of Fame
Robert Monfera@monfera@monferaHall of Fame
Robert Möstl@rmoestl@rmoestlHall of Fame
Nicolas Riesco@n-riesco Hall of Fame
Miklós Tusz@mdtusz@mdtuszHall of Fame
Chelsea Douglas@cldougl Hall of Fame
Ben Postlethwaite@bpostlethwaite Hall of Fame
Chris Parmer@chriddyp Hall of Fame
Alex Vados@alexander-daniel Hall of Fame

Copyright and license

Code and documentation copyright 2021 Plotly, Inc.

Code released under the MIT license.

Versioning

This project is maintained under the Semantic Versioning guidelines.

See the Releases section of our GitHub project for changelogs for each release version of plotly.js.


Community

  • Follow @plotlygraphs on Twitter for the latest Plotly news.
  • Implementation help may be found on community.plot.com (tagged plotly-js) or on Stack Overflow (tagged plotly).
  • Developers should use the keyword plotly on packages which modify or add to the functionality of plotly.js when distributing through npm.

Author: Plotly
Source Code: https://github.com/plotly/plotly.js 
License: MIT license

#javascript #visualization #d3 #charts #webgl