Mark Mara

Mark Mara

1607175000

A Plotly.js React Component From Plotly

react-plotly.js

A plotly.js React component from Plotly. The basis of Plotly’s React component suite.

Installation

$ npm install react-plotly.js plotly.js

Quick start

The easiest way to use this component is to import and pass data to a plot component:

import React from 'react';
import Plot from 'react-plotly.js';

class App extends React.Component {
  render() {
    return (
      <Plot
        data={[
          {
            x: [1, 2, 3],
            y: [2, 6, 3],
            type: 'scatter',
            mode: 'lines+markers',
            marker: {color: 'red'},
          },
          {type: 'bar', x: [1, 2, 3], y: [2, 5, 3]},
        ]}
        layout={{width: 320, height: 240, title: 'A Fancy Plot'}}
      />
    );
  }
}

You should see a plot like this:

Example plot

For a full description of Plotly chart types and attributes see the following resources:

State management

This is a “dumb” component that doesn’t merge its internal state with any updates. This means that if a user interacts with the plot, by zooming or panning for example, any subsequent re-renders will lose this information unless it is captured and upstreamed via the onUpdate callback prop.

Here is a simple example of how to capture and store state in a parent object:

class App extends React.Component {
  constructor(props) {
    super(props);
    this.state = {data: [], layout: {}, frames: [], config: {}};
  }

  render() {
    return (
      <Plot
        data={this.state.data}
        layout={this.state.layout}
        frames={this.state.frames}
        config={this.state.config}
        onInitialized={(figure) => this.setState(figure)}
        onUpdate={(figure) => this.setState(figure)}
      />
    );
  }
}

Refreshing the Plot

This component will refresh the plot via Plotly.react if any of the following are true:

  • The revision prop is defined and has changed, OR;
  • One of data, layout or config has changed identity as checked via a shallow ===, OR;
  • The number of elements in frames has changed

Furthermore, when called, Plotly.react will only refresh the data being plotted if the identity of the data arrays (e.g. x, y, marker.color etc) has changed, or if layout.datarevision has changed.

In short, this means that simply adding data points to a trace in data or changing a value in layout will not cause a plot to update unless this is done immutably via something like immutability-helper if performance considerations permit it, or unless revision and/or layout.datarevision are used to force a rerender.

API Reference

Basic Props

Warning: for the time being, this component may mutate its layout and data props in response to user input, going against React rules. This behaviour will change in the near future once https://github.com/plotly/plotly.js/issues/2389 is completed.

Prop Type Default Description
data Array [] list of trace objects (see https://plot.ly/javascript/reference/)
layout Object undefined layout object (see https://plot.ly/javascript/reference/#layout)
frames Array undefined list of frame objects (see https://plot.ly/javascript/reference/)
config Object undefined config object (see https://plot.ly/javascript/configuration-options/)
revision Number undefined When provided, causes the plot to update when the revision is incremented.
onInitialized Function(figure, graphDiv) undefined Callback executed after plot is initialized. See below for parameter information.
onUpdate Function(figure, graphDiv) undefined Callback executed when a plot is updated due to new data or layout, or when user interacts with a plot. See below for parameter information.
onPurge Function(figure, graphDiv) undefined Callback executed when component unmounts, before Plotly.purge strips the graphDiv of all private attributes. See below for parameter information.
onError Function(err) undefined Callback executed when a plotly.js API method rejects
divId string undefined id assigned to the <div> into which the plot is rendered.
className string undefined applied to the <div> into which the plot is rendered
style Object {position: 'relative', display: 'inline-block'} used to style the <div> into which the plot is rendered
debug Boolean false Assign the graph div to window.gd for debugging
useResizeHandler Boolean false When true, adds a call to Plotly.Plot.resize() as a window.resize event handler

Note: To make a plot responsive, i.e. to fill its containing element and resize when the window is resized, use style or className to set the dimensions of the element (i.e. using width: 100%; height: 100% or some similar values) and set useResizeHandler to true while setting layout.autosize to true and leaving layout.height and layout.width undefined. This can be seen in action in this CodePen and will implement the behaviour documented here: https://plot.ly/javascript/responsive-fluid-layout/

Callback signature: Function(figure, graphDiv)

The onInitialized, onUpdate and onPurge props are all functions which will be called with two arguments: figure and graphDiv.

  • figure is a serializable object with three keys corresponding to input props: data, layout and frames.
    • As mentioned above, for the time being, this component may mutate its layout and data props in response to user input, going against React rules. This behaviour will change in the near future once https://github.com/plotly/plotly.js/issues/2389 is completed.
  • graphDiv is a reference to the (unserializable) DOM node into which the figure was rendered.

Event handler props

Event handlers for specific plotly.js events may be attached through the following props:

Prop Type Plotly Event
onAfterExport Function plotly_afterexport
onAfterPlot Function plotly_afterplot
onAnimated Function plotly_animated
onAnimatingFrame Function plotly_animatingframe
onAnimationInterrupted Function plotly_animationinterrupted
onAutoSize Function plotly_autosize
onBeforeExport Function plotly_beforeexport
onBeforeHover Function plotly_beforehover
onButtonClicked Function plotly_buttonclicked
onClick Function plotly_click
onClickAnnotation Function plotly_clickannotation
onDeselect Function plotly_deselect
onDoubleClick Function plotly_doubleclick
onFramework Function plotly_framework
onHover Function plotly_hover
onLegendClick Function plotly_legendclick
onLegendDoubleClick Function plotly_legenddoubleclick
onRelayout Function plotly_relayout
onRelayouting Function plotly_relayouting
onRestyle Function plotly_restyle
onRedraw Function plotly_redraw
onSelected Function plotly_selected
onSelecting Function plotly_selecting
onSliderChange Function plotly_sliderchange
onSliderEnd Function plotly_sliderend
onSliderStart Function plotly_sliderstart
onSunburstClick Function plotly_sunburstclick
onTransitioning Function plotly_transitioning
onTransitionInterrupted Function plotly_transitioninterrupted
onUnhover Function plotly_unhover

Customizing the plotly.js bundle

By default, the Plot component exported by this library loads a precompiled version of all of plotly.js, so plotly.js must be installed as a peer dependency. This bundle is around 6Mb unminified, and minifies to just over 2Mb.

If you do not wish to use this version of plotly.js, e.g. if you want to use a different precompiled bundle or if your wish to assemble you own customized bundle, or if you wish to load plotly.js from a CDN, you can skip the installation of as a peer dependency (and ignore the resulting warning) and use the createPlotComponent method to get a Plot component, instead of importing it:

// simplest method: uses precompiled complete bundle from `plotly.js`
import Plot from 'react-plotly.js';

// customizable method: use your own `Plotly` object
import createPlotlyComponent from 'react-plotly.js/factory';
const Plot = createPlotlyComponent(Plotly);

Loading from a <script> tag

For quick one-off demos on CodePen or JSFiddle, you may wish to just load the component directly as a script tag. We don’t host the bundle directly, so you should never rely on this to work forever or in production, but you can use a third-party service to load the factory version of the component from, for example, https://unpkg.com/react-plotly.js@latest/dist/create-plotly-component.js.

You can load plotly.js and the component factory with:

<script src="https://cdn.plot.ly/plotly-latest.min.js"></script>
<script src="https://unpkg.com/react-plotly.js@latest/dist/create-plotly-component.js"></script>

And instantiate the component with

const Plot = createPlotlyComponent(Plotly);

ReactDOM.render(
  React.createElement(Plot, {
    data: [{x: [1, 2, 3], y: [2, 1, 3]}],
  }),
  document.getElementById('root')
);

You can see an example of this method in action here.

Development

To get started:

$ npm install

To transpile from ES2015 + JSX into the ES5 npm-distributed version:

$ npm run prepublishOnly

To run the tests:

$ npm run test

👉 DEMO

👉 Demo source code

Download Details:

Author: plotly

Demo: http://react-plotly.js-demo.getforge.io/

Source Code: https://github.com/plotly/react-plotly.js

#react #reactjs #javascript

What is GEEK

Buddha Community

A Plotly.js React Component From Plotly
Autumn  Blick

Autumn Blick

1598839687

How native is React Native? | React Native vs Native App Development

If you are undertaking a mobile app development for your start-up or enterprise, you are likely wondering whether to use React Native. As a popular development framework, React Native helps you to develop near-native mobile apps. However, you are probably also wondering how close you can get to a native app by using React Native. How native is React Native?

In the article, we discuss the similarities between native mobile development and development using React Native. We also touch upon where they differ and how to bridge the gaps. Read on.

A brief introduction to React Native

Let’s briefly set the context first. We will briefly touch upon what React Native is and how it differs from earlier hybrid frameworks.

React Native is a popular JavaScript framework that Facebook has created. You can use this open-source framework to code natively rendering Android and iOS mobile apps. You can use it to develop web apps too.

Facebook has developed React Native based on React, its JavaScript library. The first release of React Native came in March 2015. At the time of writing this article, the latest stable release of React Native is 0.62.0, and it was released in March 2020.

Although relatively new, React Native has acquired a high degree of popularity. The “Stack Overflow Developer Survey 2019” report identifies it as the 8th most loved framework. Facebook, Walmart, and Bloomberg are some of the top companies that use React Native.

The popularity of React Native comes from its advantages. Some of its advantages are as follows:

  • Performance: It delivers optimal performance.
  • Cross-platform development: You can develop both Android and iOS apps with it. The reuse of code expedites development and reduces costs.
  • UI design: React Native enables you to design simple and responsive UI for your mobile app.
  • 3rd party plugins: This framework supports 3rd party plugins.
  • Developer community: A vibrant community of developers support React Native.

Why React Native is fundamentally different from earlier hybrid frameworks

Are you wondering whether React Native is just another of those hybrid frameworks like Ionic or Cordova? It’s not! React Native is fundamentally different from these earlier hybrid frameworks.

React Native is very close to native. Consider the following aspects as described on the React Native website:

  • Access to many native platforms features: The primitives of React Native render to native platform UI. This means that your React Native app will use many native platform APIs as native apps would do.
  • Near-native user experience: React Native provides several native components, and these are platform agnostic.
  • The ease of accessing native APIs: React Native uses a declarative UI paradigm. This enables React Native to interact easily with native platform APIs since React Native wraps existing native code.

Due to these factors, React Native offers many more advantages compared to those earlier hybrid frameworks. We now review them.

#android app #frontend #ios app #mobile app development #benefits of react native #is react native good for mobile app development #native vs #pros and cons of react native #react mobile development #react native development #react native experience #react native framework #react native ios vs android #react native pros and cons #react native vs android #react native vs native #react native vs native performance #react vs native #why react native #why use react native

Royce  Reinger

Royce Reinger

1677494820

Zoofs: A Feature Selection Library Based on Evolutionary Algorithms

🐾 zoofs ( Zoo Feature Selection )

zoofs is a Python library for performing feature selection using a variety of nature inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics based to Evolutionary. It's an easy to use, flexible and powerful tool to reduce your feature size.

🔗 Whats new in V0.1.24

  • pass kwargs through objective function
  • improved logger for results
  • added harris hawk algorithm
  • now you can pass timeout as a parameter to stop operation after the given number of second(s). An amazing alternative to passing number of iterations
  • Feature score hashing of visited feature sets to increase the overall performance

🛠 Installation

Using pip

Use the package manager to install zoofs.

pip install zoofs

📜 Available Algorithms

Algorithm NameClass NameDescriptionReferences doi
Particle Swarm AlgorithmParticleSwarmOptimizationUtilizes swarm behaviourhttps://doi.org/10.1007/978-3-319-13563-2_51
Grey Wolf AlgorithmGreyWolfOptimizationUtilizes wolf hunting behaviourhttps://doi.org/10.1016/j.neucom.2015.06.083
Dragon Fly AlgorithmDragonFlyOptimizationUtilizes dragonfly swarm behaviourhttps://doi.org/10.1016/j.knosys.2020.106131
Harris Hawk AlgorithmHarrisHawkOptimizationUtilizes hawk hunting behaviourhttps://link.springer.com/chapter/10.1007/978-981-32-9990-0_12
Genetic Algorithm AlgorithmGeneticOptimizationUtilizes genetic mutation behaviourhttps://doi.org/10.1109/ICDAR.2001.953980
Gravitational AlgorithmGravitationalOptimizationUtilizes newtons gravitational behaviourhttps://doi.org/10.1109/ICASSP.2011.5946916

More algos soon, stay tuned !

  • [Try It Now?] Open In Colab

⚡️ Usage

Define your own objective function for optimization !

Classification Example

from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
#  fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):      
    model.fit(X_train,y_train)  
    P=log_loss(y_valid,model.predict_proba(X_valid))
    return P

# import an algorithm !  
from zoofs import ParticleSwarmOptimization
# create object of algorithm
algo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20,
                                       population_size=20,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()                                       
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True)
#plot your results
algo_object.plot_history()

Regression Example

from sklearn.metrics import mean_squared_error
# define your own objective function, make sure the function receives four parameters,
#  fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):      
    model.fit(X_train,y_train)  
    P=mean_squared_error(y_valid,model.predict(X_valid))
    return P

# import an algorithm !  
from zoofs import ParticleSwarmOptimization
# create object of algorithm
algo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20,
                                       population_size=20,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMRegressor()                                       
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True)
#plot your results
algo_object.plot_history()

Suggestions for Usage

  • As available algorithms are wrapper algos, it is better to use ml models that build quicker, e.g lightgbm, catboost.
  • Take sufficient amount for 'population_size' , as this will determine the extent of exploration and exploitation of the algo.
  • Ensure that your ml model has its hyperparamters optimized before passing it to zoofs algos.

objective score plot

objective score Header

Algorithms

Particle Swarm Algorithm

Particle Swarm

In computational science, particle swarm optimization (PSO) is a computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. It solves a problem by having a population of candidate solutions, here dubbed particles, and moving these particles around in the search-space according to simple mathematical formula over the particle's position and velocity. Each particle's movement is influenced by its local best known position, but is also guided toward the best known positions in the search-space, which are updated as better positions are found by other particles. This is expected to move the swarm toward the best solutions.


class zoofs.ParticleSwarmOptimization(objective_function,n_iteration=50,population_size=50,minimize=True,c1=2,c2=2,w=0.9)


  
Parameters

objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. 
 

The function must return a value, that needs to be minimized/maximized.

n_iteration : int, default=1000 
 

Number of time the algorithm will run

timeout: int = None 
 

Stop operation after the given number of second(s). If this argument is set to None, the operation is executed without time limitation and n_iteration is followed

population_size : int, default=50 
 

Total size of the population

minimize : bool, default=True 
 

Defines if the objective value is to be maximized or minimized

c1 : float, default=2.0 
 

first acceleration coefficient of particle swarm

c2 : float, default=2.0 
 

second acceleration coefficient of particle swarm

w : float, default=0.9 
 

weight parameter

Attributes

best_feature_list : array-like 
 

Final best set of features

Methods

MethodsClass Name
fitRun the algorithm
plot_historyPlot results achieved across iteration

fit(model,X_train, y_train, X_test, y_test,verbose=True)

  
Parameters

model
 

machine learning model's object

X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Training input samples to be used for machine learning model

y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The target values (class labels in classification, real numbers in regression).

X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Validation input samples

y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The Validation target values .

verbose : bool,default=True 
 

Print results for iterations

Returns

best_feature_list : array-like 
 

Final best set of features

plot_history()

Plot results across iterations

Example

from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
#  fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):      
    model.fit(X_train,y_train)  
    P=log_loss(y_valid,model.predict_proba(X_valid))
    return P

# import an algorithm !  
from zoofs import ParticleSwarmOptimization
# create object of algorithm
algo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20,
                                       population_size=20,minimize=True,c1=2,c2=2,w=0.9)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()                      
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True)
#plot your results
algo_object.plot_history()



Grey Wolf Algorithm

Grey Wolf

The Grey Wolf Optimizer (GWO) mimics the leadership hierarchy and hunting mechanism of grey wolves in nature. Four types of grey wolves such as alpha, beta, delta, and omega are employed for simulating the leadership hierarchy. In addition, three main steps of hunting, searching for prey, encircling prey, and attacking prey, are implemented to perform optimization.


class zoofs.GreyWolfOptimization(objective_function,n_iteration=50,population_size=50,minimize=True)


  
Parameters

objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. 
 

The function must return a value, that needs to be minimized/maximized.

n_iteration : int, default=50 
 

Number of time the algorithm will run

timeout: int = None 
 

Stop operation after the given number of second(s). If this argument is set to None, the operation is executed without time limitation and n_iteration is followed

population_size : int, default=50 
 

Total size of the population

method : {1, 2}, default=1 
 

Choose the between the two methods of grey wolf optimization

minimize : bool, default=True 
 

Defines if the objective value is to be maximized or minimized

Attributes

best_feature_list : array-like 
 

Final best set of features

Methods

MethodsClass Name
fitRun the algorithm
plot_historyPlot results achieved across iteration

fit(model,X_train,y_train,X_valid,y_valid,method=1,verbose=True)

  
Parameters

model
 

machine learning model's object

X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Training input samples to be used for machine learning model

y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The target values (class labels in classification, real numbers in regression).

X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Validation input samples

y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The Validation target values .

verbose : bool,default=True 
 

Print results for iterations

Returns

best_feature_list : array-like 
 

Final best set of features

plot_history()

Plot results across iterations

Example

from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
#  fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):      
    model.fit(X_train,y_train)  
    P=log_loss(y_valid,model.predict_proba(X_valid))
    return P

# import an algorithm !  
from zoofs import GreyWolfOptimization
# create object of algorithm
algo_object=GreyWolfOptimization(objective_function_topass,n_iteration=20,method=1,
                                    population_size=20,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()                                       
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True)
#plot your results
algo_object.plot_history()



Dragon Fly Algorithm

Dragon Fly

The main inspiration of the Dragonfly Algorithm (DA) algorithm originates from static and dynamic swarming behaviours. These two swarming behaviours are very similar to the two main phases of optimization using meta-heuristics: exploration and exploitation. Dragonflies create sub swarms and fly over different areas in a static swarm, which is the main objective of the exploration phase. In the static swarm, however, dragonflies fly in bigger swarms and along one direction, which is favourable in the exploitation phase.


class zoofs.DragonFlyOptimization(objective_function,n_iteration=50,population_size=50,minimize=True)


  
Parameters

objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. 
 

The function must return a value, that needs to be minimized/maximized.

n_iteration : int, default=50 
 

Number of time the algorithm will run

timeout: int = None 
 

Stop operation after the given number of second(s). If this argument is set to None, the operation is executed without time limitation and n_iteration is followed

population_size : int, default=50 
 

Total size of the population

method : {'linear','random','quadraic','sinusoidal'}, default='sinusoidal' 
 

Choose the between the three methods of Dragon Fly optimization

minimize : bool, default=True 
 

Defines if the objective value is to be maximized or minimized

Attributes

best_feature_list : array-like 
 

Final best set of features

Methods

MethodsClass Name
fitRun the algorithm
plot_historyPlot results achieved across iteration

fit(model,X_train,y_train,X_valid,y_valid,method='sinusoidal',verbose=True)

  
Parameters

model
 

machine learning model's object

X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Training input samples to be used for machine learning model

y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The target values (class labels in classification, real numbers in regression).

X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Validation input samples

y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The Validation target values .

verbose : bool,default=True 
 

Print results for iterations

Returns

best_feature_list : array-like 
 

Final best set of features

plot_history()

Plot results across iterations

Example

from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
#  fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):      
    model.fit(X_train,y_train)  
    P=log_loss(y_valid,model.predict_proba(X_valid))
    return P

# import an algorithm !  
from zoofs import DragonFlyOptimization
# create object of algorithm
algo_object=DragonFlyOptimization(objective_function_topass,n_iteration=20,method='sinusoidal',
                                    population_size=20,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()                                     
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,  verbose=True)
#plot your results
algo_object.plot_history()



Harris Hawk Optimization

Harris Hawk

HHO is a popular swarm-based, gradient-free optimization algorithm with several active and time-varying phases of exploration and exploitation. This algorithm initially published by the prestigious Journal of Future Generation Computer Systems (FGCS) in 2019, and from the first day, it has gained increasing attention among researchers due to its flexible structure, high performance, and high-quality results. The main logic of the HHO method is designed based on the cooperative behaviour and chasing styles of Harris' hawks in nature called "surprise pounce". Currently, there are many suggestions about how to enhance the functionality of HHO, and there are also several enhanced variants of the HHO in the leading Elsevier and IEEE transaction journals.


class zoofs.HarrisHawkOptimization(objective_function,n_iteration=50,population_size=50,minimize=True,beta=0.5)


  
Parameters

objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. 
 

The function must return a value, that needs to be minimized/maximized.

n_iteration : int, default=1000 
 

Number of time the algorithm will run

timeout: int = None 
 

Stop operation after the given number of second(s). If this argument is set to None, the operation is executed without time limitation and n_iteration is followed

population_size : int, default=50 
 

Total size of the population

minimize : bool, default=True 
 

Defines if the objective value is to be maximized or minimized

beta : float, default=0.5 
 

value for levy random walk

Attributes

best_feature_list : array-like 
 

Final best set of features

Methods

MethodsClass Name
fitRun the algorithm
plot_historyPlot results achieved across iteration

fit(model,X_train, y_train, X_test, y_test,verbose=True)

  
Parameters

model
 

machine learning model's object

X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Training input samples to be used for machine learning model

y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The target values (class labels in classification, real numbers in regression).

X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Validation input samples

y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The Validation target values .

verbose : bool,default=True 
 

Print results for iterations

Returns

best_feature_list : array-like 
 

Final best set of features

plot_history()

Plot results across iterations

Example

from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
#  fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):      
    model.fit(X_train,y_train)  
    P=log_loss(y_valid,model.predict_proba(X_valid))
    return P

# import an algorithm !  
from zoofs import HarrisHawkOptimization
# create object of algorithm
algo_object=HarrisHawkOptimization(objective_function_topass,n_iteration=20,
                                       population_size=20,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()                      
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True)
#plot your results
algo_object.plot_history()



Genetic Algorithm

Dragon Fly

In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection. Some examples of GA applications include optimizing decision trees for better performance, automatically solve sudoku puzzles, hyperparameter optimization, etc.


class zoofs.GeneticOptimization(objective_function,n_iteration=20,population_size=20,selective_pressure=2,elitism=2,mutation_rate=0.05,minimize=True)


  
Parameters

objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. 
 

The function must return a value, that needs to be minimized/maximized.

n_iteration: int, default=50 
 

Number of time the algorithm will run

timeout: int = None 
 

Stop operation after the given number of second(s). If this argument is set to None, the operation is executed without time limitation and n_iteration is followed

population_size : int, default=50 
 

Total size of the population

selective_pressure: int, default=2 
 

measure of reproductive opportunities for each organism in the population

elitism: int, default=2 
 

number of top individuals to be considered as elites

mutation_rate: float, default=0.05 
 

rate of mutation in the population's gene

minimize: bool, default=True 
 

Defines if the objective value is to be maximized or minimized

Attributes

best_feature_list : array-like 
 

Final best set of features

Methods

MethodsClass Name
fitRun the algorithm
plot_historyPlot results achieved across iteration

fit(model,X_train,y_train,X_valid,y_valid,verbose=True)

  
Parameters

model
 

machine learning model's object

X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Training input samples to be used for machine learning model

y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The target values (class labels in classification, real numbers in regression).

X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Validation input samples

y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The Validation target values .

verbose : bool,default=True 
 

Print results for iterations

Returns

best_feature_list : array-like 
 

Final best set of features

plot_history()

Plot results across iterations

Example

from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
#  fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):      
    model.fit(X_train,y_train)  
    P=log_loss(y_valid,model.predict_proba(X_valid))
    return P

# import an algorithm !  
from zoofs import GeneticOptimization
# create object of algorithm
algo_object=GeneticOptimization(objective_function_topass,n_iteration=20,
                            population_size=20,selective_pressure=2,elitism=2,
                            mutation_rate=0.05,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()                            
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train,X_valid, y_valid, verbose=True)
#plot your results
algo_object.plot_history()

Gravitational Algorithm

Gravitational Algorithm

Gravitational Algorithm is based on the law of gravity and mass interactions is introduced. In the algorithm, the searcher agents are a collection of masses which interact with each other based on the Newtonian gravity and the laws of motion.


class zoofs.GravitationalOptimization(self,objective_function,n_iteration=50,population_size=50,g0=100,eps=0.5,minimize=True)


  
Parameters

objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. 
 

The function must return a value, that needs to be minimized/maximized.

n_iteration: int, default=50 
 

Number of time the algorithm will run

timeout: int = None 
 

Stop operation after the given number of second(s). If this argument is set to None, the operation is executed without time limitation and n_iteration is followed

population_size : int, default=50 
 

Total size of the population

g0: float, default=100 
 

gravitational strength constant

eps: float, default=0.5 
 

distance constant

minimize: bool, default=True 
 

Defines if the objective value is to be maximized or minimized

Attributes

best_feature_list : array-like 
 

Final best set of features

Methods

MethodsClass Name
fitRun the algorithm
plot_historyPlot results achieved across iteration

fit(model,X_train,y_train,X_valid,y_valid,verbose=True)

  
Parameters

model
 

machine learning model's object

X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Training input samples to be used for machine learning model

y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The target values (class labels in classification, real numbers in regression).

X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) 
 

Validation input samples

y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) 
 

The Validation target values .

verbose : bool,default=True 
 

Print results for iterations

Returns

best_feature_list : array-like 
 

Final best set of features

plot_history()

Plot results across iterations

Example

from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
#  fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):      
    model.fit(X_train,y_train)  
    P=log_loss(y_valid,model.predict_proba(X_valid))
    return P

# import an algorithm !  
from zoofs import GravitationalOptimization
# create object of algorithm
algo_object=GravitationalOptimization(objective_function_topass,n_iteration=50,
                                population_size=50,g0=100,eps=0.5,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()                                
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid, verbose=True)
#plot your results
algo_object.plot_history()

Support zoofs

The development of zoofs relies completely on contributions.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

First roll out

18,08,2021


🌟 Like this Project? Give us a star !

📘 Documentation

https://jaswinder9051998.github.io/zoofs/


Download Details:

Author: jaswinder9051998
Source Code: https://github.com/jaswinder9051998/zoofs 
License: Apache-2.0 license

#machinelearning #python #optimization #algorithm 

Anil  Sakhiya

Anil Sakhiya

1652748716

Exploratory Data Analysis(EDA) with Python

Exploratory Data Analysis Tutorial | Basics of EDA with Python

Exploratory data analysis is used by data scientists to analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. It helps determine how best to manipulate data sources to get the answers you need, making it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions. EDA is primarily used to see what data can reveal beyond the formal modeling or hypothesis testing task and provides a better understanding of data set variables and the relationships between them. It can also help determine if the statistical techniques you are considering for data analysis are appropriate or not.

🔹 Topics Covered:
00:00:00 Basics of EDA with Python
01:40:10 Multiple Variate Analysis
02:30:26 Outlier Detection
03:44:48 Cricket World Cup Analysis using Exploratory Data Analysis


Learning the basics of Exploratory Data Analysis using Python with Numpy, Matplotlib, and Pandas.

What is Exploratory Data Analysis(EDA)?

If we want to explain EDA in simple terms, it means trying to understand the given data much better, so that we can make some sense out of it.

We can find a more formal definition in Wikipedia.

In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.

EDA in Python uses data visualization to draw meaningful patterns and insights. It also involves the preparation of data sets for analysis by removing irregularities in the data.

Based on the results of EDA, companies also make business decisions, which can have repercussions later.

  • If EDA is not done properly then it can hamper the further steps in the machine learning model building process.
  • If done well, it may improve the efficacy of everything we do next.

In this article we’ll see about the following topics:

  1. Data Sourcing
  2. Data Cleaning
  3. Univariate analysis
  4. Bivariate analysis
  5. Multivariate analysis

1. Data Sourcing

Data Sourcing is the process of finding and loading the data into our system. Broadly there are two ways in which we can find data.

  1. Private Data
  2. Public Data

Private Data

As the name suggests, private data is given by private organizations. There are some security and privacy concerns attached to it. This type of data is used for mainly organizations internal analysis.

Public Data

This type of Data is available to everyone. We can find this in government websites and public organizations etc. Anyone can access this data, we do not need any special permissions or approval.

We can get public data on the following sites.

The very first step of EDA is Data Sourcing, we have seen how we can access data and load into our system. Now, the next step is how to clean the data.

2. Data Cleaning

After completing the Data Sourcing, the next step in the process of EDA is Data Cleaning. It is very important to get rid of the irregularities and clean the data after sourcing it into our system.

Irregularities are of different types of data.

  • Missing Values
  • Incorrect Format
  • Incorrect Headers
  • Anomalies/Outliers

To perform the data cleaning we are using a sample data set, which can be found here.

We are using Jupyter Notebook for analysis.

First, let’s import the necessary libraries and store the data in our system for analysis.

#import the useful libraries.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# Read the data set of "Marketing Analysis" in data.
data= pd.read_csv("marketing_analysis.csv")

# Printing the data
data

Now, the data set looks like this,

If we observe the above dataset, there are some discrepancies in the Column header for the first 2 rows. The correct data is from the index number 1. So, we have to fix the first two rows.

This is called Fixing the Rows and Columns. Let’s ignore the first two rows and load the data again.

#import the useful libraries.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# Read the file in data without first two rows as it is of no use.
data = pd.read_csv("marketing_analysis.csv",skiprows = 2)

#print the head of the data frame.
data.head()

Now, the dataset looks like this, and it makes more sense.

Dataset after fixing the rows and columns

Following are the steps to be taken while Fixing Rows and Columns:

  1. Delete Summary Rows and Columns in the Dataset.
  2. Delete Header and Footer Rows on every page.
  3. Delete Extra Rows like blank rows, page numbers, etc.
  4. We can merge different columns if it makes for better understanding of the data
  5. Similarly, we can also split one column into multiple columns based on our requirements or understanding.
  6. Add Column names, it is very important to have column names to the dataset.

Now if we observe the above dataset, the customerid column has of no importance to our analysis, and also the jobedu column has both the information of job and education in it.

So, what we’ll do is, we’ll drop the customerid column and we’ll split the jobedu column into two other columns job and education and after that, we’ll drop the jobedu column as well.

# Drop the customer id as it is of no use.
data.drop('customerid', axis = 1, inplace = True)

#Extract job  & Education in newly from "jobedu" column.
data['job']= data["jobedu"].apply(lambda x: x.split(",")[0])
data['education']= data["jobedu"].apply(lambda x: x.split(",")[1])

# Drop the "jobedu" column from the dataframe.
data.drop('jobedu', axis = 1, inplace = True)

# Printing the Dataset
data

Now, the dataset looks like this,

Dropping Customerid and jobedu columns and adding job and education columns

Missing Values

If there are missing values in the Dataset before doing any statistical analysis, we need to handle those missing values.

There are mainly three types of missing values.

  1. MCAR(Missing completely at random): These values do not depend on any other features.
  2. MAR(Missing at random): These values may be dependent on some other features.
  3. MNAR(Missing not at random): These missing values have some reason for why they are missing.

Let’s see which columns have missing values in the dataset.

# Checking the missing values
data.isnull().sum()

The output will be,

As we can see three columns contain missing values. Let’s see how to handle the missing values. We can handle missing values by dropping the missing records or by imputing the values.

Drop the missing Values

Let’s handle missing values in the age column.

# Dropping the records with age missing in data dataframe.
data = data[~data.age.isnull()].copy()

# Checking the missing values in the dataset.
data.isnull().sum()

Let’s check the missing values in the dataset now.

Let’s impute values to the missing values for the month column.

Since the month column is of an object type, let’s calculate the mode of that column and impute those values to the missing values.

# Find the mode of month in data
month_mode = data.month.mode()[0]

# Fill the missing values with mode value of month in data.
data.month.fillna(month_mode, inplace = True)

# Let's see the null values in the month column.
data.month.isnull().sum()

Now output is,

# Mode of month is
'may, 2017'
# Null values in month column after imputing with mode
0

Handling the missing values in the Response column. Since, our target column is Response Column, if we impute the values to this column it’ll affect our analysis. So, it is better to drop the missing values from Response Column.

#drop the records with response missing in data.
data = data[~data.response.isnull()].copy()
# Calculate the missing values in each column of data frame
data.isnull().sum()

Let’s check whether the missing values in the dataset have been handled or not,

All the missing values have been handled

We can also, fill the missing values as ‘NaN’ so that while doing any statistical analysis, it won’t affect the outcome.

Handling Outliers

We have seen how to fix missing values, now let’s see how to handle outliers in the dataset.

Outliers are the values that are far beyond the next nearest data points.

There are two types of outliers:

  1. Univariate outliers: Univariate outliers are the data points whose values lie beyond the range of expected values based on one variable.
  2. Multivariate outliers: While plotting data, some values of one variable may not lie beyond the expected range, but when you plot the data with some other variable, these values may lie far from the expected value.

So, after understanding the causes of these outliers, we can handle them by dropping those records or imputing with the values or leaving them as is, if it makes more sense.

Standardizing Values

To perform data analysis on a set of values, we have to make sure the values in the same column should be on the same scale. For example, if the data contains the values of the top speed of different companies’ cars, then the whole column should be either in meters/sec scale or miles/sec scale.

Now, that we are clear on how to source and clean the data, let’s see how we can analyze the data.

3. Univariate Analysis

If we analyze data over a single variable/column from a dataset, it is known as Univariate Analysis.

Categorical Unordered Univariate Analysis:

An unordered variable is a categorical variable that has no defined order. If we take our data as an example, the job column in the dataset is divided into many sub-categories like technician, blue-collar, services, management, etc. There is no weight or measure given to any value in the ‘job’ column.

Now, let’s analyze the job category by using plots. Since Job is a category, we will plot the bar plot.

# Let's calculate the percentage of each job status category.
data.job.value_counts(normalize=True)

#plot the bar graph of percentage job categories
data.job.value_counts(normalize=True).plot.barh()
plt.show()

The output looks like this,

By the above bar plot, we can infer that the data set contains more number of blue-collar workers compared to other categories.

Categorical Ordered Univariate Analysis:

Ordered variables are those variables that have a natural rank of order. Some examples of categorical ordered variables from our dataset are:

  • Month: Jan, Feb, March……
  • Education: Primary, Secondary,……

Now, let’s analyze the Education Variable from the dataset. Since we’ve already seen a bar plot, let’s see how a Pie Chart looks like.

#calculate the percentage of each education category.
data.education.value_counts(normalize=True)

#plot the pie chart of education categories
data.education.value_counts(normalize=True).plot.pie()
plt.show()

The output will be,

By the above analysis, we can infer that the data set has a large number of them belongs to secondary education after that tertiary and next primary. Also, a very small percentage of them have been unknown.

This is how we analyze univariate categorical analysis. If the column or variable is of numerical then we’ll analyze by calculating its mean, median, std, etc. We can get those values by using the describe function.

data.salary.describe()

The output will be,

4. Bivariate Analysis

If we analyze data by taking two variables/columns into consideration from a dataset, it is known as Bivariate Analysis.

a) Numeric-Numeric Analysis:

Analyzing the two numeric variables from a dataset is known as numeric-numeric analysis. We can analyze it in three different ways.

  • Scatter Plot
  • Pair Plot
  • Correlation Matrix

Scatter Plot

Let’s take three columns ‘Balance’, ‘Age’ and ‘Salary’ from our dataset and see what we can infer by plotting to scatter plot between salary balance and age balance

#plot the scatter plot of balance and salary variable in data
plt.scatter(data.salary,data.balance)
plt.show()

#plot the scatter plot of balance and age variable in data
data.plot.scatter(x="age",y="balance")
plt.show()

Now, the scatter plots looks like,

Pair Plot

Now, let’s plot Pair Plots for the three columns we used in plotting Scatter plots. We’ll use the seaborn library for plotting Pair Plots.

#plot the pair plot of salary, balance and age in data dataframe.
sns.pairplot(data = data, vars=['salary','balance','age'])
plt.show()

The Pair Plot looks like this,

Correlation Matrix

Since we cannot use more than two variables as x-axis and y-axis in Scatter and Pair Plots, it is difficult to see the relation between three numerical variables in a single graph. In those cases, we’ll use the correlation matrix.

# Creating a matrix using age, salry, balance as rows and columns
data[['age','salary','balance']].corr()

#plot the correlation matrix of salary, balance and age in data dataframe.
sns.heatmap(data[['age','salary','balance']].corr(), annot=True, cmap = 'Reds')
plt.show()

First, we created a matrix using age, salary, and balance. After that, we are plotting the heatmap using the seaborn library of the matrix.

b) Numeric - Categorical Analysis

Analyzing the one numeric variable and one categorical variable from a dataset is known as numeric-categorical analysis. We analyze them mainly using mean, median, and box plots.

Let’s take salary and response columns from our dataset.

First check for mean value using groupby

#groupby the response to find the mean of the salary with response no & yes separately.
data.groupby('response')['salary'].mean()

The output will be,

There is not much of a difference between the yes and no response based on the salary.

Let’s calculate the median,

#groupby the response to find the median of the salary with response no & yes separately.
data.groupby('response')['salary'].median()

The output will be,

By both mean and median we can say that the response of yes and no remains the same irrespective of the person’s salary. But, is it truly behaving like that, let’s plot the box plot for them and check the behavior.

#plot the box plot of salary for yes & no responses.
sns.boxplot(data.response, data.salary)
plt.show()

The box plot looks like this,

As we can see, when we plot the Box Plot, it paints a very different picture compared to mean and median. The IQR for customers who gave a positive response is on the higher salary side.

This is how we analyze Numeric-Categorical variables, we use mean, median, and Box Plots to draw some sort of conclusions.

c) Categorical — Categorical Analysis

Since our target variable/column is the Response rate, we’ll see how the different categories like Education, Marital Status, etc., are associated with the Response column. So instead of ‘Yes’ and ‘No’ we will convert them into ‘1’ and ‘0’, by doing that we’ll get the “Response Rate”.

#create response_rate of numerical data type where response "yes"= 1, "no"= 0
data['response_rate'] = np.where(data.response=='yes',1,0)
data.response_rate.value_counts()

The output looks like this,

Let’s see how the response rate varies for different categories in marital status.

#plot the bar graph of marital status with average value of response_rate
data.groupby('marital')['response_rate'].mean().plot.bar()
plt.show()

The graph looks like this,

By the above graph, we can infer that the positive response is more for Single status members in the data set. Similarly, we can plot the graphs for Loan vs Response rate, Housing Loans vs Response rate, etc.

5. Multivariate Analysis

If we analyze data by taking more than two variables/columns into consideration from a dataset, it is known as Multivariate Analysis.

Let’s see how ‘Education’, ‘Marital’, and ‘Response_rate’ vary with each other.

First, we’ll create a pivot table with the three columns and after that, we’ll create a heatmap.

result = pd.pivot_table(data=data, index='education', columns='marital',values='response_rate')
print(result)

#create heat map of education vs marital vs response_rate
sns.heatmap(result, annot=True, cmap = 'RdYlGn', center=0.117)
plt.show()

The Pivot table and heatmap looks like this,

Based on the Heatmap we can infer that the married people with primary education are less likely to respond positively for the survey and single people with tertiary education are most likely to respond positively to the survey.

Similarly, we can plot the graphs for Job vs marital vs response, Education vs poutcome vs response, etc.

Conclusion

This is how we’ll do Exploratory Data Analysis. Exploratory Data Analysis (EDA) helps us to look beyond the data. The more we explore the data, the more the insights we draw from it. As a data analyst, almost 80% of our time will be spent understanding data and solving various business problems through EDA.

Thank you for reading and Happy Coding!!!

#dataanalysis #python

sophia tondon

sophia tondon

1621250665

Top React JS Development Company | React JS Development Services

Looking to hire dedicated top Reactjs developers at affordable prices? Our 5+ years of average experienced Reactjs developers comprise proficiency in delivering the most complex and challenging web apps.

Hire ReactJS developers online on a monthly, hourly, or full-time basis who are highly skilled & efficient in implementing new technologies and turn into business-driven applications while saving your cost up to 60%.

Planning to** outsource React web Development services from India** using Reactjs? Or would you like to hire a team of Reactjs developers? Get in touch for a free quote!

#hire react js developer #react.js developer #react.js developers #hire reactjs development company #react js development india #react js developer

Dylan  Iqbal

Dylan Iqbal

1561523460

Matplotlib Cheat Sheet: Plotting in Python

This Matplotlib cheat sheet introduces you to the basics that you need to plot your data with Python and includes code samples.

Data visualization and storytelling with your data are essential skills that every data scientist needs to communicate insights gained from analyses effectively to any audience out there. 

For most beginners, the first package that they use to get in touch with data visualization and storytelling is, naturally, Matplotlib: it is a Python 2D plotting library that enables users to make publication-quality figures. But, what might be even more convincing is the fact that other packages, such as Pandas, intend to build more plotting integration with Matplotlib as time goes on.

However, what might slow down beginners is the fact that this package is pretty extensive. There is so much that you can do with it and it might be hard to still keep a structure when you're learning how to work with Matplotlib.   

DataCamp has created a Matplotlib cheat sheet for those who might already know how to use the package to their advantage to make beautiful plots in Python, but that still want to keep a one-page reference handy. Of course, for those who don't know how to work with Matplotlib, this might be the extra push be convinced and to finally get started with data visualization in Python. 

You'll see that this cheat sheet presents you with the six basic steps that you can go through to make beautiful plots. 

Check out the infographic by clicking on the button below:

Python Matplotlib cheat sheet

With this handy reference, you'll familiarize yourself in no time with the basics of Matplotlib: you'll learn how you can prepare your data, create a new plot, use some basic plotting routines to your advantage, add customizations to your plots, and save, show and close the plots that you make.

What might have looked difficult before will definitely be more clear once you start using this cheat sheet! Use it in combination with the Matplotlib Gallery, the documentation.

Matplotlib 

Matplotlib is a Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms.

Prepare the Data 

1D Data 

>>> import numpy as np
>>> x = np.linspace(0, 10, 100)
>>> y = np.cos(x)
>>> z = np.sin(x)

2D Data or Images 

>>> data = 2 * np.random.random((10, 10))
>>> data2 = 3 * np.random.random((10, 10))
>>> Y, X = np.mgrid[-3:3:100j, -3:3:100j]
>>> U = 1 X** 2 + Y
>>> V = 1 + X Y**2
>>> from matplotlib.cbook import get_sample_data
>>> img = np.load(get_sample_data('axes_grid/bivariate_normal.npy'))

Create Plot

>>> import matplotlib.pyplot as plt

Figure 

>>> fig = plt.figure()
>>> fig2 = plt.figure(figsize=plt.figaspect(2.0))

Axes 

>>> fig.add_axes()
>>> ax1 = fig.add_subplot(221) #row-col-num
>>> ax3 = fig.add_subplot(212)
>>> fig3, axes = plt.subplots(nrows=2,ncols=2)
>>> fig4, axes2 = plt.subplots(ncols=3)

Save Plot 

>>> plt.savefig('foo.png') #Save figures
>>> plt.savefig('foo.png',  transparent=True) #Save transparent figures

Show Plot

>>> plt.show()

Plotting Routines 

1D Data 

>>> fig, ax = plt.subplots()
>>> lines = ax.plot(x,y) #Draw points with lines or markers connecting them
>>> ax.scatter(x,y) #Draw unconnected points, scaled or colored
>>> axes[0,0].bar([1,2,3],[3,4,5]) #Plot vertical rectangles (constant width)
>>> axes[1,0].barh([0.5,1,2.5],[0,1,2]) #Plot horiontal rectangles (constant height)
>>> axes[1,1].axhline(0.45) #Draw a horizontal line across axes
>>> axes[0,1].axvline(0.65) #Draw a vertical line across axes
>>> ax.fill(x,y,color='blue') #Draw filled polygons
>>> ax.fill_between(x,y,color='yellow') #Fill between y values and 0

2D Data 

>>> fig, ax = plt.subplots()
>>> im = ax.imshow(img, #Colormapped or RGB arrays
      cmap= 'gist_earth', 
      interpolation= 'nearest',
      vmin=-2,
      vmax=2)
>>> axes2[0].pcolor(data2) #Pseudocolor plot of 2D array
>>> axes2[0].pcolormesh(data) #Pseudocolor plot of 2D array
>>> CS = plt.contour(Y,X,U) #Plot contours
>>> axes2[2].contourf(data1) #Plot filled contours
>>> axes2[2]= ax.clabel(CS) #Label a contour plot

Vector Fields 

>>> axes[0,1].arrow(0,0,0.5,0.5) #Add an arrow to the axes
>>> axes[1,1].quiver(y,z) #Plot a 2D field of arrows
>>> axes[0,1].streamplot(X,Y,U,V) #Plot a 2D field of arrows

Data Distributions 

>>> ax1.hist(y) #Plot a histogram
>>> ax3.boxplot(y) #Make a box and whisker plot
>>> ax3.violinplot(z)  #Make a violin plot

Plot Anatomy & Workflow 

Plot Anatomy 

 y-axis      

                           x-axis 

Workflow 

The basic steps to creating plots with matplotlib are:

1 Prepare Data
2 Create Plot
3 Plot
4 Customized Plot
5 Save Plot
6 Show Plot

>>> import matplotlib.pyplot as plt
>>> x = [1,2,3,4]  #Step 1
>>> y = [10,20,25,30] 
>>> fig = plt.figure() #Step 2
>>> ax = fig.add_subplot(111) #Step 3
>>> ax.plot(x, y, color= 'lightblue', linewidth=3)  #Step 3, 4
>>> ax.scatter([2,4,6],
          [5,15,25],
          color= 'darkgreen',
          marker= '^' )
>>> ax.set_xlim(1, 6.5)
>>> plt.savefig('foo.png' ) #Step 5
>>> plt.show() #Step 6

Close and Clear 

>>> plt.cla()  #Clear an axis
>>> plt.clf(). #Clear the entire figure
>>> plt.close(). #Close a window

Plotting Customize Plot 

Colors, Color Bars & Color Maps 

>>> plt.plot(x, x, x, x**2, x, x** 3)
>>> ax.plot(x, y, alpha = 0.4)
>>> ax.plot(x, y, c= 'k')
>>> fig.colorbar(im, orientation= 'horizontal')
>>> im = ax.imshow(img,
            cmap= 'seismic' )

Markers 

>>> fig, ax = plt.subplots()
>>> ax.scatter(x,y,marker= ".")
>>> ax.plot(x,y,marker= "o")

Linestyles 

>>> plt.plot(x,y,linewidth=4.0)
>>> plt.plot(x,y,ls= 'solid') 
>>> plt.plot(x,y,ls= '--') 
>>> plt.plot(x,y,'--' ,x**2,y**2,'-.' ) 
>>> plt.setp(lines,color= 'r',linewidth=4.0)

Text & Annotations 

>>> ax.text(1,
           -2.1, 
           'Example Graph', 
            style= 'italic' )
>>> ax.annotate("Sine", 
xy=(8, 0),
xycoords= 'data', 
xytext=(10.5, 0),
textcoords= 'data', 
arrowprops=dict(arrowstyle= "->", 
connectionstyle="arc3"),)

Mathtext 

>>> plt.title(r '$sigma_i=15$', fontsize=20)

Limits, Legends and Layouts 

Limits & Autoscaling 

>>> ax.margins(x=0.0,y=0.1) #Add padding to a plot
>>> ax.axis('equal')  #Set the aspect ratio of the plot to 1
>>> ax.set(xlim=[0,10.5],ylim=[-1.5,1.5])  #Set limits for x-and y-axis
>>> ax.set_xlim(0,10.5) #Set limits for x-axis

Legends 

>>> ax.set(title= 'An Example Axes',  #Set a title and x-and y-axis labels
            ylabel= 'Y-Axis', 
            xlabel= 'X-Axis')
>>> ax.legend(loc= 'best')  #No overlapping plot elements

Ticks 

>>> ax.xaxis.set(ticks=range(1,5),  #Manually set x-ticks
             ticklabels=[3,100, 12,"foo" ])
>>> ax.tick_params(axis= 'y', #Make y-ticks longer and go in and out
             direction= 'inout', 
              length=10)

Subplot Spacing 

>>> fig3.subplots_adjust(wspace=0.5,   #Adjust the spacing between subplots
             hspace=0.3,
             left=0.125,
             right=0.9,
             top=0.9,
             bottom=0.1)
>>> fig.tight_layout() #Fit subplot(s) in to the figure area

Axis Spines 

>>> ax1.spines[ 'top'].set_visible(False) #Make the top axis line for a plot invisible
>>> ax1.spines['bottom' ].set_position(( 'outward',10))  #Move the bottom axis line outward

Have this Cheat Sheet at your fingertips

Original article source at https://www.datacamp.com

#matplotlib #cheatsheet #python