Fern  Wisozk

Fern Wisozk

1628237700

10 Tips and Tricks That Will Make You a Better ReactJS Dev

Knowledge you can immediately use to improve your React game to help you become better React devs, write better code or excel at coding interviews.

  1. Use React Hooks In Functional Components
  2. Use The React Context API For Passing Props
  3. Styled-Components
  4. React Fragments
  5. Use Error Boundaries
  6. React and Typescript
  7. Jest and Enzyme for Testing
  8. Conditionals in JSX
  9. Higher-Order Components
  10. React DevTools
  11. Bonus: Must-have VS Code Extensions for React Devs


#reactjs 
 

What is GEEK

Buddha Community

10 Tips and Tricks That Will Make You a Better ReactJS Dev
Annie  Emard

Annie Emard

1653075360

HAML Lint: Tool For Writing Clean and Consistent HAML

HAML-Lint

haml-lint is a tool to help keep your HAML files clean and readable. In addition to HAML-specific style and lint checks, it integrates with RuboCop to bring its powerful static analysis tools to your HAML documents.

You can run haml-lint manually from the command line, or integrate it into your SCM hooks.

Requirements

  • Ruby 2.4+
  • HAML 4.0+

Installation

gem install haml_lint

If you'd rather install haml-lint using bundler, don't require it in your Gemfile:

gem 'haml_lint', require: false

Then you can still use haml-lint from the command line, but its source code won't be auto-loaded inside your application.

Usage

Run haml-lint from the command line by passing in a directory (or multiple directories) to recursively scan:

haml-lint app/views/

You can also specify a list of files explicitly:

haml-lint app/**/*.html.haml

haml-lint will output any problems with your HAML, including the offending filename and line number.

File Encoding

haml-lint assumes all files are encoded in UTF-8.

Command Line Flags

Command Line FlagDescription
--auto-gen-configGenerate a configuration file acting as a TODO list
--auto-gen-exclude-limitNumber of failures to allow in the TODO list before the entire rule is excluded
-c/--configSpecify which configuration file to use
-e/--excludeExclude one or more files from being linted
-i/--include-linterSpecify which linters you specifically want to run
-x/--exclude-linterSpecify which linters you don't want to run
-r/--reporterSpecify which reporter you want to use to generate the output
-p/--parallelRun linters in parallel using available CPUs
--fail-fastSpecify whether to fail after the first file with lint
--fail-levelSpecify the minimum severity (warning or error) for which the lint should fail
--[no-]colorWhether to output in color
--[no-]summaryWhether to output a summary in the default reporter
--show-lintersShow all registered linters
--show-reportersDisplay available reporters
-h/--helpShow command line flag documentation
-v/--versionShow haml-lint version
-V/--verbose-versionShow haml-lint, haml, and ruby version information

Configuration

haml-lint will automatically recognize and load any file with the name .haml-lint.yml as a configuration file. It loads the configuration based on the directory haml-lint is being run from, ascending until a configuration file is found. Any configuration loaded is automatically merged with the default configuration (see config/default.yml).

Here's an example configuration file:

linters:
  ImplicitDiv:
    enabled: false
    severity: error

  LineLength:
    max: 100

All linters have an enabled option which can be true or false, which controls whether the linter is run, along with linter-specific options. The defaults are defined in config/default.yml.

Linter Options

OptionDescription
enabledIf false, this linter will never be run. This takes precedence over any other option.
includeList of files or glob patterns to scope this linter to. This narrows down any files specified via the command line.
excludeList of files or glob patterns to exclude from this linter. This excludes any files specified via the command line or already filtered via the include option.
severityThe severity of the linter. External tools consuming haml-lint output can use this to determine whether to warn or error based on the lints reported.

Global File Exclusion

The exclude global configuration option allows you to specify a list of files or glob patterns to exclude from all linters. This is useful for ignoring third-party code that you don't maintain or care to lint. You can specify a single string or a list of strings for this option.

Skipping Frontmatter

Some static blog generators such as Jekyll include leading frontmatter to the template for their own tracking purposes. haml-lint allows you to ignore these headers by specifying the skip_frontmatter option in your .haml-lint.yml configuration:

skip_frontmatter: true

Inheriting from Other Configuration Files

The inherits_from global configuration option allows you to specify an inheritance chain for a configuration file. It accepts either a scalar value of a single file name or a vector of multiple files to inherit from. The inherited files are resolved in a first in, first out order and with "last one wins" precedence. For example:

inherits_from:
  - .shared_haml-lint.yml
  - .personal_haml-lint.yml

First, the default configuration is loaded. Then the .shared_haml-lint.yml configuration is loaded, followed by .personal_haml-lint.yml. Each of these overwrite each other in the event of a collision in configuration value. Once the inheritance chain is resolved, the base configuration is loaded and applies its rules to overwrite any in the intermediate configuration.

Lastly, in order to match your RuboCop configuration style, you can also use the inherit_from directive, which is an alias for inherits_from.

Linters

» Linters Documentation

haml-lint is an opinionated tool that helps you enforce a consistent style in your HAML files. As an opinionated tool, we've had to make calls about what we think are the "best" style conventions, even when there are often reasonable arguments for more than one possible style. While all of our choices have a rational basis, we think that the opinions themselves are less important than the fact that haml-lint provides us with an automated and low-cost means of enforcing consistency.

Custom Linters

Add the following to your configuration file:

require:
  - './relative/path/to/my_first_linter.rb'
  - 'absolute/path/to/my_second_linter.rb'

The files that are referenced by this config should have the following structure:

module HamlLint
  # MyFirstLinter is the name of the linter in this example, but it can be anything
  class Linter::MyFirstLinter < Linter
    include LinterRegistry

    def visit_tag
      return unless node.tag_name == 'div'
      record_lint(node, "You're not allowed divs!")
    end
  end
end

For more information on the different types on HAML node, please look through the HAML parser code: https://github.com/haml/haml/blob/master/lib/haml/parser.rb

Keep in mind that by default your linter will be disabled by default. So you will need to enable it in your configuration file to have it run.

Disabling Linters within Source Code

One or more individual linters can be disabled locally in a file by adding a directive comment. These comments look like the following:

-# haml-lint:disable AltText, LineLength
[...]
-# haml-lint:enable AltText, LineLength

You can disable all linters for a section with the following:

-# haml-lint:disable all

Directive Scope

A directive will disable the given linters for the scope of the block. This scope is inherited by child elements and sibling elements that come after the comment. For example:

-# haml-lint:disable AltText
#content
  %img#will-not-show-lint-1{ src: "will-not-show-lint-1.png" }
  -# haml-lint:enable AltText
  %img#will-show-lint-1{ src: "will-show-lint-1.png" }
  .sidebar
    %img#will-show-lint-2{ src: "will-show-lint-2.png" }
%img#will-not-show-lint-2{ src: "will-not-show-lint-2.png" }

The #will-not-show-lint-1 image on line 2 will not raise an AltText lint because of the directive on line 1. Since that directive is at the top level of the tree, it applies everywhere.

However, on line 4, the directive enables the AltText linter for the remainder of the #content element's content. This means that the #will-show-lint-1 image on line 5 will raise an AltText lint because it is a sibling of the enabling directive that appears later in the #content element. Likewise, the #will-show-lint-2 image on line 7 will raise an AltText lint because it is a child of a sibling of the enabling directive.

Lastly, the #will-not-show-lint-2 image on line 8 will not raise an AltText lint because the enabling directive on line 4 exists in a separate element and is not a sibling of the it.

Directive Precedence

If there are multiple directives for the same linter in an element, the last directive wins. For example:

-# haml-lint:enable AltText
%p Hello, world!
-# haml-lint:disable AltText
%img#will-not-show-lint{ src: "will-not-show-lint.png" }

There are two conflicting directives for the AltText linter. The first one enables it, but the second one disables it. Since the disable directive came later, the #will-not-show-lint element will not raise an AltText lint.

You can use this functionality to selectively enable directives within a file by first using the haml-lint:disable all directive to disable all linters in the file, then selectively using haml-lint:enable to enable linters one at a time.

Onboarding Onto a Preexisting Project

Adding a new linter into a project that wasn't previously using one can be a daunting task. To help ease the pain of starting to use Haml-Lint, you can generate a configuration file that will exclude all linters from reporting lint in files that currently have lint. This gives you something similar to a to-do list where the violations that you had when you started using Haml-Lint are listed for you to whittle away, but ensuring that any views you create going forward are properly linted.

To use this functionality, call Haml-Lint like:

haml-lint --auto-gen-config

This will generate a .haml-lint_todo.yml file that contains all existing lint as exclusions. You can then add inherits_from: .haml-lint_todo.yml to your .haml-lint.yml configuration file to ensure these exclusions are used whenever you call haml-lint.

By default, any rules with more than 15 violations will be disabled in the todo-file. You can increase this limit with the auto-gen-exclude-limit option:

haml-lint --auto-gen-config --auto-gen-exclude-limit 100

Editor Integration

Vim

If you use vim, you can have haml-lint automatically run against your HAML files after saving by using the Syntastic plugin. If you already have the plugin, just add let g:syntastic_haml_checkers = ['haml_lint'] to your .vimrc.

Vim 8 / Neovim

If you use vim 8+ or Neovim, you can have haml-lint automatically run against your HAML files as you type by using the Asynchronous Lint Engine (ALE) plugin. ALE will automatically lint your HAML files if it detects haml-lint in your PATH.

Sublime Text 3

If you use SublimeLinter 3 with Sublime Text 3 you can install the SublimeLinter-haml-lint plugin using Package Control.

Atom

If you use atom, you can install the linter-haml plugin.

TextMate 2

If you use TextMate 2, you can install the Haml-Lint.tmbundle bundle.

Visual Studio Code

If you use Visual Studio Code, you can install the Haml Lint extension

Git Integration

If you'd like to integrate haml-lint into your Git workflow, check out our Git hook manager, overcommit.

Rake Integration

To execute haml-lint via a Rake task, make sure you have rake included in your gem path (e.g. via Gemfile) add the following to your Rakefile:

require 'haml_lint/rake_task'

HamlLint::RakeTask.new

By default, when you execute rake haml_lint, the above configuration is equivalent to running haml-lint ., which will lint all .haml files in the current directory and its descendants.

You can customize your task by writing:

require 'haml_lint/rake_task'

HamlLint::RakeTask.new do |t|
  t.config = 'custom/config.yml'
  t.files = ['app/views', 'custom/*.haml']
  t.quiet = true # Don't display output from haml-lint to STDOUT
end

You can also use this custom configuration with a set of files specified via the command line:

# Single quotes prevent shell glob expansion
rake 'haml_lint[app/views, custom/*.haml]'

Files specified in this manner take precedence over the task's files attribute.

Documentation

Code documentation is generated with YARD and hosted by RubyDoc.info.

Contributing

We love getting feedback with or without pull requests. If you do add a new feature, please add tests so that we can avoid breaking it in the future.

Speaking of tests, we use Appraisal to test against both HAML 4 and 5. We use rspec to write our tests. To run the test suite, execute the following from the root directory of the repository:

appraisal bundle install
appraisal bundle exec rspec

Community

All major discussion surrounding HAML-Lint happens on the GitHub issues page.

Changelog

If you're interested in seeing the changes and bug fixes between each version of haml-lint, read the HAML-Lint Changelog.

Author: sds
Source Code: https://github.com/sds/haml-lint
License: MIT license

#haml #lint 

Fern  Wisozk

Fern Wisozk

1628237700

10 Tips and Tricks That Will Make You a Better ReactJS Dev

Knowledge you can immediately use to improve your React game to help you become better React devs, write better code or excel at coding interviews.

  1. Use React Hooks In Functional Components
  2. Use The React Context API For Passing Props
  3. Styled-Components
  4. React Fragments
  5. Use Error Boundaries
  6. React and Typescript
  7. Jest and Enzyme for Testing
  8. Conditionals in JSX
  9. Higher-Order Components
  10. React DevTools
  11. Bonus: Must-have VS Code Extensions for React Devs


#reactjs 
 

How to Create Arrays in Python

In this tutorial, you'll know the basics of how to create arrays in Python using the array module. Learn how to use Python arrays. You'll see how to define them and the different methods commonly used for performing operations on them.

This tutorialvideo on 'Arrays in Python' will help you establish a strong hold on all the fundamentals in python programming language. Below are the topics covered in this video:  
1:15 What is an array?
2:53 Is python list same as an array?
3:48  How to create arrays in python?
7:19 Accessing array elements
9:59 Basic array operations
        - 10:33  Finding the length of an array
        - 11:44  Adding Elements
        - 15:06  Removing elements
        - 18:32  Array concatenation
       - 20:59  Slicing
       - 23:26  Looping  


Python Array Tutorial – Define, Index, Methods

In this article, you'll learn how to use Python arrays. You'll see how to define them and the different methods commonly used for performing operations on them.

The artcile covers arrays that you create by importing the array module. We won't cover NumPy arrays here.

Table of Contents

  1. Introduction to Arrays
    1. The differences between Lists and Arrays
    2. When to use arrays
  2. How to use arrays
    1. Define arrays
    2. Find the length of arrays
    3. Array indexing
    4. Search through arrays
    5. Loop through arrays
    6. Slice an array
  3. Array methods for performing operations
    1. Change an existing value
    2. Add a new value
    3. Remove a value
  4. Conclusion

Let's get started!

What are Python Arrays?

Arrays are a fundamental data structure, and an important part of most programming languages. In Python, they are containers which are able to store more than one item at the same time.

Specifically, they are an ordered collection of elements with every value being of the same data type. That is the most important thing to remember about Python arrays - the fact that they can only hold a sequence of multiple items that are of the same type.

What's the Difference between Python Lists and Python Arrays?

Lists are one of the most common data structures in Python, and a core part of the language.

Lists and arrays behave similarly.

Just like arrays, lists are an ordered sequence of elements.

They are also mutable and not fixed in size, which means they can grow and shrink throughout the life of the program. Items can be added and removed, making them very flexible to work with.

However, lists and arrays are not the same thing.

Lists store items that are of various data types. This means that a list can contain integers, floating point numbers, strings, or any other Python data type, at the same time. That is not the case with arrays.

As mentioned in the section above, arrays store only items that are of the same single data type. There are arrays that contain only integers, or only floating point numbers, or only any other Python data type you want to use.

When to Use Python Arrays

Lists are built into the Python programming language, whereas arrays aren't. Arrays are not a built-in data structure, and therefore need to be imported via the array module in order to be used.

Arrays of the array module are a thin wrapper over C arrays, and are useful when you want to work with homogeneous data.

They are also more compact and take up less memory and space which makes them more size efficient compared to lists.

If you want to perform mathematical calculations, then you should use NumPy arrays by importing the NumPy package. Besides that, you should just use Python arrays when you really need to, as lists work in a similar way and are more flexible to work with.

How to Use Arrays in Python

In order to create Python arrays, you'll first have to import the array module which contains all the necassary functions.

There are three ways you can import the array module:

  • By using import array at the top of the file. This includes the module array. You would then go on to create an array using array.array().
import array

#how you would create an array
array.array()
  • Instead of having to type array.array() all the time, you could use import array as arr at the top of the file, instead of import array alone. You would then create an array by typing arr.array(). The arr acts as an alias name, with the array constructor then immediately following it.
import array as arr

#how you would create an array
arr.array()
  • Lastly, you could also use from array import *, with * importing all the functionalities available. You would then create an array by writing the array() constructor alone.
from array import *

#how you would create an array
array()

How to Define Arrays in Python

Once you've imported the array module, you can then go on to define a Python array.

The general syntax for creating an array looks like this:

variable_name = array(typecode,[elements])

Let's break it down:

  • variable_name would be the name of the array.
  • The typecode specifies what kind of elements would be stored in the array. Whether it would be an array of integers, an array of floats or an array of any other Python data type. Remember that all elements should be of the same data type.
  • Inside square brackets you mention the elements that would be stored in the array, with each element being separated by a comma. You can also create an empty array by just writing variable_name = array(typecode) alone, without any elements.

Below is a typecode table, with the different typecodes that can be used with the different data types when defining Python arrays:

TYPECODEC TYPEPYTHON TYPESIZE
'b'signed charint1
'B'unsigned charint1
'u'wchar_tUnicode character2
'h'signed shortint2
'H'unsigned shortint2
'i'signed intint2
'I'unsigned intint2
'l'signed longint4
'L'unsigned longint4
'q'signed long longint8
'Q'unsigned long longint8
'f'floatfloat4
'd'doublefloat8

Tying everything together, here is an example of how you would define an array in Python:

import array as arr 

numbers = arr.array('i',[10,20,30])


print(numbers)

#output

#array('i', [10, 20, 30])

Let's break it down:

  • First we included the array module, in this case with import array as arr .
  • Then, we created a numbers array.
  • We used arr.array() because of import array as arr .
  • Inside the array() constructor, we first included i, for signed integer. Signed integer means that the array can include positive and negative values. Unsigned integer, with H for example, would mean that no negative values are allowed.
  • Lastly, we included the values to be stored in the array in square brackets.

Keep in mind that if you tried to include values that were not of i typecode, meaning they were not integer values, you would get an error:

import array as arr 

numbers = arr.array('i',[10.0,20,30])


print(numbers)

#output

#Traceback (most recent call last):
# File "/Users/dionysialemonaki/python_articles/demo.py", line 14, in <module>
#   numbers = arr.array('i',[10.0,20,30])
#TypeError: 'float' object cannot be interpreted as an integer

In the example above, I tried to include a floating point number in the array. I got an error because this is meant to be an integer array only.

Another way to create an array is the following:

from array import *

#an array of floating point values
numbers = array('d',[10.0,20.0,30.0])

print(numbers)

#output

#array('d', [10.0, 20.0, 30.0])

The example above imported the array module via from array import * and created an array numbers of float data type. This means that it holds only floating point numbers, which is specified with the 'd' typecode.

How to Find the Length of an Array in Python

To find out the exact number of elements contained in an array, use the built-in len() method.

It will return the integer number that is equal to the total number of elements in the array you specify.

import array as arr 

numbers = arr.array('i',[10,20,30])


print(len(numbers))

#output
# 3

In the example above, the array contained three elements – 10, 20, 30 – so the length of numbers is 3.

Array Indexing and How to Access Individual Items in an Array in Python

Each item in an array has a specific address. Individual items are accessed by referencing their index number.

Indexing in Python, and in all programming languages and computing in general, starts at 0. It is important to remember that counting starts at 0 and not at 1.

To access an element, you first write the name of the array followed by square brackets. Inside the square brackets you include the item's index number.

The general syntax would look something like this:

array_name[index_value_of_item]

Here is how you would access each individual element in an array:

import array as arr 

numbers = arr.array('i',[10,20,30])

print(numbers[0]) # gets the 1st element
print(numbers[1]) # gets the 2nd element
print(numbers[2]) # gets the 3rd element

#output

#10
#20
#30

Remember that the index value of the last element of an array is always one less than the length of the array. Where n is the length of the array, n - 1 will be the index value of the last item.

Note that you can also access each individual element using negative indexing.

With negative indexing, the last element would have an index of -1, the second to last element would have an index of -2, and so on.

Here is how you would get each item in an array using that method:

import array as arr 

numbers = arr.array('i',[10,20,30])

print(numbers[-1]) #gets last item
print(numbers[-2]) #gets second to last item
print(numbers[-3]) #gets first item
 
#output

#30
#20
#10

How to Search Through an Array in Python

You can find out an element's index number by using the index() method.

You pass the value of the element being searched as the argument to the method, and the element's index number is returned.

import array as arr 

numbers = arr.array('i',[10,20,30])

#search for the index of the value 10
print(numbers.index(10))

#output

#0

If there is more than one element with the same value, the index of the first instance of the value will be returned:

import array as arr 


numbers = arr.array('i',[10,20,30,10,20,30])

#search for the index of the value 10
#will return the index number of the first instance of the value 10
print(numbers.index(10))

#output

#0

How to Loop through an Array in Python

You've seen how to access each individual element in an array and print it out on its own.

You've also seen how to print the array, using the print() method. That method gives the following result:

import array as arr 

numbers = arr.array('i',[10,20,30])

print(numbers)

#output

#array('i', [10, 20, 30])

What if you want to print each value one by one?

This is where a loop comes in handy. You can loop through the array and print out each value, one-by-one, with each loop iteration.

For this you can use a simple for loop:

import array as arr 

numbers = arr.array('i',[10,20,30])

for number in numbers:
    print(number)
    
#output
#10
#20
#30

You could also use the range() function, and pass the len() method as its parameter. This would give the same result as above:

import array as arr  

values = arr.array('i',[10,20,30])

#prints each individual value in the array
for value in range(len(values)):
    print(values[value])

#output

#10
#20
#30

How to Slice an Array in Python

To access a specific range of values inside the array, use the slicing operator, which is a colon :.

When using the slicing operator and you only include one value, the counting starts from 0 by default. It gets the first item, and goes up to but not including the index number you specify.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#get the values 10 and 20 only
print(numbers[:2])  #first to second position

#output

#array('i', [10, 20])

When you pass two numbers as arguments, you specify a range of numbers. In this case, the counting starts at the position of the first number in the range, and up to but not including the second one:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])


#get the values 20 and 30 only
print(numbers[1:3]) #second to third position

#output

#rray('i', [20, 30])

Methods For Performing Operations on Arrays in Python

Arrays are mutable, which means they are changeable. You can change the value of the different items, add new ones, or remove any you don't want in your program anymore.

Let's see some of the most commonly used methods which are used for performing operations on arrays.

How to Change the Value of an Item in an Array

You can change the value of a specific element by speficying its position and assigning it a new value:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#change the first element
#change it from having a value of 10 to having a value of 40
numbers[0] = 40

print(numbers)

#output

#array('i', [40, 20, 30])

How to Add a New Value to an Array

To add one single value at the end of an array, use the append() method:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integer 40 to the end of numbers
numbers.append(40)

print(numbers)

#output

#array('i', [10, 20, 30, 40])

Be aware that the new item you add needs to be the same data type as the rest of the items in the array.

Look what happens when I try to add a float to an array of integers:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integer 40 to the end of numbers
numbers.append(40.0)

print(numbers)

#output

#Traceback (most recent call last):
#  File "/Users/dionysialemonaki/python_articles/demo.py", line 19, in <module>
#   numbers.append(40.0)
#TypeError: 'float' object cannot be interpreted as an integer

But what if you want to add more than one value to the end an array?

Use the extend() method, which takes an iterable (such as a list of items) as an argument. Again, make sure that the new items are all the same data type.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integers 40,50,60 to the end of numbers
#The numbers need to be enclosed in square brackets

numbers.extend([40,50,60])

print(numbers)

#output

#array('i', [10, 20, 30, 40, 50, 60])

And what if you don't want to add an item to the end of an array? Use the insert() method, to add an item at a specific position.

The insert() function takes two arguments: the index number of the position the new element will be inserted, and the value of the new element.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integer 40 in the first position
#remember indexing starts at 0

numbers.insert(0,40)

print(numbers)

#output

#array('i', [40, 10, 20, 30])

How to Remove a Value from an Array

To remove an element from an array, use the remove() method and include the value as an argument to the method.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

numbers.remove(10)

print(numbers)

#output

#array('i', [20, 30])

With remove(), only the first instance of the value you pass as an argument will be removed.

See what happens when there are more than one identical values:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30,10,20])

numbers.remove(10)

print(numbers)

#output

#array('i', [20, 30, 10, 20])

Only the first occurence of 10 is removed.

You can also use the pop() method, and specify the position of the element to be removed:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30,10,20])

#remove the first instance of 10
numbers.pop(0)

print(numbers)

#output

#array('i', [20, 30, 10, 20])

Conclusion

And there you have it - you now know the basics of how to create arrays in Python using the array module. Hopefully you found this guide helpful.

Thanks for reading and happy coding!

#python #programming 

Why ReactJS is better for Web Application Development?

Web Application Development is essential for a business in today’s digital era. Finding the right platform for Web Application Development is important for building an effective Web Application that can enhance the overall customer engagement. Here’s what makes ReactJS a better option for building your next Web Application.

#Why ReactJS is better for Web Application Development #Benefits of ReactJS #What is ReactJS? #ReactJS vs AngularJS

Elian  Harber

Elian Harber

1641430440

Bokeh Plotting Backend for Pandas and GeoPandas

Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of Pandas. Importing the library adds a complementary plotting method plot_bokeh() on DataFrames and Series.

With Pandas-Bokeh, creating stunning, interactive, HTML-based visualization is as easy as calling:

df.plot_bokeh()

Pandas-Bokeh also provides native support as a Pandas Plotting backend for Pandas >= 0.25. When Pandas-Bokeh is installed, switchting the default Pandas plotting backend to Bokeh can be done via:

pd.set_option('plotting.backend', 'pandas_bokeh')

More details about the new Pandas backend can be found below.


Interactive Documentation

Please visit:

https://patrikhlobil.github.io/Pandas-Bokeh/

for an interactive version of the documentation below, where you can play with the dynamic Bokeh plots.


For more information have a look at the Examples below or at notebooks on the Github Repository of this project.

Startimage


 

Installation

You can install Pandas-Bokeh from PyPI via pip

pip install pandas-bokeh

or conda:

conda install -c patrikhlobil pandas-bokeh

With the current release 0.5.5, Pandas-Bokeh officially supports Python 3.6 and newer. For more details, see Release Notes.

How To Use

Classical Use

The Pandas-Bokeh library should be imported after Pandas, GeoPandas and/or Pyspark. After the import, one should define the plotting output, which can be:

pandas_bokeh.output_notebook(): Embeds the Plots in the cell outputs of the notebook. Ideal when working in Jupyter Notebooks.

pandas_bokeh.output_file(filename): Exports the plot to the provided filename as an HTML.

For more details about the plotting outputs, see the reference here or the Bokeh documentation.

Notebook output (see also bokeh.io.output_notebook)

import pandas as pd import pandas_bokeh pandas_bokeh.output_notebook()

File output to "Interactive Plot.html" (see also bokeh.io.output_file)

import pandas as pd import pandas_bokeh pandas_bokeh.output_file("Interactive Plot.html")

Pandas-Bokeh as native Pandas plotting backend

For pandas >= 0.25, a plotting backend switch is natively supported. It can be achievied by calling:

import pandas as pd
pd.set_option('plotting.backend', 'pandas_bokeh')

Now, the plotting API is accessible for a Pandas DataFrame via:

df.plot(...)

All additional functionalities of Pandas-Bokeh are then accessible at pd.plotting. So, setting the output to notebook is:

pd.plotting.output_notebook()

or calling the grid layout functionality:

pd.plotting.plot_grid(...)

Note: Backwards compatibility is kept since there will still be the df.plot_bokeh(...) methods for a DataFrame.


Plot types

Supported plottypes are at the moment:

Also, check out the complementary chapter Outputs, Formatting & Layouts about:


Lineplot

Basic Lineplot

This simple lineplot in Pandas-Bokeh already contains various interactive elements:

  • a pannable and zoomable (zoom in plotarea and zoom on axis) plot
  • by clicking on the legend elements, one can hide and show the individual lines
  • a Hovertool for the plotted lines

Consider the following simple example:

import numpy as np

np.random.seed(42)
df = pd.DataFrame({"Google": np.random.randn(1000)+0.2, 
                   "Apple": np.random.randn(1000)+0.17}, 
                   index=pd.date_range('1/1/2000', periods=1000))
df = df.cumsum()
df = df + 50
df.plot_bokeh(kind="line")       #equivalent to df.plot_bokeh.line()

ApplevsGoogle_1

Note, that similar to the regular pandas.DataFrame.plot method, there are also additional accessors to directly access the different plotting types like:

  • df.plot_bokeh(kind="line", ...)df.plot_bokeh.line(...)
  • df.plot_bokeh(kind="bar", ...)df.plot_bokeh.bar(...)
  • df.plot_bokeh(kind="hist", ...)df.plot_bokeh.hist(...)
  • ...

Advanced Lineplot

There are various optional parameters to tune the plots, for example:

kind: Which kind of plot should be produced. Currently supported are: "line", "point", "scatter", "bar" and "histogram". In the near future many more will be implemented as horizontal barplot, boxplots, pie-charts, etc.

x: Name of the column to use for the horizontal x-axis. If the x parameter is not specified, the index is used for the x-values of the plot. Alternative, also an array of values can be passed that has the same number of elements as the DataFrame.

y: Name of column or list of names of columns to use for the vertical y-axis.

figsize: Choose width & height of the plot

title: Sets title of the plot

xlim/ylim: Set visibler range of plot for x- and y-axis (also works for datetime x-axis)

xlabel/ylabel: Set x- and y-labels

logx/logy: Set log-scale on x-/y-axis

xticks/yticks: Explicitly set the ticks on the axes

color: Defines a single color for a plot.

colormap: Can be used to specify multiple colors to plot. Can be either a list of colors or the name of a Bokeh color palette

hovertool: If True a Hovertool is active, else if False no Hovertool is drawn.

hovertool_string: If specified, this string will be used for the hovertool (@{column} will be replaced by the value of the column for the element the mouse hovers over, see also Bokeh documentation and here)

toolbar_location: Specify the position of the toolbar location (None, "above", "below", "left" or "right"). Default: "right"

zooming: Enables/Disables zooming. Default: True

panning: Enables/Disables panning. Default: True

fontsize_label/fontsize_ticks/fontsize_title/fontsize_legend: Set fontsize of labels, ticks, title or legend (int or string of form "15pt")

rangetool Enables a range tool scroller. Default False

kwargs**: Optional keyword arguments of bokeh.plotting.figure.line

Try them out to get a feeling for the effects. Let us consider now:

df.plot_bokeh.line(
    figsize=(800, 450),
    y="Apple",
    title="Apple vs Google",
    xlabel="Date",
    ylabel="Stock price [$]",
    yticks=[0, 100, 200, 300, 400],
    ylim=(0, 400),
    toolbar_location=None,
    colormap=["red", "blue"],
    hovertool_string=r"""<img
                        src='https://upload.wikimedia.org/wikipedia/commons/thumb/f/fa/Apple_logo_black.svg/170px-Apple_logo_black.svg.png' 
                        height="42" alt="@imgs" width="42"
                        style="float: left; margin: 0px 15px 15px 0px;"
                        border="2"></img> Apple 
                        
                        <h4> Stock Price: </h4> @{Apple}""",
    panning=False,
    zooming=False)

ApplevsGoogle_2

Lineplot with data points

For lineplots, as for many other plot-kinds, there are some special keyword arguments that only work for this plotting type. For lineplots, these are:

plot_data_points: Plot also the data points on the lines

plot_data_points_size: Determines the size of the data points

marker: Defines the point type (Default: "circle"). Possible values are: 'circle', 'square', 'triangle', 'asterisk', 'circle_x', 'square_x', 'inverted_triangle', 'x', 'circle_cross', 'square_cross', 'diamond', 'cross'

kwargs**: Optional keyword arguments of bokeh.plotting.figure.line```

Let us use this information to have another version of the same plot:

df.plot_bokeh.line(
    figsize=(800, 450),
    title="Apple vs Google",
    xlabel="Date",
    ylabel="Stock price [$]",
    yticks=[0, 100, 200, 300, 400],
    ylim=(100, 200),
    xlim=("2001-01-01", "2001-02-01"),
    colormap=["red", "blue"],
    plot_data_points=True,
    plot_data_points_size=10,
    marker="asterisk")

ApplevsGoogle_3

Lineplot with rangetool

ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index, columns=list('ABCD'))
df = df.cumsum()

df.plot_bokeh(rangetool=True)

rangetool

Pointplot

If you just wish to draw the date points for curves, the pointplot option is the right choice. It also accepts the kwargs of bokeh.plotting.figure.scatter like marker or size:

import numpy as np

x = np.arange(-3, 3, 0.1)
y2 = x**2
y3 = x**3
df = pd.DataFrame({"x": x, "Parabula": y2, "Cube": y3})
df.plot_bokeh.point(
    x="x",
    xticks=range(-3, 4),
    size=5,
    colormap=["#009933", "#ff3399"],
    title="Pointplot (Parabula vs. Cube)",
    marker="x")

Pointplot

Stepplot

With a similar API as the line- & pointplots, one can generate a stepplot. Additional keyword arguments for this plot type are passes to bokeh.plotting.figure.step, e.g. mode (before, after, center), see the following example

import numpy as np

x = np.arange(-3, 3, 1)
y2 = x**2
y3 = x**3
df = pd.DataFrame({"x": x, "Parabula": y2, "Cube": y3})
df.plot_bokeh.step(
    x="x",
    xticks=range(-1, 1),
    colormap=["#009933", "#ff3399"],
    title="Pointplot (Parabula vs. Cube)",
    figsize=(800,300),
    fontsize_title=30,
    fontsize_label=25,
    fontsize_ticks=15,
    fontsize_legend=5,
    )

df.plot_bokeh.step(
    x="x",
    xticks=range(-1, 1),
    colormap=["#009933", "#ff3399"],
    title="Pointplot (Parabula vs. Cube)",
    mode="after",
    figsize=(800,300)
    )

Stepplot

Note that the step-plot API of Bokeh does so far not support a hovertool functionality.

Scatterplot

A basic scatterplot can be created using the kind="scatter" option. For scatterplots, the x and y parameters have to be specified and the following optional keyword argument is allowed:

category: Determines the category column to use for coloring the scatter points

kwargs**: Optional keyword arguments of bokeh.plotting.figure.scatter

Note, that the pandas.DataFrame.plot_bokeh() method return per default a Bokeh figure, which can be embedded in Dashboard layouts with other figures and Bokeh objects (for more details about (sub)plot layouts and embedding the resulting Bokeh plots as HTML click here).

In the example below, we use the building grid layout support of Pandas-Bokeh to display both the DataFrame (using a Bokeh DataTable) and the resulting scatterplot:

# Load Iris Dataset:
df = pd.read_csv(
    r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/iris/iris.csv"
)
df = df.sample(frac=1)

# Create Bokeh-Table with DataFrame:
from bokeh.models.widgets import DataTable, TableColumn
from bokeh.models import ColumnDataSource

data_table = DataTable(
    columns=[TableColumn(field=Ci, title=Ci) for Ci in df.columns],
    source=ColumnDataSource(df),
    height=300,
)

# Create Scatterplot:
p_scatter = df.plot_bokeh.scatter(
    x="petal length (cm)",
    y="sepal width (cm)",
    category="species",
    title="Iris DataSet Visualization",
    show_figure=False,
)

# Combine Table and Scatterplot via grid layout:
pandas_bokeh.plot_grid([[data_table, p_scatter]], plot_width=400, plot_height=350)

 

Scatterplot

A possible optional keyword parameters that can be passed to bokeh.plotting.figure.scatter is size. Below, we use the sepal length of the Iris data as reference for the size:

#Change one value to clearly see the effect of the size keyword
df.loc[13, "sepal length (cm)"] = 15

#Make scatterplot:
p_scatter = df.plot_bokeh.scatter(
    x="petal length (cm)",
    y="sepal width (cm)",
    category="species",
    title="Iris DataSet Visualization with Size Keyword",
    size="sepal length (cm)")

Scatterplot2

In this example you can see, that the additional dimension sepal length cannot be used to clearly differentiate between the virginica and versicolor species.

Barplot

The barplot API has no special keyword arguments, but accepts optional kwargs of bokeh.plotting.figure.vbar like alpha. It uses per default the index for the bar categories (however, also columns can be used as x-axis category using the x argument).

data = {
    'fruits':
    ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'],
    '2015': [2, 1, 4, 3, 2, 4],
    '2016': [5, 3, 3, 2, 4, 6],
    '2017': [3, 2, 4, 4, 5, 3]
}
df = pd.DataFrame(data).set_index("fruits")

p_bar = df.plot_bokeh.bar(
    ylabel="Price per Unit [€]", 
    title="Fruit prices per Year", 
    alpha=0.6)

Barplot

Using the stacked keyword argument you also maked stacked barplots:

p_stacked_bar = df.plot_bokeh.bar(
    ylabel="Price per Unit [€]",
    title="Fruit prices per Year",
    stacked=True,
    alpha=0.6)

Barplot2

Also horizontal versions of the above barplot are supported with the keyword kind="barh" or the accessor plot_bokeh.barh. You can still specify a column of the DataFrame as the bar category via the x argument if you do not wish to use the index.

#Reset index, such that "fruits" is now a column of the DataFrame:
df.reset_index(inplace=True)

#Create horizontal bar (via kind keyword):
p_hbar = df.plot_bokeh(
    kind="barh",
    x="fruits",
    xlabel="Price per Unit [€]",
    title="Fruit prices per Year",
    alpha=0.6,
    legend = "bottom_right",
    show_figure=False)

#Create stacked horizontal bar (via barh accessor):
p_stacked_hbar = df.plot_bokeh.barh(
    x="fruits",
    stacked=True,
    xlabel="Price per Unit [€]",
    title="Fruit prices per Year",
    alpha=0.6,
    legend = "bottom_right",
    show_figure=False)

#Plot all barplot examples in a grid:
pandas_bokeh.plot_grid([[p_bar, p_stacked_bar],
                        [p_hbar, p_stacked_hbar]], 
                       plot_width=450)

Barplot3

Histogram

For drawing histograms (kind="hist"), Pandas-Bokeh has a lot of customization features. Optional keyword arguments for histogram plots are:

bins: Determines bins to use for the histogram. If bins is an int, it defines the number of equal-width bins in the given range (10, by default). If bins is a sequence, it defines the bin edges, including the rightmost edge, allowing for non-uniform bin widths. If bins is a string, it defines the method used to calculate the optimal bin width, as defined by histogram_bin_edges.

histogram_type: Either "sidebyside", "topontop" or "stacked". Default: "topontop"

stacked: Boolean that overrides the histogram_type as "stacked" if given. Default: False

kwargs**: Optional keyword arguments of bokeh.plotting.figure.quad

Below examples of the different histogram types:

import numpy as np

df_hist = pd.DataFrame({
    'a': np.random.randn(1000) + 1,
    'b': np.random.randn(1000),
    'c': np.random.randn(1000) - 1
    },
    columns=['a', 'b', 'c'])

#Top-on-Top Histogram (Default):
df_hist.plot_bokeh.hist(
    bins=np.linspace(-5, 5, 41),
    vertical_xlabel=True,
    hovertool=False,
    title="Normal distributions (Top-on-Top)",
    line_color="black")

#Side-by-Side Histogram (multiple bars share bin side-by-side) also accessible via
#kind="hist":
df_hist.plot_bokeh(
    kind="hist",
    bins=np.linspace(-5, 5, 41),
    histogram_type="sidebyside",
    vertical_xlabel=True,
    hovertool=False,
    title="Normal distributions (Side-by-Side)",
    line_color="black")

#Stacked histogram:
df_hist.plot_bokeh.hist(
    bins=np.linspace(-5, 5, 41),
    histogram_type="stacked",
    vertical_xlabel=True,
    hovertool=False,
    title="Normal distributions (Stacked)",
    line_color="black")

Histogram

Further, advanced keyword arguments for histograms are:

  • weights: A column of the DataFrame that is used as weight for the histogramm aggregation (see also numpy.histogram)
  • normed: If True, histogram values are normed to 1 (sum of histogram values=1). It is also possible to pass an integer, e.g. normed=100 would result in a histogram with percentage y-axis (sum of histogram values=100). Default: False
  • cumulative: If True, a cumulative histogram is shown. Default: False
  • show_average: If True, the average of the histogram is also shown. Default: False

Their usage is shown in these examples:

p_hist = df_hist.plot_bokeh.hist(
    y=["a", "b"],
    bins=np.arange(-4, 6.5, 0.5),
    normed=100,
    vertical_xlabel=True,
    ylabel="Share[%]",
    title="Normal distributions (normed)",
    show_average=True,
    xlim=(-4, 6),
    ylim=(0, 30),
    show_figure=False)

p_hist_cum = df_hist.plot_bokeh.hist(
    y=["a", "b"],
    bins=np.arange(-4, 6.5, 0.5),
    normed=100,
    cumulative=True,
    vertical_xlabel=True,
    ylabel="Share[%]",
    title="Normal distributions (normed & cumulative)",
    show_figure=False)

pandas_bokeh.plot_grid([[p_hist, p_hist_cum]], plot_width=450, plot_height=300)

Histogram2


 

Areaplot

Areaplot (kind="area") can be either drawn on top of each other or stacked. The important parameters are:

stacked: If True, the areaplots are stacked. If False, plots are drawn on top of each other. Default: False

kwargs**: Optional keyword arguments of bokeh.plotting.figure.patch


Let us consider the energy consumption split by source that can be downloaded as DataFrame via:

df_energy = pd.read_csv(r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/energy/energy.csv", 
parse_dates=["Year"])
df_energy.head()
YearOilGasCoalNuclear EnergyHydroelectricityOther Renewable
1970-01-012291.5826.71467.317.7265.85.8
1971-01-012427.7884.81459.224.9276.46.3
1972-01-012613.9933.71475.734.1288.96.8
1973-01-012818.1978.01519.645.9292.57.3
1974-01-012777.31001.91520.959.6321.17.7


Creating the Areaplot can be achieved via:

df_energy.plot_bokeh.area(
    x="Year",
    stacked=True,
    legend="top_left",
    colormap=["brown", "orange", "black", "grey", "blue", "green"],
    title="Worldwide energy consumption split by energy source",
    ylabel="Million tonnes oil equivalent",
    ylim=(0, 16000))

areaplot

Note that the energy consumption of fossile energy is still increasing and renewable energy sources are still small in comparison 😢!!! However, when we norm the plot using the normed keyword, there is a clear trend towards renewable energies in the last decade:

df_energy.plot_bokeh.area(
    x="Year",
    stacked=True,
    normed=100,
    legend="bottom_left",
    colormap=["brown", "orange", "black", "grey", "blue", "green"],
    title="Worldwide energy consumption split by energy source",
    ylabel="Million tonnes oil equivalent")

areaplot2

Pieplot

For Pieplots, let us consider a dataset showing the results of all Bundestags elections in Germany since 2002:

df_pie = pd.read_csv(r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/Bundestagswahl/Bundestagswahl.csv")
df_pie
Partei20022005200920132017
CDU/CSU38.535.233.841.532.9
SPD38.534.223.025.720.5
FDP7.49.814.64.810.7
Grünen8.68.110.78.48.9
Linke/PDS4.08.711.98.69.2
AfD0.00.00.00.012.6
Sonstige3.04.06.011.05.0

We can create a Pieplot of the last election in 2017 by specifying the "Partei" (german for party) column as the x column and the "2017" column as the y column for values:

df_pie.plot_bokeh.pie(
    x="Partei",
    y="2017",
    colormap=["blue", "red", "yellow", "green", "purple", "orange", "grey"],
    title="Results of German Bundestag Election 2017",
    )

pieplot

When you pass several columns to the y parameter (not providing the y-parameter assumes you plot all columns), multiple nested pieplots will be shown in one plot:

df_pie.plot_bokeh.pie(
    x="Partei",
    colormap=["blue", "red", "yellow", "green", "purple", "orange", "grey"],
    title="Results of German Bundestag Elections [2002-2017]",
    line_color="grey")

pieplot2

Mapplot

The mapplot method of Pandas-Bokeh allows for plotting geographic points stored in a Pandas DataFrame on an interactive map. For more advanced Geoplots for line and polygon shapes have a look at the Geoplots examples for the GeoPandas API of Pandas-Bokeh.

For mapplots, only (latitude, longitude) pairs in geographic projection (WGS84) can be plotted on a map. The basic API has the following 2 base parameters:

  • x: name of the longitude column of the DataFrame
  • y: name of the latitude column of the DataFrame

The other optional keyword arguments are discussed in the section about the GeoPandas API, e.g. category for coloring the points.

Below an example of plotting all cities for more than 1 million inhabitants:

df_mapplot = pd.read_csv(r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/populated%20places/populated_places.csv")
df_mapplot.head()
namepop_maxlatitudelongitudesize
Mesa108539433.423915-111.7360841.085394
Sharjah110302725.37138355.4064781.103027
Changwon108149935.219102128.5835621.081499
Sheffield129290053.366677-1.4999971.292900
Abbottabad118364734.14950373.1995011.183647
df_mapplot["size"] = df_mapplot["pop_max"] / 1000000
df_mapplot.plot_bokeh.map(
    x="longitude",
    y="latitude",
    hovertool_string="""<h2> @{name} </h2> 
    
                        <h3> Population: @{pop_max} </h3>""",
    tile_provider="STAMEN_TERRAIN_RETINA",
    size="size", 
    figsize=(900, 600),
    title="World cities with more than 1.000.000 inhabitants")

 

Mapplot

Geoplots

Pandas-Bokeh also allows for interactive plotting of Maps using GeoPandas by providing a geopandas.GeoDataFrame.plot_bokeh() method. It allows to plot the following geodata on a map :

  • Points/MultiPoints
  • Lines/MultiLines
  • Polygons/MultiPolygons

Note: t is not possible to mix up the objects types, i.e. a GeoDataFrame with Points and Lines is for example not allowed.

Les us start with a simple example using the "World Borders Dataset" . Let us first import all neccessary libraries and read the shapefile:

import geopandas as gpd
import pandas as pd
import pandas_bokeh
pandas_bokeh.output_notebook()

#Read in GeoJSON from URL:
df_states = gpd.read_file(r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/states/states.geojson")
df_states.head()
STATE_NAMEREGIONPOPESTIMATE2010POPESTIMATE2011POPESTIMATE2012POPESTIMATE2013POPESTIMATE2014POPESTIMATE2015POPESTIMATE2016POPESTIMATE2017geometry
Hawaii413638171378323139277214080381417710142632014286831427538(POLYGON ((-160.0738033454681 22.0041773479577...
Washington467413866819155689089969634107046931715281872809347405743(POLYGON ((-122.4020153103835 48.2252163723779...
Montana4990507996866100352210119211019931102831710386561050493POLYGON ((-111.4754253002074 44.70216236909688...
Maine113275681327968132810113279751328903132778713302321335907(POLYGON ((-69.77727626137293 44.0741483685119...
North Dakota2674518684830701380722908738658754859755548755393POLYGON ((-98.73043728833767 45.93827137024809...

Plotting the data on a map is as simple as calling:

df_states.plot_bokeh(simplify_shapes=10000)

US_States_1

We also passed the optional parameter simplify_shapes (~meter) to improve plotting performance (for a reference see shapely.object.simplify). The above geolayer thus has an accuracy of about 10km.

Many keyword arguments like xlabel, ylabel, xlim, ylim, title, colormap, hovertool, zooming, panning, ... for costumizing the plot are also available for the geoplotting API and can be uses as in the examples shown above. There are however also many other options especially for plotting geodata:

  • geometry_column: Specify the column that stores the geometry-information (default: "geometry")
  • hovertool_columns: Specify column names, for which values should be shown in hovertool
  • hovertool_string: If specified, this string will be used for the hovertool (@{column} will be replaced by the value of the column for the element the mouse hovers over, see also Bokeh documentation)
  • colormap_uselog: If set True, the colormapper is using a logscale. Default: False
  • colormap_range: Specify the value range of the colormapper via (min, max) tuple
  • tile_provider: Define build-in tile provider for background maps. Possible values: None, 'CARTODBPOSITRON', 'CARTODBPOSITRON_RETINA', 'STAMEN_TERRAIN', 'STAMEN_TERRAIN_RETINA', 'STAMEN_TONER', 'STAMEN_TONER_BACKGROUND', 'STAMEN_TONER_LABELS'. Default: CARTODBPOSITRON_RETINA
  • tile_provider_url: An arbitraty tile_provider_url of the form '/{Z}/{X}/{Y}*.png' can be passed to be used as background map.
  • tile_attribution: String (also HTML accepted) for showing attribution for tile source in the lower right corner
  • tile_alpha: Sets the alpha value of the background tile between [0, 1]. Default: 1

One of the most common usage of map plots are choropleth maps, where the color of a the objects is determined by the property of the object itself. There are 3 ways of drawing choropleth maps using Pandas-Bokeh, which are described below.

Categories

This is the simplest way. Just provide the category keyword for the selection of the property column:

  • category: Specifies the column of the GeoDataFrame that should be used to draw a choropleth map
  • show_colorbar: Whether or not to show a colorbar for categorical plots. Default: True

Let us now draw the regions as a choropleth plot using the category keyword (at the moment, only numerical columns are supported for choropleth plots):

df_states.plot_bokeh(
    figsize=(900, 600),
    simplify_shapes=5000,
    category="REGION",
    show_colorbar=False,
    colormap=["blue", "yellow", "green", "red"],
    hovertool_columns=["STATE_NAME", "REGION"],
    tile_provider="STAMEN_TERRAIN_RETINA")

When hovering over the states, the state-name and the region are shown as specified in the hovertool_columns argument.

US_States_2

 

Dropdown

By passing a list of column names of the GeoDataFrame as the dropdown keyword argument, a dropdown menu is shown above the map. This dropdown menu can be used to select the choropleth layer by the user. :

df_states["STATE_NAME_SMALL"] = df_states["STATE_NAME"].str.lower()

df_states.plot_bokeh(
    figsize=(900, 600),
    simplify_shapes=5000,
    dropdown=["POPESTIMATE2010", "POPESTIMATE2017"],
    colormap="Viridis",
    hovertool_string="""
                        <img
                        src="https://www.states101.com/img/flags/gif/small/@STATE_NAME_SMALL.gif" 
                        height="42" alt="@imgs" width="42"
                        style="float: left; margin: 0px 15px 15px 0px;"
                        border="2"></img>
                
                        <h2>  @STATE_NAME </h2>
                        <h3> 2010: @POPESTIMATE2010 </h3>
                        <h3> 2017: @POPESTIMATE2017 </h3>""",
    tile_provider_url=r"http://c.tile.stamen.com/watercolor/{Z}/{X}/{Y}.jpg",
    tile_attribution='Map tiles by <a href="http://stamen.com">Stamen Design</a>, under <a href="http://creativecommons.org/licenses/by/3.0">CC BY 3.0</a>. Data by <a href="http://openstreetmap.org">OpenStreetMap</a>, under <a href="http://www.openstreetmap.org/copyright">ODbL</a>.'
    )

US_States_3

Using hovertool_string, one can pass a string that can contain arbitrary HTML elements (including divs, images, ...) that is shown when hovering over the geographies (@{column} will be replaced by the value of the column for the element the mouse hovers over, see also Bokeh documentation).

Here, we also used an OSM tile server with watercolor style via tile_provider_url and added the attribution via tile_attribution.

Sliders

Another option for interactive choropleth maps is the slider implementation of Pandas-Bokeh. The possible keyword arguments are here:

  • slider: By passing a list of column names of the GeoDataFrame, a slider can be used to . This dropdown menu can be used to select the choropleth layer by the user.
  • slider_range: Pass a range (or numpy.arange) of numbers object to relate the sliders values with the slider columns. By passing range(0,10), the slider will have values [0, 1, 2, ..., 9], when passing numpy.arange(3,5,0.5), the slider will have values [3, 3.5, 4, 4.5]. Default: range(0, len(slider))
  • slider_name: Specifies the title of the slider. Default is an empty string.

This can be used to display the change in population relative to the year 2010:


#Calculate change of population relative to 2010:
for i in range(8):
    df_states["Delta_Population_201%d"%i] = ((df_states["POPESTIMATE201%d"%i] / df_states["POPESTIMATE2010"]) -1 ) * 100

#Specify slider columns:
slider_columns = ["Delta_Population_201%d"%i for i in range(8)]

#Specify slider-range (Maps "Delta_Population_2010" -> 2010, 
#                           "Delta_Population_2011" -> 2011, ...):
slider_range = range(2010, 2018)

#Make slider plot:
df_states.plot_bokeh(
    figsize=(900, 600),
    simplify_shapes=5000,
    slider=slider_columns,
    slider_range=slider_range,
    slider_name="Year", 
    colormap="Inferno",
    hovertool_columns=["STATE_NAME"] + slider_columns,
    title="Change of Population [%]")

US_States_4



 

Plot multiple geolayers

If you wish to display multiple geolayers, you can pass the Bokeh figure of a Pandas-Bokeh plot via the figure keyword to the next plot_bokeh() call:

import geopandas as gpd
import pandas_bokeh
pandas_bokeh.output_notebook()

# Read in GeoJSONs from URL:
df_states = gpd.read_file(r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/states/states.geojson")
df_cities = gpd.read_file(
    r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/populated%20places/ne_10m_populated_places_simple_bigcities.geojson"
)
df_cities["size"] = df_cities.pop_max / 400000

#Plot shapes of US states (pass figure options to this initial plot):
figure = df_states.plot_bokeh(
    figsize=(800, 450),
    simplify_shapes=10000,
    show_figure=False,
    xlim=[-170, -80],
    ylim=[10, 70],
    category="REGION",
    colormap="Dark2",
    legend="States",
    show_colorbar=False,
)

#Plot cities as points on top of the US states layer by passing the figure:
df_cities.plot_bokeh(
    figure=figure,         # <== pass figure here!
    category="pop_max",
    colormap="Viridis",
    colormap_uselog=True,
    size="size",
    hovertool_string="""<h1>@name</h1>
                        <h3>Population: @pop_max </h3>""",
    marker="inverted_triangle",
    legend="Cities",
)

Multiple Geolayers


Point & Line plots:

Below, you can see an example that use Pandas-Bokeh to plot point data on a map. The plot shows all cities with a population larger than 1.000.000. For point plots, you can select the marker as keyword argument (since it is passed to bokeh.plotting.figure.scatter). Here an overview of all available marker types:

gdf = gpd.read_file(r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/populated%20places/ne_10m_populated_places_simple_bigcities.geojson")
gdf["size"] = gdf.pop_max / 400000

gdf.plot_bokeh(
    category="pop_max",
    colormap="Viridis",
    colormap_uselog=True,
    size="size",
    hovertool_string="""<h1>@name</h1>
                        <h3>Population: @pop_max </h3>""",
    xlim=[-15, 35],
    ylim=[30,60],
    marker="inverted_triangle");

Pointmap

In a similar way, also GeoDataFrames with (multi)line shapes can be drawn using Pandas-Bokeh.


 


Colorbar formatting:

If you want to display the numerical labels on your colorbar with an alternative to the scientific format, you can pass in a one of the bokeh number string formats or an instance of one of the bokeh.models.formatters to the colorbar_tick_format argument in the geoplot

An example of using the string format argument:

df_states = gpd.read_file(r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/states/states.geojson")

df_states["STATE_NAME_SMALL"] = df_states["STATE_NAME"].str.lower()

# pass in a string format to colorbar_tick_format to display the ticks as 10m rather than 1e7
df_states.plot_bokeh(
    figsize=(900, 600),
    category="POPESTIMATE2017",
    simplify_shapes=5000,    
    colormap="Inferno",
    colormap_uselog=True,
    colorbar_tick_format="0.0a")

colorbar_tick_format with string argument

An example of using the bokeh PrintfTickFormatter:

df_states = gpd.read_file(r"https://raw.githubusercontent.com/PatrikHlobil/Pandas-Bokeh/master/docs/Testdata/states/states.geojson")

df_states["STATE_NAME_SMALL"] = df_states["STATE_NAME"].str.lower()

for i in range(8):
    df_states["Delta_Population_201%d"%i] = ((df_states["POPESTIMATE201%d"%i] / df_states["POPESTIMATE2010"]) -1 ) * 100

# pass in a PrintfTickFormatter instance colorbar_tick_format to display the ticks with 2 decimal places  
df_states.plot_bokeh(
    figsize=(900, 600),
    category="Delta_Population_2017",
    simplify_shapes=5000,    
    colormap="Inferno",
    colorbar_tick_format=PrintfTickFormatter(format="%4.2f"))

colorbar_tick_format with bokeh.models.formatter_instance


Outputs, Formatting & Layouts

Output options

The pandas.DataFrame.plot_bokeh API has the following additional keyword arguments:

  • show_figure: If True, the resulting figure is shown (either in the notebook or exported and shown as HTML file, see Basics. If False, None is returned. Default: True
  • return_html: If True, the method call returns an HTML string that contains all Bokeh CSS&JS resources and the figure embedded in a div. This HTML representation of the plot can be used for embedding the plot in an HTML document. Default: False

If you have a Bokeh figure or layout, you can also use the pandas_bokeh.embedded_html function to generate an embeddable HTML representation of the plot. This can be included into any valid HTML (note that this is not possible directly with the HTML generated by the pandas_bokeh.output_file output option, because it includes an HTML header). Let us consider the following simple example:

#Import Pandas and Pandas-Bokeh (if you do not specify an output option, the standard is
#output_file):
import pandas as pd
import pandas_bokeh

#Create DataFrame to Plot:
import numpy as np
x = np.arange(-10, 10, 0.1)
sin = np.sin(x)
cos = np.cos(x)
tan = np.tan(x)
df = pd.DataFrame({"x": x, "sin(x)": sin, "cos(x)": cos, "tan(x)": tan})

#Make Bokeh plot from DataFrame using Pandas-Bokeh. Do not show the plot, but export
#it to an embeddable HTML string:
html_plot = df.plot_bokeh(
    kind="line",
    x="x",
    y=["sin(x)", "cos(x)", "tan(x)"],
    xticks=range(-20, 20),
    title="Trigonometric functions",
    show_figure=False,
    return_html=True,
    ylim=(-1.5, 1.5))

#Write some HTML and embed the HTML plot below it. For production use, please use
#Templates and the awesome Jinja library.
html = r"""
<script type="text/x-mathjax-config">
  MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}});
</script>
<script type="text/javascript"
  src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>

<h1> Trigonometric functions </h1>

<p> The basic trigonometric functions are:</p>

<p>$ sin(x) $</p>
<p>$ cos(x) $</p>
<p>$ tan(x) = \frac{sin(x)}{cos(x)}$</p>

<p>Below is a plot that shows them</p>

""" + html_plot

#Export the HTML string to an external HTML file and show it:
with open("test.html" , "w") as f:
    f.write(html)
    
import webbrowser
webbrowser.open("test.html")

This code will open up a webbrowser and show the following page. As you can see, the interactive Bokeh plot is embedded nicely into the HTML layout. The return_html option is ideal for the use in a templating engine like Jinja.

Embedded HTML

Auto Scaling Plots

For single plots that have a number of x axis values or for larger monitors, you can auto scale the figure to the width of the entire jupyter cell by setting the sizing_mode parameter.

df = pd.DataFrame(np.random.rand(10, 4), columns=['a', 'b', 'c', 'd']) df.plot_bokeh(kind="bar", figsize=(500, 200), sizing_mode="scale_width")

Scaled Plot

The figsize parameter can be used to change the height and width as well as act as a scaling multiplier against the axis that is not being scaled.

 

Number formats

To change the formats of numbers in the hovertool, use the number_format keyword argument. For a documentation about the format to pass, have a look at the Bokeh documentation.Let us consider some examples for the number 3.141592653589793:

FormatOutput
03
0.0003.141
0.00 $3.14 $

This number format will be applied to all numeric columns of the hovertool. If you want to make a very custom or complicated hovertool, you should probably use the hovertool_string keyword argument, see e.g. this example. Below, we use the number_format parameter to specify the "Stock Price" format to 2 decimal digits and an additional $ sign.

import numpy as np

#Lineplot:
np.random.seed(42)
df = pd.DataFrame({
    "Google": np.random.randn(1000) + 0.2,
    "Apple": np.random.randn(1000) + 0.17
},
                  index=pd.date_range('1/1/2000', periods=1000))
df = df.cumsum()
df = df + 50
df.plot_bokeh(
    kind="line",
    title="Apple vs Google",
    xlabel="Date",
    ylabel="Stock price [$]",
    yticks=[0, 100, 200, 300, 400],
    ylim=(0, 400),
    colormap=["red", "blue"],
    number_format="1.00 $")

Number format

Suppress scientific notation for axes

If you want to suppress the scientific notation for axes, you can use the disable_scientific_axes parameter, which accepts one of "x", "y", "xy":

df = pd.DataFrame({"Animal": ["Mouse", "Rabbit", "Dog", "Tiger", "Elefant", "Wale"],
                   "Weight [g]": [19, 3000, 40000, 200000, 6000000, 50000000]})
p_scientific = df.plot_bokeh(x="Animal", y="Weight [g]", show_figure=False)
p_non_scientific = df.plot_bokeh(x="Animal", y="Weight [g]", disable_scientific_axes="y", show_figure=False,)
pandas_bokeh.plot_grid([[p_scientific, p_non_scientific]], plot_width = 450)

Number format

 

Dashboard Layouts

As shown in the Scatterplot Example, combining plots with plots or other HTML elements is straighforward in Pandas-Bokeh due to the layout capabilities of Bokeh. The easiest way to generate a dashboard layout is using the pandas_bokeh.plot_grid method (which is an extension of bokeh.layouts.gridplot):

import pandas as pd
import numpy as np
import pandas_bokeh
pandas_bokeh.output_notebook()

#Barplot:
data = {
    'fruits':
    ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'],
    '2015': [2, 1, 4, 3, 2, 4],
    '2016': [5, 3, 3, 2, 4, 6],
    '2017': [3, 2, 4, 4, 5, 3]
}
df = pd.DataFrame(data).set_index("fruits")
p_bar = df.plot_bokeh(
    kind="bar",
    ylabel="Price per Unit [€]",
    title="Fruit prices per Year",
    show_figure=False)

#Lineplot:
np.random.seed(42)
df = pd.DataFrame({
    "Google": np.random.randn(1000) + 0.2,
    "Apple": np.random.randn(1000) + 0.17
},
                  index=pd.date_range('1/1/2000', periods=1000))
df = df.cumsum()
df = df + 50
p_line = df.plot_bokeh(
    kind="line",
    title="Apple vs Google",
    xlabel="Date",
    ylabel="Stock price [$]",
    yticks=[0, 100, 200, 300, 400],
    ylim=(0, 400),
    colormap=["red", "blue"],
    show_figure=False)

#Scatterplot:
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(iris["data"])
df.columns = iris["feature_names"]
df["species"] = iris["target"]
df["species"] = df["species"].map(dict(zip(range(3), iris["target_names"])))
p_scatter = df.plot_bokeh(
    kind="scatter",
    x="petal length (cm)",
    y="sepal width (cm)",
    category="species",
    title="Iris DataSet Visualization",
    show_figure=False)

#Histogram:
df_hist = pd.DataFrame({
    'a': np.random.randn(1000) + 1,
    'b': np.random.randn(1000),
    'c': np.random.randn(1000) - 1
},
                       columns=['a', 'b', 'c'])

p_hist = df_hist.plot_bokeh(
    kind="hist",
    bins=np.arange(-6, 6.5, 0.5),
    vertical_xlabel=True,
    normed=100,
    hovertool=False,
    title="Normal distributions",
    show_figure=False)

#Make Dashboard with Grid Layout:
pandas_bokeh.plot_grid([[p_line, p_bar], 
                        [p_scatter, p_hist]], plot_width=450)

Dashboard Layout

Using a combination of row and column elements (see also Bokeh Layouts) allow for a very easy general arrangement of elements. An alternative layout to the one above is:

p_line.plot_width = 900
p_hist.plot_width = 900

layout = pandas_bokeh.column(p_line,
                pandas_bokeh.row(p_scatter, p_bar),
                p_hist)

pandas_bokeh.show(layout)

Alternative Dashboard Layout


 



 

 

Release Notes

Release Notes can be found here.

Contributing to Pandas-Bokeh

If you wish to contribute to the development of Pandas-Bokeh you can follow the instructions on the CONTRIBUTING.md.

 

Author: PatrikHlobil
Source Code: https://github.com/PatrikHlobil/Pandas-Bokeh 
License: MIT License

#machine-learning  #datavisualizations #python