Layne  Fadel

Layne Fadel

1663941240

10 Popular PostgreSQL Optimization Libraries

In this Postgres article, let's learn about Optimization: 10 Popular PostgreSQL Optimization Libraries

Table of contents:

  • pg_flame - A flamegraph generator for query plans.
  • PgHero - PostgreSQL insights made easy.
  • pgtune - PostgreSQL configuration wizard.
  • pgtune - Online version of PostgreSQL configuration wizard.
  • pgconfig.org - PostgreSQL Online Configuration Tool (also based on pgtune).
  • PoWA - PostgreSQL Workload Analyzer gathers performance stats and provides real-time charts and graphs to help monitor and tune your PostgreSQL servers.
  • pg_web_stats - Web UI to view pg_stat_statements.
  • TimescaleDB Tune - a program for tuning a TimescaleDB database to perform its best based on the host's resources such as memory and number of CPUs.

What is PostgreSQL?

PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads. The origins of PostgreSQL date back to 1986 as part of the POSTGRES project at the University of California at Berkeley and has more than 30 years of active development on the core platform.

PostgreSQL has earned a strong reputation for its proven architecture, reliability, data integrity, robust feature set, extensibility, and the dedication of the open source community behind the software to consistently deliver performant and innovative solutions. PostgreSQL runs on all major operating systems, has been ACID-compliant since 2001, and has powerful add-ons such as the popular PostGIS geospatial database extender. It is no surprise that PostgreSQL has become the open source relational database of choice for many people and organisations.

What is PostgreSQL query optimization?

Just like any advanced relational database, PostgreSQL uses a cost-based query optimizer that tries to turn your SQL queries into something efficient that executes in as little time as possible


10 Popular PostgreSQL Optimization Libraries

  1. pg_flame

A flamegraph generator for Postgres EXPLAIN ANALYZE output.
Installation

Homebrew

You can install via Homebrew with the follow command:

$ brew install mgartner/tap/pg_flame

Download pre-compiled binary

Download one of the compiled binaries in the releases tab. Once downloaded, move pg_flame into your $PATH.

Docker

Alternatively, if you'd like to use Docker to build the program, you can.

$ docker pull mgartner/pg_flame

Build from source

If you'd like to build a binary from the source code, run the following commands. Note that compiling requires Go version 1.13+.

$ git clone https://github.com/mgartner/pg_flame.git
$ cd pg_flame
$ go build

A pg_flame binary will be created that you can place in your $PATH.

View on GitHub


2.  PgHero

A performance dashboard for Postgres


Documentation

PgHero is available as a Docker image, Linux package, and Rails engine.

View on GitHub


3.  pgtune

pgtune takes the wimpy default postgresql.conf and expands the database server to be as powerful as the hardware it's being deployed on.

Installation/Usage

Source installation

There is no need to build/compile pgtune, it is a Python script. Extracting the tarball to a convenient location is sufficient. Note that you will need the multiple pg_settings-<version>_<architecture> files included with the program too, pgtune can't work without those.

RPM Installation

The RPM package installs:

  • The pgtune binary under/usr/bin
  • Documents in /usr/share/doc/pgtune-$version
  • Setting files in /usr/share/pgtune

Using pgtune

pgtune works by taking an existing postgresql.conf file as an input, making changes to it based on the amount of RAM in your server and suggested workload, and output a new file.

Here's a sample usage:

pgtune -i $PGDATA/postgresql.conf -o $PGDATA/postgresql.conf.pgtune

pgtune --help will give you additional usage information. These are the current parameters:

  • -i or --input-config : Specifies the current postgresql.conf file.
  • -o or --output-config : Specifies the file name for the new postgresql.conf file.
  • -M or --memory: Use this parameter to specify total system memory. If not specified, pgtune will attempt to detect memory size.
  • -T or --type : Specifies database type. Valid options are: DW, OLTP, Web, Mixed, Desktop
  • -P or --platform : Specifies platform, defaults to the platform running the program. Valid options are Windows, Linux, and Darwin (Mac OS X).
  • -c or --connections: Specifies number of maximum connections expected. If not specified, it depends on database type.
  • -D or --debug : Enables debugging mode.
  • -S or --settings: Directory where settings data files are located at. Defaults to the directory where the script is being run from. The RPM package includes a patch to use the correct location these files were installed into.

View on GitHub


4.  PgTune

Pgtune - tuning PostgreSQL config by your hardware


Tuning PostgreSQL config by your hardware. Based on original pgtune. Illustration by Kate.

Development

Web app build on top of middleman. To start it in development mode, you need install ruby, node.js and run in terminal:

$ bundle # get all ruby deps
$ yarn # get all node.js deps
$ middleman server # start server on 4567 port

View on GitHub


5.  pgconfig

Web Based PostgreSQL configuration tool

View on GitHub


6.  Powa

You can try powa at demo-powa.anayrat.info. Just click "Login" and try its features! Note that in order to get interesting metrics, resources have been limited on this server (2 vCPU, 384MB of RAM and 150iops for the disks). Please be patient when using it.

Thanks to Adrien Nayrat for providing it.

PoWA (PostgreSQL Workload Analyzer) is a performance tool for PostgreSQL 9.4 and newer allowing to collect, aggregate and purge statistics on multiple PostgreSQL instances from various :ref:`stat_extensions`.

Depending on your needs, you can either use the provided background worker (requires a PostgreSQL restart, and more suited for single-instance setups), or the provided :ref:`powa_collector` daemon (does not require a PostgreSQL restart, can gather performance metrics from multiple instances, including standby).

Main components

  • PoWA-archivist is the PostgreSQL extension, collecting statistics.
  • PoWA-collector is the daemon that gather performance metrics from remote PostgreSQL instances (optional) on a dedicated repository server.
  • PoWA-web is the graphical user interface to powa-collected metrics.
  • Stat extensions are the actual source of data.
  • PoWA is the whole project.

View on GitHub


7.  Pg_web_stats

Web UI to view pg_stat_statements

Features

  • Sorting by any column from pg_stat_statements
  • Filtering by database or user
  • Highlighting important queries && hidding not important queries

Installation

  1. Prepare your PG setup: enable the pg_stat_statements extension and execute CREATE EXTENSION pg_stat_statements inside the database you want to inspect. Hint: there is an awesome article about pg_stat_statements in russian.
  2. Clone the repo
  3. Fill config.yml.example with your credentians and save it as config.yml
  4. Start the app: rake server (or run rake console to have command line)
  5. ???
  6. PROFIT

Mount inside a rails app

Add this line to your application's Gemfile:

gem 'pg_web_stats', require: 'pg_web_stats_app'

Or if gem is not released yet

gem 'pg_web_stats', git: 'https://github.com/shhavel/pg_web_stats', require: 'pg_web_stats_app'

And then execute:

$ bundle

Create file config/initializers/pg_web_stats.rb

# Configure database connection
config_hash = YAML.load_file(Rails.root.join('config', 'database.yml'))[Rails.env]
PG_WEB_STATS = PgWebStats.new(config_hash)

# Restrict access to pg_web_stats with Basic Authentication
# (or use any other authentication system).
PgWebStatsApp.use(Rack::Auth::Basic) do |user, password|
  password == "secret"
end

Add to routes.rb

mount PgWebStatsApp, at: '/pg_stats'

View on GitHub


8.  timescaledb-tune

timescaledb-tune is a program for tuning a TimescaleDB database to perform its best based on the host's resources such as memory and number of CPUs. It parses the existing postgresql.conf file to ensure that the TimescaleDB extension is appropriately installed and provides recommendations for memory, parallelism, WAL, and other settings.

Getting started

You need the Go runtime (1.12+) installed, then simply go install this repo:

$ go install github.com/timescale/timescaledb-tune/cmd/timescaledb-tune@main

It is also available as a binary package on a variety systems using Homebrew, yum, or apt. Search for timescaledb-tools.

Using timescaledb-tune

By default, timescaledb-tune attempts to locate your postgresql.conf file for parsing by using heuristics based on the operating system, so the simplest invocation would be:

$ timescaledb-tune

You'll then be given a series of prompts that require minimal user input to make sure your config file is up to date:

Using postgresql.conf at this path:
/usr/local/var/postgres/postgresql.conf

Is this correct? [(y)es/(n)o]: y
Writing backup to:
/var/folders/cr/zpgdkv194vz1g5smxl_5tggm0000gn/T/timescaledb_tune.backup201901071520

shared_preload_libraries needs to be updated
Current:
#shared_preload_libraries = 'timescaledb'
Recommended:
shared_preload_libraries = 'timescaledb'
Is this okay? [(y)es/(n)o]: y
success: shared_preload_libraries will be updated

Tune memory/parallelism/WAL and other settings? [(y)es/(n)o]: y
Recommendations based on 8.00 GB of available memory and 4 CPUs for PostgreSQL 11

Memory settings recommendations
Current:
shared_buffers = 128MB
#effective_cache_size = 4GB
#maintenance_work_mem = 64MB
#work_mem = 4MB
Recommended:
shared_buffers = 2GB
effective_cache_size = 6GB
maintenance_work_mem = 1GB
work_mem = 26214kB
Is this okay? [(y)es/(s)kip/(q)uit]:

View on GitHub


PostgreSQL Optimization FAQ

  • What is PostgreSQL query optimization?

Just like any advanced relational database, PostgreSQL uses a cost-based query optimizer that tries to turn your SQL queries into something efficient that executes in as little time as possible

  • How do I optimize a table in PostgreSQL?

4 Ways To Optimise PostgreSQL Database With Millions of Data

  1. speed up database operations.
  2. reduce the load.
  3. reduce the size of the database.
  4. take advantage of the out-of-the-box feature to help with overall database optimization.
  • What is database performance optimization?

The goal of database performance tuning is to minimize the response time of your queries by making the best use of your system resources. The best use of these resources involves minimizing network traffic, disk I/O, and CPU time.

  • How make PostgreSQL query run faster?

Some of the tricks we used to speed up SELECT-s in PostgreSQL: LEFT JOIN with redundant conditions, VALUES, extended statistics, primary key type conversion, CLUSTER, pg_hint_plan + bonus.

  • Can Postgres handle 1 billion rows?

As commercial database vendors are bragging about their capabilities we decided to push PostgreSQL to the next level and exceed 1 billion rows per second to show what we can do with Open Source. To those who need even more: 1 billion rows is by far not the limit - a lot more is possible.


Related videos:

PostgreSQL Query Optimization Techniques


Related posts:

#postgresql 

What is GEEK

Buddha Community

10 Popular PostgreSQL Optimization Libraries
Connor Mills

Connor Mills

1670560264

Understanding Arrays in Python

Learn how to use Python arrays. Create arrays in Python using the array module. You'll see how to define them and the different methods commonly used for performing operations on them.
 

The artcile covers arrays that you create by importing the array module. We won't cover NumPy arrays here.

Table of Contents

  1. Introduction to Arrays
    1. The differences between Lists and Arrays
    2. When to use arrays
  2. How to use arrays
    1. Define arrays
    2. Find the length of arrays
    3. Array indexing
    4. Search through arrays
    5. Loop through arrays
    6. Slice an array
  3. Array methods for performing operations
    1. Change an existing value
    2. Add a new value
    3. Remove a value
  4. Conclusion

Let's get started!


What are Python Arrays?

Arrays are a fundamental data structure, and an important part of most programming languages. In Python, they are containers which are able to store more than one item at the same time.

Specifically, they are an ordered collection of elements with every value being of the same data type. That is the most important thing to remember about Python arrays - the fact that they can only hold a sequence of multiple items that are of the same type.

What's the Difference between Python Lists and Python Arrays?

Lists are one of the most common data structures in Python, and a core part of the language.

Lists and arrays behave similarly.

Just like arrays, lists are an ordered sequence of elements.

They are also mutable and not fixed in size, which means they can grow and shrink throughout the life of the program. Items can be added and removed, making them very flexible to work with.

However, lists and arrays are not the same thing.

Lists store items that are of various data types. This means that a list can contain integers, floating point numbers, strings, or any other Python data type, at the same time. That is not the case with arrays.

As mentioned in the section above, arrays store only items that are of the same single data type. There are arrays that contain only integers, or only floating point numbers, or only any other Python data type you want to use.

When to Use Python Arrays

Lists are built into the Python programming language, whereas arrays aren't. Arrays are not a built-in data structure, and therefore need to be imported via the array module in order to be used.

Arrays of the array module are a thin wrapper over C arrays, and are useful when you want to work with homogeneous data.

They are also more compact and take up less memory and space which makes them more size efficient compared to lists.

If you want to perform mathematical calculations, then you should use NumPy arrays by importing the NumPy package. Besides that, you should just use Python arrays when you really need to, as lists work in a similar way and are more flexible to work with.

How to Use Arrays in Python

In order to create Python arrays, you'll first have to import the array module which contains all the necassary functions.

There are three ways you can import the array module:

  1. By using import array at the top of the file. This includes the module array. You would then go on to create an array using array.array().
import array

#how you would create an array
array.array()
  1. Instead of having to type array.array() all the time, you could use import array as arr at the top of the file, instead of import array alone. You would then create an array by typing arr.array(). The arr acts as an alias name, with the array constructor then immediately following it.
import array as arr

#how you would create an array
arr.array()
  1. Lastly, you could also use from array import *, with * importing all the functionalities available. You would then create an array by writing the array() constructor alone.
from array import *

#how you would create an array
array()

How to Define Arrays in Python

Once you've imported the array module, you can then go on to define a Python array.

The general syntax for creating an array looks like this:

variable_name = array(typecode,[elements])

Let's break it down:

  • variable_name would be the name of the array.
  • The typecode specifies what kind of elements would be stored in the array. Whether it would be an array of integers, an array of floats or an array of any other Python data type. Remember that all elements should be of the same data type.
  • Inside square brackets you mention the elements that would be stored in the array, with each element being separated by a comma. You can also create an empty array by just writing variable_name = array(typecode) alone, without any elements.

Below is a typecode table, with the different typecodes that can be used with the different data types when defining Python arrays:

TYPECODEC TYPEPYTHON TYPESIZE
'b'signed charint1
'B'unsigned charint1
'u'wchar_tUnicode character2
'h'signed shortint2
'H'unsigned shortint2
'i'signed intint2
'I'unsigned intint2
'l'signed longint4
'L'unsigned longint4
'q'signed long longint8
'Q'unsigned long longint8
'f'floatfloat4
'd'doublefloat8

Tying everything together, here is an example of how you would define an array in Python:

import array as arr 

numbers = arr.array('i',[10,20,30])


print(numbers)

#output

#array('i', [10, 20, 30])

Let's break it down:

  • First we included the array module, in this case with import array as arr .
  • Then, we created a numbers array.
  • We used arr.array() because of import array as arr .
  • Inside the array() constructor, we first included i, for signed integer. Signed integer means that the array can include positive and negative values. Unsigned integer, with H for example, would mean that no negative values are allowed.
  • Lastly, we included the values to be stored in the array in square brackets.

Keep in mind that if you tried to include values that were not of i typecode, meaning they were not integer values, you would get an error:

import array as arr 

numbers = arr.array('i',[10.0,20,30])


print(numbers)

#output

#Traceback (most recent call last):
# File "/Users/dionysialemonaki/python_articles/demo.py", line 14, in <module>
#   numbers = arr.array('i',[10.0,20,30])
#TypeError: 'float' object cannot be interpreted as an integer

In the example above, I tried to include a floating point number in the array. I got an error because this is meant to be an integer array only.

Another way to create an array is the following:

from array import *

#an array of floating point values
numbers = array('d',[10.0,20.0,30.0])

print(numbers)

#output

#array('d', [10.0, 20.0, 30.0])

The example above imported the array module via from array import * and created an array numbers of float data type. This means that it holds only floating point numbers, which is specified with the 'd' typecode.

How to Find the Length of an Array in Python

To find out the exact number of elements contained in an array, use the built-in len() method.

It will return the integer number that is equal to the total number of elements in the array you specify.

import array as arr 

numbers = arr.array('i',[10,20,30])


print(len(numbers))

#output
# 3

In the example above, the array contained three elements – 10, 20, 30 – so the length of numbers is 3.

Array Indexing and How to Access Individual Items in an Array in Python

Each item in an array has a specific address. Individual items are accessed by referencing their index number.

Indexing in Python, and in all programming languages and computing in general, starts at 0. It is important to remember that counting starts at 0 and not at 1.

To access an element, you first write the name of the array followed by square brackets. Inside the square brackets you include the item's index number.

The general syntax would look something like this:

array_name[index_value_of_item]

Here is how you would access each individual element in an array:

import array as arr 

numbers = arr.array('i',[10,20,30])

print(numbers[0]) # gets the 1st element
print(numbers[1]) # gets the 2nd element
print(numbers[2]) # gets the 3rd element

#output

#10
#20
#30

Remember that the index value of the last element of an array is always one less than the length of the array. Where n is the length of the array, n - 1 will be the index value of the last item.

Note that you can also access each individual element using negative indexing.

With negative indexing, the last element would have an index of -1, the second to last element would have an index of -2, and so on.

Here is how you would get each item in an array using that method:

import array as arr 

numbers = arr.array('i',[10,20,30])

print(numbers[-1]) #gets last item
print(numbers[-2]) #gets second to last item
print(numbers[-3]) #gets first item
 
#output

#30
#20
#10

How to Search Through an Array in Python

You can find out an element's index number by using the index() method.

You pass the value of the element being searched as the argument to the method, and the element's index number is returned.

import array as arr 

numbers = arr.array('i',[10,20,30])

#search for the index of the value 10
print(numbers.index(10))

#output

#0

If there is more than one element with the same value, the index of the first instance of the value will be returned:

import array as arr 


numbers = arr.array('i',[10,20,30,10,20,30])

#search for the index of the value 10
#will return the index number of the first instance of the value 10
print(numbers.index(10))

#output

#0

How to Loop through an Array in Python

You've seen how to access each individual element in an array and print it out on its own.

You've also seen how to print the array, using the print() method. That method gives the following result:

import array as arr 

numbers = arr.array('i',[10,20,30])

print(numbers)

#output

#array('i', [10, 20, 30])

What if you want to print each value one by one?

This is where a loop comes in handy. You can loop through the array and print out each value, one-by-one, with each loop iteration.

For this you can use a simple for loop:

import array as arr 

numbers = arr.array('i',[10,20,30])

for number in numbers:
    print(number)
    
#output
#10
#20
#30

You could also use the range() function, and pass the len() method as its parameter. This would give the same result as above:

import array as arr  

values = arr.array('i',[10,20,30])

#prints each individual value in the array
for value in range(len(values)):
    print(values[value])

#output

#10
#20
#30

How to Slice an Array in Python

To access a specific range of values inside the array, use the slicing operator, which is a colon :.

When using the slicing operator and you only include one value, the counting starts from 0 by default. It gets the first item, and goes up to but not including the index number you specify.


import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#get the values 10 and 20 only
print(numbers[:2])  #first to second position

#output

#array('i', [10, 20])

When you pass two numbers as arguments, you specify a range of numbers. In this case, the counting starts at the position of the first number in the range, and up to but not including the second one:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])


#get the values 20 and 30 only
print(numbers[1:3]) #second to third position

#output

#rray('i', [20, 30])

Methods For Performing Operations on Arrays in Python

Arrays are mutable, which means they are changeable. You can change the value of the different items, add new ones, or remove any you don't want in your program anymore.

Let's see some of the most commonly used methods which are used for performing operations on arrays.

How to Change the Value of an Item in an Array

You can change the value of a specific element by speficying its position and assigning it a new value:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#change the first element
#change it from having a value of 10 to having a value of 40
numbers[0] = 40

print(numbers)

#output

#array('i', [40, 20, 30])

How to Add a New Value to an Array

To add one single value at the end of an array, use the append() method:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integer 40 to the end of numbers
numbers.append(40)

print(numbers)

#output

#array('i', [10, 20, 30, 40])

Be aware that the new item you add needs to be the same data type as the rest of the items in the array.

Look what happens when I try to add a float to an array of integers:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integer 40 to the end of numbers
numbers.append(40.0)

print(numbers)

#output

#Traceback (most recent call last):
#  File "/Users/dionysialemonaki/python_articles/demo.py", line 19, in <module>
#   numbers.append(40.0)
#TypeError: 'float' object cannot be interpreted as an integer

But what if you want to add more than one value to the end an array?

Use the extend() method, which takes an iterable (such as a list of items) as an argument. Again, make sure that the new items are all the same data type.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integers 40,50,60 to the end of numbers
#The numbers need to be enclosed in square brackets

numbers.extend([40,50,60])

print(numbers)

#output

#array('i', [10, 20, 30, 40, 50, 60])

And what if you don't want to add an item to the end of an array? Use the insert() method, to add an item at a specific position.

The insert() function takes two arguments: the index number of the position the new element will be inserted, and the value of the new element.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integer 40 in the first position
#remember indexing starts at 0

numbers.insert(0,40)

print(numbers)

#output

#array('i', [40, 10, 20, 30])

How to Remove a Value from an Array

To remove an element from an array, use the remove() method and include the value as an argument to the method.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

numbers.remove(10)

print(numbers)

#output

#array('i', [20, 30])

With remove(), only the first instance of the value you pass as an argument will be removed.

See what happens when there are more than one identical values:


import array as arr 

#original array
numbers = arr.array('i',[10,20,30,10,20])

numbers.remove(10)

print(numbers)

#output

#array('i', [20, 30, 10, 20])

Only the first occurence of 10 is removed.

You can also use the pop() method, and specify the position of the element to be removed:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30,10,20])

#remove the first instance of 10
numbers.pop(0)

print(numbers)

#output

#array('i', [20, 30, 10, 20])

Conclusion

And there you have it - you now know the basics of how to create arrays in Python using the array module. Hopefully you found this guide helpful.

You'll start from the basics and learn in an interacitve and beginner-friendly way. You'll also build five projects at the end to put into practice and help reinforce what you learned.

Thanks for reading and happy coding!

Original article source at https://www.freecodecamp.org

#python 

How to Create Arrays in Python

In this tutorial, you'll know the basics of how to create arrays in Python using the array module. Learn how to use Python arrays. You'll see how to define them and the different methods commonly used for performing operations on them.

This tutorialvideo on 'Arrays in Python' will help you establish a strong hold on all the fundamentals in python programming language. Below are the topics covered in this video:  
1:15 What is an array?
2:53 Is python list same as an array?
3:48  How to create arrays in python?
7:19 Accessing array elements
9:59 Basic array operations
        - 10:33  Finding the length of an array
        - 11:44  Adding Elements
        - 15:06  Removing elements
        - 18:32  Array concatenation
       - 20:59  Slicing
       - 23:26  Looping  


Python Array Tutorial – Define, Index, Methods

In this article, you'll learn how to use Python arrays. You'll see how to define them and the different methods commonly used for performing operations on them.

The artcile covers arrays that you create by importing the array module. We won't cover NumPy arrays here.

Table of Contents

  1. Introduction to Arrays
    1. The differences between Lists and Arrays
    2. When to use arrays
  2. How to use arrays
    1. Define arrays
    2. Find the length of arrays
    3. Array indexing
    4. Search through arrays
    5. Loop through arrays
    6. Slice an array
  3. Array methods for performing operations
    1. Change an existing value
    2. Add a new value
    3. Remove a value
  4. Conclusion

Let's get started!

What are Python Arrays?

Arrays are a fundamental data structure, and an important part of most programming languages. In Python, they are containers which are able to store more than one item at the same time.

Specifically, they are an ordered collection of elements with every value being of the same data type. That is the most important thing to remember about Python arrays - the fact that they can only hold a sequence of multiple items that are of the same type.

What's the Difference between Python Lists and Python Arrays?

Lists are one of the most common data structures in Python, and a core part of the language.

Lists and arrays behave similarly.

Just like arrays, lists are an ordered sequence of elements.

They are also mutable and not fixed in size, which means they can grow and shrink throughout the life of the program. Items can be added and removed, making them very flexible to work with.

However, lists and arrays are not the same thing.

Lists store items that are of various data types. This means that a list can contain integers, floating point numbers, strings, or any other Python data type, at the same time. That is not the case with arrays.

As mentioned in the section above, arrays store only items that are of the same single data type. There are arrays that contain only integers, or only floating point numbers, or only any other Python data type you want to use.

When to Use Python Arrays

Lists are built into the Python programming language, whereas arrays aren't. Arrays are not a built-in data structure, and therefore need to be imported via the array module in order to be used.

Arrays of the array module are a thin wrapper over C arrays, and are useful when you want to work with homogeneous data.

They are also more compact and take up less memory and space which makes them more size efficient compared to lists.

If you want to perform mathematical calculations, then you should use NumPy arrays by importing the NumPy package. Besides that, you should just use Python arrays when you really need to, as lists work in a similar way and are more flexible to work with.

How to Use Arrays in Python

In order to create Python arrays, you'll first have to import the array module which contains all the necassary functions.

There are three ways you can import the array module:

  • By using import array at the top of the file. This includes the module array. You would then go on to create an array using array.array().
import array

#how you would create an array
array.array()
  • Instead of having to type array.array() all the time, you could use import array as arr at the top of the file, instead of import array alone. You would then create an array by typing arr.array(). The arr acts as an alias name, with the array constructor then immediately following it.
import array as arr

#how you would create an array
arr.array()
  • Lastly, you could also use from array import *, with * importing all the functionalities available. You would then create an array by writing the array() constructor alone.
from array import *

#how you would create an array
array()

How to Define Arrays in Python

Once you've imported the array module, you can then go on to define a Python array.

The general syntax for creating an array looks like this:

variable_name = array(typecode,[elements])

Let's break it down:

  • variable_name would be the name of the array.
  • The typecode specifies what kind of elements would be stored in the array. Whether it would be an array of integers, an array of floats or an array of any other Python data type. Remember that all elements should be of the same data type.
  • Inside square brackets you mention the elements that would be stored in the array, with each element being separated by a comma. You can also create an empty array by just writing variable_name = array(typecode) alone, without any elements.

Below is a typecode table, with the different typecodes that can be used with the different data types when defining Python arrays:

TYPECODEC TYPEPYTHON TYPESIZE
'b'signed charint1
'B'unsigned charint1
'u'wchar_tUnicode character2
'h'signed shortint2
'H'unsigned shortint2
'i'signed intint2
'I'unsigned intint2
'l'signed longint4
'L'unsigned longint4
'q'signed long longint8
'Q'unsigned long longint8
'f'floatfloat4
'd'doublefloat8

Tying everything together, here is an example of how you would define an array in Python:

import array as arr 

numbers = arr.array('i',[10,20,30])


print(numbers)

#output

#array('i', [10, 20, 30])

Let's break it down:

  • First we included the array module, in this case with import array as arr .
  • Then, we created a numbers array.
  • We used arr.array() because of import array as arr .
  • Inside the array() constructor, we first included i, for signed integer. Signed integer means that the array can include positive and negative values. Unsigned integer, with H for example, would mean that no negative values are allowed.
  • Lastly, we included the values to be stored in the array in square brackets.

Keep in mind that if you tried to include values that were not of i typecode, meaning they were not integer values, you would get an error:

import array as arr 

numbers = arr.array('i',[10.0,20,30])


print(numbers)

#output

#Traceback (most recent call last):
# File "/Users/dionysialemonaki/python_articles/demo.py", line 14, in <module>
#   numbers = arr.array('i',[10.0,20,30])
#TypeError: 'float' object cannot be interpreted as an integer

In the example above, I tried to include a floating point number in the array. I got an error because this is meant to be an integer array only.

Another way to create an array is the following:

from array import *

#an array of floating point values
numbers = array('d',[10.0,20.0,30.0])

print(numbers)

#output

#array('d', [10.0, 20.0, 30.0])

The example above imported the array module via from array import * and created an array numbers of float data type. This means that it holds only floating point numbers, which is specified with the 'd' typecode.

How to Find the Length of an Array in Python

To find out the exact number of elements contained in an array, use the built-in len() method.

It will return the integer number that is equal to the total number of elements in the array you specify.

import array as arr 

numbers = arr.array('i',[10,20,30])


print(len(numbers))

#output
# 3

In the example above, the array contained three elements – 10, 20, 30 – so the length of numbers is 3.

Array Indexing and How to Access Individual Items in an Array in Python

Each item in an array has a specific address. Individual items are accessed by referencing their index number.

Indexing in Python, and in all programming languages and computing in general, starts at 0. It is important to remember that counting starts at 0 and not at 1.

To access an element, you first write the name of the array followed by square brackets. Inside the square brackets you include the item's index number.

The general syntax would look something like this:

array_name[index_value_of_item]

Here is how you would access each individual element in an array:

import array as arr 

numbers = arr.array('i',[10,20,30])

print(numbers[0]) # gets the 1st element
print(numbers[1]) # gets the 2nd element
print(numbers[2]) # gets the 3rd element

#output

#10
#20
#30

Remember that the index value of the last element of an array is always one less than the length of the array. Where n is the length of the array, n - 1 will be the index value of the last item.

Note that you can also access each individual element using negative indexing.

With negative indexing, the last element would have an index of -1, the second to last element would have an index of -2, and so on.

Here is how you would get each item in an array using that method:

import array as arr 

numbers = arr.array('i',[10,20,30])

print(numbers[-1]) #gets last item
print(numbers[-2]) #gets second to last item
print(numbers[-3]) #gets first item
 
#output

#30
#20
#10

How to Search Through an Array in Python

You can find out an element's index number by using the index() method.

You pass the value of the element being searched as the argument to the method, and the element's index number is returned.

import array as arr 

numbers = arr.array('i',[10,20,30])

#search for the index of the value 10
print(numbers.index(10))

#output

#0

If there is more than one element with the same value, the index of the first instance of the value will be returned:

import array as arr 


numbers = arr.array('i',[10,20,30,10,20,30])

#search for the index of the value 10
#will return the index number of the first instance of the value 10
print(numbers.index(10))

#output

#0

How to Loop through an Array in Python

You've seen how to access each individual element in an array and print it out on its own.

You've also seen how to print the array, using the print() method. That method gives the following result:

import array as arr 

numbers = arr.array('i',[10,20,30])

print(numbers)

#output

#array('i', [10, 20, 30])

What if you want to print each value one by one?

This is where a loop comes in handy. You can loop through the array and print out each value, one-by-one, with each loop iteration.

For this you can use a simple for loop:

import array as arr 

numbers = arr.array('i',[10,20,30])

for number in numbers:
    print(number)
    
#output
#10
#20
#30

You could also use the range() function, and pass the len() method as its parameter. This would give the same result as above:

import array as arr  

values = arr.array('i',[10,20,30])

#prints each individual value in the array
for value in range(len(values)):
    print(values[value])

#output

#10
#20
#30

How to Slice an Array in Python

To access a specific range of values inside the array, use the slicing operator, which is a colon :.

When using the slicing operator and you only include one value, the counting starts from 0 by default. It gets the first item, and goes up to but not including the index number you specify.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#get the values 10 and 20 only
print(numbers[:2])  #first to second position

#output

#array('i', [10, 20])

When you pass two numbers as arguments, you specify a range of numbers. In this case, the counting starts at the position of the first number in the range, and up to but not including the second one:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])


#get the values 20 and 30 only
print(numbers[1:3]) #second to third position

#output

#rray('i', [20, 30])

Methods For Performing Operations on Arrays in Python

Arrays are mutable, which means they are changeable. You can change the value of the different items, add new ones, or remove any you don't want in your program anymore.

Let's see some of the most commonly used methods which are used for performing operations on arrays.

How to Change the Value of an Item in an Array

You can change the value of a specific element by speficying its position and assigning it a new value:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#change the first element
#change it from having a value of 10 to having a value of 40
numbers[0] = 40

print(numbers)

#output

#array('i', [40, 20, 30])

How to Add a New Value to an Array

To add one single value at the end of an array, use the append() method:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integer 40 to the end of numbers
numbers.append(40)

print(numbers)

#output

#array('i', [10, 20, 30, 40])

Be aware that the new item you add needs to be the same data type as the rest of the items in the array.

Look what happens when I try to add a float to an array of integers:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integer 40 to the end of numbers
numbers.append(40.0)

print(numbers)

#output

#Traceback (most recent call last):
#  File "/Users/dionysialemonaki/python_articles/demo.py", line 19, in <module>
#   numbers.append(40.0)
#TypeError: 'float' object cannot be interpreted as an integer

But what if you want to add more than one value to the end an array?

Use the extend() method, which takes an iterable (such as a list of items) as an argument. Again, make sure that the new items are all the same data type.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integers 40,50,60 to the end of numbers
#The numbers need to be enclosed in square brackets

numbers.extend([40,50,60])

print(numbers)

#output

#array('i', [10, 20, 30, 40, 50, 60])

And what if you don't want to add an item to the end of an array? Use the insert() method, to add an item at a specific position.

The insert() function takes two arguments: the index number of the position the new element will be inserted, and the value of the new element.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

#add the integer 40 in the first position
#remember indexing starts at 0

numbers.insert(0,40)

print(numbers)

#output

#array('i', [40, 10, 20, 30])

How to Remove a Value from an Array

To remove an element from an array, use the remove() method and include the value as an argument to the method.

import array as arr 

#original array
numbers = arr.array('i',[10,20,30])

numbers.remove(10)

print(numbers)

#output

#array('i', [20, 30])

With remove(), only the first instance of the value you pass as an argument will be removed.

See what happens when there are more than one identical values:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30,10,20])

numbers.remove(10)

print(numbers)

#output

#array('i', [20, 30, 10, 20])

Only the first occurence of 10 is removed.

You can also use the pop() method, and specify the position of the element to be removed:

import array as arr 

#original array
numbers = arr.array('i',[10,20,30,10,20])

#remove the first instance of 10
numbers.pop(0)

print(numbers)

#output

#array('i', [20, 30, 10, 20])

Conclusion

And there you have it - you now know the basics of how to create arrays in Python using the array module. Hopefully you found this guide helpful.

Thanks for reading and happy coding!

#python #programming 

Layne  Fadel

Layne Fadel

1663941240

10 Popular PostgreSQL Optimization Libraries

In this Postgres article, let's learn about Optimization: 10 Popular PostgreSQL Optimization Libraries

Table of contents:

  • pg_flame - A flamegraph generator for query plans.
  • PgHero - PostgreSQL insights made easy.
  • pgtune - PostgreSQL configuration wizard.
  • pgtune - Online version of PostgreSQL configuration wizard.
  • pgconfig.org - PostgreSQL Online Configuration Tool (also based on pgtune).
  • PoWA - PostgreSQL Workload Analyzer gathers performance stats and provides real-time charts and graphs to help monitor and tune your PostgreSQL servers.
  • pg_web_stats - Web UI to view pg_stat_statements.
  • TimescaleDB Tune - a program for tuning a TimescaleDB database to perform its best based on the host's resources such as memory and number of CPUs.

What is PostgreSQL?

PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads. The origins of PostgreSQL date back to 1986 as part of the POSTGRES project at the University of California at Berkeley and has more than 30 years of active development on the core platform.

PostgreSQL has earned a strong reputation for its proven architecture, reliability, data integrity, robust feature set, extensibility, and the dedication of the open source community behind the software to consistently deliver performant and innovative solutions. PostgreSQL runs on all major operating systems, has been ACID-compliant since 2001, and has powerful add-ons such as the popular PostGIS geospatial database extender. It is no surprise that PostgreSQL has become the open source relational database of choice for many people and organisations.

What is PostgreSQL query optimization?

Just like any advanced relational database, PostgreSQL uses a cost-based query optimizer that tries to turn your SQL queries into something efficient that executes in as little time as possible


10 Popular PostgreSQL Optimization Libraries

  1. pg_flame

A flamegraph generator for Postgres EXPLAIN ANALYZE output.
Installation

Homebrew

You can install via Homebrew with the follow command:

$ brew install mgartner/tap/pg_flame

Download pre-compiled binary

Download one of the compiled binaries in the releases tab. Once downloaded, move pg_flame into your $PATH.

Docker

Alternatively, if you'd like to use Docker to build the program, you can.

$ docker pull mgartner/pg_flame

Build from source

If you'd like to build a binary from the source code, run the following commands. Note that compiling requires Go version 1.13+.

$ git clone https://github.com/mgartner/pg_flame.git
$ cd pg_flame
$ go build

A pg_flame binary will be created that you can place in your $PATH.

View on GitHub


2.  PgHero

A performance dashboard for Postgres


Documentation

PgHero is available as a Docker image, Linux package, and Rails engine.

View on GitHub


3.  pgtune

pgtune takes the wimpy default postgresql.conf and expands the database server to be as powerful as the hardware it's being deployed on.

Installation/Usage

Source installation

There is no need to build/compile pgtune, it is a Python script. Extracting the tarball to a convenient location is sufficient. Note that you will need the multiple pg_settings-<version>_<architecture> files included with the program too, pgtune can't work without those.

RPM Installation

The RPM package installs:

  • The pgtune binary under/usr/bin
  • Documents in /usr/share/doc/pgtune-$version
  • Setting files in /usr/share/pgtune

Using pgtune

pgtune works by taking an existing postgresql.conf file as an input, making changes to it based on the amount of RAM in your server and suggested workload, and output a new file.

Here's a sample usage:

pgtune -i $PGDATA/postgresql.conf -o $PGDATA/postgresql.conf.pgtune

pgtune --help will give you additional usage information. These are the current parameters:

  • -i or --input-config : Specifies the current postgresql.conf file.
  • -o or --output-config : Specifies the file name for the new postgresql.conf file.
  • -M or --memory: Use this parameter to specify total system memory. If not specified, pgtune will attempt to detect memory size.
  • -T or --type : Specifies database type. Valid options are: DW, OLTP, Web, Mixed, Desktop
  • -P or --platform : Specifies platform, defaults to the platform running the program. Valid options are Windows, Linux, and Darwin (Mac OS X).
  • -c or --connections: Specifies number of maximum connections expected. If not specified, it depends on database type.
  • -D or --debug : Enables debugging mode.
  • -S or --settings: Directory where settings data files are located at. Defaults to the directory where the script is being run from. The RPM package includes a patch to use the correct location these files were installed into.

View on GitHub


4.  PgTune

Pgtune - tuning PostgreSQL config by your hardware


Tuning PostgreSQL config by your hardware. Based on original pgtune. Illustration by Kate.

Development

Web app build on top of middleman. To start it in development mode, you need install ruby, node.js and run in terminal:

$ bundle # get all ruby deps
$ yarn # get all node.js deps
$ middleman server # start server on 4567 port

View on GitHub


5.  pgconfig

Web Based PostgreSQL configuration tool

View on GitHub


6.  Powa

You can try powa at demo-powa.anayrat.info. Just click "Login" and try its features! Note that in order to get interesting metrics, resources have been limited on this server (2 vCPU, 384MB of RAM and 150iops for the disks). Please be patient when using it.

Thanks to Adrien Nayrat for providing it.

PoWA (PostgreSQL Workload Analyzer) is a performance tool for PostgreSQL 9.4 and newer allowing to collect, aggregate and purge statistics on multiple PostgreSQL instances from various :ref:`stat_extensions`.

Depending on your needs, you can either use the provided background worker (requires a PostgreSQL restart, and more suited for single-instance setups), or the provided :ref:`powa_collector` daemon (does not require a PostgreSQL restart, can gather performance metrics from multiple instances, including standby).

Main components

  • PoWA-archivist is the PostgreSQL extension, collecting statistics.
  • PoWA-collector is the daemon that gather performance metrics from remote PostgreSQL instances (optional) on a dedicated repository server.
  • PoWA-web is the graphical user interface to powa-collected metrics.
  • Stat extensions are the actual source of data.
  • PoWA is the whole project.

View on GitHub


7.  Pg_web_stats

Web UI to view pg_stat_statements

Features

  • Sorting by any column from pg_stat_statements
  • Filtering by database or user
  • Highlighting important queries && hidding not important queries

Installation

  1. Prepare your PG setup: enable the pg_stat_statements extension and execute CREATE EXTENSION pg_stat_statements inside the database you want to inspect. Hint: there is an awesome article about pg_stat_statements in russian.
  2. Clone the repo
  3. Fill config.yml.example with your credentians and save it as config.yml
  4. Start the app: rake server (or run rake console to have command line)
  5. ???
  6. PROFIT

Mount inside a rails app

Add this line to your application's Gemfile:

gem 'pg_web_stats', require: 'pg_web_stats_app'

Or if gem is not released yet

gem 'pg_web_stats', git: 'https://github.com/shhavel/pg_web_stats', require: 'pg_web_stats_app'

And then execute:

$ bundle

Create file config/initializers/pg_web_stats.rb

# Configure database connection
config_hash = YAML.load_file(Rails.root.join('config', 'database.yml'))[Rails.env]
PG_WEB_STATS = PgWebStats.new(config_hash)

# Restrict access to pg_web_stats with Basic Authentication
# (or use any other authentication system).
PgWebStatsApp.use(Rack::Auth::Basic) do |user, password|
  password == "secret"
end

Add to routes.rb

mount PgWebStatsApp, at: '/pg_stats'

View on GitHub


8.  timescaledb-tune

timescaledb-tune is a program for tuning a TimescaleDB database to perform its best based on the host's resources such as memory and number of CPUs. It parses the existing postgresql.conf file to ensure that the TimescaleDB extension is appropriately installed and provides recommendations for memory, parallelism, WAL, and other settings.

Getting started

You need the Go runtime (1.12+) installed, then simply go install this repo:

$ go install github.com/timescale/timescaledb-tune/cmd/timescaledb-tune@main

It is also available as a binary package on a variety systems using Homebrew, yum, or apt. Search for timescaledb-tools.

Using timescaledb-tune

By default, timescaledb-tune attempts to locate your postgresql.conf file for parsing by using heuristics based on the operating system, so the simplest invocation would be:

$ timescaledb-tune

You'll then be given a series of prompts that require minimal user input to make sure your config file is up to date:

Using postgresql.conf at this path:
/usr/local/var/postgres/postgresql.conf

Is this correct? [(y)es/(n)o]: y
Writing backup to:
/var/folders/cr/zpgdkv194vz1g5smxl_5tggm0000gn/T/timescaledb_tune.backup201901071520

shared_preload_libraries needs to be updated
Current:
#shared_preload_libraries = 'timescaledb'
Recommended:
shared_preload_libraries = 'timescaledb'
Is this okay? [(y)es/(n)o]: y
success: shared_preload_libraries will be updated

Tune memory/parallelism/WAL and other settings? [(y)es/(n)o]: y
Recommendations based on 8.00 GB of available memory and 4 CPUs for PostgreSQL 11

Memory settings recommendations
Current:
shared_buffers = 128MB
#effective_cache_size = 4GB
#maintenance_work_mem = 64MB
#work_mem = 4MB
Recommended:
shared_buffers = 2GB
effective_cache_size = 6GB
maintenance_work_mem = 1GB
work_mem = 26214kB
Is this okay? [(y)es/(s)kip/(q)uit]:

View on GitHub


PostgreSQL Optimization FAQ

  • What is PostgreSQL query optimization?

Just like any advanced relational database, PostgreSQL uses a cost-based query optimizer that tries to turn your SQL queries into something efficient that executes in as little time as possible

  • How do I optimize a table in PostgreSQL?

4 Ways To Optimise PostgreSQL Database With Millions of Data

  1. speed up database operations.
  2. reduce the load.
  3. reduce the size of the database.
  4. take advantage of the out-of-the-box feature to help with overall database optimization.
  • What is database performance optimization?

The goal of database performance tuning is to minimize the response time of your queries by making the best use of your system resources. The best use of these resources involves minimizing network traffic, disk I/O, and CPU time.

  • How make PostgreSQL query run faster?

Some of the tricks we used to speed up SELECT-s in PostgreSQL: LEFT JOIN with redundant conditions, VALUES, extended statistics, primary key type conversion, CLUSTER, pg_hint_plan + bonus.

  • Can Postgres handle 1 billion rows?

As commercial database vendors are bragging about their capabilities we decided to push PostgreSQL to the next level and exceed 1 billion rows per second to show what we can do with Open Source. To those who need even more: 1 billion rows is by far not the limit - a lot more is possible.


Related videos:

PostgreSQL Query Optimization Techniques


Related posts:

#postgresql 

Samanta  Moore

Samanta Moore

1623834960

Top 10 Popular Java Frameworks Every Developer Should Know in 2021

Java frameworks are essentially blocks of pre-written code, to which a programmer may add his code to solve specific problems. Several Java frameworks exist, all of which have their pros and cons. All of them can be used to solve problems in a variety of fields and domains. Java frameworks reduce the amount of coding from scratch that programmers have to do to come up with a solution.

Table of Contents

#full stack development #frameworks #java #java frameworks #top 10 popular java frameworks every developer should know in 2021 #top 10 popular java frameworks

Awesome  Rust

Awesome Rust

1658878980

Git Branchless: Branchless Workflow for Git Built on Rust

Branchless workflow for Git

(This suite of tools is 100% compatible with branches. If you think this is confusing, you can suggest a new name here.)

About

git-branchless is a suite of tools which enhances Git in several ways:

It makes Git easier to use, both for novices and for power users. Examples:

It adds more flexibility for power users. Examples:

  • Patch-stack workflows: strong support for "patch-stack" workflows as used by the Linux and Git projects, as well as at many large tech companies. (This is how Git was "meant" to be used.)
  • Prototyping and experimenting workflows: strong support for prototyping and experimental work via "divergent" development.
  • git sync: to rebase all local commit stacks and branches without having to check them out first.
  • git move: The ability to move subtrees rather than "sticks" while cleaning up old branches, not touching the working copy, etc.
  • Anonymous branching: reduces the overhead of branching for experimental work.
  • In-memory operations: to modify the commit graph without having to check out the commits in question.
  • git next/prev: to quickly jump between commits and branches in a commit stack.
  • git co -i/--interactive: to interactively select a commit to check out.

It provides faster operations for large repositories and monorepos, particularly at large tech companies. Examples:

  • See the blog post Lightning-fast rebases with git-move.
  • Performance tested: benchmarked on torvalds/linux (1M+ commits) and mozilla/gecko-dev (700k+ commits).
  • Operates in-memory: avoids touching the working copy by default (which can slow down git status or invalidate build artifacts).
  • Sparse indexes: uses a custom implementation of sparse indexes for fast commit and merge operations.
  • Segmented changelog DAG: for efficient queries on the commit graph, such as merge-base calculation in O(log n) instead of O(n).
  • Ahead-of-time compiled: written in an ahead-of-time compiled language with good runtime performance (Rust).
  • Multithreading: distributes work across multiple CPU cores where appropriate.
  • To my knowledge, git-branchless provides the fastest implementation of rebase among Git tools and UIs, for the above reasons.

See also the User guide and Design goals.

Demos

Repair

Undo almost anything:

  • Commits.
  • Amended commits.
  • Merges and rebases (e.g. if you resolved a conflict wrongly).
  • Checkouts.
  • Branch creations, updates, and deletions.

Why not git reflog?

git reflog is a tool to view the previous position of a single reference (like HEAD), which can be used to undo operations. But since it only tracks the position of a single reference, complicated operations like rebases can be tedious to reverse-engineer. git undo operates at a higher level of abstraction: the entire state of your repository.

git reflog also fundamentally can't be used to undo some rare operations, such as certain branch creations, updates, and deletions. See the architecture document for more details.

What doesn't git undo handle?

git undo relies on features in recent versions of Git to work properly. See the compatibility chart.

Currently, git undo can't undo the following. You can find the design document to handle some of these cases in issue #10.

  • "Uncommitting" a commit by undoing the commit and restoring its changes to the working copy.
    • In stock Git, this can be accomplished with git reset HEAD^.
    • This scenario would be better implemented with a custom git uncommit command instead. See issue #3.
  • Undoing the staging or unstaging of files. This is tracked by issue #10 above.
  • Undoing back into the middle of a conflict, such that git status shows a message like path/to/file (both modified), so that you can resolve that specific conflict differently. This is tracked by issue #10 above.

Fundamentally, git undo is not intended to handle changes to untracked files.

Comparison to other Git undo tools

  • gitjk: Requires a shell alias. Only undoes most recent command. Only handles some Git operations (e.g. doesn't handle rebases).
  • git-extras/git-undo: Only undoes commits at current HEAD.
  • git-annex undo: Only undoes the most recent change to a given file or directory.
  • thefuck: Only undoes historical shell commands. Only handles some Git operations (e.g. doesn't handle rebases).

Visualize

Visualize your commit history with the smartlog (git sl):

Why not `git log --graph`?

git log --graph only shows commits which have branches attached with them. If you prefer to work without branches, then git log --graph won't work for you.

To support users who rewrite their commit graph extensively, git sl also points out commits which have been abandoned and need to be repaired (descendants of commits marked with rewritten as abcd1234). They can be automatically fixed up with git restack, or manually handled.

Manipulate

Edit your commit graph without fear:

Why not `git rebase -i`?

Interactive rebasing with git rebase -i is fully supported, but it has a couple of shortcomings:

  • git rebase -i can only repair linear series of commits, not trees. If you modify a commit with multiple children, then you have to be sure to rebase all of the other children commits appropriately.
  • You have to commit to a plan of action before starting the rebase. For some use-cases, it can be easier to operate on individual commits at a time, rather than an entire series of commits all at once.

When you use git rebase -i with git-branchless, you will be prompted to repair your commit graph if you abandon any commits.

Installation

See https://github.com/arxanas/git-branchless/wiki/Installation.

Short version: run cargo install --locked git-branchless, then run git branchless init in your repository.

Status

git-branchless is currently in alpha. Be prepared for breaking changes, as some of the workflows and architecture may change in the future. It's believed that there are no major bugs, but it has not yet been comprehensively battle-tested. You can see the known issues in the issue tracker.

git-branchless follows semantic versioning. New 0.x.y versions, and new major versions after reaching 1.0.0, may change the on-disk format in a backward-incompatible way.

To be notified about new versions, select Watch » Custom » Releases in Github's notifications menu at the top of the page. Or use GitPunch to deliver notifications by email.

Related tools

There's a lot of promising tooling developing in this space. See Related tools for more information.

Contributing

Thanks for your interest in contributing! If you'd like, I'm happy to set up a call to help you onboard.

For code contributions, check out the Runbook to understand how to set up a development workflow, and the Coding guidelines. You may also want to read the Architecture documentation.

For contributing documentation, see the Wiki style guide.

Contributors should abide by the Code of Conduct.

Download details:
Author: arxanas
Source code: https://github.com/arxanas/git-branchless
License: GPL-2.0 license

#rust #rustlang #git