To know more about slicing and other sequential types, fill free to check the full guide Python for Machine Learning: Indexing and Slicing for Lists, Tuples, Strings, and other Sequential Types

Python supports slice notation for any sequential data type like lists, strings, tuples, bytes, bytearrays, and ranges. Also, any new data structure can add its support as well. This is greatly used (and abused) in NumPy and Pandas libraries, which are so popular in Machine Learning and Data Science. It’s a good example of “learn once, use everywhere.”

Indexing

Before discussing slice notation, we need to have a good grasp of indexing for sequential types.

In Python, list is akin to arrays in other scripting languages(Ruby, JavaScript, PHP). It allows you to store an enumerated set of items in one place and access an item by its position – index.

Let’s take a simple example:
>>> colors = ['red', 'green', 'blue', 'yellow', 'white', 'black']

Here we defined a list of colors. Each item in the list has a value(color name) and an index(its position in the list). Python uses zero-based indexing. That means, the first element(value ‘red’) has an index 0, the second(value ‘green’) has index 1, and so on.

This is image title

To access an element by its index we need to use square brackets:

>>> colors = ['red', 'green', 'blue', 'yellow', 'white', 'black']
>>> colors[0]
'red'
>>> colors[1]
'green'
>>> colors[5]
'black'

Negative indexes

Using indexing we can easily get any element by its position. This is handy if we use position from the head of a list. But what if we want to take the last element of a list? Or the penultimate element? In this case, we want to enumerate elements from the tail of a list.

To address this requirement there is negative indexing. So, instead of using indexes from zero and above, we can use indexes from -1 and below.

This is image title

In negative indexing system -1 corresponds to the last element of the list(value ‘black’), -2 to the penultimate (value ‘white’), and so on.

>>> colors = ['red', 'green', 'blue', 'yellow', 'white', 'black']
>>> colors[-1]
'black'
>>> colors[-2]
'white'
>>> colors[-6]
'red'

Assignment

Before we used indexing only for accessing the content of a list cell. But it’s also possible to change cell content using an assignment operation:

>>> basket = ['bread', 'butter', 'milk']
>>> basket[0] = 'cake'
>>> basket
['cake', 'butter', 'milk']
>>> basket[-1] = 'water'
>>> basket
['cake', 'butter', 'water']

We can freely use positive or negative indexing for assignment.

Deletion

We can also easily delete any element from the list by using indexing and del statement:

>>> basket = ['bread', 'butter', 'milk']
>>> del basket[0]
>>> basket
['butter', 'milk']
>>> del basket[1]
>>> basket
['butter']

Indexing for other Sequential Types

Read-only indexing operations work perfectly well for all sequential types. But assignment and deletion operations are not applicable to immutable sequential types.

The original post was written by my colleague Sergii Boiko, a Full Stack Engineer at Railsware

#python #machine-learning #data-science

Python Indexing for Machine Learning
5.20 GEEK