Iterators and Iterables in my Python data science workflow

Most of my career in data science I have not made the optimal use of the Python programming language. To be honest I never formally learned Python, thus my initial projects were more of a hack than any structured thinking. In the last few days, I decided to relook at some of my older projects in Python and impose better programming ideas to those projects. One mistake I have kept noticing is the misuse of iterables in code. This blog is a discourse on these two fundamental Python concepts — Iterators and Iterables.
Sum of Factorial
As a running example let’s think of a program that adds the factorial of the first n integers together. This is a slightly contrived example, but this pattern is quite common in many data science applications I have seen. If you are anything like me, you will have written a program like the one below.

You will immediately notice a problem with this program. We are recomputing factorial(1) to factorial(n-1) to calculate factorial(n). So, the smart reader will rewrite a specialized function that is efficient.

Specialized functions, however, should be avoided as far as possible in a function style programming. A specialized fact_sum(n) function can’t be used in any other program that needs a factorial. As data scientist, most of the code that I write is of function style and I would like to avoid specializing a function. For a more systematic overview of functional programming check out this how to.
Iterables and Iterators
An alternate approach here is to develop an Iterable and use an Iterator to get the factorials. Most of us have used iterators in every program written in Python. The title of this post for i in range(n): is one such example. Range function returns an iterator. The code below shows an iterator pattern for our factorial problem. In this example we create an Iterable by implementing a iter & a next method.

#iteration #python3 #iterables #machine-learning #deep learning

medium.com

Iterators and Iterables in my Python data science workflow