Subclassing the Scikit-Learn Pipeline

Subclassing the Scikit-Learn Pipeline

If you visit the Scikit-Learn developer’s guide, you can easily find a breakdown of the objects that they expect you to customize.

If you visit the Scikit-Learn developer’s guide, you can easily find a breakdown of the objects that they expect you to customize. It includes the Estimator, Predictor, Transformer, and Model classes, and there’s a nice guide walking you through the ins and outs of their APIs.

But if for some (potentially misguided) reason you’ve decided to implement your own subclass of the sklearn.pipeline.Pipeline class, then you’ll be stepping off the marked trail, and you’re going to need your jungle gear: plenty of coffee, the built-in function dir, the pdbmodule, and a pillow to scream into occasionally.

The import

When it comes to naming your subclass, you can give it a custom name if you’re hoping to use it only in new code you’re writing. If, however, you’re hoping to integrate your new class into legacy code, it’s simplest to keep the name the same. Here’s how to do that:

from sklearn.pipeline import Pipeline as SKPipeline

class Pipeline(SKPipeline):
    pass

Aliasing the sklearn Pipeline as SKPipeline allows you to use the identifier Pipeline for your new class

What’s up with attribute and attribute_?

Suppose you want your special Pipeline class to interact with its steps — perhaps you’re interested in writing a Pipeline.stepinfo property. To do this, you’ll need to interact with the attributes of its constituent objects. Many Scikit-Learn objects often contain properties with a trailing underscore such as components_. For some objects, both underscored and the non-underscored attributes exist. For example, ColumnTransformer hastransformers and a transformers_ attributes. Which of these attributes do you want to access, and what’s the difference?

The trailing underscore indicates an attribute that exists in an object after it has been “fitted”. (Yes, “fitted” is how sklearn refers to this state — in error messages that I see in all my nightmares now.) This is an important distinction because fitting often involves a cloning process that creates new objects. Take ColumnTransformer for example. The objects in ColumnTransformer.transformers are actually different objects (they have a different id()) from those in ColumnTransformer.transformers_. If you’re hoping to access data that exists in a “fitted” (so weird) ColumnTransformer, then you need the underscored attribute.

python oop deep learning

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Emojify - Create your own emoji with Deep Learning

Emojify - Create your own emoji with Deep Learning. We will classify human facial expressions to filter and map corresponding emojis or avatars.

Learn Transfer Learning for Deep Learning by implementing the project.

Project walk-through on Convolution neural networks using transfer learning. From 2 years of my master’s degree, I found that the best way to learn concepts is by doing the projects.

Deep Learning With Python | Deep Learning Tutorial For Beginners

Deep Learning with Python tutorial will help you understand what is deep learning, applications of deep learning, what is a neural network, biological versus artificial neural networks, activation functions, cost function, how neural networks work, and what gradient descent is. Finally, we'll code a neural network in Python using TensorFlow.

PyTorch for Deep Learning | Data Science | Machine Learning | Python

PyTorch for Deep Learning | Data Science | Machine Learning | Python. PyTorch is a library in Python which provides tools to build deep learning models. What python does for programming PyTorch does for deep learning. Python is a very flexible language for programming and just like python, the PyTorch library provides flexible tools for deep learning.

Deep Learning Tutorial with Python | Machine Learning with Neural Networks

In this video, Deep Learning Tutorial with Python | Machine Learning with Neural Networks Explained, Frank Kane helps de-mystify the world of deep learning and artificial neural networks with Python!