Riley Lambert

Riley Lambert

1560916679

Creating and Deploying a Python Machine Learning Service

Introduction

Imagine you’re the moderator of a message board or comment section. You don’t want to read everything your users write online, yet you want to be alerted in case a discussion turns sour or people start spewing racial slurs all over the place. So, you decide to build yourself an automated system for hate speech detection.

Text classification via machine learning is an obvious choice of technology. However, turning model prototypes into working services has proven to be a widespread challenge. To help bridge this gap, this four-step tutorial illustrates an exemplary deployment workflow for a hate speech detection app:

  1. Train and persist a prediction model with scikit-learn
  2. Create an API endpoint with firefly
  3. Create a Docker container for this service
  4. Deploy the container on Heroku

The code for this project is available here.


1. Create prediction model

Dataset

The approach is based on the paper Automated Hate Speech Detection and the Problem of Offensive Language by Davidson, Warmsley, Macy and Weber. Their results are based on more than 20 000 labelled tweets, which are available on the corresponding Github page.

The .csv file is loaded as a dataframe:

import pandas as pd
import re

df = pd.read_csv(‘labeled_data.csv’, usecols=[‘class’, ‘tweet’])

df[‘tweet’] = df[‘tweet’].apply(lambda tweet: re.sub(‘[^A-Za-z]+’, ’ ', tweet.lower()))

The last line cleans the tweet column by converting all text to lowercase and removing non-alphabetic characters.

Result

The class attribute can assume three category values: 0 for hate speech, 1 for offensive language and 2 for neither.


Model training

We have to convert our predictors, i.e. the tweet text, into a numeric representation before we can train a machine learning classifier. We can use scikit-learn’s TfidfVectorizer for this task, which transforms texts into a matrix of term-frequency times inverse document-frequency (tf-idf) values, suitable for machine learning. Additionally, we can remove stop words (common words such as the, is, …) from the processing.

For text classification, support vector machines (SVMs) are a reliable choice. As they are binary classifiers, we will use a One-Vs-Rest strategy, where for each category an SVM is trained to separate this category from all others.

Both text vectorization and SVM training can be performed in one command by using scikit-learn’s Pipeline feature and defining the respective steps:

from sklearn.pipeline import make_pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC
from stop_words import get_stop_words

clf = make_pipeline(
TfidfVectorizer(stop_words=get_stop_words(‘en’)),
OneVsRestClassifier(SVC(kernel=‘linear’, probability=True))
)

clf = clf.fit(X=df[‘tweet’], y=df[‘class’])

Now, the performance of the model should be evaluated, e.g. using a cross-validation approach to calculate classification metrics. However, as this tutorial focusses on model deployment, we will skip this step (never do this in an actual project). The same goes for parameter tuning or additional techniques of natural language processing which are described in the original paper.


Test the model

We can now try a test text and have the model predict the probabilities:

text = “I hate you, please die!”
clf.predict_proba([text.lower()])

Output:

array([0.64, 0.14, 0.22])

The numbers in the array correspond to the probabilities for the three categories (hate speech, offensive language, neither).


Model persistence

Using the joblib module, we can save the model as a binary object to disk. This will allow us to load and use the model in an application.

from sklearn import externals

model_filename = ‘hatespeech.joblib.z’
externals.joblib.dump(clf, model_filename)

2. Create REST API

Create API endpoint

The python file app.py loads the model and defines a simple module-level function which wraps the call to the model’s predict_proba function:

from sklearn import externals

model_filename = ‘hatespeech.joblib.z’
clf = externals.joblib.load(model_filename)

def predict(text):
probas = clf.predict_proba([text.lower()])[0]
return {‘hate speech’: probas[0],
‘offensive language’: probas[1],
‘neither’: probas[2]}

Now, we use firefly, a lightweight python module for function as a service. For advanced configuration or use in a production environment, Flask or Falcon might be a better choice as they’re well established with a large community. For rapid prototyping, we’re fine with firefly.

We’ll use firefly on the command line to bind the predict function to port 5000 on localhost:

$ firefly app.predict --bind 127.0.0.1:5000

Test API locally

Via curl, we can make a POST request to the created endpoint and obtain a prediction:

$ curl -d ‘{“text”: “Please respect each other.”}’ \ http://127.0.0.1:5000/predict

Output:

{“hate speech”: 0.04, “offensive language”: 0.31, “neither”: 0.65}

Of course, in a full-fledged real application there would be much more additional features (logging, input and output validation, exception handling, …) and work steps (documentation, versioning, testing, monitoring, …), but here we’re merely deploying a simple prototype.

3. Create a Docker container

Why Docker? A Docker container runs an application in an isolated environment, with all dependencies included, and can be shipped as an image, thus simplifying service setup and scaling.


Build image

We have to configure the contents and start-actions of our container in a file named Dockerfile:

FROM python:3.6
RUN pip install scikit-learn==0.20.2 firefly-python==0.1.15
COPY app.py hatespeech.joblib.z ./

CMD firefly app.predict --bind 0.0.0.0:5000
EXPOSE 5000

The first three lines are about taking python:3.6 as base image, additionally installing scikit-learn and firefly (the same versions as in the development environment) and copying the app and model files inside. The latter two lines tell Docker the command which is executed when a container is started and that port 5000 should be exposed.

The build process that creates the image hatespeechdetect is started via:

$ docker build . -t hatespeechdetect

Run Container

The run command starts a container, derived from an image. Additionally, we’re binding the containers’s port 5000 to the host’s port 3000 via the -p option:

$ docker run -p 3000:5000 -d hatespeechdetect

Use prediction service

Now, we can send a request and obtain a prediction:

$ curl -d ‘{“text”: “You are fake news media! Crooked!”}’ \ http://127.0.0.1:3000/predict

Output:

{“hate speech”: 0.08, “offensive language”: 0.76, “neither”: 0.16}

In this example, the container runs locally. Of course the actual purpose is to keep it running at a permanent location, and possibly scale the service by starting multiple containers in an enterprise cluster.

4. Deploy as an Heroku app

A way to make the app publicly available to others is using a platform as a service such as Heroku, which supports Docker and offers a free basic membership. To use it, we have to register an account and install the Heroku CLI.

Heroku’s application containers expose a dynamic port, which requires an edit in our Dockerfile: We have to change port 5000 to the environment variable PORT:

CMD firefly app.predict --bind 0.0.0.0:$PORT

After this change, we are ready for deployment. On the command line, we log in to heroku (which will prompt us for credentials in the browser) and create an app named hate-speech-detector:

$ heroku login

$ heroku create hate-speech-detector

Then we log in to the container registry. heroku container:push will build an image based on the Dockerfile in the current directory and send it to the Heroku Container registry. After that, we can release the image to the app:

$ heroku container:login

$ heroku container:push web --app hate-speech-detector

$ heroku container:release web --app hate-speech-detector

As before, the API can be addressed via curl. However, this time, the service is not running locally, but is available to the world!

$ curl -d ‘{“text”: “You dumb idiot!”}’ https://hate-speech-detector.herokuapp.com/predict

Output:

{“hate speech”: 0.26, “offensive language”: 0.68, “neither”: 0.06}

Now, scaling the app would be just a few clicks or commands away. Also, the service needs to be connected to the message board, the trigger threshold needs to be set and an alerting implemented.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading

Machine Learning A-Z™: Hands-On Python & R In Data Science

Python for Data Science and Machine Learning Bootcamp

Machine Learning, Data Science and Deep Learning with Python

Deep Learning A-Z™: Hands-On Artificial Neural Networks

Artificial Intelligence A-Z™: Learn How To Build An AI

A Complete Machine Learning Project Walk-Through in Python

Machine Learning: how to go from Zero to Hero

Top 18 Machine Learning Platforms For Developers

10 Amazing Articles On Python Programming And Machine Learning

100+ Basic Machine Learning Interview Questions and Answers

#python #machine-learning #data-science

What is GEEK

Buddha Community

Creating and Deploying a Python Machine Learning Service
Ray  Patel

Ray Patel

1625843760

Python Packages in SQL Server – Get Started with SQL Server Machine Learning Services

Introduction

When installing Machine Learning Services in SQL Server by default few Python Packages are installed. In this article, we will have a look on how to get those installed python package information.

Python Packages

When we choose Python as Machine Learning Service during installation, the following packages are installed in SQL Server,

  • revoscalepy – This Microsoft Python package is used for remote compute contexts, streaming, parallel execution of rx functions for data import and transformation, modeling, visualization, and analysis.
  • microsoftml – This is another Microsoft Python package which adds machine learning algorithms in Python.
  • Anaconda 4.2 – Anaconda is an opensource Python package

#machine learning #sql server #executing python in sql server #machine learning using python #machine learning with sql server #ml in sql server using python #python in sql server ml #python packages #python packages for machine learning services #sql server machine learning services

Easter  Deckow

Easter Deckow

1655630160

PyTumblr: A Python Tumblr API v2 Client

PyTumblr

Installation

Install via pip:

$ pip install pytumblr

Install from source:

$ git clone https://github.com/tumblr/pytumblr.git
$ cd pytumblr
$ python setup.py install

Usage

Create a client

A pytumblr.TumblrRestClient is the object you'll make all of your calls to the Tumblr API through. Creating one is this easy:

client = pytumblr.TumblrRestClient(
    '<consumer_key>',
    '<consumer_secret>',
    '<oauth_token>',
    '<oauth_secret>',
)

client.info() # Grabs the current user information

Two easy ways to get your credentials to are:

  1. The built-in interactive_console.py tool (if you already have a consumer key & secret)
  2. The Tumblr API console at https://api.tumblr.com/console
  3. Get sample login code at https://api.tumblr.com/console/calls/user/info

Supported Methods

User Methods

client.info() # get information about the authenticating user
client.dashboard() # get the dashboard for the authenticating user
client.likes() # get the likes for the authenticating user
client.following() # get the blogs followed by the authenticating user

client.follow('codingjester.tumblr.com') # follow a blog
client.unfollow('codingjester.tumblr.com') # unfollow a blog

client.like(id, reblogkey) # like a post
client.unlike(id, reblogkey) # unlike a post

Blog Methods

client.blog_info(blogName) # get information about a blog
client.posts(blogName, **params) # get posts for a blog
client.avatar(blogName) # get the avatar for a blog
client.blog_likes(blogName) # get the likes on a blog
client.followers(blogName) # get the followers of a blog
client.blog_following(blogName) # get the publicly exposed blogs that [blogName] follows
client.queue(blogName) # get the queue for a given blog
client.submission(blogName) # get the submissions for a given blog

Post Methods

Creating posts

PyTumblr lets you create all of the various types that Tumblr supports. When using these types there are a few defaults that are able to be used with any post type.

The default supported types are described below.

  • state - a string, the state of the post. Supported types are published, draft, queue, private
  • tags - a list, a list of strings that you want tagged on the post. eg: ["testing", "magic", "1"]
  • tweet - a string, the string of the customized tweet you want. eg: "Man I love my mega awesome post!"
  • date - a string, the customized GMT that you want
  • format - a string, the format that your post is in. Support types are html or markdown
  • slug - a string, the slug for the url of the post you want

We'll show examples throughout of these default examples while showcasing all the specific post types.

Creating a photo post

Creating a photo post supports a bunch of different options plus the described default options * caption - a string, the user supplied caption * link - a string, the "click-through" url for the photo * source - a string, the url for the photo you want to use (use this or the data parameter) * data - a list or string, a list of filepaths or a single file path for multipart file upload

#Creates a photo post using a source URL
client.create_photo(blogName, state="published", tags=["testing", "ok"],
                    source="https://68.media.tumblr.com/b965fbb2e501610a29d80ffb6fb3e1ad/tumblr_n55vdeTse11rn1906o1_500.jpg")

#Creates a photo post using a local filepath
client.create_photo(blogName, state="queue", tags=["testing", "ok"],
                    tweet="Woah this is an incredible sweet post [URL]",
                    data="/Users/johnb/path/to/my/image.jpg")

#Creates a photoset post using several local filepaths
client.create_photo(blogName, state="draft", tags=["jb is cool"], format="markdown",
                    data=["/Users/johnb/path/to/my/image.jpg", "/Users/johnb/Pictures/kittens.jpg"],
                    caption="## Mega sweet kittens")

Creating a text post

Creating a text post supports the same options as default and just a two other parameters * title - a string, the optional title for the post. Supports markdown or html * body - a string, the body of the of the post. Supports markdown or html

#Creating a text post
client.create_text(blogName, state="published", slug="testing-text-posts", title="Testing", body="testing1 2 3 4")

Creating a quote post

Creating a quote post supports the same options as default and two other parameter * quote - a string, the full text of the qote. Supports markdown or html * source - a string, the cited source. HTML supported

#Creating a quote post
client.create_quote(blogName, state="queue", quote="I am the Walrus", source="Ringo")

Creating a link post

  • title - a string, the title of post that you want. Supports HTML entities.
  • url - a string, the url that you want to create a link post for.
  • description - a string, the desciption of the link that you have
#Create a link post
client.create_link(blogName, title="I like to search things, you should too.", url="https://duckduckgo.com",
                   description="Search is pretty cool when a duck does it.")

Creating a chat post

Creating a chat post supports the same options as default and two other parameters * title - a string, the title of the chat post * conversation - a string, the text of the conversation/chat, with diablog labels (no html)

#Create a chat post
chat = """John: Testing can be fun!
Renee: Testing is tedious and so are you.
John: Aw.
"""
client.create_chat(blogName, title="Renee just doesn't understand.", conversation=chat, tags=["renee", "testing"])

Creating an audio post

Creating an audio post allows for all default options and a has 3 other parameters. The only thing to keep in mind while dealing with audio posts is to make sure that you use the external_url parameter or data. You cannot use both at the same time. * caption - a string, the caption for your post * external_url - a string, the url of the site that hosts the audio file * data - a string, the filepath of the audio file you want to upload to Tumblr

#Creating an audio file
client.create_audio(blogName, caption="Rock out.", data="/Users/johnb/Music/my/new/sweet/album.mp3")

#lets use soundcloud!
client.create_audio(blogName, caption="Mega rock out.", external_url="https://soundcloud.com/skrillex/sets/recess")

Creating a video post

Creating a video post allows for all default options and has three other options. Like the other post types, it has some restrictions. You cannot use the embed and data parameters at the same time. * caption - a string, the caption for your post * embed - a string, the HTML embed code for the video * data - a string, the path of the file you want to upload

#Creating an upload from YouTube
client.create_video(blogName, caption="Jon Snow. Mega ridiculous sword.",
                    embed="http://www.youtube.com/watch?v=40pUYLacrj4")

#Creating a video post from local file
client.create_video(blogName, caption="testing", data="/Users/johnb/testing/ok/blah.mov")

Editing a post

Updating a post requires you knowing what type a post you're updating. You'll be able to supply to the post any of the options given above for updates.

client.edit_post(blogName, id=post_id, type="text", title="Updated")
client.edit_post(blogName, id=post_id, type="photo", data="/Users/johnb/mega/awesome.jpg")

Reblogging a Post

Reblogging a post just requires knowing the post id and the reblog key, which is supplied in the JSON of any post object.

client.reblog(blogName, id=125356, reblog_key="reblog_key")

Deleting a post

Deleting just requires that you own the post and have the post id

client.delete_post(blogName, 123456) # Deletes your post :(

A note on tags: When passing tags, as params, please pass them as a list (not a comma-separated string):

client.create_text(blogName, tags=['hello', 'world'], ...)

Getting notes for a post

In order to get the notes for a post, you need to have the post id and the blog that it is on.

data = client.notes(blogName, id='123456')

The results include a timestamp you can use to make future calls.

data = client.notes(blogName, id='123456', before_timestamp=data["_links"]["next"]["query_params"]["before_timestamp"])

Tagged Methods

# get posts with a given tag
client.tagged(tag, **params)

Using the interactive console

This client comes with a nice interactive console to run you through the OAuth process, grab your tokens (and store them for future use).

You'll need pyyaml installed to run it, but then it's just:

$ python interactive-console.py

and away you go! Tokens are stored in ~/.tumblr and are also shared by other Tumblr API clients like the Ruby client.

Running tests

The tests (and coverage reports) are run with nose, like this:

python setup.py test

Author: tumblr
Source Code: https://github.com/tumblr/pytumblr
License: Apache-2.0 license

#python #api 

Ray  Patel

Ray Patel

1619643600

Top Machine Learning Projects in Python For Beginners [2021]

If you want to become a machine learning professional, you’d have to gain experience using its technologies. The best way to do so is by completing projects. That’s why in this article, we’re sharing multiple machine learning projects in Python so you can quickly start testing your skills and gain valuable experience.

However, before you begin, make sure that you’re familiar with machine learning and its algorithm. If you haven’t worked on a project before, don’t worry because we have also shared a detailed tutorial on one project:

#artificial intelligence #machine learning #machine learning in python #machine learning projects #machine learning projects in python #python

sophia tondon

sophia tondon

1620898103

5 Latest Technology Trends of Machine Learning for 2021

Check out the 5 latest technologies of machine learning trends to boost business growth in 2021 by considering the best version of digital development tools. It is the right time to accelerate user experience by bringing advancement in their lifestyle.

#machinelearningapps #machinelearningdevelopers #machinelearningexpert #machinelearningexperts #expertmachinelearningservices #topmachinelearningcompanies #machinelearningdevelopmentcompany

Visit Blog- https://www.xplace.com/article/8743

#machine learning companies #top machine learning companies #machine learning development company #expert machine learning services #machine learning experts #machine learning expert

Top Machine Learning Projects in Python For Beginners [2021] | upGrad blog

If you want to become a machine learning professional, you’d have to gain experience using its technologies. The best way to do so is by completing projects. That’s why in this article, we’re sharing multiple machine learning projects in Python so you can quickly start testing your skills and gain valuable experience.

However, before you begin, make sure that you’re familiar with machine learning and its algorithm. If you haven’t worked on a project before, don’t worry because we have also shared a detailed tutorial on one project:

The Iris Dataset: For the Beginners

The Iris dataset is easily one of the most popular machine learning projects in Python. It is relatively small, but its simplicity and compact size make it perfect for beginners. If you haven’t worked on any machine learning projects in Python, you should start with it. The Iris dataset is a collection of flower sepal and petal sizes of the flower Iris. It has three classes, with 50 instances in every one of them.

We’ve provided sample code on various places, but you should only use it to understand how it works. Implementing the code without understanding it would fail the premise of doing the project. So be sure to understand the code well before implementing it.

#artificial intelligence #machine learning #machine learning in python #machine learning projects #machine learning projects in python #python