1593546120

This post and the code here are part of a larger repo called RecoTour, where I normally explore and implement some recommendation algorithms that I consider interesting and/or useful (see RecoTour and RecoTourII). In every directory, I have included a `README`

file and a series of explanatory notebooks that I hope help explaining the code. I keep adding algorithms from time to time, so stay tuned if you are interested.

As always, let me first acknowledge the relevant people that did the hard work. This post and the companion repo are based on the papers “Variational Autoencoders for Collaborative Filtering” [1] and “Auto-Encoding Variational Bayes” [2]. The code here and in that repo is partially inspired by the implementation from Younggyo Seo. I have adapted the code to my coding preferences and added a number of options and flexibility to run multiple experiment.

The reason to take a deep dive into variational auto-encoders for collaborative filtering is because they seem to be one of the few Deep Learning based algorithms (if not the only one) that obtains better results that those using non-Deep Learning techniques [3].

Throughout this exercise I will use two dataset. The Amazon Movies and TV dataset [4] [5] and the Movilens dataset. The later is used so I can make sure I am obtaining consistent results to those obtained in the paper. The Amazon dataset is significantly more challenging that the Movielens dataset as it is ∼13 times more sparse.

All the experiments in this post were run using a p2.xlarge EC2 instance on AWS.

The more detailed, ** original version** of this post in published in my blog. This intends to be a summary of the content there and focuses more on the implementation/code and the corresponding results and less on the math.

I will assume in this section that the reader has some experience with Variational Autoencoders (VAEs). If this is not the case, I recommend reading Kingma and Welling’s paper, Liang et al paper, or the original post. There, the reader will find a detailed derivation of the *Loss* function we will be using when implementing the Partially Regularised Multinomial Variational Autoencoder (Mult-VAE). Here I will only include the final expression and briefly introduce some additional pieces of information that I consider useful to understand the Mult-VAE implementation and the loss below in Eq (1).

Let me first describe the notational convention. Following Liang et al., 2018, I will use **u** ∈ {1,…,** U**} to index users and

With that notation, the Mult-VAE *Loss* function is defined as:

where *M* is the mini-batch size. The first element within the summation is simply the log-likelihood of the click history **_xᵤ **conditioned to the latent representation **z_ᵤ**, i.e.

We just need a bit more detail before we can jump to the code. ** xᵤ**,the click history of user

where**_ Nᵤ = * ∑ᵢ Nᵤᵢ **is the total number of clicks for user u. As I mentioned before, **z_ᵤ**is latent representation of xᵤ,and is assumed to be drawn from a standard Gaussian prior pθ(**z_ᵤ**)∼ N(0, I). During the implementation of the Mult-VAE, **z_ᵤ **needs to be sampled from an approximate posterior qϕ(**z_ᵤ∣x_ᵤ***) (which is also assume to be Gaussian). Since computing gradients when sampling is involved is…“

** μ** and

At this stage we have almost all the information we need to implement the Mult-VAE and its loss function in Eq (1): we know what **_xᵤ _**is, **z_ᵤ, μ * and* σ _**will be functions of our neural networks, and π is just the

Looking at the loss function in Eq (1) within the context of VAEs, we can see that the first term is the “reconstruction loss”, while the *KL* divergence act as a regularizer. With that in mind, Liang et al add a factor *β* to control the strength of the regularization, and propose *β*<1. For a more in-depth refection of the role of _β, *and in general a better explanation of the form of the loss function for the Mult-VAE*, _please read the original paper or the original post.

Without further ado, let’s move to the code:

#pytorch #mxnet #variational-autoencoder #recommendation-system #python

1593546120

This post and the code here are part of a larger repo called RecoTour, where I normally explore and implement some recommendation algorithms that I consider interesting and/or useful (see RecoTour and RecoTourII). In every directory, I have included a `README`

file and a series of explanatory notebooks that I hope help explaining the code. I keep adding algorithms from time to time, so stay tuned if you are interested.

As always, let me first acknowledge the relevant people that did the hard work. This post and the companion repo are based on the papers “Variational Autoencoders for Collaborative Filtering” [1] and “Auto-Encoding Variational Bayes” [2]. The code here and in that repo is partially inspired by the implementation from Younggyo Seo. I have adapted the code to my coding preferences and added a number of options and flexibility to run multiple experiment.

The reason to take a deep dive into variational auto-encoders for collaborative filtering is because they seem to be one of the few Deep Learning based algorithms (if not the only one) that obtains better results that those using non-Deep Learning techniques [3].

Throughout this exercise I will use two dataset. The Amazon Movies and TV dataset [4] [5] and the Movilens dataset. The later is used so I can make sure I am obtaining consistent results to those obtained in the paper. The Amazon dataset is significantly more challenging that the Movielens dataset as it is ∼13 times more sparse.

All the experiments in this post were run using a p2.xlarge EC2 instance on AWS.

The more detailed, ** original version** of this post in published in my blog. This intends to be a summary of the content there and focuses more on the implementation/code and the corresponding results and less on the math.

I will assume in this section that the reader has some experience with Variational Autoencoders (VAEs). If this is not the case, I recommend reading Kingma and Welling’s paper, Liang et al paper, or the original post. There, the reader will find a detailed derivation of the *Loss* function we will be using when implementing the Partially Regularised Multinomial Variational Autoencoder (Mult-VAE). Here I will only include the final expression and briefly introduce some additional pieces of information that I consider useful to understand the Mult-VAE implementation and the loss below in Eq (1).

Let me first describe the notational convention. Following Liang et al., 2018, I will use **u** ∈ {1,…,** U**} to index users and

With that notation, the Mult-VAE *Loss* function is defined as:

where *M* is the mini-batch size. The first element within the summation is simply the log-likelihood of the click history **_xᵤ **conditioned to the latent representation **z_ᵤ**, i.e.

We just need a bit more detail before we can jump to the code. ** xᵤ**,the click history of user

where**_ Nᵤ = * ∑ᵢ Nᵤᵢ **is the total number of clicks for user u. As I mentioned before, **z_ᵤ**is latent representation of xᵤ,and is assumed to be drawn from a standard Gaussian prior pθ(**z_ᵤ**)∼ N(0, I). During the implementation of the Mult-VAE, **z_ᵤ **needs to be sampled from an approximate posterior qϕ(**z_ᵤ∣x_ᵤ***) (which is also assume to be Gaussian). Since computing gradients when sampling is involved is…“

** μ** and

At this stage we have almost all the information we need to implement the Mult-VAE and its loss function in Eq (1): we know what **_xᵤ _**is, **z_ᵤ, μ * and* σ _**will be functions of our neural networks, and π is just the

Looking at the loss function in Eq (1) within the context of VAEs, we can see that the first term is the “reconstruction loss”, while the *KL* divergence act as a regularizer. With that in mind, Liang et al add a factor *β* to control the strength of the regularization, and propose *β*<1. For a more in-depth refection of the role of _β, *and in general a better explanation of the form of the loss function for the Mult-VAE*, _please read the original paper or the original post.

Without further ado, let’s move to the code:

#pytorch #mxnet #variational-autoencoder #recommendation-system #python

1595365380

Collaborative filtering is a tool that companies are increasingly using. Netflix uses it to recommend shows for you to watch. Facebook uses it to recommend who you should be friends with. Spotify uses it to recommend playlists and songs. It’s incredibly useful in recommending products to customers.

In this post, I construct a collaborative filtering neural network with embeddings to understand how users would feel towards certain movies. From this, we can recommend movies for them to watch.

The dataset is taken from here. This code is loosely based off the fastai notebook.

First, let get rid of the annoyingly complex user ids. We can make do with plain old integers. They’re much easier to handle.

```
import pandas as pd
ratings = pd.read_csv('ratings.csv')
movies = pd.read_csv('movies.csv')
```

Then we’ll do the same thing for movie ids as well.

```
u_uniq = ratings.userId.unique()
user2idx = {o:i for i,o in enumerate(u_uniq)}
ratings.userId = ratings.userId.apply(lambda x: user2idx[x])
```

We’ll need to get the number of users and the number of movies.

```
n_users=int(ratings.userId.nunique())
n_movies=int(ratings.movieId.nunique())
```

First, let’s create some random weights. We need to call. This allows us to avoid calling the base class explicitly. This makes the code more maintainable.

These weights will be uniformly distributed between 0 and 0.05. The `_`

operator at the end of `uniform_`

denotes an inplace operation.

```
class EmbeddingDot(nn.Module):
def __init__(self, n_users, n_movies):
super().__init__()
self.u = nn.Embedding(n_users, n_factors)
self.m = nn.Embedding(n_movies, n_factors)
self.u.weight.data.uniform_(0,0.05)
self.m.weight.data.uniform_(0,0.05)
view raw
embedding_matrices.py hosted with ❤ by GitHub
```

Next, we add our Embedding matrices and latent factors.

We’re creating an embedding matrix for our user ids and our movie ids. An embedding is basically an array lookup. When we multiply our one-hot encoded user ids by our weights most calculations cancel to `0`

`(0 * number = 0)`

. All we’re left with is a particular row in the weight matrix. That’s basically just an array lookup.

So we don’t need the matrix multiply and we don’t need the one-hot encoded array. Instead, we can just do an array lookup. This reduces memory usage and speeds up the neural network. It also reveals the intrinsic properties of the categorical variables. This idea was applied in a recent Kaggle competition and achieved 3rd place.

The size of these embedding matrices will be determined by n_factors. These factors determine the number of latent factors in our dataset.

#machine-learning #collaborative-filtering #deep-learning #pytorch #deep learning

1623145380

**Item-based collaborative filtering** is the recommendation system to use the similarity between items using the ratings by users. In this article, I explain its basic concept and practice how to make the item-based collaborative filtering using Python.

#item-based-cf #python #collaborative-filtering #movie-recommendation #item-based collaborative filtering #recommender

1596619131

Teams spread over remote locations as well as the office are increasingly in vogue. As the coronavirus pandemic spreads in 2020, it has also resulted in increased reliance on distributed teams. These teams provide a huge amount of benefits and also pose a set of unique problems. Businesses are not surprisingly continually updating how they manage distributed teams. They are also increasingly using team collaboration software to overcome the challenges that distributed teams pose. So continue reading, to find the latest information on software that helps in the coordination and managing of distributed teams.

**1. Communication is the key**

The cornerstone of managing a distributed team is communication. In a distributed team relying on email alone is just not an option. So there is an ever-present need to adopt the best team communication software tools available and provide feedback routinely.

**2. Management of productivity**

A distributed team’s productivity can be high at certain times and then fall off. To ensure that team productivity remains at optimum levels managers need to be able to monitor it. Managers also need the best team messaging collaboration solution to manage the productivity of individuals and the team.

**3. Solid tech infrastructure**

Cutting edge tech infrastructure is the backbone of a business communication solution. Using this infrastructure, managers can efficiently monitor employees and their work. Additionally, employees can use the tech infrastructure to ensure work is completed on time.

**4. Advanced security features**

Working in distributed teams often means having to deal with security issues that crop up when people work from home. Also within such teams steps have to be taken to ensure that data and customer information remains secure when transmitted.

**5. Elevating team spirit and morale**

Within an office, teams interact easily and managers can keep an eye on morale and team spirit. In distributed teams, the process of keeping team spirit and morale bubbling with energy is much more complex. The process requires the team and manager to put in extra effort and rely on the aid of group collaboration software.

#team-collaboration #collaboration #online-collaboration-tools #collaboration-tools #api

1594449179

This article is continuation of my previous article which is complete guide to build CNN using pytorch and keras.

Taking input from standard datasets or custom datasets is already mentioned in complete guide to CNN using pytorch and keras. So we can start with necessary introduction to AutoEncoders and then implement one.

Auto Encoder is a neural network that learns encoding data with minimal loss of information.

There are many variants of above network. Some of them are:

This auto-encoder reduces overfitting by regularizing activation function hidden nodes.

This auto-encoder is trained by adding noise to input. This will remove noise from input at evaluation.

#keras #variational-autoencoder #pytorch