Guide To Lightly: Tool For Curating Your Vision Data

Lightly, makes deep learning more efficient by popularizing the use of self-supervised methods to understand and filter raw image data.

Most Benchmarked Datasets for Question Answering in NLP with implementation in PyTorch

Question Answering is a technique that consequently answers the addresses presented by people in natural language processing.

Google Releases New Dataset For Advanced 3D Object Understanding

Google's Objectron dataset, a collection of object-centric video clips that capture a set of common objects from various angles.

12 Best Social Media Datasets for Machine Learning

We’re continuing our series of articles on open datasets for machine learning. This time, we at Lionbridge AI combed the web and put together the ultimate cheat sheet for social media datasets for machine learning.

How To Use UCF101, The Largest Dataset Of Human Actions

UCF-101 dataset has 101 actions and 13320 clips of human actions, collected from youtube were first introduced in 2012 by researchers.

COVID-19 Posts: A Public Dataset Containing 400+ COVID-19 Blog Posts

We are excited to share this dataset publicly, to help bloggers who want to analyze COVID-19 data by unleashing R and the resources of its community by being able to research such posts.

Top 10 Regression Datasets for Machine Learning Projects

Top 10 Regression Datasets for Machine Learning Projects. This listicle on datasets built for regression or linear regression tasks has been upvoted many times on Reddit and reshared dozens of times on various social media platforms.

Pseudo Labelling - A Guide To Semi-Supervised Learning

Generating pseudo labels using the semi-supervised learning technique which is a mixture of both supervised and unsupervised learning.

Read a CSV File from the Internet Directly Into Your Code

Use functions such as download.file(), read.csv() and pd.read_csv() to read a CSV file from the internet directly into your R or Python code. In this article, we shall focus on numerical data stored in a comma-separated values (CSV) file format.

Uber Case Study: EDA

An analysis of the Uber request dataset with relevant illustrations using seaborn and matplotlib visualization libraries. I have given a link to my Kaggle notebook where I have performed a detailed analysis of this Uber dataset.

How to Build a Football Dataset with Web Scraping

Using Selenium to scrape JavaScript rendered content. This article will cover the scraping of JavaScript rendered content with Selenium using the Premier League website as an example and scraping the stats of every match in the 2019/20 season.

Complete Guide To Model Deployment Using Flask in Google Cloud Platform

This article discusses Deploying the model using Flask. Further, it demonstrates how to deploy a machine learning model using google cloud

How to Scrape Tweets and create Dataset using Twint without Twitter API

How to Scrape Tweets and create Dataset using Twint without Twitter API. In this article, I’ll describe how I created a huge dataset of tweets scraped from an entire country.

50+ Object Detection Datasets from different industry domains

50+ Object Detection Datasets from different industry domains. A list of object detection and image segmentation datasets (With colab notebooks for training and inference) to explore and experiment with different algorithms on!

Data Maps: Datasets Can Be Distilled Too

TL;DR: This post is about the paper Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics, alongside implementation in TensorFlow 2. In this post, I’ll go over the paper, and finish up with a TensorFlow implementation.

How to Get Datasets for Data Science

How to Get Datasets for Data Science. Popular Sources for Data Science Datasets

Complete Guide To Handling Categorical Data Using Scikit-Learn

Handling categorical features to preprocess before building machine learning models. Techniques of encoding categorical features to numeric.

Top 10 Ready To Use Datasets For ML on TensorFlow

The machine learning community can access public research datasets as and as NumPy arrays.

Complete Guide To ShuffleNet V1 With Implementation In Multiclass Image Classification

This article demonstrates how we can implement a deep learning model with ShuffleNet architecture to classify images of CIFAR-10 dataset.

Riiid Announces $100,000 Kaggle Competition Using EdNet- World’s Largest Education Dataset

Riiid Labs has announced the launch of the first-ever global Artificial Intelligence Education (AIEd) Challenge, created to accelerate innovation in education by building a better and more equitable learning model for students around the world. “AIEd today is just starting and it requires a practical approach to improve the quality of personalized remote learning,” said…