Social Network Analysis: From Graph Theory to Applications with Python

We cover the theory of social networks with an introduction to graph theory, information spread and a deep dive into Python code examples

Automatic Canary Releases for Machine Learning Models

“MLOps” is about automatically managing Machine Learning lifecycle. For this, “DevOps” principles can be applied by a Machine Learning Engineer, mainly differing by their scope.

Morty story generation with GPT2 using Transformers and Streamlit in 57 lines of code

This post will show you how to fine-tune a pre-trained GPT2 model on Rick and Morty transcripts using Hugging Face’s Transformers library, build a demo application, and deploy it using Streamlit Sharing.

5-Minute Guide to Calling Functions from R Scripts

R code tutorial using Google Trends data. You've likely heard the popular guideline that if you find yourself copying and pasting code more than 3 times, you should write it as a function.

Machine learning pitfalls

Machine learning (ML) systems are complex, and the more complex a system is, the more failure modes there are. Knowing what can go wrong is essential for building robust ML systems. Together, we will explore possible pitfalls that can occur at 5 different maturity levels, using concrete examples.

Creating a Chess Engine with Deep Learning

Using Deep Learning to train a Deep Search Chess Algorithm and understanding how neural networks can be used to indirectly solve problems

Data Science Interviews: SQL

This post will provide a technical guide to SQL within data science interviews. The problems discussed are from this data science interview newsletter which features questions from top tech companies and will be involved in an upcoming book.

Deep learning based reverse image search for industrial applications

Deep learning based reverse image search for industrial applications. From unstructured data to content based image retrieval

Benford’s Law — A Simple Explanation

In this article, I’ll cover a brief background of BL, explain two key concepts: normal distributions and logarithms, show how a dice rolling exercise can lead to BL, and finally take a look at some real datasets to see if this explanation holds up.

🗣️ Sentiment Analysis: Idioms and their Importance

We demonstrated the value of idioms as features of sentiment analysis by showing that idiom-based features significantly improve sentiment classification results when idioms are present. The overall performance in terms of F1-score was improved from 45% to 64% in one experiment, and from 46% to 61% in the other.

Improve Glaucoma Assessment with Brain-Computer Interface and Machine Learning

Improve Glaucoma Assessment with Brain-Computer Interface and Machine Learning. My research used multitask learning to provide rapid point-of-care diagnostics to detect peripheral vision loss

Interactive: Visualizing Covid-19 Test Accuracy

Interactive: Visualizing Covid-19 Test Accuracy. How accurate are the most common Covid-19 tests? What does “accuracy” even mean?

Data Science Writers to Follow on Medium

Data Science Writers to Follow on Medium. Great Data Science & AI content on Medium

Integrating Tableau and R for Regression Analyses

Integrating Tableau and R for Regression Analyses. I will walk through a sample regression analysis conducted using R code and Tableau visualizations.

Understanding Dynamic Programming

An intuitive guide to the popular optimization technique. In this post, we’ll discuss when we use DP, followed by its types and then finally work through an example.

All the ~Eigen-stuff they never thought you should know

To Infinity and…Linear Algebra?! In this article, I’m going to describe Eigen-stuff as mind-blowingly simple and hopefully it will boost your confidence when tackling trickier topics based on it.

Ethically Collecting Conversations With People that have Cognitive Impairments

Improving the Accessibility of Voice Assistants: Doing Things Right. This practical guide aims to help future researchers, like me, collect these valuable datasets quickly without compromising any ethical considerations or data security.

Real-Time Time Series Anomaly Detection

Develop a Monitoring System on Multiple Time Series Sensors. In this post, we explore different anomaly detection approaches that can scale on a big data source in real-time. The tsmoothie package can help us to carry out this task.

Entropy Application in the Stock Market

Entropy Application in the Stock Market. Monitoring Correlation-Based Networks over time with Structural Entropy

UMAP for Data Integration

In this article, I will demonstrate how one can perform graph based across modalities integration of single cell Omics (scOmics) data by using graph intersection approach with Igraph and UMAP.