Chet  Lubowitz

Chet Lubowitz

1598414760

Beyond “classic” PCA: Functional Principal Components Analysis (FPCA)

Discover why using “Functions” instead of “Linear Vectors” in Principal Components Analysis can help you better understand common trends and behaviors of time-series.

FPCA is traditionally implemented with R but the “FDASRSF” package from J. Derek Tucker will achieve similar (and even greater) results in Python.

If you have reached this page, you are probably familiar with PCA.

Principal Components Analysis is part of the Data Science exploration toolkit as it provides many benefits: reducing dimensions of a large dataset, preventing multi-collinearity, etc.

There are many articles out there that explain the benefits of PCA and, if needed, I suggest you to have a look at this one which summarizes my understanding of this methodology:

The intuition behind the “Functional” PCA

In a standard PCA process, we define Eigenvectors to convert the original dataset into a smaller one with fewer dimensions and for which most of the initial dataset variance is preserved (usually 90 or 95%).

Image for post

Initial dataset (blue crosses) and the corresponding first two Eigenvectors

Now let’s imagine that the patterns of the time-series have more importance than their absolute variance. For example, you would like to compare physical phenomena such as signals, temperatures’ variation, production batches, etc… Functional Principal Components Analysis will act this way by determining the corresponding underlying functions!

Image for post

#machine-learning #dimensionality-reduction #principal-component #python

What is GEEK

Buddha Community

Beyond “classic” PCA: Functional Principal Components Analysis (FPCA)
Chet  Lubowitz

Chet Lubowitz

1598414760

Beyond “classic” PCA: Functional Principal Components Analysis (FPCA)

Discover why using “Functions” instead of “Linear Vectors” in Principal Components Analysis can help you better understand common trends and behaviors of time-series.

FPCA is traditionally implemented with R but the “FDASRSF” package from J. Derek Tucker will achieve similar (and even greater) results in Python.

If you have reached this page, you are probably familiar with PCA.

Principal Components Analysis is part of the Data Science exploration toolkit as it provides many benefits: reducing dimensions of a large dataset, preventing multi-collinearity, etc.

There are many articles out there that explain the benefits of PCA and, if needed, I suggest you to have a look at this one which summarizes my understanding of this methodology:

The intuition behind the “Functional” PCA

In a standard PCA process, we define Eigenvectors to convert the original dataset into a smaller one with fewer dimensions and for which most of the initial dataset variance is preserved (usually 90 or 95%).

Image for post

Initial dataset (blue crosses) and the corresponding first two Eigenvectors

Now let’s imagine that the patterns of the time-series have more importance than their absolute variance. For example, you would like to compare physical phenomena such as signals, temperatures’ variation, production batches, etc… Functional Principal Components Analysis will act this way by determining the corresponding underlying functions!

Image for post

#machine-learning #dimensionality-reduction #principal-component #python

Vincent Lab

Vincent Lab

1605017502

The Difference Between Regular Functions and Arrow Functions in JavaScript

Other then the syntactical differences. The main difference is the way the this keyword behaves? In an arrow function, the this keyword remains the same throughout the life-cycle of the function and is always bound to the value of this in the closest non-arrow parent function. Arrow functions can never be constructor functions so they can never be invoked with the new keyword. And they can never have duplicate named parameters like a regular function not using strict mode.

Here are a few code examples to show you some of the differences
this.name = "Bob";

const person = {
name: “Jon”,

<span style="color: #008000">// Regular function</span>
func1: <span style="color: #0000ff">function</span> () {
    console.log(<span style="color: #0000ff">this</span>);
},

<span style="color: #008000">// Arrow function</span>
func2: () =&gt; {
    console.log(<span style="color: #0000ff">this</span>);
}

}

person.func1(); // Call the Regular function
// Output: {name:“Jon”, func1:[Function: func1], func2:[Function: func2]}

person.func2(); // Call the Arrow function
// Output: {name:“Bob”}

The new keyword with an arrow function
const person = (name) => console.log("Your name is " + name);
const bob = new person("Bob");
// Uncaught TypeError: person is not a constructor

If you want to see a visual presentation on the differences, then you can see the video below:

#arrow functions #javascript #regular functions #arrow functions vs normal functions #difference between functions and arrow functions

Principal Component Analysis - Visualized

If you have ever taken an online course on Machine Learning, you must have come across Principal Component Analysis for dimensionality reduction, or in simple terms, for compression of data. Guess what, I had taken such courses too but I never really understood the graphical significance of PCA because all I saw was matrices and equations. It took me quite a lot of time to understand this concept from various sources. So, I decided to compile it all in one place.

In this article, we will take a visual (graphical) approach to understand PCA and how it can be used to compress data. Basic knowledge of Linear Algebra and Matrices is assumed. If you are new to this concept, just follow along, I have tried my best to keep this as simple as possible.

Introduction

These days, datasets containing a large number of dimensions are increasingly common and are often difficult to interpret. One example can be a database of face photographs of let’s say, 1,000,000 people. If each face photograph has a dimension of 100x100, then the data of each face is 10000 dimensional (there are 100x100 = 10,000 unique values to be stored for each face). Now, if 1 byte is required to store the information of each pixel, then 10,000 bytes are required to store 1 face. Since there are 1000 faces in the database,10,000 x 1,000,000 = 10 GB will be needed to store the dataset.

Principal component analysis (PCA) is a technique for reducing the dimensionality of such datasets, exploiting the fact that the images in these datasets have something in common. For instance, in a dataset consisting of face photographs, each photograph will have facial features like eyes, nose, mouth. Instead of encoding this information pixel by pixel, we could make a template of each type of these features and then just combine these templates to generate any face in the dataset. In this approach, each template will still be 100x100 = 1000 dimensional, but since we will be reusing these templates (basis functions) to generate each face in the dataset, the number of templates required will be very small. PCA does exactly this.

#principal-component #python #data-science #machine-learning #data-compression #data analysis

Ian  Robinson

Ian Robinson

1623856080

Streamline Your Data Analysis With Automated Business Analysis

Have you ever visited a restaurant or movie theatre, only to be asked to participate in a survey? What about providing your email address in exchange for coupons? Do you ever wonder why you get ads for something you just searched for online? It all comes down to data collection and analysis. Indeed, everywhere you look today, there’s some form of data to be collected and analyzed. As you navigate running your business, you’ll need to create a data analytics plan for yourself. Data helps you solve problems , find new customers, and re-assess your marketing strategies. Automated business analysis tools provide key insights into your data. Below are a few of the many valuable benefits of using such a system for your organization’s data analysis needs.

Workflow integration and AI capability

Pinpoint unexpected data changes

Understand customer behavior

Enhance marketing and ROI

#big data #latest news #data analysis #streamline your data analysis #automated business analysis #streamline your data analysis with automated business analysis

Tyrique  Littel

Tyrique Littel

1604008800

Static Code Analysis: What It Is? How to Use It?

Static code analysis refers to the technique of approximating the runtime behavior of a program. In other words, it is the process of predicting the output of a program without actually executing it.

Lately, however, the term “Static Code Analysis” is more commonly used to refer to one of the applications of this technique rather than the technique itself — program comprehension — understanding the program and detecting issues in it (anything from syntax errors to type mismatches, performance hogs likely bugs, security loopholes, etc.). This is the usage we’d be referring to throughout this post.

“The refinement of techniques for the prompt discovery of error serves as well as any other as a hallmark of what we mean by science.”

  • J. Robert Oppenheimer

Outline

We cover a lot of ground in this post. The aim is to build an understanding of static code analysis and to equip you with the basic theory, and the right tools so that you can write analyzers on your own.

We start our journey with laying down the essential parts of the pipeline which a compiler follows to understand what a piece of code does. We learn where to tap points in this pipeline to plug in our analyzers and extract meaningful information. In the latter half, we get our feet wet, and write four such static analyzers, completely from scratch, in Python.

Note that although the ideas here are discussed in light of Python, static code analyzers across all programming languages are carved out along similar lines. We chose Python because of the availability of an easy to use ast module, and wide adoption of the language itself.

How does it all work?

Before a computer can finally “understand” and execute a piece of code, it goes through a series of complicated transformations:

static analysis workflow

As you can see in the diagram (go ahead, zoom it!), the static analyzers feed on the output of these stages. To be able to better understand the static analysis techniques, let’s look at each of these steps in some more detail:

Scanning

The first thing that a compiler does when trying to understand a piece of code is to break it down into smaller chunks, also known as tokens. Tokens are akin to what words are in a language.

A token might consist of either a single character, like (, or literals (like integers, strings, e.g., 7Bob, etc.), or reserved keywords of that language (e.g, def in Python). Characters which do not contribute towards the semantics of a program, like trailing whitespace, comments, etc. are often discarded by the scanner.

Python provides the tokenize module in its standard library to let you play around with tokens:

Python

1

import io

2

import tokenize

3

4

code = b"color = input('Enter your favourite color: ')"

5

6

for token in tokenize.tokenize(io.BytesIO(code).readline):

7

    print(token)

Python

1

TokenInfo(type=62 (ENCODING),  string='utf-8')

2

TokenInfo(type=1  (NAME),      string='color')

3

TokenInfo(type=54 (OP),        string='=')

4

TokenInfo(type=1  (NAME),      string='input')

5

TokenInfo(type=54 (OP),        string='(')

6

TokenInfo(type=3  (STRING),    string="'Enter your favourite color: '")

7

TokenInfo(type=54 (OP),        string=')')

8

TokenInfo(type=4  (NEWLINE),   string='')

9

TokenInfo(type=0  (ENDMARKER), string='')

(Note that for the sake of readability, I’ve omitted a few columns from the result above — metadata like starting index, ending index, a copy of the line on which a token occurs, etc.)

#code quality #code review #static analysis #static code analysis #code analysis #static analysis tools #code review tips #static code analyzer #static code analysis tool #static analyzer