1601506800

How does your brain represent the environment ? The partial answer is the admittedly complex title which we will explore in this article.

Representation is a hard problem for both Neuroscience and AI and a good explanation of these terms I think is helpful if we are to build better AIs, this is my attempt.

A representation is simply that, the internal description of something, a thing, an idea or even a thought or feeling in general, but alas these later ones are very high in the hierarchy of representations and we don’t yet know exactly how they are formed in biological brains. A simpler representation is your perception of a physical thing out there, an apple for instance.

#computer-science #data-science #neuroscience #artificial-intelligence

1624298520

In a series of weekly articles, I will be covering some important topics of statistics with a twist.

The goal is to use Python to help us get intuition on complex concepts, empirically test theoretical proofs, or build algorithms from scratch. In this series, you will find articles covering topics such as random variables, sampling distributions, confidence intervals, significance tests, and more.

At the end of each article, you can find exercises to test your knowledge. The solutions will be shared in the article of the following week.

Articles published so far:

- Bernoulli and Binomial Random Variables with Python
- From Binomial to Geometric and Poisson Random Variables with Python
- Sampling Distributions with Python

As usual, the code is available on my GitHub.

#statistics #distribution #python #machine-learning #sampling distributions with python #sampling distributions

1601506800

How does your brain represent the environment ? The partial answer is the admittedly complex title which we will explore in this article.

Representation is a hard problem for both Neuroscience and AI and a good explanation of these terms I think is helpful if we are to build better AIs, this is my attempt.

A representation is simply that, the internal description of something, a thing, an idea or even a thought or feeling in general, but alas these later ones are very high in the hierarchy of representations and we don’t yet know exactly how they are formed in biological brains. A simpler representation is your perception of a physical thing out there, an apple for instance.

#computer-science #data-science #neuroscience #artificial-intelligence

1623263280

This blog is an abridged version of the talk that I gave at the Apache Ignite community meetup. You can download the slides that I presented at the meetup here. In the talk, I explain how data in Apache Ignite is distributed.

Inevitably, the evolution of a system that requires data storage and processing reaches a threshold. Either too much data is accumulated, so the data simply does not fit into the storage device, or the load increases so rapidly that a single server cannot manage the number of queries. Both scenarios happen frequently.

Usually, in such situations, two solutions come in handy—sharding the data storage or migrating to a distributed database. The solutions have features in common. The most frequently used feature uses a set of nodes to manage data. Throughout this post, I will refer to the set of nodes as “topology.”

The problem of data distribution among the nodes of the topology can be described in regard to the set of requirements that the distribution must comply with:

- Algorithm. The algorithm allows the topology nodes and front-end applications to discover unambiguously on which node or nodes an object (or key) is located.
- Distribution uniformity. The more uniform the data distribution is among the nodes, the more uniform the workloads on the nodes is. Here, I assume that the nodes have approximately equal resources.
- Minimal disruption. If the topology is changed because of a node failure, the changes in distribution should affect only the data that is on the failed node. It should also be noted that, if a node is added to the topology, no data swap should occur among the nodes that are already present in the topology.

#tutorial #big data #distributed systems #apache ignite #distributed storage #data distribution #consistent hashing

1623896372

e-Distribución is an energy distribution company that covers most of South Spain area. If you live in this area, you probably are able to register into their website to get some information about your power demand, energy consumption, or even cycle billing (in terms of consumptions).

Although their application is great, this integration enables you to add a sensor to Home Assistant and getting updated automatically. However, it has some limitations yet, and no front-end support is being provided at the moment.

- Install HACS
- Add this repo (https://github.com/uvejota/edistribucion) to the custom repositories in HACS
- Install the integration. Please consider that alpha/beta versions are untested, and they might cause bans due to excesive polling.
- Add this basic configuration at Home Assistant configuration files (e.g.,
`configuration.yml`

)

```
sensor:
- platform: edistribucion
username: !secret eds_user ## this key may exist in secrets.yaml!
password: !secret eds_password ## this key may exist in secrets.yaml!
```

YAML

At this point, you got an unique default sensor for the integration, namely `sensor.edistribucion`

, linked to those credentials in the e-Distribución platform. This default sensor assumes the first CUPS that appears in the fetched list of CUPS, which frequently is the most recent contract, so this configuration may be valid for most users. If you need a more detailed configuration, please check the section below “What about customisation?”.

#machine learning #distribution #python #home assistant custom integration for e-distribution with python #home assistant #e-distribution with python

1598905320

One of the most important concepts discussed in the context of inferential data analysis is the idea of sampling distributions. Understanding sampling distributions helps us better comprehend and interpret results from our descriptive as well as predictive data analysis investigations. Sampling distributions are also frequently used in decision making under uncertainty and hypothesis testing.

You may already be familiar with the idea of probability distributions. A probability distribution gives us an understanding of the probability and likelihood associated with values (or range of values) that a random variable may assume. A random variable is a quantity whose value (outcome) is determined randomly. Some examples of a random variable include, the monthly revenue of a retail store, the number of customers arriving at a car wash location on any given day, the number of accidents on a certain highway on any given day, weekly sales volume at a retail store, etc. Although the outcome of a random variable is random, the probability distribution allows us to gain and understanding about the likelihood and probabilities of different values occurring in the outcome. Sampling distributions are probability distributions that we attach to sample statistics of a sample.

A sample statistic (also known simply as a statistic) is a value learned from a sample. Here is an example, suppose you collect the results of a survey filled out by 250 randomly selected individuals who live in a certain neighborhood. Based on the survey results you realize that the average annual income of the individuals in this sample is $82,512. This is a sample statistic and is denoted by _x̅ = $82,512. _The sample mean is also a random variable (denoted by X̅) with a probability distribution. The probability distribution for X̅ is called the sampling distribution for the sample mean. Sampling distribution could be defined for other types of sample statistics including sample proportion, sample regression coefficients, sample correlation coefficient, etc.

You might be wondering why X̅ is a random variable while the sample mean is just a single number! The key to understanding this lies in the idea of *sample to sample variability*. This idea refers to the fact that samples drawn from the same population are not identical. Here’s an example, suppose in the example above, instead of conducting only one survey of 250 individuals living in a particular neighborhood, we conducted 35 samples of the same size in that neighborhood. If we calculated the sample mean _x̅ _for each of the 35 samples, you would be getting 35 different values. Now suppose, hypothetically, we conducted many many surveys of the same size in that neighborhood. We would be getting many many (different) values for sample means. The distribution resulting from those sample means is what we call the sampling distribution for sample mean. Thinking about the sample mean from this perspective, we can imagine how X̅ (note the big letter) is the random variable representing sample means and _x̅ _(note the small letter)is just one realization of that random variable.

#hypothesis-testing #python #distribution #sampling-distribution #statistics