Royce  Reinger

Royce Reinger

1675258440

Bolt: 10x Faster Matrix and Vector Operations

Bolt 

Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations.

If you have a large collection of mostly-dense vectors and can tolerate lossy compression, Bolt can probably save you 10-200x space and compute time.

Bolt also has theoretical guarantees bounding the errors in its approximations.

EDIT: this repo now also features the source code for MADDNESS, our shiny new algorithm for approximate matrix multiplication. MADDNESS has no Python wrapper yet, and is referred to as "mithral" in the source code. Name changed because apparently I'm the only who gets Lord of the Rings references. MADDNESS runs ridiculously fast and, under reasonable assumptions, requires zero multiply-adds. Realistically, it'll be most useful for speeding up neural net inference on CPUs, but it'll take another couple papers to get it there; we need to generalize it to convolution and write the CUDA kernels to allow GPU training. See also the poster and slides.

EDIT2: Looking for a research project? See our list of ideas.

EDIT3: See Build.md for a working dockerfile that builds and runs Bolt, contributed by @mneilly.

NOTE: All below code refers to the Python wrapper for Bolt and has nothing to do with MADDNESS. It also seems to be no longer building for many people. If you want to use MADDNESS, see the Python Implementation driven by amm_main.py or C++ implementation. All code is ugly, but Python code should be pretty easy to add new AMM methods/variations to.

Installing

Python

  $ brew install swig  # for wrapping C++; use apt-get, yum, etc, if not OS X
  $ pip install numpy  # bolt installation needs numpy already present
  $ git clone https://github.com/dblalock/bolt.git
  $ cd bolt && python setup.py install
  $ pytest tests/  # optionally, run the tests

If you run into any problems, please don't hesitate to mention it in the Python build problems issue.

C++

Install Bazel, Google's open-source build system. Then

  $ git clone https://github.com/dblalock/bolt.git
  $ cd bolt/cpp && bazel run :main

The bazel run command will build the project and run the tests and benchmarks.

If you want to integrate Bolt with another C++ project, include cpp/src/include/public.hpp and add the remaining files under cpp/src to your builds. You should let me know if you're interested in doing such an integration because I'm hoping to see Bolt become part of many libraries and thus would be happy to help you.

Notes

Bolt currently only supports machines with AVX2 instructions, which basically means x86 machines from fall 2013 or later. Contributions for ARM support are welcome. Also note that the Bolt Python wrapper is currently configured to require Clang, since GCC apparently runs into issues.

How does it work?

Bolt is based on vector quantization. For details, see the Bolt paper or slides.

Benchmarks

Bolt includes a thorough set of speed and accuracy benchmarks. See the experiments/ directory. This is also what you want if you want to reproduce the results in the paper.

Note that all of the timing results use the raw C++ implementation. At present, the Python wrapper is slightly slower due to Python overhead. If you're interested in having a full-speed wrapper, let me know and I'll allocate time to making this happen.

Basic usage

X, queries = some N x D array, some iterable of length D arrays

# these are approximately equal (though the latter are shifted and scaled)
enc = bolt.Encoder(reduction='dot').fit(X)
[np.dot(X, q) for q in queries]
[enc.transform(q) for q in queries]

# same for these
enc = bolt.Encoder(reduction='l2').fit(X)
[np.sum((X - q) * (X - q), axis=1) for q in queries]
[enc.transform(q) for q in queries]

# but enc.transform() is 10x faster or more

Example: Matrix-vector multiplies

import bolt
import numpy as np
from scipy.stats import pearsonr as corr
from sklearn.datasets import load_digits
import timeit

# for simplicity, use the sklearn digits dataset; we'll split
# it into a matrix X and a set of queries Q
X, _ = load_digits(return_X_y=True)
nqueries = 20
X, Q = X[:-nqueries], X[-nqueries:]

enc = bolt.Encoder(reduction='dot', accuracy='lowest') # can tweak acc vs speed
enc.fit(X)

dot_corrs = np.empty(nqueries)
for i, q in enumerate(Q):
    dots_true = np.dot(X, q)
    dots_bolt = enc.transform(q)
    dot_corrs[i] = corr(dots_true, dots_bolt)[0]

# dot products closely preserved despite compression
print("dot product correlation: {} +/- {}".format(
    np.mean(dot_corrs), np.std(dot_corrs)))  # > .97

# massive space savings
print(X.nbytes)  # 1777 rows * 64 cols * 8B = 909KB
print(enc.nbytes)  # 1777 * 2B = 3.55KB

# massive time savings (~10x here, but often >100x on larger
# datasets with less Python overhead; see the paper)
t_np = timeit.Timer(
    lambda: [np.dot(X, q) for q in Q]).timeit(5)        # ~9ms
t_bolt = timeit.Timer(
    lambda: [enc.transform(q) for q in Q]).timeit(5)    # ~800us
print "Numpy / BLAS time, Bolt time: {:.3f}ms, {:.3f}ms".format(
    t_np * 1000, t_bolt * 1000)

# can get output without offset/scaling if needed
dots_bolt = [enc.transform(q, unquantize=True) for q in Q]

Example: K-Nearest Neighbor / Maximum Inner Product Search

# search using squared Euclidean distances
# (still using the Digits dataset from above)
enc = bolt.Encoder('l2', accuracy='high').fit(X)
bolt_knn = [enc.knn(q, k_bolt) for q in Q]  # knn for each query

# search using dot product (maximum inner product search)
enc = bolt.Encoder('dot', accuracy='medium').fit(X)
bolt_knn = [enc.knn(q, k_bolt) for q in Q]  # knn for each query

Miscellaneous

Bolt stands for "Based On Lookup Tables". Feel free to use this exciting fact at parties.

Download Details:

Author: dblalock
Source Code: https://github.com/dblalock/bolt 
License: MPL-2.0 license

#machinelearning #datamining #compress #database 

What is GEEK

Buddha Community

Bolt: 10x Faster Matrix and Vector Operations
Python  Library

Python Library

1657400640

Synapse: Matrix Homeserver Written in Python 3/Twisted

Introduction

Matrix is an ambitious new ecosystem for open federated Instant Messaging and VoIP. The basics you need to know to get up and running are:

  • Everything in Matrix happens in a room. Rooms are distributed and do not exist on any single server. Rooms can be located using convenience aliases like #matrix:matrix.org or #test:localhost:8448.
  • Matrix user IDs look like @matthew:matrix.org (although in the future you will normally refer to yourself and others using a third party identifier (3PID): email address, phone number, etc rather than manipulating Matrix user IDs)

The overall architecture is:

client <----> homeserver <=====================> homeserver <----> client
       https://somewhere.org/_matrix      https://elsewhere.net/_matrix

#matrix:matrix.org is the official support room for Matrix, and can be accessed by any client from https://matrix.org/docs/projects/try-matrix-now.html or via IRC bridge at irc://irc.libera.chat/matrix.

Synapse is currently in rapid development, but as of version 0.5 we believe it is sufficiently stable to be run as an internet-facing service for real usage!

About Matrix

Matrix specifies a set of pragmatic RESTful HTTP JSON APIs as an open standard, which handle:

  • Creating and managing fully distributed chat rooms with no single points of control or failure
  • Eventually-consistent cryptographically secure synchronisation of room state across a global open network of federated servers and services
  • Sending and receiving extensible messages in a room with (optional) end-to-end encryption
  • Inviting, joining, leaving, kicking, banning room members
  • Managing user accounts (registration, login, logout)
  • Using 3rd Party IDs (3PIDs) such as email addresses, phone numbers, Facebook accounts to authenticate, identify and discover users on Matrix.
  • Placing 1:1 VoIP and Video calls

These APIs are intended to be implemented on a wide range of servers, services and clients, letting developers build messaging and VoIP functionality on top of the entirely open Matrix ecosystem rather than using closed or proprietary solutions. The hope is for Matrix to act as the building blocks for a new generation of fully open and interoperable messaging and VoIP apps for the internet.

Synapse is a Matrix "homeserver" implementation developed by the matrix.org core team, written in Python 3/Twisted.

In Matrix, every user runs one or more Matrix clients, which connect through to a Matrix homeserver. The homeserver stores all their personal chat history and user account information - much as a mail client connects through to an IMAP/SMTP server. Just like email, you can either run your own Matrix homeserver and control and own your own communications and history or use one hosted by someone else (e.g. matrix.org) - there is no single point of control or mandatory service provider in Matrix, unlike WhatsApp, Facebook, Hangouts, etc.

We'd like to invite you to join #matrix:matrix.org (via https://matrix.org/docs/projects/try-matrix-now.html), run a homeserver, take a look at the Matrix spec, and experiment with the APIs and Client SDKs.

Thanks for using Matrix!

Support

For support installing or managing Synapse, please join #synapse:matrix.org (from a matrix.org account if necessary) and ask questions there. We do not use GitHub issues for support requests, only for bug reports and feature requests.

Synapse's documentation is nicely rendered on GitHub Pages, with its source available in docs.

Synapse Installation

Connecting to Synapse from a client

The easiest way to try out your new Synapse installation is by connecting to it from a web client.

Unless you are running a test instance of Synapse on your local machine, in general, you will need to enable TLS support before you can successfully connect from a client: see TLS certificates.

An easy way to get started is to login or register via Element at https://app.element.io/#/login or https://app.element.io/#/register respectively. You will need to change the server you are logging into from matrix.org and instead specify a Homeserver URL of https://<server_name>:8448 (or just https://<server_name> if you are using a reverse proxy). If you prefer to use another client, refer to our client breakdown.

If all goes well you should at least be able to log in, create a room, and start sending messages.

Registering a new user from a client

By default, registration of new users via Matrix clients is disabled. To enable it, specify enable_registration: true in homeserver.yaml. (It is then recommended to also set up CAPTCHA - see docs/CAPTCHA_SETUP.md.)

Once enable_registration is set to true, it is possible to register a user via a Matrix client.

Your new user name will be formed partly from the server_name, and partly from a localpart you specify when you create the account. Your name will take the form of:

@localpart:my.domain.name

(pronounced "at localpart on my dot domain dot name").

As when logging in, you will need to specify a "Custom server". Specify your desired localpart in the 'User name' box.

Security note

Matrix serves raw, user-supplied data in some APIs -- specifically the content repository endpoints.

Whilst we make a reasonable effort to mitigate against XSS attacks (for instance, by using CSP), a Matrix homeserver should not be hosted on a domain hosting other web applications. This especially applies to sharing the domain with Matrix web clients and other sensitive applications like webmail. See https://developer.github.com/changes/2014-04-25-user-content-security for more information.

Ideally, the homeserver should not simply be on a different subdomain, but on a completely different registered domain (also known as top-level site or eTLD+1). This is because some attacks are still possible as long as the two applications share the same registered domain.

To illustrate this with an example, if your Element Web or other sensitive web application is hosted on A.example1.com, you should ideally host Synapse on example2.com. Some amount of protection is offered by hosting on B.example1.com instead, so this is also acceptable in some scenarios. However, you should not host your Synapse on A.example1.com.

Note that all of the above refers exclusively to the domain used in Synapse's public_baseurl setting. In particular, it has no bearing on the domain mentioned in MXIDs hosted on that server.

Following this advice ensures that even if an XSS is found in Synapse, the impact to other applications will be minimal.

Upgrading an existing Synapse

The instructions for upgrading synapse are in the upgrade notes. Please check these instructions as upgrading may require extra steps for some versions of synapse.

Using a reverse proxy with Synapse

It is recommended to put a reverse proxy such as nginx, Apache, Caddy, HAProxy or relayd in front of Synapse. One advantage of doing so is that it means that you can expose the default https port (443) to Matrix clients without needing to run Synapse with root privileges.

For information on configuring one, see docs/reverse_proxy.md.

Identity Servers

Identity servers have the job of mapping email addresses and other 3rd Party IDs (3PIDs) to Matrix user IDs, as well as verifying the ownership of 3PIDs before creating that mapping.

They are not where accounts or credentials are stored - these live on home servers. Identity Servers are just for mapping 3rd party IDs to matrix IDs.

This process is very security-sensitive, as there is obvious risk of spam if it is too easy to sign up for Matrix accounts or harvest 3PID data. In the longer term, we hope to create a decentralised system to manage it (matrix-doc #712), but in the meantime, the role of managing trusted identity in the Matrix ecosystem is farmed out to a cluster of known trusted ecosystem partners, who run 'Matrix Identity Servers' such as Sydent, whose role is purely to authenticate and track 3PID logins and publish end-user public keys.

You can host your own copy of Sydent, but this will prevent you reaching other users in the Matrix ecosystem via their email address, and prevent them finding you. We therefore recommend that you use one of the centralised identity servers at https://matrix.org or https://vector.im for now.

To reiterate: the Identity server will only be used if you choose to associate an email address with your account, or send an invite to another user via their email address.

Password reset

Users can reset their password through their client. Alternatively, a server admin can reset a users password using the admin API or by directly editing the database as shown below.

First calculate the hash of the new password:

$ ~/synapse/env/bin/hash_password
Password:
Confirm password:
$2a$12$xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Then update the users table in the database:

UPDATE users SET password_hash='$2a$12$xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
    WHERE name='@test:test.com';

Synapse Development

The best place to get started is our guide for contributors. This is part of our larger documentation, which includes information for synapse developers as well as synapse administrators.

Developers might be particularly interested in:

Alongside all that, join our developer community on Matrix: #synapse-dev:matrix.org, featuring real humans!

Quick start

Before setting up a development environment for synapse, make sure you have the system dependencies (such as the python header files) installed - see Platform-specific prerequisites.

To check out a synapse for development, clone the git repo into a working directory of your choice:

git clone https://github.com/matrix-org/synapse.git
cd synapse

Synapse has a number of external dependencies. We maintain a fixed development environment using Poetry. First, install poetry. We recommend:

pip install --user pipx
pipx install poetry

as described here. (See poetry's installation docs for other installation methods.) Then ask poetry to create a virtual environment from the project and install Synapse's dependencies:

poetry install --extras "all test"

This will run a process of downloading and installing all the needed dependencies into a virtual env.

We recommend using the demo which starts 3 federated instances running on ports 8080 - 8082:

poetry run ./demo/start.sh

(to stop, you can use poetry run ./demo/stop.sh)

See the demo documentation for more information.

If you just want to start a single instance of the app and run it directly:

# Create the homeserver.yaml config once
poetry run synapse_homeserver \
  --server-name my.domain.name \
  --config-path homeserver.yaml \
  --generate-config \
  --report-stats=[yes|no]

# Start the app
poetry run synapse_homeserver --config-path homeserver.yaml

Running the unit tests

After getting up and running, you may wish to run Synapse's unit tests to check that everything is installed correctly:

poetry run trial tests

This should end with a 'PASSED' result (note that exact numbers will differ):

Ran 1337 tests in 716.064s

PASSED (skips=15, successes=1322)

For more tips on running the unit tests, like running a specific test or to see the logging output, see the CONTRIBUTING doc.

Running the Integration Tests

Synapse is accompanied by SyTest, a Matrix homeserver integration testing suite, which uses HTTP requests to access the API as a Matrix client would. It is able to run Synapse directly from the source tree, so installation of the server is not required.

Testing with SyTest is recommended for verifying that changes related to the Client-Server API are functioning correctly. See the SyTest installation instructions for details.

Platform dependencies

Synapse uses a number of platform dependencies such as Python and PostgreSQL, and aims to follow supported upstream versions. See the docs/deprecation_policy.md document for more details.

Troubleshooting

Need help? Join our community support room on Matrix: #synapse:matrix.org

Running out of File Handles

If synapse runs out of file handles, it typically fails badly - live-locking at 100% CPU, and/or failing to accept new TCP connections (blocking the connecting client). Matrix currently can legitimately use a lot of file handles, thanks to busy rooms like #matrix:matrix.org containing hundreds of participating servers. The first time a server talks in a room it will try to connect simultaneously to all participating servers, which could exhaust the available file descriptors between DNS queries & HTTPS sockets, especially if DNS is slow to respond. (We need to improve the routing algorithm used to be better than full mesh, but as of March 2019 this hasn't happened yet).

If you hit this failure mode, we recommend increasing the maximum number of open file handles to be at least 4096 (assuming a default of 1024 or 256). This is typically done by editing /etc/security/limits.conf

Separately, Synapse may leak file handles if inbound HTTP requests get stuck during processing - e.g. blocked behind a lock or talking to a remote server etc. This is best diagnosed by matching up the 'Received request' and 'Processed request' log lines and looking for any 'Processed request' lines which take more than a few seconds to execute. Please let us know at #synapse:matrix.org if you see this failure mode so we can help debug it, however.

Help!! Synapse is slow and eats all my RAM/CPU!

First, ensure you are running the latest version of Synapse, using Python 3 with a PostgreSQL database.

Synapse's architecture is quite RAM hungry currently - we deliberately cache a lot of recent room data and metadata in RAM in order to speed up common requests. We'll improve this in the future, but for now the easiest way to either reduce the RAM usage (at the risk of slowing things down) is to set the almost-undocumented SYNAPSE_CACHE_FACTOR environment variable. The default is 0.5, which can be decreased to reduce RAM usage in memory constrained enviroments, or increased if performance starts to degrade.

However, degraded performance due to a low cache factor, common on machines with slow disks, often leads to explosions in memory use due backlogged requests. In this case, reducing the cache factor will make things worse. Instead, try increasing it drastically. 2.0 is a good starting value.

Using libjemalloc can also yield a significant improvement in overall memory use, and especially in terms of giving back RAM to the OS. To use it, the library must simply be put in the LD_PRELOAD environment variable when launching Synapse. On Debian, this can be done by installing the libjemalloc1 package and adding this line to /etc/default/matrix-synapse:

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1

This can make a significant difference on Python 2.7 - it's unclear how much of an improvement it provides on Python 3.x.

If you're encountering high CPU use by the Synapse process itself, you may be affected by a bug with presence tracking that leads to a massive excess of outgoing federation requests (see discussion). If metrics indicate that your server is also issuing far more outgoing federation requests than can be accounted for by your users' activity, this is a likely cause. The misbehavior can be worked around by setting the following in the Synapse config file:

presence:
    enabled: false

People can't accept room invitations from me

The typical failure mode here is that you send an invitation to someone to join a room or direct chat, but when they go to accept it, they get an error (typically along the lines of "Invalid signature"). They might see something like the following in their logs:

2019-09-11 19:32:04,271 - synapse.federation.transport.server - 288 - WARNING - GET-11752 - authenticate_request failed: 401: Invalid signature for server <server> with key ed25519:a_EqML: Unable to verify signature for <server>

This is normally caused by a misconfiguration in your reverse-proxy. See docs/reverse_proxy.md and double-check that your settings are correct.

Download Details:
Author: matrix-org
Source Code: https://github.com/matrix-org/synapse
License: Apache-2.0 license

#python

Ray  Patel

Ray Patel

1619565060

Ternary operator in Python?

  1. Ternary Operator in Python

What is a ternary operator: The ternary operator is a conditional expression that means this is a comparison operator and results come on a true or false condition and it is the shortest way to writing an if-else statement. It is a condition in a single line replacing the multiline if-else code.

syntax : condition ? value_if_true : value_if_false

condition: A boolean expression evaluates true or false

value_if_true: a value to be assigned if the expression is evaluated to true.

value_if_false: A value to be assigned if the expression is evaluated to false.

How to use ternary operator in python here are some examples of Python ternary operator if-else.

Brief description of examples we have to take two variables a and b. The value of a is 10 and b is 20. find the minimum number using a ternary operator with one line of code. ( **min = a if a < b else b ) **. if a less than b then print a otherwise print b and second examples are the same as first and the third example is check number is even or odd.

#python #python ternary operator #ternary operator #ternary operator in if-else #ternary operator in python #ternary operator with dict #ternary operator with lambda

Raju Bhadra

Raju Bhadra

1625362361

BOLT Review - (Real) BOLT App Scam or Legit? Exert Opinion!- Art Flair

**Introduction of BOLT **

Brand New, Lightning Fast All-In-One Video Hosting, Video Player & Video Marketing Software That Boosts Engagements, Gets You FREE Traffic, And More Views For A Low, 1-Time Payment.

What’s The Power of BOLT

-Video Marketing & Traffic App that helps anyone generate sales quickly.
-Software Based On Real Life Problem Solving.
-High Converting Funnel. Every upgrade Compliments the previous one.
-Generate Profits Without Selling Anything.
-LIVE Proofs and Real Time Case Studies.
-Thousands in Prizes Paid Instantly.
-Newbies can Drive Traffic at Zero Cost.

See More Here >>

**IT’S A COMPLETE VIDEO MARKETING SOLUTION **

That Will Save You Time, Get You A Tidal Wave Of FREE Traffic, AND We’re Able To Use The Included Commercial License To Make MASSIVE Profits Without Any Hard Work Or Video Creation Required…

Most people are fumbling along and struggling to get results from their video marketing… That’s because most people are using expensive video marketing platforms that cost an arm and a leg, and because of their built-in ad systems, they actually make it really hard for you to make money…

Inside, You Get Everything You Need To Finally Get HUGE Results From Video Marketing… Even If You’re Just Starting Out…

SEE WHY I HIGHLY NOT RECOMMEND BOLT SOFTWARE!!

Read Honest BOLT Review Here >>

#bolt review #bolt #bolt review art flair #bolt reviews

Abdullah  Kozey

Abdullah Kozey

1617738420

Unformatted input/output operations In C++

In this article, we will discuss the unformatted Input/Output operations In C++. Using objects cin and cout for the input and the output of data of various types is possible because of overloading of operator >> and << to recognize all the basic C++ types. The operator >> is overloaded in the istream class and operator << is overloaded in the ostream class.

The general format for reading data from the keyboard:

cin >> var1 >> var2 >> …. >> var_n;

  • Here, var1var2, ……, varn are the variable names that are declared already.
  • The input data must be separated by white space characters and the data type of user input must be similar to the data types of the variables which are declared in the program.
  • The operator >> reads the data character by character and assigns it to the indicated location.
  • Reading of variables terminates when white space occurs or character type occurs that does not match the destination type.

#c++ #c++ programs #c++-operator overloading #cpp-input-output #cpp-operator #cpp-operator-overloading #operators

Kasey  Turcotte

Kasey Turcotte

1623233520

Efficient Pandas: Apply vs Vectorized Operations

Time and efficiency matters

Pandas is one of the most commonly used data analysis and manipulation libraries in data science ecosystem. It offers plenty of functions and methods to perform efficient operations.

What I like most about Pandas is that there are almost always multiple ways to accomplish a given task. However, we should consider time and computational complexity when selection a method from available options.

It is not enough just to complete a given task. We should make it as efficient as possible. Thus, having a comprehensive understanding of how functions and methods work is of crucial importance.

In this article, we will do examples to compare the apply and applymap functions of pandas to vectorized operations. The apply and applymap functions come in hand for many tasks. However, as the size of data increases, time becomes an issue.

#programming #data-science #machine-learning #artificial-intelligence #efficient pandas: apply vs vectorized operations #apply vs vectorized operations