1679728440
Wayback is a tool that supports running as a command-line tool and docker container, purpose to snapshot webpage to time capsules.
wayback
) for convenient useThe simplest, cross-platform way is to download from GitHub Releases and place the executable file in your PATH.
From source:
go install github.com/wabarc/wayback/cmd/wayback@latest
From GitHub Releases:
curl -fsSL https://github.com/wabarc/wayback/raw/main/install.sh | sh
or via Bina:
curl -fsSL https://bina.egoist.dev/wabarc/wayback | sh
Using Snapcraft (on GNU/Linux)
sudo snap install wayback
Via APT:
curl -fsSL https://repo.wabarc.eu.org/apt/gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/packages.wabarc.gpg
echo "deb [arch=amd64,arm64,armhf signed-by=/usr/share/keyrings/packages.wabarc.gpg] https://repo.wabarc.eu.org/apt/ /" | sudo tee /etc/apt/sources.list.d/wayback.list
sudo apt update
sudo apt install wayback
Via RPM:
sudo rpm --import https://repo.wabarc.eu.org/yum/gpg.key
sudo tee /etc/yum.repos.d/wayback.repo > /dev/null <<EOT
[wayback]
name=Wayback Archiver
baseurl=https://repo.wabarc.eu.org/yum/
enabled=1
gpgcheck=1
gpgkey=https://repo.wabarc.eu.org/yum/gpg.key
EOT
sudo dnf install -y wayback
Via Homebrew:
brew tap wabarc/wayback
brew install wayback
$ wayback -h
A command-line tool and daemon service for archiving webpages.
Usage:
wayback [flags]
Examples:
wayback https://www.wikipedia.org
wayback https://www.fsf.org https://www.eff.org
wayback --ia https://www.fsf.org
wayback --ia --is -d telegram -t your-telegram-bot-token
WAYBACK_SLOT=pinata WAYBACK_APIKEY=YOUR-PINATA-APIKEY \
WAYBACK_SECRET=YOUR-PINATA-SECRET wayback --ip https://www.fsf.org
Flags:
--chatid string Telegram channel id
-c, --config string Configuration file path, defaults: ./wayback.conf, ~/wayback.conf, /etc/wayback.conf
-d, --daemon strings Run as daemon service, supported services are telegram, web, mastodon, twitter, discord, slack, irc
--debug Enable debug mode (default mode is false)
-h, --help help for wayback
--ia Wayback webpages to Internet Archive
--info Show application information
--ip Wayback webpages to IPFS
--ipfs-host string IPFS daemon host, do not require, unless enable ipfs (default "127.0.0.1")
-m, --ipfs-mode string IPFS mode (default "pinner")
-p, --ipfs-port uint IPFS daemon port (default 5001)
--is Wayback webpages to Archive Today
--ph Wayback webpages to Telegraph
--print Show application configurations
-t, --token string Telegram Bot API Token
--tor Snapshot webpage via Tor anonymity network
--tor-key string The private key for Tor Hidden Service
-v, --version version for wayback
Wayback one or more url to Internet Archive and archive.today:
wayback https://www.wikipedia.org
wayback https://www.fsf.org https://www.eff.org
Wayback url to Internet Archive or archive.today or IPFS:
// Internet Archive
$ wayback --ia https://www.fsf.org
// archive.today
$ wayback --is https://www.fsf.org
// IPFS
$ wayback --ip https://www.fsf.org
For using IPFS, also can specify a pinning service:
$ export WAYBACK_SLOT=pinata
$ export WAYBACK_APIKEY=YOUR-PINATA-APIKEY
$ export WAYBACK_SECRET=YOUR-PINATA-SECRET
$ wayback --ip https://www.fsf.org
// or
$ WAYBACK_SLOT=pinata WAYBACK_APIKEY=YOUR-PINATA-APIKEY \
$ WAYBACK_SECRET=YOUR-PINATA-SECRET wayback --ip https://www.fsf.org
More details about pinning service.
With telegram bot:
wayback --ia --is --ip -d telegram -t your-telegram-bot-token
Publish message to your Telegram channel at the same time:
wayback --ia --is --ip -d telegram -t your-telegram-bot-token --chatid your-telegram-channel-name
Also can run with debug mode:
wayback -d telegram -t YOUR-BOT-TOKEN --debug
Both serve on Telegram and Tor hidden service:
wayback -d telegram -t YOUT-BOT-TOKEN -d web
URLs from file:
wayback url.txt
cat url.txt | wayback
By default, wayback
looks for configuration options from this files, the following are parsed:
./wayback.conf
~/wayback.conf
/etc/wayback.conf
Use the -c
/ --config
option to specify the build definition file to use.
You can also specify configuration options either via command flags or via environment variables, an overview of all options below.
Flags | Environment Variable | Default | Description |
---|---|---|---|
--debug | DEBUG | false | Enable debug mode, override LOG_LEVEL |
-c , --config | - | - | Configuration file path, defaults: ./wayback.conf , ~/wayback.conf , /etc/wayback.conf |
- | LOG_TIME | true | Display the date and time in log messages |
- | LOG_LEVEL | info | Log level, supported level are debug , info , warn , error , fatal , defaults to info |
- | ENABLE_METRICS | false | Enable metrics collector |
- | WAYBACK_LISTEN_ADDR | 0.0.0.0:8964 | The listen address for the HTTP server |
- | CHROME_REMOTE_ADDR | - | Chrome/Chromium remote debugging address, for screenshot, format: host:port , wss://domain.tld |
- | WAYBACK_POOLING_SIZE | 3 | Number of worker pool for wayback at once |
- | WAYBACK_BOLT_PATH | ./wayback.db | File path of bolt database |
- | WAYBACK_STORAGE_DIR | - | Directory to store binary file, e.g. PDF, html file |
- | WAYBACK_MAX_MEDIA_SIZE | 512MB | Max size to limit download stream media |
- | WAYBACK_MEDIA_SITES | - | Extra media websites wish to be supported, separate with comma |
- | WAYBACK_TIMEOUT | 300 | Timeout for single wayback request, defaults to 300 second |
- | WAYBACK_MAX_RETRIES | 2 | Max retries for single wayback request, defaults to 2 |
- | WAYBACK_USERAGENT | WaybackArchiver/1.0 | User-Agent for a wayback request |
- | WAYBACK_FALLBACK | off | Use Google cache as a fallback if the original webpage is unavailable |
- | WAYBACK_MEILI_ENDPOINT | - | Meilisearch API endpoint |
- | WAYBACK_MEILI_INDEXING | capsules | Meilisearch indexing name |
- | WAYBACK_MEILI_APIKEY | - | Meilisearch admin API key |
-d , --daemon | - | - | Run as daemon service, e.g. telegram , web , mastodon , twitter , discord |
--ia | WAYBACK_ENABLE_IA | true | Wayback webpages to Internet Archive |
--is | WAYBACK_ENABLE_IS | true | Wayback webpages to Archive Today |
--ip | WAYBACK_ENABLE_IP | false | Wayback webpages to IPFS |
--ph | WAYBACK_ENABLE_PH | false | Wayback webpages to Telegra.ph, required Chrome/Chromium |
--ipfs-host | WAYBACK_IPFS_HOST | 127.0.0.1 | IPFS daemon service host |
-p , --ipfs-port | WAYBACK_IPFS_PORT | 5001 | IPFS daemon service port |
-m , --ipfs-mode | WAYBACK_IPFS_MODE | pinner | IPFS mode for preserve webpage, e.g. daemon , pinner |
- | WAYBACK_IPFS_TARGET | web3storage | The IPFS pinning service is used to store files, supported pinners: infura, pinata, nftstorage, web3storage. |
- | WAYBACK_IPFS_APIKEY | - | Apikey of the IPFS pinning service |
- | WAYBACK_IPFS_SECRET | - | Secret of the IPFS pinning service |
- | WAYBACK_GITHUB_TOKEN | - | GitHub Personal Access Token, required the repo scope |
- | WAYBACK_GITHUB_OWNER | - | GitHub account name |
- | WAYBACK_GITHUB_REPO | - | GitHub repository to publish results |
- | WAYBACK_NOTION_TOKEN | - | Notion integration token |
- | WAYBACK_NOTION_DATABASE_ID | - | Notion database ID for archiving results |
-t , --token | WAYBACK_TELEGRAM_TOKEN | - | Telegram Bot API Token |
--chatid | WAYBACK_TELEGRAM_CHANNEL | - | The Telegram public/private channel id to publish archive result |
- | WAYBACK_TELEGRAM_HELPTEXT | - | The help text for Telegram command |
- | WAYBACK_MASTODON_SERVER | - | Domain of Mastodon instance |
- | WAYBACK_MASTODON_KEY | - | The client key of your Mastodon application |
- | WAYBACK_MASTODON_SECRET | - | The client secret of your Mastodon application |
- | WAYBACK_MASTODON_TOKEN | - | The access token of your Mastodon application |
- | WAYBACK_TWITTER_CONSUMER_KEY | - | The customer key of your Twitter application |
- | WAYBACK_TWITTER_CONSUMER_SECRET | - | The customer secret of your Twitter application |
- | WAYBACK_TWITTER_ACCESS_TOKEN | - | The access token of your Twitter application |
- | WAYBACK_TWITTER_ACCESS_SECRET | - | The access secret of your Twitter application |
- | WAYBACK_IRC_NICK | - | IRC nick |
- | WAYBACK_IRC_PASSWORD | - | IRC password |
- | WAYBACK_IRC_CHANNEL | - | IRC channel |
- | WAYBACK_IRC_SERVER | irc.libera.chat:6697 | IRC server, required TLS |
- | WAYBACK_MATRIX_HOMESERVER | https://matrix.org | Matrix homeserver |
- | WAYBACK_MATRIX_USERID | - | Matrix unique user ID, format: @foo:example.com |
- | WAYBACK_MATRIX_ROOMID | - | Matrix internal room ID, format: !bar:example.com |
- | WAYBACK_MATRIX_PASSWORD | - | Matrix password |
- | WAYBACK_DISCORD_BOT_TOKEN | - | Discord bot authorization token |
- | WAYBACK_DISCORD_CHANNEL | - | Discord channel ID, find channel ID |
- | WAYBACK_DISCORD_HELPTEXT | - | The help text for Discord command |
- | WAYBACK_SLACK_APP_TOKEN | - | App-Level Token of Slack app |
- | WAYBACK_SLACK_BOT_TOKEN | - | Bot User OAuth Token for Slack workspace, use User OAuth Token if requires create external link |
- | WAYBACK_SLACK_CHANNEL | - | Channel ID of Slack channel |
- | WAYBACK_SLACK_HELPTEXT | - | The help text for Slack slash command |
- | WAYBACK_NOSTR_RELAY_URL | wss://nostr.developer.li | Nostr relay server url, multiple separated by comma |
- | WAYBACK_NOSTR_PRIVATE_KEY | - | The private key of a Nostr account |
--tor | WAYBACK_USE_TOR | false | Snapshot webpage via Tor anonymity network |
--tor-key | WAYBACK_TOR_PRIVKEY | - | The private key for Tor Hidden Service |
- | WAYBACK_TOR_LOCAL_PORT | 8964 | Local port for Tor Hidden Service, also support for a reverse proxy. This is ignored if WAYBACK_LISTEN_ADDR is set. |
- | WAYBACK_TOR_REMOTE_PORTS | 80 | Remote ports for Tor Hidden Service, e.g. WAYBACK_TOR_REMOTE_PORTS=80,81 |
- | WAYBACK_SLOT | - | Pinning service for IPFS mode of pinner, see ipfs-pinner |
- | WAYBACK_APIKEY | - | API key for pinning service |
- | WAYBACK_SECRET | - | API secret for pinning service |
If both of the definition file and environment variables are specified, they are all will be read and apply, and preferred from the environment variable for the same item.
Prints the resulting options of the targets with --print
, in a Go struct with type, without running the wayback
.
docker pull wabarc/wayback
docker run -d wabarc/wayback wayback -d telegram -t YOUR-BOT-TOKEN # without telegram channel
docker run -d wabarc/wayback wayback -d telegram -t YOUR-BOT-TOKEN -c YOUR-CHANNEL-USERNAME # with telegram channel
For a comprehensive guide, please refer to the complete documentation.
We encourage all contributions to this repository! Open an issue! Or open a Pull Request!
If you're interested in contributing to wayback
itself, read our contributing guide to get started.
Note: All interaction here should conform to the Code of Conduct.
Supported Golang version: See .github/workflows/testing.yml
Author: Wabarc
Source Code: https://github.com/wabarc/wayback
License: GPL-3.0 license
1673755080
Fast Python Collaborative Filtering for Implicit Datasets.
This project provides fast Python implementations of several different popular recommendation algorithms for implicit feedback datasets:
Alternating Least Squares as described in the papers Collaborative Filtering for Implicit Feedback Datasets and Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.
Bayesian Personalized Ranking.
Item-Item Nearest Neighbour models using Cosine, TFIDF or BM25 as a distance metric.
All models have multi-threaded training routines, using Cython and OpenMP to fit the models in parallel among all available CPU cores. In addition, the ALS and BPR models both have custom CUDA kernels - enabling fitting on compatible GPU's. Approximate nearest neighbours libraries such as Annoy, NMSLIB and Faiss can also be used by Implicit to speed up making recommendations.
Implicit can be installed from pypi with:
pip install implicit
Installing with pip will use prebuilt binary wheels on x86_64 Linux, Windows and OSX. These wheels include GPU support on Linux.
Implicit can also be installed with conda:
# CPU only package
conda install -c conda-forge implicit
# CPU+GPU package
conda install -c conda-forge implicit implicit-proc=*=gpu
import implicit
# initialize a model
model = implicit.als.AlternatingLeastSquares(factors=50)
# train the model on a sparse matrix of user/item/confidence weights
model.fit(user_item_data)
# recommend items for a user
recommendations = model.recommend(userid, user_item_data[userid])
# find related items
related = model.similar_items(itemid)
The examples folder has a program showing how to use this to compute similar artists on the last.fm dataset.
For more information see the documentation.
These blog posts describe the algorithms that power this library:
There are also several other articles about using Implicit to build recommendation systems:
This library requires SciPy version 0.16 or later and Python version 3.6 or later.
GPU Support requires at least version 11 of the NVidia CUDA Toolkit.
This library is tested with Python 3.7, 3.8, 3.9, 3.10 and 3.11 on Ubuntu, OSX and Windows.
Simple benchmarks comparing the ALS fitting time versus Spark can be found here.
I'd recommend configuring SciPy to use Intel's MKL matrix libraries. One easy way of doing this is by installing the Anaconda Python distribution.
For systems using OpenBLAS, I highly recommend setting 'export OPENBLAS_NUM_THREADS=1'. This disables its internal multithreading ability, which leads to substantial speedups for this package. Likewise for Intel MKL, setting 'export MKL_NUM_THREADS=1' should also be set.
Author: Benfred
Source Code: https://github.com/benfred/implicit
License: MIT license
1672780500
LightFM is a Python implementation of a number of popular recommendation algorithms for both implicit and explicit feedback, including efficient implementation of BPR and WARP ranking losses. It's easy to use, fast (via multithreaded model estimation), and produces high quality results.
It also makes it possible to incorporate both item and user metadata into the traditional matrix factorization algorithms. It represents each user and item as the sum of the latent representations of their features, thus allowing recommendations to generalise to new items (via item features) and to new users (via user features).
Install from pip
:
pip install lightfm
or Conda:
conda install -c conda-forge lightfm
Fitting an implicit feedback model on the MovieLens 100k dataset is very easy:
from lightfm import LightFM
from lightfm.datasets import fetch_movielens
from lightfm.evaluation import precision_at_k
# Load the MovieLens 100k dataset. Only five
# star ratings are treated as positive.
data = fetch_movielens(min_rating=5.0)
# Instantiate and train the model
model = LightFM(loss='warp')
model.fit(data['train'], epochs=30, num_threads=2)
# Evaluate the trained model
test_precision = precision_at_k(model, data['test'], k=5).mean()
Please cite LightFM if it helps your research. You can use the following BibTeX entry:
@inproceedings{DBLP:conf/recsys/Kula15,
author = {Maciej Kula},
editor = {Toine Bogers and
Marijn Koolen},
title = {Metadata Embeddings for User and Item Cold-start Recommendations},
booktitle = {Proceedings of the 2nd Workshop on New Trends on Content-Based Recommender
Systems co-located with 9th {ACM} Conference on Recommender Systems
(RecSys 2015), Vienna, Austria, September 16-20, 2015.},
series = {{CEUR} Workshop Proceedings},
volume = {1448},
pages = {14--21},
publisher = {CEUR-WS.org},
year = {2015},
url = {http://ceur-ws.org/Vol-1448/paper4.pdf},
}
Pull requests are welcome. To install for development:
git clone git@github.com:lyst/lightfm.git
cd lightfm && python3 -m venv venv && source ./venv/bin/activate
pip install -e . && pip install -r test-requirements.txt
./venv/bin/py.test tests
.lint-requirements.txt
.pip install pre-commit
pre-commit install
When making changes to the .pyx
extension files, you'll need to run python setup.py cythonize
in order to produce the extension .c
files before running pip install -e .
.
For more details, see the Documentation.
Need help? Contact me via email, Twitter, or Gitter.
Author: lyst
Source Code: https://github.com/lyst/lightfm
License: Apache-2.0 license
1672040280
Talk to ChatGPT via your favourite Matrix client!
This is an unofficial Matrix bot that uses https://github.com/transitive-bullshit/chatgpt-api to access the unofficial ChatGPT API.
It is worth reading the authentication instructions for the unofficial ChatGPT API.
Your user-agent and IP address must match from the real browser window you're logged in with to the one you're using for ChatGPTAPI. This means that you currently can't log in with your laptop and then run the bot on a server or proxy somewhere.
Cloudflare will still sometimes ask you to complete a CAPTCHA, so you may need to keep an eye on it and manually resolve the CAPTCHA.
You should not be using this ChatGPT account while the bot is using it, because that browser window may refresh one of your tokens and invalidate the bot's session.
If your OpenAI account uses Google Auth, you shouldn't encounter any of the more complicated Recaptchas — and can avoid using paid third-party CAPTCHA solving providers. To use Google auth, make sure your OpenAI account is using Google and then set IS_GOOGLE_LOGIN to true.
Usage
Features
Setting up the account
.env
:# https://matrix.org if your account is on matrix.org.
MATRIX_HOMESERVER_URL=
MATRIX_ACCESS_TOKEN=
OPENAI_EMAIL=
OPENAI_PASSWORD=
IS_GOOGLE_LOGIN=true
# With the @ and :DOMAIN, ie @SOMETHING:DOMAIN
MATRIX_BOT_USERNAME=
MATRIX_BOT_PASSWORD=
Discussion
Come and join #matrix-chatgpt-bot:matrix.org with your favourite Matrix chat client! If you've never set up a Matrix client before I'd recomend following the prompts at https://element.io/get-started to download and sign into Element.
Local development setup
yarn
yarn build
yarn
yarn build
yarn start
docker build . -t matrix-chatgpt-bot
docker run --cap-add=SYS_ADMIN -it matrix-chatgpt-bot
Note: Without -it flags in the command above you won't be able to stop the container using Ctrl-C
Author: jakecoppinger
Source Code: https://github.com/jakecoppinger/matrix-chatgpt-bot
License: AGPL-3.0 license
1666941540
These are the beginnings of a set of interfaces to HSL packages for sparse linear algebra.
Certain HSL packages are freely available to all, others are freely available to academics only. Please refer to the website above for licensing information. In all cases, users are responsible for obtaining HSL packages.
julia> ]
pkg> add HSL
At this point, make sure that there isn't a stray METIS library on your library path. You especially want to make sure that METIS 5 is not accessible because the HSL library currently interfaced only supports METIS 4. If you have such library accessible, it is important to remove it from the library path, at least temporarily. For example, if you are on OSX and are using Homebrew, you can hide METIS 5 with brew unlink metis
. After the install procedure is complete, it is fine to link metis
again with brew link metis
.
Set the environment variable HSL_ARCHIVES_PATH
to specify where the source archives tar.gz
and zip
are stored. You can use the zip
archives as long as unzip
is installed on your system. If archives are stored in different folders, you can also set the environment variable <ALGNAME>_PATH
, e.g. HSL_MA57_PATH
or MC21_PATH
. The HSL
Julia module will take care of compilation. Once the source archives have been placed in the locations indicated by the environment variables, run
julia> ]
pkg> build HSL
pkg> test HSL
Note that a C and Fortran compilers are required. Should it be necessary, you can set the compilers to use by setting the environment variables
HSL_FC
: the Fortran 90/95 compiler (default: gfortran
)HSL_F77
: the Fortran 77 compiler (default: the same as FC
)HSL_CC
: the C compiler (default: gcc
).Supported versions:
HSL_MA97: an OpenMP-based direct solver for symmetric linear systems. Example:
using MatrixMarket
using HSL
K = MatrixMarket.mmread("K.mtx") # only the lower triangle
rhs = readdlm("rhs.rhs")
LBL = Ma97(K)
ma97_factorize!(LBL)
x = ma97_solve(LBL, rhs) # or x = LBL \ rhs
There is a convenience interface to solve rectangular systems that complements the sparse QR factorization in Julia.
When A is m-by-n with m < n and has full row rank,
(x, y) = ma97_solve(A, b)
solves for the minimum-norm solution, i.e., x such that Ax = b and x + Aᵀ y = 0. The call
(x, y) = ma97_min_norm(A, b)
is also defined, and is equivalent to the above.
When m > n and has full column rank,
(r, x) = ma97_solve(A, b)
solves for the least-squares solution, i.e., x such that r = b - Ax satisfies Aᵀ r = 0. The call
(r, x) = ma97_least_squares(A, b)
is also defined, and is equivalent to the above.
HSL_MA57 version 5.2.0: a sparse, multifrontal solver for symmetric linear systems. Example:
using MatrixMarket
using HSL
K = MatrixMarket.mmread("examples/K_0.mtx") # only the lower triangle
rhs = readdlm("examples/rhs_0.rhs")
rhss = hcat(rhs, rhs)
## factorize and solve
LDL = Ma57(K)
ma57_factorize!(LDL)
LDL.info.rank
x = ma57_solve(LDL, rhs) # or x = LBL \ rhs
norm(K*x - rhs)
xx = ma57_solve(LDL, rhss) # or x = LBL \ rhss
There is a convenience interface to solve rectangular systems that complements the sparse QR factorization in Julia.
When A is m-by-n with m < n and has full row rank,
(x, y) = ma57_solve(A, b)
solves for the minimum-norm solution, i.e., x such that Ax = b and x + Aᵀ y = 0. The call
(x, y) = ma57_min_norm(A, b)
is also defined, and is equivalent to the above.
When m > n and has full column rank,
(r, x) = ma57_solve(A, b)
solves for the least-squares solution, i.e., x such that r = b - Ax satisfies Aᵀ r = 0. The call
(r, x) = ma57_least_squares(A, b)
is also defined, and is equivalent to the above. Example:
using MatrixMarket
using HSL
K = MatrixMarket.mmread("examples/K_0.mtx") # only the lower triangle
rhs = readdlm("examples/rhs_0.rhs")
## solve min norm
K_mn = K[1:200,:]
x_mn, y_mn = ma57_min_norm(K_mn, rhs[1:200]) # == ma57_solve(K_mn, rhs[1:200])
## solve least squares
K_ls = K[:,1:200]
r_ls, x_ls = ma57_least_squares(K_ls, rhs) # == ma57_solve(K_ls, rhs)
\
Author: JuliaSmoothOptimizers
Source Code: https://github.com/JuliaSmoothOptimizers/HSL.jl
License: View license
1666534200
PositiveFactorizations is a package for computing a positive definite matrix decomposition (factorization) from an arbitrary symmetric input. The motivating application is optimization (Newton or quasi-Newton methods), in which the canonical search direction -H\g
(H
being the Hessian and g
the gradient) may not be a descent direction if H
is not positive definite. This package provides an efficient approach to computing -Htilde\g
, where Htilde
is equal to H
if H
is positive definite, and otherwise is a positive definite matrix that is "spiritually like H
."
The approach favored here is different from the well-known Gill-Murray-Wright approach of computing the Cholesky factorization of H+E
, where E
is a minimal correction needed to make H+E
positive-definite (sometimes known as GMW81). See the discussion starting here for justification; briefly, the idea of a small correction conflates large negative eigenvalues with numerical roundoff error, which (when stated that way) seems like a puzzling choice. In contrast, this package provides methods that are largely equivalent to taking the absolute value of the diagonals D in an LDLT factorization, and setting any "tiny" diagonals (those consistent with roundoff error) to 1. For a diagonal matrix with some entries negative, this results in approximately twice the correction used in GMW81.
Given a symmetric matrix H
, compute a positive factorization F
like this:
F = cholesky(Positive, H, [pivot=Val{false}])
Pivoting (turned on with Val{true}
) can make the correction smaller and increase accuracy, but is not necessary for existence or stability.
For a little more information, call ldlt
instead:
F, d = ldlt(Positive, H, [pivot=Val{false}])
F
will be the same as for cholesky
, but this also returns d
, a vector of Int8
with values +1, 0, or -1 indicating the sign of the diagonal as encountered during processing (so in order of rows/columns if not using pivoting, in order of pivot if using pivoting). This output can be useful for determining whether the original matrix was already positive (semi)definite.
Note that cholesky
/ldlt
can be used with any matrix, even those which lack a conventional LDLT factorization. For example, the matrix [0 1; 1 0]
is factored as L = [1 0; 0 1]
(the identity matrix), with all entries of d
being 0. Symmetry is assumed but not checked; only the lower-triangle of the input is used.
cholesky
is recommended because it is very efficient. A slower alternative is to use eigen
:
F = eigen(Positive, H)
which may be easier to reason about from the standpoint of fundamental linear algebra.
Author: Timholy
Source Code: https://github.com/timholy/PositiveFactorizations.jl
License: View license
1666162020
Introduction
Cholesky factorizations over the hemireals can be computed for arbitrary symmetric matrices, including singular and indefinite matrices. For singular matrices, the behavior is reminiscent of the singular value decomposition, but the performance is much better.
Usage
After creating a symmetric matrix A
, compute its Cholesky factorization over the hemireal numbers like this:
F = cholfact(PureHemi, A)
Then you can use F
to solve equations, e.g.,
x = F\b
If A
is singular, this should be the least-squares solution.
You can compute F*F'
or say rank(F)
. You can also convert F
into matrix form with convert(Matrix, F)
.
To support all operations, you need to be running at least a version of julia-0.5-dev that is current with master as of 2015-12-11. However, many operations also work on julia-0.4.
F = cholfact(PureHemi, A, δ; blocksize=default)
where:
δ
is the tolerance on the diagonal values of A
during factorization; any with magnitudes smaller than δ
will be treated as if they are 0.blocksize
controls the performance of the factorization algorithm.Author: Timholy
Source Code: https://github.com/timholy/HemirealFactorizations.jl
License: View license
1666124820
Surge is a Swift library that uses the Accelerate framework to provide high-performance functions for matrix math, digital signal processing, and image manipulation.
Accelerate exposes SIMD instructions available in modern CPUs to significantly improve performance of certain calculations. Because of its relative obscurity and inconvenient APIs, Accelerate is not commonly used by developers, which is a shame, since many applications could benefit from these performance optimizations.
Surge aims to bring Accelerate to the mainstream, making it as easy (and nearly as fast, in most cases) to perform computation over a set of numbers as for a single member.
Though, keep in mind: Accelerate is not a silver bullet. Under certain conditions, such as performing simple calculations over a small data set, Accelerate can be out-performed by conventional algorithms. Always benchmark to determine the performance characteristics of each potential approach.
Curious about the name Surge? (And Jounce?) Back in the mid 90's, Apple, IBM, and Motorola teamed up to create AltiVec (a.k.a the Velocity Engine), which provided a SIMD instruction set for the PowerPC architecture. When Apple made the switch to Intel CPUs, AltiVec was ported to the x86 architecture and rechristened Accelerate. The derivative of Accelerate (and second derivative of Velocity) is known as either jerk, jolt, surge, or lurch; if you take the derivative of surge, you get the jounce --- hence the name of this library and its parent organization.
The infrastructure and best practices for distributing Swift libraries are currently in flux during this beta period of Swift & Xcode. In the meantime, you can add Surge as a git submodule, drag the Surge.xcodeproj
file into your Xcode project, and add Surge.framework
as a dependency for your target.
Surge uses Swift 5. This means that your code has to be written in Swift 5 due to current binary compatibility limitations.
To use Swift Package Manager add Surge to your Package.swift
file:
let package = Package(
name: "myproject",
dependencies: [
.package(url: "https://github.com/Jounce/Surge.git", .upToNextMajor(from: "2.3.2")),
],
targets: [
.target(
name: "myproject",
dependencies: ["Surge"]),
]
)
Then run swift build
.
To use CocoaPods add Surge to your Podfile
:
source 'https://github.com/CocoaPods/Specs.git'
platform :ios, '10.0'
use_frameworks!
target '<Your Target Name>' do
pod 'Surge', '~> 2.3.2'
end
Then run pod install
.
To use Carthage add Surge to your Cartfile
:
github "Jounce/Surge" ~> 2.3.2
Then run carthage update
and use the framework in Carthage/Build/<platform>
.
[Double]
import Surge
let n = [1.0, 2.0, 3.0, 4.0, 5.0]
let sum = Surge.sum(n) // 15.0
[Double]
simport Surge
let a = [1.0, 3.0, 5.0, 7.0]
let b = [2.0, 4.0, 6.0, 8.0]
let product = Surge.elmul(a, b) // [2.0, 12.0, 30.0, 56.0]
Addition functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | add | .+ (infix) | .+= (infix) |
(Array, Scalar) | add | + (infix) | += (infix) |
(Matrix, Matrix) | add | + (infix) | += (infix) |
(Matrix, Scalar) | n/a | n/a | n/a |
(Vector, Vector) | add | + (infix) | += (infix) |
(Vector, Scalar) | add | + (infix) | += (infix) |
Subtraction functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | sub | .- (infix) | .-= (infix) |
(Array, Scalar) | sub | - (infix) | -= (infix) |
(Matrix, Matrix) | sub | - (infix) | -= (infix) |
(Matrix, Scalar) | n/a | n/a | n/a |
(Vector, Vector) | sub | - (infix) | -= (infix) |
(Vector, Scalar) | sub | - (infix) | -= (infix) |
Multiplication functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | mul | .* (infix) | .*= (infix) |
(Array, Scalar) | mul | * (infix) | *= (infix) |
(Matrix, Matrix) | mul | * (infix) | n/a |
(Matrix, Vector) | mul | * (infix) | n/a |
(Matrix, Scalar) | mul | * (infix) | n/a |
(Vector, Matrix) | mul | * (infix) | n/a |
(Vector, Scalar) | mul | * (infix) | *= (infix) |
(Scalar, Array) | mul | * (infix) | n/a |
(Scalar, Matrix) | mul | * (infix) | n/a |
(Scalar, Vector) | mul | * (infix) | n/a |
Element-wise multiplication functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Matrix, Matrix) | elmul | n/a | n/a |
(Vector, Vector) | elmul | .* (infix) | .*= (infix) |
Division functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | div | ./ (infix) | ./= (infix) |
(Array, Scalar) | div | / (infix) | /= (infix) |
(Matrix, Matrix) | div | / (infix) | n/a |
(Matrix, Scalar) | n/a | / (infix) | n/a |
(Vector, Scalar) | div | / (infix) | /= (infix) |
Element-wise multiplication functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Vector, Vector) | eldiv | ./ (infix) | ./= (infix) |
Modulo functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | mod | .% (infix) | n/a |
(Array, Scalar) | mod | % (infix) | n/a |
Remainder functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | remainder | n/a | n/a |
(Array, Scalar) | remainder | n/a | n/a |
Square root functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array) | sqrt | n/a | n/a |
Sum functions & operator
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array) | sum | n/a | n/a |
(Matrix) | sum | n/a | n/a |
Dot product functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | dot | • (infix) | n/a |
(Vector, Vector) | dot | • (infix) | n/a |
Distance functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | dist | n/a | n/a |
(Vector, Vector) | dist | n/a | n/a |
Squared distance functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | distSq | n/a | n/a |
(Vector, Vector) | distSq | n/a | n/a |
Power functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array, Array) | pow | .** (infix) | .**= (infix) |
(Array, Scalar) | pow | ** (infix) | **= (infix) |
(Matrix, Scalar) | pow | n/a | n/a |
(Vector, Vector) | pow | n/a | n/a |
(Serial exponentiation: a ** b ** c == a ** (b ** c)
)
Exponential functions & operators
Arguments | Function | Operator | In-Place Operator |
---|---|---|---|
(Array) | exp | n/a | n/a |
(Matrix) | exp | n/a | n/a |
(Vector) | exp | n/a | n/a |
Trigonometric functions & operators
Arguments | Function | Operation |
---|---|---|
(Array) | sin | Sine |
(Array) | cos | Cosine |
(Array) | tan | Tangent |
(Array) | sincos | Sine & Cosine |
Arguments | Function | Operation |
---|---|---|
(Array) | asin | Arc Sine |
(Array) | acos | Arc Cosine |
(Array) | atan | Arc Tangent |
Arguments | Function | Operation |
---|---|---|
(Array) | sinh | Hyperbolic Sine |
(Array) | cosh | Hyperbolic Cosine |
(Array) | tanh | Hyperbolic Tangent |
Arguments | Function | Operation |
---|---|---|
(Array) | asinh | Inverse Hyperbolic Sine |
(Array) | acosh | Inverse Hyperbolic Cosine |
(Array) | atanh | Inverse Hyperbolic Tangent |
Exponential functions & operators
Arguments | Function | Operation |
---|---|---|
(Array) | exp | Base-e Exponential Function |
(Array) | exp2 | Base-2 Exponential Function |
Exponential functions & operators
Arguments | Function | Operation |
---|---|---|
(Array) | log | Base-e Logarithm |
(Array) | log2 | Base-2 Logarithm |
(Array) | log10 | Base-10 Logarithm |
(Array) | logb | Base-b Logarithm |
Statistical functions & operators
Arguments | Function | Operation |
---|---|---|
(Array) | sum | Summation |
(Array) | asum | Absolute Summation |
Arguments | Function | Operation |
---|---|---|
(Array) | min | Minimum |
(Array) | max | Maximum |
Arguments | Function | Operation |
---|---|---|
(Array) | mean | Mean |
(Array) | meamg | Mean of Magnitudes |
(Array) | measq | Mean of Squares |
(Array) | variance | Variance |
(Array) | std | Standard Deviation |
Auxiliary functions & operators
Arguments | Function | Operation |
---|---|---|
(Array) | ceil | Ceiling |
(Array) | floor | Flooring |
(Array) | round | Rounding |
(Array) | trunc | Integer truncation |
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Array) | abs | n/a | n/a | n/a |
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Array) | copysign | n/a | n/a | n/a |
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Array) | rec | n/a | n/a | n/a |
Matrix-specific functions & operators
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Matrix) | inv | n/a | n/a | n/a |
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Matrix) | transpose | n/a | ′ (postfix) | n/a |
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Matrix) | det | n/a | n/a | n/a |
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Matrix) | eigenDecompose | n/a | n/a | n/a |
Fast fourier transform functions & operators
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Array) | fft | n/a | n/a | n/a |
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Array, Array) | conv | n/a | n/a | n/a |
Arguments | Function | In-Place Function | Operator | In-Place Operator |
---|---|---|---|---|
(Array, Array) | xcorr | n/a | n/a | n/a |
(Array) | xcorr | n/a | n/a | n/a |
Author: Jounce
Source Code: https://github.com/Jounce/Surge
License: MIT license
1666105260
Fast matrix multiplication and division for Toeplitz, Hankel and circulant matrices in Julia
Note
Multiplication of large matrices and sqrt
, inv
, LinearAlgebra.eigvals
, LinearAlgebra.ldiv!
, and LinearAlgebra.pinv
for circulant matrices are computed with FFTs. To be able to use these methods, you have to install and load a package that implements the AbstractFFTs.jl interface such as FFTW.jl:
using FFTW
If you perform multiple calculations with FFTs, it can be more efficient to initialize the required arrays and plan the FFT only once. You can precompute the FFT factorization with LinearAlgebra.factorize
and then use the factorization for the FFT-based computations.
Supported matrices
A Toeplitz matrix has constant diagonals. It can be constructed using
Toeplitz(vc,vr)
where vc
are the entries in the first column and vr
are the entries in the first row, where vc[1]
must equal vr[1]
. For example.
Toeplitz(1:3, [1.,4.,5.])
is a sparse representation of the matrix
[ 1.0 4.0 5.0
2.0 1.0 4.0
3.0 2.0 1.0 ]
A symmetric Toeplitz matrix is a symmetric matrix with constant diagonals. It can be constructed with
SymmetricToeplitz(vc)
where vc
are the entries of the first column. For example,
SymmetricToeplitz([1.0, 2.0, 3.0])
is a sparse representation of the matrix
[ 1.0 2.0 3.0
2.0 1.0 2.0
3.0 2.0 1.0 ]
A triangular Toeplitz matrix can be constructed using
TriangularToeplitz(ve,uplo)
where uplo is either :L
or :U
and ve
are the rows or columns, respectively. For example,
TriangularToeplitz([1.,2.,3.],:L)
is a sparse representation of the matrix
[ 1.0 0.0 0.0
2.0 1.0 0.0
3.0 2.0 1.0 ]
A Hankel matrix has constant anti-diagonals. It can be constructed using
Hankel(vc,vr)
where vc
are the entries in the first column and vr
are the entries in the last row, where vc[end]
must equal vr[1]
. For example.
Hankel([1.,2.,3.], 3:5)
is a sparse representation of the matrix
[ 1.0 2.0 3.0
2.0 3.0 4.0
3.0 4.0 5.0 ]
A circulant matrix is a special case of a Toeplitz matrix with periodic end conditions. It can be constructed using
Circulant(vc)
where vc
is a vector with the entries for the first column. For example:
Circulant([1.0, 2.0, 3.0])
is a sparse representation of the matrix
[ 1.0 3.0 2.0
2.0 1.0 3.0
3.0 2.0 1.0 ]
Author: JuliaMatrices
Source Code: https://github.com/JuliaMatrices/ToeplitzMatrices.jl
License: MIT license
1666080960
SparseVectorMatrix
This packages provides an alternative implementation of SparseMatrices that maintains a vector of SparseVectors. Such an implementation is best used when all matrix operations require access to just one column each.
using SparseVectorMatrix
# Random Generation
a = svmrand(100, 100, 0.1)
# Getindex
a[:, 1] # Returns an entire column quickly
a[1, :] # Returns an entire row, but slowly.
# SetIndex
a[:, 1] = 1:100 # Assign an entire column quickly.
a[1, :] = 1:100 # Assign an entire row, by slowly.
#Concatenation
b = svmrand(100, 100, 0.1)
hcat(a, b) # Concatenates horizontally. Very fast.
vcat(a, b) # Concatenates vertically. Not as fast.
arr = [svmrand(100, 100, 0.1) for i in 1:4]
hvcat((2,2), arr..) # Grid Concatenation. Quite fast.
include("benchmarks/run.jl")
Author: Pranavtbhat
Source Code: https://github.com/pranavtbhat/SparseVectorMatrix.jl
License: View license
1666052460
This package provides support for the Woodbury matrix identity for the Julia programming language. This is a generalization of the Sherman-Morrison formula. Note that the Woodbury matrix identity is notorious for floating-point roundoff errors, so be prepared for a certain amount of inaccuracy in the result.
using WoodburyMatrices
W = Woodbury(A, U, C, V)
creates a Woodbury
matrix from the A
, U
, C
, and V
matrices representing A + U*C*V
. These matrices can be dense or sparse (or generally any type of AbstractMatrix
), with the caveat that inv(inv(C) + V*(A\U))
will be calculated explicitly and hence needs to be representable with the available resources. (In many applications, this is a fairly small matrix.)
Here are some of the things you can do with a Woodbury matrix:
Matrix(W)
converts to its dense representation.W\b
solves the equation W*x = b
for x
.W*x
computes the product.det(W)
computes the determinant of W
.α*W
and W1 + W2
.It's worth emphasizing that A
can be supplied as a factorization, which makes W\b
and det(W)
more efficient.
using WoodburyMatrices
W = SymWoodbury(A, B, D)
creates a SymWoodbury
matrix, a symmetric version of a Woodbury matrix representing A + B*D*B'
. In addition to the features above, SymWoodbury
also supports "squaring" W*W
.
Author: Timholy
Source Code: https://github.com/timholy/WoodburyMatrices.jl
License: View license
1666004520
Introduction
Pseudospectra is a Julia package for computing pseudospectra of non-symmetric matrices, and plotting them along with eigenvalues ("spectral portraits"). Some related computations and plots are also provided.
Whereas the spectrum of a matrix is the set of its eigenvalues, a pseudospectrum is the set of complex numbers "close" to the spectrum in some practical sense.
More precisely, the ϵ-pseudospectrum of a matrix A
, σ_ϵ(A)
, is the set of complex numbers λ
such that
λ
is an eigenvalue of some matrix A+E
, where the norm of the perturbation ‖E‖ < ϵ
, or‖(A-λI)^(-1)‖ > 1/ϵ
,(the definitions are equivalent). This sense of "closeness" is trivial for Hermitian matrices, but interesting for others. Specifically, this package is currently limited to the unweighted 2-norm.
Among other things, pseudospectra:
eigs
(from the Arpack package).See the Pseudospectra gateway for details, references, and more.
To study a moderate-sized matrix with minimal user effort, follow this example:
using Plots, Pseudospectra, LinearAlgebra
n=150
B=diagm(1 => fill(2im,n-1), 2 => fill(-1,n-2), 3 => fill(2,n-3), -2 => fill(-4,n-2), -3 => fill(-2im, n-3))
spectralportrait(B)
The figure shows a section of the complex plane with eigenvalues and contours of log10(ϵ)
.
Pseudospectra.jl (along with a QML-based GUI, in the forthcoming PseudospectraView package) is essentially a translation of the acclaimed MATLAB-based EigTool (homepage here), code now hosted on GitHub.
No endorsement or promotion of Pseudospectra.jl by the authors of EigTool is implied.
Specific documentation for Pseudospectra is a work in progress; a draft is available here. See the examples and tests for more.
The plotting interface is somewhat schizophrenic. Drivers are included for Plots.jl and/or PyPlot.jl (i.e., PyPlot is a useful back end for Plots as used here; other Plots backends have been partially tested). Experimental support for Makie.jl is also available.
Although this package is designed with an eye to plotting results, the computational routines are usable without a plotting package, To avoid forcing potential users to install a particular one, none are specified in the formal package requirements. The setgs
function can import one conditionally.
Some functions used for examples require other packages. They should give a useful complaint if invoked without that support.
Installation
This package should be included in the General registry by the time anyone else sees this paragraph, so the normal Pkg
commands to add Pseudospectra
should suffice.
In the interim, install by adding this repository explicitly with the package manager.
Pkg.add("https://github.com/RalphAS/Pseudospectra.jl")
Basic usage
Minimal use of the REPL interface is as follows:
using Plots
using Pseudospectra
A = your_matrix_generating_function()
ps_data = new_matrix(A)
driver!(ps_data)
# modify, e.g., for higher resolution
options = Dict{Symbol,Any}(:npts => 100)
driver!(ps_data,options)
This should show a contour plot of log10(ϵ)
in the vicinity of the spectrum, which is the standard display of a spectral portrait. More elaborate capabilities are exhibited (as always) in the examples and test folders.
Disclaimer
This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright holder or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.
Note on licensing
Most of the package is under a BSD license, in accordance with derivation from EigTool. See individual source files for exceptions.
Author: RalphAS
Source Code: https://github.com/RalphAS/Pseudospectra.jl
License: View license
1665991380
An extensible test matrix collection for Julia.
Give access to a wealth of sample and test matrices and accompanying data. A set of matrices is generated locally (with arguments controlling the special case). Another set is loaded from one of the publicly accessible matrix collections SuiteSparse Matrix Collection
(formerly University of Florida Matrix Collection
) and the Matrix Market Collection
, the latter being obsolescent.
Access is like
using MatrixDepot
?MatrixDepot # display package help info
A = matrixdepot("hilb", 10) # locally generated hilbert matrix dimensions (10,10)
A = matrixdepot("HB/1138_bus") # named matrix of the SuiteSparse Collection
or
md = mdopen("*/bfly") # named matrix with some extra data
A = md.A
co = md.coord
tx = md("Gname_10.txt")
md.<tab><tab> # overview of the "fields" of md returning like
# A m n dnz nnz coord Gname_10.txt G_10 Gcoord_10
or also
mdinfo("gravity") # text info about the selected matrix
md = mdopen("gravity", 10, false) # localy generated example with rhs and solution
A = md.A
b = md.b
x = md.x
NOTE: If you use Windows, you need to install MinGW/MSYS or Cygwin in order to use the SuiteSparse sparse and MatrixMarket matrix collection interface.
To install the release version, type
julia> Pkg.add("MatrixDepot")
Every Matrix type has a unique name, which is a string of one of the forms:
"name"
- used for matrices, which are generated locally."dir/name"
- for all matrices of the SuiteSparse
collection."dir/subdir/name"
- for all matrices of the MatrixMarket
collection.The names are similar to relative path names, separated by a slash character. The components of the name must not contain any of the characters "/*[]"
.
a set of matrices may be assigned to predefined or user-defined groups. The group names are represented as Julia
symbols in the form :symmetric
. The group names are therefore restricted to valid Julia
identifiers, that means start with a letter and contain only letters, digits, and '_'
.
Every matrix has a numeric identifier, which is unique for its area:
builtin(id)
- one of the built-in matrix generators - currently id ∈ 1:59
.
user(id)
- a user-defined matrix generator - starting with 1
.
sp(id)
- one of the SuiteSparse
collection. The integer ids are the 'official' ident numbers assigned by the collection. Currently id ∈ 1:3000
.
mm(id)
- one of the MatrixMarket
collection. Here id follows the ordering of the index file of the collection.
For some functions it makes sense to have lists of matrix names to operate on, for example to select a set matrices with certain properties. These sets are descibed by 'Patterns', which are applied to matrix names and also to other matrix properties.
The following pattern types are supported:
"name"
- a string matching exactly a matrix name
"shell-pattern"
- a string with shell wildcards '?', '*', "[...]"
included.
r"egular expression"
- a regular expression to match the matrix name.
:group
- one of the defined group names; match all matrices in the group
qualified numeric identifiers
- examples builtin(10)
, sp(1:5, 7)
, mm(1), sp(:)
predicate_function
- the name of a predefined or user-defined boolean function of the internal data type MatrixData
. Example: issymmetric
.
abstract vector of sub-patterns
- OR
- any of the sub-pattern matches
tuple of sub-patterns
- AND
- all of the sub-patterns match
~pattern
- negation of a pattern the \neg - operator ~ may be applied to all patterns
To express OR
and AND
, the binary operators |
and &
and (
/ )
are preferred.
Examples:
"gravity" | "HB/*" & ~(ishermitian & iscomplex) & ~sp(20:30)
The set of all known matrices can be expressed as empty tuple ()
. In a shell- pattern the double **
matches also slash characters, in contrast to the single *
.
A convenient form of a predicate-generator is
@pred(expression)
where expression is a valid Julia
boolean expression, which may access all properties of MatrixData
as literal variable names.
Examples:
@pred(author == "J. Brown")
is translated to: d -> :author in propertynames(d) && d.author == "J. Brown"
@pred(500_000 <= n * m < 1_000_000)
restricts the size of matched matrices.
@pred(10^4 <= n <= 2*10^4 && n == m && nnz / n > 10 )
in average more than 10 entries per row
There is s set of predefined predicate functions including: (issymmetric, ishermitian, isgeneral, isskew, isreal, iscomplex, isboolean, islocal, isremote, isloaded, isunloaded, isbuiltin, isuser, issparse)
Special predicate generators keyword(word...)
and hasdata(symbol...)
allow to support keyword-search and check for the existence of meta-data. For example: hasdata(:x) & ~keyword("fluid"
provides solution (x) and does not mention "fluid".
mdinfo() # overview
listgroups() # list all defined group names
mdlist(pattern) # array of matrix names according to pattern
listdata(pattern) # array of `MatrixData`objects according to pattern
listnames(pattern) # MD-formatted listing of all names according to pattern
listdir("*//*") # MD-formatted - group over part before `//` - count matching
mdinfo() # overview over database
mdinfo(pattern) # individual documentation about matrix(es) matching pattern
A = matrixdepot("kahan", 10)
generates a matrix using one of the buit-in generators
md = mdopen("kahan", 10)
returns a handle md
; matrix can be obtained by A = md.A
In general the first form is preferrable, if only the pure matrix is required. For remote collections no arguments are used.
The second form allows to access all types of 'meta-data', which may be available for some local or remote matrices.
Examples:
md = mdopen("spikes", 5, false); A = md.A; b = md.b; x = md.x
md = mdopen("Rommes/bips07_1998"); A = md.A; v = md.iv; title = md.data.title; nodenames = md("nodename.txt")
The last example shows, how to access textual meta-data, when the name contains Julia
non-word characters. Also if the metadata-name is stored in a varaible, the last form has to be used.
meta = metasymbols(md)[2]; sec_matrix = md(meta)
The function metasymbols
returns a list of all symbols denoting metadata provided by md
. Wether expressed as symbols or strings does not matter.
The system function propertynames(md)
returns all data of md
. That includes size information and metadata.
propertynames(md.data)
gives an overview about all attributes of the md.data::MatrixData
, which can for example be used in the @pred
definitions.
The remote data are originally stored at the remote web-site of one of the matrix collections. Before they are presented to the user, they are downloaded to local disk storage, which serves as a permanent cache.
By default, the data directory is a scratchspace managed by Scratch.jl
, but can be changed by setting the MATRIXDEPOT_DATA
environment variable.
The data directory can be queried by
julia> MatrixDepot.data_dir()
"/home/.../.julia/scratchspaces/b51810bb-c9f3-55da-ae3c-350fc1fbce05/data
The occasional user needs not bother about downloads, because that is done in the background if matrix files are missing on the local disk.
The same is true for the data required by mdinfo(pattern)
. Actually these are stored in separate files if the full matrix files (which may be huge) are not yet loaded.
Load Header Data
A download job to transmit a subset of remote matrix files may be started to load header data for all files. Header data always include the matrix type according to the matrix-market-format and the size values m
row-number, n
= columns-number, and dnz
number of stored data of the main sparse matrix.
MatrixDepot.loadinfo(pattern)
where pattern
defines the subset.
That is possible for the SuiteSparse collection and the NIST MatrixMarket collection. The patterns can always refer to matrix names and id numbers. In the case of SuiteSparse
collection, also the metadata "date"
, "kind"
, "m"
, "n"
, "nnz"
are available and can be used, before individual matrix data have been loaded. They are contained in a data file obtained from the remote site. For MatrixMarket
collection, patterns are restricted to names and id numbers.
In general it would be possible by loadinfo("**")
to load all header data. That would last maybe an hour and generate some traffic for the remote sites. Nevertheless it is not necessary to do so, if you don't need the header data for the following task.
Load Complete Data Files
MatrixDepot.load(pattern)
loads all data files for the patterns. Patterns can only refer to attributes, which are already available. In the case of SuiteSparse
that includes the size info "date"
, "kind"
, "m"
, "n"
, and "nnz"
and all additional attributes loaded in the previous step, which include "author"
, "title"
, "notes"
, and keywords. In the case of MatrixMarket
you can only refer to "m"
, "n"
, and "dnz"
, if previously loaded with the header data.
Please do not: MatrixDepot.load("**")
. That would require some day(s) to finish and include some really big data files (~100GB), which could be more than your disks can hold.
Make a reasonable selection, before you start a bulk download. Local and already loaded matrices are skipped automatically.
Example:
MatrixDepot.load(sp(:) & @pred(nnz < 100_000))
to download only problems with given number of stored entries in the main matrix.
To see an overview of the matrices in the collection, type
julia> using MatrixDepot
julia> mdinfo()
Currently loaded Matrices
–––––––––––––––––––––––––––
builtin(#)
–––––––––– ––––––––––– ––––––––––– ––––––––––– –––––––––– –––––––––––– ––––––––––– ––––––––––– ––––––––––––– ––––––––––––
1 baart 7 circul 13 fiedler 19 gravity 25 invhilb 31 magic 37 parter 43 randcorr 49 shaw 55 ursell
2 binomial 8 clement 14 forsythe 20 grcar 26 invol 32 minij 38 pascal 44 rando 50 smallworld 56 vand
3 blur 9 companion 15 foxgood 21 hadamard 27 kahan 33 moler 39 pei 45 randsvd 51 spikes 57 wathen
4 cauchy 10 deriv2 16 frank 22 hankel 28 kms 34 neumann 40 phillips 46 rohess 52 toeplitz 58 wilkinson
5 chebspec 11 dingdong 17 gilbert 23 heat 29 lehmer 35 oscillate 41 poisson 47 rosser 53 tridiag 59 wing
6 chow 12 erdrey 18 golub 24 hilb 30 lotkin 36 parallax 42 prolate 48 sampling 54 triw
user(#)
–––––––––
1 randsym
Groups
–––––– ––––––– ––––– –––– ––––– ––––– ––––––– ––––––– –––––– –––––– ––––––– –––––– –––––––––
all builtin local user eigen graph illcond inverse posdef random regprob sparse symmetric
Suite Sparse of
–––––––––––– ––––
2770 2833
MatrixMarket of
–––––––––––– –––
488 498
We can generate a 4-by-4 Hilbert matrix by typing
julia> matrixdepot("hilb", 4)
4x4 Array{Float64,2}:
1.0 0.5 0.333333 0.25
0.5 0.333333 0.25 0.2
0.333333 0.25 0.2 0.166667
0.25 0.2 0.166667 0.142857
We can type the matrix name to get documentation about the matrix.
julia> mdinfo("hilb")
Hilbert matrix
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
The Hilbert matrix has (i,j) element 1/(i+j-1). It is notorious for being
ill conditioned. It is symmetric positive definite and totally positive.
Input options:
• [type,] dim: the dimension of the matrix;
• [type,] row_dim, col_dim: the row and column dimensions.
Groups: ["inverse", "ill-cond", "symmetric", "pos-def"]
References:
M. D. Choi, Tricks or treats with the Hilbert matrix, Amer. Math. Monthly,
90 (1983), pp. 301-312.
N. J. Higham, Accuracy and Stability of Numerical Algorithms, second
edition, Society for Industrial and Applied Mathematics, Philadelphia, PA,
USA, 2002; sec. 28.1.
We can also specify the data type for locally generated matrices.
julia> matrixdepot("hilb", Float16, 5, 3)
5x3 Array{Float16,2}:
1.0 0.5 0.33325
0.5 0.33325 0.25
0.33325 0.25 0.19995
0.25 0.19995 0.16663
0.19995 0.16663 0.14282
julia> matrixdepot("hilb", Rational{Int}, 4)
4x4 Array{Rational{T<:Integer},2}:
1//1 1//2 1//3 1//4
1//2 1//3 1//4 1//5
1//3 1//4 1//5 1//6
1//4 1//5 1//6 1//7
Matrices can be accessed by a variety of patterns and composed patterns. Integer numbers i
refer to the ident numbers in sp(i)
, mm(i)
, builtin(i)
, user(i)
. Here sp
... denote the supported matrix collections SuiteSparse (formerly UFL), Matrix Market, built-in, user-defined.
julia> mdlist(sp(1)) # here sp(1) is the ident number of the SuiteSparse collection
list(1)
–––––––––––
HB/1138_bus
julia> listnames(builtin(1, 5:10)) # the internal numbering of the builtin-functions
list(7)
––––––– –––––––– –––– –––––– ––––––– ––––––––– ––––––
baart chebspec chow circul clement companion deriv2
julia> mdlist(builtin(1:4, 6, 10:15) | user(1:10) )
12-element Array{String,1}:
"baart"
"binomial"
"blur"
"cauchy"
"chow"
"deriv2"
"dingdong"
"erdrey"
"fiedler"
"forsythe"
"foxgood"
"randsym"
While the listnames
command renders the output as markdown table, the internal mdlist
produces an array of valid matrix names.
We can type a group name to see all the matrices in that group. Group names are always written as symbols to distinguish them form matrix names and pattern, which are always strings.
julia> listnames(:symmetric)
list(22)
–––––––– –––––––– ––––––– –––––– ––––––––– –––––––– ––––––– –––––––––
cauchy dingdong hilb lehmer oscillate poisson randsym wilkinson
circul fiedler invhilb minij pascal prolate tridiag
clement hankel kms moler pei randcorr wathen
It is possible to extend the builtin local problems with user defined generators and groups. We can add new matrix generators and define new groups of matrices.
Weijian Zhang and Nicholas J. Higham, "Matrix Depot: An Extensible Test Matrix Collection for Julia", PeerJ Comput. Sci., 2:e58 (2016), [pdf]
Nicholas J. Higham, "Algorithm 694, A Collection of Test Matrices in MATLAB", ACM Trans. Math. Software, vol. 17. (1991), pp 289-305 [pdf] [doi]
T.A. Davis and Y. Hu, "The University of Florida Sparse Matrix Collection", ACM Transaction on Mathematical Software, vol. 38, Issue 1, (2011), pp 1:1-1:25 [pdf]
R.F. Boisvert, R. Pozo, K. A. Remington, R. F. Barrett, & J. Dongarra, " Matrix Market: a web resource for test matrix collections", Quality of Numerical Software (1996) (pp. 125-137). [pdf]
Per Christian Hansen, "Test Matrices for Regularization Methods", SIAM Journal on Scientific Computing, vol. 16, 2, (1995) pp.506-512. [pdf] [doi]
Author: https://github.com/JuliaMatrices/MatrixDepot.jl
Source Code: https://github.com/JuliaMatrices/MatrixDepot.jl
License: View license
1665695940
This Julia package provides fast low-rank approximation algorithms for BLAS/LAPACK-compatible matrices based on some of the latest technology in adaptive randomized matrix sketching. Currently implemented algorithms include:
By "partial", we mean essentially that these algorithms are early-terminating, i.e., they are not simply post-truncated versions of their standard counterparts. There is also support for "matrix-free" linear operators described only through their action on vectors. All methods accept a number of options specifying, e.g., the rank, estimated absolute precision, and estimated relative precision of approximation.
Our implementation borrows heavily from the perspective espoused by N. Halko, P.G. Martinsson, J.A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53 (2): 217-288, 2011., except that we choose the interpolative decomposition (ID) as our basic form of approximation instead of matrix range projection. The reason is that the latter requires expensive matrix-matrix multiplication to contract to the relevant subspace, while the former can sometimes be computed much faster, depending on the accelerated sketching strategy employed.
This package has been developed with performance in mind, and early tests have shown large speedups over similar codes written in MATLAB and Python (and even some in Fortran and C). For example, computing an ID of a Hilbert matrix of order 1024 to relative precision ~1e-15 takes:
This difference can be attributed in part to both algorithmic improvements as well as to some low-level optimizations.
To install LowRankApprox, simply type:
Pkg.add("LowRankApprox")
at the Julia prompt. The package can then be imported as usual with using
or import
.
To illustrate the usage of this package, let's consider the computation of a partial QR decomposition of a Hilbert matrix, which is well known to have low rank. First, we load LowRankApprox via:
using LowRankApprox
Then we construct a Hilbert matrix with:
n = 1024
A = matrixlib(:hilb, n, n)
A partial QR decomposition can then be computed using:
F = pqrfact(A)
This returns a PartialQR
factorization with variables Q
, R
, and p
denoting the unitary, trapezoidal, and permutation factors, respectively, constituting the decomposition. Alternatively, these can be extracted directly with:
Q, R, p = pqr(A)
but the factorized form is often more convenient as it also implements various arithmetic operations. For example, the commands:
x = rand(n)
y = F *x
z = F'*x
automatically invoke specialized multiplication routines to rapidly compute y
and z
using the low-rank structure of F
.
The rank of the factorization can be retrieved by F[:k]
, which in this example usually gives 26 or 27. The reason for this variability is that the default interface uses randomized Gaussian sketching for acceleration. Likewise, the actual approximation error is also random but can be efficiently and robustly estimated:
aerr = snormdiff(A, F)
rerr = aerr / snorm(A)
This computes the absolute and relative errors in the spectral norm using power iteration. Here, the relative error achieved should be on the order of machine epsilon. (You may see a warning about exceeding the iteration limit, but this is harmless in this case.)
The default interface requests ~1e-15 estimated relative precision. To request, say, only 1e-12 relative precision, use:
F = pqrfact(A, rtol=1e-12)
which returns a factorization of rank ~22. We can also directly control the rank instead with, e.g.:
F = pqrfact(A, rank=20)
Using both together as in:
F = pqrfact(A, rank=20, rtol=1e-12)
sets two separate termination criteria: one on reaching rank 20 and the other on achieving estimated relative precision 1e-12---with the computation completing upon either of these being fulfilled. Many other options are available as well. All keyword arguments can also be encapsulated into an LRAOptions
type for convenience. For example, we can equivalently write the above as:
opts = LRAOptions(rank=20, rtol=1e-12)
F = pqrfact(A, opts)
For further details, see the Options section.
All aforementioned considerations also apply when the input is a linear operator, i.e., when the matrix is described not by its entries but by its action on vectors. To demonstrate, we can convert A
to type LinearOperator
as follows:
L = LinearOperator(A)
which inherits and stores methods for applying the matrix and its adjoint. (This command actually recognizes A
as Hermitian and forms L
as a HermitianLinearOperator
.) A partial QR decomposition can then be computed with:
F = pqrfact(L)
just as in the previous case. Of course, there is no real benefit to doing so in this particular example; the advantage comes when considering complicated matrix products that can be represented implicitly as a single LinearOperator
. For instance, A*A
can be represented as L*L
without ever forming the resultant matrix explicitly, and we can even encapsulate entire factorizations as linear operators to exploit fast multiplication:
L = LinearOperator(F)
Linear operators can be scaled, added, and composed together using the usual syntax. All methods in LowRankApprox transparently support both matrices and linear operators.
We now detail the various low-rank approximations implemented, which all nominally return compact Factorization
types storing the matrix factors in structured form. All such factorizations provide optimized multiplication routines. Furthermore, the rank of any factorization F
can be queried with F[:k]
and the matrix approximant defined by F
can be reconstructed as Matrix(F)
. For concreteness of exposition, assume in the following that A
has size m
by n
with factorization rank F[:k] = k
. Note that certain matrix identities below should be interpreted only as equalities up to the approximation precision.
A partial QR decomposition is a factorization A[:,p] = Q*R
, where Q
is m
by k
with orthonormal columns, R
is k
by n
and upper trapezoidal, and p
is a permutation vector. Such a decomposition can be computed with:
F = pqrfact(A, args...)
or more explicitly with:
Q, R, p = pqr(A, args...)
The former returns a PartialQR
factorization with access methods:
F[:Q]
: Q
factor as type Matrix
F[:R]
: R
factor as type UpperTrapezoidal
F[:p]
: p
permutation as type Vector
F[:P]
: p
permutation as type ColumnPermutation
Both F[:R]
and F[:P]
are represented as structured matrices, complete with their own arithmetic operations, and together permit the alternative approximation formula A*F[:P] = F[:Q]*F[:R]
. The factorization form additionally supports least squares solution by left-division.
We can also compute a partial QR decomposition of A'
(that is, pivoting on rows instead of columns) without necessarily constructing the matrix transpose explicitly by writing:
F = pqrfact(:c, A, args...)
and similarly with pqr
. The default interface is equivalent to, e.g.:
F = pqrfact(:n, A, args...)
for "no transpose". It is also possible to generate only a subset of the partial QR factors for further efficiency; see Options.
The above methods do not modify the input matrix A
and may make a copy of the data in order to enforce this (whether this is actually necessary depends on the type of input and the sketch method used). Potentially more efficient versions that reserve the right to overwrite A
are available as pqrfact!
and pqr!
, respectively.
The ID is based on the approximation A[:,rd] = A[:,sk]*T
, where sk
is a set of k
"skeleton" columns, rd
is a set of n - k
"redundant" columns, and T
is a k
by n - k
interpolation matrix. It follows that A[:,p] = C*V
, where p = [sk; rd]
, C = A[:,sk]
, and V = [Matrix(I,k,k) T]
. An ID can be computed by:
V = idfact(A, args...)
or:
sk, rd, T = id(A, args...)
Here, V
is of type IDPackedV
and defines the V
factor above but can also implicitly represent the entire ID via:
V[:sk]
: sk
columns as type Vector
V[:rd]
: rd
columns as type Vector
V[:p]
: p
permutation as type Vector
V[:P]
: p
permutation as type ColumnPermutation
V[:T]
: T
factor as type Matrix
To actually produce the ID itself, use:
F = ID(A, V)
or:
F = ID(A, sk, rd, T)
which returns an ID
factorization that can be directly compared with A
. This factorization has access methods:
F[:C]
: C
factor as type Matrix
F[:V]
: V
factor as type IDPackedV
in addition to those defined for IDPackedV
.
As with the partial QR decomposition, an ID can be computed for A'
instead (that is, finding skeleton rows as opposed to columns) in the same way, e.g.:
V = idfact(:c, A, args...)
The default interface is equivalent to passing :n
as the first argument. Moreover, modifying versions of the above are available as idfact!
and id!
.
A partial SVD is a factorization A = U*S*V'
, where U
and V
are m
by k
and n
by k
, respectively, both with orthonormal columns, and S
is k
by k
and diagonal with nonincreasing nonnegative real entries. It can be computed with:
F = psvdfact(A, args...)
or:
U, S, V = psvd(A, args...)
The factorization is of type PartialSVD
and has access methods:
F[:U]
: U
factor as type Matrix
F[:S]
: S
factor as type Vector
F[:V]
: V
factor as type Matrix
F[:Vt]
: V'
factor as type Matrix
Note that the underlying SVD routine forms V'
as output, so F[:Vt]
is easier to extract than F[:V]
. Least squares solution is also supported using left-division. Furthermore, if just the singular values are required, then we can use:
S = psvdvals(A, args...)
A partial Hermitian eigendecomposition of an n
by n
Hermitian matrix A
is a factorization A = U*S*U'
, where U
is n
by k
with orthonormal columns and S
is k
by k
and diagonal with nondecreasing real entries. It is very similar to a partial Hermitian SVD and can be computed by:
F = pheigfact(A, args...)
or:
vals, vecs = pheig(A, args...)
where we have followed the Julia convention of letting vals
denote the eigenvalues comprising S
and vecs
denote the eigenvector matrix U
. The factorization is of type PartialHermitianEigen
and has access methods:
F[:values]
: vals
as type Vector
F[:vectors]
: vecs
as type Matrix
It also supports least squares solution by left-division. If only the eigenvalues are desired, use instead:
vals = pheigvals(A, args...)
A CUR decomposition is a factorization A = C*U*R
, where C = A[:,cols]
and R = A[rows,:]
consist of k
columns and rows, respectively, from A
and U = inv(A[rows,cols])
. The basis rows and columns can be computed with:
U = curfact(A, args...)
or:
rows, cols = cur(A, args...)
The former is of type CURPackedU
(or HermitianCURPackedU
if A
is Hermitian or SymmetricCURPackedU
if symmetric) and has access methods:
U[:cols]
: cols
columns as type Vector
U[:rows]
: rows
rows as type Vector
To produce the corresponding CUR decomposition, use:
F = CUR(A, U)
or:
F = CUR(A, rows, cols)
which returns a CUR
factorization (or HermitianCUR
if A
is Hermitian or SymmetricCUR
if symmetric), with access methods:
F[:C]
: C
factor as type Matrix
F[:U]
: U
factor as type Factorization
F[:R]
: R
factor as type Matrix
in addition to those defined for CURPackedU
. If F
is of type HermitianCUR
, then F[:R] = F[:C]'
, while if F
has type SymmetricCUR
, then F[:R] = transpose(F[:C])
. Note that because of conditioning issues, U
is not stored explicitly but rather in factored form, nominally as type SVD
but practically as PartialHermitianEigen
if U
has type HermitianCURPackedU
or PartialSVD
otherwise (for convenient arithmetic operations).
Modifying versions of the above are available as curfact!
and cur!
.
Matrix sketching is a core component of this package and its proper use is critical for high performance. For an m
by n
matrix A
, a sketch of order k
takes the form B = S*A
, where S
is a k
by m
sampling matrix (see below). Sketches can similarly be constructed for sampling from the right or for multiplying against A'
. The idea is that B
contains a compressed representation of A
up to rank approximately k
, which can then be efficiently processed to recover information about A
.
The default sketch method defines S
as a Gaussian random matrix. Other sketch methods can be specified using the keyword sketch
. For example, setting:
opts = LRAOptions(sketch=:srft, args...)
or equivalently:
opts = LRAOptions(args...)
opts.sketch = :srft
then passing to, e.g.:
V = idfact(A, opts)
computes an ID with sketching via a subsampled random Fourier transform (SRFT). This can also be done more directly with:
V = idfact(A, sketch=:srft)
A list of supported sketch methods is given below. To disable sketching altogether, use:
opts.sketch = :none
In addition to its integration with low-rank factorization methods, sketches can also be generated independently by:
B = sketch(A, order, args...)
Other interfaces include:
B = sketch(:left, :n, A, order, args...)
to compute B = S*A
B = sketch(:left, :c, A, order, args...)
to compute B = S*A'
B = sketch(:right, :n, A, order, args...)
to compute B = A*S
B = sketch(:right, :c, A, order, args...)
to compute B = A'*S
We also provide adaptive routines to automatically sketch with increasing orders until a specified error tolerance is met, as detected by early termination of an unaccelerated partial QR decomposition. This adaptive sketching forms the basis for essentially all higher-level algorithms in LowRankApprox and can be called with:
F = sketchfact(A, args...)
Like sketch
, a more detailed interface is also available as:
F = sketchfact(side, trans, A, args...)
The canonical sampling matrix is a Gaussian random matrix with entries drawn independently from the standard normal distribution (or with real and imaginary parts each drawn independently if A
is complex). To use this sketch method, set:
opts.sketch = :randn
There is also support for power iteration to improve accuracy when the spectral gap (up to rank k
) is small. This computes, e.g., B = S*(A*A')^p*A
(or simply B = S*A^(p + 1)
if A
is Hermitian) instead of just B = S*A
, with all intermediate matrix products orthogonalized for stability.
For generic A
, Gaussian sketching has complexity O(k*m*n)
. In principle, this can make it the most expensive stage of computing a fast low-rank approximation (though in practice it is still very effective). There is a somewhat serious effort to develop sketch methods with lower computational cost, which is addressed in part by the following techniques.
Perhaps the simplest matrix sketch is just a random subset of rows or columns, with complexity O(k*m)
or O(k*n)
as appropriate. This can be specified with:
opts.sketch = :sub
The linear growth in matrix dimension is obviously attractive, but note that this method can fail if the matrix is not sufficiently "regular", e.g., if it contains a few large isolated entries. Random subselection is only implemented for type AbstractMatrix
.
An alternative approach based on imposing structure in the sampling matrix is the SRFT, which has the form S = R*F*D
(if applying from the left), where R
is a random permutation matrix of size k
by m
, F
is the discrete Fourier transform (DFT) of order m
, and D
is a random diagonal unitary scaling. Due to the DFT structure, this can be applied in only O(m*n*log(k))
operations (but beware that the constant is quite high). To use this method, set:
opts.sketch = :srft
For real A
, our SRFT implementation uses only real arithmetic by separately computing real and imaginary parts as in a standard real-to-real DFT. Only AbstractMatrix
types are supported.
As a modification of Gaussian sketching, we provide also a "sparse" random Gaussian sampling scheme, wherein S
is restricted to have only O(m)
or O(n)
nonzeros, depending on the dimension to be contracted. Considering the case B = S*A
for concreteness, each row of S
is taken to be nonzero in only O(m/k)
columns, with full coverage of A
maintained by evenly spreading these nonzero indices among the rows of S
. The complexity of computing B
is O(m*n)
. Sparse Gaussian sketching can be specified with:
opts.sketch = :sprn
and is only implemented for type AbstractMatrix
. Power iteration is not supported since any subsequent matrix application would devolve back to having O(k*m*n)
cost.
We also provide a few other useful relevant algorithms as follows. Let A
be an m
by n
matrix.
A basis for the partial range of A
of rank k
is an m
by k
matrix Q
with orthonormal columns such that A = Q*Q'*A
. Such a basis can be computed with:
Q = prange(A, args...)
Fast range approximation using sketching is supported.
The default interface computes a basis for the column space of A
. To capture the row space instead, use:
Q = prange(:c, A, args...)
which is equivalent to computing the partial range of A'
. The resulting matrix Q
is n
by k
with orthonormal rows and satisfies A = A*Q*Q'
. It is also possible to approximate both the row and column spaces simultaneously with:
Q = prange(:b, A, args...)
Then A = Q*Q'*A*Q*Q'
.
A possibly modifying version is available as prange!
.
The spectral norm of A
can be rapidly computed using randomized power iteration via:
err = snorm(A, args...)
Similarly, the spectral norm difference of two matrices A
and B
can be computed with:
err = snormdiff(A, B, args...)
which admits both a convenient and efficient way to test the accuracy of our low-rank approximations.
The underlying algorithm behind LowRankApprox is the pivoted QR decomposition, with the magnitudes of the pivots providing an estimate of the approximation error incurred at each truncation rank. Here, we use an early-terminating variant of the LAPACK routine GEQP3. The partial QR decomposition so constructed is then leveraged into an ID to support the various other factorizations.
Due to its fundamental importance, we can also perform optional determinant maximization post-processing to obtain a (strong) rank-revealing QR (RRQR) decomposition. This ensures that we select the best column pivots and can further improve numerical precision and stability.
Numerous options are exposed by the LRAOptions
type, which we will cover by logical function below.
The accuracy of any low-rank approximation (in the spectral norm) is controlled by the following parameters:
atol
: absolute tolerance of approximation (default: 0
)rtol
: relative tolerance of approximation (default: 5*eps()
)rank
: maximum rank of approximation (default: -1
)Each parameter specifies an independent termination criterion; the computation completes when any of them are met. Currently, atol
and rtol
are checked against QR pivot magnitudes and thus accuracy can only be approximately guaranteed, though the resulting errors should be of the correct order.
Iterative RRQR post-processing is also available:
maxdet_niter
: maximum number of iterations for determinant maximization (default: -1
)maxdet_tol
: relative tolerance for determinant maximization (default: -1
)If maxdet_tol < 0
, no post-processing is done; otherwise, as above, each parameter specifies an independent termination criterion. These options have an impact on all factorizations (i.e., not just QR) since they all involve, at some level, approximations based on the QR. For example, computing an ID via an RRQR guarantees that the interpolation matrix T
satisfies maxabs(T) < 1 + maxdet_tol
(assuming no early termination due to maxdet_niter
).
The parameters atol
and rtol
are also used for the spectral norm estimation routines snorm
and snormdiff
to specify the requested precision of the (scalar) norm output.
The following parameters govern matrix sketching:
sketch
: sketch method, one of :none
, :randn
(default), :srft
, :sub
, or :sprn
sketch_randn_niter
: number of power iterations for Gaussian sketching (default: 0
)sketchfact_adap
: whether to compute a sketched factorization adaptively by successively doubling the sketch order (default: true
); if false
only takes effect if rank >= 0
, in which case a single sketch of order rank
is (partially) factorizedsketchfact_randn_samp
: oversampling function for Gaussian sketching (default: n -> n + 8
)sketchfact_srft_samp
: oversampling function for SRFT sketching (default: n -> n + 8
)sketchfact_sub_samp
: oversampling function for subset sketching (default: n -> 4*n + 8
)The oversampling functions take as input a desired approximation rank and return a corresponding sketch order designed to be able to capture it with high probability. No oversampling function is used for sparse random Gaussian sketching due to its special form.
Other available options include:
nb
: computational block size, used in various settings (default: 32
)pheig_orthtol
: eigenvalue relative tolerance to identify degenerate subspaces, within which eigenvectors are re-orthonormalized (to combat LAPACK issue; default: sqrt(eps())
)pqrfact_retval
: string containing keys indicating which outputs to return from pqrfact
(default: "qr"
)"q"
: orthonormal Q
matrix"r"
: trapezoidal R
matrix"t"
: interpolation T
matrix (for ID)snorm_niter
: maximum number of iterations for spectral norm estimation (default: 32
)verb
: whether to print verbose messages, used sparingly (default: true
)Note that pqrfact
always returns the permutation vector p
so that no specification is needed in pqrfact_retval
. If pqrfact_retval = "qr"
(in some order), then the output factorization has type PartialQR
; otherwise, it is of type PartialQRFactors
, which is simply a container type with no defined arithmetic operations. All keys other than "q"
, "r"
, and "t"
are ignored.
Below, we summarize the leading-order computational costs of each factorization function depending on the sketch type. Assume an input AbstractMatrix
of size m
by n
with numerical rank k << min(m, n)
and O(1)
cost to compute each entry. Then, first, for a non-adaptive computation (i.e., k
is known essentially a priori):
function | none | randn | sub | srft | sprn |
---|---|---|---|---|---|
pqr | k*m*n | k*m*n | k^2*(m + n) | m*n*log(k) | m*n |
id | k*m*n | k*m*n | k^2*n + k*m | m*n*log(k) | m*n |
svd | k*m*n | k*m*n | k^2*(m + n) | m*n*log(k) | m*n |
pheig | k*m*n | k*m*n | k^2*(m + n) | m*n*log(k) | m*n |
cur | k*m*n | k*m*n | k^2*(m + n) | m*n*log(k) | m*n |
The cost given for the ID is for the default column-oriented version; to obtain the operation count for a row-oriented ID, simply switch the roles of m
and n
. Note also that pheig
is only applicable to square matrices, i.e., m = n
.
All of the above remain unchanged for sketchfact_adap = true
with the exception of the following, in which case the costs become:
sketch = :srft
: m*n*log(k)^2
sketch = :sprn
: m*n*log(k)
uniformly across all functions.
Author: JuliaMatrices
Source Code: https://github.com/JuliaMatrices/LowRankApprox.jl
License: View license
1664322600
R Htmlwidget Wrapper Around The D3 Scatterplot Matrix
This is a htmlwidget
wrapper around Benjie Chen's extension of Mike Bostock's scatterplot matrix.
library(scatterMatrixD3)
scatterMatrix(
data = iris
)
Author: jcizel
Source Code: https://github.com/jcizel/scatterMatrixD3