1666953720

# LinearOperators.jl: Linear Operators for Julia

## LinearOperators.jl

A Julia Linear Operator Package

## How to Cite

If you use LinearOperators.jl in your work, please cite using the format given in `CITATION.bib`.

## Philosophy

Operators behave like matrices (with some exceptions - see below) but are defined by their effect when applied to a vector. They can be transposed, conjugated, or combined with other operators cheaply. The costly operation is deferred until multiplied with a vector.

## Compatibility

Julia 1.3 and up.

## How to Install

``````pkg> add LinearOperators
pkg> test LinearOperators
``````

## How to use

Check the tutorial.

## Other Operations on Operators

Operators can be transposed (`transpose(A)`), conjugated (`conj(A)`) and conjugate-transposed (`A'`). Operators can be sliced (`A[:,3]`, `A[2:4,1:5]`, `A[1,1]`), but unlike matrices, slices always return operators (see differences below).

## Differences

Unlike matrices, an operator never reduces to a vector or a number.

``````A = rand(5,5)
opA = LinearOperator(A)
A[:,1] * 3 # Vector
opA[:,1] * 3 # LinearOperator
A[:,1] * [3] # ERROR
opA[:,1] * [3] # Vector
``````

This is also true for `A[i,J]`, which returns vectors on 0.5, and for the scalar `A[i,j]`. Similarly, `opA[1,1]` is an operator of size (1,1):"

``````opA[1,1] # LinearOperator
A[1,1] # Number
``````

In the same spirit, the operator `full` always returns a matrix.

``````full(opA[:,1]) # nx1 matrix
``````

## Other Operators

• LimitedLDLFactorizations features a limited-memory LDLT factorization operator that may be used as preconditioner in iterative methods
• MUMPS.jl features a full distributed-memory factorization operator that may be used to represent the preconditioner in, e.g., constraint-preconditioned Krylov methods.

## Bug reports and discussions

If you think you found a bug, feel free to open an issue. Focused suggestions and requests can also be opened as issues. Before opening a pull request, start an issue or a discussion on the topic, please.

If you want to ask a question not suited for a bug report, feel free to start a discussion here. This forum is for general discussion about this repository and the JuliaSmoothOptimizers organization, so questions about any of our packages are welcome.

Author: JuliaSmoothOptimizers
Source Code: https://github.com/JuliaSmoothOptimizers/LinearOperators.jl

1666953720

## LinearOperators.jl

A Julia Linear Operator Package

## How to Cite

If you use LinearOperators.jl in your work, please cite using the format given in `CITATION.bib`.

## Philosophy

Operators behave like matrices (with some exceptions - see below) but are defined by their effect when applied to a vector. They can be transposed, conjugated, or combined with other operators cheaply. The costly operation is deferred until multiplied with a vector.

## Compatibility

Julia 1.3 and up.

## How to Install

``````pkg> add LinearOperators
pkg> test LinearOperators
``````

## How to use

Check the tutorial.

## Other Operations on Operators

Operators can be transposed (`transpose(A)`), conjugated (`conj(A)`) and conjugate-transposed (`A'`). Operators can be sliced (`A[:,3]`, `A[2:4,1:5]`, `A[1,1]`), but unlike matrices, slices always return operators (see differences below).

## Differences

Unlike matrices, an operator never reduces to a vector or a number.

``````A = rand(5,5)
opA = LinearOperator(A)
A[:,1] * 3 # Vector
opA[:,1] * 3 # LinearOperator
A[:,1] * [3] # ERROR
opA[:,1] * [3] # Vector
``````

This is also true for `A[i,J]`, which returns vectors on 0.5, and for the scalar `A[i,j]`. Similarly, `opA[1,1]` is an operator of size (1,1):"

``````opA[1,1] # LinearOperator
A[1,1] # Number
``````

In the same spirit, the operator `full` always returns a matrix.

``````full(opA[:,1]) # nx1 matrix
``````

## Other Operators

• LimitedLDLFactorizations features a limited-memory LDLT factorization operator that may be used as preconditioner in iterative methods
• MUMPS.jl features a full distributed-memory factorization operator that may be used to represent the preconditioner in, e.g., constraint-preconditioned Krylov methods.

## Bug reports and discussions

If you think you found a bug, feel free to open an issue. Focused suggestions and requests can also be opened as issues. Before opening a pull request, start an issue or a discussion on the topic, please.

If you want to ask a question not suited for a bug report, feel free to start a discussion here. This forum is for general discussion about this repository and the JuliaSmoothOptimizers organization, so questions about any of our packages are welcome.

Author: JuliaSmoothOptimizers
Source Code: https://github.com/JuliaSmoothOptimizers/LinearOperators.jl

1619565060

## Ternary operator in Python?

1. Ternary Operator in Python

What is a ternary operator: The ternary operator is a conditional expression that means this is a comparison operator and results come on a true or false condition and it is the shortest way to writing an if-else statement. It is a condition in a single line replacing the multiline if-else code.

syntax : condition ? value_if_true : value_if_false

condition: A boolean expression evaluates true or false

value_if_true: a value to be assigned if the expression is evaluated to true.

value_if_false: A value to be assigned if the expression is evaluated to false.

How to use ternary operator in python here are some examples of Python ternary operator if-else.

Brief description of examples we have to take two variables a and b. The value of a is 10 and b is 20. find the minimum number using a ternary operator with one line of code. ( **min = a if a < b else b ) **. if a less than b then print a otherwise print b and second examples are the same as first and the third example is check number is even or odd.

#python #python ternary operator #ternary operator #ternary operator in if-else #ternary operator in python #ternary operator with dict #ternary operator with lambda

1617738420

## Unformatted input/output operations In C++

In this article, we will discuss the unformatted Input/Output operations In C++. Using objects cin and cout for the input and the output of data of various types is possible because of overloading of operator >> and << to recognize all the basic C++ types. The operator >> is overloaded in the istream class and operator << is overloaded in the ostream class.

The general format for reading data from the keyboard:

cin >> var1 >> var2 >> …. >> var_n;

• Here, var1var2, ……, varn are the variable names that are declared already.
• The input data must be separated by white space characters and the data type of user input must be similar to the data types of the variables which are declared in the program.
• The operator >> reads the data character by character and assigns it to the indicated location.
• Reading of variables terminates when white space occurs or character type occurs that does not match the destination type.

1658280750

## GLM.jl: Generalized Linear Models in Julia

Linear and generalized linear models in Julia

GLM.jl Manual

Linear and generalized linear models in Julia

## Installation

``Pkg.add("GLM")``

will install this package and its dependencies, which includes the Distributions package.

The RDatasets package is useful for fitting models on standard R datasets to compare the results with those from R.

## Fitting GLM models

Two methods can be used to fit a Generalized Linear Model (GLM): `glm(formula, data, family, link)` and `glm(X, y, family, link)`. Their arguments must be:

• `formula`: a StatsModels.jl `Formula` object referring to columns in `data`; for example, if column names are `:Y`, `:X1`, and `:X2`, then a valid formula is `@formula(Y ~ X1 + X2)`
• `data`: a table in the Tables.jl definition, e.g. a data frame; rows with `missing` values are ignored
• `X` a matrix holding values of the dependent variable(s) in columns
• `y` a vector holding values of the independent variable (including if appropriate the intercept)
• `family`: chosen from `Bernoulli()`, `Binomial()`, `Gamma()`, `Normal()`, `Poisson()`, or `NegativeBinomial(θ)`
• `link`: chosen from the list below, for example, `LogitLink()` is a valid link for the `Binomial()` family

Typical distributions for use with `glm` and their canonical link functions are

``````       Bernoulli (LogitLink)

Currently the available Link types are

``````CauchitLink

Note that the canonical link for negative binomial regression is `NegativeBinomialLink`, but in practice one typically uses `LogLink`. The `NegativeBinomial` distribution belongs to the exponential family only if θ (the shape parameter) is fixed, thus θ has to be provided if we use `glm` with `NegativeBinomial` family. If one would like to also estimate θ, then `negbin(formula, data, link)` should be used instead.

An intercept is included in any GLM by default.

## Categorical variables

Categorical variables will be dummy coded by default if they are non-numeric or if they are `CategoricalVector`s within a Tables.jl table (`DataFrame`, JuliaDB table, named tuple of vectors, etc). Alternatively, you can pass an explicit contrasts argument if you would like a different contrast coding system or if you are not using DataFrames.

The response (dependent) variable may not be categorical.

Using a `CategoricalVector` constructed with `categorical` or `categorical!`:

``````julia> using CategoricalArrays, DataFrames, GLM, StableRNGs

julia> rng = StableRNG(1); # Ensure example can be reproduced

julia> data = DataFrame(y = rand(rng, 100), x = categorical(repeat([1, 2, 3, 4], 25)));

julia> lm(@formula(y ~ x), data)
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}}

y ~ 1 + x

Coefficients:
───────────────────────────────────────────────────────────────────────────
Coef.  Std. Error      t  Pr(>|t|)   Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
(Intercept)   0.490985    0.0564176   8.70    <1e-13   0.378997    0.602973
x: 2          0.0527655   0.0797865   0.66    0.5100  -0.105609    0.21114
x: 3          0.0955446   0.0797865   1.20    0.2341  -0.0628303   0.25392
x: 4         -0.032673    0.0797865  -0.41    0.6831  -0.191048    0.125702
───────────────────────────────────────────────────────────────────────────``````

Using `contrasts`:

``````julia> using StableRNGs

julia> data = DataFrame(y = rand(StableRNG(1), 100), x = repeat([1, 2, 3, 4], 25));

julia> lm(@formula(y ~ x), data, contrasts = Dict(:x => DummyCoding()))
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}}

y ~ 1 + x

Coefficients:
───────────────────────────────────────────────────────────────────────────
Coef.  Std. Error      t  Pr(>|t|)   Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
(Intercept)   0.490985    0.0564176   8.70    <1e-13   0.378997    0.602973
x: 2          0.0527655   0.0797865   0.66    0.5100  -0.105609    0.21114
x: 3          0.0955446   0.0797865   1.20    0.2341  -0.0628303   0.25392
x: 4         -0.032673    0.0797865  -0.41    0.6831  -0.191048    0.125702
───────────────────────────────────────────────────────────────────────────``````

## Comparing models with F-test

Comparisons between two or more linear models can be performed using the `ftest` function, which computes an F-test between each pair of subsequent models and reports fit statistics:

``````julia> using DataFrames, GLM, StableRNGs

julia> data = DataFrame(y = (1:50).^2 .+ randn(StableRNG(1), 50), x = 1:50);

julia> ols_lin = lm(@formula(y ~ x), data);

julia> ols_sq = lm(@formula(y ~ x + x^2), data);

julia> ftest(ols_lin.model, ols_sq.model)
F-test: 2 models fitted on 50 observations
─────────────────────────────────────────────────────────────────────────────────
DOF  ΔDOF           SSR           ΔSSR      R²     ΔR²            F*   p(>F)
─────────────────────────────────────────────────────────────────────────────────
[1]    3        1731979.2266                 0.9399
[2]    4     1       40.7581  -1731938.4685  1.0000  0.0601  1997177.0357  <1e-99
─────────────────────────────────────────────────────────────────────────────────``````

## Methods applied to fitted models

Many of the methods provided by this package have names similar to those in R.

• `coef`: extract the estimates of the coefficients in the model
• `deviance`: measure of the model fit, weighted residual sum of squares for lm's
• `dof_residual`: degrees of freedom for residuals, when meaningful
• `glm`: fit a generalized linear model (an alias for `fit(GeneralizedLinearModel, ...)`)
• `lm`: fit a linear model (an alias for `fit(LinearModel, ...)`)
• `r2`: R² of a linear model or pseudo-R² of a generalized linear model
• `stderror`: standard errors of the coefficients
• `vcov`: estimated variance-covariance matrix of the coefficient estimates
• `predict` : obtain predicted values of the dependent variable from the fitted model
• `residuals`: get the vector of residuals from the fitted model

Note that the canonical link for negative binomial regression is `NegativeBinomialLink`, but in practice one typically uses `LogLink`.

## Separation of response object and predictor object

The general approach in this code is to separate functionality related to the response from that related to the linear predictor. This allows for greater generality by mixing and matching different subtypes of the abstract type `LinPred` and the abstract type `ModResp`.

A `LinPred` type incorporates the parameter vector and the model matrix. The parameter vector is a dense numeric vector but the model matrix can be dense or sparse. A `LinPred` type must incorporate some form of a decomposition of the weighted model matrix that allows for the solution of a system `X'W * X * delta=X'wres` where `W` is a diagonal matrix of "X weights", provided as a vector of the square roots of the diagonal elements, and `wres` is a weighted residual vector.

Currently there are two dense predictor types, `DensePredQR` and `DensePredChol`, and the usual caveats apply. The Cholesky version is faster but somewhat less accurate than that QR version. The skeleton of a distributed predictor type is in the code but not yet fully fleshed out. Because Julia by default uses OpenBLAS, which is already multi-threaded on multicore machines, there may not be much advantage in using distributed predictor types.

A `ModResp` type must provide methods for the `wtres` and `sqrtxwts` generics. Their values are the arguments to the `updatebeta` methods of the `LinPred` types. The `Float64` value returned by `updatedelta` is the value of the convergence criterion.

Similarly, `LinPred` types must provide a method for the `linpred` generic. In general `linpred` takes an instance of a `LinPred` type and a step factor. Methods that take only an instance of a `LinPred` type use a default step factor of 1. The value of `linpred` is the argument to the `updatemu` method for `ModResp` types. The `updatemu` method returns the updated deviance.

Examples

``````DocTestSetup = quote
using CategoricalArrays, DataFrames, Distributions, GLM, RDatasets, Optim
end
``````

## Linear regression

``````julia> using DataFrames, GLM, StatsBase

julia> data = DataFrame(X=[1,2,3], Y=[2,4,7])
3×2 DataFrame
Row │ X      Y
│ Int64  Int64
─────┼──────────────
1 │     1      2
2 │     2      4
3 │     3      7

julia> ols = lm(@formula(Y ~ X), data)
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}}

Y ~ 1 + X

Coefficients:
─────────────────────────────────────────────────────────────────────────
Coef.  Std. Error      t  Pr(>|t|)  Lower 95%  Upper 95%
─────────────────────────────────────────────────────────────────────────
(Intercept)  -0.666667    0.62361   -1.07    0.4788   -8.59038    7.25704
X             2.5         0.288675   8.66    0.0732   -1.16797    6.16797
─────────────────────────────────────────────────────────────────────────

julia> round.(stderror(ols), digits=5)
2-element Vector{Float64}:
0.62361
0.28868

julia> round.(predict(ols), digits=5)
3-element Vector{Float64}:
1.83333
4.33333
6.83333

julia> round.(confint(ols); digits=5)
2×2 Matrix{Float64}:
-8.59038  7.25704
-1.16797  6.16797

julia> round(r2(ols); digits=5)
0.98684

0.97368

julia> round(deviance(ols); digits=5)
0.16667

julia> dof(ols)
3

julia> dof_residual(ols)
1.0

julia> round(aic(ols); digits=5)
5.84252

julia> round(aicc(ols); digits=5)
-18.15748

julia> round(bic(ols); digits=5)
3.13835

julia> round(dispersion(ols.model); digits=5)
0.40825

julia> round(loglikelihood(ols); digits=5)
0.07874

julia> round(nullloglikelihood(ols); digits=5)
-6.41736

julia> round.(vcov(ols); digits=5)
2×2 Matrix{Float64}:
0.38889  -0.16667
-0.16667   0.08333
``````

## Probit regression

``````julia> data = DataFrame(X=[1,2,2], Y=[1,0,1])
3×2 DataFrame
Row │ X      Y
│ Int64  Int64
─────┼──────────────
1 │     1      1
2 │     2      0
3 │     2      1

julia> probit = glm(@formula(Y ~ X), data, Binomial(), ProbitLink())
StatsModels.TableRegressionModel{GeneralizedLinearModel{GLM.GlmResp{Vector{Float64}, Binomial{Float64}, ProbitLink}, GLM.DensePredChol{Float64, LinearAlgebra.Cholesky{Float64, Matrix{Float64}}}}, Matrix{Float64}}

Y ~ 1 + X

Coefficients:
────────────────────────────────────────────────────────────────────────
Coef.  Std. Error      z  Pr(>|z|)  Lower 95%  Upper 95%
────────────────────────────────────────────────────────────────────────
(Intercept)   9.63839     293.909   0.03    0.9738   -566.414    585.69
X            -4.81919     146.957  -0.03    0.9738   -292.849    283.211
────────────────────────────────────────────────────────────────────────
``````

## Negative binomial regression

``````julia> using GLM, RDatasets

julia> quine = dataset("MASS", "quine")
146×5 DataFrame
Row │ Eth   Sex   Age   Lrn   Days
│ Cat…  Cat…  Cat…  Cat…  Int32
─────┼───────────────────────────────
1 │ A     M     F0    SL        2
2 │ A     M     F0    SL       11
3 │ A     M     F0    SL       14
4 │ A     M     F0    AL        5
5 │ A     M     F0    AL        5
6 │ A     M     F0    AL       13
7 │ A     M     F0    AL       20
8 │ A     M     F0    AL       22
⋮  │  ⋮     ⋮     ⋮     ⋮      ⋮
140 │ N     F     F3    AL        3
141 │ N     F     F3    AL        3
142 │ N     F     F3    AL        5
143 │ N     F     F3    AL       15
144 │ N     F     F3    AL       18
145 │ N     F     F3    AL       22
146 │ N     F     F3    AL       37
131 rows omitted

julia> nbrmodel = glm(@formula(Days ~ Eth+Sex+Age+Lrn), quine, NegativeBinomial(2.0), LogLink())
StatsModels.TableRegressionModel{GeneralizedLinearModel{GLM.GlmResp{Vector{Float64}, NegativeBinomial{Float64}, LogLink}, GLM.DensePredChol{Float64, LinearAlgebra.Cholesky{Float64, Matrix{Float64}}}}, Matrix{Float64}}

Days ~ 1 + Eth + Sex + Age + Lrn

Coefficients:
────────────────────────────────────────────────────────────────────────────
Coef.  Std. Error      z  Pr(>|z|)   Lower 95%   Upper 95%
────────────────────────────────────────────────────────────────────────────
(Intercept)   2.88645      0.227144  12.71    <1e-36   2.44125     3.33164
Eth: N       -0.567515     0.152449  -3.72    0.0002  -0.86631    -0.26872
Sex: M        0.0870771    0.159025   0.55    0.5840  -0.224606    0.398761
Age: F1      -0.445076     0.239087  -1.86    0.0627  -0.913678    0.0235251
Age: F2       0.0927999    0.234502   0.40    0.6923  -0.366816    0.552416
Age: F3       0.359485     0.246586   1.46    0.1449  -0.123814    0.842784
Lrn: SL       0.296768     0.185934   1.60    0.1105  -0.0676559   0.661191
────────────────────────────────────────────────────────────────────────────

julia> nbrmodel = negbin(@formula(Days ~ Eth+Sex+Age+Lrn), quine, LogLink())
StatsModels.TableRegressionModel{GeneralizedLinearModel{GLM.GlmResp{Vector{Float64}, NegativeBinomial{Float64}, LogLink}, GLM.DensePredChol{Float64, LinearAlgebra.Cholesky{Float64, Matrix{Float64}}}}, Matrix{Float64}}

Days ~ 1 + Eth + Sex + Age + Lrn

Coefficients:
────────────────────────────────────────────────────────────────────────────
Coef.  Std. Error      z  Pr(>|z|)   Lower 95%   Upper 95%
────────────────────────────────────────────────────────────────────────────
(Intercept)   2.89453      0.227415  12.73    <1e-36   2.4488      3.34025
Eth: N       -0.569341     0.152656  -3.73    0.0002  -0.868541   -0.270141
Sex: M        0.0823881    0.159209   0.52    0.6048  -0.229655    0.394431
Age: F1      -0.448464     0.238687  -1.88    0.0603  -0.916281    0.0193536
Age: F2       0.0880506    0.235149   0.37    0.7081  -0.372834    0.548935
Age: F3       0.356955     0.247228   1.44    0.1488  -0.127602    0.841513
Lrn: SL       0.292138     0.18565    1.57    0.1156  -0.0717297   0.656006
────────────────────────────────────────────────────────────────────────────

julia> println("Estimated theta = ", round(nbrmodel.model.rr.d.r, digits=5))
Estimated theta = 1.27489

``````

## Julia and R comparisons

An example of a simple linear model in R is

``````> coef(summary(lm(optden ~ carb, Formaldehyde)))
Estimate  Std. Error    t value     Pr(>|t|)
(Intercept) 0.005085714 0.007833679  0.6492115 5.515953e-01
carb        0.876285714 0.013534536 64.7444207 3.409192e-07``````

The corresponding model with the `GLM` package is

``````julia> using GLM, RDatasets

julia> form = dataset("datasets", "Formaldehyde")
6×2 DataFrame
Row │ Carb     OptDen
│ Float64  Float64
─────┼──────────────────
1 │     0.1    0.086
2 │     0.3    0.269
3 │     0.5    0.446
4 │     0.6    0.538
5 │     0.7    0.626
6 │     0.9    0.782

julia> lm1 = fit(LinearModel, @formula(OptDen ~ Carb), form)
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}}

OptDen ~ 1 + Carb

Coefficients:
───────────────────────────────────────────────────────────────────────────
Coef.  Std. Error      t  Pr(>|t|)   Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
(Intercept)  0.00508571  0.00783368   0.65    0.5516  -0.0166641  0.0268355
Carb         0.876286    0.0135345   64.74    <1e-06   0.838708   0.913864
───────────────────────────────────────────────────────────────────────────
``````

A more complex example in R is

``````> coef(summary(lm(sr ~ pop15 + pop75 + dpi + ddpi, LifeCycleSavings)))
Estimate   Std. Error    t value     Pr(>|t|)
(Intercept) 28.5660865407 7.3545161062  3.8841558 0.0003338249
pop15       -0.4611931471 0.1446422248 -3.1885098 0.0026030189
pop75       -1.6914976767 1.0835989307 -1.5609998 0.1255297940
dpi         -0.0003369019 0.0009311072 -0.3618293 0.7191731554
ddpi         0.4096949279 0.1961971276  2.0881801 0.0424711387``````

with the corresponding Julia code

``````julia> LifeCycleSavings = dataset("datasets", "LifeCycleSavings")
50×6 DataFrame
Row │ Country         SR       Pop15    Pop75    DPI      DDPI
│ String15        Float64  Float64  Float64  Float64  Float64
─────┼─────────────────────────────────────────────────────────────
1 │ Australia         11.43    29.35     2.87  2329.68     2.87
2 │ Austria           12.07    23.32     4.41  1507.99     3.93
3 │ Belgium           13.17    23.8      4.43  2108.47     3.82
4 │ Bolivia            5.75    41.89     1.67   189.13     0.22
5 │ Brazil            12.88    42.19     0.83   728.47     4.56
6 │ Canada             8.79    31.72     2.85  2982.88     2.43
7 │ Chile              0.6     39.74     1.34   662.86     2.67
8 │ China             11.9     44.75     0.67   289.52     6.51
⋮  │       ⋮            ⋮        ⋮        ⋮        ⋮        ⋮
44 │ United States      7.56    29.81     3.43  4001.89     2.45
45 │ Venezuela          9.22    46.4      0.9    813.39     0.53
46 │ Zambia            18.56    45.25     0.56   138.33     5.14
47 │ Jamaica            7.72    41.12     1.73   380.47    10.23
48 │ Uruguay            9.24    28.13     2.72   766.54     1.88
49 │ Libya              8.89    43.69     2.07   123.58    16.71
50 │ Malaysia           4.71    47.2      0.66   242.69     5.08
35 rows omitted

julia> fm2 = fit(LinearModel, @formula(SR ~ Pop15 + Pop75 + DPI + DDPI), LifeCycleSavings)
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}}

SR ~ 1 + Pop15 + Pop75 + DPI + DDPI

Coefficients:
─────────────────────────────────────────────────────────────────────────────────
Coef.   Std. Error      t  Pr(>|t|)    Lower 95%    Upper 95%
─────────────────────────────────────────────────────────────────────────────────
(Intercept)  28.5661       7.35452       3.88    0.0003  13.7533      43.3788
Pop15        -0.461193     0.144642     -3.19    0.0026  -0.752518    -0.169869
Pop75        -1.6915       1.0836       -1.56    0.1255  -3.87398      0.490983
DPI          -0.000336902  0.000931107  -0.36    0.7192  -0.00221225   0.00153844
DDPI          0.409695     0.196197      2.09    0.0425   0.0145336    0.804856
─────────────────────────────────────────────────────────────────────────────────
``````

The `glm` function (or equivalently, `fit(GeneralizedLinearModel, ...)`) works similarly to the R `glm` function except that the `family` argument is replaced by a `Distribution` type and, optionally, a `Link` type. The first example from `?glm` in R is

``````glm> ## Dobson (1990) Page 93: Randomized Controlled Trial : (slightly modified)
glm> counts <- c(18,17,15,20,10,21,25,13,13)

glm> outcome <- gl(3,1,9)

glm> treatment <- gl(3,3)

glm> print(d.AD <- data.frame(treatment, outcome, counts))
treatment outcome counts
1         1       1     18
2         1       2     17
3         1       3     15
4         2       1     20
5         2       2     10
6         2       3     21
7         3       1     25
8         3       2     13
9         3       3     13

glm> glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())

glm> anova(glm.D93)
Analysis of Deviance Table

Response: counts

Terms added sequentially (first to last)

Df Deviance Resid. Df Resid. Dev
NULL                          8    10.3928
outcome    2   5.2622         6     5.1307
treatment  2   0.0132         4     5.1175

glm> ## No test:
glm> summary(glm.D93)

Call:
glm(formula = counts ~ outcome + treatment, family = poisson())

Deviance Residuals:
1        2        3        4        5        6        7        8        9
-0.6122   1.0131  -0.2819  -0.2498  -0.9784   1.0777   0.8162  -0.1155  -0.8811

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)   3.0313     0.1712  17.711   <2e-16 ***
outcome2     -0.4543     0.2022  -2.247   0.0246 *
outcome3     -0.2513     0.1905  -1.319   0.1870
treatment2    0.0198     0.1990   0.100   0.9207
treatment3    0.0198     0.1990   0.100   0.9207
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 10.3928  on 8  degrees of freedom
Residual deviance:  5.1175  on 4  degrees of freedom
AIC: 56.877

Number of Fisher Scoring iterations: 4``````

In Julia this becomes

``````julia> using DataFrames, CategoricalArrays, GLM

julia> dobson = DataFrame(Counts    = [18.,17,15,20,10,21,25,13,13],
Outcome   = categorical([1,2,3,1,2,3,1,2,3]),
Treatment = categorical([1,1,1,2,2,2,3,3,3]))
9×3 DataFrame
Row │ Counts   Outcome  Treatment
│ Float64  Cat…     Cat…
─────┼─────────────────────────────
1 │    18.0  1        1
2 │    17.0  2        1
3 │    15.0  3        1
4 │    20.0  1        2
5 │    10.0  2        2
6 │    21.0  3        2
7 │    25.0  1        3
8 │    13.0  2        3
9 │    13.0  3        3

julia> gm1 = fit(GeneralizedLinearModel, @formula(Counts ~ Outcome + Treatment), dobson, Poisson())
StatsModels.TableRegressionModel{GeneralizedLinearModel{GLM.GlmResp{Vector{Float64}, Poisson{Float64}, LogLink}, GLM.DensePredChol{Float64, LinearAlgebra.Cholesky{Float64, Matrix{Float64}}}}, Matrix{Float64}}

Counts ~ 1 + Outcome + Treatment

Coefficients:
────────────────────────────────────────────────────────────────────────────
Coef.  Std. Error      z  Pr(>|z|)  Lower 95%   Upper 95%
────────────────────────────────────────────────────────────────────────────
(Intercept)    3.03128      0.171155  17.71    <1e-69   2.69582    3.36674
Outcome: 2    -0.454255     0.202171  -2.25    0.0246  -0.850503  -0.0580079
Outcome: 3    -0.251314     0.190476  -1.32    0.1870  -0.624641   0.122012
Treatment: 2   0.0198026    0.199017   0.10    0.9207  -0.370264   0.409869
Treatment: 3   0.0198026    0.199017   0.10    0.9207  -0.370264   0.409869
────────────────────────────────────────────────────────────────────────────

julia> round(deviance(gm1), digits=5)
5.11746

In this example, we choose the best model from a set of λs, based on minimum BIC.
julia> using GLM, RDatasets, StatsBase, DataFrames, Optim

julia> trees = DataFrame(dataset("datasets", "trees"))
31×3 DataFrame
Row │ Girth    Height  Volume
│ Float64  Int64   Float64
─────┼──────────────────────────
1 │     8.3      70     10.3
2 │     8.6      65     10.3
3 │     8.8      63     10.2
4 │    10.5      72     16.4
5 │    10.7      81     18.8
6 │    10.8      83     19.7
7 │    11.0      66     15.6
8 │    11.0      75     18.2
⋮  │    ⋮       ⋮        ⋮
25 │    16.3      77     42.6
26 │    17.3      81     55.4
27 │    17.5      82     55.7
28 │    17.9      80     58.3
29 │    18.0      80     51.5
30 │    18.0      80     51.0
31 │    20.6      87     77.0
16 rows omitted

julia> bic_glm(λ) = bic(glm(@formula(Volume ~ Height + Girth), trees, Normal(), PowerLink(λ)));

julia> optimal_bic = optimize(bic_glm, -1.0, 1.0);

julia> round(optimal_bic.minimizer, digits = 5) # Optimal λ
0.40935

julia> glm(@formula(Volume ~ Height + Girth), trees, Normal(), PowerLink(optimal_bic.minimizer)) # Best model
StatsModels.TableRegressionModel{GeneralizedLinearModel{GLM.GlmResp{Vector{Float64}, Normal{Float64}, PowerLink}, GLM.DensePredChol{Float64, LinearAlgebra.Cholesky{Float64, Matrix{Float64}}}}, Matrix{Float64}}

Volume ~ 1 + Height + Girth

Coefficients:
────────────────────────────────────────────────────────────────────────────
Coef.  Std. Error      z  Pr(>|z|)   Lower 95%   Upper 95%
────────────────────────────────────────────────────────────────────────────
(Intercept)  -1.07586    0.352543    -3.05    0.0023  -1.76684    -0.384892
Height        0.0232172  0.00523331   4.44    <1e-05   0.0129601   0.0334743
Girth         0.242837   0.00922555  26.32    <1e-99   0.224756    0.260919
────────────────────────────────────────────────────────────────────────────

julia> round(optimal_bic.minimum, digits=5)
156.37638``````

Author: JuliaStats
Source code: https://github.com/JuliaStats/GLM.jl

1665752640

## LMCLUS

A Julia package for linear manifold clustering.

## Installation

Prior to Julia v0.7.0

``````Pkg.clone("https://github.com/wildart/LMCLUS.jl.git")
``````

For Julia v0.7.0/1.0.0

``````pkg> add https://github.com/wildart/LMCLUS.jl.git#0.4.0
``````

For Julia 1.1+, add BoffinStuff registry in the package manager before installing the package.

``````pkg> registry add https://github.com/wildart/BoffinStuff.git