1678099140
LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data.
NOTE: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!
At its core, LlamaIndex contains a toolkit designed to easily connect LLM's with your external data. LlamaIndex helps to provide the following:
Each data structure offers distinct use cases and a variety of customizable parameters. These indices can then be queried in a general purpose manner, in order to achieve any task that you would typically achieve with an LLM:
Interesting in contributing? See our Contribution Guide for more details.
Full documentation can be found here: https://gpt-index.readthedocs.io/en/latest/.
Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources!
pip install llama-index
Examples are in the examples
folder. Indices are in the indices
folder (see list of indices below).
To build a simple vector store index:
import os
os.environ["OPENAI_API_KEY"] = 'YOUR_OPENAI_API_KEY'
from llama_index import GPTSimpleVectorIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('data').load_data()
index = GPTSimpleVectorIndex(documents)
To save to and load from disk:
# save to disk
index.save_to_disk('index.json')
# load from disk
index = GPTSimpleVectorIndex.load_from_disk('index.json')
To query:
index.query("<question_text>?")
The main third-party package requirements are tiktoken
, openai
, and langchain
.
All requirements should be contained within the setup.py
file. To run the package locally without building the wheel, simply run pip install -r requirements.txt
.
Reference to cite if you use LlamaIndex in a paper:
@software{Liu_LlamaIndex_2022,
author = {Liu, Jerry},
doi = {10.5281/zenodo.1234},
month = {11},
title = {{LlamaIndex}},
url = {https://github.com/jerryjliu/gpt_index},
year = {2022}
}
⚠️ NOTE: We are rebranding GPT Index as LlamaIndex! We will carry out this transition gradually.
2/25/2023: By default, our docs/notebooks/instructions now reference "LlamaIndex" instead of "GPT Index".
2/19/2023: By default, our docs/notebooks/instructions now use the
llama-index
package. However thegpt-index
package still exists as a duplicate!
2/16/2023: We have a duplicate
llama-index
pip package. Simply replace all imports ofgpt_index
withllama_index
if you choose topip install llama-index
.
PyPi:
Documentation: https://gpt-index.readthedocs.io/en/latest/.
Twitter: https://twitter.com/gpt_index.
Discord: https://discord.gg/dGcwcsnxhU.
LlamaHub (community library of data loaders): https://llamahub.ai
Author: jerryjliu
Source Code: https://github.com/jerryjliu/gpt_index
License: MIT license
1677079389
You can use the CREATE INDEX statement to create an index in Oracle. Here's the basic syntax,
CREATE INDEX index_name
ON table_name (column1, column2, ...);
In this syntax, index_name is the name of the index you want to create, and table_name is the name of the table on which you want to create the index. You can also specify one or more column names in parentheses to indicate which columns you want to include in the index.
For example, let's say you have a table called "employees" with columns "employee_id", "last_name", and "first_name", and you want to create an index on the "last_name" column. You can do this with the following SQL statement,
CREATE INDEX emp_last_name_idx
ON employees (last_name);
This will create an index called "emp_last_name_idx" on the "last_name" column of the "employees" table. You can then use this index to improve the performance of queries that filter or sort by the "last_name" column.
Original article source at: https://www.c-sharpcorner.com/
1675898760
Hello Readers!! We are again back with a new interesting topic with this blog. While using kibana some of you may have faced an issue of unavailable values in Kibana for index fields with dot notations. So In this blog, we will see why we face this problem and what we can do to resolve this issue.
If you are facing the issue of empty values in Kibana for index field names containing dots, this issue is caused by Kibana’s treatment of dots in field names. In Elasticsearch, we get the data but not in kibana. In Kibana, dots are used as separators in the field names, which can result in empty values in the visualization if the dots are not properly escaped. This issue may occur because Elasticsearch and Kibana handle field names with dots differently.
Scripted fields in Kibana are calculated fields that are generated using a script. They are used to derive new values based on existing data and to manipulate the data before it is displayed in visualizations and dashboards. Scripted fields can be created using either Painless or Lucene expressions, and they can be used in conjunction with other fields to provide additional insights into your data.
Some common use cases for scripted fields include:
As this issue occurs because Elasticsearch and Kibana handle field names with dots differently. There can be a number of ways to solve this issue. So here we will use Kibana scripted field scripts. Follow the following steps in Kibana:
Step 1: Go to the “Management” section in Kibana and select “Index Patterns.” Find the index pattern that contains the fields with dots and click on it.
This is my index pattern.
You can see here the index_field containing dot notations:
Step 2: Move to the “Scripted Fields” tab and after this click on “Add Scripted Field.”
Give the scripted field a name without dots, for example, “field_without_dots.” Also select the type as per your respective fields.
Step 3: In the script field, enter the following code:
return doc['field_with_dots'].value
Replace “field_with_dots” with the actual name of the field that contains dots.
And now click “Create field” to save the scripted field. As you can see below my scripted field is created successfully.
Now, this scripted field can now be used in discover, visualizations, and dashboards, just like all other index fields.
Yes, we are all done now!! I hope this will help you somewhere.
Thank you for sticking to the end. In this blog, we have learned how we can fix the issue of unavailable values in Kibana for index fields with dot notations. This is really very useful. I hope this blog helped you somewhere. Please share if you liked this blog. Kindly reach out to me for any related queries.
HAPPY LEARNING!
Original article source at: https://blog.knoldus.com/
1673466300
With MarkLogic being a document-oriented database, data is commonly stored in a JSON or XML document format.
If the data to bring into the MarkLogic is not already structured in JSON or XML means if it is currently in a relational database, there are various ways to export or transform it from the source.
For example, many relational databases provide an option to export relational data in XML or in JSON format, or a SQL script could be written to fetch the data from the database, outputting it in an XML or JSON structure. Or, using Marklogic rows from a .csv file can be imported as XML and JSON documents.
In any case, it is normal to first denormalize the data being exported from the relational database to first put the content back together in its original state. Denormalization, which naturally occurs when working with documents in their original form, greatly reduces the need for joins and acceleration performance.
As we know that schema is something having a set of rules for a particular structure of the database. While we talk about data quality then schemas are helpful as quality matters a lot with quality reliability and a proper actional database is going to present.
Now if we talk about the schema-agnostic then it is something the database is not bounded by any schema but it is aware of it. Schemas are optional in MarkLogic. Data is going to be loaded in its original data form. To address a group of documents within a database, directories, collections and internal structure of documents can be used. With MarkLogic easily supports data from disparate systems all in the same database.
When loading a document, it is the best choice to have one document per entity. Marklogic is the most performant with many small documents, rather than one large document. The target document size is 1KB to 100KB but can be larger.
For Example, rather than loading a bunch of students all as one document, have each student be a document.
Whenever defining a document remember that use XML document and attribute names or JSON property names. Make document names human-readable so do not create generic names. Using this convention help indexes be efficient.
<items>
<item>
<product> Mouse </product>
<price> 1000 </price>
<quantity> 3 </quantity>
</item>
<item>
<product> Keyboard </product>
<price> 2000 </price>
<quantity> 2 </quantity>
</item>
</items>
As documents are loaded, all the words in each document and the structure of each document, are indexed. So documents are easily searchable.
The document can be loaded into the MarkLogic in many ways:
To read a document, the URI of the document is used.
XQuery Example : fn:doc("college/course-101.json")
JavaScript Example : fn:doc("account/order-202.json")
Rest API Example : curl --abc --user admin:admin: -X GET "http://localhost:8055/v1/document?uri=/accounting/order-10072.json"
MLCP has the feature of splitting the long XML documents, where each occurrence of a designated element becomes an individual XML document in the database. This is useful when multiple records are all contained within one large XML file. Such as a list of students, courses, details, etc.
The -input_file_type aggregates option is used to split a large document into individual documents. The aggregate_record-element option is used element used to designate a new document. The -uri_id is used to create a URI for each document.
While it is fine to have a mix of XML and JSON documents in the same database, it is also possible to transform content from one format to other. You can easily transform the files by following the below steps.
xquery version "1.0-ml";
import module namespace json = "http://abc.com/xdmp/json" at "abc/json/json.xqy";
json:transform-to-json(fn:doc("doc-01.xml"), json:config("custom"))
A Marklogic content pump can be used to import the rows from the .csv file to a MarkLogic database. We are able to the data during the process or afterward in the database. Ways to modify content once it is already in the database include using the data movement SDK, XQuery, Js, etc.
As we know that MarkLogic is a database that facilitates many things like we can load the data, indexing the data, transforming the data, and splitting the data.
References:
Original article source at: https://blog.knoldus.com/
1669188856
In this R article, we will learn about What Is R Programming Language? introduction & Basics. R is a programming language developed by Ross Ihaka and Robert Gentleman in 1993. R possesses an extensive catalog of statistical and graphical methods. It includes machine learning algorithms, linear regression, time series, statistical inference to name a few. Most of the R libraries are written in R, but for heavy computational tasks, C, C++, and Fortran codes are preferred.
Data analysis with R is done in a series of steps; programming, transforming, discovering, modeling and communicating the results
As conclusion, R is the world’s most widely used statistics programming language. It’s the 1st choice of data scientists and supported by a vibrant and talented community of contributors. R is taught in universities and deployed in mission-critical business applications.
Windows Installation – We can download the Windows installer version of R from R-3.2.2 for windows (32/64)
As it is a Windows installer (.exe) with the name “R-version-win.exe”. You can just double click and run the installer accepting the default settings. If your Windows is a 32-bit version, it installs the 32-bit version. But if your windows are 64-bit, then it installs both the 32-bit and 64-bit versions.
After installation, you can locate the icon to run the program in a directory structure “R\R3.2.2\bin\i386\Rgui.exe” under the Windows Program Files. Clicking this icon brings up the R-GUI which is the R console to do R Programming.
R Programming is a very popular programming language that is broadly used in data analysis. The way in which we define its code is quite simple. The “Hello World!” is the basic program for all the languages, and now we will understand the syntax of R programming with the “Hello world” program. We can write our code either in the command prompt, or we can use an R script file.
Once you have R environment setup, then it’s easy to start your R command prompt by just typing the following command at your command prompt −
$R
This will launch R interpreter and you will get a prompt > where you can start typing your program as follows −
>myString <- "Hello, World"
>print (myString)
[1] "Hello, World!"
Here the first statement defines a string variable myString, where we assign a string “Hello, World!” and then the next statement print() is being used to print the value stored in myString variable.
While doing programming in any programming language, you need to use various variables to store various information. Variables are nothing but reserved memory locations to store values. This means that when you create a variable you reserve some space in memory.
In contrast to other programming languages like C and java in R, the variables are not declared as some data type. The variables are assigned with R-Objects and the data type of the R-object becomes the data type of the variable. There are many types of R-objects. The frequently used ones are −
#create a vector and find the elements which are >5
v<-c(1,2,3,4,5,6,5,8)
v[v>5]
#subset
subset(v,v>5)
#position in the vector created in which square of the numbers of v is >10 holds good
which(v*v>10)
#to know the values
v[v*v>10]
Output: [1] 6 8
Output: [1] 6 8
Output: [1] 4 5 6 7 8
Output: [1] 4 5 6 5 8
A matrix is a two-dimensional rectangular data set. It can be created using a vector input to the matrix function.
#matrices: a vector with two dimensional attributes
mat<-matrix(c(1,2,3,4))
mat1<-matrix(c(1,2,3,4),nrow=2)
mat1
Output: [,1] [,2] [1,] 1 3 [2,] 2 4
mat2<-matrix(c(1,2,3,4),ncol=2,byrow=T)
mat2
Output: [,1] [,2] [1,] 1 2 [2,] 3 4
mat3<-matrix(c(1,2,3,4),byrow=T)
mat3
#transpose of matrix
mattrans<-t(mat)
mattrans
#create a character matrix called fruits with elements apple, orange, pear, grapes
fruits<-matrix(c("apple","orange","pear","grapes"),2)
#create 3×4 matrix of marks obtained in each quarterly exams for 4 different subjects
X<-matrix(c(50,70,40,90,60, 80,50, 90,100, 50,30, 70),nrow=3)
X
#give row names and column names
rownames(X)<-paste(prefix="Test.",1:3)
subs<-c("Maths", "English", "Science", "History")
colnames(X)<-subs
X
Output: [,1] [1,] 1 [2,] 2 [3,] 3 [4,] 4 Output: [,1] [,2] [,3] [,4] [1,] 1 2 3 4 Output: [,1] [,2] [,3] [,4] [1,] 50 90 50 50 [2,] 70 60 90 30 [3,] 40 80 100 70 Output: Maths English Science History Test. 1 50 90 50 50 Test. 2 70 60 90 30 Test. 3 40 80 100 70
While matrices are confined to two dimensions, arrays can be of any number of dimensions. The array function takes a dim attribute which creates the required number of dimensions. In the below example we create an array with two elements which are 3×3 matrices each.
#Arrays
arr<-array(1:24,dim=c(3,4,2))
arr
#create an array using alphabets with dimensions 3 rows, 2 columns and 3 arrays
arr1<-array(letters[1:18],dim=c(3,2,3))
#select only 1st two matrix of an array
arr1[,,c(1:2)]
#LIST
X<-list(u=2, n='abc')
X
X$u
[,1] [,2] [,3] [,4]
[,1] [,2] [,3] [,4]
[,1] [,2]
[,1] [,2]
Data frames are tabular data objects. Unlike a matrix in a data frame, each column can contain different modes of data. The first column can be numeric while the second column can be character and the third column can be logical. It is a list of vectors of equal length.
#Dataframes
students<-c("J","L","M","K","I","F","R","S")
Subjects<-rep(c("science","maths"),each=2)
marks<-c(55,70,66,85,88,90,56,78)
data<-data.frame(students,Subjects,marks)
#Accessing dataframes
data[[1]]
data$Subjects
data[,1]
Output: [1] J L M K I F R S Levels: F I J K L M R S Output: data$Subjects [1] science science maths maths science science maths maths Levels: maths science
Factors are the r-objects which are created using a vector. It stores the vector along with the distinct values of the elements in the vector as labels. The labels are always character irrespective of whether it is numeric or character or Boolean etc. in the input vector. They are useful in statistical modeling.
Factors are created using the factor() function. The nlevels function gives the count of levels.
#Factors
x<-c(1,2,3)
factor(x)
#apply function
data1<-data.frame(age=c(55,34,42,66,77),bmi=c(26,25,21,30,22))
d<-apply(data1,2,mean)
d
#create two vectors age and gender and find mean age with respect to gender
age<-c(33,34,55,54)
gender<-factor(c("m","f","m","f"))
tapply(age,gender,mean)
Output: [1] 1 2 3 Levels: 1 2 3 Output: age bmi 54.8 24.8 Output: f m 44 44
A variable provides us with named storage that our programs can manipulate. A variable in R can store an atomic vector, a group of atomic vectors, or a combination of many R objects. A valid variable name consists of letters, numbers, and the dot or underlines characters.
total, sum, .fine.with.dot, this_is_acceptable, Number5
tot@l, 5um, _fine, TRUE, .0ne
Earlier versions of R used underscore (_) as an assignment operator. So, the period (.) was used extensively in variable names having multiple words. Current versions of R support underscore as a valid identifier but it is good practice to use a period as word separators.
For example, a.variable.name is preferred over a_variable_name or alternatively we could use camel case as aVariableName.
Constants, as the name suggests, are entities whose value cannot be altered. Basic types of constant are numeric constants and character constants.
Numeric Constants
All numbers fall under this category. They can be of type integer, double or complex. It can be checked with the typeof() function.
Numeric Constants followed by L are regarded as integers and those followed by i are regarded as complex.
> typeof(5)
> typeof(5L)
> typeof(5L)
[1] “double” [1] “double” [[1] “double”
Character Constants
Character constants can be represented using either single quotes (‘) or double quotes (“) as delimiters.
> 'example'
> typeof("5")
[1] "example" [1] "character"
Operators – Arithmetic, Relational, Logical, Assignment, and some of the Miscellaneous Operators that R programming language provides.
There are four main categories of Operators in the R programming language.
x <- 35
y<-10
x+y > x-y > x*y > x/y > x%/%y > x%%y > x^y [1] 45 [1] 25 [1] 350 [1] 3.5 [1] 3 [1] 5 [1]2.75e+15
The below table shows the logical operators in R. Operators & and | perform element-wise operation producing result having a length of the longer operand. But && and || examines only the first element of the operands resulting in a single length logical vector.
a <- c(TRUE,TRUE,FALSE,0,6,7)
b <- c(FALSE,TRUE,FALSE,TRUE,TRUE,TRUE)
a&b
[1] FALSE TRUE FALSE FALSE TRUE TRUE
a&&b
[1] FALSE
> a|b
[1] TRUE TRUE FALSE TRUE TRUE TRUE
> a||b
[1] TRUE
> !a
[1] FALSE FALSE TRUE TRUE FALSE FALSE
> !b
[1] TRUE FALSE TRUE FALSE FALSE FALSE
Functions are defined using the function() directive and are stored as R objects just like anything else. In particular, they are R objects of class “function”. Here’s a simple function that takes no arguments simply prints ‘Hi statistics’.
#define the function
f <- function() {
print("Hi statistics!!!")
}
#Call the function
f()
Output: [1] "Hi statistics!!!"
Now let’s define a function called standardize, and the function has a single argument x which is used in the body of a function.
#Define the function that will calculate standardized score.
standardize = function(x) {
m = mean(x)
sd = sd(x)
result = (x – m) / sd
result
}
input<- c(40:50) #Take input for what we want to calculate a standardized score.
standardize(input) #Call the function
Output: standardize(input) #Call the function [1] -1.5075567 -1.2060454 -0.9045340 -0.6030227 -0.3015113 0.0000000 0.3015113 0.6030227 0.9045340 1.2060454 1.5075567
R has some very useful functions which implement looping in a compact form to make life easier. The very rich and powerful family of applied functions is made of intrinsically vectorized functions. These functions in R allow you to apply some function to a series of objects (eg. vectors, matrices, data frames, or files). They include:
There is another function called split() which is also useful, particularly in conjunction with lapply.
A vector is a sequence of data elements of the same basic type. Members in a vector are officially called components. Vectors are the most basic R data objects and there are six types of atomic vectors. They are logical, integer, double, complex, character, and raw.
The c() function can be used to create vectors of objects by concatenating things together.
x <- c(1,2,3,4,5) #double
x #If you use only x auto-printing occurs
l <- c(TRUE, FALSE) #logical
l <- c(T, F) ## logical
c <- c("a", "b", "c", "d") ## character
i <- 1:20 ## integer
cm <- c(2+2i, 3+3i) ## complex
print(l)
print(c)
print(i)
print(cm)
You can see the type of each vector using typeof() function in R.
typeof(x)
typeof(l)
typeof(c)
typeof(i)
typeof(cm)
Output: print(l) [1] TRUE FALSE print(c) [1] "a" "b" "c" "d" print(i) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 print(cm) [1] 2+2i 3+3i Output: typeof(x) [1] "double" typeof(l) [1] "logical" typeof(c) [1] "character" typeof(i) [1] "integer" typeof(cm) [1] "complex"
We can use the seq() function to create a vector within an interval by specifying step size or specifying the length of the vector.
seq(1:10) #By default it will be incremented by 1
seq(1, 20, length.out=5) # specify length of the vector
seq(1, 20, by=2) # specify step size
Output: > seq(1:10) #By default it will be incremented by 1 [1] 1 2 3 4 5 6 7 8 9 10 > seq(1, 20, length.out=5) # specify length of the vector [1] 1.00 5.75 10.50 15.25 20.00 > seq(1, 20, by=2) # specify step size [1] 1 3 5 7 9 11 13 15 17 19
Elements of a vector can be accessed using indexing. The vector indexing can be logical, integer, or character. The [ ] brackets are used for indexing. Indexing starts with position 1, unlike most programming languages where indexing starts from 0.
We can use integers as an index to access specific elements. We can also use negative integers to return all elements except that specific element.
x<- 101:110
x[1] #access the first element
x[c(2,3,4,5)] #Extract 2nd, 3rd, 4th, and 5th elements
x[5:10] #Extract all elements from 5th to 10th
x[c(-5,-10)] #Extract all elements except 5th and 10th
x[-c(5:10)] #Extract all elements except from 5th to 10th
Output: x[1] #Extract the first element [1] 101 x[c(2,3,4,5)] #Extract 2nd, 3rd, 4th, and 5th elements [1] 102 103 104 105 x[5:10] #Extract all elements from 5th to 10th [1] 105 106 107 108 109 110 x[c(-5,-10)] #Extract all elements except 5th and 10th [1] 101 102 103 104 106 107 108 109 x[-c(5:10)] #Extract all elements except from 5th to 10th [1] 101 102 103 104
If you use a logical vector for indexing, the position where the logical vector is TRUE will be returned.
x[x < 105]
x[x>=104]
Output: x[x < 105] [1] 101 102 103 104 x[x>=104] [1] 104 105 106 107 108 109 110
We can modify a vector and assign a new value to it. You can truncate a vector by using reassignments. Check the below example.
x<- 10:12
x[1]<- 101 #Modify the first element
x
x[2]<-102 #Modify the 2nd element
x
x<- x[1:2] #Truncate the last element
x
Output: x [1] 101 11 12 x[2]<-102 #Modify the 2nd element x [1] 101 102 12 x<- x[1:2] #Truncate the last element x [1] 101 102
We can use arithmetic operations on two vectors of the same length. They can be added, subtracted, multiplied, or divided. Check the output of the below code.
# Create two vectors.
v1 <- c(1:10)
v2 <- c(101:110)
# Vector addition.
add.result <- v1+v2
print(add.result)
# Vector subtraction.
sub.result <- v2-v1
print(sub.result)
# Vector multiplication.
multi.result <- v1*v2
print(multi.result)
# Vector division.
divi.result <- v2/v1
print(divi.result)
Output: print(add.result) [1] 102 104 106 108 110 112 114 116 118 120 print(sub.result) [1] 100 100 100 100 100 100 100 100 100 100 print(multi.result) [1] 101 204 309 416 525 636 749 864 981 1100 print(divi.result) [1] 101.00000 51.00000 34.33333 26.00000 21.00000 17.66667 15.28571 13.50000 12.11111 11.00000
The minimum and the maximum of a vector can be found using the min() or the max() function. range() is also available which returns the minimum and maximum in a vector.
x<- 1001:1010
max(x) # Find the maximum
min(x) # Find the minimum
range(x) #Find the range
Output: max(x) # Find the maximum [1] 1010 min(x) # Find the minimum [1] 1001 range(x) #Find the range [1] 1001 1010
The list is a data structure having elements of mixed data types. A vector having all elements of the same type is called an atomic vector but a vector having elements of a different type is called list.
We can check the type with typeof() or class() function and find the length using length()function.
x <- list("stat",5.1, TRUE, 1 + 4i)
x
class(x)
typeof(x)
length(x)
Output: x [[1]] [1] "stat" [[2]] [1] 5.1 [[3]] [1] TRUE [[4]] [1] 1+4i class(x) [1] “list” typeof(x) [1] “list” length(x) [1] 4
You can create an empty list of a prespecified length with the vector() function.
x <- vector("list", length = 10)
x
Output: x [[1]] NULL [[2]] NULL [[3]] NULL [[4]] NULL [[5]] NULL [[6]] NULL [[7]] NULL [[8]] NULL [[9]] NULL [[10]] NULL
Lists can be subset using two syntaxes, the $ operator, and square brackets []. The $ operator returns a named element of a list. The [] syntax returns a list, while the [[]] returns an element of a list.
# subsetting
l$e
l["e"]
l[1:2]
l[c(1:2)] #index using integer vector
l[-c(3:length(l))] #negative index to exclude elements from 3rd up to last.
l[c(T,F,F,F,F)] # logical index to access elements
Output: > l$e [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 0 0 0 0 0 0 0 0 0 [2,] 0 1 0 0 0 0 0 0 0 0 [3,] 0 0 1 0 0 0 0 0 0 0 [4,] 0 0 0 1 0 0 0 0 0 0 [5,] 0 0 0 0 1 0 0 0 0 0 [6,] 0 0 0 0 0 1 0 0 0 0 [7,] 0 0 0 0 0 0 1 0 0 0 [8,] 0 0 0 0 0 0 0 1 0 0 [9,] 0 0 0 0 0 0 0 0 1 0 [10,] 0 0 0 0 0 0 0 0 0 1 > l["e"] $e [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 0 0 0 0 0 0 0 0 0 [2,] 0 1 0 0 0 0 0 0 0 0 [3,] 0 0 1 0 0 0 0 0 0 0 [4,] 0 0 0 1 0 0 0 0 0 0 [5,] 0 0 0 0 1 0 0 0 0 0 [6,] 0 0 0 0 0 1 0 0 0 0 [7,] 0 0 0 0 0 0 1 0 0 0 [8,] 0 0 0 0 0 0 0 1 0 0 [9,] 0 0 0 0 0 0 0 0 1 0 [10,] 0 0 0 0 0 0 0 0 0 1 > l[1:2] [[1]] [1] 1 2 3 4 [[2]] [1] FALSE > l[c(1:2)] #index using integer vector [[1]] [1] 1 2 3 4 [[2]] [1] FALSE > l[-c(3:length(l))] #negative index to exclude elements from 3rd up to last. [[1]] [1] 1 2 3 4 [[2]] [1] FALSE l[c(T,F,F,F,F)] [[1]] [1] 1 2 3 4
We can change components of a list through reassignment.
l[["name"]] <- "Kalyan Nandi"
l
Output: [[1]] [1] 1 2 3 4 [[2]] [1] FALSE [[3]] [1] “Hello Statistics!” $d function (arg = 42) { print(“Hello World!”) } $name [1] “Kalyan Nandi”
In R Programming Matrix is a two-dimensional data structure. They contain elements of the same atomic types. A Matrix can be created using the matrix() function. R can also be used for matrix calculations. Matrices have rows and columns containing a single data type. In a matrix, the order of rows and columns is important. Dimension can be checked directly with the dim() function and all attributes of an object can be checked with the attributes() function. Check the below example.
Creating a matrix in R
m <- matrix(nrow = 2, ncol = 3)
dim(m)
attributes(m)
m <- matrix(1:20, nrow = 4, ncol = 5)
m
Output: dim(m) [1] 2 3 attributes(m) $dim [1] 2 3 m <- matrix(1:20, nrow = 4, ncol = 5) m [,1] [,2] [,3] [,4] [,5] [1,] 1 5 9 13 17 [2,] 2 6 10 14 18 [3,] 3 7 11 15 19 [4,] 4 8 12 16 20
Matrices can be created by column-binding or row-binding with the cbind() and rbind() functions.
x<-1:3
y<-10:12
z<-30:32
cbind(x,y,z)
rbind(x,y,z)
Output: cbind(x,y,z) x y z [1,] 1 10 30 [2,] 2 11 31 [3,] 3 12 32 rbind(x,y,z) [,1] [,2] [,3] x 1 2 3 y 10 11 12 z 30 31 32
By default, the matrix function reorders a vector into columns, but we can also tell R to use rows instead.
x <-1:9
matrix(x, nrow = 3, ncol = 3)
matrix(x, nrow = 3, ncol = 3, byrow = TRUE)
Output cbind(x,y,z) x y z [1,] 1 10 30 [2,] 2 11 31 [3,] 3 12 32 rbind(x,y,z) [,1] [,2] [,3] x 1 2 3 y 10 11 12 z 30 31 32
In R, Arrays are the data types that can store data in more than two dimensions. An array can be created using the array() function. It takes vectors as input and uses the values in the dim parameter to create an array. If you create an array of dimensions (2, 3, 4) then it creates 4 rectangular matrices each with 2 rows and 3 columns. Arrays can store only data type.
We can give names to the rows, columns, and matrices in the array by setting the dimnames parameter.
v1 <- c(1,2,3)
v2 <- 100:110
col.names <- c("Col1","Col2","Col3","Col4","Col5","Col6","Col7")
row.names <- c("Row1","Row2")
matrix.names <- c("Matrix1","Matrix2")
arr4 <- array(c(v1,v2), dim=c(2,7,2), dimnames = list(row.names,col.names, matrix.names))
arr4
Output: , , Matrix1 Col1 Col2 Col3 Col4 Col5 Col6 Col7 Row1 1 3 101 103 105 107 109 Row2 2 100 102 104 106 108 110 , , Matrix2 Col1 Col2 Col3 Col4 Col5 Col6 Col7 Row1 1 3 101 103 105 107 109 Row2 2 100 102 104 106 108 110
# Print the 2nd row of the 1st matrix of the array.
print(arr4[2,,1])
# Print the element in the 2nd row and 4th column of the 2nd matrix.
print(arr4[2,4,2])
# Print the 2nd Matrix.
print(arr4[,,2])
Output: > print(arr4[2,,1]) Col1 Col2 Col3 Col4 Col5 Col6 Col7 2 100 102 104 106 108 110 > > # Print the element in the 2nd row and 4th column of the 2nd matrix. > print(arr4[2,4,2]) [1] 104 > > # Print the 2nd Matrix. > print(arr4[,,2]) Col1 Col2 Col3 Col4 Col5 Col6 Col7 Row1 1 3 101 103 105 107 109 Row2 2 100 102 104 106 108 110
Factors are used to represent categorical data and can be unordered or ordered. An example might be “Male” and “Female” if we consider gender. Factor objects can be created with the factor() function.
x <- factor(c("male", "female", "male", "male", "female"))
x
table(x)
Output: x [1] male female male male female Levels: female male table(x) x female male 2 3
By default, Levels are put in alphabetical order. If you print the above code you will get levels as female and male. But if you want to get your levels in a particular order then set levels parameter like this.
x <- factor(c("male", "female", "male", "male", "female"), levels=c("male", "female"))
x
table(x)
Output: x [1] male female male male female Levels: male female table(x) x male female 3 2
Data frames are used to store tabular data in R. They are an important type of object in R and are used in a variety of statistical modeling applications. Data frames are represented as a special type of list where every element of the list has to have the same length. Each element of the list can be thought of as a column and the length of each element of the list is the number of rows. Unlike matrices, data frames can store different classes of objects in each column. Matrices must have every element be the same class (e.g. all integers or all numeric).
Data frames can be created explicitly with the data.frame() function.
employee <- c('Ram','Sham','Jadu')
salary <- c(21000, 23400, 26800)
startdate <- as.Date(c('2016-11-1','2015-3-25','2017-3-14'))
employ_data <- data.frame(employee, salary, startdate)
employ_data
View(employ_data)
Output: employ_data employee salary startdate 1 Ram 21000 2016-11-01 2 Sham 23400 2015-03-25 3 Jadu 26800 2017-03-14 View(employ_data)
If you look at the structure of the data frame now, you see that the variable employee is a character vector, as shown in the following output:
str(employ_data)
Output: > str(employ_data) 'data.frame': 3 obs. of 3 variables: $ employee : Factor w/ 3 levels "Jadu","Ram","Sham": 2 3 1 $ salary : num 21000 23400 26800 $ startdate: Date, format: "2016-11-01" "2015-03-25" "2017-03-14"
Note that the first column, employee, is of type factor, instead of a character vector. By default, data.frame() function converts character vector into factor. To suppress this behavior, we can pass the argument stringsAsFactors=FALSE.
employ_data <- data.frame(employee, salary, startdate, stringsAsFactors = FALSE)
str(employ_data)
Output: 'data.frame': 3 obs. of 3 variables: $ employee : chr "Ram" "Sham" "Jadu" $ salary : num 21000 23400 26800 $ startdate: Date, format: "2016-11-01" "2015-03-25" "2017-03-14"
The primary location for obtaining R packages is CRAN.
You can obtain information about the available packages on CRAN with the available.packages() function.
a <- available.packages()
head(rownames(a), 30) # Show the names of the first 30 packages
Packages can be installed with the install.packages() function in R. To install a single package, pass the name of the lecture to the install.packages() function as the first argument.
The following code installs the ggplot2 package from CRAN.
install.packages(“ggplot2”)
You can install multiple R packages at once with a single call to install.packages(). Place the names of the R packages in a character vector.
install.packages(c(“caret”, “ggplot2”, “dplyr”))
Loading packages
Installing a package does not make it immediately available to you in R; you must load the package. The library() function is used to load packages into R. The following code is used to load the ggplot2 package into R. Do not put the package name in quotes.
library(ggplot2)
If you have Installed your packages without root access using the command install.packages(“ggplot2″, lib=”/data/Rpackages/”). Then to load use the below command.
library(ggplot2, lib.loc=”/data/Rpackages/”)
After loading a package, the functions exported by that package will be attached to the top of the search() list (after the workspace).
library(ggplot2)
search()
In R, we can read data from files stored outside the R environment. We can also write data into files that will be stored and accessed by the operating system. R can read and write into various file formats like CSV, Excel, XML, etc.
We can check which directory the R workspace is pointing to using the getwd() function. You can also set a new working directory using setwd()function.
# Get and print current working directory.
print(getwd())
# Set current working directory.
setwd("/web/com")
# Get and print current working directory.
print(getwd())
Output: [1] "/web/com/1441086124_2016" [1] "/web/com"
The CSV file is a text file in which the values in the columns are separated by a comma. Let’s consider the following data present in the file named input.csv.
You can create this file using windows notepad by copying and pasting this data. Save the file as input.csv using the save As All files(*.*) option in notepad.
Following is a simple example of read.csv() function to read a CSV file available in your current working directory −
data <- read.csv("input.csv")
print(data)
id, name, salary, start_date, dept
Pie charts are created with the function pie(x, labels=) where x is a non-negative numeric vector indicating the area of each slice and labels= notes a character vector of names for the slices.
The basic syntax for creating a pie-chart using the R is −
pie(x, labels, radius, main, col, clockwise)
Following is the description of the parameters used −
# Simple Pie Chart
slices <- c(10, 12,4, 16, 8)
lbls <- c("US", "UK", "Australia", "Germany", "France")
pie(slices, labels = lbls, main="Pie Chart of Countries")
3-D pie chart
The pie3D( ) function in the plotrix package provides 3D exploded pie charts.
# 3D Exploded Pie Chart
library(plotrix)
slices <- c(10, 12, 4, 16, 8)
lbls <- c("US", "UK", "Australia", "Germany", "France")
pie3D(slices,labels=lbls,explode=0.1,
main="Pie Chart of Countries ")
A bar chart represents data in rectangular bars with a length of the bar proportional to the value of the variable. R uses the function barplot() to create bar charts. R can draw both vertical and Horizontal bars in the bar chart. In the bar chart, each of the bars can be given different colors.
Let us suppose, we have a vector of maximum temperatures (in degree Celsius) for seven days as follows.
max.temp <- c(22, 27, 26, 24, 23, 26, 28)
barplot(max.temp)
Some of the frequently used ones are, “main” to give the title, “xlab” and “ylab” to provide labels for the axes, names.arg for naming each bar, “col” to define color, etc.
We can also plot bars horizontally by providing the argument horiz=TRUE.
# barchart with added parameters
barplot(max.temp,
main = "Maximum Temperatures in a Week",
xlab = "Degree Celsius",
ylab = "Day",
names.arg = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"),
col = "darkred",
horiz = TRUE)
Simply doing barplot(age) will not give us the required plot. It will plot 10 bars with height equal to the student’s age. But we want to know the number of students in each age category.
This count can be quickly found using the table() function, as shown below.
> table(age)
age
16 17 18 19
1 2 6 1
Now plotting this data will give our required bar plot. Note below, that we define the argument “density” to shade the bars.
barplot(table(age),
main="Age Count of 10 Students",
xlab="Age",
ylab="Count",
border="red",
col="blue",
density=10
)
A histogram represents the frequencies of values of a variable bucketed into ranges. Histogram is similar to bar chat but the difference is it groups the values into continuous ranges. Each bar in histogram represents the height of the number of values present in that range.
R creates histogram using hist() function. This function takes a vector as an input and uses some more parameters to plot histograms.
The basic syntax for creating a histogram using R is −
hist(v,main,xlab,xlim,ylim,breaks,col,border)
Following is the description of the parameters used −
A simple histogram is created using input vector, label, col, and border parameters.
The script given below will create and save the histogram in the current R working directory.
# Create data for the graph.
v <- c(9,13,21,8,36,22,12,41,31,33,19)
# Give the chart file a name.
png(file = "histogram.png")
# Create the histogram.
hist(v,xlab = "Weight",col = "yellow",border = "blue")
# Save the file.
dev.off()
To specify the range of values allowed in X axis and Y axis, we can use the xlim and ylim parameters.
The width of each bar can be decided by using breaks.
# Create data for the graph.
v <- c(9,13,21,8,36,22,12,41,31,33,19)
# Give the chart file a name.
png(file = "histogram_lim_breaks.png")
# Create the histogram.
hist(v,xlab = "Weight",col = "green",border = "red", xlim = c(0,40), ylim = c(0,5),
breaks = 5)
# Save the file.
dev.off()
The debate around data analytics tools has been going on forever. Each time a new one comes out, comparisons transpire. Although many aspects of the tool remain subjective, beginners want to know which tool is better to start with.
The most popular and widely used tools for data analytics are R and SAS. Both of them have been around for a long time and are often pitted against each other. So, let’s compare them based on the most relevant factors.
Final Verdict
As per estimations by the Economic Times, the analytics industry will grow to $16 billion till 2025 in India. If you wish to venture into this domain, there can’t be a better time. Just start learning the tool you think is better based on the comparison points above.
Original article source at: https://www.mygreatlearning.com
1668501324
The unidentified index notice in PHP appears when you try to access an array variable with a key that doesn’t exist.
For example, suppose you have an associative array named $user
with the following values:
<?php
$user = [
"name" => "Nathan",
"age" => 28,
"hobby" => "programming",
];
Suppose you try to access the $user
variable with the key user_id
.
Because the $user
variable doesn’t have a user_id
key, PHP will respond with the unidentified index notice:
<?php
print $user["user_id"];
The code above will produce the following output:
Notice: Undefined index: user_id in ... on line ...
The notice above means PHP doesn’t know what you mean with user_id
index in the code.
To solve this issue, you need to make sure that the array key exists by calling the isset()
function:
<?php
$user = [
"name" => "Nathan",
"age" => 28,
"hobby" => "programming",
];
if (isset($user["user_id"])) {
print $user["user_id"];
} else {
print "user_id does not exists";
}
A fellow once asked me, “isn’t it enough to put the variable inside an if
statement without isset()
?”
Without the isset()
function, PHP will still emit the “undefined index” notice.
You need both the isset()
function and the if
statement to remove the notice.
This issue frequently appears when you are accessing data from the $_POST
or $_GET
variable.
The solution also works for these global variables:
// 👇 check if the name variable exists in $_POST
if (isset($_POST["name"])) {
print $_POST["name"];
}
// 👇 check if the query variable exists in $_GET
if (isset($_GET["query"])) {
print $_GET["query"];
}
If you are assigning the value to a variable, you can use the ternary operator to assign a default value to that variable.
Consider the example below:
// assign name from $_POST into $name
// otherwise, put User0 to $name
$name = isset($_POST["name"]) ? $_POST["name"] : "User0";
The ternary operator allows you to write a shorter code for the if..else
check.
Now you’ve learned how to fix the unidentified index notice in PHP. Good work! 👍
Original article source at: https://sebhastian.com/
1665578460
This package implements two-based indexing in Julia. Two-based indexing affects only your code. Functions from other packages/modules will still function properly, but when you index into the arrays they return, the indices will start at 2 instead of 1. This makes it easy to gradually transition your codebase from obsolete one-based indexing to proper two-based indexing.
julia> using TwoBasedIndexing
julia> twobased() # enable two-based indexing in current module
julia> x = [1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> for i = 2:4 println(x[i]) end
1
2
3
julia> x[2] = 2
2
julia> x
3-element Array{Int64,1}:
2
2
3
Author: Simonster
Source Code: https://github.com/simonster/TwoBasedIndexing.jl
License: View license
1664853960
Tools to work with hemispherical pictures for the determination of Leaf Area Index (LAI).
Quick introduction
Install the package through
Pkg.clone("https://github.com/ETC-UA/LeafAreaIndex.jl")
The basic type used by this package is a PolarImage. You construct a PolarImage from a CameraLens type and an Image (or in general, an AbstractMatrix). Note that for LAI calculations typically only the blue channel of the image is used.
You can load the image eg. with the Images package:
using Images
img = imread("image.jpg")
imgblue = blue(img) #take the blue channel
or in case you have the raw image from the camera, we provide a more accurate, dedicated function to extract the pixels from the blue channel (using dcraw
under the hood):
using LeafAreaIndex
imgblue = rawblueread("image.NEF")
Because the mapping of pixels on the image to coordinates in the scene is dependent on your camera setup, you must construct a configuration object with this information. A CameraLens type is constructed given an image size, the coordinates of the lens center and the (inverse) projection function. The projection function maps polar distance ρ [in pixels] on the image to the zenith angle θ [in radians] of the scene and is usually not linear. This project function depends on the specific (fish-eye) used and is usually polynomial approximated up to 2nd order as f(ρ/ρmax) = a₁θ + a₂θ² with ρmax the maximum visible radius. More general you can submit a vector A
with the polynomial coefficients. The maximum radius ρmax and the lens center depends on the combination of camera together with the lens (and the image size depends obviously on the camera).
using LeafAreaIndex
mycameralens = CameraLens( (height, width), (centeri, centerj), ρmax, A)
The basic PolarImage type is then constructed:
polarimg = PolarImage(imgblue, mycameralens)
The first processing step is automatic thresholding (default method Ridler Calvard):
thresh = threshold(polarimg)
In the second step the (effective) LAI is estimated through the inversion model. The default method assumes an ellipsoidal leave angle distribution and uses a non-linear optimization method.
LAIe = inverse(polarimg, thresh)
Finally, the clumping factor can be estimated with the method of Lang Xiang (default with 45ᵒ segments in full view angle):
clump = langxiang(polarimg, thresh)
With clumping correction we obtain LAI = LAIe / clump
.
For images taken (always vertically upwards) on a domain with a slope of eg 10ᵒ and sloping downward to the East, you must include this information in your PolarImage with the Slope(inclination, direction)
function:
myslope = SlopeParams(10/180*pi, pi/2)
polarimg = PolarImage(imgblue, mycameralens, myslope)
For downward taken (crop) images, create a mask
to cut out the photographer's shoes and use the RedMax()
method instead of thresholding to separate soil from (green) plant material
mymask = MaskParams(pi/3, -2*pi/3, -pi/3)
polarimg = PolarImage(imgblue, mycameralens, mymask)
LAIe = inverse(polarimg, RedMax())
Besides the default Ridler Calvard method, two more automatic thresholdingmethods Edge Detection and Minimum algorithm can be used:
thresh = threshold(polarimg, RidlerCalvard())
thresh2 = threshold(polarimg, EdgeDetection())
thresh3 = threshold(polarimg, MinimumThreshold())
Further LAI estimation methods for the inversion model are available: * The EllipsLUT
also assumes an ellipsoidal leaf angle distribution, but uses a Lookup Table approach instead of optimization approach. * The Zenith57
method uses a ring around the view angle of 57ᵒ (1 rad) where the ALIA influence is minimal; * The Miller
method integrates several zenith rings assuming a constant leaf angle; and * The Lang
method uses a first order regression on the Miller method.
LAI = inverse(polarimg, thresh, EllipsOpt())
LAI2 = inverse(polarimg, thresh, EllipsLUT())
LAI3 = inverse(polarimg, thresh, Zenith57())
LAI4 = inverse(polarimg, thresh, Miller())
LAI5 = inverse(polarimg, thresh, Lang())
For the clumping factor, besides the method from Lang & Xiang, also the (experimental) method from Chen & Chilar is available:
clump2 = chencihlar(polarimg, thresh, 0, pi/2)
Under the hood several lower level methods are used to access pixels and calculated gapfractions. We suggest to look at the code for their definition and usage.
To access the pixels in a particular zenith range, pixels(polarimg, pi/6, pi/3)
will return a vector with pixels quickly, sorted by increasing ρ (and then by polar angles ϕ for identical ρ). A shortcut pixels(polarimg)
is translated to pixels(polarimg, 0, pi/2)
.
The segments
function can further split these ring pixels in n segments (eg. for clumping calculation). It returns a vector with n elements, each again a vector with the segment pixels.
For the gapfraction, we suggest (see online documentation) to use the contact frequencies $K(\theta_V) = -\ln[T(\theta_v)] \cos\theta_V$ for LAI inversion calculations, with $T$ the gapfraction and $\theta_V$ the view angle. The input N determines the number of rings between view angles θ1 and θ2 for a polar image with a certain threshold. The function returns a vector with angle edges of the rings, the weighted average midpoint angle for each ring and the contact frequency for each ring.
θedges, θmid, K = contactfreqs(polimg, θ1, θ2, N, thresh)
In case of problems or suggestion, don't hesitate to submit an issue through the issue tracker or code suggestions through a pull request.
View the full documentation on (https://etc-ua.github.io/LeafAreaIndex.jl).
Author: ETC-UA
Source Code: https://github.com/ETC-UA/LeafAreaIndex.jl
License: View license
1659685980
Shoulda Matchers provides RSpec- and Minitest-compatible one-liners to test common Rails functionality that, if written by hand, would be much longer, more complex, and error-prone.
📖 Read the documentation for the latest version.
📢 See what's changed in recent versions.
Start by including shoulda-matchers
in your Gemfile:
group :test do
gem 'shoulda-matchers', '~> 5.0'
end
Then run bundle install
.
Now you need to configure the gem by telling it:
If you're working on a Rails app, simply place this at the bottom of spec/rails_helper.rb
(or in a support file if you so choose):
Shoulda::Matchers.configure do |config|
config.integrate do |with|
with.test_framework :rspec
with.library :rails
end
end
If you're not working on a Rails app, but you still make use of ActiveRecord or ActiveModel in your project, you can still use this gem too! In that case, you'll want to place the following configuration at the bottom of spec/spec_helper.rb
:
Shoulda::Matchers.configure do |config|
config.integrate do |with|
with.test_framework :rspec
# Keep as many of these lines as are necessary:
with.library :active_record
with.library :active_model
end
end
If you're using our umbrella gem Shoulda, then make sure that you're using the latest version:
group :test do
gem 'shoulda', '~> 4.0'
end
Otherwise, add shoulda-matchers
to your Gemfile:
group :test do
gem 'shoulda-matchers', '~> 4.0'
end
Then run bundle install
.
Now you need to configure the gem by telling it:
If you're working on a Rails app, simply place this at the bottom of test/test_helper.rb
:
Shoulda::Matchers.configure do |config|
config.integrate do |with|
with.test_framework :minitest
with.library :rails
end
end
If you're not working on a Rails app, but you still make use of ActiveRecord or ActiveModel in your project, you can still use this gem too! In that case, you'll want to place the following configuration at the bottom of test/test_helper.rb
:
Shoulda::Matchers.configure do |config|
config.integrate do |with|
with.test_framework :minitest
# Keep as many of these lines as are necessary:
with.library :active_record
with.library :active_model
end
end
Most of the matchers provided by this gem are useful in a Rails context, and as such, can be used for different parts of a Rails app:
delegate
As the name of the gem indicates, most matchers are designed to be used in "one-liner" form using the should
macro, a special directive available in both RSpec and Shoulda. For instance, a model test case may look something like:
# RSpec
RSpec.describe MenuItem, type: :model do
describe 'associations' do
it { should belong_to(:category).class_name('MenuCategory') }
end
describe 'validations' do
it { should validate_presence_of(:name) }
it { should validate_uniqueness_of(:name).scoped_to(:category_id) }
end
end
# Minitest (Shoulda)
class MenuItemTest < ActiveSupport::TestCase
context 'associations' do
should belong_to(:category).class_name('MenuCategory')
end
context 'validations' do
should validate_presence_of(:name)
should validate_uniqueness_of(:name).scoped_to(:category_id)
end
end
See below for the full set of matchers that you can use.
subject
For both RSpec and Shoulda, the subject is an implicit reference to the object under test, and through the use of should
as demonstrated above, all of the matchers make use of subject
internally when they are run. A subject
is always set automatically by your test framework in any given test case; however, in certain cases it can be advantageous to override it. For instance, when testing validations in a model, it is customary to provide a valid model instead of a fresh one:
# RSpec
RSpec.describe Post, type: :model do
describe 'validations' do
# Here we're using FactoryBot, but you could use anything
subject { build(:post) }
it { should validate_presence_of(:title) }
end
end
# Minitest (Shoulda)
class PostTest < ActiveSupport::TestCase
context 'validations' do
subject { build(:post) }
should validate_presence_of(:title)
end
end
When overriding the subject in this manner, then, it's important to provide the correct object. When in doubt, provide an instance of the class under test. This is particularly necessary for controller tests, where it is easy to accidentally write something like:
RSpec.describe PostsController, type: :controller do
describe 'GET #index' do
subject { get :index }
# This may work...
it { should have_http_status(:success) }
# ...but this will not!
it { should permit(:title, :body).for(:post) }
end
end
In this case, you would want to use before
rather than subject
:
RSpec.describe PostsController, type: :controller do
describe 'GET #index' do
before { get :index }
# Notice that we have to assert have_http_status on the response here...
it { expect(response).to have_http_status(:success) }
# ...but we do not have to provide a subject for render_template
it { should render_template('index') }
end
end
If you're using RSpec, then you're probably familiar with the concept of example groups. Example groups can be assigned tags order to assign different behavior to different kinds of example groups. This comes into play especially when using rspec-rails
, where, for instance, controller example groups, tagged with type: :controller
, are written differently than request example groups, tagged with type: :request
. This difference in writing style arises because rspec-rails
mixes different behavior and methods into controller example groups vs. request example groups.
Relying on this behavior, Shoulda Matchers automatically makes certain matchers available in certain kinds of example groups:
type: :model
or in files located under spec/models
.type: :controller
or in files located under spec/controllers
.route
matcher is available in routing example groups, i.e., those tagged with type: :routing
or in files located under spec/routing
.As long as you're using Rails, you don't need to worry about these details — everything should "just work".
What if you are using ActiveModel or ActiveRecord outside of Rails, however, and you want to use model matchers in a certain example group? Then you'll need to manually include the module that holds those matchers into that example group. For instance, you might have to say:
RSpec.describe MySpecialModel do
include Shoulda::Matchers::ActiveModel
include Shoulda::Matchers::ActiveRecord
end
If you have a lot of similar example groups in which you need to do this, then you might find it more helpful to tag your example groups appropriately, then instruct RSpec to mix these modules into any example groups that have that tag. For instance, you could add this to your rails_helper.rb
:
RSpec.configure do |config|
config.include(Shoulda::Matchers::ActiveModel, type: :model)
config.include(Shoulda::Matchers::ActiveRecord, type: :model)
end
And from then on, you could say:
RSpec.describe MySpecialModel, type: :model do
# ...
end
should
vs is_expected.to
In this README and throughout the documentation, you'll notice that we use the should
form of RSpec's one-liner syntax over is_expected.to
. Beside being the namesake of the gem itself, this is our preferred syntax as it's short and sweet. But if you prefer to use is_expected.to
, you can do that too:
RSpec.describe Person, type: :model do
it { is_expected.to validate_presence_of(:name) }
end
Here is the full list of matchers that ship with this gem. If you need details about any of them, make sure to consult the documentation!
has_secure_password
.validates_absence_of
.validates_acceptance_of
.validates_confirmation_of
.validates_exclusion_of
.validates_inclusion_of
.validates_length_of
.validates_numericality_of
.validates_presence_of
.accepts_nested_attributes_for
macro.belongs_to
associations.enum
macro.has_and_belongs_to_many
associations.implicit_order_column
.has_many
associations.has_many_attached
associations.has_one
associations.has_one_attached
associations.attr_readonly
macro.has_rich_text
associations.serialize
macro.validates_uniqueness_of
.params
hash.rescue_from
macro.session
hash.flash
hash.after_action
callback is defined in your controller.around_action
callback is defined in your controller.before_action
callback is defined in your controller.Over time our community has created extensions to Shoulda Matchers. If you've created something that you want to share, please let us know!
Have a fix for a problem you've been running into or an idea for a new feature you think would be useful? Take a look at the Contributing document for instructions on setting up the repo on your machine, understanding the codebase, and creating a good pull request.
Shoulda Matchers is tested and supported against Ruby 2.6+, Rails 5.2+, RSpec 3.x, and Minitest 5.x.
Shoulda Matchers follows Semantic Versioning 2.0 as defined at https://semver.org.
Shoulda Matchers is maintained by Elliot Winkler and Gui Albuk.
Shoulda Matchers is copyright © 2006-2021 Tammer Saleh and thoughtbot, inc. It is free and opensource software and may be redistributed under the terms specified in the LICENSE file.
The names and logos for thoughtbot are trademarks of thoughtbot, inc.
We are passionate about open source software. See our other projects. We are available for hire.
Author: thoughtbot
Source code: https://github.com/thoughtbot/shoulda-matchers
License: MIT license
1657321920
const si = require('search-index')
// initialize an index
const { PUT, QUERY } = await si()
// add documents to the index
await PUT( /* objects */ )
// read documents from the index
const results = await QUERY( /* query */ )
Author: Fergiemcdowall
Source Code: https://github.com/fergiemcdowall/search-index
License: MIT license
1657265460
changes-index
create indexes from a leveldb changes feed
This package provides a way to create a materialized view on top of an append-only log.
To update an index, just change the index code and delete the indexed data.
example
Create a change feed and set up an index. Here we'll create an index for keys that start with the prefix 'user!'
on the name
and hackerspace
properties.
var level = require('level');
var sublevel = require('subleveldown');
var changes = require('changes-feed');
var changesdown = require('changesdown');
var chi = require('../');
var argv = require('minimist')(process.argv.slice(2));
var up = level('/tmp/test.db', { valueEncoding: 'json' });
var feed = changes(sublevel(up, 'feed'));
var db = changesdown(sublevel(up, 'db'), feed, { valueEncoding: 'json' });
var indexes = chi({
ixdb: level('/tmp/index.db', { valueEncoding: 'json' }),
chdb: db,
feed: feed
});
indexes.add(function (row, cb) {
if (/^user!/.test(row.key)) {
cb(null, {
'user.name': row.value.name,
'user.space': row.value.hackerspace
});
}
else cb()
});
now we can create users:
if (argv._[0] === 'create') {
var id = require('crypto').randomBytes(16).toString('hex');
var name = argv._[1], space = argv._[2];
var value = { name: name, hackerspace: space };
userExists(name, function (err, ex) {
if (err) return console.error(err);
if (ex) return console.error('name in use');
db.put('user!' + id, value, function (err) {
if (err) console.error(err);
});
});
function userExists (name, cb) {
indexes.createReadStream('user.name', name, { gte: name, lte: name })
.pipe(through.obj(write, end))
;
function write (row, enc, next) { cb(null, true) }
function end () { cb(null, false) }
}
}
or clear (and implicitly regenerate) an existing index:
else if (argv._[0] === 'clear') {
indexes.clear(argv._[1], function (err) {
if (err) console.error(err);
});
}
With these indexes we can list users by name
and space
:
else if (argv._[0] === 'by-name') {
indexes.createReadStream('user.name', argv)
.on('data', console.log)
;
}
else if (argv._[0] === 'by-space') {
indexes.createReadStream('user.space', argv)
.on('data', console.log)
;
}
methods
var chi = require('changes-index')
var indexes = chi(opts)
You must provide:
opts.ixdb
- levelup database to use for indexingopts.chdb
- wrapped changesdown levelup database to lookup primary recordsopts.feed
- changes-feed handle wired up to the chdbindexes.add(fn)
Create an index from a function fn(row, cb)
that will be called for each put and delete. Your function fn
must call cb(err, ix)
with ix
, an object mapping index names to values. The values from ix
will be sorted according to the algorithm from bytewise.
indexes.createReadStream(name, opts)
Create a readable object-mode stream of the primary documents inserted into chdb
based on the index given by name
.
The stream will produce row
objects with:
row.key
- the key name put into changedownrow.value
- the value put into changedownrow.index
- the index value generated by the index functionrow.exists
- whether the key existed prior to this operationrow.prev
- the previous set of indexes or null
if the key was createdrow.change
- the monotonically increasing change sequence from changes-feedThis read stream can be bounded by all the usual levelup options:
opts.lt
opts.lte
opts.gt
opts.gte
opts.limit
opts.reverse
plus:
opts.eq
which is the same as setting opts.gte
and opts.lte
to the same value. This isn't common in ordinary levelup but is very common when dealing with indexes that map to other keys.
indexes.clear(name, cb)
Delete the index for name
, calling cb(err)
when finished.
versioning
The internals of this module may change between patch releases, which may affect how data is stored on disk.
When you upgrade this package over existing data, you should delete the indexes first.
Author: Substack
Source Code: https://github.com/substack/changes-index
License: View license
1657258020
Reference every value in your leveldb to its parent, e.g. by setting value.parentKey
to the key of the parent, then level-tree-index will keep track of the full path for each value, allow you to look up parents and children, stream the entire tree or a part thereof and even perform streaming search queries on the tree.
This is useful for implementing e.g. nested comments.
level-tree-index works for all keyEncodings. It works for the json valueEncoding automatically and works for other valueEncodings if you provide custom functions for the opts.pathProp
and opts.parentProp
options. level-tree-index works equally well with string and buffer paths.
level-tree-index automatically keeps the tree updated as you add, change or delete from the database.
Usage
// db contains your data and idb is used to store the index
var tree = treeIndexer(db, idb);
db.put('1', {name: "foo"}, function(err) {
if(err) fail(err);
db.put('2', {parentKey: '1', name: "bar"}, function(err) {
if(err) fail(err);
db.put('3', {parentKey: '2', name: "baz"}, function(err) {
if(err) fail(err);
// wait for index to finish building
setTimeout(function() {
// stream child-paths of 'foo' recursively
var s = tree.stream('foo');
s.on('data', function(data) {
console.log(data.path, data.key, data.value);
});
}, 500);
});
});
});
Read the unit tests in tests/
for more.
API
opts:
pathProp: 'name' // property used to construct the path
parentProp: 'parentKey' // property that references key of parent
sep: 0x1f, // path separator. can be string or unicode/ascii character code
pathArray: false, // for functions that output paths, output paths as arrays
ignore: false, // set to a function to selectively ignore
listen: true, // listen for changes on db and update index automatically
uniquefy: false, // add uniqueProp to end of pathProp to ensure uniqueness
uniqProp: 'unique', // property used for uniqueness
uniqSep: 0x1e, // like `sep` but separates pathProp from uniqProp
levelup: false // if true, returns a levelup instance instead
orphanPath: 'orphans' // parent path of orphans
Both pathProp
and parentProp
can be either a string, a buffer or a function.
If a function is used then the function will be passed a value from your database as the only argument. The pathProp
function is expected to return a string or buffer that will be used to construct the path by joining multiple returned pathProp
values with the opts.sep
value as separator. The parentProp
function is expected to return the key in db
of the parent.
opts.sep
can be a buffer of a string and is used as a separator to construct the path to each node in the tree.
opts.ignore
can be set to a function which will receive the key and value for each change and if it returns something truthy then that value will be ignored by the tree indexer, e.g:
Setting orphanPath
to a string, buffer or array will cause all orphaned rows to have orphanPath
as their parent path. Setting orphanPath
to null
will cause orphaned rows to be ignored (not indexed). An orphan is defined as a row with its parentProp
set to a non-falsy value but where the referenced parent does not exist in the database. This can happen e.g. if a parent is deleted but its children are left in the database.
// ignore items where the .name property starts with an underscore
ignore: function(key, value) {
if(typeof value === 'object') {
if(typeof value.name === 'string') {
if(value.name[0] === '_') {
return true;
}
}
}
return false;
}
If opts.listen
is true then level-tree-index will listen to operations on db and automatically update the index. Otherwise the index will only be updated when .put/.del/batch is called directly on the level-tree-index instance. This option is ignored when opts.levelup
is true.
If opts.levelup
is true then instead of a level-tree-index instance a levelup instance will be returned with all of the standard levelup API + the level-tree-index API. All calls to .put, .del or .batch will operate on the database given as the db
argument and only call their callbacks once the tree index has been updated.
Limitations when using levelup:true
:
opts.pathProp
and opts.parentProp
must be set to functions and if you're using valueEncoding:'json'
then those functions will receive the stringified json data.See tests/levelup.js
for how to use the levelup:true
mode.
Get the path and key of the root element. E.g:
tree.getRoot(function(err, path, key) {
console.log("Path of root element:", path);
console.log("Key of root element:", key);
});
Recursively stream descendants starting from parentPath
. If parentPath
is falsy then the entire tree will be streamed to the specified depth.
Opts:
depth: 0, // how many (grand)children deep to go. 0 means infinite
paths: true, // output the path for each child
keys: true, // output the key for each child
values: true, // output the value for each child
pathArray: undefined, // output paths as arrays
ignore: false, // optional function that returns true for values to ignore
match: null, // Stream only matching elements. A string, buffer or function.
matchAncestors: false, // include ancestors of matches if true
gt: undefined, // specify gt directly, must then also specify lt or lte
gte: undefined, // specify gte directly, must then also specify lt or lte
lt: undefined, // specify lt directly, must then also specify gt or gte
lte: undefined // specify lte directly, must then also specify lt or gte
If parentPath
is not specified then .gt/.gte
and .lt/.lte
must be specified.
opts.depth
is currently not usable at the same time as opts.match
.
If more than one of opts.paths
, opts.keys
and opts.values
is true then the stream will output objects with these as properties.
opts.ignore
can be set to a function. This function will receive whatever the stream is about to output (which depends on opts.paths
, opts.keys
and opts.values
) and if the function returns true then those values will not be emitted by the stream.
opts.match
allows for streaming search queries on the tree. If set to a string or buffer it will match any path that contains that string or buffer. If set to a RegEx then it will run a .match on the path with that RegEx (only works for string paths). If set to a function then that function will be called with the path as first argument and with the second argument depending on the values of opts.paths
, opts.keys
and opts.values
, e.g:
match: function(path, o) {
if(o.value.name.match("cattens")) {
return true;
}
return false;
}
Setting opts.matchAncestors
to true modifies the behaviour of opts.match
to also match all ancestors of matched elements. Ancestors of matched elements will then be streamed in the correct order before the matched element. This requires some buffering so may slow down matches on very large tree indexes.
When using opts.lt/opts.lte
you can use the convenience function .lteKey(key)
. E.g. to stream all paths that begin with 'foo.bar' you could run:
levelTree.stream({
gte: 'foo.bar',
lte: levelTree.lteKey('foo.bar')
});
Keep in mind that the above example would also return paths like 'foo.barbar'.
Convenience function that, according to leveldb alphabetical ordering, returns the last possible string or buffer that begins with the specified string or buffer.
Stream tree index ancestor paths starting from path
. Like .stream()
but traverses ancestors instead of descendants.
Opts:
height: 0, // how many (grand)children up to go. 0 means infinite
includeCurrent: true, // include the node specified by path in the stream
paths: true, // output the path for each child
keys: true, // output the key for each child
values: true, // output the value for each child
pathArray: undefined, // output paths as arrays
Same as .parentStream
but calls back with the results as an array.
Get key and value from path.
Callback: cb(err, key, value)
Get tree path given a key.
opts.pathArray: undefined // if true, split path into array
Callback: cb(err, path)
Get parent value given a value.
Callback: cb(err, parentValue)
Get parent path given a key.
opts.pathArray: undefined // if true, split path into array
Callback: cb(err, parentPath)
Get parent path given a value.
opts.pathArray: undefined // if true, split path into array
Callback: cb(err, parentPath)
Get parent value given a path.
Callback: cb(err, parentValue)
Get parent path given a path.
opts.pathArray: undefined // if true, split path into array
Note: this function can be called synchronously
Callback: cb(err, parentPath)
Get array of children given a value.
Same usage as .stream
but this version isn't streaming.
Callback: cb(err, childArray)
Same as .children
but takes a key as input.
Same usage as .stream
but this version isn't streaming.
Callback: cb(err, childArray)
Same as .stream with only opts.paths set to true.
Same as .stream with only opts.keys set to true.
Same as .stream with only opts.values set to true.
Clear the index. Deletes all of the index's data in the index db.
Build the index from scratch.
Note: You will likely want to .clear the index first or call .rebuild instead.
Clear and then build the index.
If you need to wait for the tree index to update after a .put
operation then you can use .put directly on the level-tree-index instance and give it a callback. Calling .put
this way is much less efficient so if you are planning to use this feature most of the time then you should look into using level-tree-index with the levelup:true
option instead.
Allows you to wait for the tree index to finish building using a callback. Same as .put
above but for deletion.
Uniqueness
The way level-tree-index
works requires that each indexed database entry has a globally unique path. In other words no two siblings can share the same pathProp
.
You might get into a situation where you really need multiple siblings with an identical pathProp
. Then you might wonder if you coulds just append e.g. a random string to each pathProp
before passing it to level-tree-index
and then strip it away again before e.g. showing the data to users.
Well, level-tree-index
provides helpers for exactly that. You can set opts.uniquefy
to true
in the constructor. You will then need database each entry to have a property that, combined with its pathProp
, makes it unique. This can be as simple as a long randomly generated string. As with pathProp
you will have to inform level-tree-index
about this property with uniqProp
.
You will then run into the problem that you no longer know the actual path names since they have the uniqueness added. You can either get the actual path name using the synchronous function .getPathName(val)
where val
is the value from the key-value pair for which you want the path. Or you can call .put
or .batch
directly on your level-tree-index
instance and they will pass your callback a second argument which for .put
is the actual path name and for .batch
is an array of path names for all put
operations.
When uniqefy
is turned on any functions returning paths will now be returning paths with the uniqueness data appended. You can use the convenience function .nicify(path)
to convert these paths into normal paths without the uniqueness data. For .stream
and any functions described as "same as .stream but ..." you can add set opts.nicePaths
to true and in you will receive the nicified paths back with each result.
Async quirks
Note that when you call .put, .del or .batch on your database level-tree-index will not be able to delay the callback so you cannot expect the tree index to be up to date when the callback is called. That is why you see the setTimeout used in the usage example above. You can instead call .put, .del or .batch directly on the level-tree-index instance and your callback will not be called until the index has finished building. This works but if opts.listen
is set to true then an inefficient and inelegant workaround is used (in order to prevent the change listener from attempting to update the already updated index) which could potentially slow things down.
If you want to wait for the index to update most of the time then you should probably either set opts.listen
to false or use the levelup mode by calling the constructor with opts.levelup
set to true, though that has its own drawbacks, especially if using valueEncoding:'json'
. See the constructor API documentation for more.
I normal operation (opts.levelup == false)
level-tree-index will listen for any changes on your database and update its index every time a change occurs. This is implemented using leveup change event listeners which run after the database operation has already completed.
When running .put
or .del
directly on level-tree-index the operation is performed on the underlying database then the tree index is updated and then the callback is called. Since we can't turn off the change event listeners for a specific operation, level-tree-index has to remember the operations performed directly through .put
or .del
on the level-tree-index instance such that the change event listener can ignore them to prevent the tree-index update operation from being called twice. This is done by hashing the entire operation, saving the hash and then checking the hash of each operation picked up by the change event listeners agains the saved hash. This is obviously inefficient. If this feature is never used then nothing is ever hashed nor compared so performance will not be impacted.
ToDo
opts.depth
working with opts.match
.Author: Biobricks
Source Code: https://github.com/biobricks/level-tree-index
License: AGPLv3
1657243080
Index and filter LevelDB databases and watch for future changes.
Set up the view indexes and filters:
var Index = require('level-match-index')
var level = require('level')
var sub = require('level-sublevel')
var db = sub(level('database', {valueEncoding: 'json'}))
var views = {
post: Index(db, {
match: { type: 'post' },
index: [ 'id' ],
single: true
}),
postsByTag: Index(db, {
match: { type: 'post' },
index: [ {many: 'tags'} ] // index each tag in array seperately
}),
commentsByPost: Index(db, {
match: { type: 'comment' },
index: [ 'postId' ]
})
}
Add some data:
var post1 = {
id: 'post-1', // used for matching as specified above
type: 'post', //
title: 'Typical Blog Post Example',
tags: [ 'test post', 'long winded' ],
body: 'etc...',
date: Date.now()
}
var post2 = {
id: 'post-2',
type: 'post',
title: 'Typical Blog Post Example',
tags: [ 'test post', 'exciting' ],
body: 'etc...',
date: Date.now()
}
var comment1 = {
id: 'comment-1',
type: 'comment', // used for matching as specified above
postId: post1.id, //
name: 'Matt McKegg',
body: 'cool story bro',
date: Date.now()
}
var comment2 = {
id: 'comment-2',
type: 'comment',
postId: post1.id,
name: 'Joe Blogs',
body: 'I do not understand!',
date: Date.now()
}
db.batch([
{key: post1.id, value: post1, type: 'put'},
{key: post2.id, value: post2, type: 'put'},
{key: comment1.id, value: comment1, type: 'put'},
{key: comment2.id, value: comment2, type: 'put'}
])
Now query the views:
var result = {post: null, comments: []}
views.post(post1.id).read().on('data', function(data){
result.post = data.value
}).on('end', getComments)
function getComments(){
views.commentsByPost(post1.id).read().on('data', function(data){
result.comments.push(data.value)
}).on('end', finish)
}
function finish(){
t.deepEqual(result, {
post: post1,
comments: [ comment1, comment2 ]
})
}
Or by tags:
var posts = []
views.postsByTag('long winded').read().on('data', function(data){
tags.push(data.value)
}).on('end', finish)
function finish(){
t.deepEqual(posts, [ post1 ])
}
Watch for future changes:
var comment3 = {
id: 'comment-2',
type: 'comment', // used for matching as specified above
postId: post1.id, //
name: 'Bobby',
body: 'Done yet?',
date: Date.now()
}
var remove = views.commentsByPost(post1.id).watch(function(ch){
// function is called with each change
t.deepEqual(ch.value, comment3)
})
db.put(newComment.id, newComment)
// remove the watcher hook if no longer needed
remove()
Same example as above but instead of specifying the postId for comments index, pull it out using a query:
var result = {post: null, comments: []}
views.post(post1.id).read().on('data', function(data){
result.post = data.value
}).on('end', getComments)
function getComments(){
// specify a value to extract as query and specify where to get it from as read option
views.commentsByPost({ $query: 'post.id' }).read({
data: result
}).on('data', function(data){
result.comments.push(data.value)
}).on('end', finish)
}
function finish(){
t.deepEqual(result, {
post: post1,
comments: [ comment1, comment2 ]
})
}
Author: mmckegg
Source Code: https://github.com/mmckegg/level-match-index
License:
1657232640
Secondary indexes for leveldb.
Create 2 indexes on top of a posts database.
var level = require('level');
var Secondary = require('level-secondary');
var sub = require('level-sublevel');
var db = sub(level(__dirname + '/db', {
valueEncoding: 'json'
}));
var posts = db.sublevel('posts');
// add a title index
posts.byTitle = Secondary(posts, 'title');
// add a length index
// append the post.id for unique indexes with possibly overlapping values
posts.byLength = Secondary(posts, 'length', function(post){
return post.body.length + '!' + post.id;
});
posts.put('1337', {
id: '1337',
title: 'a title',
body: 'lorem ipsum'
}, function(err) {
if (err) throw err;
posts.byTitle.get('a title', function(err, post) {
if (err) throw err;
console.log('get', post);
// => get: { id: '1337', title: 'a title', body: 'lorem ipsum' }
posts.del('1337', function(err) {
if (err) throw err;
posts.byTitle.get('a title', function(err) {
console.log(err.name);
// => NotFoundError
});
});
});
posts.byLength.createReadStream({
start: 10,
end: 20
}).on('data', console.log.bind(console, 'read'));
// => read { key: '1337', value: { id: '1337', title: 'a title', body: 'lorem ipsum' } }
posts.byLength.createKeyStream({
start: 10,
end: 20
}).on('data', console.log.bind(console, 'key'));
// => key 1337
posts.byLength.createValueStream({
start: 10,
end: 20
}).on('data', console.log.bind(console, 'value'));
// => value { id: '1337', title: 'a title', body: 'lorem ipsum' }
});
Return a secondary index that either indexes property name
or uses a custom reduce
function to map values to indexes.
Get the value that has been indexed with key
.
Create a readable stream that has indexes as keys and indexed data as values.
A level manifest that you can pass to multilevel.
What used to be
db = Secondary('name', db);
is now
db.byName = Secondary(db, 'name');
Also hooks are used, so it works perfectly with batches across multiple sublevels.
With npm do:
npm install level-secondary
Author: juliangruber
Source Code: https://github.com/juliangruber/level-secondary
License: MIT
1657228320
Generic indexer for leveldb. Only stores document keys for space efficiency.
npm install level-indexer
var indexer = require('level-indexer')
// create a index (by country)
var country = indexer(db, ['country']) // index by country
country.add({
key: 'mafintosh',
name: 'mathias',
country: 'denmark'
})
country.add({
key: 'maxogden',
name: 'max',
country: 'united states'
})
var stream = country.find({
gte:{country:'denmark'},
lte:{country:'denmark'}
})
// or using the shorthand syntax
var stream = country.find('denmark')
stream.on('data', function(key) {
console.log(key) // will print mafintosh
})
The stored index is prefix with the index key names which means you can use the same levelup instance to store multiple indexes.
index = indexer(db, [prop1, prop2, ...], [options])
Creates a new index using the given properties. Options include
{
map: function(key, cb) {
// map find results to another value
db.get(key, cb)
}
}
index.add(doc, [key], [cb])
Add a document to the index. The document needs to have a key or provide one. Only the key will be stored in the index.
index.remove(doc, [key], [cb])
Remove a document from the index.
index.key(doc, [key])
Returns the used leveldb key. Useful if you want to batch multiple index updates together yourself
var batch = [{type:'put', key:index.key(doc), value:doc.key}, ...]
stream = index.find(options, [cb])
Search the index. Use options.{gt,gte,lt,lte}
to scope your search.
// find everyone in the age range 20-50 in denmark
var index = indexer(db, ['country', 'age'])
...
var stream = index.find({
gt: {
country: 'denmark',
age: 20
},
lt: {
country: 'denmark',
age: 50
}
})
Optionally you can specify the ranges using arrays
var stream = index.find({
gt: ['denmark', 20],
lt: ['denmark', 50]
})
Or if you do not care about ranges
var stream = index.find(['denmark', 20])
// equivalent to
var stream = index.find({
gte: ['denmark', 20],
lte: ['denmark', 20]
})
The stream will contain the keys of the documents that where found in the index. Use options.map
to map the to the document values.
Options also include the regular levelup db.createReadStream
options.
If you set cb
the stream will be buffered and passed as an array.
index.findOne(options, cb)
Only find the first match in the index and pass that to the callbck
Author: Mafintosh
Source Code: https://github.com/mafintosh/level-indexer
License: MIT license