1656403049
Understanding Linked Lists can be a difficult task when you are a beginner JavaScript developer since JavaScript does not provide built-in Linked List support. In an advanced language like JavaScript, we need to implement this data structure by scratch and, if you are unfamiliar with how this data structure works, the implementation part becomes that much more difficult.
In this article, we will discuss how a linked list gets stored in the database and, we will implement a linked-list from scratch with operations like the addition and deletion of an element, lookups, and reversing a linked-list. Before moving onto implementing a linked-list, one needs to understand what are the advantages of using a linked-list when we already have data structures like arrays and objects present.
We know that elements inside an array get stored in the database with numbered indexes and, in sequential order:
Arrays in memory
While using arrays, operations like adding/deleting elements at the start or at a specific index can be a slow task since we have to shift indexes of all the other elements. This slowness is caused due to the numbered indexes feature of arrays.
The above problem can get solved with the use of objects. Since in objects, the elements get stored at random positions and therefore, there is no need to shift indexes of elements while performing the operations like adding/deleting elements at the start or specific index:
Objects in memory
Although operations like addition and deletion are fast in objects, we observe from the above image that, when it comes to iterating operations, objects are not the best choice since elements of an object get stored in random positions. Therefore, iterating operations can take a long time. This is where linked-lists come in.
So what is a linked list?
From the name itself, we can figure out that it’s a list that is linked in some way. So how is it linked and what does the list contain? A linked list consists of nodes that have two properties, data and the pointer. The pointer inside a node points to the next node in the list. The first node inside a linked list is called the head. For understanding better, let’s take a look at the following image that describes a linked list:
Linked List Illustration
We observe from the above image that each node has two properties, data and a pointer. The pointer points to the next node in the list and, the pointer of the last node points to null. The above image represents a singly linked list.
We can see there is a big difference when comparing linked lists with objects. In linked-lists, each node gets connected to the next node via a pointer. Therefore, we have a connection between each node of the linked list, whereas, in objects, the key-value pairs get stored randomly and have no connection between each other.
Let’s implement a linked-list that stores integers as data. Since JavaScript does not provide built-in linked list support, we will be using objects and classes to implement a linked list. Let’s get started:
class Node{
constructor(value){
this.value = value;
this.next = null;
}
}
class LinkedList{
constructor(){
this.head = null;
this.tail = this.head;
this.length = 0;
}
append(value){
}
prepend(value){
}
insert(value,index){
}
lookup(index){
}
remove(index){
}
reverse(){
}
}
In the above code, we have created two classes, one for the linked-list itself and the other for creating nodes. Like we discussed, each node will have two properties, a value and a pointer (“next” in the above case). The LinkedList class contains three properties, the head (which is null initially), the tail (which points to null as well) that is used to store the last node of the linked list and the length property that holds the length of the linked list. It also consists of functions that are empty for now. We will fill these functions one by one.
This function adds a node to the end of the linked list. For implementing this function, we need to understand the operation that it’s going to perform:
Illustration of append function
From the above image, we can implement the append function in the following way:
append(value){
const newNode = new Node(value);
if(!this.head){
this.head = newNode;
this.tail = newNode;
}
else{
this.tail.next = newNode;
this.tail = newNode;
}
this.length++;
}
Let’s decode the function,
If you are new to JavaScript, understanding the above function can be daunting therefore, let’s breakdown what happens when we perform the append function:
const linkedList1 = new LinkedList();
linkedList1.append(2);
Check whether the head points to null, it does, so we create a new object and assign the new object to both head and tail:
let node = new Node(2);
this.head = newNode;
this.tail = newNode;
Now, both head and tail are pointing to the same object and this is a very important point to remember.
Next, let’s append two more values to the linked list:
linkedList1.append(3);
linkedList1.append(4);
Now, the head does not point to null, so we go into the else condition of append function:
this.tail.next = node;
Since both head and tail point to the same object, any change in tail results in change of the head object as well. That is how objects work in JavaScript. In JavaScript, objects are passed by reference and therefore, both head and tail point to the same address space where the object is stored. The above line of code is equivalent to:
this.head.next = node;
Next,
this.tail = node;
Now, after the executing the above line of code, the this.head.next and this.tail are pointing to the same object and therefore, whenever we append new nodes, the head object will automatically get updated.
After performing three appends, try console.logging the linkedList1 object, this is how it should look:
head: {value: 2 , next: {value: 3, next: {value: 4,next: null}}}
tail : {value: 4, next: null}
length:3
We observe from all the above code that the append function of a linked list is of complexity O(1) since we neither have to shift indexes nor iterate through the linked list.
Let’s move on to the next function,
For implementing this function, we create a new node using the Node class and point the next object of this new node to the head of the linked list. Next, we assign the new node to the head of the linked list:
prepend(value){
const node = new Node(value);
node.next = this.head;
this.head = node;
this.length++;
}
Just like the append function, this function as well has the complexity of O(1).
Before implementing this function in code, it’s important to visualise what this function does, therefore, for understanding purposes, let’s create a linked list with few values and then visualise the insert function. The insert function takes in two parameters, value and index.
let linkedList2 = new LinkedList();
linkedList2.append(23);
linkedList2.append(89);
linkedList2.append(12);
linkedList2.append(3);
linkedList2.insert(45,2);
Step 1:
Iterate through the linked list till we reach the index-1 position (1st index in this case):
Step 1 illustration of insert operation
Step 2:
Assign the pointer of node at index-1 position(89 in this case) to the new node(45 in this case):
Step 2 illustration of insert operation
Step 3:
Assign the next pointer of the new node (45) to the next node (12):
Step 3 illustration of insert operation
This is how the insert operation is performed. Using the above visualisation, we observe that we need to find nodes at index-1 position and index position so that we can insert the new node between them. Let’s implement this in code:
insert(value,index){
if(index >= this.length){
this.append(value);
}
const node = new Node(value);
const {prevNode,nextNode} = this.getPrevNextNodes(index);
prevNode.next = node;
node.next = nextNode;
this.length++;
}
Let’s decode the function, if the value of index is greater than or equal to the length property, we handover the operation to the append function. For the else condition, we create a new node using Node class, next we observe a new function getPrevNextNodes( ) through which we receive values of prevNode and nextNode. The getPrevNextNodes function gets implemented like this:
getPrevNextNodes(index){
let count = 0;
let prevNode = this.head;
let nextNode = prevNode.next;
while(count < index - 1){
prevNode = prevNode.next;
nextNode = prevNode.next;
count++;
}
return {
prevNode,
nextNode
}
}
The above function basically returns the nodes at index-1 position and index position by iterating through the linked list. After receiving these nodes, we point the next property of the prevNode to the new node and the new node’s next property to the nextNode.
The insert operation for a linked list is of complexity O(n) since we have to iterate through the linked list and search for nodes at index-1 and index positions. Although the complexity is O(n), we observe that this insert operation is much faster than the insert operation on arrays, in arrays we would have to shift the indexes of all the elements after a particular index but, in the case of a linked list, we only manipulate the next properties of nodes at index-1 and index positions.
Now that we have covered the insertion operation, the remove operation might feel easier since it’s almost the same as the insertion operation with a small difference, when we get prevNode and nextNode values from the getPrevNextNodes function, we have to perform the following operation in remove function:
prevNode.next = nextNode.next;
By executing the above line of code, the next property of node at index-1 position will now point to node at index+1 position. This way, the node at index position will be removed.
The complete function:
remove(index){
let {previousNode,currentNode} = this.getNodes(index);
previousNode.next = currentNode.next;
this.length--;
}
The remove operation is also of complexity O(n) but, again, like the insertion operation, the remove operation in linked lists is faster than the remove operation in arrays.
Although it might seem simple, reversing a linked list can often be the most confusing operation to implement and hence, this operation gets asked a lot in coding interviews. Before implementing the function, let’s visualise the strategy that we are going to use to reverse a linked list.
For reversing a linked list, we need to keep track of three nodes, previousNode, currentNode and nextNode.
Consider the following linked list:
let linkedList2 = new LinkedList();
linkedList2.append(67);
linkedList2.append(32);
linkedList2.append(44);
Step 1:
Initially, the previousNode has the value null and, the currentNode has the value of head:
SteStep 1 illustration of reverse operationp 1 illustration of insert operation
Step 2:
Next, we assign the nextNode to the currentNode.next:
Step 2 illustration of reverse operation
Step 3:
Next, we point the currentNode.next property to the previousNode:
Step 3 illustration of reverse operation
Step 4:
Now, we shift the previousNode to currentNode and currentNode to nextNode:
Step 4 illustration of reverse operation
This process restarts from step 2 and continues till currentNode equals null.
To implement this on code:
reverse(){
let previousNode = null;
let currentNode = this.head;
while(currentNode !== null){
let nextNode = currentNode.next;
currentNode.next = previousNode;
previousNode = currentNode;
currentNode = nextNode;
}
this.head = previousNode;
}
Like we visualised, till we hit the currentNode === null mark, we keep iterating and shifting the values. In the end, we assign the previousNode value to the head.
The reverse operation has a complexity of O(n).
This operation is simple, we just iterate through the linked list and return the node at specific index. This operation as well has the complexity of O(n).
lookup(index){
let counter = 0;
let currentNode = this.head;
while(counter < index){
currentNode = currentNode.next;
counter++;
}
return currentNode;
}
There you go, we have finished implementing basic operations of a singly linked list in javascript. The difference between a singly and doubly linked list is that, doubly linked list has nodes which have pointers to both the previous node and the next node.
From the above operations, let’s conclude linked lists.
Linked lists provide us with fast append(Adding element at the end) and prepend(Adding element at the start) operations. Although the insertion operation in linked lists is of complexity O(n), it is much faster than insertion operation of arrays. The other problem that we face while using arrays is size complexity, when we use dynamic arrays, while adding an element, we have to copy the complete array to a different address space and then add the element whereas, in linked lists, we don’t face such problems.
The problem we face while using objects is the random placement of elements in memory whereas in linked lists, the nodes are connected to each other with pointers that provide us some order.
So finally, we have finished understanding and evaluating a commonly used data structure called, a Linked List.
#datastructure #javascript
1655721660
Queue is a linear collection of items where items are inserted and removed in a particular order. Queue is also called a FIFO Data Structure because it follows the "First In First Out" principle i.e. the item that is inserted in the first is the one that is taken out first. In this video, we look at what the queue is, how is it implemented, what are the different operations you can perform on a queue, and the implementation of Queue in JavaScript. After watching this video, you will be able to answer the following questions:
- What is Queue Data Structure?
- What is FIFO principle?
- What are different operations you can perform on a Queue?
- How to implement stack in Queue?
1655030040
Stack is a linear collection of items where items are inserted and removed in a particular order. Stack is also called a LIFO Data Structure because it follows the "Last In First Out" principle i.e. the item that is inserted in the last is the one that is taken out first. In this video, we look at what the stack is, how is it implemented, what are the different operations you can perform on a stack, and some of the real-world usages of Stack. After watching this video, you will be able to answer the following questions:
- What is Stack Data Structure?
- What is LIFO principle?
- What are different operations you can perform on a Stack?
- What are some usage examples of Stack?
- How to implement stack in JavaScript?
1654894860
SBE is an OSI layer 6 presentation for encoding and decoding binary application messages for low-latency financial applications. This repository contains the reference implementations in Java, C++, Golang, C#, and Rust.
More details on the design and usage of SBE can be found on the Wiki.
An XSD for SBE specs can be found here. Please address questions about the specification to the SBE FIX community.
For the latest version information and changes see the Change Log with downloads at Maven Central.
The Java and C++ SBE implementations work very efficiently with the Aeron messaging system for low-latency and high-throughput communications. The Java SBE implementation has a dependency on Agrona for its buffer implementations. Commercial support is available from sales@real-logic.co.uk.
Binaries and dependency information for Maven, Ivy, Gradle, and others can be found at http://search.maven.org.
Example for Maven:
<dependency>
<groupId>uk.co.real-logic</groupId>
<artifactId>sbe-all</artifactId>
<version>${sbe.tool.version}</version>
</dependency>
Build the project with Gradle using this build.gradle file.
Full clean build:
$ ./gradlew
Run the Java examples
$ ./gradlew runJavaExamples
Jars for the executable, source, and javadoc for the various modules can be found in the following directories:
sbe-benchmarks/build/libs
sbe-samples/build/libs
sbe-tool/build/libs
sbe-all/build/libs
An example to execute a Jar from command line using the 'all' jar which includes the Agrona dependency:
java -Dsbe.generate.ir=true -Dsbe.target.language=Cpp -Dsbe.target.namespace=sbe -Dsbe.output.dir=include/gen -Dsbe.errorLog=yes -jar sbe-all/build/libs/sbe-all-${SBE_TOOL_VERSION}.jar my-sbe-messages.xml
NOTE: Linux, Mac OS, and Windows only for the moment. See FAQ. Windows builds have been tested with Visual Studio Express 12.
For convenience, the cppbuild
script does a full clean, build, and test of all targets as a Release build.
$ ./cppbuild/cppbuild
If you are comfortable using CMake, then a full clean, build, and test looks like:
$ mkdir -p cppbuild/Debug
$ cd cppbuild/Debug
$ cmake ../..
$ cmake --build . --clean-first
$ ctest
Note: The C++ build includes the C generator. Currently, the C generator is a work in progress.
First build using Gradle to generate the SBE jar and then use it to generate the golang code for testing.
$ ./gradlew
$ ./gradlew generateGolangCodecs
For convenience on Linux, a gnu Makefile is provided that runs some tests and contains some examples.
$ cd gocode
# make # test, examples, bench
Users of golang generated code should see the user documentation.
Developers wishing to enhance the golang generator should see the developer documentation
Users of CSharp generated code should see the user documentation.
Developers wishing to enhance the CSharp generator should see the developer documentation
The SBE Rust generator will produce 100% safe rust crates (no unsafe
code will be generated). Generated crates do not have any dependencies on any libraries (including no SBE libraries). If you don't yet have Rust installed see Rust: Getting Started
Generate the Rust codecs
$ ./gradlew generateRustCodecs
Run the Rust test from Gradle
$ ./gradlew runRustTests
Or run test directly with Cargo
$ cd rust
$ cargo test
Download Details:
Author: real-logic
Source Code: https://github.com/real-logic/simple-binary-encoding
License: Apache-2.0 license
1654891140
Bitsets, also called bitmaps, are commonly used as fast data structures. Unfortunately, they can use too much memory. To compensate, we often use compressed bitmaps.
Roaring bitmaps are compressed bitmaps which tend to outperform conventional compressed bitmaps such as WAH, EWAH or Concise. In some instances, roaring bitmaps can be hundreds of times faster and they often offer significantly better compression. They can even be faster than uncompressed bitmaps.
Roaring bitmaps are found to work well in many important applications:
Use Roaring for bitmap compression whenever possible. Do not use other bitmap compression methods (Wang et al., SIGMOD 2017)
kudos for making something that makes my software run 5x faster (Charles Parker from BigML)
The YouTube SQL Engine, Google Procella, uses Roaring bitmaps for indexing. Apache Lucene uses Roaring bitmaps, though they have their own independent implementation. Derivatives of Lucene such as Solr and Elastic also use Roaring bitmaps. Other platforms such as Whoosh, Microsoft Visual Studio Team Services (VSTS) and Pilosa also use Roaring bitmaps with their own implementations. You find Roaring bitmaps in InfluxDB, Bleve, Cloud Torrent, and so forth.
There is a serialized format specification for interoperability between implementations. We have interoperable C/C++, Java and Go implementations.
(c) 2013-... the RoaringBitmap authors
This code is licensed under Apache License, Version 2.0 (AL2.0).
When should you use a bitmap?
Sets are a fundamental abstraction in software. They can be implemented in various ways, as hash sets, as trees, and so forth. In databases and search engines, sets are often an integral part of indexes. For example, we may need to maintain a set of all documents or rows (represented by numerical identifier) that satisfy some property. Besides adding or removing elements from the set, we need fast functions to compute the intersection, the union, the difference between sets, and so on.
To implement a set of integers, a particularly appealing strategy is the bitmap (also called bitset or bit vector). Using n bits, we can represent any set made of the integers from the range [0,n): the ith bit is set to one if integer i is present in the set. Commodity processors use words of W=32 or W=64 bits. By combining many such words, we can support large values of n. Intersections, unions and differences can then be implemented as bitwise AND, OR and ANDNOT operations. More complicated set functions can also be implemented as bitwise operations.
When the bitset approach is applicable, it can be orders of magnitude faster than other possible implementation of a set (e.g., as a hash set) while using several times less memory.
However, a bitset, even a compressed one is not always applicable. For example, if you have 1000 random-looking integers, then a simple array might be the best representation. We refer to this case as the "sparse" scenario.
When should you use compressed bitmaps?
An uncompressed BitSet can use a lot of memory. For example, if you take a BitSet and set the bit at position 1,000,000 to true and you have just over 100kB. That is over 100kB to store the position of one bit. This is wasteful even if you do not care about memory: suppose that you need to compute the intersection between this BitSet and another one that has a bit at position 1,000,001 to true, then you need to go through all these zeroes, whether you like it or not. That can become very wasteful.
This being said, there are definitively cases where attempting to use compressed bitmaps is wasteful. For example, if you have a small universe size. E.g., your bitmaps represent sets of integers from [0,n) where n is small (e.g., n=64 or n=128). If you are able to uncompressed BitSet and it does not blow up your memory usage, then compressed bitmaps are probably not useful to you. In fact, if you do not need compression, then a BitSet offers remarkable speed.
The sparse scenario is another use case where compressed bitmaps should not be used. Keep in mind that random-looking data is usually not compressible. E.g., if you have a small set of 32-bit random integers, it is not mathematically possible to use far less than 32 bits per integer, and attempts at compression can be counterproductive.
How does Roaring compares with the alternatives?
Most alternatives to Roaring are part of a larger family of compressed bitmaps that are run-length-encoded bitmaps. They identify long runs of 1s or 0s and they represent them with a marker word. If you have a local mix of 1s and 0, you use an uncompressed word.
There are many formats in this family:
There is a big problem with these formats however that can hurt you badly in some cases: there is no random access. If you want to check whether a given value is present in the set, you have to start from the beginning and "uncompress" the whole thing. This means that if you want to intersect a big set with a large set, you still have to uncompress the whole big set in the worst case...
Roaring solves this problem. It works in the following manner. It divides the data into chunks of 216 integers (e.g., [0, 216), [216, 2 x 216), ...). Within a chunk, it can use an uncompressed bitmap, a simple list of integers, or a list of runs. Whatever format it uses, they all allow you to check for the present of any one value quickly (e.g., with a binary search). The net result is that Roaring can compute many operations much faster than run-length-encoded formats like WAH, EWAH, Concise... Maybe surprisingly, Roaring also generally offers better compression ratios.
http://www.javadoc.io/doc/org.roaringbitmap/RoaringBitmap/
import org.roaringbitmap.RoaringBitmap;
public class Basic {
public static void main(String[] args) {
RoaringBitmap rr = RoaringBitmap.bitmapOf(1,2,3,1000);
RoaringBitmap rr2 = new RoaringBitmap();
rr2.add(4000L,4255L);
rr.select(3); // would return the third value or 1000
rr.rank(2); // would return the rank of 2, which is index 1
rr.contains(1000); // will return true
rr.contains(7); // will return false
RoaringBitmap rror = RoaringBitmap.or(rr, rr2);// new bitmap
rr.or(rr2); //in-place computation
boolean equals = rror.equals(rr);// true
if(!equals) throw new RuntimeException("bug");
// number of values stored?
long cardinality = rr.getLongCardinality();
System.out.println(cardinality);
// a "forEach" is faster than this loop, but a loop is possible:
for(int i : rr) {
System.out.println(i);
}
}
}
Please see the examples folder for more examples, which you can run with ./gradlew :examples:runAll
, or run a specific one with ./gradlew :examples:runExampleBitmap64
, etc.
Java lacks native unsigned integers but integers are still considered to be unsigned within Roaring and ordered according to Integer.compareUnsigned
. This means that Java will order the numbers like so 0, 1, ..., 2147483647, -2147483648, -2147483647,..., -1. To interpret correctly, you can use Integer.toUnsignedLong
and Integer.toUnsignedString
.
If you want to have your bitmaps lie in memory-mapped files, you can use the org.roaringbitmap.buffer package instead. It contains two important classes, ImmutableRoaringBitmap and MutableRoaringBitmap. MutableRoaringBitmaps are derived from ImmutableRoaringBitmap, so that you can convert (cast) a MutableRoaringBitmap to an ImmutableRoaringBitmap in constant time.
An ImmutableRoaringBitmap that is not an instance of a MutableRoaringBitmap is backed by a ByteBuffer which comes with some performance overhead, but with the added flexibility that the data can reside anywhere (including outside of the Java heap).
At times you may need to work with bitmaps that reside on disk (instances of ImmutableRoaringBitmap) and bitmaps that reside in Java memory. If you know that the bitmaps will reside in Java memory, it is best to use MutableRoaringBitmap instances, not only can they be modified, but they will also be faster. Moreover, because MutableRoaringBitmap instances are also ImmutableRoaringBitmap instances, you can write much of your code expecting ImmutableRoaringBitmap.
If you write your code expecting ImmutableRoaringBitmap instances, without attempting to cast the instances, then your objects will be truly immutable. The MutableRoaringBitmap has a convenience method (toImmutableRoaringBitmap) which is a simple cast back to an ImmutableRoaringBitmap instance. From a language design point of view, instances of the ImmutableRoaringBitmap class are immutable only when used as per the interface of the ImmutableRoaringBitmap class. Given that the class is not final, it is possible to modify instances, through other interfaces. Thus we do not take the term "immutable" in a purist manner, but rather in a practical one.
One of our motivations for this design where MutableRoaringBitmap instances can be casted down to ImmutableRoaringBitmap instances is that bitmaps are often large, or used in a context where memory allocations are to be avoided, so we avoid forcing copies. Copies could be expected if one needs to mix and match ImmutableRoaringBitmap and MutableRoaringBitmap instances.
The following code sample illustrates how to create an ImmutableRoaringBitmap from a ByteBuffer. In such instances, the constructor only loads the meta-data in RAM while the actual data is accessed from the ByteBuffer on demand.
import org.roaringbitmap.buffer.*;
//...
MutableRoaringBitmap rr1 = MutableRoaringBitmap.bitmapOf(1, 2, 3, 1000);
MutableRoaringBitmap rr2 = MutableRoaringBitmap.bitmapOf( 2, 3, 1010);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
DataOutputStream dos = new DataOutputStream(bos);
// If there were runs of consecutive values, you could
// call rr1.runOptimize(); or rr2.runOptimize(); to improve compression
rr1.serialize(dos);
rr2.serialize(dos);
dos.close();
ByteBuffer bb = ByteBuffer.wrap(bos.toByteArray());
ImmutableRoaringBitmap rrback1 = new ImmutableRoaringBitmap(bb);
bb.position(bb.position() + rrback1.serializedSizeInBytes());
ImmutableRoaringBitmap rrback2 = new ImmutableRoaringBitmap(bb);
Alternatively, we can serialize directly to a ByteBuffer
with the serialize(ByteBuffer)
method.
Operations on an ImmutableRoaringBitmap such as and, or, xor, flip, will generate a RoaringBitmap which lies in RAM. As the name suggest, the ImmutableRoaringBitmap itself cannot be modified.
This design was inspired by Druid.
One can find a complete working example in the test file TestMemoryMapping.java.
Note that you should not mix the classes from the org.roaringbitmap package with the classes from the org.roaringbitmap.buffer package. They are incompatible. They serialize to the same output however. The performance of the code in org.roaringbitmap package is generally superior because there is no overhead due to the use of ByteBuffer instances.
Many applications use Kryo for serialization/deserialization. One can use Roaring bitmaps with Kryo efficiently thanks to a custom serializer (Kryo 5):
public class RoaringSerializer extends Serializer<RoaringBitmap> {
@Override
public void write(Kryo kryo, Output output, RoaringBitmap bitmap) {
try {
bitmap.serialize(new KryoDataOutput(output));
} catch (IOException e) {
e.printStackTrace();
throw new RuntimeException();
}
}
@Override
public RoaringBitmap read(Kryo kryo, Input input, Class<? extends RoaringBitmap> type) {
RoaringBitmap bitmap = new RoaringBitmap();
try {
bitmap.deserialize(new KryoDataInput(input));
} catch (IOException e) {
e.printStackTrace();
throw new RuntimeException();
}
return bitmap;
}
}
Though Roaring Bitmaps were designed with the 32-bit case in mind, we have extensions to 64-bit integers. We offer two classes for this purpose: Roaring64NavigableMap
and Roaring64Bitmap
.
The Roaring64NavigableMap
relies on a conventional red-black-tree. The keys are 32-bit integers representing the most significant 32~bits of elements whereas the values of the tree are 32-bit Roaring bitmaps. The 32-bit Roaring bitmaps represent the least significant bits of a set of elements.
The newer Roaring64Bitmap
approach relies on the ART data structure to hold the key/value pair. The key is made of the most significant 48bits of elements whereas the values are 16-bit Roaring containers. It is inspired by [The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases](https://db.in.tum.de/leis/papers/ART.pdf) by Leis et al. (ICDE '13).
import org.roaringbitmap.longlong.*;
// first Roaring64NavigableMap
LongBitmapDataProvider r = Roaring64NavigableMap.bitmapOf(1,2,100,1000);
r.addLong(1234);
System.out.println(r.contains(1)); // true
System.out.println(r.contains(3)); // false
LongIterator i = r.getLongIterator();
while(i.hasNext()) System.out.println(i.next());
// second Roaring64Bitmap
bitmap1 = new Roaring64Bitmap();
bitmap2 = new Roaring64Bitmap();
int k = 1 << 16;
long i = Long.MAX_VALUE / 2;
long base = i;
for (; i < base + 10000; ++i) {
bitmap1.add(i * k);
bitmap2.add(i * k);
}
b1.and(bitmap2);
RangeBitmap
is a succinct data structure supporting range queries. Each value added to the bitmap is associated with an incremental identifier, and queries produce a RoaringBitmap
of the identifiers associated with values that satisfy the query. Every value added to the bitmap is stored separately, so that if a value is added twice, it will be stored twice, and if that value is less than some threshold, there will be at least two integers in the resultant RoaringBitmap
.
It is more efficient - in terms of both time and space - to provide a maximum value. If you don't know the maximum value, provide a Long.MAX_VALUE
. Unsigned order is used like elsewhere in the library.
var appender = RangeBitmap.appender(1_000_000);
appender.add(1L);
appender.add(1L);
appender.add(100_000L);
RangeBitmap bitmap = appender.build();
RoaringBitmap lessThan5 = bitmap.lt(5); // {0,1}
RoaringBitmap greaterThanOrEqualTo1 = bitmap.gte(1); // {0, 1, 2}
RoaringBitmap greaterThan1 = bitmap.gt(1); // {2}
RangeBitmap
is can be written to disk and memory mapped:
var appender = RangeBitmap.appender(1_000_000);
appender.add(1L);
appender.add(1L);
appender.add(100_000L);
ByteBuffer buffer = mapBuffer(appender.serializedSizeInBytes());
appender.serialize(buffer);
RangeBitmap bitmap = RangeBitmap.map(buffer);
The serialization format uses little endian byte order.
To build the project you need maven (version 3).
You can download releases from github: https://github.com/RoaringBitmap/RoaringBitmap/releases
If your project depends on roaring, you can specify the dependency in the Maven "pom.xml" file:
<dependencies>
<dependency>
<groupId>org.roaringbitmap</groupId>
<artifactId>RoaringBitmap</artifactId>
<version>0.9.9</version>
</dependency>
</dependencies>
where you should replace the version number by the version you require.
Get java
./gradlew assemble
will compile
./gradlew build
will compile and run the unit tests
./gradlew test
will run the tests
./gradlew :roaringbitmap:test --tests TestIterators.testIndexIterator4
run just the test TestIterators.testIndexIterator4
./gradlew bsi:test --tests BufferBSITest.testEQ
run just the test BufferBSITest.testEQ
in the bsi
submodule
./gradlew checkstyleMain
will check that you abide by the code style and that the code compiles. We enforce a strict style so that there is no debate as to the proper way to format the code.
If you plan to contribute to RoaringBitmap, you can have load it up in your favorite IDE.
Contributions are invited. We enforce the Google Java style. Please run ./gradlew checkstyleMain
on your code before submitting a patch.
In the serialized files, part of the first 4 bytes are dedicated to a "cookie" which serves to indicate the file format.
If you try to deserialize or map a bitmap from data that has an unrecognized "cookie", the code will abort the process and report an error.
This problem will occur to all users who serialized Roaring bitmaps using versions prior to 0.4.x as they upgrade to version 0.4.x or better. These users need to refresh their serialized bitmaps.
Given N integers in [0,x), then the serialized size in bytes of a Roaring bitmap should never exceed this bound:
8 + 9 * ((long)x+65535)/65536 + 2 * N
That is, given a fixed overhead for the universe size (x), Roaring bitmaps never use more than 2 bytes per integer. You can call RoaringBitmap.maximumSerializedSize
for a more precise estimate.
There is no such thing as a data structure that is always ideal. You should make sure that Roaring bitmaps fit your application profile. There are at least two cases where Roaring bitmaps can be easily replaced by superior alternatives compression-wise:
Random random = new Random();
bitmap.select(random.nextInt(bitmap.getCardinality()));
To run JMH benchmarks, use the following command:
$ ./gradlew jmhJar
You can also run specific benchmarks...
$ ./jmh/run.sh 'org.roaringbitmap.aggregation.and.identical.*'
Download Details:
Author: RoaringBitmap
Source Code: https://github.com/RoaringBitmap/RoaringBitmap
License: Apache-2.0 license
1654876440
A Persistent Java Collections Library
PCollections serves as a persistent and immutable analogue of the Java Collections Framework. This includes efficient, thread-safe, generic, immutable, and persistent stacks, maps, vectors, sets, and bags, compatible with their Java Collections counterparts.
Persistent and immutable datatypes are increasingly appreciated as a simple, design-friendly, concurrency-friendly, and sometimes more time- and space-efficient alternative to mutable datatypes.
Note that these immutable collections are very different from the immutable collections returned by Java's Collections.unmodifiableCollection() and similar methods. The difference is that Java's unmodifiable collections have no producers, whereas PCollections have very efficient producers.
Thus if you have an unmodifiable Collection x
and you want a new Collection x2
consisting of the elements of x
in addition to some element e
, you would have to do something like:
Collection x2 = new HashSet(x);
x2.add(e);
which involves copying all of x
, using linear time and space.
If, on the other hand, you have a PCollection y
you can simply say:
PCollection y2 = y.plus(e);
which still leaves y
untouched but generally requires little or no copying, using time and space much more efficiently.
PCollections are created using producers and static factory methods. Some example static factory methods are HashTreePSet.empty()
which returns an empty PSet, while HashTreePSet.singleton(e)
returns a PSet containing just the element e
, and HashTreePSet.from(collection)
returns a PSet containing the same elements as collection
. See Example Code below for an example of using producers.
The same empty()
, singleton()
, and from()
factory methods are found in each of the PCollections implementations, which currently include one concrete implementation for each abstract type:
PCollections are highly interoperable with Java Collections: every PCollection is a java.util.Collection, every PMap is a java.util.Map, every PSequence — including every PStack and PVector — is a java.util.List, and every PSet is a java.util.Set.
PCollections uses Semantic Versioning, which establishes a strong correspondence between API changes and version numbering.
PCollections is in the Maven Central repository, under org.pcollections. Thus the Maven coordinates for PCollections are:
<dependency>
<groupId>org.pcollections</groupId>
<artifactId>pcollections</artifactId>
<version>3.1.4</version>
</dependency>
or Gradle:
compile 'org.pcollections:pcollections:3.1.4'
The following gives a very simple example of using PCollections, including the static factory method HashTreePSet.empty() and the producer plus(e):
import org.pcollections.*;
public class Example {
public static void main(String... args) {
PSet<String> set = HashTreePSet.empty();
set = set.plus("something");
System.out.println(set);
System.out.println(set.plus("something else"));
System.out.println(set);
}
}
Running this program gives the following output:
[something]
[something else, something]
[something]
To build the project from source clone the repository and then run ./gradlew
Clojure and Scala also provide persistent collections on the JVM, but they are less interoperable with Java. Both Guava and java.util.Collections
provide immutable collections but they are not persistent—that is, they do not provide efficient producers—so they are not nearly as useful. See Persistent versus Unmodifiable above.
Download Details:
Author: hrldcpr
Source Code: https://github.com/hrldcpr/pcollections
License: MIT license
1654417200
This is a Rust port of the Roaring bitmap data structure, initially defined as a Java library and described in Better bitmap performance with Roaring bitmaps.
This crate only supports the current stable version of Rust, patch releases may use new features at any time.
This project uses Clippy, rustfmt, and denies warnings in CI builds. Available via rustup component add clippy rustfmt
.
To ensure your changes will be accepted please check them with:
cargo fmt -- --check
cargo fmt --manifest-path benchmarks/Cargo.toml -- --check
cargo clippy --all-targets -- -D warnings
In addition, ensure all tests are passing with cargo test
It is recommended to run the cargo bench
command inside of the benchmarks
directory. This directory contains a library that is dedicated to benchmarking the Roaring library by using a set of real-world datasets. It is also advised to run the benchmarks on a bare-metal machine, running them on the base branch and then on the contribution PR branch to better see the changes.
Those benchmarks are designed on top of the Criterion library, you can read more about it on the user guide.
The simd
feature is in active development. It has not been tested. If you would like to build with simd
note that std::simd
is only available in Rust nightly.
Download Details:
Author: RoaringBitmap
Source Code: https://github.com/RoaringBitmap/roaring-rs
License: Apache-2.0, MIT licenses found
1654410420
A library providing enum map providing type safe enum array. It is implemented using regular Rust arrays, so using them is as fast as using regular Rust arrays.
This library doesn't provide Minimum Supported Rust Version (MSRV). If you find having MSRV valuable, please use enum-map 0.6 instead.
#[macro_use]
extern crate enum_map;
use enum_map::EnumMap;
#[derive(Debug, Enum)]
enum Example {
A,
B,
C,
}
fn main() {
let mut map = enum_map! {
Example::A => 1,
Example::B => 2,
Example::C => 3,
};
map[Example::C] = 4;
assert_eq!(map[Example::A], 1);
for (key, &value) in &map {
println!("{:?} has {} as value.", key, value);
}
}
Download Details:
Author: xfix
Source Code: https://github.com/xfix/enum-map
License: Apache-2.0, MIT licenses found
1654409880
Rust Persistent Data Structures provides fully persistent data structures with structural sharing.
To use rpds add the following to your Cargo.toml
:
[dependencies]
rpds = "<version>"
This crate offers the following data structures:
List
Your classic functional list.
use rpds::List;
let list = List::new().push_front("list");
assert_eq!(list.first(), Some(&"list"));
let a_list = list.push_front("a");
assert_eq!(a_list.first(), Some(&"a"));
let list_dropped = a_list.drop_first().unwrap();
assert_eq!(list_dropped, list);
Vector
A sequence that can be indexed. The implementation is described in Understanding Persistent Vector Part 1 and Understanding Persistent Vector Part 2.
use rpds::Vector;
let vector = Vector::new()
.push_back("I’m")
.push_back("a")
.push_back("vector");
assert_eq!(vector[1], "a");
let screaming_vector = vector
.drop_last().unwrap()
.push_back("VECTOR!!!");
assert_eq!(screaming_vector[2], "VECTOR!!!");
Stack
A LIFO (last in, first out) data structure. This is just a List
in disguise.
use rpds::Stack;
let stack = Stack::new().push("stack");
assert_eq!(stack.peek(), Some(&"stack"));
let a_stack = stack.push("a");
assert_eq!(a_stack.peek(), Some(&"a"));
let stack_popped = a_stack.pop().unwrap();
assert_eq!(stack_popped, stack);
Queue
A FIFO (first in, first out) data structure.
use rpds::Queue;
let queue = Queue::new()
.enqueue("um")
.enqueue("dois")
.enqueue("tres");
assert_eq!(queue.peek(), Some(&"um"));
let queue_dequeued = queue.dequeue().unwrap();
assert_eq!(queue_dequeued.peek(), Some(&"dois"));
HashTrieMap
A map implemented with a hash array mapped trie. See Ideal Hash Trees for details.
use rpds::HashTrieMap;
let map_en = HashTrieMap::new()
.insert(0, "zero")
.insert(1, "one");
assert_eq!(map_en.get(&1), Some(&"one"));
let map_pt = map_en
.insert(1, "um")
.insert(2, "dois");
assert_eq!(map_pt.get(&2), Some(&"dois"));
let map_pt_binary = map_pt.remove(&2);
assert_eq!(map_pt_binary.get(&2), None);
HashTrieSet
A set implemented with a HashTrieMap
.
use rpds::HashTrieSet;
let set = HashTrieSet::new()
.insert("zero")
.insert("one");
assert!(set.contains(&"one"));
let set_extended = set.insert("two");
assert!(set_extended.contains(&"two"));
let set_positive = set_extended.remove(&"zero");
assert!(!set_positive.contains(&"zero"));
RedBlackTreeMap
A map implemented with a red-black tree.
use rpds::RedBlackTreeMap;
let map_en = RedBlackTreeMap::new()
.insert(0, "zero")
.insert(1, "one");
assert_eq!(map_en.get(&1), Some(&"one"));
let map_pt = map_en
.insert(1, "um")
.insert(2, "dois");
assert_eq!(map_pt.get(&2), Some(&"dois"));
let map_pt_binary = map_pt.remove(&2);
assert_eq!(map_pt_binary.get(&2), None);
assert_eq!(map_pt_binary.first(), Some((&0, &"zero")));
RedBlackTreeSet
A set implemented with a RedBlackTreeMap
.
use rpds::RedBlackTreeSet;
let set = RedBlackTreeSet::new()
.insert("zero")
.insert("one");
assert!(set.contains(&"one"));
let set_extended = set.insert("two");
assert!(set_extended.contains(&"two"));
let set_positive = set_extended.remove(&"zero");
assert!(!set_positive.contains(&"zero"));
assert_eq!(set_positive.first(), Some(&"one"));
When you change a data structure you often do not need its previous versions. For those cases rpds offers you mutable methods which are generally faster:
use rpds::HashTrieSet;
let mut set = HashTrieSet::new();
set.insert_mut("zero");
set.insert_mut("one");
let set_0_1 = set.clone();
let set_0_1_2 = set.insert("two");
There are convenient initialization macros for all data structures:
use rpds::*;
let vector = vector![3, 1, 4, 1, 5];
let map = ht_map!["orange" => "orange", "banana" => "yellow"];
Check the documentation for initialization macros of other data structures.
All data structures in this crate can be shared between threads, but that is an opt-in ability. This is because there is a performance cost to make data structures thread safe. That cost is worth avoiding when you are not actually sharing them between threads.
Of course if you try to share a rpds data structure across different threads you can count on the rust compiler to ensure that it is safe to do so. If you are using the version of the data structure that is not thread safe you will get a compile-time error.
To create a thread-safe version of any data structure use new_sync()
:
let vec = Vector::new_sync()
.push_back(42);
Or use the _sync
variant of the initialization macro:
let vec = vector_sync!(42);
no_std
supportThis crate supports no_std
. To enable that you need to disable the default feature std
:
[dependencies]
rpds = { version = "<version>", default-features = false }
Internally the data structures in this crate maintain a lot of reference-counting pointers. These pointers are used both for links between the internal nodes of the data structure as well as for the values it stores.
There are two implementations of reference-counting pointers in the standard library: Rc
and Arc
. They behave the same way, but Arc
allows you to share the data it points to across multiple threads. The downside is that it is significantly slower to clone and drop than Rc
, and persistent data structures do a lot of those operations. In some microbenchmarks with rpds data structure we can see that using Rc
instead of Arc
can make some operations twice as fast! You can see this for yourself by running cargo bench
.
To implement this we parameterize the type of reference-counting pointer (Rc
or Arc
) as a type argument of the data structure. We use the archery crate to do this in a convenient way.
The pointer type can be parameterized like this:
let vec: Vector<u32, archery::ArcK> = Vector::new_with_ptr_kind();
// ↖
// This will use `Arc` pointers.
// Change it to `archery::RcK` to use a `Rc` pointer.
We support serialization through serde. To use it enable the serde
feature. To do so change the rpds dependency in your Cargo.toml
to
[dependencies]
rpds = { version = "<version>", features = ["serde"] }
Download Details:
Author: orium
Source Code: https://github.com/orium/rpds
License: MPL-2.0 license
1654402500
K-dimensional tree in Rust for fast geospatial indexing and nearest neighbors lookup
Add kdtree
to Cargo.toml
[dependencies]
kdtree = "0.5.1"
Add points to kdtree and query nearest n points with distance function
use kdtree::KdTree;
use kdtree::ErrorKind;
use kdtree::distance::squared_euclidean;
let a: ([f64; 2], usize) = ([0f64, 0f64], 0);
let b: ([f64; 2], usize) = ([1f64, 1f64], 1);
let c: ([f64; 2], usize) = ([2f64, 2f64], 2);
let d: ([f64; 2], usize) = ([3f64, 3f64], 3);
let dimensions = 2;
let mut kdtree = KdTree::new(dimensions);
kdtree.add(&a.0, a.1).unwrap();
kdtree.add(&b.0, b.1).unwrap();
kdtree.add(&c.0, c.1).unwrap();
kdtree.add(&d.0, d.1).unwrap();
assert_eq!(kdtree.size(), 4);
assert_eq!(
kdtree.nearest(&a.0, 0, &squared_euclidean).unwrap(),
vec![]
);
assert_eq!(
kdtree.nearest(&a.0, 1, &squared_euclidean).unwrap(),
vec![(0f64, &0)]
);
assert_eq!(
kdtree.nearest(&a.0, 2, &squared_euclidean).unwrap(),
vec![(0f64, &0), (2f64, &1)]
);
assert_eq!(
kdtree.nearest(&a.0, 3, &squared_euclidean).unwrap(),
vec![(0f64, &0), (2f64, &1), (8f64, &2)]
);
assert_eq!(
kdtree.nearest(&a.0, 4, &squared_euclidean).unwrap(),
vec![(0f64, &0), (2f64, &1), (8f64, &2), (18f64, &3)]
);
assert_eq!(
kdtree.nearest(&a.0, 5, &squared_euclidean).unwrap(),
vec![(0f64, &0), (2f64, &1), (8f64, &2), (18f64, &3)]
);
assert_eq!(
kdtree.nearest(&b.0, 4, &squared_euclidean).unwrap(),
vec![(0f64, &1), (2f64, &0), (2f64, &2), (8f64, &3)]
);
cargo bench
with 2.3 GHz Intel i5-7360U:
cargo bench
Running target/release/deps/bench-9e622e6a4ed9b92a
running 2 tests
test bench_add_to_kdtree_with_1k_3d_points ... bench: 106 ns/iter (+/- 25)
test bench_nearest_from_kdtree_with_1k_3d_points ... bench: 1,237 ns/iter (+/- 266)
test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out
Thanks Eh2406 for various fixes and perf improvements.
Download Details:
Author: mrhooray
Source Code: https://github.com/mrhooray/kdtree-rs
License: Apache-2.0, MIT licenses found
1654395120
This crate implements generic array types for Rust.
Requires minumum Rust version of 1.36.0, or 1.41.0 for From<[T; N]>
implementations
The Rust arrays [T; N]
are problematic in that they can't be used generically with respect to N
, so for example this won't work:
struct Foo<N> {
data: [i32; N]
}
generic-array defines a new trait ArrayLength<T>
and a struct GenericArray<T, N: ArrayLength<T>>
, which let the above be implemented as:
struct Foo<N: ArrayLength<i32>> {
data: GenericArray<i32, N>
}
The ArrayLength<T>
trait is implemented by default for unsigned integer types from typenum crate:
use generic_array::typenum::U5;
struct Foo<N: ArrayLength<i32>> {
data: GenericArray<i32, N>
}
fn main() {
let foo = Foo::<U5>{data: GenericArray::default()};
}
For example, GenericArray<T, U5>
would work almost like [T; 5]
:
use generic_array::typenum::U5;
struct Foo<T, N: ArrayLength<T>> {
data: GenericArray<T, N>
}
fn main() {
let foo = Foo::<i32, U5>{data: GenericArray::default()};
}
In version 0.1.1 an arr!
macro was introduced, allowing for creation of arrays as shown below:
let array = arr![u32; 1, 2, 3];
assert_eq!(array[2], 3);
Download Details:
Author: fizyk20
Source Code: https://github.com/fizyk20/generic-array
License: MIT license
1654387800
Array helpers for Rust. Some of the most common methods you would use on Arrays made available on Vectors. Polymorphic implementations for handling most of your use cases.
Add the following to your Cargo.toml file
[dependencies]
array_tool = "~1.0.3"
And in your rust files where you plan to use it put this at the top
extern crate array_tool;
And if you plan to use all of the Vector helper methods available you may do
use array_tool::vec::*;
This crate has helpful methods for strings as well.
use array_tool::iter::ZipOpt;
fn zip_option<U: Iterator>(self, other: U) -> ZipOption<Self, U>
where Self: Sized, U: IntoIterator;
// let a = vec![1];
// let b = vec![];
// a.zip_option(b).next() // input
// Some((Some(1), None)) // return value
pub fn uniques<T: PartialEq + Clone>(a: Vec<T>, b: Vec<T>) -> Vec<Vec<T>>
// array_tool::uniques(vec![1,2,3,4,5], vec![2,5,6,7,8]) // input
// vec![vec![1,3,4], vec![6,7,8]] // return value
use array_tool::vec::Uniq;
fn uniq(&self, other: Vec<T>) -> Vec<T>;
// vec![1,2,3,4,5,6].uniq( vec![1,2,5,7,9] ) // input
// vec![3,4,6] // return value
fn uniq_via<F: Fn(&T, &T) -> bool>(&self, other: Self, f: F) -> Self;
// vec![1,2,3,4,5,6].uniq_via( vec![1,2,5,7,9], |&l, r| l == r + 2 ) // input
// vec![1,2,4,6] // return value
fn unique(&self) -> Vec<T>;
// vec![1,2,1,3,2,3,4,5,6].unique() // input
// vec![1,2,3,4,5,6] // return value
fn unique_via<F: Fn(&T, &T) -> bool>(&self, f: F) -> Self;
// vec![1.0,2.0,1.4,3.3,2.1,3.5,4.6,5.2,6.2].
// unique_via( |l: &f64, r: &f64| l.floor() == r.floor() ) // input
// vec![1.0,2.0,3.3,4.6,5.2,6.2] // return value
fn is_unique(&self) -> bool;
// vec![1,2,1,3,4,3,4,5,6].is_unique() // input
// false // return value
// vec![1,2,3,4,5,6].is_unique() // input
// true // return value
use array_tool::vec::Shift;
fn unshift(&mut self, other: T); // no return value, modifies &mut self directly
// let mut x = vec![1,2,3];
// x.unshift(0);
// assert_eq!(x, vec![0,1,2,3]);
fn shift(&mut self) -> Option<T>;
// let mut x = vec![0,1,2,3];
// assert_eq!(x.shift(), Some(0));
// assert_eq!(x, vec![1,2,3]);
use array_tool::vec::Intersect;
fn intersect(&self, other: Vec<T>) -> Vec<T>;
// vec![1,1,3,5].intersect(vec![1,2,3]) // input
// vec![1,3] // return value
fn intersect_if<F: Fn(&T, &T) -> bool>(&self, other: Vec<T>, validator: F) -> Vec<T>;
// vec!['a','a','c','e'].intersect_if(vec!['A','B','C'], |l, r| l.eq_ignore_ascii_case(r)) // input
// vec!['a','c'] // return value
use array_tool::vec::Join;
fn join(&self, joiner: &'static str) -> String;
// vec![1,2,3].join(",") // input
// "1,2,3" // return value
use array_tool::vec::Times;
fn times(&self, qty: i32) -> Vec<T>;
// vec![1,2,3].times(3) // input
// vec![1,2,3,1,2,3,1,2,3] // return value
use array_tool::vec::Union;
fn union(&self, other: Vec<T>) -> Vec<T>;
// vec!["a","b","c"].union(vec!["c","d","a"]) // input
// vec![ "a", "b", "c", "d" ] // return value
use array_tool::string::ToGraphemeBytesIter;
fn grapheme_bytes_iter(&'a self) -> GraphemeBytesIter<'a>;
// let string = "a s—d féZ";
// let mut graphemes = string.grapheme_bytes_iter()
// graphemes.skip(3).next(); // input
// [226, 128, 148] // return value for emdash `—`
use array_tool::string::Squeeze;
fn squeeze(&self, targets: &'static str) -> String;
// "yellow moon".squeeze("") // input
// "yelow mon" // return value
// " now is the".squeeze(" ") // input
// " now is the" // return value
use array_tool::string::Justify;
fn justify_line(&self, width: usize) -> String;
// "asd as df asd".justify_line(16) // input
// "asd as df asd" // return value
// "asd as df asd".justify_line(18) // input
// "asd as df asd" // return value
use array_tool::string::SubstMarks;
fn subst_marks(&self, marks: Vec<usize>, chr: &'static str) -> String;
// "asdf asdf asdf".subst_marks(vec![0,5,8], "Z") // input
// "Zsdf ZsdZ asdf" // return value
use array_tool::string::WordWrap;
fn word_wrap(&self, width: usize) -> String;
// "01234 67 9 BC EFG IJ".word_wrap(6) // input
// "01234\n67 9\nBC\nEFG IJ" // return value
use array_tool::string::AfterWhitespace;
fn seek_end_of_whitespace(&self, offset: usize) -> Option<usize>;
// "asdf asdf asdf".seek_end_of_whitespace(6) // input
// Some(9) // return value
// "asdf".seek_end_of_whitespace(3) // input
// Some(0) // return value
// "asdf ".seek_end_of_whitespace(6) // input
// None // return_value
Expect methods to become more polymorphic over time (same method implemented for similar & compatible types). I plan to implement many of the methods available for Arrays in higher languages; such as Ruby. Expect regular updates.
Download Details:
Author: danielpclark
Source Code: https://github.com/danielpclark/array_tool
License: MIT license
1654380540
This crate is a Rust port of Google's high-performance SwissTable hash map, adapted to make it a drop-in replacement for Rust's standard HashMap
and HashSet
types.
The original C++ version of SwissTable can be found here, and this CppCon talk gives an overview of how the algorithm works.
Since Rust 1.36, this is now the HashMap
implementation for the Rust standard library. However you may still want to use this crate instead since it works in environments without std
, such as embedded systems and kernels.
HashMap
and HashSet
types.AHash
as the default hasher, which is much faster than SipHash.HashMap
.#[no_std]
(but requires a global allocator with the alloc
crate).Compared to the previous implementation of std::collections::HashMap
(Rust 1.35).
With the hashbrown default AHash hasher (not HashDoS-resistant):
name oldstdhash ns/iter hashbrown ns/iter diff ns/iter diff % speedup
insert_ahash_highbits 20,846 7,397 -13,449 -64.52% x 2.82
insert_ahash_random 20,515 7,796 -12,719 -62.00% x 2.63
insert_ahash_serial 21,668 7,264 -14,404 -66.48% x 2.98
insert_erase_ahash_highbits 29,570 17,498 -12,072 -40.83% x 1.69
insert_erase_ahash_random 39,569 17,474 -22,095 -55.84% x 2.26
insert_erase_ahash_serial 32,073 17,332 -14,741 -45.96% x 1.85
iter_ahash_highbits 1,572 2,087 515 32.76% x 0.75
iter_ahash_random 1,609 2,074 465 28.90% x 0.78
iter_ahash_serial 2,293 2,120 -173 -7.54% x 1.08
lookup_ahash_highbits 3,460 4,403 943 27.25% x 0.79
lookup_ahash_random 6,377 3,911 -2,466 -38.67% x 1.63
lookup_ahash_serial 3,629 3,586 -43 -1.18% x 1.01
lookup_fail_ahash_highbits 5,286 3,411 -1,875 -35.47% x 1.55
lookup_fail_ahash_random 12,365 4,171 -8,194 -66.27% x 2.96
lookup_fail_ahash_serial 4,902 3,240 -1,662 -33.90% x 1.51
With the libstd default SipHash hasher (HashDoS-resistant):
name oldstdhash ns/iter hashbrown ns/iter diff ns/iter diff % speedup
insert_std_highbits 32,598 20,199 -12,399 -38.04% x 1.61
insert_std_random 29,824 20,760 -9,064 -30.39% x 1.44
insert_std_serial 33,151 17,256 -15,895 -47.95% x 1.92
insert_erase_std_highbits 74,731 48,735 -25,996 -34.79% x 1.53
insert_erase_std_random 73,828 47,649 -26,179 -35.46% x 1.55
insert_erase_std_serial 73,864 40,147 -33,717 -45.65% x 1.84
iter_std_highbits 1,518 2,264 746 49.14% x 0.67
iter_std_random 1,502 2,414 912 60.72% x 0.62
iter_std_serial 6,361 2,118 -4,243 -66.70% x 3.00
lookup_std_highbits 21,705 16,962 -4,743 -21.85% x 1.28
lookup_std_random 21,654 17,158 -4,496 -20.76% x 1.26
lookup_std_serial 18,726 14,509 -4,217 -22.52% x 1.29
lookup_fail_std_highbits 25,852 17,323 -8,529 -32.99% x 1.49
lookup_fail_std_random 25,913 17,760 -8,153 -31.46% x 1.46
lookup_fail_std_serial 22,648 14,839 -7,809 -34.48% x 1.53
Add this to your Cargo.toml
:
[dependencies]
hashbrown = "0.9"
Then:
use hashbrown::HashMap;
let mut map = HashMap::new();
map.insert(1, "one");
This crate has the following Cargo features:
nightly
: Enables nightly-only features: #[may_dangle]
.serde
: Enables serde serialization support.rayon
: Enables rayon parallel iterator support.raw
: Enables access to the experimental and unsafe RawTable
API.inline-more
: Adds inline hints to most functions, improving run-time performance at the cost of compilation time. (enabled by default)ahash
: Compiles with ahash as default hasher. (enabled by default)ahash-compile-time-rng
: Activates the compile-time-rng
feature of ahash, to increase the DOS-resistance, but can result in issues for no_std
builds. More details in issue#124. (enabled by default)Download Details:
Author: contain-rs
Source Code: https://github.com/contain-rs/hashbrown2
License: View license
1654373160
Ternary search tree collection in rust with similar API to std::collections as it possible.
Ternary search tree is a type of trie (sometimes called a prefix tree) where nodes are arranged in a manner similar to a binary search tree, but with up to three children rather than the binary tree's limit of two. Like other prefix trees, a ternary search tree can be used as an associative map structure with the ability for incremental string search. However, ternary search trees are more space efficient compared to standard prefix trees, at the cost of speed. Common applications for ternary search trees include spell-checking and auto-completion. TSTMap and TSTSet structures for map and set like usage.
Documentation is available at http://billyevans.github.io/tst/tst
It has special methods:
Add this to your Cargo.toml
:
[dependencies]
tst = "0.10.*"
#[macro_use]
extern crate tst;
use tst::TSTMap;
let m = tstmap! {
"first" => 1,
"second" => 2,
"firstthird" => 3,
"firstsecond" => 12,
"xirst" => -13,
};
// iterate
for (key, value) in m.iter() {
println!("{}: {}", key, value);
}
assert_eq!(Some(&1), m.get("first"));
assert_eq!(5, m.len());
// calculating longest prefix
assert_eq!("firstsecond", m.longest_prefix("firstsecondthird"));
// get values with common prefix
for (key, value) in m.prefix_iter("first") {
println!("{}: {}", key, value);
}
// get sum by wildcard iterator
assert_eq!(-12, m.wildcard_iter(".irst").fold(0, |sum, (_, val)| sum + val));
#[macro_use]
extern crate tst;
use tst::TSTMap;
let m = tstmap! {
"ac" => 1,
"bd" => 2,
"cc" => 3,
};
for (k, v) in m.wildcard_iter(".c") {
println!("{} -> {}", k, v);
}
#[macro_use]
extern crate tst;
use tst::TSTMap;
let m = tstmap! {
"abc" => 1,
"abcd" => 1,
"abce" => 1,
"abca" => 1,
"zxd" => 1,
"add" => 1,
"abcdef" => 1,
};
for (key, value) in m.prefix_iter("abc") {
println!("{}: {}", key, value);
}
#[macro_use]
extern crate tst;
use tst::TSTMap;
let m = tstmap! {
"abc" => 1,
"abcd" => 1,
"abce" => 1,
"abca" => 1,
"zxd" => 1,
"add" => 1,
"abcdef" => 1,
};
assert_eq!("abcd", m.longest_prefix("abcde"));
https://en.wikipedia.org/wiki/Ternary_search_tree
Download Details:
Author: billyevans
Source Code: https://github.com/billyevans/tst
License: MIT license
1654312200
In this video, we will test out Github Copilot by using it to solve Leetcode problems.
GitHub Copilot is powered by Codex, the new AI system created by OpenAI. GitHub Copilot understands significantly more context than most code assistants. So, whether it’s in a docstring, comment, function name, or the code itself, GitHub Copilot uses the context you’ve provided and synthesizes code to match. Together with OpenAI, we’re designing GitHub Copilot to get smarter at producing safe and effective code as developers use it.