Gordon  Taylor

Gordon Taylor

1660638720

Yjs: Shared Data Types for Building Collaborative Software

Yjs

A CRDT framework with a powerful abstraction of shared data

Yjs is a CRDT implementation that exposes its internal data structure as shared types. Shared types are common data types like Map or Array with superpowers: changes are automatically distributed to other peers and merged without merge conflicts.

Overview

This repository contains a collection of shared types that can be observed for changes and manipulated concurrently. Network functionality and two-way-bindings are implemented in separate modules.

Bindings

NameCursorsBindingDemo
ProseMirror                                                  y-prosemirrordemo
Quilly-quilldemo
CodeMirrory-codemirrordemo
Monacoy-monacodemo

Providers

Setting up the communication between clients, managing awareness information, and storing shared data for offline usage is quite a hassle. Providers manage all that for you and are the perfect starting point for your collaborative app.

y-webrtc

Propagates document updates peer-to-peer using WebRTC. The peers exchange signaling data over signaling servers. Publically available signaling servers are available. Communication over the signaling servers can be encrypted by providing a shared secret, keeping the connection information and the shared document private.

y-websocket

A module that contains a simple websocket backend and a websocket client that connects to that backend. The backend can be extended to persist updates in a leveldb database.

y-indexeddb

Efficiently persists document updates to the browsers indexeddb database. The document is immediately available and only diffs need to be synced through the network provider.

y-dat

[WIP] Write document updates effinciently to the dat network using multifeed. Each client has an append-only log of CRDT local updates (hypercore). Multifeed manages and sync hypercores and y-dat listens to changes and applies them to the Yjs document.

Getting Started

Install Yjs and a provider with your favorite package manager:

npm i yjs y-websocket

Start the y-websocket server:

PORT=1234 node ./node_modules/y-websocket/bin/server.js

Example: Observe types

const yarray = doc.getArray('my-array')
yarray.observe(event => {
  console.log('yarray was modified')
})
// every time a local or remote client modifies yarray, the observer is called
yarray.insert(0, ['val']) // => "yarray was modified"

Example: Nest types

Remember, shared types are just plain old data types. The only limitation is that a shared type must exist only once in the shared document.

const ymap = doc.getMap('map')
const foodArray = new Y.Array()
foodArray.insert(0, ['apple', 'banana'])
ymap.set('food', foodArray)
ymap.get('food') === foodArray // => true
ymap.set('fruit', foodArray) // => Error! foodArray is already defined

Now you understand how types are defined on a shared document. Next you can jump to the demo repository or continue reading the API docs.

Example: Using and combining providers

Any of the Yjs providers can be combined with each other. So you can sync data over different network technologies.

In most cases you want to use a network provider (like y-websocket or y-webrtc) in combination with a persistence provider (y-indexeddb in the browser). Persistence allows you to load the document faster and to persist data that is created while offline.

For the sake of this demo we combine two different network providers with a persistence provider.

import * as Y from 'yjs'
import { WebrtcProvider } from 'y-webrtc'
import { WebsocketProvider } from 'y-websocket'
import { IndexeddbPersistence } from 'y-indexeddb'

const ydoc = new Y.Doc()

// this allows you to instantly get the (cached) documents data
const indexeddbProvider = new IndexeddbPersistence('count-demo', ydoc)
idbP.whenSynced.then(() => {
  console.log('loaded data from indexed db')
})

// Sync clients with the y-webrtc provider.
const webrtcProvider = new WebrtcProvider('count-demo', ydoc)

// Sync clients with the y-websocket provider
const websocketProvider = new WebsocketProvider(
  'wss://demos.yjs.dev', 'count-demo', ydoc
)

// array of numbers which produce a sum
const yarray = ydoc.getArray('count')

// observe changes of the sum
yarray.observe(event => {
  // print updates when the data changes
  console.log('new sum: ' + yarray.toArray().reduce((a,b) => a + b))
})

// add 1 to the sum
yarray.push([1]) // => "new sum: 1"

API

import * as Y from 'yjs'

Shared Types

Y.Array
 

A shareable Array-like type that supports efficient insert/delete of elements at any position. Internally it uses a linked list of Arrays that is split when necessary.

const yarray = new Y.Array()insert(index:number, content:Array<object|boolean|Array|string|number|Uint8Array|Y.Type>)

Insert content at index. Note that content is an array of elements. I.e. array.insert(0, [1]) splices the list and inserts 1 at position 0.

push(Array<Object|boolean|Array|string|number|Uint8Array|Y.Type>)unshift(Array<Object|boolean|Array|string|number|Uint8Array|Y.Type>)delete(index:number, length:number)get(index:number)length:numberforEach(function(value:object|boolean|Array|string|number|Uint8Array|Y.Type, index:number, array: Y.Array)) map(function(T, number, YArray):M):Array<M>toArray():Array<object|boolean|Array|string|number|Uint8Array|Y.Type>

Copies the content of this YArray to a new Array.

toJSON():Array<Object|boolean|Array|string|number>

Copies the content of this YArray to a new Array. It transforms all child types to JSON using their toJSON method.

[Symbol.Iterator]

Returns an YArray Iterator that contains the values for each index in the array.for (let value of yarray) { .. }

observe(function(YArrayEvent, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns.

unobserve(function(YArrayEvent, Transaction):void)

Removes an observe event listener from this type.

observeDeep(function(Array<YEvent>, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type or any of its children is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns. The event listener receives all Events created by itself or any of its children.

unobserveDeep(function(Array<YEvent>, Transaction):void)

Removes an observeDeep event listener from this type.

Y.Map
 

A shareable Map type.

const ymap = new Y.Map()

get(key:string):object|boolean|string|number|Uint8Array|Y.Typeset(key:string, value:object|boolean|string|number|Uint8Array|Y.Type)delete(key:string)has(key:string):booleanget(index:number)toJSON():Object<string, Object|boolean|Array|string|number|Uint8Array>

Copies the [key,value] pairs of this YMap to a new Object.It transforms all child types to JSON using their toJSON method.

forEach(function(value:object|boolean|Array|string|number|Uint8Array|Y.Type, key:string, map: Y.Map))

Execute the provided function once for every key-value pair.

[Symbol.Iterator]

Returns an Iterator of [key, value] pairs.for (let [key, value] of ymap) { .. }

entries()

Returns an Iterator of [key, value] pairs.

values()

Returns an Iterator of all values.

keys()

Returns an Iterator of all keys.

observe(function(YMapEvent, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns.

unobserve(function(YMapEvent, Transaction):void)

Removes an observe event listener from this type.

observeDeep(function(Array<YEvent>, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type or any of its children is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns. The event listener receives all Events created by itself or any of its children.

unobserveDeep(function(Array<YEvent>, Transaction):void)

Removes an observeDeep event listener from this type.

Y.Text
 

A shareable type that is optimized for shared editing on text. It allows to assign properties to ranges in the text. This makes it possible to implement rich-text bindings to this type.

This type can also be transformed to the delta format. Similarly the YTextEvents compute changes as deltas.

const ytext = new Y.Text()insert(index:number, content:string, [formattingAttributes:Object<string,string>])

Insert a string at index and assign formatting attributes to it.ytext.insert(0, 'bold text', { bold: true })

delete(index:number, length:number)format(index:number, length:number, formattingAttributes:Object<string,string>)

Assign formatting attributes to a range in the text

applyDelta(delta, opts:Object<string,any>)

See Quill Delta Can set options for preventing remove ending newLines, default is true.ytext.applyDelta(delta, { sanitize: false })

length:numbertoString():string

Transforms this type, without formatting options, into a string.

toJSON():string

See toString

toDelta():Delta

Transforms this type to a Quill Delta

observe(function(YTextEvent, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns.

unobserve(function(YTextEvent, Transaction):void)

Removes an observe event listener from this type.

observeDeep(function(Array<YEvent>, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type or any of its children is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns. The event listener receives all Events created by itself or any of its children.

unobserveDeep(function(Array<YEvent>, Transaction):void)

Removes an observeDeep event listener from this type.

Y.XmlFragment
 

A container that holds an Array of Y.XmlElements.

const yxml = new Y.XmlFragment()

insert(index:number, content:Array<Y.XmlElement|Y.XmlText>)delete(index:number, length:number)get(index:number)length:numbertoArray():Array<Y.XmlElement|Y.XmlText>

Copies the children to a new Array.

toDOM():DocumentFragment

Transforms this type and all children to new DOM elements.

toString():string

Get the XML serialization of all descendants.

toJSON():string

See toString.

observe(function(YXmlEvent, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns.

unobserve(function(YXmlEvent, Transaction):void)

Removes an observe event listener from this type.

observeDeep(function(Array<YEvent>, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type or any of its children is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns. The event listener receives all Events created by itself or any of its children.

unobserveDeep(function(Array<YEvent>, Transaction):void)

Removes an observeDeep event listener from this type.

Y.XmlElement
 

A shareable type that represents an XML Element. It has a nodeName, attributes, and a list of children. But it makes no effort to validate its content and be actually XML compliant.

const yxml = new Y.XmlElement()

insert(index:number, content:Array<Y.XmlElement|Y.XmlText>)delete(index:number, length:number)get(index:number)length:numbersetAttribute(attributeName:string, attributeValue:string)removeAttribute(attributeName:string)getAttribute(attributeName:string):stringgetAttributes(attributeName:string):Object<string,string>toArray():Array<Y.XmlElement|Y.XmlText>

Copies the children to a new Array.

toDOM():Element

Transforms this type and all children to a new DOM element.

toString():string

Get the XML serialization of all descendants.

toJSON():string

See toString.

observe(function(YXmlEvent, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns.

unobserve(function(YXmlEvent, Transaction):void)

Removes an observe event listener from this type.

observeDeep(function(Array<YEvent>, Transaction):void)

Adds an event listener to this type that will be called synchronously every time this type or any of its children is modified. In the case this type is modified in the event listener, the event listener will be called again after the current event listener returns. The event listener receives all Events created by itself or any of its children.

unobserveDeep(function(Array<YEvent>, Transaction):void)

Removes an observeDeep event listener from this type.

Y.Doc

const doc = new Y.Doc()

clientID

A unique id that identifies this client. (readonly)

gc

Whether garbage collection is enabled on this doc instance. Set `doc.gc = false` in order to disable gc and be able to restore old content. See https://github.com/yjs/yjs#yjs-crdt-algorithm for more information about gc in Yjs.

transact(function(Transaction):void [, origin:any])

Every change on the shared document happens in a transaction. Observer calls and the update event are called after each transaction. You should bundle changes into a single transaction to reduce the amount of event calls. I.e. doc.transact(() => { yarray.insert(..); ymap.set(..) }) triggers a single change event. 
You can specify an optional origin parameter that is stored on transaction.origin and on('update', (update, origin) => ..).

get(string, Y.[TypeClass]):[Type]

Define a shared type.

getArray(string):Y.Array

Define a shared Y.Array type. Is equivalent to y.get(string, Y.Array).

getMap(string):Y.Map

Define a shared Y.Map type. Is equivalent to y.get(string, Y.Map).

getXmlFragment(string):Y.XmlFragment

Define a shared Y.XmlFragment type. Is equivalent to y.get(string, Y.XmlFragment).

on(string, function)

Register an event listener on the shared type

off(string, function)

Unregister an event listener from the shared type

Y.Doc Events

on('update', function(updateMessage:Uint8Array, origin:any, Y.Doc):void)

Listen to document updates. Document updates must be transmitted to all other peers. You can apply document updates in any order and multiple times.

on('beforeTransaction', function(Y.Transaction, Y.Doc):void)

Emitted before each transaction.

on('afterTransaction', function(Y.Transaction, Y.Doc):void)

Emitted after each transaction.

Document Updates

Changes on the shared document are encoded into document updates. Document updates are commutative and idempotent. This means that they can be applied in any order and multiple times.

Example: Listen to update events and apply them on remote client

const doc1 = new Y.Doc()
const doc2 = new Y.Doc()

doc1.on('update', update => {
  Y.applyUpdate(doc2, update)
})

doc2.on('update', update => {
  Y.applyUpdate(doc1, update)
})

// All changes are also applied to the other document
doc1.getArray('myarray').insert(0, ['Hello doc2, you got this?'])
doc2.getArray('myarray').get(0) // => 'Hello doc2, you got this?'

Yjs internally maintains a state vector that denotes the next expected clock from each client. In a different interpretation it holds the number of structs created by each client. When two clients sync, you can either exchange the complete document structure or only the differences by sending the state vector to compute the differences.

Example: Sync two clients by exchanging the complete document structure

const state1 = Y.encodeStateAsUpdate(ydoc1)
const state2 = Y.encodeStateAsUpdate(ydoc2)
Y.applyUpdate(ydoc1, state2)
Y.applyUpdate(ydoc2, state1)

Example: Sync two clients by computing the differences

This example shows how to sync two clients with the minimal amount of exchanged data by computing only the differences using the state vector of the remote client. Syncing clients using the state vector requires another roundtrip, but can safe a lot of bandwidth.

const stateVector1 = Y.encodeStateVector(ydoc1)
const stateVector2 = Y.encodeStateVector(ydoc2)
const diff1 = Y.encodeStateAsUpdate(ydoc1, stateVector2)
const diff2 = Y.encodeStateAsUpdate(ydoc2, stateVector1)
Y.applyUpdate(ydoc1, diff2)
Y.applyUpdate(ydoc2, diff1)

Y.applyUpdate(Y.Doc, update:Uint8Array, [transactionOrigin:any])

Apply a document update on the shared document. Optionally you can specify transactionOrigin that will be stored on transaction.origin and ydoc.on('update', (update, origin) => ..).

Y.encodeStateAsUpdate(Y.Doc, [encodedTargetStateVector:Uint8Array]):Uint8Array

Encode the document state as a single update message that can be applied on the remote document. Optionally specify the target state vector to only write the differences to the update message.

Y.encodeStateVector(Y.Doc):Uint8Array

Computes the state vector and encodes it into an Uint8Array.

Relative Positions

This API is not stable yet

This feature is intended for managing selections / cursors. When working with other users that manipulate the shared document, you can't trust that an index position (an integer) will stay at the intended location. A relative position is fixated to an element in the shared document and is not affected by remote changes. I.e. given the document "a|c", the relative position is attached to c. When a remote user modifies the document by inserting a character before the cursor, the cursor will stay attached to the character c. insert(1, 'x')("a|c") = "ax|c". When the relative position is set to the end of the document, it will stay attached to the end of the document.

Example: Transform to RelativePosition and back

const relPos = Y.createRelativePositionFromTypeIndex(ytext, 2)
const pos = Y.createAbsolutePositionFromRelativePosition(relPos, doc)
pos.type === ytext // => true
pos.index === 2 // => true

Example: Send relative position to remote client (json)

const relPos = Y.createRelativePositionFromTypeIndex(ytext, 2)
const encodedRelPos = JSON.stringify(relPos)
// send encodedRelPos to remote client..
const parsedRelPos = JSON.parse(encodedRelPos)
const pos = Y.createAbsolutePositionFromRelativePosition(parsedRelPos, remoteDoc)
pos.type === remoteytext // => true
pos.index === 2 // => true

Example: Send relative position to remote client (Uint8Array)

const relPos = Y.createRelativePositionFromTypeIndex(ytext, 2)
const encodedRelPos = Y.encodeRelativePosition(relPos)
// send encodedRelPos to remote client..
const parsedRelPos = Y.decodeRelativePosition(encodedRelPos)
const pos = Y.createAbsolutePositionFromRelativePosition(parsedRelPos, remoteDoc)
pos.type === remoteytext // => true
pos.index === 2 // => true

Y.createRelativePositionFromTypeIndex(Uint8Array|Y.Type, number)Y.createAbsolutePositionFromRelativePosition(RelativePosition, Y.Doc)Y.encodeRelativePosition(RelativePosition):Uint8ArrayY.decodeRelativePosition(Uint8Array):RelativePosition

Y.UndoManager

Yjs ships with an Undo/Redo manager for selective undo/redo of of changes on a Yjs type. The changes can be optionally scoped to transaction origins.

const ytext = doc.getText('text')
const undoManager = new Y.UndoManager(ytext)

ytext.insert(0, 'abc')
undoManager.undo()
ytext.toString() // => ''
undoManager.redo()
ytext.toString() // => 'abc'

constructor(scope:Y.AbstractType|Array<Y.AbstractType> [, {captureTimeout:number,trackedOrigins:Set<any>,deleteFilter:function(item):boolean}])

Accepts either single type as scope or an array of types.

undo()redo()stopCapturing()on('stack-item-added', { stackItem: { meta: Map<any,any> }, type: 'undo' | 'redo' }) 

Register an event that is called when a StackItem is added to the undo- or the redo-stack.

on('stack-item-popped', { stackItem: { meta: Map<any,any> }, type: 'undo' | 'redo' }) 

Register an event that is called when a StackItem is popped from the undo- or the redo-stack.

Example: Stop Capturing

UndoManager merges Undo-StackItems if they are created within time-gap smaller than options.captureTimeout. Call um.stopCapturing() so that the next StackItem won't be merged.

// without stopCapturing
ytext.insert(0, 'a')
ytext.insert(1, 'b')
undoManager.undo()
ytext.toString() // => '' (note that 'ab' was removed)
// with stopCapturing
ytext.insert(0, 'a')
undoManager.stopCapturing()
ytext.insert(0, 'b')
undoManager.undo()
ytext.toString() // => 'a' (note that only 'b' was removed)

Example: Specify tracked origins

Every change on the shared document has an origin. If no origin was specified, it defaults to null. By specifying trackedOrigins you can selectively specify which changes should be tracked by UndoManager. The UndoManager instance is always added to trackedOrigins.

class CustomBinding {}

const ytext = doc.getText('text')
const undoManager = new Y.UndoManager(ytext, {
  trackedOrigins: new Set([42, CustomBinding])
})

ytext.insert(0, 'abc')
undoManager.undo()
ytext.toString() // => 'abc' (does not track because origin `null` and not part
                 //           of `trackedTransactionOrigins`)
ytext.delete(0, 3) // revert change

doc.transact(() => {
  ytext.insert(0, 'abc')
}, 42)
undoManager.undo()
ytext.toString() // => '' (tracked because origin is an instance of `trackedTransactionorigins`)

doc.transact(() => {
  ytext.insert(0, 'abc')
}, 41)
undoManager.undo()
ytext.toString() // => '' (not tracked because 41 is not an instance of
                 //        `trackedTransactionorigins`)
ytext.delete(0, 3) // revert change

doc.transact(() => {
  ytext.insert(0, 'abc')
}, new CustomBinding())
undoManager.undo()
ytext.toString() // => '' (tracked because origin is a `CustomBinding` and
                 //        `CustomBinding` is in `trackedTransactionorigins`)

Example: Add additional information to the StackItems

When undoing or redoing a previous action, it is often expected to restore additional meta information like the cursor location or the view on the document. You can assign meta-information to Undo-/Redo-StackItems.

const ytext = doc.getText('text')
const undoManager = new Y.UndoManager(ytext, {
  trackedOrigins: new Set([42, CustomBinding])
})

undoManager.on('stack-item-added', event => {
  // save the current cursor location on the stack-item
  event.stackItem.meta.set('cursor-location', getRelativeCursorLocation())
})

undoManager.on('stack-item-popped', event => {
  // restore the current cursor location on the stack-item
  restoreCursorLocation(event.stackItem.meta.get('cursor-location'))
})

Miscellaneous

Typescript Declarations

Yjs has type descriptions. But until this ticket is fixed, this is how you can make use of Yjs type declarations.

{
  "compilerOptions": {
    "allowJs": true,
    "checkJs": true,
  },
  "maxNodeModuleJsDepth": 5
}

Yjs CRDT Algorithm

Conflict-free replicated data types (CRDT) for collaborative editing are an alternative approach to operational transformation (OT). A very simple differenciation between the two approaches is that OT attempts to transform index positions to ensure convergence (all clients end up with the same content), while CRDTs use mathematical models that usually do not involve index transformations, like linked lists. OT is currently the de-facto standard for shared editing on text. OT approaches that support shared editing without a central source of truth (a central server) require too much bookkeeping to be viable in practice. CRDTs are better suited for distributed systems, provide additional guarantees that the document can be synced with remote clients, and do not require a central source of truth.

Yjs implements a modified version of the algorithm described in this paper. I will eventually publish a paper that describes why this approach works so well in practice. Note: Since operations make up the document structure, we prefer the term struct now.

CRDTs suitable for shared text editing suffer from the fact that they only grow in size. There are CRDTs that do not grow in size, but they do not have the characteristics that are benificial for shared text editing (like intention preservation). Yjs implements many improvements to the original algorithm that diminish the trade-off that the document only grows in size. We can't garbage collect deleted structs (tombstones) while ensuring a unique order of the structs. But we can 1. merge preceeding structs into a single struct to reduce the amount of meta information, 2. we can delete content from the struct if it is deleted, and 3. we can garbage collect tombstones if we don't care about the order of the structs anymore (e.g. if the parent was deleted).

Examples:

  1. If a user inserts elements in sequence, the struct will be merged into a single struct. E.g. array.insert(0, ['a']), array.insert(0, ['b']); is first represented as two structs ([{id: {client, clock: 0}, content: 'a'}, {id: {client, clock: 1}, content: 'b'}) and then merged into a single struct: [{id: {client, clock: 0}, content: 'ab'}].
  2. When a struct that contains content (e.g. ItemString) is deleted, the struct will be replaced with an ItemDeleted that does not contain content anymore.
  3. When a type is deleted, all child elements are transformed to GC structs. A GC struct only denotes the existence of a struct and that it is deleted. GC structs can always be merged with other GC structs if the id's are adjacent.

Especially when working on structured content (e.g. shared editing on ProseMirror), these improvements yield very good results when benchmarking random document edits. In practice they show even better results, because users usually edit text in sequence, resulting in structs that can easily be merged. The benchmarks show that even in the worst case scenario that a user edits text from right to left, Yjs achieves good performance even for huge documents.

State Vector

Yjs has the ability to exchange only the differences when syncing two clients. We use lamport timestamps to identify structs and to track in which order a client created them. Each struct has an struct.id = { client: number, clock: number} that uniquely identifies a struct. We define the next expected clock by each client as the state vector. This data structure is similar to the version vectors data structure. But we use state vectors only to describe the state of the local document, so we can compute the missing struct of the remote client. We do not use it to track causality.


Yjs is network agnostic (p2p!), supports many existing rich text editors, offline editing, version snapshots, undo/redo and shared cursors. It scales well with an unlimited number of users and is well suited for even large documents.

👷‍♀️ If you are looking for professional support, please consider supporting this project via a "support contract" on GitHub Sponsors. I will attend your issues quicker and we can discuss questions and problems in regular video conferences. Otherwise you can find help on our community discussion board.

Who is using Yjs

  • Relm A collaborative gameworld for teamwork and community. :star2:
  • Input A collaborative note taking app. :star2:
  • Room.sh A meeting application with integrated collaborative drawing, editing, and coding tools. :star:
  • http://coronavirustechhandbook.com/ A collaborative wiki that is edited by thousands of different people to work on a rapid and sophisticated response to the coronavirus outbreak and subsequent impacts. :star:
  • Nimbus Note A note-taking app designed by Nimbus Web.
  • JoeDocs An open collaborative wiki.
  • Pluxbox RadioManager A web-based app to collaboratively organize radio broadcasts.
  • Cattaz A wiki that can run custom applications in the wiki pages.

Download Details:

Author: yjs
Source Code: https://github.com/yjs/yjs 
License: View license

#javascript #decentralized #realtime  

What is GEEK

Buddha Community

Yjs: Shared Data Types for Building Collaborative Software
 iOS App Dev

iOS App Dev

1620466520

Your Data Architecture: Simple Best Practices for Your Data Strategy

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

Gerhard  Brink

Gerhard Brink

1620629020

Getting Started With Data Lakes

Frameworks for Efficient Enterprise Analytics

The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.

This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.

Introduction

As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).


This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.

#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management

Arvel  Parker

Arvel Parker

1593156510

Basic Data Types in Python | Python Web Development For Beginners

At the end of 2019, Python is one of the fastest-growing programming languages. More than 10% of developers have opted for Python development.

In the programming world, Data types play an important role. Each Variable is stored in different data types and responsible for various functions. Python had two different objects, and They are mutable and immutable objects.

Table of Contents  hide

I Mutable objects

II Immutable objects

III Built-in data types in Python

Mutable objects

The Size and declared value and its sequence of the object can able to be modified called mutable objects.

Mutable Data Types are list, dict, set, byte array

Immutable objects

The Size and declared value and its sequence of the object can able to be modified.

Immutable data types are int, float, complex, String, tuples, bytes, and frozen sets.

id() and type() is used to know the Identity and data type of the object

a**=25+**85j

type**(a)**

output**:<class’complex’>**

b**={1:10,2:“Pinky”****}**

id**(b)**

output**:**238989244168

Built-in data types in Python

a**=str(“Hello python world”)****#str**

b**=int(18)****#int**

c**=float(20482.5)****#float**

d**=complex(5+85j)****#complex**

e**=list((“python”,“fast”,“growing”,“in”,2018))****#list**

f**=tuple((“python”,“easy”,“learning”))****#tuple**

g**=range(10)****#range**

h**=dict(name=“Vidu”,age=36)****#dict**

i**=set((“python”,“fast”,“growing”,“in”,2018))****#set**

j**=frozenset((“python”,“fast”,“growing”,“in”,2018))****#frozenset**

k**=bool(18)****#bool**

l**=bytes(8)****#bytes**

m**=bytearray(8)****#bytearray**

n**=memoryview(bytes(18))****#memoryview**

Numbers (int,Float,Complex)

Numbers are stored in numeric Types. when a number is assigned to a variable, Python creates Number objects.

#signed interger

age**=**18

print**(age)**

Output**:**18

Python supports 3 types of numeric data.

int (signed integers like 20, 2, 225, etc.)

float (float is used to store floating-point numbers like 9.8, 3.1444, 89.52, etc.)

complex (complex numbers like 8.94j, 4.0 + 7.3j, etc.)

A complex number contains an ordered pair, i.e., a + ib where a and b denote the real and imaginary parts respectively).

String

The string can be represented as the sequence of characters in the quotation marks. In python, to define strings we can use single, double, or triple quotes.

# String Handling

‘Hello Python’

#single (') Quoted String

“Hello Python”

# Double (") Quoted String

“”“Hello Python”“”

‘’‘Hello Python’‘’

# triple (‘’') (“”") Quoted String

In python, string handling is a straightforward task, and python provides various built-in functions and operators for representing strings.

The operator “+” is used to concatenate strings and “*” is used to repeat the string.

“Hello”+“python”

output**:****‘Hello python’**

"python "*****2

'Output : Python python ’

#python web development #data types in python #list of all python data types #python data types #python datatypes #python types #python variable type

Cyrus  Kreiger

Cyrus Kreiger

1618039260

How Has COVID-19 Impacted Data Science?

The COVID-19 pandemic disrupted supply chains and brought economies around the world to a standstill. In turn, businesses need access to accurate, timely data more than ever before. As a result, the demand for data analytics is skyrocketing as businesses try to navigate an uncertain future. However, the sudden surge in demand comes with its own set of challenges.

Here is how the COVID-19 pandemic is affecting the data industry and how enterprises can prepare for the data challenges to come in 2021 and beyond.

#big data #data #data analysis #data security #data integration #etl #data warehouse #data breach #elt

Macey  Kling

Macey Kling

1597579680

Applications Of Data Science On 3D Imagery Data

CVDC 2020, the Computer Vision conference of the year, is scheduled for 13th and 14th of August to bring together the leading experts on Computer Vision from around the world. Organised by the Association of Data Scientists (ADaSCi), the premier global professional body of data science and machine learning professionals, it is a first-of-its-kind virtual conference on Computer Vision.

The second day of the conference started with quite an informative talk on the current pandemic situation. Speaking of talks, the second session “Application of Data Science Algorithms on 3D Imagery Data” was presented by Ramana M, who is the Principal Data Scientist in Analytics at Cyient Ltd.

Ramana talked about one of the most important assets of organisations, data and how the digital world is moving from using 2D data to 3D data for highly accurate information along with realistic user experiences.

The agenda of the talk included an introduction to 3D data, its applications and case studies, 3D data alignment, 3D data for object detection and two general case studies, which are-

  • Industrial metrology for quality assurance.
  • 3d object detection and its volumetric analysis.

This talk discussed the recent advances in 3D data processing, feature extraction methods, object type detection, object segmentation, and object measurements in different body cross-sections. It also covered the 3D imagery concepts, the various algorithms for faster data processing on the GPU environment, and the application of deep learning techniques for object detection and segmentation.

#developers corner #3d data #3d data alignment #applications of data science on 3d imagery data #computer vision #cvdc 2020 #deep learning techniques for 3d data #mesh data #point cloud data #uav data