Michio JP

Michio JP

1554886049

Rocking JS data structures!

JavaScript's development has been quite stubborn up to 2015. Yeah, that's the magic time ES6was announced and the whole web-development-thing really took off and grew in popularity exponentially. 📊 But, that's something every JS fan probably knows - the year, the exact moment has been repeatedly referenced in many, many JS resources around the world. So, let's be innovative and do the same again, shall we?

ES6 has brought a great number of new goodies to JS. Not only now-must-have arrow functionspromises, and syntactic sugar, but also new data structures. 🔢 That’s right, I’m talking about things like Sets, WeakMaps and etc. (if you already know them). These little, but very interesting features have been pushed into the background, mainly because of how long it took for modern browsers to fully embrace new specification. As time passed ⏳, people started using new syntax and some really desired new functionalities, but these structures became less relevant. Of course not to all, but taking even as obvious example as myself - I hardly ever used them. I just stuck with old-school arrays and object and lived within that limited scope. But, don’t worry, because in this article we’ll explore how good and useful these structures can really be. With new possibilities they provide and their current support… just why not? 😃

TypedArrays

I guess you know arrays, cause who doesn’t? All methods they provide, functional programming possibilities and more are just so impressive. But, if so, then what TypedArrays are and why do we need them?

TypedArrays instead of having a single class on their own, is a name used to reference different types of these specific structures. They basically serve as custom, array-like views to binary data buffers, which I guess require a bit more explanation. 😉

ArrayBuffer

ArrayBuffer is a class used to contain fixed-length raw binary data. 💾 You can create one by using its constructor with a length argument, indicating the number of bytes for your buffer.

const buffer = new ArrayBuffer(8);

ArrayBuffers don’t have many properties of their own. Most notable being byteLength and slice() - one for retrieving the length of the buffer in bytes (like the provided one) and other for slicing the specified part of the buffer and creating the new one. The only way you can interact with ArrayBuffers is through so-called view - either TypedArray or DataView (but that’s a story for another day).

The importance of ArrayBuffers comes from the way in which they represent your data - raw binary. Such form is required by some low-level API, like WebGL, because of its efficiency 🏎 and integration 🤝 with other parts of code, like e.g. shaders.

TypedArray[s]

Now, that we know that TypedArrays serve as a view for ArrayBuffer, let’s first list 'em all!

  • Int[8/16/32]Array - for interpreting buffers as arrays of integer numbers with the given number of bits for representing each;
  • Uint[8/16/32]Array - unsigned integer numbers with the given number of bits for each;
  • Float[8/16/32/64]Array - floating-point numbers with the given number of bits for each;
  • BigInt64Array - integer numbers (bigint) with 64 bits for each;
  • BigUint64Array - unsigned integer (bigint) numbers with 64 bits for each;

Each of the above types of TypedArrays has the same set of methods and properties, with the only difference being in the way of representing the data. TypedArray instance can be created with a given length (creating ArrayBuffer internally), another TypedArrayan object (with length and values for given indexes as keys) or previously instantiated ArrayBuffer. 👨‍💻

Usage

Now, as you have your TypedArray ready, you can freely edit it with methods similar to a normal array. 👍

const typedArr = new Uint8Array([0,1,2,3,4]);
const mapped = typedArr.map(num => num * 2); // Uint8Array [0,2,4,6,8]

One thing to note though, because, as under-the-hood you’re operating on the ArrayBuffer’s data, your TypedArray has fixed size. Furthermore, all methods that can be found in normal arrays, that edit their size (removing, adding, cutting, etc.) have limited possibilities or are completely unavailable. 🤐

const typedArr = new Uint8Array([0,1,2,3,4]);
typedArr.push(5) // Error! You must be kidding me!

You can also iterate on these and convert them to standard arrays back and forth, whenever you want.

const typedArr = new Uint8Array([0,1,2,3,4]);
for(const num of typedArr){
// code
}
const arr = Array.from(typedArr); // [0,1,2,3,4]

TypedArrays provide certain functionalities related to its binary-side too! You can e.g. access the underlying ArrayBuffer instance with buffer property and read its byte length and offset using byteLength and byteOffsetrespectively. 🙂

Use-cases

As I mentioned before, ArrayBuffers have big potential because of the way they represent data. Such compact form can be easily used in many, many places. It can be e.g. vector 🎨 or other compressed data 📦 sent from a server, packed for maximum speed and performance at all stages - compression, transfer, and decompression. In addition, as I said earlier, some Web APIsmake good use of the efficiency this format brings. 👌

With TypedArrays on top of ArrayBuffers, it’s so much easier to manipulate the data inside (definitely better than setting bits themselves 😅). Beyond one and only limit of fixed size, you can interact with this compact data pretty much the way you would with everyday arrays.

Sets

Continuing our research of array-like structures, we’re getting to Sets. 🗃 These are extremely similar to arrays - they can be used to store data in a similar way, with only one important difference. All of Set’s values must be unique(there are some weird cases tho 😵) - whether we’re talking about primitive values or object references - doubles are automatically removed.

Usage

Creating Sets is easy - you just need to use the right constructor with an optional argument to provide data from the start.

const dataSet = new Set([1, 2, 3, 4, 5]);

Sets provide pretty expressive API of their own. Most important being methods like:

  • add() - appends given value to the end of the Set;
  • delete() - removes given value from the Set;
  • has() - checks if given value is present in the Set;
  • clear() - removes all values from the Set;

They can also be converted to standard arrays and iterated at will.

const dataSet = new Set([1,2,3]);
const values = [0,1,2,3,4];
for(const value of values) {
if(dataSet.has(value)){
dataSet.delete(value)
} else {
dataSet.add(value);
}
}
const result = Array.from(dataSet); // [0,4];

Use-cases

Most use cases of Sets are clearly based on their ability to store unique values only. ⚡ Using such a technique with mere arrays would require some additional boilerplate. Therefore unique values can be especially useful when storing IDs and alike.🆔

Second, removing elements in Sets is much more convenient. Just providing the value to delete instead of doing whole find-index-and-splice procedure, is just much more convenient. 👍 This, of course, wouldn’t be possible so-easily with repetitive values that standard arrays allow.

WeakSets

Now, let’s talk about different kind of sets - WeakSets. 🤨 WeakSets are special - they store values differently, but also have some additional limitations, like much smaller API.

Memory

First, a word about how WeakSets store their values. Only objects can be used as WeakSets’ values. No primitives allowed. 🛑 This is very important because of the “weak” way in which WeakSets store their data. “Weak” means that if there is no other reference to a given object (object are accessed by reference), they can be garbage-collected 🗑 - removed at any moment. Thus, a good understanding of references and how objects are interacted with is required to properly utilize the potential of weak structures.

Because WeakSets are still… sets, all values they store must be unique. But, as you might know, it’s not a big deal with objects - the only possible type of WeakSets’ values. As all of them are stored by 👉 reference, even objects with exactly the same properties, are considered different.

Usage

API of WeakSets is greatly limited when compared to normal Sets. Probably most important is the fact that they’re not iterable. They don’t have any properties (Sets have e.g. size indicating number of values they store) and only 3 major methods - add()delete() and has(). Constructor method looks the same, only that optional array argument needs to store objects only. However, the use of such an argument doesn’t have much sense, as all objects you store need to be referenced in some other place in your code.

const weakDataSet = new WeakSet();
const obj = {a: 10};
weakDataSet.add(obj);
weakDataSet.add({b: 10}); // Pointless - will be removed soon
weakDataSet.has(obj); // True
weakDataSet.has({a: 10}); // False - objects are stored by reference

Use-cases

It might be quite hard to find good use-cases for WeakSets actually. That’s because, in reality, there aren’t many, and they’re really specific. The most popular and probably the best one is called object tagging. You can use your WeakSets to group and thus tag specific object when they’ve been referenced somewhere else in your code. Tagging or grouping as some might like to call it can be a very useful technique if used properly. ⚠

You need to be cautious, however. Remember that all objects that aren’t referenced anywhere else, will be garbage-collected. But, it doesn’t mean that they’ll be removed immediately, but on the next cycle ⭕ of the garbage collector. You should keep that fact in mind, and don’t trust WeakSets too much - some values can be removed sooner or later.

Maps

Maps, IMHO are structures that make the best of both worlds - arrays and object. Inside them, all data is stored in key-value pairs. 🤝 The difference between such method and usual objects can be further noticed in the API. What’s more, in Maps, keys and values are treated equally, meaning you can do even something as creative as setting an object (but remember that you need a reference to it for later access) as an actual key for your value! Also, unlike in objects, pairs stored in Maps have a specific order and are easily iterable. 🔄

Usage

You can create your Map instance with straight-forward constructor call. You can optionally provide an array of key-value arrays upfront as starting values for your Map.

const map = new Map([[“key1”, 10], [10, “value2”]]);

It’s when it comes to API where Maps really shine. It allows you to make specific operations faster and in a much more readable way.

There’s one special property called size (available in Sets too) that can give you a quick note about the number of key-value pairs at the given moment. What’s special about that is the fact that there’s no similar, easy enough way to do the same in old-school objects. 😕

And the benefits of this intuitive API don’t end here! If you already like the API of Sets, you might be happy to know that it shares many similarities with the API of Maps. All methods used to edit Maps values can feel like modified to new key-value schema, methods of Sets. Only the add()method has been transformed to set() for obvious, rational-thinking-related reasons. 😅 Other than that, to change and access Maps data, you operate mainly with keys instead of values.

Also, just like Sets and objects (it might not be as relevant when it comes to more array-like Sets), Maps provide 3 methods for reading specific groups of their data:

  • entries() - returns Map’s key-value pairs in form of an array of arrays;
  • values() - returns all of Map’s values in an array;
  • keys() - returns all of Map’s keys in an array;

These methods (especially if you’re practicing functional programming), were most likely extensively used when interacting with object, as there was no other, convenient way. It shouldn’t be the case at all with Maps. With Maps’ API and fine data structure, you should definitely feel your life being a bit easier. 🌈

const map = new Map([[‘key’, 10], [‘key2’, 10]])
map.forEach((value,key) => {
map.delete(key);
map.set(key, 10);
});

Use-cases

As you can see, Maps give you a great alternative for standard objects. Whenever you need to access both key and its value at the same time and be able to iterate over them, Maps might be your best option.

This nice combination of iterable and object-like form clearly can has many implementations. And, while you can quite easily create the same effect with a normal object - why bother at all? 🤔 The convenience behind this brilliant APIand the fact that it’s an industry standard makes Maps a good choice for a lot of different cases. 👍

WeakMaps

WeakMaps are the second weak structures that we’ve met. Many facts from WeakSets apply here too! This includes the way of storing data, object-only rule, limited API and no iteration (there’s no method giving you the list of these weakly-stored keys).

As you know, Maps (as well as WeakMaps) store data in the key-value schema. This means that there are in fact two collections of data in this one structure - keys and values. The “weak” part of WeakMaps applies only to keys, because it is them who are responsible for allowing us to access values. Mentioned values are stored in normal or if you like the name, strong way. 💪 So, as weird as it may feel, in WeakMaps, only objects can be used as valid keys.

Usage

Just like with WeakSets, WeakMaps API is severely limited. All methods you can used are get()set()delete() and has(). Again, no iteration. 😭 But, if you consider the possible use-cases and how such structures work, you’ll begin to better understand these limits. You cannot iterate over something that’s weakly stored. You need references to your keys and so these 4 basic methods are the best way to go. Etc., etc. 😏

Of course, the constructor takes additional, but a not-so-much-useful argument for initiating data.

const weakMap = new WeakMap();
const value = {a: 10}
weakMap.set({}, value); /* Key will be garbage-collected, but value
will still be accessible through variable. */
weakMap.set(value, 10) // Values don’t have to be object-only

Use-cases

WeakMaps have similar use-cases to WeakSets - tagging. All this stuff is happening on the side of keys. Values, however, as strongly-stored data of different types don’t have to be garbage-collected together with the specific key. If saved to a variable earlier, it can still be freely used. This means that you can tag not only one (keys) but also the other side (values) of data and depend on the relations between the two. 🙌

Is that all?

For now - yes. 🤯 I hope that this article helped you learn something new or at least remind some basics. Your JS code doesn’t have to be dependent only on objects and arrays, especially with modern browsers taking more and more market share. 📊 Also, apart from weak structures and their internal behavior, all structures above have pretty simple and nice polyfill options. In this way, you can freely use them, even if it’s only for their fine API.

So, what do you think of this post? Share your opinion below with a reaction or a comment. It really helps me write better articles - you know, the ones you like to read! 😀 Oh, and share the article itself for better reach! Also, follow me on Twitter 🐦, or on my Facebook page and check out my personal blog to keep up-to-date with the latest content from this blog. Again, thank you for reading my content and hope I’ll catch you in the next one! ✌

Originally published by Areknawo https://areknawo.com/rocking-js-data-structures/

Follow great articles on Twitter

Learn More

☞ The Complete JavaScript Course 2019: Build Real Projects!

☞ Become a JavaScript developer - Learn (React, Node,Angular)

☞ JavaScript: Understanding the Weird Parts

☞ Vue JS 2 - The Complete Guide (incl. Vue Router & Vuex)

☞ The Full JavaScript & ES6 Tutorial - (including ES7 & React)

☞ JavaScript - Step By Step Guide For Beginners

☞ The Web Developer Bootcamp

☞ MERN Stack Front To Back: Full Stack React, Redux & Node.js

☞ Visual Studio Code Settings and Extensions for Faster JavaScript Development

☞ Vue.js Authentication System with Node.js Backend

#javascript

What is GEEK

Buddha Community

Rocking JS data structures!
Siphiwe  Nair

Siphiwe Nair

1620466520

Your Data Architecture: Simple Best Practices for Your Data Strategy

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

NBB: Ad-hoc CLJS Scripting on Node.js

Nbb

Not babashka. Node.js babashka!?

Ad-hoc CLJS scripting on Node.js.

Status

Experimental. Please report issues here.

Goals and features

Nbb's main goal is to make it easy to get started with ad hoc CLJS scripting on Node.js.

Additional goals and features are:

  • Fast startup without relying on a custom version of Node.js.
  • Small artifact (current size is around 1.2MB).
  • First class macros.
  • Support building small TUI apps using Reagent.
  • Complement babashka with libraries from the Node.js ecosystem.

Requirements

Nbb requires Node.js v12 or newer.

How does this tool work?

CLJS code is evaluated through SCI, the same interpreter that powers babashka. Because SCI works with advanced compilation, the bundle size, especially when combined with other dependencies, is smaller than what you get with self-hosted CLJS. That makes startup faster. The trade-off is that execution is less performant and that only a subset of CLJS is available (e.g. no deftype, yet).

Usage

Install nbb from NPM:

$ npm install nbb -g

Omit -g for a local install.

Try out an expression:

$ nbb -e '(+ 1 2 3)'
6

And then install some other NPM libraries to use in the script. E.g.:

$ npm install csv-parse shelljs zx

Create a script which uses the NPM libraries:

(ns script
  (:require ["csv-parse/lib/sync$default" :as csv-parse]
            ["fs" :as fs]
            ["path" :as path]
            ["shelljs$default" :as sh]
            ["term-size$default" :as term-size]
            ["zx$default" :as zx]
            ["zx$fs" :as zxfs]
            [nbb.core :refer [*file*]]))

(prn (path/resolve "."))

(prn (term-size))

(println (count (str (fs/readFileSync *file*))))

(prn (sh/ls "."))

(prn (csv-parse "foo,bar"))

(prn (zxfs/existsSync *file*))

(zx/$ #js ["ls"])

Call the script:

$ nbb script.cljs
"/private/tmp/test-script"
#js {:columns 216, :rows 47}
510
#js ["node_modules" "package-lock.json" "package.json" "script.cljs"]
#js [#js ["foo" "bar"]]
true
$ ls
node_modules
package-lock.json
package.json
script.cljs

Macros

Nbb has first class support for macros: you can define them right inside your .cljs file, like you are used to from JVM Clojure. Consider the plet macro to make working with promises more palatable:

(defmacro plet
  [bindings & body]
  (let [binding-pairs (reverse (partition 2 bindings))
        body (cons 'do body)]
    (reduce (fn [body [sym expr]]
              (let [expr (list '.resolve 'js/Promise expr)]
                (list '.then expr (list 'clojure.core/fn (vector sym)
                                        body))))
            body
            binding-pairs)))

Using this macro we can look async code more like sync code. Consider this puppeteer example:

(-> (.launch puppeteer)
      (.then (fn [browser]
               (-> (.newPage browser)
                   (.then (fn [page]
                            (-> (.goto page "https://clojure.org")
                                (.then #(.screenshot page #js{:path "screenshot.png"}))
                                (.catch #(js/console.log %))
                                (.then #(.close browser)))))))))

Using plet this becomes:

(plet [browser (.launch puppeteer)
       page (.newPage browser)
       _ (.goto page "https://clojure.org")
       _ (-> (.screenshot page #js{:path "screenshot.png"})
             (.catch #(js/console.log %)))]
      (.close browser))

See the puppeteer example for the full code.

Since v0.0.36, nbb includes promesa which is a library to deal with promises. The above plet macro is similar to promesa.core/let.

Startup time

$ time nbb -e '(+ 1 2 3)'
6
nbb -e '(+ 1 2 3)'   0.17s  user 0.02s system 109% cpu 0.168 total

The baseline startup time for a script is about 170ms seconds on my laptop. When invoked via npx this adds another 300ms or so, so for faster startup, either use a globally installed nbb or use $(npm bin)/nbb script.cljs to bypass npx.

Dependencies

NPM dependencies

Nbb does not depend on any NPM dependencies. All NPM libraries loaded by a script are resolved relative to that script. When using the Reagent module, React is resolved in the same way as any other NPM library.

Classpath

To load .cljs files from local paths or dependencies, you can use the --classpath argument. The current dir is added to the classpath automatically. So if there is a file foo/bar.cljs relative to your current dir, then you can load it via (:require [foo.bar :as fb]). Note that nbb uses the same naming conventions for namespaces and directories as other Clojure tools: foo-bar in the namespace name becomes foo_bar in the directory name.

To load dependencies from the Clojure ecosystem, you can use the Clojure CLI or babashka to download them and produce a classpath:

$ classpath="$(clojure -A:nbb -Spath -Sdeps '{:aliases {:nbb {:replace-deps {com.github.seancorfield/honeysql {:git/tag "v2.0.0-rc5" :git/sha "01c3a55"}}}}}')"

and then feed it to the --classpath argument:

$ nbb --classpath "$classpath" -e "(require '[honey.sql :as sql]) (sql/format {:select :foo :from :bar :where [:= :baz 2]})"
["SELECT foo FROM bar WHERE baz = ?" 2]

Currently nbb only reads from directories, not jar files, so you are encouraged to use git libs. Support for .jar files will be added later.

Current file

The name of the file that is currently being executed is available via nbb.core/*file* or on the metadata of vars:

(ns foo
  (:require [nbb.core :refer [*file*]]))

(prn *file*) ;; "/private/tmp/foo.cljs"

(defn f [])
(prn (:file (meta #'f))) ;; "/private/tmp/foo.cljs"

Reagent

Nbb includes reagent.core which will be lazily loaded when required. You can use this together with ink to create a TUI application:

$ npm install ink

ink-demo.cljs:

(ns ink-demo
  (:require ["ink" :refer [render Text]]
            [reagent.core :as r]))

(defonce state (r/atom 0))

(doseq [n (range 1 11)]
  (js/setTimeout #(swap! state inc) (* n 500)))

(defn hello []
  [:> Text {:color "green"} "Hello, world! " @state])

(render (r/as-element [hello]))

Promesa

Working with callbacks and promises can become tedious. Since nbb v0.0.36 the promesa.core namespace is included with the let and do! macros. An example:

(ns prom
  (:require [promesa.core :as p]))

(defn sleep [ms]
  (js/Promise.
   (fn [resolve _]
     (js/setTimeout resolve ms))))

(defn do-stuff
  []
  (p/do!
   (println "Doing stuff which takes a while")
   (sleep 1000)
   1))

(p/let [a (do-stuff)
        b (inc a)
        c (do-stuff)
        d (+ b c)]
  (prn d))
$ nbb prom.cljs
Doing stuff which takes a while
Doing stuff which takes a while
3

Also see API docs.

Js-interop

Since nbb v0.0.75 applied-science/js-interop is available:

(ns example
  (:require [applied-science.js-interop :as j]))

(def o (j/lit {:a 1 :b 2 :c {:d 1}}))

(prn (j/select-keys o [:a :b])) ;; #js {:a 1, :b 2}
(prn (j/get-in o [:c :d])) ;; 1

Most of this library is supported in nbb, except the following:

  • destructuring using :syms
  • property access using .-x notation. In nbb, you must use keywords.

See the example of what is currently supported.

Examples

See the examples directory for small examples.

Also check out these projects built with nbb:

API

See API documentation.

Migrating to shadow-cljs

See this gist on how to convert an nbb script or project to shadow-cljs.

Build

Prequisites:

  • babashka >= 0.4.0
  • Clojure CLI >= 1.10.3.933
  • Node.js 16.5.0 (lower version may work, but this is the one I used to build)

To build:

  • Clone and cd into this repo
  • bb release

Run bb tasks for more project-related tasks.

Download Details:
Author: borkdude
Download Link: Download The Source Code
Official Website: https://github.com/borkdude/nbb 
License: EPL-1.0

#node #javascript

Gerhard  Brink

Gerhard Brink

1620629020

Getting Started With Data Lakes

Frameworks for Efficient Enterprise Analytics

The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.

This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.

Introduction

As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).


This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.

#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management

Cyrus  Kreiger

Cyrus Kreiger

1617959340

4 Tips To Become A Successful Entry-Level Data Analyst

Companies across every industry rely on big data to make strategic decisions about their business, which is why data analyst roles are constantly in demand. Even as we transition to more automated data collection systems, data analysts remain a crucial piece in the data puzzle. Not only do they build the systems that extract and organize data, but they also make sense of it –– identifying patterns, trends, and formulating actionable insights.

If you think that an entry-level data analyst role might be right for you, you might be wondering what to focus on in the first 90 days on the job. What skills should you have going in and what should you focus on developing in order to advance in this career path?

Let’s take a look at the most important things you need to know.

#data #data-analytics #data-science #data-analysis #big-data-analytics #data-privacy #data-structures #good-company

Cyrus  Kreiger

Cyrus Kreiger

1618039260

How Has COVID-19 Impacted Data Science?

The COVID-19 pandemic disrupted supply chains and brought economies around the world to a standstill. In turn, businesses need access to accurate, timely data more than ever before. As a result, the demand for data analytics is skyrocketing as businesses try to navigate an uncertain future. However, the sudden surge in demand comes with its own set of challenges.

Here is how the COVID-19 pandemic is affecting the data industry and how enterprises can prepare for the data challenges to come in 2021 and beyond.

#big data #data #data analysis #data security #data integration #etl #data warehouse #data breach #elt