Royce  Reinger

Royce Reinger

1675318380

Marqo: Tensor Search for Humans

Marqo

A tensor-based search and analytics engine that seamlessly integrates with applications and websites. Marqo allows developers to turbocharge search functionality with the latest machine learning models, in 3 lines of code.

demo-short.gif

Try the demo | View the code 

✨ Core Features

⚡ Performance

  • Embeddings stored in in-memory HNSW indexes, achieving cutting edge search speeds.
  • Scale to hundred-million document indexes with horizontal index sharding.
  • Async and non-blocking data upload and search.

🤖 Machine Learning

  • Use the latest machine learning models from PyTorch, Huggingface, OpenAI and more.
  • Start with a pre-configured model or bring your own.
  • Built in ONNX support and conversion for faster inference and higher throughput.
  • CPU and GPU support.

☁️ Cloud-native

  • Fast deployment using Docker.
  • Run Marqo multi-az and high availability.

🌌 End-to-end

  • Build search and analytics on multiple unstructured data types such as text, image, code, video.
  • Filter search results using Marqo’s query DSL.
  • Store unstructred data and semi-structured metadata together in documents, using a range of supported datatypes like bools, ints and keywords.

🍱 Managed cloud

  • Scale marqo at the click of a button and Marqo at million document scale with high performace, including performant management of in-memory HNSW indexes.
  • Multi-az, accelerated inference.
  • Marqo cloud ☁️ is in beta. If you’re interested, apply here.

Learn more about Marqo

  
📗 Quick startBuild your first application with Marqo in under 5 minutes.
🔍 What is tensor search?A beginner's guide to the fundamentals of Marqo and tensor search.
🖼 Marqo for image dataBuilding text-to-image search in Marqo in 5 lines of code.
📚 Marqo for textBuilding a multilingual database in Marqo.
🔮 Integrating Marqo with GPTMaking GPT a subject matter expert by using Marqo as a knowledge base.
🎨 Marqo for Creative AICombining stable diffusion with semantic search to generate and categorise 100k images of hotdogs.
🦾 FeaturesMarqo's core features.

Getting started

Marqo requires docker. To install Docker go to the Docker Official website.. Ensure that docker has at least 8GB memory and 50GB storage.

Use docker to run Marqo (Mac users with M-series chips will need to go here):


docker rm -f marqo
docker pull marqoai/marqo:latest
docker run --name marqo -it --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway marqoai/marqo:latest
  1. Install the Marqo client:
pip install marqo
  1. Start indexing and searching! Let's look at a simple example below:
import marqo

mq = marqo.Client(url='http://localhost:8882')

mq.index("my-first-index").add_documents([
    {
        "Title": "The Travels of Marco Polo",
        "Description": "A 13th-century travelogue describing Polo's travels"
    }, 
    {
        "Title": "Extravehicular Mobility Unit (EMU)",
        "Description": "The EMU is a spacesuit that provides environmental protection, "
                       "mobility, life support, and communications for astronauts",
        "_id": "article_591"
    }]
)

results = mq.index("my-first-index").search(
    q="What is the best outfit to wear on the moon?", searchable_attributes=["Title", "Description"]
)
  • mq is the client that wraps the marqo API
  • add_documents() takes a list of documents, represented as python dicts for indexing.
  • add_documents() creates an index with default settings, if one does not already exist.
  • You can optionally set a document's ID with the special _id field. Otherwise, Marqo will generate one.
  • If the index doesn't exist, Marqo will create it. If it exists then Marqo will add the documents to the index.

Let's have a look at the results:

# let's print out the results:
import pprint
pprint.pprint(results)

{
    'hits': [
        {   
            'Title': 'Extravehicular Mobility Unit (EMU)',
            'Description': 'The EMU is a spacesuit that provides environmental protection, mobility, life support, and' 
                           'communications for astronauts',
            '_highlights': {
                'Description': 'The EMU is a spacesuit that provides environmental protection, '
                               'mobility, life support, and communications for astronauts'
            },
            '_id': 'article_591',
            '_score': 0.61938936
        }, 
        {   
            'Title': 'The Travels of Marco Polo',
            'Description': "A 13th-century travelogue describing Polo's travels",
            '_highlights': {'Title': 'The Travels of Marco Polo'},
            '_id': 'e00d1a8d-894c-41a1-8e3b-d8b2a8fce12a',
            '_score': 0.60237324
        }
    ],
    'limit': 10,
    'processingTimeMs': 49,
    'query': 'What is the best outfit to wear on the moon?'
}
  • Each hit corresponds to a document that matched the search query.
  • They are ordered from most to least matching.
  • limit is the maximum number of hits to be returned. This can be set as a parameter during search.
  • Each hit has a _highlights field. This was the part of the document that matched the query the best.

Other basic operations

Get document

Retrieve a document by ID.


result = mq.index("my-first-index").get_document(document_id="article_591")

Note that by adding the document using add_documents again using the same _id will cause a document to be updated.

Get index stats

Get information about an index.


results = mq.index("my-first-index").get_stats()

Lexical search

Perform a keyword search.


result = mq.index("my-first-index").search('marco polo', search_method=marqo.SearchMethods.LEXICAL)

Search specific fields

Using the default tensor search method.


result = mq.index("my-first-index").search('adventure', searchable_attributes=['Title'])

Delete documents

Delete documents.


results = mq.index("my-first-index").delete_documents(ids=["article_591", "article_602"])

Delete index

Delete an index.


results = mq.index("my-first-index").delete()

Multi modal and cross modal search

To power image and text search, Marqo allows users to plug and play with CLIP models from HuggingFace. Note that if you do not configure multi modal search, image urls will be treated as strings. To start indexing and searching with images, first create an index with a CLIP configuration, as below:


settings = {
  "treat_urls_and_pointers_as_images":True,   # allows us to find an image file and index it 
  "model":"ViT-L/14"
}
response = mq.create_index("my-multimodal-index", **settings)

Images can then be added within documents as follows. You can use urls from the internet (for example S3) or from the disk of the machine:


response = mq.index("my-multimodal-index").add_documents([{
    "My Image": "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f2/Portrait_Hippopotamus_in_the_water.jpg/440px-Portrait_Hippopotamus_in_the_water.jpg",
    "Description": "The hippopotamus, also called the common hippopotamus or river hippopotamus, is a large semiaquatic mammal native to sub-Saharan Africa",
    "_id": "hippo-facts"
}])

Setting searchable_attributes to the image field ['My Image'] ensures only images are searched in this index:


results = mq.index("my-multimodal-index").search('animal',  searchable_attributes=['My Image'])

You can then search using text as usual. Both text and image fields will be searched:


results = mq.index("my-multimodal-index").search('animal')

Setting searchable_attributes to the image field ['My Image'] ensures only images are searched in this index:


results = mq.index("my-multimodal-index").search('animal', searchable_attributes=['My Image'])

Searching using an image

Searching using an image can be achieved by providing the image link.


results = mq.index("my-multimodal-index").search('https://upload.wikimedia.org/wikipedia/commons/thumb/9/96/Standing_Hippopotamus_MET_DP248993.jpg/440px-Standing_Hippopotamus_MET_DP248993.jpg')

Documentation

The full documentation for Marqo can be found here https://docs.marqo.ai/.

Warning

Note that you should not run other applications on Marqo's Opensearch cluster as Marqo automatically changes and adapts the settings on the cluster.

M series Mac users

Marqo does not yet support the docker-in-docker backend configuration for the arm64 architecture. This means that if you have an M series Mac, you will also need to run marqo's backend, marqo-os, locally.

To run Marqo on an M series Mac, follow the next steps.

In one terminal run the following command to start opensearch:

docker rm -f marqo-os; docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" marqoai/marqo-os:0.0.3-arm

In another terminal run the following command to launch Marqo:

docker rm -f marqo; docker run --name marqo --privileged \
    -p 8882:8882 --add-host host.docker.internal:host-gateway \
    -e "OPENSEARCH_URL=https://localhost:9200" \
    marqoai/marqo:latest

Contributors

Marqo is a community project with the goal of making tensor search accessible to the wider developer community. We are glad that you are interested in helping out! Please read this to get started.

Dev set up

Create a virtual env python -m venv ./venv.

Activate the virtual environment source ./venv/bin/activate.

Install requirements from the requirements file: pip install -r requirements.txt.

Run tests by running the tox file. CD into this dir and then run "tox".

If you update dependencies, make sure to delete the .tox dir and rerun.

Merge instructions:

Run the full test suite (by using the command tox in this dir).

Create a pull request with an attached github issue.

Translations

This readme is available in the following translations:

Download Details:

Author: marqo-ai
Source Code: https://github.com/marqo-ai/marqo 
License: Apache-2.0 license

#searchengine #machinelearning #deeplearning #transform #python 

Marqo: Tensor Search for Humans
Rupert  Beatty

Rupert Beatty

1673641260

A Block-based API for NSValueTransformer, with A Growing Collection

TransformerKit

A block-based API for NSValueTransformer, with a growing collection of useful examples.

NSValueTransformer, while perhaps obscure to most iOS programmers, remains a staple of OS X development. Before Objective-C APIs got in the habit of flinging block parameters hither and thither with reckless abandon, NSValueTransformer was the go-to way to encapsulate mutation functionality --- especially when it came to Bindings.

NSValueTransformer is convenient to use but a pain to set up. To create a value transformer you have to create a subclass, implement a handful of required methods, and register a singleton instance by name.

TransformerKit breathes new life into NSValueTransformer by making them dead-simple to define and register:

NSString * const TTTCapitalizedStringTransformerName = @"TTTCapitalizedStringTransformerName";

[NSValueTransformer registerValueTransformerWithName:TTTCapitalizedStringTransformerName
                               transformedValueClass:[NSString class]
                  returningTransformedValueWithBlock:^id(id value) {
  return [value capitalizedString];
}];

TransformerKit pairs nicely with InflectorKit and FormatterKit, providing well-designed APIs for manipulating user-facing content.


TransformerKit also contains a growing number of convenient transformers that your apps will love and cherish:

String Transformers

  • Capitalized
  • UPPERCASE
  • lowercase
  • CamelCase
  • llamaCase
  • snake_case
  • train-case
  • esreveR* (Reverse)
  • Rémövê Dîaçritics (Remove accents and combining marks)
  • ट्रांस्लितेराते स्ट्रिंग (Transliterate to Latin)
  • Any Valid ICU Transform*

Image Transformers

  • PNG Representation*
  • JPEG Representation*
  • GIF Representation (macOS)
  • TIFF Representation (macOS)

Date Transformers

JSON Data Transformers

  • JSON Transformer*

Data Transformers (macOS)

  • Base16 String Encode / Decode
  • Base32 String Encode / Decode
  • Base64 String Encode / Decode
  • Base85 String Encode / Decode

Cryptographic Transformers (macOS)

  • MD5, SHA-1, SHA-256, et al. Digests

* - Reversible

Contact

Mattt (@mattt)

Download Details:

Author: Mattt
Source Code: https://github.com/mattt/TransformerKit 
License: MIT license

#swift #objective-c #data #transform 

A Block-based API for NSValueTransformer, with A Growing Collection

Transform The Character Case Of A String in JavaScript

In this tutorial, you’ll learn how to transform the character case of a string — to uppercase, lowercase, and title case — using native JavaScript methods.

JavaScript provides many functions and methods that allow you to manipulate data for different purposes. We’ve recently looked at methods for converting a string to a number and a number to a string or to an ordinal, and for splitting strings. This article will present methods for transforming the character case of a string — which is useful for representing strings in a certain format or for reliable string comparison.

Transform a String to Lowercase

If you need your string in lowercase, you can use the toLowerCase() method available on strings. This method returns the string with all its characters in lowercase.

For example:

const str = 'HeLlO';
console.log(str.toLowerCase()); // "hello"
console.log(str); // "HeLlo"

By using toLowerCase() method on the str variable, you can retrieve the same string with all the characters in lowercase. Notice that a new string is returned without affecting the value of str.

Transform a String to Uppercase

If you need your string in uppercase, you can use the toUpperCase() method available on strings. This method returns the string with all its characters in uppercase.

For example:

const str = 'HeLlO';
console.log(str.toUpperCase()); // "HELLO"
console.log(str); // "HeLlo"

By using toUpperCase() method on the str variable, you can retrieve the same string with all the characters in uppercase. Notice that a new string is returned without affecting the value of str.

Transform a String to Title Case

The most common use case for transforming a string’s case is transforming it to title case. This can be used to display names and headlines.

There are different ways to do this. One way is by using the method toUpperCase() on the first character of the string, then concatenating it to the rest of the string. For example:

const str = 'hello';
console.log(str[0].toUpperCase() + str.substring(1).toLowerCase()); // "Hello"

In this example, you retrieve the first character using the 0 index on the str variable. Then, you transform it to uppercase using the toUpperCase() method. Finally, you retrieve the rest of the string using the substr() method and concatinate the rest of the string to the first letter. You apply toLowerCase() on the rest of the string to ensure that it’s in lowercase.

This only transforms the first letter of the word to uppercase. However, in some cases if you have a sentence you might want to transform every word in the sentence to uppercase. In that case, it’s better to use a function like this:

function toTitleCase (str) {
  if (!str) {
    return '';
  }
  const strArr = str.split(' ').map((word) => {
    return word[0].toUpperCase() + word.substring(1).toLowerCase();
  });
  return strArr.join(' ');
}

const str = 'hello world';
console.log(toTitleCase(str)); // "Hello World"

The toTitleCase() function accepts one parameter, which is the string to transform to title case.

In the function, you first check if the string is empty and in that case return an empty string.

Then, you split the string on the space delimiter, which returns an array. After that, you use the map method on the array to apply the transformation you saw in the previous example on each item in the array. This transforms every word to title case.

Finally, you join the items in the array into a string by the same space delimiter and return it.

Live Example

In the following CodePen demo, you can try out the functionality of toLowerCase() and toUpperCase(). When you enter a string in the input, it’s transformed to both uppercase and lowercase and displayed. You can try using characters with different case in the string.

Changing Character Case for String Comparison

In many situations, you’ll need to compare strings before executing a block of code. If you can’t control the character case the string is being written in, performing comparison on the string without enforcing any character case can lead to unexpected results.

For example:

const input = document.querySelector('input[type=text]');
if (input.value === 'yes') {
  alert('Thank you for agreeing!');
} else {
  alert('We still like you anyway')
}

If the user enters in the input Yes instead of yes, the equality condition will fail and the wrong alert will show.

You can resolve this by enforcing a character case on the string:

const input = document.querySelector('input[type=text]');
if (input.value.toLowerCase() === 'yes') {
  alert('Thank you for agreeing!');
} else {
  alert('We still like you anyway')
}

Conclusion

It’s necessary to learn how to transform the character case of a string in JavaScript. You’ll often need to use it for many use cases, such as displaying the string in a certain format. You can also use it to reliably compare strings.

Enforcing a character case on the strings you’re comparing ensures that you can check if the content of the strings are equal, regardless of how they’re written.

Original article source at: https://www.sitepoint.com/

#javascript #transform #character #string 

Transform The Character Case Of A String in JavaScript
Rupert  Beatty

Rupert Beatty

1669925700

Spark RDDs: Transformation with Examples

Transformation is one of the RDD operation in spark before moving this first discuss about what actual Spark and RDD is.

What is Spark?

Apache Spark is an open-source cluster computing framework. Its main objective is to manage the data created in real time.

Hadoop MapReduce was the foundation upon which Spark was developed. Unlike competing methods like Hadoop’s MapReduce, which writes and reads data to and from computer hard drives, it was optimized to run in memory. As a result, Spark processes the data far more quickly than other options.

What is RDD?

The fundamental abstraction of Spark is the RDD (Resilient Distributed Dataset). It is a group of components that have been divided up across the cluster nodes so that we can process different parallel operations on it.

RDDs can be produced in one of two ways:

  • Parallelizing data in the driver program already in use.
  • Any data source that offers a Hadoop InputFormat, such as a shared filesystem, HDFS, HBase, or any other external storage system.

Spark RDD Operations

The RDD provides the two types of operations:

  • Transformations
  • Actions

A Transformation is a function that generates new RDDs from existing RDDs, but when we want to work with the actual dataset, we perform an Action. When the action is triggered after the result, a new RDD is not formed in the same way that transformation is.

Transformations with Examples

The role of transformation in Spark is to create a new dataset from an existing one. Lazy transformations are those that are computed only when an action requires a result to be returned to the driver programme.

When we call an action, transformations are executed since they are inherently lazy. Not right away are they carried out. There are two primary types of transformations: map() and filter ().
The outcome RDD is always distinct from the parent RDD after the transformation. It could be smaller (filter, count, distinct, sample, for example), bigger (flatMap(), union(), Cartesian()), or the same size (e.g. map).

In this section, I will explain a few RDD Transformations with word count example in scala, before we start first, let’s create an RDD by reading a text file. The text file used here is a dummy datasets you can use any datasets here.

val spark:SparkSession = SparkSession.builder()
      .master("local[3]")
      .appName("SparkByExamples.com")
      .getOrCreate()

val sc = spark.sparkContext

val rdd:RDD[String] = sc.textFile("src/main/scala/test.txt")

flatMap() Transformation

After applying the function, the flatMap() transformation flattens the RDD and creates a new RDD. The example below first divides each record in an RDD by space before flattening it. Each entry in the resulting RDD only contains one word.


val rdd2 = rdd.flatMap(f=>f.split(" "))

map() Transformation

Any complex actions, such as the addition of a column or the updating of a column, are applied using the map() transformation, and the output of these transformations always has the same amount of records as the input.

In our word count example, we are creating a new column and assigning a value of 1 to each word. The RDD produces a PairRDDFunction that has key-value pairs with the keys being words of type String and the values being 1 of type Int. I’ve defined the type of the rdd3 variable for your understanding.


val rdd3:RDD[(String,Int)]= rdd2.map(m=>(m,1))

filter() Transformation

The records in an RDD can be filtered with the filter() transformation. In our illustration, we are filtering out all terms that begin with “a.”


val rdd4 = rdd3.filter(a=> a._1.startsWith("a"))

reduceByKey() Transformation

The method supplied by reduceByKey() merges the values for each key. By using the sum function on value in our example, the word string is condensed. Our RDD’s output includes a count of the number of unique words.


val rdd5 = rdd3.reduceByKey(_ + _)

union(dataset)

We can obtain the elements from both RDDs in the new RDD using the union() function. The two RDDs must be of the same type in order for this function to work.
For instance, if RDD1’s elements are Spark, Spark, Hadoop, and Flink, and RDD2’s elements are Big data, Spark, and Flink, the resulting rdd1.union(rdd2) will have the following elements: Spark, Spark, Spark, Hadoop, Flink, and Flink, Big data.

val rdd6 = rdd5.union(rdd3)

intersection(other-dataset)

With the intersection() function, we get only the common element of both the RDD in new RDD. The key rule of this function is that the two RDDs should be of the same type.

val rdd7 = rdd1.intersection(rdd2)

Conclusion

In this Spark RDD Transformations blog, you have learned different transformation functions and their usage with scala examples. In the next blog, we will learn about actions.

Happy Learning !!

Original article source at: https://blog.knoldus.com/

#spark #transform 

Spark RDDs: Transformation with Examples

Implementation Of The Uniform Discrete Curvelet Transform (UDCT)

Curvelet.jl - The 2D Curvlet Transform

The curvelet transform is a fairly recent image processing technique that is able to easily approximate curves present in images. This package is an implementation of the “Uniform Discrete Curvelet Transform” as described in “Uniform Discrete Curvelet Transform” by Truong T. Nguyen and Hervé Chauris.

Basic usage is as follows:

require("src/Curvelet.jl")
x = rand(128,128)
X = Curvelet.curveletTransform(x)
y = Curvelet.inverseCurveletTransform(X)

Restrictions

Currently this transform works only for a simple class of inputs: square images with dimensions that are powers of two in length and at least 16x16.

Download Details:

Author: Fundamental
Source Code: https://github.com/fundamental/Curvelet.jl 
License: View license

#julia #transform 

Implementation Of The Uniform Discrete Curvelet Transform (UDCT)
Dexter  Goodwin

Dexter Goodwin

1660450140

Realtime Database Backend Based on Operational Transformation (OT)

ShareDB

ShareDB is a realtime database backend based on Operational Transformation (OT) of JSON documents. It is the realtime backend for the DerbyJS web application framework.

For help, questions, discussion and announcements, join the ShareJS mailing list or read the documentation.

Please report any bugs you find to the issue tracker.

Features

  • Realtime synchronization of any JSON document
  • Concurrent multi-user collaboration
  • Synchronous editing API with asynchronous eventual consistency
  • Realtime query subscriptions
  • Simple integration with any database
  • Horizontally scalable with pub/sub integration
  • Projections to select desired fields from documents and operations
  • Middleware for implementing access control and custom extensions
  • Ideal for use in browsers or on the server
  • Offline change syncing upon reconnection
  • In-memory implementations of database and pub/sub for unit testing
  • Access to historic document versions
  • Realtime user presence syncing

Examples

Counter

demo.gif

Leaderboard

demo.gif

Development

Documentation

The documentation is stored as Markdown files, but sometimes it can be useful to run these locally. The docs are served using Jekyll, and require Ruby >2.4.0 and Bundler:

gem install jekyll bundler

The docs can be built locally and served with live reload:

npm run docs:install
npm run docs:start

Documentation

https://share.github.io/sharedb/

Download Details:

Author: Share
Source Code: https://github.com/share/sharedb 
License: View license

#javascript #database #transform 

Realtime Database Backend Based on Operational Transformation (OT)
Monty  Boehm

Monty Boehm

1659722100

A fresh approach to coordinate transformations...

CoordinateTransformations

CoordinateTransformations is a Julia package to manage simple or complex networks of coordinate system transformations. Transformations can be easily applied, inverted, composed, and differentiated (both with respect to the input coordinates and with respect to transformation parameters such as rotation angle). Transformations are designed to be light-weight and efficient enough for, e.g., real-time graphical applications, while support for both explicit and automatic differentiation makes it easy to perform optimization and therefore ideal for computer vision applications such as SLAM (simultaneous localization and mapping).

The package provide two main pieces of functionality

Primarily, an interface for defining Transformations and applying (by calling), inverting (inv()), composing ( or compose()) and differentiating (transform_deriv() and transform_deriv_params()) them.

A small set of built-in, composable, primitive transformations for transforming 2D and 3D points (optionally leveraging the StaticArrays and Rotations packages).

Quick start

Let's translate a 3D point:

using CoordinateTransformations, Rotations, StaticArrays

x = SVector(1.0, 2.0, 3.0)  # SVector is provided by StaticArrays.jl
trans = Translation(3.5, 1.5, 0.0)

y = trans(x)

We can either apply different transformations in turn,

rot = LinearMap(RotX(0.3))  # Rotate 0.3 radians about X-axis, from Rotations.jl

z = trans(rot(x))

or build a composed transformation using the operator (accessible at the REPL by typing \circ then tab):

composed = trans ∘ rot  # alternatively, use compose(trans, rot)

composed(x) == z

A composition of a Translation and a LinearMap results in an AffineMap.

We can invert the transformation:

composed_inv = inv(composed)

composed_inv(z) == x

For any transformation, we can shift the origin to a new point using recenter:

rot_around_x = recenter(rot, x)

Now rot_around_x is a rotation around the point x = SVector(1.0, 2.0, 3.0).

Finally, we can construct a matrix describing how the components of z differentiates with respect to components of x:

∂z_∂x = transform_deriv(composed, x) # In general, the transform may be non-linear, and thus we require the value of x to compute the derivative

Or perhaps we want to know how y will change with respect to changes of to the translation parameters:

∂y_∂θ = transform_deriv_params(trans, x)

The interface

Transformations are derived from Transformation. As an example, we have Translation{T} <: Transformation. A Translation will accept and translate points in a variety of formats, such as Vector or SVector, but in general your custom-defined Transformations could transform any Julia object.

Transformations can be reversed using inv(trans). They can be chained together using the operator (trans1 ∘ trans2) or compose function (compose(trans1, trans2)). In this case, trans2 is applied first to the data, before trans1. Composition may be intelligent, for instance by precomputing a new Translation by summing the elements of two existing Translations, and yet other transformations may compose to the IdentityTransformation. But by default, composition will result in a ComposedTransformation object which simply dispatches to apply the transformations in the correct order.

Finally, the matrix describing how differentials propagate through a transform can be calculated with the transform_deriv(trans, x) method. The derivatives of how the output depends on the transformation parameters is accessed via transform_deriv_params(trans, x). Users currently have to overload these methods, as no fall-back automatic differentiation is currently included. Alternatively, all the built-in types and transformations are compatible with automatic differentiation techniques, and can be parameterized by DualNumbers' DualNumber or ForwardDiff's Dual.

Built-in transformations

A small number of 2D and 3D coordinate systems and transformations are included. We also have IdentityTransformation and ComposedTransformation, which allows us to nest together arbitrary transformations to create a complex yet efficient transformation chain.

Coordinate types

The package accepts any AbstractVector type for Cartesian coordinates (as well as FixedSizeArrays types in Julia v0.4 only). For speed, we recommend using a statically-sized container such as SVector{N} from StaticArrays.

We do provide a few specialist coordinate types. The Polar(r, θ) type is a 2D polar representation of a point, and similarly in 3D we have defined Spherical(r, θ, ϕ) and Cylindrical(r, θ, z).

Coordinate system transformations

Two-dimensional coordinates may be converted using these parameterless (singleton) transformations:

  1. PolarFromCartesian()
  2. CartesianFromPolar()

Three-dimensional coordinates may be converted using these parameterless transformations:

  1. SphericalFromCartesian()
  2. CartesianFromSpherical()
  3. SphericalFromCylindrical()
  4. CylindricalFromSpherical()
  5. CartesianFromCylindrical()
  6. CylindricalFromCartesian()

However, you may find it simpler to use the convenience constructors like Polar(SVector(1.0, 2.0)).

Translations

Translations can be be applied to Cartesian coordinates in arbitrary dimensions, by e.g. Translation(Δx, Δy) or Translation(Δx, Δy, Δz) in 2D/3D, or by Translation(Δv) in general (with Δv an AbstractVector). Compositions of two Translations will intelligently create a new Translation by adding the translation vectors.

Linear transformations

Linear transformations (a.k.a. linear maps), including rotations, can be encapsulated in the LinearMap type, which is a simple wrapper of an AbstractMatrix.

You are able to provide any matrix of your choosing, but your choice of type will have a large effect on speed. For instance, if you know the dimensionality of your points (e.g. 2D or 3D) you might consider a statically sized matrix like SMatrix from StaticArrays.jl. We recommend performing 3D rotations using those from Rotations.jl for their speed and flexibility. Scaling will be efficient with Julia's built-in UniformScaling. Also note that compositions of two LinearMaps will intelligently create a new LinearMap by multiplying the transformation matrices.

Affine maps

An Affine map encapsulates a more general set of transformation which are defined by a composition of a translation and a linear transformation. An AffineMap is constructed from an AbstractVector translation v and an AbstractMatrix linear transformation M. It will perform the mapping x -> M*x + v, but the order of addition and multiplication will be more obvious (and controllable) if you construct it from a composition of a linear map and a translation, e.g. Translation(v) ∘ LinearMap(v) (or any combination of LinearMap, Translation and AffineMap).

Perspective transformations

The perspective transformation maps real-space coordinates to those on a virtual "screen" of one lesser dimension. For instance, this process is used to render 3D scenes to 2D images in computer generated graphics and games. It is an ideal model of how a pinhole camera operates and is a good approximation of the modern photography process.

The PerspectiveMap() command creates a Transformation to perform the projective mapping. It can be applied individually, but is particularly powerful when composed with an AffineMap containing the position and orientation of the camera in your scene. For example, to transfer points in 3D space to 2D screen_points giving their projected locations on a virtual camera image, you might use the following code:

cam_transform = PerspectiveMap() ∘ inv(AffineMap(cam_rotation, cam_position))
screen_points = map(cam_transform, points)

There is also a cameramap() convenience function that can create a composed transformation that includes the intrinsic scaling (e.g. focal length and pixel size) and offset (defining which pixel is labeled (0,0)) of an imaging system.

Acknowledgements

FugroRoames

Author: JuliaGeometry
Source Code: https://github.com/JuliaGeometry/CoordinateTransformations.jl 
License: View license

#julia #transform 

A fresh approach to coordinate transformations...
Muhammad  Price

Muhammad Price

1659511140

Roadie: Making HTML Emails Comfortable for The Ruby Rockstars

Roadie 

  
:warning:This gem is now in [passive maintenance mode][passive]. [(more)][passive]

Making HTML emails comfortable for the Ruby rockstars

Roadie tries to make sending HTML emails a little less painful by inlining stylesheets and rewriting relative URLs for you inside your emails.

How does it work?

Email clients have bad support for stylesheets, and some of them blocks stylesheets from downloading. The easiest way to handle this is to work with inline styles (style="..."), but that is error prone and hard to work with as you cannot use classes and/or reuse styling over your HTML.

This gem makes this easier by automatically inlining stylesheets into the document. You give Roadie your CSS, or let it find it by itself from the <link> and <style> tags in the markup, and it will go through all of the selectors assigning the styles to the matching elements. Careful attention has been put into selectors being applied in the correct order, so it should behave just like in the browser.

"Dynamic" selectors (:hover, :visited, :focus, etc.), or selectors not understood by Nokogiri will be inlined into a single <style> element for those email clients that support it. This changes specificity a great deal for these rules, so it might not work 100% out of the box. (See more about this below)

Roadie also rewrites all relative URLs in the email to an absolute counterpart, making images you insert and those referenced in your stylesheets work. No more headaches about how to write the stylesheets while still having them work with emails from your acceptance environments. You can disable this on specific elements using a data-roadie-ignore marker.

Features

  • Writes CSS styles inline.
    • Respects !important styles.
    • Does not overwrite styles already present in the style attribute of tags.
    • Supports the same CSS selectors as Nokogiri; use CSS3 selectors in your emails!
    • Keeps :hover, @media { ... } and friends around in a separate <style> element.
  • Makes image urls absolute.
    • Hostname and port configurable on a per-environment basis.
    • Can be disabled on individual elements.
  • Makes link hrefs and img srcs absolute.
  • Automatically adds proper HTML skeleton when missing; you don't have to create a layout for emails.
    • Also supports HTML fragments / partial documents, where layout is not added.
  • Allows you to inject stylesheets in a number of ways, at runtime.
  • Removes data-roadie-ignore markers before finishing the HTML.

Install & Usage

Add this gem to your Gemfile as recommended by Rubygems and run bundle install.

gem 'roadie', '~> 4.0'

Your document instance can be configured with several options:

  • url_options - Dictates how absolute URLs should be built.
  • keep_uninlinable_css - Set to false to skip CSS that cannot be inlined.
  • merge_media_queries - Set to false to not group media queries. Some users might prefer to not group rules within media queries because it will result in rules getting reordered. e.g.
@media(max-width: 600px) { .col-6 { display: block; } }
@media(max-width: 400px) { .col-12 { display: inline-block; } }
@media(max-width: 600px) { .col-12 { display: block; } }
  • will become
@media(max-width: 600px) { .col-6 { display: block; } .col-12 { display: block; } }
@media(max-width: 400px) { .col-12 { display: inline-block; } }
  • asset_providers - A list of asset providers that are invoked when CSS files are referenced. See below.
  • external_asset_providers - A list of asset providers that are invoked when absolute CSS URLs are referenced. See below.
  • before_transformation - A callback run before transformation starts.
  • after_transformation - A callback run after transformation is completed.

Making URLs absolute

In order to make URLs absolute you need to first configure the URL options of the document.

html = '... <a href="/about-us">Read more!</a> ...'
document = Roadie::Document.new html
document.url_options = {host: "myapp.com", protocol: "https"}
document.transform
  # => "... <a href=\"https://myapp.com/about-us\">Read more!</a> ..."

The following URLs will be rewritten for you:

  • a[href] (HTML)
  • img[src] (HTML)
  • url() (CSS)

You can disable individual elements by adding an data-roadie-ignore marker on them. CSS will still be inlined on those elements, but URLs will not be rewritten.

<a href="|UNSUBSCRIBE_URL|" data-roadie-ignore>Unsubscribe</a>

Referenced stylesheets

By default, style and link elements in the email document's head are processed along with the stylesheets and removed from the head.

You can set a special data-roadie-ignore attribute on style and link tags that you want to ignore (the attribute will be removed, however). This is the place to put things like :hover selectors that you want to have for email clients allowing them.

Style and link elements with media="print" are also ignored.

<head>
  <link rel="stylesheet" type="text/css" href="/assets/emails/rock.css">         <!-- Will be inlined with normal providers -->
  <link rel="stylesheet" type="text/css" href="http://www.metal.org/metal.css">  <!-- Will be inlined with external providers, *IF* specified; otherwise ignored. -->
  <link rel="stylesheet" type="text/css" href="/assets/jazz.css" media="print">  <!-- Will NOT be inlined; print style -->
  <link rel="stylesheet" type="text/css" href="/ambient.css" data-roadie-ignore> <!-- Will NOT be inlined; ignored -->
  <style></style>                    <!-- Will be inlined -->
  <style data-roadie-ignore></style> <!-- Will NOT be inlined; ignored -->
</head>

Roadie will use the given asset providers to look for the actual CSS that is referenced. If you don't change the default, it will use the Roadie::FilesystemProvider which looks for stylesheets on the filesystem, relative to the current working directory.

Example:

# /home/user/foo/stylesheets/primary.css
body { color: green; }

# /home/user/foo/script.rb
html = <<-HTML
<html>
  <head>
  <link rel="stylesheet" type="text/css" href="/stylesheets/primary.css">
  </head>
  <body>
  </body>
</html>
HTML

Dir.pwd # => "/home/user/foo"
document = Roadie::Document.new html
document.transform # =>
                   # <!DOCTYPE html>
                   # <html>
                   #   <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head>
                   #   <body style="color:green;"></body>
                   # </html>

If a referenced stylesheet cannot be found, the #transform method will raise an Roadie::CssNotFound error. If you instead want to ignore missing stylesheets, you can use the NullProvider.

Configuring providers

You can write your own providers if you need very specific behavior for your app, or you can use the built-in providers. Providers come in two groups: normal and external. Normal providers handle paths without host information (/style/foo.css) while external providers handle URLs with host information (//example.com/foo.css, localhost:3001/bar.css, and so on).

The default configuration is to not have any external providers configured, which will cause those referenced stylesheets to be ignored. Adding one or more providers for external assets causes all of them to be searched and inlined, so if you only want this to happen to specific stylesheets you need to add ignore markers to every other styleshheet (see above).

Included providers:

  • FilesystemProvider – Looks for files on the filesystem, relative to the given directory unless otherwise specified.
  • ProviderList – Wraps a list of other providers and searches them in order. The asset_providers setting is an instance of this. It behaves a lot like an array, so you can push, pop, shift and unshift to it.
  • NullProvider – Does not actually provide anything, it always finds empty stylesheets. Use this in tests or if you want to ignore stylesheets that cannot be found by your other providers (or if you want to force the other providers to never run).
  • NetHttpProvider – Downloads stylesheets using Net::HTTP. Can be given a whitelist of hosts to download from.
  • CachedProvider – Wraps another provider (or ProviderList) and caches responses inside the provided cache store.
  • PathRewriterProvider – Rewrites the passed path and then passes it on to another provider (or ProviderList).

If you want to search several locations on the filesystem, you can declare that:

document.asset_providers = [
  Roadie::FilesystemProvider.new(App.root.join("resources", "stylesheets")),
  Roadie::FilesystemProvider.new(App.root.join("system", "uploads", "stylesheets")),
]

NullProvider

If you want to ignore stylesheets that cannot be found instead of crashing, push the NullProvider to the end:

# Don't crash on missing assets
document.asset_providers << Roadie::NullProvider.new

# Don't download assets in tests
document.external_asset_providers.unshift Roadie::NullProvider.new

Note: This will cause the referenced stylesheet to be removed from the source code, so email client will never see it either.

NetHttpProvider

The NetHttpProvider will download the URLs that is is given using Ruby's standard Net::HTTP library.

You can give it a whitelist of hosts that downloads are allowed from:

document.external_asset_providers << Roadie::NetHttpProvider.new(
  whitelist: ["myapp.com", "assets.myapp.com", "cdn.cdnnetwork.co.jp"],
)
document.external_asset_providers << Roadie::NetHttpProvider.new # Allows every host

CachedProvider

You might want to cache providers from working several times. If you are sending several emails quickly from the same process, this might also save a lot of time on parsing the stylesheets if you use in-memory storage such as a hash.

You can wrap any other kind of providers with it, even a ProviderList:

document.external_asset_providers = Roadie::CachedProvider.new(document.external_asset_providers, my_cache)

If you don't pass a cache backend, it will use a normal Hash. The cache store must follow this protocol:

my_cache["key"] = some_stylesheet_instance # => #<Roadie::Stylesheet instance>
my_cache["key"]                            # => #<Roadie::Stylesheet instance>
my_cache["missing"]                        # => nil

Warning: The default Hash store will never be cleared, so make sure you don't allow the number of unique asset paths to grow too large in a single run. This is especially important if you run Roadie in a daemon that accepts arbritary documents, and/or if you use hash digests in your filenames. Making a new instance of CachedProvider will use a new Hash instance.

You can implement your own custom cache store by implementing the [] and []= methods.

class MyRoadieMemcacheStore
  def initialize(memcache)
    @memcache = memcache
  end

  def [](path)
    css = memcache.read("assets/#{path}/css")
    if css
      name = memcache.read("assets/#{path}/name") || "cached #{path}"
      Roadie::Stylesheet.new(name, css)
    end
  end

  def []=(path, stylesheet)
    memcache.write("assets/#{path}/css", stylesheet.to_s)
    memcache.write("assets/#{path}/name", stylesheet.name)
    stylesheet # You need to return the set Stylesheet
  end
end

document.external_asset_providers = Roadie::CachedProvider.new(
  document.external_asset_providers,
  MyRoadieMemcacheStore.new(MemcacheClient.instance)
)

If you are using Rspec, you can test your implementation by using the shared examples for the "roadie cache store" role:

require "roadie/rspec"

describe MyRoadieMemcacheStore do
  let(:memcache_client) { MemcacheClient.instance }
  subject { MyRoadieMemcacheStore.new(memcache_client) }

  it_behaves_like "roadie cache store" do
    before { memcache_client.clear }
  end
end

PathRewriterProvider

With this provider, you can rewrite the paths that are searched in order to more easily support another provider. Examples could include rewriting absolute URLs into something that can be found on the filesystem, or to access internal hosts instead of external ones.

filesystem = Roadie::FilesystemProvider.new("assets")
document.asset_providers << Roadie::PathRewriterProvider.new(filesystem) do |path|
  path.sub('stylesheets', 'css').downcase
end

document.external_asset_providers = Roadie::PathRewriterProvider.new(filesystem) do |url|
  if url =~ /myapp\.com/
    URI.parse(url).path.sub(%r{^/assets}, '')
  else
    url
  end
end

You can also wrap a list, for example to implement external_asset_providers by composing the normal asset_providers:

document.external_asset_providers =
  Roadie::PathRewriterProvider.new(document.asset_providers) do |url|
    URI.parse(url).path
  end

Writing your own provider

Writing your own provider is also easy. You need to provide:

  • #find_stylesheet(name), returning either a Roadie::Stylesheet or nil.
  • #find_stylesheet!(name), returning either a Roadie::Stylesheet or raising Roadie::CssNotFound.
class UserAssetsProvider
  def initialize(user_collection)
    @user_collection = user_collection
  end

  def find_stylesheet(name)
    if name =~ %r{^/users/(\d+)\.css$}
      user = @user_collection.find_user($1)
      Roadie::Stylesheet.new("user #{user.id} stylesheet", user.stylesheet)
    end
  end

  def find_stylesheet!(name)
    find_stylesheet(name) or
      raise Roadie::CssNotFound.new(
        css_name: name, message: "does not match a user stylesheet", provider: self
      )
  end

  # Instead of implementing #find_stylesheet!, you could also:
  #     include Roadie::AssetProvider
  # That will give you a default implementation without any error message. If
  # you have multiple error cases, it's recommended that you implement
  # #find_stylesheet! without #find_stylesheet and raise with an explanatory
  # error message.
end

# Try to look for a user stylesheet first, then fall back to normal filesystem lookup.
document.asset_providers = [
  UserAssetsProvider.new(app),
  Roadie::FilesystemProvider.new('./stylesheets'),
]

You can test for compliance by using the built-in RSpec examples:

require 'spec_helper'
require 'roadie/rspec'

describe MyOwnProvider do
  # Will use the default `subject` (MyOwnProvider.new)
  it_behaves_like "roadie asset provider", valid_name: "found.css", invalid_name: "does_not_exist.css"

  # Extra setup just for these tests:
  it_behaves_like "roadie asset provider", valid_name: "found.css", invalid_name: "does_not_exist.css" do
    subject { MyOwnProvider.new(...) }
    before { stub_dependencies }
  end
end

Keeping CSS that is impossible to inline

Some CSS is impossible to inline properly. :hover and ::after comes to mind. Roadie tries its best to keep these around by injecting them inside a new <style> element in the <head> (or at the beginning of the partial if transforming a partial document).

The problem here is that Roadie cannot possible adjust the specificity for you, so they will not apply the same way as they did before the styles were inlined.

Another caveat is that a lot of email clients does not support this (which is the entire point of inlining in the first place), so don't put anything important in here. Always handle the case of these selectors not being part of the email.

Specificity problems

Inlined styles will have much higher specificity than styles in a <style>. Here's an example:

<style>p:hover { color: blue; }</style>
<p style="color: green;">Hello world</p>

When hovering over this <p>, the color will not change as the color: green rule takes precedence. You can get it to work by adding !important to the :hover rule.

It would be foolish to try to automatically inject !important on every rule automatically, so this is a manual process.

Turning it off

If you'd rather skip this and have the styles not possible to inline disappear, you can turn off this feature by setting the keep_uninlinable_css option to false.

document.keep_uninlinable_css = false

Callbacks

Callbacks allow you to do custom work on documents before they are transformed. The Nokogiri document tree is passed to the callable along with the Roadie::Document instance:

class TrackNewsletterLinks
  def call(dom, document)
    dom.css("a").each { |link| fix_link(link) }
  end

  def fix_link(link)
    divider = (link['href'] =~ /?/ ? '&' : '?')
    link['href'] = link['href'] + divider + 'source=newsletter'
  end
end

document.before_transformation = ->(dom, document) {
  logger.debug "Inlining document with title #{dom.at_css('head > title').try(:text)}"
}
document.after_transformation = TrackNewsletterLinks.new

XHTML vs HTML

You can configure the underlying HTML/XML engine to output XHTML or HTML (which is the default). One usecase for this is that { tokens usually gets escaped to &#123;, which would be a problem if you then pass the resulting HTML on to some other templating engine that uses those tokens (like Handlebars or Mustache).

document.mode = :xhtml

This will also affect the emitted <!DOCTYPE> if transforming a full document. Partial documents does not have a <!DOCTYPE>.

Build Status

Tested with Github CI using:

  • MRI 2.6
  • MRI 2.7
  • MRI 3.0
  • MRI 3.1

Let me know if you want any other runtime supported officially.

Versioning

This project follows Semantic Versioning and has been since version 1.0.0.

FAQ

Why is my markup changed in subtle ways?

Roadie uses Nokogiri to parse and regenerate the HTML of your email, which means that some unintentional changes might show up.

One example would be that Nokogiri might remove your &nbsp;s in some cases.

Another example is Nokogiri's lack of HTML5 support, so certain new element might have spaces removed. I recommend you don't use HTML5 in emails anyway because of bad email client support (that includes web mail!).

I'm getting segmentation faults (or other C-like problems)! What should I do?

Roadie uses Nokogiri to parse the HTML of your email, so any C-like problems like segfaults are likely in that end. The best way to fix this is to first upgrade libxml2 on your system and then reinstall Nokogiri. Instructions on how to do this on most platforms, see Nokogiri's official install guide.

What happened to my @keyframes?

The CSS Parser used in Roadie does not handle keyframes. I don't think any email clients do either, but if you want to keep on trying you can add them manually to a <style> element (or a separate referenced stylesheet) and tell Roadie not to touch them.

My @media queries are reordered, how can I fix this?

Different @media query blocks with the same conditions are merged by default, which will change the order in some cases. You can disable this by setting merge_media_queries to false. (See Install & Usage section above).

How do I get rid of the <body> elements that are added?

It sounds like you want to transform a partial document. Maybe you are building partials or template fragments to later place in other documents. Use Document#transform_partial instead of Document#transform in order to treat the HTML as a partial document.

Can I skip URL rewriting on a specific element?

If you add the data-roadie-ignore attribute on an element, URL rewriting will not be performed on that element. This could be really useful for you if you intend to send the email through some other rendering pipeline that replaces some placeholders/variables.

<a href="/about-us">About us</a>
<a href="|UNSUBSCRIBE_URL|" data-roadie-ignore>Unsubscribe</a>

Note that this will not skip CSS inlining on the element; it will still get the correct styles applied.

What should I do about "Invalid URL" errors?

If the URL is invalid on purpose, see Can I skip URL rewriting on a specific element? above. Otherwise, you can try to parse it yourself using Ruby's URI class and see if you can figure it out.

require "uri"
URI.parse("https://example.com/best image.jpg") # raises
URI.parse("https://example.com/best%20image.jpg") # Works!

Documentation

Running specs

bundle install
rake

Security

Roadie is set up with the assumption that all CSS and HTML passing through it is under your control. It is not recommended to run arbritary HTML with the default settings.

Care has been given to try to secure all file system accesses, but it is never guaranteed that someone cannot access something they should not be able to access.

In order to secure Roadie against file system access, only use your own asset providers that you yourself can secure against your particular environment.

If you have found any security vulnerability, please email me at magnus.bergmark+security@gmail.com to disclose it. For very sensitive issues, please use my public GPG key. You can also encrypt your message with my public key and open an issue if you do not want to email me directly. Thank you.

History and contributors

This gem was previously tied to Rails. It is now framework-agnostic and supports any type of HTML documents. If you want to use it with Rails, check out roadie-rails.

Major contributors to Roadie:

You can see all contributors on GitHub.

License

(The MIT License)

Copyright (c) 2009-2022 Magnus Bergmark, Jim Neath / Purify, and contributors.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Author: Mange
Source code: https://github.com/Mange/roadie
License: MIT license

#ruby   #ruby-on-rails #html 

Roadie: Making HTML Emails Comfortable for The Ruby Rockstars
Edward Jackson

Edward Jackson

1658802077

What is Bag of Words (BoW)? BoW Explained with Examples

In this Natural Language Processing (NLP) tutorial, you'll learn what Bag of Words (BoW) is, why BoW is used, learn about it’s implementation in Python and more.

  1. What is Bag of Words in NLP?
  2. Why is the Bag of Words algorithm used?
  3. Understanding Bag of Words with an example
  4. Implementing Bag of Words with Python
  5. Create a Bag of Words Model with Sklearn
  6. What are N-Grams?
  7. What is Tf-Idf ( term frequency-inverse document frequency)?
  8. Feature Extraction with Tf-Idf vectorizer
  9. Limitations of Bag of Word

Using Natural Language Processing, we make use of the text data available across the internet to generate insights for the business. In order to understand this huge amount of data and make insights from them, we need to make them usable. Natural language processing helps us to do so.

What is a Bag of Words in NLP?

Bag of words is a Natural Language Processing technique of text modelling. In technical terms, we can say that it is a method of feature extraction with text data. This approach is a simple and flexible way of extracting features from documents.

A bag of words is a representation of text that describes the occurrence of words within a document. We just keep track of word counts and disregard the grammatical details and the word order. It is called a “bag” of words because any information about the order or structure of words in the document is discarded. The model is only concerned with whether known words occur in the document, not where in the document.
 

Why is the Bag-of-Words algorithm used?

So, why bag-of-words, what is wrong with the simple and easy text?  

One of the biggest problems with text is that it is messy and unstructured, and machine learning algorithms prefer structured, well defined fixed-length inputs and by using the Bag-of-Words technique we can convert variable-length texts into a fixed-length vector.

Also, at a much granular level, the machine learning models work with numerical data rather than textual data. So to be more specific, by using the bag-of-words (BoW) technique, we convert a text into its equivalent vector of numbers.

Understanding Bag of Words with an example

Let us see an example of how the bag of words technique converts text into vectors

Example(1) without preprocessing: 

Sentence 1:  ”Welcome to Great Learning, Now start learning”

Sentence 2: “Learning is a good practice”

Sentence 1Sentence 2
WelcomeLearning
tois
Greata
Learninggood
,practice
Now 
start 
learning 

Step 1: Go through all the words in the above text and make a list of all of the words in our model vocabulary.

  • Welcome
  • To
  • Great
  • Learning
  • ,
  • Now
  • start
  • learning
  • is
  • a
  • good
  • practice

Note that the words ‘Learning’ and ‘ learning’ are not the same here because of the difference in their cases and hence are repeated. Also, note that a comma ‘ , ’ is also taken in the list.

Because we know the vocabulary has 12 words, we can use a fixed-length document-representation of 12, with one position in the vector to score each word.

The scoring method we use here is to count the presence of each word and mark 0 for absence. This scoring method is used more generally.

The scoring of sentence 1 would look as follows:

WordFrequency
Welcome1
to1
Great1
Learning1
,1
Now1
start1
learning1
is0
a0
good0
practice0

Writing the above frequencies in the vector 

Sentence 1 ➝ [ 1,1,1,1,1,1,1,1,0,0,0 ]

Now for sentence 2, the scoring would like 

WordFrequency
Welcome0
to0
Great0
Learning1
,0
Now0
start0
learning0
is1
a1
good1
practice1

Similarly, writing the above frequencies in the vector form

Sentence 2 ➝ [ 0,0,0,0,0,0,0,1,1,1,1,1 ]
 

SentenceWelcometoGreatLearning,Nowstart learningisagoodpractice
Sentence1111111110000
Sentence2000000011111

But is this the best way to perform a bag of words. The above example was not the best example of how to use a bag of words. The words Learning and learning, although having the same meaning are taken twice. Also, a comma ’,’ which does not convey any information is also included in the vocabulary.

Let us make some changes and see how we can use ‘bag of words in a more effective way.

Example(2) with preprocessing

Sentence 1: ”Welcome to Great Learning, Now start learning”

Sentence 2: “Learning is a good practice”
 

Step 1: Convert the above sentences in lower case as the case of the word does not hold any information.

Step 2: Remove special characters and stopwords from the text. Stopwords are the words that do not contain much information about text like ‘is’, ‘a’,’the and many more’.

After applying the above steps, the sentences are changed to

Sentence 1:  ”welcome great learning now start learning”

Sentence 2: “learning good practice”

Although the above sentences do not make much sense the maximum information is contained in these words only.

Step 3: Go through all the words in the above text and make a list of all of the words in our model vocabulary.

  • welcome
  • great
  • learning
  • now
  • start
  • good
  • practice

Now as the vocabulary has only 7 words, we can use a fixed-length document-representation of 7, with one position in the vector to score each word.

The scoring method we use here is the same as used in the previous example. For sentence 1, the count of words is as follow:

WordFrequency
welcome1
great1
learning2
now1
start1
good0
practice0

Writing the above frequencies in the vector 
 

Sentence 1 ➝ [ 1,1,2,1,1,0,0 ]
 

Now for sentence 2, the scoring would be like 

WordFrequency
welcome0
great0
learning1
now0
start0
good1
practice1

Similarly, writing the above frequencies in the vector form

Sentence 2 ➝ [ 0,0,1,0,0,1,1 ]
 

Sentencewelcomegreatlearningnowstart goodpractice
Sentence11121100
Sentence20010011

The approach used in example two is the one that is generally used in the Bag-of-Words technique, the reason being that the datasets used in Machine learning are tremendously large and can contain vocabulary of a few thousand or even millions of words. Hence, preprocessing the text before using bag-of-words is a better way to go.

In the examples above we use all the words from vocabulary to form a vector, which is neither a practical way nor the best way to implement the BoW model. In practice, only a few words from the vocabulary, more preferably most common words are used to form the vector. 

Implementing Bag of Words Algorithm with Python

In this section, we are going to implement a bag of words algorithm with Python. Also, this is a very basic implementation to understand how bag of words algorithm work, so I would not recommend using this in your project, instead use the method described in the next section.

def vectorize(tokens):
    ''' This function takes list of words in a sentence as input 
    and returns a vector of size of filtered_vocab.It puts 0 if the 
    word is not present in tokens and count of token if present.'''
    vector=[]
    for w in filtered_vocab:
        vector.append(tokens.count(w))
    return vector
def unique(sequence):
    '''This functions returns a list in which the order remains 
    same and no item repeats.Using the set() function does not 
    preserve the original ordering,so i didnt use that instead'''
    seen = set()
    return [x for x in sequence if not (x in seen or seen.add(x))]
#create a list of stopwords.You can import stopwords from nltk too
stopwords=["to","is","a"]
#list of special characters.You can use regular expressions too
special_char=[",",":"," ",";",".","?"]
#Write the sentences in the corpus,in our case, just two 
string1="Welcome to Great Learning , Now start learning"
string2="Learning is a good practice"
#convert them to lower case
string1=string1.lower()
string2=string2.lower()
#split the sentences into tokens
tokens1=string1.split()
tokens2=string2.split()
print(tokens1)
print(tokens2)
#create a vocabulary list
vocab=unique(tokens1+tokens2)
print(vocab)
#filter the vocabulary list
filtered_vocab=[]
for w in vocab: 
    if w not in stopwords and w not in special_char: 
        filtered_vocab.append(w)
print(filtered_vocab)
#convert sentences into vectords
vector1=vectorize(tokens1)
print(vector1)
vector2=vectorize(tokens2)
print(vector2)

Output:

Create a Bag of Words Model with Sklearn

We can use the CountVectorizer() function from the Sk-learn library to easily implement the above BoW model using Python.

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
 
sentence_1="This is a good job.I will not miss it for anything"
sentence_2="This is not good at all"
 
 
 
CountVec = CountVectorizer(ngram_range=(1,1), # to use bigrams ngram_range=(2,2)
                           stop_words='english')
#transform
Count_data = CountVec.fit_transform([sentence_1,sentence_2])
 
#create dataframe
cv_dataframe=pd.DataFrame(Count_data.toarray(),columns=CountVec.get_feature_names())
print(cv_dataframe)

What are N-Grams?

Again same questions, what are n-grams and why do we use them? Let us understand this with an example below-

Sentence 1: “This is a good job. I will not miss it for anything”

Sentence 2: ”This is not good at all”

For this example, let us take the vocabulary of 5 words only. The five words being-

  • good
  • job
  • miss
  • not
  • all

So, the respective vectors for these sentences are:

“This is a good job. I will not miss it for anything”=[1,1,1,1,0]

”This is not good at all”=[1,0,0,1,1]

Can you guess what is the problem here? Sentence 2 is a negative sentence and sentence 1 is a positive sentence. Does this reflect in any way in the vectors above? Not at all. So how can we solve this problem? Here come the N-grams to our rescue.

An N-gram is an N-token sequence of words: a 2-gram (more commonly called a bigram) is a two-word sequence of words like “really good”, “not good”, or “your homework”, and a 3-gram (more commonly called a trigram) is a three-word sequence of words like “not at all”, or “turn off light”.

For example, the bigrams in the first line of text in the previous section: “This is not good at all” are as follows:

  • “This is”
  • “is not”
  • “not good”
  • “good at”
  • “at all”

Now if instead of using just words in the above example, we use bigrams (Bag-of-bigrams) as shown above. The model can differentiate between sentence 1 and sentence 2. So, using bi-grams makes tokens more understandable (for example, “HSR Layout”, in Bengaluru, is more informative than “HSR” and “layout”)

So we can conclude that a bag-of-bigrams representation is much more powerful than bag-of-words, and in many cases proves very hard to beat.

What is Tf-Idf ( term frequency-inverse document frequency)?

The scoring method being used above takes the count of each word and represents the word in the vector by the number of counts of that particular word. What does a word having high word count signify?

Does this mean that the word is important in retrieving information about documents? The answer is NO. Let me explain, if a word occurs many times in a document but also along with many other documents in our dataset, maybe it is because this word is just a frequent word; not because it is relevant or meaningful.

One approach is to rescale the frequency of words by how often they appear in all documents so that the scores for frequent words like “the” that are also frequent across all documents are penalized. This approach is called term frequency-inverse document frequency or shortly known as Tf-Idf approach of scoring.TF-IDF is intended to reflect how relevant a term is in a given document. So how is Tf-Idf of a document in a dataset calculated?

TF-IDF for a word in a document is calculated by multiplying two different metrics:

The term frequency (TF) of a word in a document. There are several ways of calculating this frequency, with the simplest being a raw count of instances a word appears in a document. Then, there are other ways to adjust the frequency. For example, by dividing the raw count of instances of a word by either length of the document, or by the raw frequency of the most frequent word in the document. The formula to calculate Term-Frequency is

TF(i,j)=n(i,j)/Σ n(i,j)

Where,

n(i,j )= number of times nth word  occurred in a document
Σn(i,j) = total number of words in a document. 

The inverse document frequency(IDF) of the word across a set of documents. This suggests how common or rare a word is in the entire document set. The closer it is to 0, the more common is the word. This metric can be calculated by taking the total number of documents, dividing it by the number of documents that contain a word, and calculating the logarithm.

So, if the word is very common and appears in many documents, this number will approach 0. Otherwise, it will approach 1.

Multiplying these two numbers results in the TF-IDF score of a word in a document. The higher the score, the more relevant that word is in that particular document.

To put it in mathematical terms, the TF-IDF score is calculated as follows:

IDF=1+log(N/dN)

Where

N=Total number of documents in the dataset
dN=total number of documents in which nth word occur 

Also, note that the 1 added in the above formula is so that terms with zero IDF don’t get suppressed entirely. This process is known as IDF smoothing.

The TF-IDF is obtained by 

TF-IDF=TF*IDF

Does this seem too complicated? Don’t worry, this can be attained with just a few lines of code and you don’t even have to remember these scary formulas.

Feature Extraction with Tf-Idf vectorizer

We can use the TfidfVectorizer() function from the Sk-learn library to easily implement the above BoW(Tf-IDF), model.

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
 
sentence_1="This is a good job.I will not miss it for anything"
sentence_2="This is not good at all"
 
 
 
#without smooth IDF
print("Without Smoothing:")
#define tf-idf
tf_idf_vec = TfidfVectorizer(use_idf=True, 
                        smooth_idf=False,  
                        ngram_range=(1,1),stop_words='english') # to use only  bigrams ngram_range=(2,2)
#transform
tf_idf_data = tf_idf_vec.fit_transform([sentence_1,sentence_2])
 
#create dataframe
tf_idf_dataframe=pd.DataFrame(tf_idf_data.toarray(),columns=tf_idf_vec.get_feature_names())
print(tf_idf_dataframe)
print("\n")
 
#with smooth
tf_idf_vec_smooth = TfidfVectorizer(use_idf=True,  
                        smooth_idf=True,  
                        ngram_range=(1,1),stop_words='english')
 
 
tf_idf_data_smooth = tf_idf_vec_smooth.fit_transform([sentence_1,sentence_2])
 
print("With Smoothing:")
tf_idf_dataframe_smooth=pd.DataFrame(tf_idf_data_smooth.toarray(),columns=tf_idf_vec_smooth.get_feature_names())
print(tf_idf_dataframe_smooth)

Limitations of Bag-of-Words

Although Bag-of-Words is quite efficient and easy to implement, still there are some disadvantages to this technique which are given below:

  1. The model ignores the location information of the word. The location information is a piece of very important information in the text. For example  “today is off” and “Is today off”, have the exact same vector representation in the BoW model.
  2. Bag of word models doesn’t respect the semantics of the word. For example, words ‘soccer’ and ‘football’ are often used in the same context. However, the vectors corresponding to these words are quite different in the bag of words model. The problem becomes more serious while modeling sentences. Ex: “Buy used cars” and “Purchase old automobiles” are represented by totally different vectors in the Bag-of-words model.
  3. The range of vocabulary is a big issue faced by the Bag-of-Words model. For example, if the model comes across a new word it has not seen yet, rather we say a rare, but informative word like Biblioklept(means one who steals books). The BoW model will probably end up ignoring this word as this word has not been seen by the model yet.

Original article source at https://www.mygreatlearning.com

#bagofwords #python #datascience #nlp 

What is Bag of Words (BoW)? BoW Explained with Examples

Inline Specific Values From A JSON File Or The Whole JSON Blob

ts-transform-json

Inline specific values from a JSON file or the whole JSON blob. For example:

import {version} from 'package.json'
// becomes
var version = '1.0.5'

// OR
import * as packageJson from 'package.json'
// becomes
var packageJson = {"version": "1.0.5", dependencies: {}}

Usage

First of all, you need some level of familiarity with the TypeScript Compiler API.

compile.ts & tests should have examples of how this works. The available options are:

isDeclaration?: boolean

Whether you're running this transformer in declaration files (typically specified in afterDeclarations instead of after in transformer list). This flag will inline types instead of actual value.

Author: longlho
Source Code: https://github.com/longlho/ts-transform-json 
License: MIT license

#typescript #json #transform 

Inline Specific Values From A JSON File Or The Whole JSON Blob

React-rails: integrate React.js with Rails Views and Controllers

React-Rails  

React-Rails is a flexible tool to use React with Rails. The benefits:

  • Automatically renders React server-side and client-side
  • Supports Webpacker 4.x, 3.x, 2.x, 1.1+
  • Supports Sprockets 4.x, 3.x, 2.x
  • Lets you use JSX, ES6, TypeScript, CoffeeScript 

Get started with Webpacker

Alternatively, get started with Sprockets

Webpacker provides modern JS tooling for Rails. Here are the listed steps for integrating Webpacker and Rails-React with Rails:

1) Create a new Rails app:

$ rails new my-app
$ cd my-app

2) Add react-rails to your Gemfile:

gem 'react-rails'

Note: On rails versions < 6.0, You need to add gem 'webpacker' to your Gemfile in step 2 above.

3) Now run the installers:

Rails 6.x and 5.x:

$ bundle install
$ rails webpacker:install         # OR (on rails version < 5.0) rake webpacker:install
$ rails webpacker:install:react   # OR (on rails version < 5.0) rake webpacker:install:react
$ rails generate react:install

This gives you:

  • app/javascript/components/ directory for your React components
  • ReactRailsUJS setup in app/javascript/packs/application.js
  • app/javascript/packs/server_rendering.js for server-side rendering

Note: On rails versions < 6.0, link the JavaScript pack in Rails view using javascript_pack_tag helper:

<!-- application.html.erb in Head tag below turbolinks -->
<%= javascript_pack_tag 'application' %>

4) Generate your first component:

$ rails g react:component HelloWorld greeting:string

5) You can also generate your component in a subdirectory:

$ rails g react:component my_subdirectory/HelloWorld greeting:string

Note: Your component is added to app/javascript/components/ by default.

Note: If your component is in a subdirectory you will append the directory path to your erb component call.

Example:

<%= react_component("my_subdirectory/HelloWorld", { greeting: "Hello from react-rails." }) %>

6) Render it in a Rails view:

<!-- erb: paste this in view -->
<%= react_component("HelloWorld", { greeting: "Hello from react-rails." }) %>

7) Lets Start the app:

$ rails s

output: greeting: Hello from react-rails", inspect webpage in your browser too see change in tag props.

Component name

The component name tells react-rails where to load the component. For example:

react_component callcomponent require
react_component("Item")require("Item")
react_component("items/index")require("items/index")
react_component("items.Index")require("items").Index
react_component("items.Index.Header")require("items").Index.Header

This way, you can access top-level, default, or named exports.

The require.context inserted into packs/application.js is used to load components. If you want to load components from a different directory, override it by calling ReactRailsUJS.useContext:

var myCustomContext = require.context("custom_components", true)
var ReactRailsUJS = require("react_ujs")
// use `custom_components/` for <%= react_component(...) %> calls
ReactRailsUJS.useContext(myCustomContext)

If require fails to find your component, ReactRailsUJS falls back to the global namespace, described in Use with Asset Pipeline.

File naming

React-Rails supports plenty of file extensions such as: .js, .jsx.js, .js.jsx, .es6.js, .coffee, etcetera! Sometimes this will cause a stumble when searching for filenames.

Component File Namereact_component call
app/javascript/components/samplecomponent.jsreact_component("samplecomponent")
app/javascript/components/sample_component.jsreact_component("sample_component")
app/javascript/components/SampleComponent.jsreact_component("SampleComponent")
app/javascript/components/SampleComponent.js.jsxHas to be renamed to SampleComponent.jsx, then use react_component("SampleComponent")

Typescript support

If you want to use React-Rails with Typescript, simply run the installer and add @types:

$ bundle exec rails webpacker:install:typescript
$ yarn add @types/react @types/react-dom

Doing this will allow React-Rails to support the .tsx extension. Additionally, it is recommended to add ts and tsx to the server_renderer_extensions in your application configuration:

config.react.server_renderer_extensions = ["jsx", "js", "tsx", "ts"]

Test component

You can use assert_react_component to test component render:

app/views/welcome/index.html.erb
<%= react_component("HelloWorld", { greeting: "Hello from react-rails.", info: { name: "react-rails" } }, { class: "hello-world" }) %>
class WelcomeControllerTest < ActionDispatch::IntegrationTest
  test 'assert_react_component' do
    get "/welcome"
    assert_equal 200, response.status

    # assert rendered react component and check the props
    assert_react_component "HelloWorld" do |props|
      assert_equal "Hello from react-rails.", props[:greeting]
      assert_equal "react-rails", props[:info][:name]
      assert_select "[class=?]", "hello-world"
    end

    # or just assert component rendered
    assert_react_component "HelloWorld"
  end
end

Use with Asset Pipeline

react-rails provides a pre-bundled React.js & a UJS driver to the Rails asset pipeline. Get started by adding the react-rails gem:

gem 'react-rails'

And then install the react generator:

$ rails g react:install

Then restart your development server.

This will:

  • add some //= requires to application.js
  • add a components/ directory for React components
  • add server_rendering.js for server-side rendering

Now, you can create React components in .jsx files:

// app/assets/javascripts/components/post.jsx

window.Post = createReactClass({
  render: function() {
    return <h1>{this.props.title}</h1>
  }
})

// or, equivalent:
class Post extends React.Component {
  render() {
    return <h1>{this.props.title}</h1>
  }
}

Then, you can render those components in views:

<%= react_component("Post", {title: "Hello World"}) %>

Components must be accessible from the top level, but they may be namespaced, for example:

<%= react_component("Comments.NewForm", {post_id: @post.id}) %>
<!-- looks for `window.Comments.NewForm` -->

Custom JSX Transformer

react-rails uses a transformer class to transform JSX in the asset pipeline. The transformer is initialized once, at boot. You can provide a custom transformer to config.react.jsx_transformer_class. The transformer must implement:

  • #initialize(options), where options is the value passed to config.react.jsx_transform_options
  • #transform(code_string) to return a string of transformed code

react-rails provides two transformers, React::JSX::BabelTransformer (which uses ruby-babel-transpiler) and React::JSX::JSXTransformer (which uses the deprecated JSXTransformer.js).

Transform Plugin Options

To supply additional transform plugins to your JSX Transformer, assign them to config.react.jsx_transform_options

react-rails uses the Babel version of the babel-source gem.

For example, to use babel-plugin-transform-class-properties :

config.react.jsx_transform_options = {
  optional: ['es7.classProperties']
}

React.js versions

//= require react brings React into your project.

By default, React's [development version] is provided to Rails.env.development. You can override the React build with a config:

# Here are the defaults:
# config/environments/development.rb
MyApp::Application.configure do
  config.react.variant = :development
end

# config/environments/production.rb
MyApp::Application.configure do
  config.react.variant = :production
end

Be sure to restart your Rails server after changing these files. See VERSIONS.md to learn which version of React.js is included with your react-rails version. In some edge cases you may need to bust the sprockets cache with rake tmp:clear

View Helper

react-rails includes a view helper and an unobtrusive JavaScript driver which work together to put React components on the page.

The view helper (react_component) puts a div on the page with the requested component class & props. For example:

<%= react_component('HelloMessage', name: 'John') %>
<!-- becomes: -->
<div data-react-class="HelloMessage" data-react-props="{&quot;name&quot;:&quot;John&quot;}"></div>

On page load, the react_ujs driver will scan the page and mount components using data-react-class and data-react-props.

The view helper's signature is:

react_component(component_class_name, props={}, html_options={})
  • component_class_name is a string which identifies a component. See getConstructor for details.
  • props is either:
    • an object that responds to #to_json; or
    • an already-stringified JSON object (see JBuilder note below).
  • html_options may include:
    • tag: to use an element other than a div to embed data-react-class and data-react-props.
    • prerender: true to render the component on the server.
    • camelize_props to transform a props hash
    • **other Any other arguments (eg class:, id:) are passed through to content_tag.

Custom View Helper

react-rails uses a "helper implementation" class to generate the output of the react_component helper. The helper is initialized once per request and used for each react_component call during that request. You can provide a custom helper class to config.react.view_helper_implementation. The class must implement:

  • #react_component(name, props = {}, options = {}, &block) to return a string to inject into the Rails view
  • #setup(controller_instance), called when the helper is initialized at the start of the request
  • #teardown(controller_instance), called at the end of the request

react-rails provides one implementation, React::Rails::ComponentMount.

UJS

react-rails's JavaScript is available as "react_ujs" in the asset pipeline or from NPM. It attaches itself to the window as ReactRailsUJS.

Mounting & Unmounting

Usually, react-rails mounts & unmounts components automatically as described in Event Handling below.

You can also mount & unmount components from <%= react_component(...) %> tags using UJS:

// Mount all components on the page:
ReactRailsUJS.mountComponents()
// Mount components within a selector:
ReactRailsUJS.mountComponents(".my-class")
// Mount components within a specific node:
ReactRailsUJS.mountComponents(specificDOMnode)

// Unmounting works the same way:
ReactRailsUJS.unmountComponents()
ReactRailsUJS.unmountComponents(".my-class")
ReactRailsUJS.unmountComponents(specificDOMnode)

You can use this when the DOM is modified by AJAX calls or modal windows.

Event Handling

ReactRailsUJS checks for various libraries to support their page change events:

  • Turbolinks
  • pjax
  • jQuery
  • Native DOM events

ReactRailsUJS will automatically mount components on <%= react_component(...) %> tags and unmount them when appropriate.

If you need to re-detect events, you can call detectEvents:

// Remove previous event handlers and add new ones:
ReactRailsUJS.detectEvents()

For example, if Turbolinks is loaded after ReactRailsUJS, you'll need to call this again. This function removes previous handlers before adding new ones, so it's safe to call as often as needed.

If Turbolinks is imported via Webpacker (and thus not available globally), ReactRailsUJS will be unable to locate it. To fix this, you can temporarily add it to the global namespace:

// Order is particular. First start Turbolinks:
Turbolinks.start();
// Add Turbolinks to the global namespace:
window.Turbolinks = Turbolinks;
// Remove previous event handlers and add new ones:
ReactRailsUJS.detectEvents();
// (Optional) Clean up global namespace:
delete window.Turbolinks;

getConstructor

Components are loaded with ReactRailsUJS.getConstructor(className). This function has two built-in implementations:

  • On the asset pipeline, it looks up className in the global namespace.
  • On Webpacker, it requires files and accesses named exports, as described in Get started with Webpacker.

You can override this function to customize the mapping of name-to-constructor. Server-side rendering also uses this function.

Server-Side Rendering

You can render React components inside your Rails server with prerender: true:

<%= react_component('HelloMessage', {name: 'John'}, {prerender: true}) %>
<!-- becomes: -->
<div data-react-class="HelloMessage" data-react-props="{&quot;name&quot;:&quot;John&quot;}">
  <h1>Hello, John!</h1>
</div>

(It will also be mounted by the UJS on page load.)

Server rendering is powered by ExecJS and subject to some requirements:

  • react-rails must load your code. By convention, it uses server_rendering.js, which was created by the install task. This file must include your components and their dependencies (eg, Underscore.js).
  • Your code can't reference document or window. Prerender processes don't have access to document or window, so jQuery and some other libs won't work in this environment :(

ExecJS supports many backends. CRuby users will get the best performance from mini_racer.

Configuration

Server renderers are stored in a pool and reused between requests. Threaded Rubies (eg jRuby) may see a benefit to increasing the pool size beyond the default 0.

These are the default configurations:

# config/application.rb
# These are the defaults if you don't specify any yourself
module MyApp
  class Application < Rails::Application
    # Settings for the pool of renderers:
    config.react.server_renderer_pool_size  ||= 1  # ExecJS doesn't allow more than one on MRI
    config.react.server_renderer_timeout    ||= 20 # seconds
    config.react.server_renderer = React::ServerRendering::BundleRenderer
    config.react.server_renderer_options = {
      files: ["server_rendering.js"],       # files to load for prerendering
      replay_console: true,                 # if true, console.* will be replayed client-side
    }
    # Changing files matching these dirs/exts will cause the server renderer to reload:
    config.react.server_renderer_extensions = ["jsx", "js"]
    config.react.server_renderer_directories = ["/app/assets/javascripts", "/app/javascript/"]
  end
end

JavaScript State

Some of ExecJS's backends are stateful (eg, mini_racer, therubyracer). This means that any side-effects of a prerender will affect later renders with that renderer.

To manage state, you have a couple options:

  • Make a custom renderer with #before_render / #after_render hooks as described below
  • Use per_request_react_rails_prerenderer to manage state for a whole controller action.

To check out a renderer for the duration of a controller action, call the per_request_react_rails_prerenderer helper in the controller class:

class PagesController < ApplicationController
  # Use the same React server renderer for the entire request:
  per_request_react_rails_prerenderer
end

Then, you can access the ExecJS context directly with react_rails_prerenderer.context:

def show
  react_rails_prerenderer           # => #<React::ServerRendering::BundleRenderer>
  react_rails_prerenderer.context   # => #<ExecJS::Context>

  # Execute arbitrary JavaScript code
  # `self` is the global context
  react_rails_prerenderer.context.exec("self.Store.setup()")
  render :show
  react_rails_prerenderer.context.exec("self.Store.teardown()")
end

react_rails_prerenderer may also be accessed in before- or after-actions.

Custom Server Renderer

react-rails depends on a renderer class for rendering components on the server. You can provide a custom renderer class to config.react.server_renderer. The class must implement:

  • #initialize(options={}), which accepts the hash from config.react.server_renderer_options
  • #render(component_name, props, prerender_options) to return a string of HTML

react-rails provides two renderer classes: React::ServerRendering::ExecJSRenderer and React::ServerRendering::BundleRenderer.

ExecJSRenderer offers two other points for extension:

  • #before_render(component_name, props, prerender_options) to return a string of JavaScript to execute before calling React.render
  • #after_render(component_name, props, prerender_options) to return a string of JavaScript to execute after calling React.render

Any subclass of ExecJSRenderer may use those hooks (for example, BundleRenderer uses them to handle console.* on the server).

Controller Actions

Components can also be server-rendered directly from a controller action with the custom component renderer. For example:

class TodoController < ApplicationController
  def index
    @todos = Todo.all
    render component: 'TodoList', props: { todos: @todos }, tag: 'span', class: 'todo'
  end
end

You can also provide the "usual" render arguments: content_type, layout, location and status. By default, your current layout will be used and the component, rather than a view, will be rendered in place of yield. Custom data-* attributes can be passed like data: {remote: true}.

Prerendering is set to true by default, but can be turned off with prerender: false.

Component Generator

You can generate a new component file with:

rails g react:component ComponentName prop1:type prop2:type ...

For example,

rails g react:component Post title:string published:bool published_by:instanceOf{Person}

would generate:

var Post = createReactClass({
  propTypes: {
    title: PropTypes.string,
    published: PropTypes.bool,
    publishedBy: PropTypes.instanceOf(Person)
  },

  render: function() {
    return (
      <React.Fragment>
        Title: {this.props.title}
        Published: {this.props.published}
        Published By: {this.props.publishedBy}
      </React.Fragment>
    );
  }
});

The generator also accepts options:

  • --es6: use class ComponentName extends React.Component
  • --coffee: use CoffeeScript

Accepted PropTypes are:

  • Plain types: any, array, bool, element, func, number, object, node, shape, string
  • instanceOf takes an optional class name in the form of instanceOf{className}.
  • oneOf behaves like an enum, and takes an optional list of strings in the form of 'name:oneOf{one,two,three}'.
  • oneOfType takes an optional list of react and custom types in the form of 'model:oneOfType{string,number,OtherType}'.

Note that the arguments for oneOf and oneOfType must be enclosed in single quotes to prevent your terminal from expanding them into an argument list.

Use with JBuilder

If you use Jbuilder to pass a JSON string to react_component, make sure your JSON is a stringified hash, not an array. This is not the Rails default -- you should add the root node yourself. For example:

# BAD: returns a stringified array
json.array!(@messages) do |message|
  json.extract! message, :id, :name
  json.url message_url(message, format: :json)
end

# GOOD: returns a stringified hash
json.messages(@messages) do |message|
  json.extract! message, :id, :name
  json.url message_url(message, format: :json)
end

Camelize Props

You can configure camelize_props option:

MyApp::Application.configure do
  config.react.camelize_props = true # default false
end

Now, Ruby hashes given to react_component(...) as props will have their keys transformed from underscore- to camel-case, for example:

{ all_todos: @todos, current_status: @status }
# becomes:
{ "allTodos" => @todos, "currentStatus" => @status }

You can also specify this option in react_component:

<%= react_component('HelloMessage', {name: 'John'}, {camelize_props: true}) %>

Upgrading

2.3 to 2.4

Keep your react_ujs up to date, yarn upgrade

React-Rails 2.4.x uses React 16+ which no longer has React Addons. Therefore the pre-bundled version of react no longer has an addons version, if you need addons still, there is the 2.3.1+ version of the gem that still has addons.

If you need to make changes in your components for the prebundled react, see the migration docs here:

For the vast majority of cases this will get you most of the migration:

  • global find+replace React.Prop -> Prop
  • add import PropTypes from 'prop-types' (Webpacker only)
  • re-run bundle exec rails webpacker:install:react to update npm packages (Webpacker only)

Common Errors

During installation

  1. While using installers.(rails webpacker:install:react && rails webpacker:install) Error:
public/packs/manifest.json. Possible causes:
1. You want to set webpacker.yml value of compile to true for your environment
   unless you are using the `webpack -w` or the webpack-dev-server.
2. webpack has not yet re-run to reflect updates.
3. You have misconfigured Webpacker's config/webpacker.yml file.
4. Your webpack configuration is not creating a manifest.
or
yarn: error: no such option: --dev
ERROR: [Errno 2] No such file or directory: 'add'

Fix: Try updating yarn package.

sudo apt remove cmdtest
sudo apt remove yarn
curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
sudo apt-get update && sudo apt-get install yarn

yarn install

Undefined Set

ExecJS::ProgramError (identifier 'Set' undefined):

(execjs):1

If you see any variation of this issue, see Using TheRubyRacer

Using TheRubyRacer

TheRubyRacer hasn't updated LibV8 (The library that powers Node.js) from v3 in 2 years, any new features are unlikely to work.

LibV8 itself is already beyond version 7 therefore many serverside issues are caused by old JS engines and fixed by using an up to date one such as MiniRacer or TheRubyRhino on JRuby.

HMR

Hot Module Replacement is possible with this gem as it does just pass through to Webpacker. Please open an issue to let us know tips and tricks for it to add to the wiki.

Sample repo that shows HMR working with react-rails: https://github.com/edelgado/react-rails-hmr

One caveat is that currently you cannot Server-Side Render along with HMR.

Related Projects

Contributing

🎉 Thanks for taking the time to contribute! 🎉

With 5 Million+ downloads of the react-rails Gem and another 2 Million+ downloads of react_ujs on NPM, you're helping the biggest React + Rails community!

By contributing to React-Rails, you agree to abide by the code of conduct.

You can always help by submitting patches or triaging issues, even offering reproduction steps to issues is incredibly helpful!

Please see our Contribution guide for more info.

A source code example utilizing React-Rails: https://github.com/BookOfGreg/react-rails-example-app

Author: Reactjs
Source Code: https://github.com/reactjs/react-rails 
License: Apache-2.0 license

#react #javascript #typescript #ruby #rails 

React-rails: integrate React.js with Rails Views and Controllers

Machine Learning | Vision-Line Detection with Hough Transform #5

Line detection with Hough Transform has been presented.

(1) Starting with a grayscale input image (we may use binary image as well), where the line or edge pixels have been assigned a value of 255 (or a value of 1 for binary image),then, a 2D Hough Accumulator array has been created.

(2) Loop through the input image to fill the Hough Accumulator array.

(3) Finally, display the original input image and Hough Transform result.

Source Code Link

https://sigmoidtek.com/blogs/tutorials/line-detect-ht

#machine #learning #hough #transform

Machine Learning | Vision-Line Detection with Hough Transform #5

Basics of CSS Animation

CSS Animations can add some polish and shine to website. They can also be useful to provide users with some visual feedback about the user interface. Although there are some concerns about using CSS animation for critical aspects of a website — especially where use of CSS animation compromises accessibility — if used carefully, they can enhance a website in some very appealing ways.

To make use of basic CSS animations, it’s important to understand the concepts of transitionstransforms, and animation in the CSS context. These concepts support the creation of simple animations, like the gradual change of the color for a button, to complex animations, like moving an object on the screen and simultaneously changing it’s shape and opacity.

Transition

Transitions will apply a controlled change from one CSS property to another. CSS developers can control aspects of the transition including the property, the duration, timing function, and delay of a transition.

The basic structure of a transition looks like the following:

div {
  transition: <property> <duration> <timing-function> <delay>;
}

Each of these aspects of the transition can be defined individually.

  • transition-delay: duration of time to wait before applying the transition
  • transition-duration: duration of time that the transition should take to complete
  • transition-property: property targeted by the transition
  • transition-timing-function: definition of the acceleration curve for the transition

#css #transform #css3 #animation

Basics of CSS Animation
Oleta  Becker

Oleta Becker

1602925200

PYTORCH DATA LOADERS — 4 Types

In this article I will show you how to setup Data loaders and Transformers in Pytorch, You need to import below for the same exercise

import torchvision

import torch

import os

import matplotlib.pyplot as plt

import numpy as np

1. Define the Transform

Image Resize (256,256) or Any other size

Convert to Pytorch Tensors

Normalize the Image by calling torchvision.transform.Normalize

transform_img = torchvision.transforms.Compose([torchvision.transforms.Resize((256, 256)),

torchvision.transforms.ToTensor(),

torchvision.transforms.Normalize(mean=[0.485],std=[0.229])])

2. Create the DataSet from torchvision.datasets

Set some Directory Path, download = True will download the data into the directory specified, transform should be set to transform defined above

dir_path= ‘C:\Users\Asus\pytorch-basics-part2’

dataset_mnist_train = torchvision.datasets.MNIST(dir_path, train=True, transform=transform_img,

target_transform=None, download=True)

You can index this Dataset, dataset_mnist_train[i] will contain the Tuple of (Image, Label).

#pytorch #transform #dataload #deep-learning

PYTORCH DATA LOADERS — 4 Types
Hana Juali

Hana Juali

1600140758

When to use Pandas transform() function

Pandas is an amazing library that contains extensive built-in functions for manipulating data. Among them, transform() is super useful when you are looking to manipulate rows or columns.

In this article, we will cover the following most frequently used Pandas transform() features:

  1. Transforming values
  2. Combining groupby() results
  3. Filtering data
  4. Handling missing value at the group level

Please check out my Github repo for the source code


1. Transforming values

Let’s take a look at pd.transform(**func**, **axis=0**)

  • The first argument _func_ is to specify the function to be used for manipulating data. It can be a function, a string function name, a list of functions, or a dictionary of axis labels -> functions
  • The 2nd argument axis is to specify which axis the _func_ is applied to. 0 for applying the _func_ to each column and 1 for applying the _func_ to each row.

Let’s see how transform() works with the help of some examples.

#pandas #transform #machine-learning #data-science #python

When to use Pandas transform() function