CNN Series Part 1: How do computers see images?

In this article, we will learn about how computers see images & the issues faced while performing a computer vision task. We will see how deep learning comes into the picture & how with the power of neural networks, we can build a powerful computer vision system capable of solving extraordinary problems.

Image for post

One example of how deep learning is transforming computer vision is facial recognition or face detection. On the top left, you can see the icon of the human eye which visually represents vision coming into the deep neural network in the form of images, pixels, videos & on the output on the bottom you can see a depiction of the human face or detection of the human face or this could also be recognizing different human faces or emotions on the face and also the key facial features, etc.

#convolution-neural-net #computer-vision #neural-networks #cnn #convolutional-network #series

What is GEEK

Buddha Community

CNN Series Part 1: How do computers see images?
Veronica  Roob

Veronica Roob

1653475560

A Pure PHP Implementation Of The MessagePack Serialization Format

msgpack.php

A pure PHP implementation of the MessagePack serialization format.

Features

Installation

The recommended way to install the library is through Composer:

composer require rybakit/msgpack

Usage

Packing

To pack values you can either use an instance of a Packer:

$packer = new Packer();
$packed = $packer->pack($value);

or call a static method on the MessagePack class:

$packed = MessagePack::pack($value);

In the examples above, the method pack automatically packs a value depending on its type. However, not all PHP types can be uniquely translated to MessagePack types. For example, the MessagePack format defines map and array types, which are represented by a single array type in PHP. By default, the packer will pack a PHP array as a MessagePack array if it has sequential numeric keys, starting from 0 and as a MessagePack map otherwise:

$mpArr1 = $packer->pack([1, 2]);               // MP array [1, 2]
$mpArr2 = $packer->pack([0 => 1, 1 => 2]);     // MP array [1, 2]
$mpMap1 = $packer->pack([0 => 1, 2 => 3]);     // MP map {0: 1, 2: 3}
$mpMap2 = $packer->pack([1 => 2, 2 => 3]);     // MP map {1: 2, 2: 3}
$mpMap3 = $packer->pack(['a' => 1, 'b' => 2]); // MP map {a: 1, b: 2}

However, sometimes you need to pack a sequential array as a MessagePack map. To do this, use the packMap method:

$mpMap = $packer->packMap([1, 2]); // {0: 1, 1: 2}

Here is a list of type-specific packing methods:

$packer->packNil();           // MP nil
$packer->packBool(true);      // MP bool
$packer->packInt(42);         // MP int
$packer->packFloat(M_PI);     // MP float (32 or 64)
$packer->packFloat32(M_PI);   // MP float 32
$packer->packFloat64(M_PI);   // MP float 64
$packer->packStr('foo');      // MP str
$packer->packBin("\x80");     // MP bin
$packer->packArray([1, 2]);   // MP array
$packer->packMap(['a' => 1]); // MP map
$packer->packExt(1, "\xaa");  // MP ext

Check the "Custom types" section below on how to pack custom types.

Packing options

The Packer object supports a number of bitmask-based options for fine-tuning the packing process (defaults are in bold):

NameDescription
FORCE_STRForces PHP strings to be packed as MessagePack UTF-8 strings
FORCE_BINForces PHP strings to be packed as MessagePack binary data
DETECT_STR_BINDetects MessagePack str/bin type automatically
  
FORCE_ARRForces PHP arrays to be packed as MessagePack arrays
FORCE_MAPForces PHP arrays to be packed as MessagePack maps
DETECT_ARR_MAPDetects MessagePack array/map type automatically
  
FORCE_FLOAT32Forces PHP floats to be packed as 32-bits MessagePack floats
FORCE_FLOAT64Forces PHP floats to be packed as 64-bits MessagePack floats

The type detection mode (DETECT_STR_BIN/DETECT_ARR_MAP) adds some overhead which can be noticed when you pack large (16- and 32-bit) arrays or strings. However, if you know the value type in advance (for example, you only work with UTF-8 strings or/and associative arrays), you can eliminate this overhead by forcing the packer to use the appropriate type, which will save it from running the auto-detection routine. Another option is to explicitly specify the value type. The library provides 2 auxiliary classes for this, Map and Bin. Check the "Custom types" section below for details.

Examples:

// detect str/bin type and pack PHP 64-bit floats (doubles) to MP 32-bit floats
$packer = new Packer(PackOptions::DETECT_STR_BIN | PackOptions::FORCE_FLOAT32);

// these will throw MessagePack\Exception\InvalidOptionException
$packer = new Packer(PackOptions::FORCE_STR | PackOptions::FORCE_BIN);
$packer = new Packer(PackOptions::FORCE_FLOAT32 | PackOptions::FORCE_FLOAT64);

Unpacking

To unpack data you can either use an instance of a BufferUnpacker:

$unpacker = new BufferUnpacker();

$unpacker->reset($packed);
$value = $unpacker->unpack();

or call a static method on the MessagePack class:

$value = MessagePack::unpack($packed);

If the packed data is received in chunks (e.g. when reading from a stream), use the tryUnpack method, which attempts to unpack data and returns an array of unpacked messages (if any) instead of throwing an InsufficientDataException:

while ($chunk = ...) {
    $unpacker->append($chunk);
    if ($messages = $unpacker->tryUnpack()) {
        return $messages;
    }
}

If you want to unpack from a specific position in a buffer, use seek:

$unpacker->seek(42); // set position equal to 42 bytes
$unpacker->seek(-8); // set position to 8 bytes before the end of the buffer

To skip bytes from the current position, use skip:

$unpacker->skip(10); // set position to 10 bytes ahead of the current position

To get the number of remaining (unread) bytes in the buffer:

$unreadBytesCount = $unpacker->getRemainingCount();

To check whether the buffer has unread data:

$hasUnreadBytes = $unpacker->hasRemaining();

If needed, you can remove already read data from the buffer by calling:

$releasedBytesCount = $unpacker->release();

With the read method you can read raw (packed) data:

$packedData = $unpacker->read(2); // read 2 bytes

Besides the above methods BufferUnpacker provides type-specific unpacking methods, namely:

$unpacker->unpackNil();   // PHP null
$unpacker->unpackBool();  // PHP bool
$unpacker->unpackInt();   // PHP int
$unpacker->unpackFloat(); // PHP float
$unpacker->unpackStr();   // PHP UTF-8 string
$unpacker->unpackBin();   // PHP binary string
$unpacker->unpackArray(); // PHP sequential array
$unpacker->unpackMap();   // PHP associative array
$unpacker->unpackExt();   // PHP MessagePack\Type\Ext object

Unpacking options

The BufferUnpacker object supports a number of bitmask-based options for fine-tuning the unpacking process (defaults are in bold):

NameDescription
BIGINT_AS_STRConverts overflowed integers to strings [1]
BIGINT_AS_GMPConverts overflowed integers to GMP objects [2]
BIGINT_AS_DECConverts overflowed integers to Decimal\Decimal objects [3]

1. The binary MessagePack format has unsigned 64-bit as its largest integer data type, but PHP does not support such integers, which means that an overflow can occur during unpacking.

2. Make sure the GMP extension is enabled.

3. Make sure the Decimal extension is enabled.

Examples:

$packedUint64 = "\xcf"."\xff\xff\xff\xff"."\xff\xff\xff\xff";

$unpacker = new BufferUnpacker($packedUint64);
var_dump($unpacker->unpack()); // string(20) "18446744073709551615"

$unpacker = new BufferUnpacker($packedUint64, UnpackOptions::BIGINT_AS_GMP);
var_dump($unpacker->unpack()); // object(GMP) {...}

$unpacker = new BufferUnpacker($packedUint64, UnpackOptions::BIGINT_AS_DEC);
var_dump($unpacker->unpack()); // object(Decimal\Decimal) {...}

Custom types

In addition to the basic types, the library provides functionality to serialize and deserialize arbitrary types. This can be done in several ways, depending on your use case. Let's take a look at them.

Type objects

If you need to serialize an instance of one of your classes into one of the basic MessagePack types, the best way to do this is to implement the CanBePacked interface in the class. A good example of such a class is the Map type class that comes with the library. This type is useful when you want to explicitly specify that a given PHP array should be packed as a MessagePack map without triggering an automatic type detection routine:

$packer = new Packer();

$packedMap = $packer->pack(new Map([1, 2, 3]));
$packedArray = $packer->pack([1, 2, 3]);

More type examples can be found in the src/Type directory.

Type transformers

As with type objects, type transformers are only responsible for serializing values. They should be used when you need to serialize a value that does not implement the CanBePacked interface. Examples of such values could be instances of built-in or third-party classes that you don't own, or non-objects such as resources.

A transformer class must implement the CanPack interface. To use a transformer, it must first be registered in the packer. Here is an example of how to serialize PHP streams into the MessagePack bin format type using one of the supplied transformers, StreamTransformer:

$packer = new Packer(null, [new StreamTransformer()]);

$packedBin = $packer->pack(fopen('/path/to/file', 'r+'));

More type transformer examples can be found in the src/TypeTransformer directory.

Extensions

In contrast to the cases described above, extensions are intended to handle extension types and are responsible for both serialization and deserialization of values (types).

An extension class must implement the Extension interface. To use an extension, it must first be registered in the packer and the unpacker.

The MessagePack specification divides extension types into two groups: predefined and application-specific. Currently, there is only one predefined type in the specification, Timestamp.

Timestamp

The Timestamp extension type is a predefined type. Support for this type in the library is done through the TimestampExtension class. This class is responsible for handling Timestamp objects, which represent the number of seconds and optional adjustment in nanoseconds:

$timestampExtension = new TimestampExtension();

$packer = new Packer();
$packer = $packer->extendWith($timestampExtension);

$unpacker = new BufferUnpacker();
$unpacker = $unpacker->extendWith($timestampExtension);

$packedTimestamp = $packer->pack(Timestamp::now());
$timestamp = $unpacker->reset($packedTimestamp)->unpack();

$seconds = $timestamp->getSeconds();
$nanoseconds = $timestamp->getNanoseconds();

When using the MessagePack class, the Timestamp extension is already registered:

$packedTimestamp = MessagePack::pack(Timestamp::now());
$timestamp = MessagePack::unpack($packedTimestamp);

Application-specific extensions

In addition, the format can be extended with your own types. For example, to make the built-in PHP DateTime objects first-class citizens in your code, you can create a corresponding extension, as shown in the example. Please note, that custom extensions have to be registered with a unique extension ID (an integer from 0 to 127).

More extension examples can be found in the examples/MessagePack directory.

To learn more about how extension types can be useful, check out this article.

Exceptions

If an error occurs during packing/unpacking, a PackingFailedException or an UnpackingFailedException will be thrown, respectively. In addition, an InsufficientDataException can be thrown during unpacking.

An InvalidOptionException will be thrown in case an invalid option (or a combination of mutually exclusive options) is used.

Tests

Run tests as follows:

vendor/bin/phpunit

Also, if you already have Docker installed, you can run the tests in a docker container. First, create a container:

./dockerfile.sh | docker build -t msgpack -

The command above will create a container named msgpack with PHP 8.1 runtime. You may change the default runtime by defining the PHP_IMAGE environment variable:

PHP_IMAGE='php:8.0-cli' ./dockerfile.sh | docker build -t msgpack -

See a list of various images here.

Then run the unit tests:

docker run --rm -v $PWD:/msgpack -w /msgpack msgpack

Fuzzing

To ensure that the unpacking works correctly with malformed/semi-malformed data, you can use a testing technique called Fuzzing. The library ships with a help file (target) for PHP-Fuzzer and can be used as follows:

php-fuzzer fuzz tests/fuzz_buffer_unpacker.php

Performance

To check performance, run:

php -n -dzend_extension=opcache.so \
-dpcre.jit=1 -dopcache.enable=1 -dopcache.enable_cli=1 \
tests/bench.php

Example output

Filter: MessagePack\Tests\Perf\Filter\ListFilter
Rounds: 3
Iterations: 100000

=============================================
Test/Target            Packer  BufferUnpacker
---------------------------------------------
nil .................. 0.0030 ........ 0.0139
false ................ 0.0037 ........ 0.0144
true ................. 0.0040 ........ 0.0137
7-bit uint #1 ........ 0.0052 ........ 0.0120
7-bit uint #2 ........ 0.0059 ........ 0.0114
7-bit uint #3 ........ 0.0061 ........ 0.0119
5-bit sint #1 ........ 0.0067 ........ 0.0126
5-bit sint #2 ........ 0.0064 ........ 0.0132
5-bit sint #3 ........ 0.0066 ........ 0.0135
8-bit uint #1 ........ 0.0078 ........ 0.0200
8-bit uint #2 ........ 0.0077 ........ 0.0212
8-bit uint #3 ........ 0.0086 ........ 0.0203
16-bit uint #1 ....... 0.0111 ........ 0.0271
16-bit uint #2 ....... 0.0115 ........ 0.0260
16-bit uint #3 ....... 0.0103 ........ 0.0273
32-bit uint #1 ....... 0.0116 ........ 0.0326
32-bit uint #2 ....... 0.0118 ........ 0.0332
32-bit uint #3 ....... 0.0127 ........ 0.0325
64-bit uint #1 ....... 0.0140 ........ 0.0277
64-bit uint #2 ....... 0.0134 ........ 0.0294
64-bit uint #3 ....... 0.0134 ........ 0.0281
8-bit int #1 ......... 0.0086 ........ 0.0241
8-bit int #2 ......... 0.0089 ........ 0.0225
8-bit int #3 ......... 0.0085 ........ 0.0229
16-bit int #1 ........ 0.0118 ........ 0.0280
16-bit int #2 ........ 0.0121 ........ 0.0270
16-bit int #3 ........ 0.0109 ........ 0.0274
32-bit int #1 ........ 0.0128 ........ 0.0346
32-bit int #2 ........ 0.0118 ........ 0.0339
32-bit int #3 ........ 0.0135 ........ 0.0368
64-bit int #1 ........ 0.0138 ........ 0.0276
64-bit int #2 ........ 0.0132 ........ 0.0286
64-bit int #3 ........ 0.0137 ........ 0.0274
64-bit int #4 ........ 0.0180 ........ 0.0285
64-bit float #1 ...... 0.0134 ........ 0.0284
64-bit float #2 ...... 0.0125 ........ 0.0275
64-bit float #3 ...... 0.0126 ........ 0.0283
fix string #1 ........ 0.0035 ........ 0.0133
fix string #2 ........ 0.0094 ........ 0.0216
fix string #3 ........ 0.0094 ........ 0.0222
fix string #4 ........ 0.0091 ........ 0.0241
8-bit string #1 ...... 0.0122 ........ 0.0301
8-bit string #2 ...... 0.0118 ........ 0.0304
8-bit string #3 ...... 0.0119 ........ 0.0315
16-bit string #1 ..... 0.0150 ........ 0.0388
16-bit string #2 ..... 0.1545 ........ 0.1665
32-bit string ........ 0.1570 ........ 0.1756
wide char string #1 .. 0.0091 ........ 0.0236
wide char string #2 .. 0.0122 ........ 0.0313
8-bit binary #1 ...... 0.0100 ........ 0.0302
8-bit binary #2 ...... 0.0123 ........ 0.0324
8-bit binary #3 ...... 0.0126 ........ 0.0327
16-bit binary ........ 0.0168 ........ 0.0372
32-bit binary ........ 0.1588 ........ 0.1754
fix array #1 ......... 0.0042 ........ 0.0131
fix array #2 ......... 0.0294 ........ 0.0367
fix array #3 ......... 0.0412 ........ 0.0472
16-bit array #1 ...... 0.1378 ........ 0.1596
16-bit array #2 ........... S ............. S
32-bit array .............. S ............. S
complex array ........ 0.1865 ........ 0.2283
fix map #1 ........... 0.0725 ........ 0.1048
fix map #2 ........... 0.0319 ........ 0.0405
fix map #3 ........... 0.0356 ........ 0.0665
fix map #4 ........... 0.0465 ........ 0.0497
16-bit map #1 ........ 0.2540 ........ 0.3028
16-bit map #2 ............. S ............. S
32-bit map ................ S ............. S
complex map .......... 0.2372 ........ 0.2710
fixext 1 ............. 0.0283 ........ 0.0358
fixext 2 ............. 0.0291 ........ 0.0371
fixext 4 ............. 0.0302 ........ 0.0355
fixext 8 ............. 0.0288 ........ 0.0384
fixext 16 ............ 0.0293 ........ 0.0359
8-bit ext ............ 0.0302 ........ 0.0439
16-bit ext ........... 0.0334 ........ 0.0499
32-bit ext ........... 0.1845 ........ 0.1888
32-bit timestamp #1 .. 0.0337 ........ 0.0547
32-bit timestamp #2 .. 0.0335 ........ 0.0560
64-bit timestamp #1 .. 0.0371 ........ 0.0575
64-bit timestamp #2 .. 0.0374 ........ 0.0542
64-bit timestamp #3 .. 0.0356 ........ 0.0533
96-bit timestamp #1 .. 0.0362 ........ 0.0699
96-bit timestamp #2 .. 0.0381 ........ 0.0701
96-bit timestamp #3 .. 0.0367 ........ 0.0687
=============================================
Total                  2.7618          4.0820
Skipped                     4               4
Failed                      0               0
Ignored                     0               0

With JIT:

php -n -dzend_extension=opcache.so \
-dpcre.jit=1 -dopcache.jit_buffer_size=64M -dopcache.jit=tracing -dopcache.enable=1 -dopcache.enable_cli=1 \
tests/bench.php

Example output

Filter: MessagePack\Tests\Perf\Filter\ListFilter
Rounds: 3
Iterations: 100000

=============================================
Test/Target            Packer  BufferUnpacker
---------------------------------------------
nil .................. 0.0005 ........ 0.0054
false ................ 0.0004 ........ 0.0059
true ................. 0.0004 ........ 0.0059
7-bit uint #1 ........ 0.0010 ........ 0.0047
7-bit uint #2 ........ 0.0010 ........ 0.0046
7-bit uint #3 ........ 0.0010 ........ 0.0046
5-bit sint #1 ........ 0.0025 ........ 0.0046
5-bit sint #2 ........ 0.0023 ........ 0.0046
5-bit sint #3 ........ 0.0024 ........ 0.0045
8-bit uint #1 ........ 0.0043 ........ 0.0081
8-bit uint #2 ........ 0.0043 ........ 0.0079
8-bit uint #3 ........ 0.0041 ........ 0.0080
16-bit uint #1 ....... 0.0064 ........ 0.0095
16-bit uint #2 ....... 0.0064 ........ 0.0091
16-bit uint #3 ....... 0.0064 ........ 0.0094
32-bit uint #1 ....... 0.0085 ........ 0.0114
32-bit uint #2 ....... 0.0077 ........ 0.0122
32-bit uint #3 ....... 0.0077 ........ 0.0120
64-bit uint #1 ....... 0.0085 ........ 0.0159
64-bit uint #2 ....... 0.0086 ........ 0.0157
64-bit uint #3 ....... 0.0086 ........ 0.0158
8-bit int #1 ......... 0.0042 ........ 0.0080
8-bit int #2 ......... 0.0042 ........ 0.0080
8-bit int #3 ......... 0.0042 ........ 0.0081
16-bit int #1 ........ 0.0065 ........ 0.0095
16-bit int #2 ........ 0.0065 ........ 0.0090
16-bit int #3 ........ 0.0056 ........ 0.0085
32-bit int #1 ........ 0.0067 ........ 0.0107
32-bit int #2 ........ 0.0066 ........ 0.0106
32-bit int #3 ........ 0.0063 ........ 0.0104
64-bit int #1 ........ 0.0072 ........ 0.0162
64-bit int #2 ........ 0.0073 ........ 0.0174
64-bit int #3 ........ 0.0072 ........ 0.0164
64-bit int #4 ........ 0.0077 ........ 0.0161
64-bit float #1 ...... 0.0053 ........ 0.0135
64-bit float #2 ...... 0.0053 ........ 0.0135
64-bit float #3 ...... 0.0052 ........ 0.0135
fix string #1 ....... -0.0002 ........ 0.0044
fix string #2 ........ 0.0035 ........ 0.0067
fix string #3 ........ 0.0035 ........ 0.0077
fix string #4 ........ 0.0033 ........ 0.0078
8-bit string #1 ...... 0.0059 ........ 0.0110
8-bit string #2 ...... 0.0063 ........ 0.0121
8-bit string #3 ...... 0.0064 ........ 0.0124
16-bit string #1 ..... 0.0099 ........ 0.0146
16-bit string #2 ..... 0.1522 ........ 0.1474
32-bit string ........ 0.1511 ........ 0.1483
wide char string #1 .. 0.0039 ........ 0.0084
wide char string #2 .. 0.0073 ........ 0.0123
8-bit binary #1 ...... 0.0040 ........ 0.0112
8-bit binary #2 ...... 0.0075 ........ 0.0123
8-bit binary #3 ...... 0.0077 ........ 0.0129
16-bit binary ........ 0.0096 ........ 0.0145
32-bit binary ........ 0.1535 ........ 0.1479
fix array #1 ......... 0.0008 ........ 0.0061
fix array #2 ......... 0.0121 ........ 0.0165
fix array #3 ......... 0.0193 ........ 0.0222
16-bit array #1 ...... 0.0607 ........ 0.0479
16-bit array #2 ........... S ............. S
32-bit array .............. S ............. S
complex array ........ 0.0749 ........ 0.0824
fix map #1 ........... 0.0329 ........ 0.0431
fix map #2 ........... 0.0161 ........ 0.0189
fix map #3 ........... 0.0205 ........ 0.0262
fix map #4 ........... 0.0252 ........ 0.0205
16-bit map #1 ........ 0.1016 ........ 0.0927
16-bit map #2 ............. S ............. S
32-bit map ................ S ............. S
complex map .......... 0.1096 ........ 0.1030
fixext 1 ............. 0.0157 ........ 0.0161
fixext 2 ............. 0.0175 ........ 0.0183
fixext 4 ............. 0.0156 ........ 0.0185
fixext 8 ............. 0.0163 ........ 0.0184
fixext 16 ............ 0.0164 ........ 0.0182
8-bit ext ............ 0.0158 ........ 0.0207
16-bit ext ........... 0.0203 ........ 0.0219
32-bit ext ........... 0.1614 ........ 0.1539
32-bit timestamp #1 .. 0.0195 ........ 0.0249
32-bit timestamp #2 .. 0.0188 ........ 0.0260
64-bit timestamp #1 .. 0.0207 ........ 0.0281
64-bit timestamp #2 .. 0.0212 ........ 0.0291
64-bit timestamp #3 .. 0.0207 ........ 0.0295
96-bit timestamp #1 .. 0.0222 ........ 0.0358
96-bit timestamp #2 .. 0.0228 ........ 0.0353
96-bit timestamp #3 .. 0.0210 ........ 0.0319
=============================================
Total                  1.6432          1.9674
Skipped                     4               4
Failed                      0               0
Ignored                     0               0

You may change default benchmark settings by defining the following environment variables:

NameDefault
MP_BENCH_TARGETSpure_p,pure_u, see a list of available targets
MP_BENCH_ITERATIONS100_000
MP_BENCH_DURATIONnot set
MP_BENCH_ROUNDS3
MP_BENCH_TESTS-@slow, see a list of available tests

For example:

export MP_BENCH_TARGETS=pure_p
export MP_BENCH_ITERATIONS=1000000
export MP_BENCH_ROUNDS=5
# a comma separated list of test names
export MP_BENCH_TESTS='complex array, complex map'
# or a group name
# export MP_BENCH_TESTS='-@slow' // @pecl_comp
# or a regexp
# export MP_BENCH_TESTS='/complex (array|map)/'

Another example, benchmarking both the library and the PECL extension:

MP_BENCH_TARGETS=pure_p,pure_u,pecl_p,pecl_u \
php -n -dextension=msgpack.so -dzend_extension=opcache.so \
-dpcre.jit=1 -dopcache.enable=1 -dopcache.enable_cli=1 \
tests/bench.php

Example output

Filter: MessagePack\Tests\Perf\Filter\ListFilter
Rounds: 3
Iterations: 100000

===========================================================================
Test/Target            Packer  BufferUnpacker  msgpack_pack  msgpack_unpack
---------------------------------------------------------------------------
nil .................. 0.0031 ........ 0.0141 ...... 0.0055 ........ 0.0064
false ................ 0.0039 ........ 0.0154 ...... 0.0056 ........ 0.0053
true ................. 0.0038 ........ 0.0139 ...... 0.0056 ........ 0.0044
7-bit uint #1 ........ 0.0061 ........ 0.0110 ...... 0.0059 ........ 0.0046
7-bit uint #2 ........ 0.0065 ........ 0.0119 ...... 0.0042 ........ 0.0029
7-bit uint #3 ........ 0.0054 ........ 0.0117 ...... 0.0045 ........ 0.0025
5-bit sint #1 ........ 0.0047 ........ 0.0103 ...... 0.0038 ........ 0.0022
5-bit sint #2 ........ 0.0048 ........ 0.0117 ...... 0.0038 ........ 0.0022
5-bit sint #3 ........ 0.0046 ........ 0.0102 ...... 0.0038 ........ 0.0023
8-bit uint #1 ........ 0.0063 ........ 0.0174 ...... 0.0039 ........ 0.0031
8-bit uint #2 ........ 0.0063 ........ 0.0167 ...... 0.0040 ........ 0.0029
8-bit uint #3 ........ 0.0063 ........ 0.0168 ...... 0.0039 ........ 0.0030
16-bit uint #1 ....... 0.0092 ........ 0.0222 ...... 0.0049 ........ 0.0030
16-bit uint #2 ....... 0.0096 ........ 0.0227 ...... 0.0042 ........ 0.0046
16-bit uint #3 ....... 0.0123 ........ 0.0274 ...... 0.0059 ........ 0.0051
32-bit uint #1 ....... 0.0136 ........ 0.0331 ...... 0.0060 ........ 0.0048
32-bit uint #2 ....... 0.0130 ........ 0.0336 ...... 0.0070 ........ 0.0048
32-bit uint #3 ....... 0.0127 ........ 0.0329 ...... 0.0051 ........ 0.0048
64-bit uint #1 ....... 0.0126 ........ 0.0268 ...... 0.0055 ........ 0.0049
64-bit uint #2 ....... 0.0135 ........ 0.0281 ...... 0.0052 ........ 0.0046
64-bit uint #3 ....... 0.0131 ........ 0.0274 ...... 0.0069 ........ 0.0044
8-bit int #1 ......... 0.0077 ........ 0.0236 ...... 0.0058 ........ 0.0044
8-bit int #2 ......... 0.0087 ........ 0.0244 ...... 0.0058 ........ 0.0048
8-bit int #3 ......... 0.0084 ........ 0.0241 ...... 0.0055 ........ 0.0049
16-bit int #1 ........ 0.0112 ........ 0.0271 ...... 0.0048 ........ 0.0045
16-bit int #2 ........ 0.0124 ........ 0.0292 ...... 0.0057 ........ 0.0049
16-bit int #3 ........ 0.0118 ........ 0.0270 ...... 0.0058 ........ 0.0050
32-bit int #1 ........ 0.0137 ........ 0.0366 ...... 0.0058 ........ 0.0051
32-bit int #2 ........ 0.0133 ........ 0.0366 ...... 0.0056 ........ 0.0049
32-bit int #3 ........ 0.0129 ........ 0.0350 ...... 0.0052 ........ 0.0048
64-bit int #1 ........ 0.0145 ........ 0.0254 ...... 0.0034 ........ 0.0025
64-bit int #2 ........ 0.0097 ........ 0.0214 ...... 0.0034 ........ 0.0025
64-bit int #3 ........ 0.0096 ........ 0.0287 ...... 0.0059 ........ 0.0050
64-bit int #4 ........ 0.0143 ........ 0.0277 ...... 0.0059 ........ 0.0046
64-bit float #1 ...... 0.0134 ........ 0.0281 ...... 0.0057 ........ 0.0052
64-bit float #2 ...... 0.0141 ........ 0.0281 ...... 0.0057 ........ 0.0050
64-bit float #3 ...... 0.0144 ........ 0.0282 ...... 0.0057 ........ 0.0050
fix string #1 ........ 0.0036 ........ 0.0143 ...... 0.0066 ........ 0.0053
fix string #2 ........ 0.0107 ........ 0.0222 ...... 0.0065 ........ 0.0068
fix string #3 ........ 0.0116 ........ 0.0245 ...... 0.0063 ........ 0.0069
fix string #4 ........ 0.0105 ........ 0.0253 ...... 0.0083 ........ 0.0077
8-bit string #1 ...... 0.0126 ........ 0.0318 ...... 0.0075 ........ 0.0088
8-bit string #2 ...... 0.0121 ........ 0.0295 ...... 0.0076 ........ 0.0086
8-bit string #3 ...... 0.0125 ........ 0.0293 ...... 0.0130 ........ 0.0093
16-bit string #1 ..... 0.0159 ........ 0.0368 ...... 0.0117 ........ 0.0086
16-bit string #2 ..... 0.1547 ........ 0.1686 ...... 0.1516 ........ 0.1373
32-bit string ........ 0.1558 ........ 0.1729 ...... 0.1511 ........ 0.1396
wide char string #1 .. 0.0098 ........ 0.0237 ...... 0.0066 ........ 0.0065
wide char string #2 .. 0.0128 ........ 0.0291 ...... 0.0061 ........ 0.0082
8-bit binary #1 ........... I ............. I ........... F ............. I
8-bit binary #2 ........... I ............. I ........... F ............. I
8-bit binary #3 ........... I ............. I ........... F ............. I
16-bit binary ............. I ............. I ........... F ............. I
32-bit binary ............. I ............. I ........... F ............. I
fix array #1 ......... 0.0040 ........ 0.0129 ...... 0.0120 ........ 0.0058
fix array #2 ......... 0.0279 ........ 0.0390 ...... 0.0143 ........ 0.0165
fix array #3 ......... 0.0415 ........ 0.0463 ...... 0.0162 ........ 0.0187
16-bit array #1 ...... 0.1349 ........ 0.1628 ...... 0.0334 ........ 0.0341
16-bit array #2 ........... S ............. S ........... S ............. S
32-bit array .............. S ............. S ........... S ............. S
complex array ............. I ............. I ........... F ............. F
fix map #1 ................ I ............. I ........... F ............. I
fix map #2 ........... 0.0345 ........ 0.0391 ...... 0.0143 ........ 0.0168
fix map #3 ................ I ............. I ........... F ............. I
fix map #4 ........... 0.0459 ........ 0.0473 ...... 0.0151 ........ 0.0163
16-bit map #1 ........ 0.2518 ........ 0.2962 ...... 0.0400 ........ 0.0490
16-bit map #2 ............. S ............. S ........... S ............. S
32-bit map ................ S ............. S ........... S ............. S
complex map .......... 0.2380 ........ 0.2682 ...... 0.0545 ........ 0.0579
fixext 1 .................. I ............. I ........... F ............. F
fixext 2 .................. I ............. I ........... F ............. F
fixext 4 .................. I ............. I ........... F ............. F
fixext 8 .................. I ............. I ........... F ............. F
fixext 16 ................. I ............. I ........... F ............. F
8-bit ext ................. I ............. I ........... F ............. F
16-bit ext ................ I ............. I ........... F ............. F
32-bit ext ................ I ............. I ........... F ............. F
32-bit timestamp #1 ....... I ............. I ........... F ............. F
32-bit timestamp #2 ....... I ............. I ........... F ............. F
64-bit timestamp #1 ....... I ............. I ........... F ............. F
64-bit timestamp #2 ....... I ............. I ........... F ............. F
64-bit timestamp #3 ....... I ............. I ........... F ............. F
96-bit timestamp #1 ....... I ............. I ........... F ............. F
96-bit timestamp #2 ....... I ............. I ........... F ............. F
96-bit timestamp #3 ....... I ............. I ........... F ............. F
===========================================================================
Total                  1.5625          2.3866        0.7735          0.7243
Skipped                     4               4             4               4
Failed                      0               0            24              17
Ignored                    24              24             0               7

With JIT:

MP_BENCH_TARGETS=pure_p,pure_u,pecl_p,pecl_u \
php -n -dextension=msgpack.so -dzend_extension=opcache.so \
-dpcre.jit=1 -dopcache.jit_buffer_size=64M -dopcache.jit=tracing -dopcache.enable=1 -dopcache.enable_cli=1 \
tests/bench.php

Example output

Filter: MessagePack\Tests\Perf\Filter\ListFilter
Rounds: 3
Iterations: 100000

===========================================================================
Test/Target            Packer  BufferUnpacker  msgpack_pack  msgpack_unpack
---------------------------------------------------------------------------
nil .................. 0.0001 ........ 0.0052 ...... 0.0053 ........ 0.0042
false ................ 0.0007 ........ 0.0060 ...... 0.0057 ........ 0.0043
true ................. 0.0008 ........ 0.0060 ...... 0.0056 ........ 0.0041
7-bit uint #1 ........ 0.0031 ........ 0.0046 ...... 0.0062 ........ 0.0041
7-bit uint #2 ........ 0.0021 ........ 0.0043 ...... 0.0062 ........ 0.0041
7-bit uint #3 ........ 0.0022 ........ 0.0044 ...... 0.0061 ........ 0.0040
5-bit sint #1 ........ 0.0030 ........ 0.0048 ...... 0.0062 ........ 0.0040
5-bit sint #2 ........ 0.0032 ........ 0.0046 ...... 0.0062 ........ 0.0040
5-bit sint #3 ........ 0.0031 ........ 0.0046 ...... 0.0062 ........ 0.0040
8-bit uint #1 ........ 0.0054 ........ 0.0079 ...... 0.0062 ........ 0.0050
8-bit uint #2 ........ 0.0051 ........ 0.0079 ...... 0.0064 ........ 0.0044
8-bit uint #3 ........ 0.0051 ........ 0.0082 ...... 0.0062 ........ 0.0044
16-bit uint #1 ....... 0.0077 ........ 0.0094 ...... 0.0065 ........ 0.0045
16-bit uint #2 ....... 0.0077 ........ 0.0094 ...... 0.0063 ........ 0.0045
16-bit uint #3 ....... 0.0077 ........ 0.0095 ...... 0.0064 ........ 0.0047
32-bit uint #1 ....... 0.0088 ........ 0.0119 ...... 0.0063 ........ 0.0043
32-bit uint #2 ....... 0.0089 ........ 0.0117 ...... 0.0062 ........ 0.0039
32-bit uint #3 ....... 0.0089 ........ 0.0118 ...... 0.0063 ........ 0.0044
64-bit uint #1 ....... 0.0097 ........ 0.0155 ...... 0.0063 ........ 0.0045
64-bit uint #2 ....... 0.0095 ........ 0.0153 ...... 0.0061 ........ 0.0045
64-bit uint #3 ....... 0.0096 ........ 0.0156 ...... 0.0063 ........ 0.0047
8-bit int #1 ......... 0.0053 ........ 0.0083 ...... 0.0062 ........ 0.0044
8-bit int #2 ......... 0.0052 ........ 0.0080 ...... 0.0062 ........ 0.0044
8-bit int #3 ......... 0.0052 ........ 0.0080 ...... 0.0062 ........ 0.0043
16-bit int #1 ........ 0.0089 ........ 0.0097 ...... 0.0069 ........ 0.0046
16-bit int #2 ........ 0.0075 ........ 0.0093 ...... 0.0063 ........ 0.0043
16-bit int #3 ........ 0.0075 ........ 0.0094 ...... 0.0062 ........ 0.0046
32-bit int #1 ........ 0.0086 ........ 0.0122 ...... 0.0063 ........ 0.0044
32-bit int #2 ........ 0.0087 ........ 0.0120 ...... 0.0066 ........ 0.0046
32-bit int #3 ........ 0.0086 ........ 0.0121 ...... 0.0060 ........ 0.0044
64-bit int #1 ........ 0.0096 ........ 0.0149 ...... 0.0060 ........ 0.0045
64-bit int #2 ........ 0.0096 ........ 0.0157 ...... 0.0062 ........ 0.0044
64-bit int #3 ........ 0.0096 ........ 0.0160 ...... 0.0063 ........ 0.0046
64-bit int #4 ........ 0.0097 ........ 0.0157 ...... 0.0061 ........ 0.0044
64-bit float #1 ...... 0.0079 ........ 0.0153 ...... 0.0056 ........ 0.0044
64-bit float #2 ...... 0.0079 ........ 0.0152 ...... 0.0057 ........ 0.0045
64-bit float #3 ...... 0.0079 ........ 0.0155 ...... 0.0057 ........ 0.0044
fix string #1 ........ 0.0010 ........ 0.0045 ...... 0.0071 ........ 0.0044
fix string #2 ........ 0.0048 ........ 0.0075 ...... 0.0070 ........ 0.0060
fix string #3 ........ 0.0048 ........ 0.0086 ...... 0.0068 ........ 0.0060
fix string #4 ........ 0.0050 ........ 0.0088 ...... 0.0070 ........ 0.0059
8-bit string #1 ...... 0.0081 ........ 0.0129 ...... 0.0069 ........ 0.0062
8-bit string #2 ...... 0.0086 ........ 0.0128 ...... 0.0069 ........ 0.0065
8-bit string #3 ...... 0.0086 ........ 0.0126 ...... 0.0115 ........ 0.0065
16-bit string #1 ..... 0.0105 ........ 0.0137 ...... 0.0128 ........ 0.0068
16-bit string #2 ..... 0.1510 ........ 0.1486 ...... 0.1526 ........ 0.1391
32-bit string ........ 0.1517 ........ 0.1475 ...... 0.1504 ........ 0.1370
wide char string #1 .. 0.0044 ........ 0.0085 ...... 0.0067 ........ 0.0057
wide char string #2 .. 0.0081 ........ 0.0125 ...... 0.0069 ........ 0.0063
8-bit binary #1 ........... I ............. I ........... F ............. I
8-bit binary #2 ........... I ............. I ........... F ............. I
8-bit binary #3 ........... I ............. I ........... F ............. I
16-bit binary ............. I ............. I ........... F ............. I
32-bit binary ............. I ............. I ........... F ............. I
fix array #1 ......... 0.0014 ........ 0.0059 ...... 0.0132 ........ 0.0055
fix array #2 ......... 0.0146 ........ 0.0156 ...... 0.0155 ........ 0.0148
fix array #3 ......... 0.0211 ........ 0.0229 ...... 0.0179 ........ 0.0180
16-bit array #1 ...... 0.0673 ........ 0.0498 ...... 0.0343 ........ 0.0388
16-bit array #2 ........... S ............. S ........... S ............. S
32-bit array .............. S ............. S ........... S ............. S
complex array ............. I ............. I ........... F ............. F
fix map #1 ................ I ............. I ........... F ............. I
fix map #2 ........... 0.0148 ........ 0.0180 ...... 0.0156 ........ 0.0179
fix map #3 ................ I ............. I ........... F ............. I
fix map #4 ........... 0.0252 ........ 0.0201 ...... 0.0214 ........ 0.0167
16-bit map #1 ........ 0.1027 ........ 0.0836 ...... 0.0388 ........ 0.0510
16-bit map #2 ............. S ............. S ........... S ............. S
32-bit map ................ S ............. S ........... S ............. S
complex map .......... 0.1104 ........ 0.1010 ...... 0.0556 ........ 0.0602
fixext 1 .................. I ............. I ........... F ............. F
fixext 2 .................. I ............. I ........... F ............. F
fixext 4 .................. I ............. I ........... F ............. F
fixext 8 .................. I ............. I ........... F ............. F
fixext 16 ................. I ............. I ........... F ............. F
8-bit ext ................. I ............. I ........... F ............. F
16-bit ext ................ I ............. I ........... F ............. F
32-bit ext ................ I ............. I ........... F ............. F
32-bit timestamp #1 ....... I ............. I ........... F ............. F
32-bit timestamp #2 ....... I ............. I ........... F ............. F
64-bit timestamp #1 ....... I ............. I ........... F ............. F
64-bit timestamp #2 ....... I ............. I ........... F ............. F
64-bit timestamp #3 ....... I ............. I ........... F ............. F
96-bit timestamp #1 ....... I ............. I ........... F ............. F
96-bit timestamp #2 ....... I ............. I ........... F ............. F
96-bit timestamp #3 ....... I ............. I ........... F ............. F
===========================================================================
Total                  0.9642          1.0909        0.8224          0.7213
Skipped                     4               4             4               4
Failed                      0               0            24              17
Ignored                    24              24             0               7

Note that the msgpack extension (v2.1.2) doesn't support ext, bin and UTF-8 str types.

License

The library is released under the MIT License. See the bundled LICENSE file for details.

Author: rybakit
Source Code: https://github.com/rybakit/msgpack.php
License: MIT License

#php 

Treebender: A Symbolic Natural Language Parsing Library for Rust

Treebender

A symbolic natural language parsing library for Rust, inspired by HDPSG.

What is this?

This is a library for parsing natural or constructed languages into syntax trees and feature structures. There's no machine learning or probabilistic models, everything is hand-crafted and deterministic.

You can find out more about the motivations of this project in this blog post.

But what are you using it for?

I'm using this to parse a constructed language for my upcoming xenolinguistics game, Themengi.

Motivation

Using a simple 80-line grammar, introduced in the tutorial below, we can parse a simple subset of English, checking reflexive pronoun binding, case, and number agreement.

$ cargo run --bin cli examples/reflexives.fgr
> she likes himself
Parsed 0 trees

> her likes herself
Parsed 0 trees

> she like herself
Parsed 0 trees

> she likes herself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: she))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: herself)))
[
  child-2: [
    case: acc
    pron: ref
    needs_pron: #0 she
    num: sg
    child-0: [ word: herself ]
  ]
  child-1: [
    tense: nonpast
    child-0: [ word: likes ]
    num: #1 sg
  ]
  child-0: [
    child-0: [ word: she ]
    case: nom
    pron: #0
    num: #1
  ]
]

Low resource language? Low problem! No need to train on gigabytes of text, just write a grammar using your brain. Let's hypothesize that in American Sign Language, topicalized nouns (expressed with raised eyebrows) must appear first in the sentence. We can write a small grammar (18 lines), and plug in some sentences:

$ cargo run --bin cli examples/asl-wordorder.fgr -n
> boy sit
Parsed 1 tree
(0..2: S
  (0..1: NP ((0..1: N (0..1: boy))))
  (1..2: IV (1..2: sit)))

> boy throw ball
Parsed 1 tree
(0..3: S
  (0..1: NP ((0..1: N (0..1: boy))))
  (1..2: TV (1..2: throw))
  (2..3: NP ((2..3: N (2..3: ball)))))

> ball nm-raised-eyebrows boy throw
Parsed 1 tree
(0..4: S
  (0..2: NP
    (0..1: N (0..1: ball))
    (1..2: Topic (1..2: nm-raised-eyebrows)))
  (2..3: NP ((2..3: N (2..3: boy))))
  (3..4: TV (3..4: throw)))

> boy throw ball nm-raised-eyebrows
Parsed 0 trees

Tutorial

As an example, let's say we want to build a parser for English reflexive pronouns (himself, herself, themselves, themself, itself). We'll also support number ("He likes X" v.s. "They like X") and simple embedded clauses ("He said that they like X").

Grammar files are written in a custom language, similar to BNF, called Feature GRammar (.fgr). There's a VSCode syntax highlighting extension for these files available as fgr-syntax.

We'll start by defining our lexicon. The lexicon is the set of terminal symbols (symbols in the actual input) that the grammar will match. Terminal symbols must start with a lowercase letter, and non-terminal symbols must start with an uppercase letter.

// pronouns
N -> he
N -> him
N -> himself
N -> she
N -> her
N -> herself
N -> they
N -> them
N -> themselves
N -> themself

// names, lowercase as they are terminals
N -> mary
N -> sue
N -> takeshi
N -> robert

// complementizer
Comp -> that

// verbs -- intransitive, transitive, and clausal
IV -> falls
IV -> fall
IV -> fell

TV -> likes
TV -> like
TV -> liked

CV -> says
CV -> say
CV -> said

Next, we can add our sentence rules (they must be added at the top, as the first rule in the file is assumed to be the top-level rule):

// sentence rules
S -> N IV
S -> N TV N
S -> N CV Comp S

// ... previous lexicon ...

Assuming this file is saved as examples/no-features.fgr (which it is :wink:), we can test this file with the built-in CLI:

$ cargo run --bin cli examples/no-features.fgr
> he falls
Parsed 1 tree
(0..2: S
  (0..1: N (0..1: he))
  (1..2: IV (1..2: falls)))
[
  child-1: [ child-0: [ word: falls ] ]
  child-0: [ child-0: [ word: he ] ]
]

> he falls her
Parsed 0 trees

> he likes her
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: he))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: her)))
[
  child-2: [ child-0: [ word: her ] ]
  child-1: [ child-0: [ word: likes ] ]
  child-0: [ child-0: [ word: he ] ]
]

> he likes
Parsed 0 trees

> he said that he likes her
Parsed 1 tree
(0..6: S
  (0..1: N (0..1: he))
  (1..2: CV (1..2: said))
  (2..3: Comp (2..3: that))
  (3..6: S
    (3..4: N (3..4: he))
    (4..5: TV (4..5: likes))
    (5..6: N (5..6: her))))
[
  child-0: [ child-0: [ word: he ] ]
  child-2: [ child-0: [ word: that ] ]
  child-1: [ child-0: [ word: said ] ]
  child-3: [
    child-2: [ child-0: [ word: her ] ]
    child-1: [ child-0: [ word: likes ] ]
    child-0: [ child-0: [ word: he ] ]
  ]
]

> he said that he
Parsed 0 trees

This grammar already parses some correct sentences, and blocks some trivially incorrect ones. However, it doesn't care about number, case, or reflexives right now:

> she likes himself  // unbound reflexive pronoun
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: she))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: himself)))
[
  child-0: [ child-0: [ word: she ] ]
  child-2: [ child-0: [ word: himself ] ]
  child-1: [ child-0: [ word: likes ] ]
]

> him like her  // incorrect case on the subject pronoun, should be nominative
                // (he) instead of accusative (him)
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: him))
  (1..2: TV (1..2: like))
  (2..3: N (2..3: her)))
[
  child-0: [ child-0: [ word: him ] ]
  child-1: [ child-0: [ word: like ] ]
  child-2: [ child-0: [ word: her ] ]
]

> he like her  // incorrect verb number agreement
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: he))
  (1..2: TV (1..2: like))
  (2..3: N (2..3: her)))
[
  child-2: [ child-0: [ word: her ] ]
  child-1: [ child-0: [ word: like ] ]
  child-0: [ child-0: [ word: he ] ]
]

To fix this, we need to add features to our lexicon, and restrict the sentence rules based on features.

Features are added with square brackets, and are key: value pairs separated by commas. **top** is a special feature value, which basically means "unspecified" -- we'll come back to it later. Features that are unspecified are also assumed to have a **top** value, but sometimes explicitly stating top is more clear.

/// Pronouns
// The added features are:
// * num: sg or pl, whether this noun wants a singular verb (likes) or
//   a plural verb (like). note this is grammatical number, so for example
//   singular they takes plural agreement ("they like X", not *"they likes X")
// * case: nom or acc, whether this noun is nominative or accusative case.
//   nominative case goes in the subject, and accusative in the object.
//   e.g., "he fell" and "she likes him", not *"him fell" and *"her likes he"
// * pron: he, she, they, or ref -- what type of pronoun this is
// * needs_pron: whether this is a reflexive that needs to bind to another
//   pronoun.
N[ num: sg, case: nom, pron: he ]                    -> he
N[ num: sg, case: acc, pron: he ]                    -> him
N[ num: sg, case: acc, pron: ref, needs_pron: he ]   -> himself
N[ num: sg, case: nom, pron: she ]                   -> she
N[ num: sg, case: acc, pron: she ]                   -> her
N[ num: sg, case: acc, pron: ref, needs_pron: she]   -> herself
N[ num: pl, case: nom, pron: they ]                  -> they
N[ num: pl, case: acc, pron: they ]                  -> them
N[ num: pl, case: acc, pron: ref, needs_pron: they ] -> themselves
N[ num: sg, case: acc, pron: ref, needs_pron: they ] -> themself

// Names
// The added features are:
// * num: sg, as people are singular ("mary likes her" / *"mary like her")
// * case: **top**, as names can be both subjects and objects
//   ("mary likes her" / "she likes mary")
// * pron: whichever pronoun the person uses for reflexive agreement
//   mary    pron: she  => mary likes herself
//   sue     pron: they => sue likes themself
//   takeshi pron: he   => takeshi likes himself
N[ num: sg, case: **top**, pron: she ]  -> mary
N[ num: sg, case: **top**, pron: they ] -> sue
N[ num: sg, case: **top**, pron: he ]   -> takeshi
N[ num: sg, case: **top**, pron: he ]   -> robert

// Complementizer doesn't need features
Comp -> that

// Verbs -- intransitive, transitive, and clausal
// The added features are:
// * num: sg, pl, or **top** -- to match the noun numbers.
//   **top** will match either sg or pl, as past-tense verbs in English
//   don't agree in number: "he fell" and "they fell" are both fine
// * tense: past or nonpast -- this won't be used for agreement, but will be
//   copied into the final feature structure, and the client code could do
//   something with it
IV[ num:      sg, tense: nonpast ] -> falls
IV[ num:      pl, tense: nonpast ] -> fall
IV[ num: **top**, tense: past ]    -> fell

TV[ num:      sg, tense: nonpast ] -> likes
TV[ num:      pl, tense: nonpast ] -> like
TV[ num: **top**, tense: past ]    -> liked

CV[ num:      sg, tense: nonpast ] -> says
CV[ num:      pl, tense: nonpast ] -> say
CV[ num: **top**, tense: past ]    -> said

Now that our lexicon is updated with features, we can update our sentence rules to constrain parsing based on those features. This uses two new features, tags and unification. Tags allow features to be associated between nodes in a rule, and unification controls how those features are compatible. The rules for unification are:

  1. A string feature can unify with a string feature with the same value
  2. A top feature can unify with anything, and the nodes are merged
  3. A complex feature ([ ... ] structure) is recursively unified with another complex feature.

If unification fails anywhere, the parse is aborted and the tree is discarded. This allows the programmer to discard trees if features don't match.

// Sentence rules
// Intransitive verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #1)
S -> N[ case: nom, num: #1 ] IV[ num: #1 ]
// Transitive verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #2)
// * If there's a reflexive in the object position, make sure its `needs_pron`
//   feature matches the subject's `pron` feature. If the object isn't a
//   reflexive, then its `needs_pron` feature will implicitly be `**top**`, so
//   will unify with anything.
S -> N[ case: nom, pron: #1, num: #2 ] TV[ num: #2 ] N[ case: acc, needs_pron: #1 ]
// Clausal verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #1)
// * Reflexives can't cross clause boundaries (*"He said that she likes himself"),
//   so we can ignore reflexives and delegate to inner clause rule
S -> N[ case: nom, num: #1 ] CV[ num: #1 ] Comp S

Now that we have this augmented grammar (available as examples/reflexives.fgr), we can try it out and see that it rejects illicit sentences that were previously accepted, while still accepting valid ones:

> he fell
Parsed 1 tree
(0..2: S
  (0..1: N (0..1: he))
  (1..2: IV (1..2: fell)))
[
  child-1: [
    child-0: [ word: fell ]
    num: #0 sg
    tense: past
  ]
  child-0: [
    pron: he
    case: nom
    num: #0
    child-0: [ word: he ]
  ]
]

> he like him
Parsed 0 trees

> he likes himself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: he))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: himself)))
[
  child-1: [
    num: #0 sg
    child-0: [ word: likes ]
    tense: nonpast
  ]
  child-2: [
    needs_pron: #1 he
    num: sg
    child-0: [ word: himself ]
    pron: ref
    case: acc
  ]
  child-0: [
    child-0: [ word: he ]
    pron: #1
    num: #0
    case: nom
  ]
]

> he likes herself
Parsed 0 trees

> mary likes herself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: mary))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: herself)))
[
  child-0: [
    pron: #0 she
    num: #1 sg
    case: nom
    child-0: [ word: mary ]
  ]
  child-1: [
    tense: nonpast
    child-0: [ word: likes ]
    num: #1
  ]
  child-2: [
    child-0: [ word: herself ]
    num: sg
    pron: ref
    case: acc
    needs_pron: #0
  ]
]

> mary likes themself
Parsed 0 trees

> sue likes themself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: sue))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: themself)))
[
  child-0: [
    pron: #0 they
    child-0: [ word: sue ]
    case: nom
    num: #1 sg
  ]
  child-1: [
    tense: nonpast
    num: #1
    child-0: [ word: likes ]
  ]
  child-2: [
    needs_pron: #0
    case: acc
    pron: ref
    child-0: [ word: themself ]
    num: sg
  ]
]

> sue likes himself
Parsed 0 trees

If this is interesting to you and you want to learn more, you can check out my blog series, the excellent textbook Syntactic Theory: A Formal Introduction (2nd ed.), and the DELPH-IN project, whose work on the LKB inspired this simplified version.

Using from code

I need to write this section in more detail, but if you're comfortable with Rust, I suggest looking through the codebase. It's not perfect, it started as one of my first Rust projects (after migrating through F# -> TypeScript -> C in search of the right performance/ergonomics tradeoff), and it could use more tests, but overall it's not too bad.

Basically, the processing pipeline is:

  1. Make a Grammar struct
  • Grammar is defined in rules.rs.
  • The easiest way to make a Grammar is Grammar::parse_from_file, which is mostly a hand-written recusive descent parser in parse_grammar.rs. Yes, I recognize the irony here.
  1. It takes input (in Grammar::parse, which does everything for you, or Grammar::parse_chart, which just does the chart)
  2. The input is first chart-parsed in earley.rs
  3. Then, a forest is built from the chart, in forest.rs, using an algorithm I found in a very useful blog series I forget the URL for, because the algorithms in the academic literature for this are... weird.
  4. Finally, the feature unification is used to prune the forest down to only valid trees. It would be more efficient to do this during parsing, but meh.

The most interesting thing you can do via code and not via the CLI is probably getting at the raw feature DAG, as that would let you do things like pronoun coreference. The DAG code is in featurestructure.rs, and should be fairly approachable -- there's a lot of Rust ceremony around Rc<RefCell<...>> because using an arena allocation crate seemed too harlike overkill, but that is somewhat mitigated by the NodeRef type alias. Hit me up at https://vgel.me/contact if you need help with anything here!

Download Details:
Author: vgel
Source Code: https://github.com/vgel/treebender
License: MIT License

#rust  #machinelearning 

Py Rouge: A Full Python Implementation Of The ROUGE Metric

Py-rouge

A full Python implementation of the ROUGE metric, producing same results as in the official perl implementation.

Important remarks

  • The original Porter stemmer in NLTK is slightly different than the one use in the official ROUGE perl script as it has been written by end. Therefore, there might be slightly different stems for certain words. For DUC2004 dataset, I have identified these words and this script produces same stems.
  • The official ROUGE perl script use resampling strategy to compute the average with confidence intervals. Therefore, we might have a difference <3e-5 for ROUGE-L as well as ROUGE-W and <4e-5 for ROUGE-N.
  • Finally, ROUGE-1.5.5. has a bug: should have $tmpTextLen += $sLen at line 2101. Here, the last sentence, $limitBytes is taken instead of $limitBytes-$tmpTextLen (as $tmpTextLen is never updated with bytes length limit). It has been fixed in this code. This bug does not have a consequence for the default evaluation -b 665.

In case of doubts, please see all the implemented tests to compare outputs between the official ROUGE-1.5.5 and this script.

Installation

Package is uploaded on PyPI <https://pypi.org/project/py-rouge>_.

You can install it with pip:

pip install py-rouge

or do it manually:

git clone https://github.com/Diego999/py-rouge
cd py-rouge
python setup.py install

Issues/Pull Requests/Feedbacks

Don't hesitate to contact for any feedback or create issues/pull requests (especially if you want to rewrite the stemmer implemented in ROUGE-1.5.5 in python ;)).

Example

import rouge


def prepare_results(m, p, r, f):
    return '\t{}:\t{}: {:5.2f}\t{}: {:5.2f}\t{}: {:5.2f}'.format(m, 'P', 100.0 * p, 'R', 100.0 * r, 'F1', 100.0 * f)


for aggregator in ['Avg', 'Best', 'Individual']:
    print('Evaluation with {}'.format(aggregator))
    apply_avg = aggregator == 'Avg'
    apply_best = aggregator == 'Best'

    evaluator = rouge.Rouge(metrics=['rouge-n', 'rouge-l', 'rouge-w'],
                           max_n=4,
                           limit_length=True,
                           length_limit=100,
                           length_limit_type='words',
                           apply_avg=apply_avg,
                           apply_best=apply_best,
                           alpha=0.5, # Default F1_score
                           weight_factor=1.2,
                           stemming=True)


    hypothesis_1 = "King Norodom Sihanouk has declined requests to chair a summit of Cambodia 's top political leaders , saying the meeting would not bring any progress in deadlocked negotiations to form a government .\nGovernment and opposition parties have asked King Norodom Sihanouk to host a summit meeting after a series of post-election negotiations between the two opposition groups and Hun Sen 's party to form a new government failed .\nHun Sen 's ruling party narrowly won a majority in elections in July , but the opposition _ claiming widespread intimidation and fraud _ has denied Hun Sen the two-thirds vote in parliament required to approve the next government .\n"
    references_1 = ["Prospects were dim for resolution of the political crisis in Cambodia in October 1998.\nPrime Minister Hun Sen insisted that talks take place in Cambodia while opposition leaders Ranariddh and Sam Rainsy, fearing arrest at home, wanted them abroad.\nKing Sihanouk declined to chair talks in either place.\nA U.S. House resolution criticized Hun Sen's regime while the opposition tried to cut off his access to loans.\nBut in November the King announced a coalition government with Hun Sen heading the executive and Ranariddh leading the parliament.\nLeft out, Sam Rainsy sought the King's assurance of Hun Sen's promise of safety and freedom for all politicians.",
                    "Cambodian prime minister Hun Sen rejects demands of 2 opposition parties for talks in Beijing after failing to win a 2/3 majority in recent elections.\nSihanouk refuses to host talks in Beijing.\nOpposition parties ask the Asian Development Bank to stop loans to Hun Sen's government.\nCCP defends Hun Sen to the US Senate.\nFUNCINPEC refuses to share the presidency.\nHun Sen and Ranariddh eventually form a coalition at summit convened by Sihanouk.\nHun Sen remains prime minister, Ranariddh is president of the national assembly, and a new senate will be formed.\nOpposition leader Rainsy left out.\nHe seeks strong assurance of safety should he return to Cambodia.\n",
                    ]

    hypothesis_2 = "China 's government said Thursday that two prominent dissidents arrested this week are suspected of endangering national security _ the clearest sign yet Chinese leaders plan to quash a would-be opposition party .\nOne leader of a suppressed new political party will be tried on Dec. 17 on a charge of colluding with foreign enemies of China '' to incite the subversion of state power , '' according to court documents given to his wife on Monday .\nWith attorneys locked up , harassed or plain scared , two prominent dissidents will defend themselves against charges of subversion Thursday in China 's highest-profile dissident trials in two years .\n"
    references_2 = "Hurricane Mitch, category 5 hurricane, brought widespread death and destruction to Central American.\nEspecially hard hit was Honduras where an estimated 6,076 people lost their lives.\nThe hurricane, which lingered off the coast of Honduras for 3 days before moving off, flooded large areas, destroying crops and property.\nThe U.S. and European Union were joined by Pope John Paul II in a call for money and workers to help the stricken area.\nPresident Clinton sent Tipper Gore, wife of Vice President Gore to the area to deliver much needed supplies to the area, demonstrating U.S. commitment to the recovery of the region.\n"

    all_hypothesis = [hypothesis_1, hypothesis_2]
    all_references = [references_1, references_2]

    scores = evaluator.get_scores(all_hypothesis, all_references)

    for metric, results in sorted(scores.items(), key=lambda x: x[0]):
        if not apply_avg and not apply_best: # value is a type of list as we evaluate each summary vs each reference
            for hypothesis_id, results_per_ref in enumerate(results):
                nb_references = len(results_per_ref['p'])
                for reference_id in range(nb_references):
                    print('\tHypothesis #{} & Reference #{}: '.format(hypothesis_id, reference_id))
                    print('\t' + prepare_results(metric,results_per_ref['p'][reference_id], results_per_ref['r'][reference_id], results_per_ref['f'][reference_id]))
            print()
        else:
            print(prepare_results(metric, results['p'], results['r'], results['f']))
    print()

It produces the following output:

Evaluation with Avg
    rouge-1:    P: 28.62    R: 26.46    F1: 27.49
    rouge-2:    P:  4.21    R:  3.92    F1:  4.06
    rouge-3:    P:  0.80    R:  0.74    F1:  0.77
    rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    rouge-l:    P: 30.52    R: 28.57    F1: 29.51
    rouge-w:    P: 15.85    R:  8.28    F1: 10.87

Evaluation with Best
    rouge-1:    P: 30.44    R: 28.36    F1: 29.37
    rouge-2:    P:  4.74    R:  4.46    F1:  4.59
    rouge-3:    P:  1.06    R:  0.98    F1:  1.02
    rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    rouge-l:    P: 31.54    R: 29.71    F1: 30.60
    rouge-w:    P: 16.42    R:  8.82    F1: 11.47

Evaluation with Individual
    Hypothesis #0 & Reference #0: 
        rouge-1:    P: 38.54    R: 35.58    F1: 37.00
    Hypothesis #0 & Reference #1: 
        rouge-1:    P: 45.83    R: 43.14    F1: 44.44
    Hypothesis #1 & Reference #0: 
        rouge-1:    P: 15.05    R: 13.59    F1: 14.29

    Hypothesis #0 & Reference #0: 
        rouge-2:    P:  7.37    R:  6.80    F1:  7.07
    Hypothesis #0 & Reference #1: 
        rouge-2:    P:  9.47    R:  8.91    F1:  9.18
    Hypothesis #1 & Reference #0: 
        rouge-2:    P:  0.00    R:  0.00    F1:  0.00

    Hypothesis #0 & Reference #0: 
        rouge-3:    P:  2.13    R:  1.96    F1:  2.04
    Hypothesis #0 & Reference #1: 
        rouge-3:    P:  1.06    R:  1.00    F1:  1.03
    Hypothesis #1 & Reference #0: 
        rouge-3:    P:  0.00    R:  0.00    F1:  0.00

    Hypothesis #0 & Reference #0: 
        rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    Hypothesis #0 & Reference #1: 
        rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    Hypothesis #1 & Reference #0: 
        rouge-4:    P:  0.00    R:  0.00    F1:  0.00

    Hypothesis #0 & Reference #0: 
        rouge-l:    P: 42.11    R: 39.39    F1: 40.70
    Hypothesis #0 & Reference #1: 
        rouge-l:    P: 46.19    R: 43.92    F1: 45.03
    Hypothesis #1 & Reference #0: 
        rouge-l:    P: 16.88    R: 15.50    F1: 16.16

    Hypothesis #0 & Reference #0: 
        rouge-w:    P: 22.27    R: 11.49    F1: 15.16
    Hypothesis #0 & Reference #1: 
        rouge-w:    P: 24.56    R: 13.60    F1: 17.51
    Hypothesis #1 & Reference #0: 
        rouge-w:    P:  8.29    R:  4.04    F1:  5.43

Download Details:

Author: Diego999
Source Code: https://github.com/Diego999/py-rouge

License: Apache-2.0 license

#perl #python 

Queenie  Davis

Queenie Davis

1653123600

EasyMDE: Simple, Beautiful and Embeddable JavaScript Markdown Editor

EasyMDE - Markdown Editor 

This repository is a fork of SimpleMDE, made by Sparksuite. Go to the dedicated section for more information.

A drop-in JavaScript text area replacement for writing beautiful and understandable Markdown. EasyMDE allows users who may be less experienced with Markdown to use familiar toolbar buttons and shortcuts.

In addition, the syntax is rendered while editing to clearly show the expected result. Headings are larger, emphasized words are italicized, links are underlined, etc.

EasyMDE also features both built-in auto saving and spell checking. The editor is entirely customizable, from theming to toolbar buttons and javascript hooks.

Try the demo

Preview

Quick access

Install EasyMDE

Via npm:

npm install easymde

Via the UNPKG CDN:

<link rel="stylesheet" href="https://unpkg.com/easymde/dist/easymde.min.css">
<script src="https://unpkg.com/easymde/dist/easymde.min.js"></script>

Or jsDelivr:

<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/easymde/dist/easymde.min.css">
<script src="https://cdn.jsdelivr.net/npm/easymde/dist/easymde.min.js"></script>

How to use

Loading the editor

After installing and/or importing the module, you can load EasyMDE onto the first textarea element on the web page:

<textarea></textarea>
<script>
const easyMDE = new EasyMDE();
</script>

Alternatively you can select a specific textarea, via JavaScript:

<textarea id="my-text-area"></textarea>
<script>
const easyMDE = new EasyMDE({element: document.getElementById('my-text-area')});
</script>

Editor functions

Use easyMDE.value() to get the content of the editor:

<script>
easyMDE.value();
</script>

Use easyMDE.value(val) to set the content of the editor:

<script>
easyMDE.value('New input for **EasyMDE**');
</script>

Configuration

Options list

  • autoDownloadFontAwesome: If set to true, force downloads Font Awesome (used for icons). If set to false, prevents downloading. Defaults to undefined, which will intelligently check whether Font Awesome has already been included, then download accordingly.
  • autofocus: If set to true, focuses the editor automatically. Defaults to false.
  • autosave: Saves the text that's being written and will load it back in the future. It will forget the text when the form it's contained in is submitted.
    • enabled: If set to true, saves the text automatically. Defaults to false.
    • delay: Delay between saves, in milliseconds. Defaults to 10000 (10 seconds).
    • submit_delay: Delay before assuming that submit of the form failed and saving the text, in milliseconds. Defaults to autosave.delay or 10000 (10 seconds).
    • uniqueId: You must set a unique string identifier so that EasyMDE can autosave. Something that separates this from other instances of EasyMDE elsewhere on your website.
    • timeFormat: Set DateTimeFormat. More information see DateTimeFormat instances. Default locale: en-US, format: hour:minute.
    • text: Set text for autosave.
  • autoRefresh: Useful, when initializing the editor in a hidden DOM node. If set to { delay: 300 }, it will check every 300 ms if the editor is visible and if positive, call CodeMirror's refresh().
  • blockStyles: Customize how certain buttons that style blocks of text behave.
    • bold: Can be set to ** or __. Defaults to **.
    • code: Can be set to ``` or ~~~. Defaults to ```.
    • italic: Can be set to * or _. Defaults to *.
  • unorderedListStyle: can be *, - or +. Defaults to *.
  • scrollbarStyle: Chooses a scrollbar implementation. The default is "native", showing native scrollbars. The core library also provides the "null" style, which completely hides the scrollbars. Addons can implement additional scrollbar models.
  • element: The DOM element for the textarea element to use. Defaults to the first textarea element on the page.
  • forceSync: If set to true, force text changes made in EasyMDE to be immediately stored in original text area. Defaults to false.
  • hideIcons: An array of icon names to hide. Can be used to hide specific icons shown by default without completely customizing the toolbar.
  • indentWithTabs: If set to false, indent using spaces instead of tabs. Defaults to true.
  • initialValue: If set, will customize the initial value of the editor.
  • previewImagesInEditor: - EasyMDE will show preview of images, false by default, preview for images will appear only for images on separate lines.
  • imagesPreviewHandler: - A custom function for handling the preview of images. Takes the parsed string between the parantheses of the image markdown ![]( ) as argument and returns a string that serves as the src attribute of the <img> tag in the preview. Enables dynamic previewing of images in the frontend without having to upload them to a server, allows copy-pasting of images to the editor with preview.
  • insertTexts: Customize how certain buttons that insert text behave. Takes an array with two elements. The first element will be the text inserted before the cursor or highlight, and the second element will be inserted after. For example, this is the default link value: ["[", "](http://)"].
    • horizontalRule
    • image
    • link
    • table
  • lineNumbers: If set to true, enables line numbers in the editor.
  • lineWrapping: If set to false, disable line wrapping. Defaults to true.
  • minHeight: Sets the minimum height for the composition area, before it starts auto-growing. Should be a string containing a valid CSS value like "500px". Defaults to "300px".
  • maxHeight: Sets fixed height for the composition area. minHeight option will be ignored. Should be a string containing a valid CSS value like "500px". Defaults to undefined.
  • onToggleFullScreen: A function that gets called when the editor's full screen mode is toggled. The function will be passed a boolean as parameter, true when the editor is currently going into full screen mode, or false.
  • parsingConfig: Adjust settings for parsing the Markdown during editing (not previewing).
    • allowAtxHeaderWithoutSpace: If set to true, will render headers without a space after the #. Defaults to false.
    • strikethrough: If set to false, will not process GFM strikethrough syntax. Defaults to true.
    • underscoresBreakWords: If set to true, let underscores be a delimiter for separating words. Defaults to false.
  • overlayMode: Pass a custom codemirror overlay mode to parse and style the Markdown during editing.
    • mode: A codemirror mode object.
    • combine: If set to false, will replace CSS classes returned by the default Markdown mode. Otherwise the classes returned by the custom mode will be combined with the classes returned by the default mode. Defaults to true.
  • placeholder: If set, displays a custom placeholder message.
  • previewClass: A string or array of strings that will be applied to the preview screen when activated. Defaults to "editor-preview".
  • previewRender: Custom function for parsing the plaintext Markdown and returning HTML. Used when user previews.
  • promptURLs: If set to true, a JS alert window appears asking for the link or image URL. Defaults to false.
  • promptTexts: Customize the text used to prompt for URLs.
    • image: The text to use when prompting for an image's URL. Defaults to URL of the image:.
    • link: The text to use when prompting for a link's URL. Defaults to URL for the link:.
  • uploadImage: If set to true, enables the image upload functionality, which can be triggered by drag and drop, copy-paste and through the browse-file window (opened when the user click on the upload-image icon). Defaults to false.
  • imageMaxSize: Maximum image size in bytes, checked before upload (note: never trust client, always check the image size at server-side). Defaults to 1024 * 1024 * 2 (2 MB).
  • imageAccept: A comma-separated list of mime-types used to check image type before upload (note: never trust client, always check file types at server-side). Defaults to image/png, image/jpeg.
  • imageUploadFunction: A custom function for handling the image upload. Using this function will render the options imageMaxSize, imageAccept, imageUploadEndpoint and imageCSRFToken ineffective.
    • The function gets a file and onSuccess and onError callback functions as parameters. onSuccess(imageUrl: string) and onError(errorMessage: string)
  • imageUploadEndpoint: The endpoint where the images data will be sent, via an asynchronous POST request. The server is supposed to save this image, and return a JSON response.
    • if the request was successfully processed (HTTP 200 OK): {"data": {"filePath": "<filePath>"}} where filePath is the path of the image (absolute if imagePathAbsolute is set to true, relative if otherwise);
    • otherwise: {"error": "<errorCode>"}, where errorCode can be noFileGiven (HTTP 400 Bad Request), typeNotAllowed (HTTP 415 Unsupported Media Type), fileTooLarge (HTTP 413 Payload Too Large) or importError (see errorMessages below). If errorCode is not one of the errorMessages, it is alerted unchanged to the user. This allows for server-side error messages. No default value.
  • imagePathAbsolute: If set to true, will treat imageUrl from imageUploadFunction and filePath returned from imageUploadEndpoint as an absolute rather than relative path, i.e. not prepend window.location.origin to it.
  • imageCSRFToken: CSRF token to include with AJAX call to upload image. For various instances like Django, Spring and Laravel.
  • imageCSRFName: CSRF token filed name to include with AJAX call to upload image, applied when imageCSRFToken has value, defaults to csrfmiddlewaretoken.
  • imageCSRFHeader: If set to true, passing CSRF token via header. Defaults to false, which pass CSRF through request body.
  • imageTexts: Texts displayed to the user (mainly on the status bar) for the import image feature, where #image_name#, #image_size# and #image_max_size# will replaced by their respective values, that can be used for customization or internationalization:
    • sbInit: Status message displayed initially if uploadImage is set to true. Defaults to Attach files by drag and dropping or pasting from clipboard..
    • sbOnDragEnter: Status message displayed when the user drags a file to the text area. Defaults to Drop image to upload it..
    • sbOnDrop: Status message displayed when the user drops a file in the text area. Defaults to Uploading images #images_names#.
    • sbProgress: Status message displayed to show uploading progress. Defaults to Uploading #file_name#: #progress#%.
    • sbOnUploaded: Status message displayed when the image has been uploaded. Defaults to Uploaded #image_name#.
    • sizeUnits: A comma-separated list of units used to display messages with human-readable file sizes. Defaults to B, KB, MB (example: 218 KB). You can use B,KB,MB instead if you prefer without whitespaces (218KB).
  • errorMessages: Errors displayed to the user, using the errorCallback option, where #image_name#, #image_size# and #image_max_size# will replaced by their respective values, that can be used for customization or internationalization:
    • noFileGiven: The server did not receive any file from the user. Defaults to You must select a file..
    • typeNotAllowed: The user send a file type which doesn't match the imageAccept list, or the server returned this error code. Defaults to This image type is not allowed..
    • fileTooLarge: The size of the image being imported is bigger than the imageMaxSize, or if the server returned this error code. Defaults to Image #image_name# is too big (#image_size#).\nMaximum file size is #image_max_size#..
    • importError: An unexpected error occurred when uploading the image. Defaults to Something went wrong when uploading the image #image_name#..
  • errorCallback: A callback function used to define how to display an error message. Defaults to (errorMessage) => alert(errorMessage).
  • renderingConfig: Adjust settings for parsing the Markdown during previewing (not editing).
    • codeSyntaxHighlighting: If set to true, will highlight using highlight.js. Defaults to false. To use this feature you must include highlight.js on your page or pass in using the hljs option. For example, include the script and the CSS files like:
      <script src="https://cdn.jsdelivr.net/highlight.js/latest/highlight.min.js"></script>
      <link rel="stylesheet" href="https://cdn.jsdelivr.net/highlight.js/latest/styles/github.min.css">
    • hljs: An injectible instance of highlight.js. If you don't want to rely on the global namespace (window.hljs), you can provide an instance here. Defaults to undefined.
    • markedOptions: Set the internal Markdown renderer's options. Other renderingConfig options will take precedence.
    • singleLineBreaks: If set to false, disable parsing GitHub Flavored Markdown (GFM) single line breaks. Defaults to true.
    • sanitizerFunction: Custom function for sanitizing the HTML output of Markdown renderer.
  • shortcuts: Keyboard shortcuts associated with this instance. Defaults to the array of shortcuts.
  • showIcons: An array of icon names to show. Can be used to show specific icons hidden by default without completely customizing the toolbar.
  • spellChecker: If set to false, disable the spell checker. Defaults to true. Optionally pass a CodeMirrorSpellChecker-compliant function.
  • inputStyle: textarea or contenteditable. Defaults to textarea for desktop and contenteditable for mobile. contenteditable option is necessary to enable nativeSpellcheck.
  • nativeSpellcheck: If set to false, disable native spell checker. Defaults to true.
  • sideBySideFullscreen: If set to false, allows side-by-side editing without going into fullscreen. Defaults to true.
  • status: If set to false, hide the status bar. Defaults to the array of built-in status bar items.
    • Optionally, you can set an array of status bar items to include, and in what order. You can even define your own custom status bar items.
  • styleSelectedText: If set to false, remove the CodeMirror-selectedtext class from selected lines. Defaults to true.
  • syncSideBySidePreviewScroll: If set to false, disable syncing scroll in side by side mode. Defaults to true.
  • tabSize: If set, customize the tab size. Defaults to 2.
  • theme: Override the theme. Defaults to easymde.
  • toolbar: If set to false, hide the toolbar. Defaults to the array of icons.
  • toolbarTips: If set to false, disable toolbar button tips. Defaults to true.
  • direction: rtl or ltr. Changes text direction to support right-to-left languages. Defaults to ltr.

Options example

Most options demonstrate the non-default behavior:

const editor = new EasyMDE({
    autofocus: true,
    autosave: {
        enabled: true,
        uniqueId: "MyUniqueID",
        delay: 1000,
        submit_delay: 5000,
        timeFormat: {
            locale: 'en-US',
            format: {
                year: 'numeric',
                month: 'long',
                day: '2-digit',
                hour: '2-digit',
                minute: '2-digit',
            },
        },
        text: "Autosaved: "
    },
    blockStyles: {
        bold: "__",
        italic: "_",
    },
    unorderedListStyle: "-",
    element: document.getElementById("MyID"),
    forceSync: true,
    hideIcons: ["guide", "heading"],
    indentWithTabs: false,
    initialValue: "Hello world!",
    insertTexts: {
        horizontalRule: ["", "\n\n-----\n\n"],
        image: ["![](http://", ")"],
        link: ["[", "](https://)"],
        table: ["", "\n\n| Column 1 | Column 2 | Column 3 |\n| -------- | -------- | -------- |\n| Text     | Text      | Text     |\n\n"],
    },
    lineWrapping: false,
    minHeight: "500px",
    parsingConfig: {
        allowAtxHeaderWithoutSpace: true,
        strikethrough: false,
        underscoresBreakWords: true,
    },
    placeholder: "Type here...",

    previewClass: "my-custom-styling",
    previewClass: ["my-custom-styling", "more-custom-styling"],

    previewRender: (plainText) => customMarkdownParser(plainText), // Returns HTML from a custom parser
    previewRender: (plainText, preview) => { // Async method
        setTimeout(() => {
            preview.innerHTML = customMarkdownParser(plainText);
        }, 250);

        return "Loading...";
    },
    promptURLs: true,
    promptTexts: {
        image: "Custom prompt for URL:",
        link: "Custom prompt for URL:",
    },
    renderingConfig: {
        singleLineBreaks: false,
        codeSyntaxHighlighting: true,
        sanitizerFunction: (renderedHTML) => {
            // Using DOMPurify and only allowing <b> tags
            return DOMPurify.sanitize(renderedHTML, {ALLOWED_TAGS: ['b']})
        },
    },
    shortcuts: {
        drawTable: "Cmd-Alt-T"
    },
    showIcons: ["code", "table"],
    spellChecker: false,
    status: false,
    status: ["autosave", "lines", "words", "cursor"], // Optional usage
    status: ["autosave", "lines", "words", "cursor", {
        className: "keystrokes",
        defaultValue: (el) => {
            el.setAttribute('data-keystrokes', 0);
        },
        onUpdate: (el) => {
            const keystrokes = Number(el.getAttribute('data-keystrokes')) + 1;
            el.innerHTML = `${keystrokes} Keystrokes`;
            el.setAttribute('data-keystrokes', keystrokes);
        },
    }], // Another optional usage, with a custom status bar item that counts keystrokes
    styleSelectedText: false,
    sideBySideFullscreen: false,
    syncSideBySidePreviewScroll: false,
    tabSize: 4,
    toolbar: false,
    toolbarTips: false,
});

Toolbar icons

Below are the built-in toolbar icons (only some of which are enabled by default), which can be reorganized however you like. "Name" is the name of the icon, referenced in the JavaScript. "Action" is either a function or a URL to open. "Class" is the class given to the icon. "Tooltip" is the small tooltip that appears via the title="" attribute. Note that shortcut hints are added automatically and reflect the specified action if it has a key bind assigned to it (i.e. with the value of action set to bold and that of tooltip set to Bold, the final text the user will see would be "Bold (Ctrl-B)").

Additionally, you can add a separator between any icons by adding "|" to the toolbar array.

NameActionTooltip
Class
boldtoggleBoldBold
fa fa-bold
italictoggleItalicItalic
fa fa-italic
strikethroughtoggleStrikethroughStrikethrough
fa fa-strikethrough
headingtoggleHeadingSmallerHeading
fa fa-header
heading-smallertoggleHeadingSmallerSmaller Heading
fa fa-header
heading-biggertoggleHeadingBiggerBigger Heading
fa fa-lg fa-header
heading-1toggleHeading1Big Heading
fa fa-header header-1
heading-2toggleHeading2Medium Heading
fa fa-header header-2
heading-3toggleHeading3Small Heading
fa fa-header header-3
codetoggleCodeBlockCode
fa fa-code
quotetoggleBlockquoteQuote
fa fa-quote-left
unordered-listtoggleUnorderedListGeneric List
fa fa-list-ul
ordered-listtoggleOrderedListNumbered List
fa fa-list-ol
clean-blockcleanBlockClean block
fa fa-eraser
linkdrawLinkCreate Link
fa fa-link
imagedrawImageInsert Image
fa fa-picture-o
tabledrawTableInsert Table
fa fa-table
horizontal-ruledrawHorizontalRuleInsert Horizontal Line
fa fa-minus
previewtogglePreviewToggle Preview
fa fa-eye no-disable
side-by-sidetoggleSideBySideToggle Side by Side
fa fa-columns no-disable no-mobile
fullscreentoggleFullScreenToggle Fullscreen
fa fa-arrows-alt no-disable no-mobile
guideThis linkMarkdown Guide
fa fa-question-circle
undoundoUndo
fa fa-undo
redoredoRedo
fa fa-redo

Toolbar customization

Customize the toolbar using the toolbar option.

Only the order of existing buttons:

const easyMDE = new EasyMDE({
    toolbar: ["bold", "italic", "heading", "|", "quote"]
});

All information and/or add your own icons

const easyMDE = new EasyMDE({
    toolbar: [
        {
            name: "bold",
            action: EasyMDE.toggleBold,
            className: "fa fa-bold",
            title: "Bold",
        },
        "italics", // shortcut to pre-made button
        {
            name: "custom",
            action: (editor) => {
                // Add your own code
            },
            className: "fa fa-star",
            title: "Custom Button",
            attributes: { // for custom attributes
                id: "custom-id",
                "data-value": "custom value" // HTML5 data-* attributes need to be enclosed in quotation marks ("") because of the dash (-) in its name.
            }
        },
        "|" // Separator
        // [, ...]
    ]
});

Put some buttons on dropdown menu

const easyMDE = new EasyMDE({
    toolbar: [{
                name: "heading",
                action: EasyMDE.toggleHeadingSmaller,
                className: "fa fa-header",
                title: "Headers",
            },
            "|",
            {
                name: "others",
                className: "fa fa-blind",
                title: "others buttons",
                children: [
                    {
                        name: "image",
                        action: EasyMDE.drawImage,
                        className: "fa fa-picture-o",
                        title: "Image",
                    },
                    {
                        name: "quote",
                        action: EasyMDE.toggleBlockquote,
                        className: "fa fa-percent",
                        title: "Quote",
                    },
                    {
                        name: "link",
                        action: EasyMDE.drawLink,
                        className: "fa fa-link",
                        title: "Link",
                    }
                ]
            },
        // [, ...]
    ]
});

Keyboard shortcuts

EasyMDE comes with an array of predefined keyboard shortcuts, but they can be altered with a configuration option. The list of default ones is as follows:

Shortcut (Windows / Linux)Shortcut (macOS)Action
Ctrl-'Cmd-'"toggleBlockquote"
Ctrl-BCmd-B"toggleBold"
Ctrl-ECmd-E"cleanBlock"
Ctrl-HCmd-H"toggleHeadingSmaller"
Ctrl-ICmd-I"toggleItalic"
Ctrl-KCmd-K"drawLink"
Ctrl-LCmd-L"toggleUnorderedList"
Ctrl-PCmd-P"togglePreview"
Ctrl-Alt-CCmd-Alt-C"toggleCodeBlock"
Ctrl-Alt-ICmd-Alt-I"drawImage"
Ctrl-Alt-LCmd-Alt-L"toggleOrderedList"
Shift-Ctrl-HShift-Cmd-H"toggleHeadingBigger"
F9F9"toggleSideBySide"
F11F11"toggleFullScreen"

Here is how you can change a few, while leaving others untouched:

const editor = new EasyMDE({
    shortcuts: {
        "toggleOrderedList": "Ctrl-Alt-K", // alter the shortcut for toggleOrderedList
        "toggleCodeBlock": null, // unbind Ctrl-Alt-C
        "drawTable": "Cmd-Alt-T", // bind Cmd-Alt-T to drawTable action, which doesn't come with a default shortcut
    }
});

Shortcuts are automatically converted between platforms. If you define a shortcut as "Cmd-B", on PC that shortcut will be changed to "Ctrl-B". Conversely, a shortcut defined as "Ctrl-B" will become "Cmd-B" for Mac users.

The list of actions that can be bound is the same as the list of built-in actions available for toolbar buttons.

Advanced use

Event handling

You can catch the following list of events: https://codemirror.net/doc/manual.html#events

const easyMDE = new EasyMDE();
easyMDE.codemirror.on("change", () => {
    console.log(easyMDE.value());
});

Removing EasyMDE from text area

You can revert to the initial text area by calling the toTextArea method. Note that this clears up the autosave (if enabled) associated with it. The text area will retain any text from the destroyed EasyMDE instance.

const easyMDE = new EasyMDE();
// ...
easyMDE.toTextArea();
easyMDE = null;

If you need to remove registered event listeners (when the editor is not needed anymore), call easyMDE.cleanup().

Useful methods

The following self-explanatory methods may be of use while developing with EasyMDE.

const easyMDE = new EasyMDE();
easyMDE.isPreviewActive(); // returns boolean
easyMDE.isSideBySideActive(); // returns boolean
easyMDE.isFullscreenActive(); // returns boolean
easyMDE.clearAutosavedValue(); // no returned value

How it works

EasyMDE is a continuation of SimpleMDE.

SimpleMDE began as an improvement of lepture's Editor project, but has now taken on an identity of its own. It is bundled with CodeMirror and depends on Font Awesome.

CodeMirror is the backbone of the project and parses much of the Markdown syntax as it's being written. This allows us to add styles to the Markdown that's being written. Additionally, a toolbar and status bar have been added to the top and bottom, respectively. Previews are rendered by Marked using GitHub Flavored Markdown (GFM).

SimpleMDE fork

I originally made this fork to implement FontAwesome 5 compatibility into SimpleMDE. When that was done I submitted a pull request, which has not been accepted yet. This, and the project being inactive since May 2017, triggered me to make more changes and try to put new life into the project.

Changes include:

  • FontAwesome 5 compatibility
  • Guide button works when editor is in preview mode
  • Links are now https:// by default
  • Small styling changes
  • Support for Node 8 and beyond
  • Lots of refactored code
  • Links in preview will open in a new tab by default
  • TypeScript support

My intention is to continue development on this project, improving it and keeping it alive.

Hacking EasyMDE

You may want to edit this library to adapt its behavior to your needs. This can be done in some quick steps:

  1. Follow the prerequisites and installation instructions in the contribution guide;
  2. Do your changes;
  3. Run gulp command, which will generate files: dist/easymde.min.css and dist/easymde.min.js;
  4. Copy-paste those files to your code base, and you are done.

Contributing

Want to contribute to EasyMDE? Thank you! We have a contribution guide just for you!


Author: Ionaru
Source Code: https://github.com/Ionaru/easy-markdown-editor
License: MIT license

#react-native #react 

A Plugin for D3.js That Allows You to Easy Use Context-menus

d3-context-menu

This is a plugin for d3.js that allows you to easy use context-menus in your visualizations. It's 100% d3 based and done in the "d3 way", so you don't need to worry about including additional frameworks.

Install with Bower

bower install d3-context-menu

Basic usage:

// Define your menu
var menu = [
    {
        title: 'Item #1',
        action: function(d) {
            console.log('Item #1 clicked!');
            console.log('The data for this circle is: ' + d);
        },
        disabled: false // optional, defaults to false
    },
    {
        title: 'Item #2',
        action: function(d) {
            console.log('You have clicked the second item!');
            console.log('The data for this circle is: ' + d);
        }
    }
]

var data = [1, 2, 3];

var g = d3.select('body').append('svg')
    .attr('width', 200)
    .attr('height', 400)
    .append('g');

g.selectAll('circles')
    .data(data)
    .enter()
    .append('circle')
    .attr('r', 30)
    .attr('fill', 'steelblue')
    .attr('cx', function(d) {
        return 100;
    })
    .attr('cy', function(d) {
        return d * 100;
    })
    .on('contextmenu', d3.contextMenu(menu)); // attach menu to element
});

Advanced usage:

Headers and Dividers

Menus can have Headers and Dividers. To specify a header simply don't define an "action" property. To specify a divider, simply add a "divider: true" property to the menu item, and it'll be considered a divider. Example menu definition:

var menu = [
    {
        title: 'Header',
    },
    {
        title: 'Normal item',
        action: function() {}
    },
    {
        divider: true
    },
    {
        title: 'Last item',
        action: function() {}
    }
];

Nested Menu

Menus can have Nested Menu. To specify a nested menu, simply add "children" property. Children has item of array.

var menu = [
    {
        title: 'Parent',
        children: [
            {
                title: 'Child',
                children: [
                    {
                        // header
                        title: 'Grand-Child1'
                    },
                    {
                        // normal
                        title: 'Grand-Child2',
                        action: function() {}
                    },
                    {
                        // divider
                        divider: true
                    },
                    {
                        // disable
                        title: 'Grand-Child3',
                        action: function() {}
                    }
                ]
            }
        ]
    },
];

See the index.htm file in the example folder to see this in action.

Pre-show callback

You can pass in a callback that will be executed before the context menu appears. This can be useful if you need something to close tooltips or perform some other task before the menu appears:

    ...
    .on('contextmenu', d3.contextMenu(menu, function() {
        console.log('Quick! Before the menu appears!');
    })); // attach menu to element

Post-show callback

You can pass in a callback that will be executed after the context menu appears using the onClose option:

    ...
    .on('contextmenu', d3.contextMenu(menu, {
        onOpen: function() {
            console.log('Quick! Before the menu appears!');
        },
        onClose: function() {
            console.log('Menu has been closed.');
        }
    })); // attach menu to element

Context-sensitive menu items

You can use information from your context in menu names, simply specify a function for title which returns a string:

var menu = [
    {
        title: function(d) {
            return 'Delete circle '+d.circleName;
        },
        action: function(d) {
            // delete it
        }
    },
    {
        title: function(d) {
            return 'Item 2';
        },
        action: function(d) {
            // do nothing interesting
        }
    }
];

// Menu shown is:

[Delete Circle MyCircle]
[Item 2]

Dynamic menu list

You can also have different lists of menu items for different nodes if menu is a function:

var menu = function(data) {
    if (data.x > 100) {
        return [{
            title: 'Item #1',
            action: function(d) {
                console.log('Item #1 clicked!');
                console.log('The data for this circle is: ' + d);
            }
        }];
    } else {
        return [{
            title: 'Item #1',
            action: function(d) {
                console.log('Item #1 clicked!');
                console.log('The data for this circle is: ' + d);
            }
        }, {
            title: 'Item #2',
            action: function(d) {
                console.log('Item #2 clicked!');
                console.log('The data for this circle is: ' + d);
            }
        }];
    }
};

// Menu shown for nodes with x < 100 contains 1 item, while other nodes have 2 menu items

Deleting Nodes Example

The following example shows how to add a right click menu to a tree diagram:

http://plnkr.co/edit/bDBe0xGX1mCLzqYGOqOS?p=info

Explicitly set menu position

Default position can be overwritten by providing a position option (either object or function returning an object):

    ...
    .on('contextmenu', d3.contextMenu(menu, {
        onOpen: function() {
            ...
        },
        onClose: function() {
            ...
        },
        position: {
            top: 100,
            left: 200
        }
    })); // attach menu to element

or

    ...
    .on('contextmenu', d3.contextMenu(menu, {
        onOpen: function() {
            ...
        },
        onClose: function() {
            ...
        },
        position: function(d) {
            var elm = this;
            var bounds = elm.getBoundingClientRect();

            // eg. align bottom-left
            return {
                top: bounds.top + bounds.height,
                left: bounds.left
            }
        }
    })); // attach menu to element

Set your own CSS class as theme (make sure to style it)

d3.contextMenu(menu, {
    ...
    theme: 'my-awesome-theme'
});

or

d3.contextMenu(menu, {
    ...
    theme: function () {
        if (foo) {
            return 'my-foo-theme';
        }
        else {
            return 'my-awesome-theme';
        }
    }
});

Close the context menu programatically (can be used as cleanup, as well)

d3.contextMenu('close');

The following example shows how to add a right click menu to a tree diagram:

http://plnkr.co/edit/bDBe0xGX1mCLzqYGOqOS?p=info

Additional callback arguments

Depending on the D3 library version used the callback functions can provide an additional argument:

  • for D3 6.x or above it will be the event, since the global d3.event is not available.
var menu = [
    {
        title: 'Item #1',
        action: function(d, event) {
            console.log('Item #1 clicked!');
            console.log('The data for this circle is: ' + d);
            console.log('The event is: ' + event);
        }
    }
]
  • for D3 5.x or below it will be the index, for backward compatibility reasons.
var menu = [
    {
        title: 'Item #1',
        action: function(d, index) {
            console.log('Item #1 clicked!');
            console.log('The data for this circle is: ' + d);
            console.log('The index is: ' + index);
        }
    }
]

What's new in version 2.1.0

  • Added support for accessing event information in with D3 6.x.

What's new in version 2.0.0

  • Added support for D3 6.x
  • The index parameter of callbacks are undefined when using D3 6.x or above. See the index.htm file in the example folder to see how to get the proper index value in that case.
  • Added class property for menu items that allows specifying CSS classes (see: https://github.com/patorjk/d3-context-menu/pull/56).

What's new in version 1.1.2

  • Menu updated so it wont go off bottom or right of screen when window is smaller.

What's new in version 1.1.1

  • Menu close bug fix.

What's new in version 1.1.0

  • Nested submenus are now supported.

What's new in version 1.0.1

  • Default theme styles extracted to their own CSS class (d3-context-menu-theme)
  • Ability to specify own theme css class via the theme configuration option (as string or function returning string)
  • onOpen/onClose callbacks now have consistent signature (they receive data and index, and this argument refers to the DOM element the context menu is related to)
  • all other functions (eg. position, menu) have the same signature and this object as onClose/onOpen
  • Context menu now closes on mousedown outside of the menu, instead of click outside (to mimic behaviour of the native context menu)
  • disabled and divider can now be functions as well and have the same signature and this object as explained above
  • Close the context menu programatically using d3.contextMenu('close');

What's new in version 0.2.1

  • Ability to set menu position
  • Minified css and js versions

What's new in version 0.1.3

  • Fixed issue where context menu element is never removed from DOM
  • Fixed issue where <body> click event is never removed
  • Fixed issue where the incorrect onClose callback was called when menu was closed as a result of clicking outside

What's new in version 0.1.2

  • If contextmenu is clicked twice it will close rather than open the browser's context menu.

What's new in version 0.1.1

  • Header and Divider items.
  • Ability to disable items.

It's written to be very light weight and customizable. You can see it in action here:

http://plnkr.co/edit/hAx36JQhb0RsvVn7TomS?p=info

Author: Patorjk
Source Code: https://github.com/patorjk/d3-context-menu 
License: MIT license

#javascript #d3 #menu #visualization