Getting Started with Svelte Store: Tweened() Svelte Store ( Part 1 )

Learn Getting Started with Svelte Store: Tweened() Svelte Store ( Part 1 )

#svelte 
 

What is GEEK

Buddha Community

Getting Started with Svelte Store: Tweened() Svelte Store ( Part 1 )
Veronica  Roob

Veronica Roob

1653475560

A Pure PHP Implementation Of The MessagePack Serialization Format

msgpack.php

A pure PHP implementation of the MessagePack serialization format.

Features

Installation

The recommended way to install the library is through Composer:

composer require rybakit/msgpack

Usage

Packing

To pack values you can either use an instance of a Packer:

$packer = new Packer();
$packed = $packer->pack($value);

or call a static method on the MessagePack class:

$packed = MessagePack::pack($value);

In the examples above, the method pack automatically packs a value depending on its type. However, not all PHP types can be uniquely translated to MessagePack types. For example, the MessagePack format defines map and array types, which are represented by a single array type in PHP. By default, the packer will pack a PHP array as a MessagePack array if it has sequential numeric keys, starting from 0 and as a MessagePack map otherwise:

$mpArr1 = $packer->pack([1, 2]);               // MP array [1, 2]
$mpArr2 = $packer->pack([0 => 1, 1 => 2]);     // MP array [1, 2]
$mpMap1 = $packer->pack([0 => 1, 2 => 3]);     // MP map {0: 1, 2: 3}
$mpMap2 = $packer->pack([1 => 2, 2 => 3]);     // MP map {1: 2, 2: 3}
$mpMap3 = $packer->pack(['a' => 1, 'b' => 2]); // MP map {a: 1, b: 2}

However, sometimes you need to pack a sequential array as a MessagePack map. To do this, use the packMap method:

$mpMap = $packer->packMap([1, 2]); // {0: 1, 1: 2}

Here is a list of type-specific packing methods:

$packer->packNil();           // MP nil
$packer->packBool(true);      // MP bool
$packer->packInt(42);         // MP int
$packer->packFloat(M_PI);     // MP float (32 or 64)
$packer->packFloat32(M_PI);   // MP float 32
$packer->packFloat64(M_PI);   // MP float 64
$packer->packStr('foo');      // MP str
$packer->packBin("\x80");     // MP bin
$packer->packArray([1, 2]);   // MP array
$packer->packMap(['a' => 1]); // MP map
$packer->packExt(1, "\xaa");  // MP ext

Check the "Custom types" section below on how to pack custom types.

Packing options

The Packer object supports a number of bitmask-based options for fine-tuning the packing process (defaults are in bold):

NameDescription
FORCE_STRForces PHP strings to be packed as MessagePack UTF-8 strings
FORCE_BINForces PHP strings to be packed as MessagePack binary data
DETECT_STR_BINDetects MessagePack str/bin type automatically
  
FORCE_ARRForces PHP arrays to be packed as MessagePack arrays
FORCE_MAPForces PHP arrays to be packed as MessagePack maps
DETECT_ARR_MAPDetects MessagePack array/map type automatically
  
FORCE_FLOAT32Forces PHP floats to be packed as 32-bits MessagePack floats
FORCE_FLOAT64Forces PHP floats to be packed as 64-bits MessagePack floats

The type detection mode (DETECT_STR_BIN/DETECT_ARR_MAP) adds some overhead which can be noticed when you pack large (16- and 32-bit) arrays or strings. However, if you know the value type in advance (for example, you only work with UTF-8 strings or/and associative arrays), you can eliminate this overhead by forcing the packer to use the appropriate type, which will save it from running the auto-detection routine. Another option is to explicitly specify the value type. The library provides 2 auxiliary classes for this, Map and Bin. Check the "Custom types" section below for details.

Examples:

// detect str/bin type and pack PHP 64-bit floats (doubles) to MP 32-bit floats
$packer = new Packer(PackOptions::DETECT_STR_BIN | PackOptions::FORCE_FLOAT32);

// these will throw MessagePack\Exception\InvalidOptionException
$packer = new Packer(PackOptions::FORCE_STR | PackOptions::FORCE_BIN);
$packer = new Packer(PackOptions::FORCE_FLOAT32 | PackOptions::FORCE_FLOAT64);

Unpacking

To unpack data you can either use an instance of a BufferUnpacker:

$unpacker = new BufferUnpacker();

$unpacker->reset($packed);
$value = $unpacker->unpack();

or call a static method on the MessagePack class:

$value = MessagePack::unpack($packed);

If the packed data is received in chunks (e.g. when reading from a stream), use the tryUnpack method, which attempts to unpack data and returns an array of unpacked messages (if any) instead of throwing an InsufficientDataException:

while ($chunk = ...) {
    $unpacker->append($chunk);
    if ($messages = $unpacker->tryUnpack()) {
        return $messages;
    }
}

If you want to unpack from a specific position in a buffer, use seek:

$unpacker->seek(42); // set position equal to 42 bytes
$unpacker->seek(-8); // set position to 8 bytes before the end of the buffer

To skip bytes from the current position, use skip:

$unpacker->skip(10); // set position to 10 bytes ahead of the current position

To get the number of remaining (unread) bytes in the buffer:

$unreadBytesCount = $unpacker->getRemainingCount();

To check whether the buffer has unread data:

$hasUnreadBytes = $unpacker->hasRemaining();

If needed, you can remove already read data from the buffer by calling:

$releasedBytesCount = $unpacker->release();

With the read method you can read raw (packed) data:

$packedData = $unpacker->read(2); // read 2 bytes

Besides the above methods BufferUnpacker provides type-specific unpacking methods, namely:

$unpacker->unpackNil();   // PHP null
$unpacker->unpackBool();  // PHP bool
$unpacker->unpackInt();   // PHP int
$unpacker->unpackFloat(); // PHP float
$unpacker->unpackStr();   // PHP UTF-8 string
$unpacker->unpackBin();   // PHP binary string
$unpacker->unpackArray(); // PHP sequential array
$unpacker->unpackMap();   // PHP associative array
$unpacker->unpackExt();   // PHP MessagePack\Type\Ext object

Unpacking options

The BufferUnpacker object supports a number of bitmask-based options for fine-tuning the unpacking process (defaults are in bold):

NameDescription
BIGINT_AS_STRConverts overflowed integers to strings [1]
BIGINT_AS_GMPConverts overflowed integers to GMP objects [2]
BIGINT_AS_DECConverts overflowed integers to Decimal\Decimal objects [3]

1. The binary MessagePack format has unsigned 64-bit as its largest integer data type, but PHP does not support such integers, which means that an overflow can occur during unpacking.

2. Make sure the GMP extension is enabled.

3. Make sure the Decimal extension is enabled.

Examples:

$packedUint64 = "\xcf"."\xff\xff\xff\xff"."\xff\xff\xff\xff";

$unpacker = new BufferUnpacker($packedUint64);
var_dump($unpacker->unpack()); // string(20) "18446744073709551615"

$unpacker = new BufferUnpacker($packedUint64, UnpackOptions::BIGINT_AS_GMP);
var_dump($unpacker->unpack()); // object(GMP) {...}

$unpacker = new BufferUnpacker($packedUint64, UnpackOptions::BIGINT_AS_DEC);
var_dump($unpacker->unpack()); // object(Decimal\Decimal) {...}

Custom types

In addition to the basic types, the library provides functionality to serialize and deserialize arbitrary types. This can be done in several ways, depending on your use case. Let's take a look at them.

Type objects

If you need to serialize an instance of one of your classes into one of the basic MessagePack types, the best way to do this is to implement the CanBePacked interface in the class. A good example of such a class is the Map type class that comes with the library. This type is useful when you want to explicitly specify that a given PHP array should be packed as a MessagePack map without triggering an automatic type detection routine:

$packer = new Packer();

$packedMap = $packer->pack(new Map([1, 2, 3]));
$packedArray = $packer->pack([1, 2, 3]);

More type examples can be found in the src/Type directory.

Type transformers

As with type objects, type transformers are only responsible for serializing values. They should be used when you need to serialize a value that does not implement the CanBePacked interface. Examples of such values could be instances of built-in or third-party classes that you don't own, or non-objects such as resources.

A transformer class must implement the CanPack interface. To use a transformer, it must first be registered in the packer. Here is an example of how to serialize PHP streams into the MessagePack bin format type using one of the supplied transformers, StreamTransformer:

$packer = new Packer(null, [new StreamTransformer()]);

$packedBin = $packer->pack(fopen('/path/to/file', 'r+'));

More type transformer examples can be found in the src/TypeTransformer directory.

Extensions

In contrast to the cases described above, extensions are intended to handle extension types and are responsible for both serialization and deserialization of values (types).

An extension class must implement the Extension interface. To use an extension, it must first be registered in the packer and the unpacker.

The MessagePack specification divides extension types into two groups: predefined and application-specific. Currently, there is only one predefined type in the specification, Timestamp.

Timestamp

The Timestamp extension type is a predefined type. Support for this type in the library is done through the TimestampExtension class. This class is responsible for handling Timestamp objects, which represent the number of seconds and optional adjustment in nanoseconds:

$timestampExtension = new TimestampExtension();

$packer = new Packer();
$packer = $packer->extendWith($timestampExtension);

$unpacker = new BufferUnpacker();
$unpacker = $unpacker->extendWith($timestampExtension);

$packedTimestamp = $packer->pack(Timestamp::now());
$timestamp = $unpacker->reset($packedTimestamp)->unpack();

$seconds = $timestamp->getSeconds();
$nanoseconds = $timestamp->getNanoseconds();

When using the MessagePack class, the Timestamp extension is already registered:

$packedTimestamp = MessagePack::pack(Timestamp::now());
$timestamp = MessagePack::unpack($packedTimestamp);

Application-specific extensions

In addition, the format can be extended with your own types. For example, to make the built-in PHP DateTime objects first-class citizens in your code, you can create a corresponding extension, as shown in the example. Please note, that custom extensions have to be registered with a unique extension ID (an integer from 0 to 127).

More extension examples can be found in the examples/MessagePack directory.

To learn more about how extension types can be useful, check out this article.

Exceptions

If an error occurs during packing/unpacking, a PackingFailedException or an UnpackingFailedException will be thrown, respectively. In addition, an InsufficientDataException can be thrown during unpacking.

An InvalidOptionException will be thrown in case an invalid option (or a combination of mutually exclusive options) is used.

Tests

Run tests as follows:

vendor/bin/phpunit

Also, if you already have Docker installed, you can run the tests in a docker container. First, create a container:

./dockerfile.sh | docker build -t msgpack -

The command above will create a container named msgpack with PHP 8.1 runtime. You may change the default runtime by defining the PHP_IMAGE environment variable:

PHP_IMAGE='php:8.0-cli' ./dockerfile.sh | docker build -t msgpack -

See a list of various images here.

Then run the unit tests:

docker run --rm -v $PWD:/msgpack -w /msgpack msgpack

Fuzzing

To ensure that the unpacking works correctly with malformed/semi-malformed data, you can use a testing technique called Fuzzing. The library ships with a help file (target) for PHP-Fuzzer and can be used as follows:

php-fuzzer fuzz tests/fuzz_buffer_unpacker.php

Performance

To check performance, run:

php -n -dzend_extension=opcache.so \
-dpcre.jit=1 -dopcache.enable=1 -dopcache.enable_cli=1 \
tests/bench.php

Example output

Filter: MessagePack\Tests\Perf\Filter\ListFilter
Rounds: 3
Iterations: 100000

=============================================
Test/Target            Packer  BufferUnpacker
---------------------------------------------
nil .................. 0.0030 ........ 0.0139
false ................ 0.0037 ........ 0.0144
true ................. 0.0040 ........ 0.0137
7-bit uint #1 ........ 0.0052 ........ 0.0120
7-bit uint #2 ........ 0.0059 ........ 0.0114
7-bit uint #3 ........ 0.0061 ........ 0.0119
5-bit sint #1 ........ 0.0067 ........ 0.0126
5-bit sint #2 ........ 0.0064 ........ 0.0132
5-bit sint #3 ........ 0.0066 ........ 0.0135
8-bit uint #1 ........ 0.0078 ........ 0.0200
8-bit uint #2 ........ 0.0077 ........ 0.0212
8-bit uint #3 ........ 0.0086 ........ 0.0203
16-bit uint #1 ....... 0.0111 ........ 0.0271
16-bit uint #2 ....... 0.0115 ........ 0.0260
16-bit uint #3 ....... 0.0103 ........ 0.0273
32-bit uint #1 ....... 0.0116 ........ 0.0326
32-bit uint #2 ....... 0.0118 ........ 0.0332
32-bit uint #3 ....... 0.0127 ........ 0.0325
64-bit uint #1 ....... 0.0140 ........ 0.0277
64-bit uint #2 ....... 0.0134 ........ 0.0294
64-bit uint #3 ....... 0.0134 ........ 0.0281
8-bit int #1 ......... 0.0086 ........ 0.0241
8-bit int #2 ......... 0.0089 ........ 0.0225
8-bit int #3 ......... 0.0085 ........ 0.0229
16-bit int #1 ........ 0.0118 ........ 0.0280
16-bit int #2 ........ 0.0121 ........ 0.0270
16-bit int #3 ........ 0.0109 ........ 0.0274
32-bit int #1 ........ 0.0128 ........ 0.0346
32-bit int #2 ........ 0.0118 ........ 0.0339
32-bit int #3 ........ 0.0135 ........ 0.0368
64-bit int #1 ........ 0.0138 ........ 0.0276
64-bit int #2 ........ 0.0132 ........ 0.0286
64-bit int #3 ........ 0.0137 ........ 0.0274
64-bit int #4 ........ 0.0180 ........ 0.0285
64-bit float #1 ...... 0.0134 ........ 0.0284
64-bit float #2 ...... 0.0125 ........ 0.0275
64-bit float #3 ...... 0.0126 ........ 0.0283
fix string #1 ........ 0.0035 ........ 0.0133
fix string #2 ........ 0.0094 ........ 0.0216
fix string #3 ........ 0.0094 ........ 0.0222
fix string #4 ........ 0.0091 ........ 0.0241
8-bit string #1 ...... 0.0122 ........ 0.0301
8-bit string #2 ...... 0.0118 ........ 0.0304
8-bit string #3 ...... 0.0119 ........ 0.0315
16-bit string #1 ..... 0.0150 ........ 0.0388
16-bit string #2 ..... 0.1545 ........ 0.1665
32-bit string ........ 0.1570 ........ 0.1756
wide char string #1 .. 0.0091 ........ 0.0236
wide char string #2 .. 0.0122 ........ 0.0313
8-bit binary #1 ...... 0.0100 ........ 0.0302
8-bit binary #2 ...... 0.0123 ........ 0.0324
8-bit binary #3 ...... 0.0126 ........ 0.0327
16-bit binary ........ 0.0168 ........ 0.0372
32-bit binary ........ 0.1588 ........ 0.1754
fix array #1 ......... 0.0042 ........ 0.0131
fix array #2 ......... 0.0294 ........ 0.0367
fix array #3 ......... 0.0412 ........ 0.0472
16-bit array #1 ...... 0.1378 ........ 0.1596
16-bit array #2 ........... S ............. S
32-bit array .............. S ............. S
complex array ........ 0.1865 ........ 0.2283
fix map #1 ........... 0.0725 ........ 0.1048
fix map #2 ........... 0.0319 ........ 0.0405
fix map #3 ........... 0.0356 ........ 0.0665
fix map #4 ........... 0.0465 ........ 0.0497
16-bit map #1 ........ 0.2540 ........ 0.3028
16-bit map #2 ............. S ............. S
32-bit map ................ S ............. S
complex map .......... 0.2372 ........ 0.2710
fixext 1 ............. 0.0283 ........ 0.0358
fixext 2 ............. 0.0291 ........ 0.0371
fixext 4 ............. 0.0302 ........ 0.0355
fixext 8 ............. 0.0288 ........ 0.0384
fixext 16 ............ 0.0293 ........ 0.0359
8-bit ext ............ 0.0302 ........ 0.0439
16-bit ext ........... 0.0334 ........ 0.0499
32-bit ext ........... 0.1845 ........ 0.1888
32-bit timestamp #1 .. 0.0337 ........ 0.0547
32-bit timestamp #2 .. 0.0335 ........ 0.0560
64-bit timestamp #1 .. 0.0371 ........ 0.0575
64-bit timestamp #2 .. 0.0374 ........ 0.0542
64-bit timestamp #3 .. 0.0356 ........ 0.0533
96-bit timestamp #1 .. 0.0362 ........ 0.0699
96-bit timestamp #2 .. 0.0381 ........ 0.0701
96-bit timestamp #3 .. 0.0367 ........ 0.0687
=============================================
Total                  2.7618          4.0820
Skipped                     4               4
Failed                      0               0
Ignored                     0               0

With JIT:

php -n -dzend_extension=opcache.so \
-dpcre.jit=1 -dopcache.jit_buffer_size=64M -dopcache.jit=tracing -dopcache.enable=1 -dopcache.enable_cli=1 \
tests/bench.php

Example output

Filter: MessagePack\Tests\Perf\Filter\ListFilter
Rounds: 3
Iterations: 100000

=============================================
Test/Target            Packer  BufferUnpacker
---------------------------------------------
nil .................. 0.0005 ........ 0.0054
false ................ 0.0004 ........ 0.0059
true ................. 0.0004 ........ 0.0059
7-bit uint #1 ........ 0.0010 ........ 0.0047
7-bit uint #2 ........ 0.0010 ........ 0.0046
7-bit uint #3 ........ 0.0010 ........ 0.0046
5-bit sint #1 ........ 0.0025 ........ 0.0046
5-bit sint #2 ........ 0.0023 ........ 0.0046
5-bit sint #3 ........ 0.0024 ........ 0.0045
8-bit uint #1 ........ 0.0043 ........ 0.0081
8-bit uint #2 ........ 0.0043 ........ 0.0079
8-bit uint #3 ........ 0.0041 ........ 0.0080
16-bit uint #1 ....... 0.0064 ........ 0.0095
16-bit uint #2 ....... 0.0064 ........ 0.0091
16-bit uint #3 ....... 0.0064 ........ 0.0094
32-bit uint #1 ....... 0.0085 ........ 0.0114
32-bit uint #2 ....... 0.0077 ........ 0.0122
32-bit uint #3 ....... 0.0077 ........ 0.0120
64-bit uint #1 ....... 0.0085 ........ 0.0159
64-bit uint #2 ....... 0.0086 ........ 0.0157
64-bit uint #3 ....... 0.0086 ........ 0.0158
8-bit int #1 ......... 0.0042 ........ 0.0080
8-bit int #2 ......... 0.0042 ........ 0.0080
8-bit int #3 ......... 0.0042 ........ 0.0081
16-bit int #1 ........ 0.0065 ........ 0.0095
16-bit int #2 ........ 0.0065 ........ 0.0090
16-bit int #3 ........ 0.0056 ........ 0.0085
32-bit int #1 ........ 0.0067 ........ 0.0107
32-bit int #2 ........ 0.0066 ........ 0.0106
32-bit int #3 ........ 0.0063 ........ 0.0104
64-bit int #1 ........ 0.0072 ........ 0.0162
64-bit int #2 ........ 0.0073 ........ 0.0174
64-bit int #3 ........ 0.0072 ........ 0.0164
64-bit int #4 ........ 0.0077 ........ 0.0161
64-bit float #1 ...... 0.0053 ........ 0.0135
64-bit float #2 ...... 0.0053 ........ 0.0135
64-bit float #3 ...... 0.0052 ........ 0.0135
fix string #1 ....... -0.0002 ........ 0.0044
fix string #2 ........ 0.0035 ........ 0.0067
fix string #3 ........ 0.0035 ........ 0.0077
fix string #4 ........ 0.0033 ........ 0.0078
8-bit string #1 ...... 0.0059 ........ 0.0110
8-bit string #2 ...... 0.0063 ........ 0.0121
8-bit string #3 ...... 0.0064 ........ 0.0124
16-bit string #1 ..... 0.0099 ........ 0.0146
16-bit string #2 ..... 0.1522 ........ 0.1474
32-bit string ........ 0.1511 ........ 0.1483
wide char string #1 .. 0.0039 ........ 0.0084
wide char string #2 .. 0.0073 ........ 0.0123
8-bit binary #1 ...... 0.0040 ........ 0.0112
8-bit binary #2 ...... 0.0075 ........ 0.0123
8-bit binary #3 ...... 0.0077 ........ 0.0129
16-bit binary ........ 0.0096 ........ 0.0145
32-bit binary ........ 0.1535 ........ 0.1479
fix array #1 ......... 0.0008 ........ 0.0061
fix array #2 ......... 0.0121 ........ 0.0165
fix array #3 ......... 0.0193 ........ 0.0222
16-bit array #1 ...... 0.0607 ........ 0.0479
16-bit array #2 ........... S ............. S
32-bit array .............. S ............. S
complex array ........ 0.0749 ........ 0.0824
fix map #1 ........... 0.0329 ........ 0.0431
fix map #2 ........... 0.0161 ........ 0.0189
fix map #3 ........... 0.0205 ........ 0.0262
fix map #4 ........... 0.0252 ........ 0.0205
16-bit map #1 ........ 0.1016 ........ 0.0927
16-bit map #2 ............. S ............. S
32-bit map ................ S ............. S
complex map .......... 0.1096 ........ 0.1030
fixext 1 ............. 0.0157 ........ 0.0161
fixext 2 ............. 0.0175 ........ 0.0183
fixext 4 ............. 0.0156 ........ 0.0185
fixext 8 ............. 0.0163 ........ 0.0184
fixext 16 ............ 0.0164 ........ 0.0182
8-bit ext ............ 0.0158 ........ 0.0207
16-bit ext ........... 0.0203 ........ 0.0219
32-bit ext ........... 0.1614 ........ 0.1539
32-bit timestamp #1 .. 0.0195 ........ 0.0249
32-bit timestamp #2 .. 0.0188 ........ 0.0260
64-bit timestamp #1 .. 0.0207 ........ 0.0281
64-bit timestamp #2 .. 0.0212 ........ 0.0291
64-bit timestamp #3 .. 0.0207 ........ 0.0295
96-bit timestamp #1 .. 0.0222 ........ 0.0358
96-bit timestamp #2 .. 0.0228 ........ 0.0353
96-bit timestamp #3 .. 0.0210 ........ 0.0319
=============================================
Total                  1.6432          1.9674
Skipped                     4               4
Failed                      0               0
Ignored                     0               0

You may change default benchmark settings by defining the following environment variables:

NameDefault
MP_BENCH_TARGETSpure_p,pure_u, see a list of available targets
MP_BENCH_ITERATIONS100_000
MP_BENCH_DURATIONnot set
MP_BENCH_ROUNDS3
MP_BENCH_TESTS-@slow, see a list of available tests

For example:

export MP_BENCH_TARGETS=pure_p
export MP_BENCH_ITERATIONS=1000000
export MP_BENCH_ROUNDS=5
# a comma separated list of test names
export MP_BENCH_TESTS='complex array, complex map'
# or a group name
# export MP_BENCH_TESTS='-@slow' // @pecl_comp
# or a regexp
# export MP_BENCH_TESTS='/complex (array|map)/'

Another example, benchmarking both the library and the PECL extension:

MP_BENCH_TARGETS=pure_p,pure_u,pecl_p,pecl_u \
php -n -dextension=msgpack.so -dzend_extension=opcache.so \
-dpcre.jit=1 -dopcache.enable=1 -dopcache.enable_cli=1 \
tests/bench.php

Example output

Filter: MessagePack\Tests\Perf\Filter\ListFilter
Rounds: 3
Iterations: 100000

===========================================================================
Test/Target            Packer  BufferUnpacker  msgpack_pack  msgpack_unpack
---------------------------------------------------------------------------
nil .................. 0.0031 ........ 0.0141 ...... 0.0055 ........ 0.0064
false ................ 0.0039 ........ 0.0154 ...... 0.0056 ........ 0.0053
true ................. 0.0038 ........ 0.0139 ...... 0.0056 ........ 0.0044
7-bit uint #1 ........ 0.0061 ........ 0.0110 ...... 0.0059 ........ 0.0046
7-bit uint #2 ........ 0.0065 ........ 0.0119 ...... 0.0042 ........ 0.0029
7-bit uint #3 ........ 0.0054 ........ 0.0117 ...... 0.0045 ........ 0.0025
5-bit sint #1 ........ 0.0047 ........ 0.0103 ...... 0.0038 ........ 0.0022
5-bit sint #2 ........ 0.0048 ........ 0.0117 ...... 0.0038 ........ 0.0022
5-bit sint #3 ........ 0.0046 ........ 0.0102 ...... 0.0038 ........ 0.0023
8-bit uint #1 ........ 0.0063 ........ 0.0174 ...... 0.0039 ........ 0.0031
8-bit uint #2 ........ 0.0063 ........ 0.0167 ...... 0.0040 ........ 0.0029
8-bit uint #3 ........ 0.0063 ........ 0.0168 ...... 0.0039 ........ 0.0030
16-bit uint #1 ....... 0.0092 ........ 0.0222 ...... 0.0049 ........ 0.0030
16-bit uint #2 ....... 0.0096 ........ 0.0227 ...... 0.0042 ........ 0.0046
16-bit uint #3 ....... 0.0123 ........ 0.0274 ...... 0.0059 ........ 0.0051
32-bit uint #1 ....... 0.0136 ........ 0.0331 ...... 0.0060 ........ 0.0048
32-bit uint #2 ....... 0.0130 ........ 0.0336 ...... 0.0070 ........ 0.0048
32-bit uint #3 ....... 0.0127 ........ 0.0329 ...... 0.0051 ........ 0.0048
64-bit uint #1 ....... 0.0126 ........ 0.0268 ...... 0.0055 ........ 0.0049
64-bit uint #2 ....... 0.0135 ........ 0.0281 ...... 0.0052 ........ 0.0046
64-bit uint #3 ....... 0.0131 ........ 0.0274 ...... 0.0069 ........ 0.0044
8-bit int #1 ......... 0.0077 ........ 0.0236 ...... 0.0058 ........ 0.0044
8-bit int #2 ......... 0.0087 ........ 0.0244 ...... 0.0058 ........ 0.0048
8-bit int #3 ......... 0.0084 ........ 0.0241 ...... 0.0055 ........ 0.0049
16-bit int #1 ........ 0.0112 ........ 0.0271 ...... 0.0048 ........ 0.0045
16-bit int #2 ........ 0.0124 ........ 0.0292 ...... 0.0057 ........ 0.0049
16-bit int #3 ........ 0.0118 ........ 0.0270 ...... 0.0058 ........ 0.0050
32-bit int #1 ........ 0.0137 ........ 0.0366 ...... 0.0058 ........ 0.0051
32-bit int #2 ........ 0.0133 ........ 0.0366 ...... 0.0056 ........ 0.0049
32-bit int #3 ........ 0.0129 ........ 0.0350 ...... 0.0052 ........ 0.0048
64-bit int #1 ........ 0.0145 ........ 0.0254 ...... 0.0034 ........ 0.0025
64-bit int #2 ........ 0.0097 ........ 0.0214 ...... 0.0034 ........ 0.0025
64-bit int #3 ........ 0.0096 ........ 0.0287 ...... 0.0059 ........ 0.0050
64-bit int #4 ........ 0.0143 ........ 0.0277 ...... 0.0059 ........ 0.0046
64-bit float #1 ...... 0.0134 ........ 0.0281 ...... 0.0057 ........ 0.0052
64-bit float #2 ...... 0.0141 ........ 0.0281 ...... 0.0057 ........ 0.0050
64-bit float #3 ...... 0.0144 ........ 0.0282 ...... 0.0057 ........ 0.0050
fix string #1 ........ 0.0036 ........ 0.0143 ...... 0.0066 ........ 0.0053
fix string #2 ........ 0.0107 ........ 0.0222 ...... 0.0065 ........ 0.0068
fix string #3 ........ 0.0116 ........ 0.0245 ...... 0.0063 ........ 0.0069
fix string #4 ........ 0.0105 ........ 0.0253 ...... 0.0083 ........ 0.0077
8-bit string #1 ...... 0.0126 ........ 0.0318 ...... 0.0075 ........ 0.0088
8-bit string #2 ...... 0.0121 ........ 0.0295 ...... 0.0076 ........ 0.0086
8-bit string #3 ...... 0.0125 ........ 0.0293 ...... 0.0130 ........ 0.0093
16-bit string #1 ..... 0.0159 ........ 0.0368 ...... 0.0117 ........ 0.0086
16-bit string #2 ..... 0.1547 ........ 0.1686 ...... 0.1516 ........ 0.1373
32-bit string ........ 0.1558 ........ 0.1729 ...... 0.1511 ........ 0.1396
wide char string #1 .. 0.0098 ........ 0.0237 ...... 0.0066 ........ 0.0065
wide char string #2 .. 0.0128 ........ 0.0291 ...... 0.0061 ........ 0.0082
8-bit binary #1 ........... I ............. I ........... F ............. I
8-bit binary #2 ........... I ............. I ........... F ............. I
8-bit binary #3 ........... I ............. I ........... F ............. I
16-bit binary ............. I ............. I ........... F ............. I
32-bit binary ............. I ............. I ........... F ............. I
fix array #1 ......... 0.0040 ........ 0.0129 ...... 0.0120 ........ 0.0058
fix array #2 ......... 0.0279 ........ 0.0390 ...... 0.0143 ........ 0.0165
fix array #3 ......... 0.0415 ........ 0.0463 ...... 0.0162 ........ 0.0187
16-bit array #1 ...... 0.1349 ........ 0.1628 ...... 0.0334 ........ 0.0341
16-bit array #2 ........... S ............. S ........... S ............. S
32-bit array .............. S ............. S ........... S ............. S
complex array ............. I ............. I ........... F ............. F
fix map #1 ................ I ............. I ........... F ............. I
fix map #2 ........... 0.0345 ........ 0.0391 ...... 0.0143 ........ 0.0168
fix map #3 ................ I ............. I ........... F ............. I
fix map #4 ........... 0.0459 ........ 0.0473 ...... 0.0151 ........ 0.0163
16-bit map #1 ........ 0.2518 ........ 0.2962 ...... 0.0400 ........ 0.0490
16-bit map #2 ............. S ............. S ........... S ............. S
32-bit map ................ S ............. S ........... S ............. S
complex map .......... 0.2380 ........ 0.2682 ...... 0.0545 ........ 0.0579
fixext 1 .................. I ............. I ........... F ............. F
fixext 2 .................. I ............. I ........... F ............. F
fixext 4 .................. I ............. I ........... F ............. F
fixext 8 .................. I ............. I ........... F ............. F
fixext 16 ................. I ............. I ........... F ............. F
8-bit ext ................. I ............. I ........... F ............. F
16-bit ext ................ I ............. I ........... F ............. F
32-bit ext ................ I ............. I ........... F ............. F
32-bit timestamp #1 ....... I ............. I ........... F ............. F
32-bit timestamp #2 ....... I ............. I ........... F ............. F
64-bit timestamp #1 ....... I ............. I ........... F ............. F
64-bit timestamp #2 ....... I ............. I ........... F ............. F
64-bit timestamp #3 ....... I ............. I ........... F ............. F
96-bit timestamp #1 ....... I ............. I ........... F ............. F
96-bit timestamp #2 ....... I ............. I ........... F ............. F
96-bit timestamp #3 ....... I ............. I ........... F ............. F
===========================================================================
Total                  1.5625          2.3866        0.7735          0.7243
Skipped                     4               4             4               4
Failed                      0               0            24              17
Ignored                    24              24             0               7

With JIT:

MP_BENCH_TARGETS=pure_p,pure_u,pecl_p,pecl_u \
php -n -dextension=msgpack.so -dzend_extension=opcache.so \
-dpcre.jit=1 -dopcache.jit_buffer_size=64M -dopcache.jit=tracing -dopcache.enable=1 -dopcache.enable_cli=1 \
tests/bench.php

Example output

Filter: MessagePack\Tests\Perf\Filter\ListFilter
Rounds: 3
Iterations: 100000

===========================================================================
Test/Target            Packer  BufferUnpacker  msgpack_pack  msgpack_unpack
---------------------------------------------------------------------------
nil .................. 0.0001 ........ 0.0052 ...... 0.0053 ........ 0.0042
false ................ 0.0007 ........ 0.0060 ...... 0.0057 ........ 0.0043
true ................. 0.0008 ........ 0.0060 ...... 0.0056 ........ 0.0041
7-bit uint #1 ........ 0.0031 ........ 0.0046 ...... 0.0062 ........ 0.0041
7-bit uint #2 ........ 0.0021 ........ 0.0043 ...... 0.0062 ........ 0.0041
7-bit uint #3 ........ 0.0022 ........ 0.0044 ...... 0.0061 ........ 0.0040
5-bit sint #1 ........ 0.0030 ........ 0.0048 ...... 0.0062 ........ 0.0040
5-bit sint #2 ........ 0.0032 ........ 0.0046 ...... 0.0062 ........ 0.0040
5-bit sint #3 ........ 0.0031 ........ 0.0046 ...... 0.0062 ........ 0.0040
8-bit uint #1 ........ 0.0054 ........ 0.0079 ...... 0.0062 ........ 0.0050
8-bit uint #2 ........ 0.0051 ........ 0.0079 ...... 0.0064 ........ 0.0044
8-bit uint #3 ........ 0.0051 ........ 0.0082 ...... 0.0062 ........ 0.0044
16-bit uint #1 ....... 0.0077 ........ 0.0094 ...... 0.0065 ........ 0.0045
16-bit uint #2 ....... 0.0077 ........ 0.0094 ...... 0.0063 ........ 0.0045
16-bit uint #3 ....... 0.0077 ........ 0.0095 ...... 0.0064 ........ 0.0047
32-bit uint #1 ....... 0.0088 ........ 0.0119 ...... 0.0063 ........ 0.0043
32-bit uint #2 ....... 0.0089 ........ 0.0117 ...... 0.0062 ........ 0.0039
32-bit uint #3 ....... 0.0089 ........ 0.0118 ...... 0.0063 ........ 0.0044
64-bit uint #1 ....... 0.0097 ........ 0.0155 ...... 0.0063 ........ 0.0045
64-bit uint #2 ....... 0.0095 ........ 0.0153 ...... 0.0061 ........ 0.0045
64-bit uint #3 ....... 0.0096 ........ 0.0156 ...... 0.0063 ........ 0.0047
8-bit int #1 ......... 0.0053 ........ 0.0083 ...... 0.0062 ........ 0.0044
8-bit int #2 ......... 0.0052 ........ 0.0080 ...... 0.0062 ........ 0.0044
8-bit int #3 ......... 0.0052 ........ 0.0080 ...... 0.0062 ........ 0.0043
16-bit int #1 ........ 0.0089 ........ 0.0097 ...... 0.0069 ........ 0.0046
16-bit int #2 ........ 0.0075 ........ 0.0093 ...... 0.0063 ........ 0.0043
16-bit int #3 ........ 0.0075 ........ 0.0094 ...... 0.0062 ........ 0.0046
32-bit int #1 ........ 0.0086 ........ 0.0122 ...... 0.0063 ........ 0.0044
32-bit int #2 ........ 0.0087 ........ 0.0120 ...... 0.0066 ........ 0.0046
32-bit int #3 ........ 0.0086 ........ 0.0121 ...... 0.0060 ........ 0.0044
64-bit int #1 ........ 0.0096 ........ 0.0149 ...... 0.0060 ........ 0.0045
64-bit int #2 ........ 0.0096 ........ 0.0157 ...... 0.0062 ........ 0.0044
64-bit int #3 ........ 0.0096 ........ 0.0160 ...... 0.0063 ........ 0.0046
64-bit int #4 ........ 0.0097 ........ 0.0157 ...... 0.0061 ........ 0.0044
64-bit float #1 ...... 0.0079 ........ 0.0153 ...... 0.0056 ........ 0.0044
64-bit float #2 ...... 0.0079 ........ 0.0152 ...... 0.0057 ........ 0.0045
64-bit float #3 ...... 0.0079 ........ 0.0155 ...... 0.0057 ........ 0.0044
fix string #1 ........ 0.0010 ........ 0.0045 ...... 0.0071 ........ 0.0044
fix string #2 ........ 0.0048 ........ 0.0075 ...... 0.0070 ........ 0.0060
fix string #3 ........ 0.0048 ........ 0.0086 ...... 0.0068 ........ 0.0060
fix string #4 ........ 0.0050 ........ 0.0088 ...... 0.0070 ........ 0.0059
8-bit string #1 ...... 0.0081 ........ 0.0129 ...... 0.0069 ........ 0.0062
8-bit string #2 ...... 0.0086 ........ 0.0128 ...... 0.0069 ........ 0.0065
8-bit string #3 ...... 0.0086 ........ 0.0126 ...... 0.0115 ........ 0.0065
16-bit string #1 ..... 0.0105 ........ 0.0137 ...... 0.0128 ........ 0.0068
16-bit string #2 ..... 0.1510 ........ 0.1486 ...... 0.1526 ........ 0.1391
32-bit string ........ 0.1517 ........ 0.1475 ...... 0.1504 ........ 0.1370
wide char string #1 .. 0.0044 ........ 0.0085 ...... 0.0067 ........ 0.0057
wide char string #2 .. 0.0081 ........ 0.0125 ...... 0.0069 ........ 0.0063
8-bit binary #1 ........... I ............. I ........... F ............. I
8-bit binary #2 ........... I ............. I ........... F ............. I
8-bit binary #3 ........... I ............. I ........... F ............. I
16-bit binary ............. I ............. I ........... F ............. I
32-bit binary ............. I ............. I ........... F ............. I
fix array #1 ......... 0.0014 ........ 0.0059 ...... 0.0132 ........ 0.0055
fix array #2 ......... 0.0146 ........ 0.0156 ...... 0.0155 ........ 0.0148
fix array #3 ......... 0.0211 ........ 0.0229 ...... 0.0179 ........ 0.0180
16-bit array #1 ...... 0.0673 ........ 0.0498 ...... 0.0343 ........ 0.0388
16-bit array #2 ........... S ............. S ........... S ............. S
32-bit array .............. S ............. S ........... S ............. S
complex array ............. I ............. I ........... F ............. F
fix map #1 ................ I ............. I ........... F ............. I
fix map #2 ........... 0.0148 ........ 0.0180 ...... 0.0156 ........ 0.0179
fix map #3 ................ I ............. I ........... F ............. I
fix map #4 ........... 0.0252 ........ 0.0201 ...... 0.0214 ........ 0.0167
16-bit map #1 ........ 0.1027 ........ 0.0836 ...... 0.0388 ........ 0.0510
16-bit map #2 ............. S ............. S ........... S ............. S
32-bit map ................ S ............. S ........... S ............. S
complex map .......... 0.1104 ........ 0.1010 ...... 0.0556 ........ 0.0602
fixext 1 .................. I ............. I ........... F ............. F
fixext 2 .................. I ............. I ........... F ............. F
fixext 4 .................. I ............. I ........... F ............. F
fixext 8 .................. I ............. I ........... F ............. F
fixext 16 ................. I ............. I ........... F ............. F
8-bit ext ................. I ............. I ........... F ............. F
16-bit ext ................ I ............. I ........... F ............. F
32-bit ext ................ I ............. I ........... F ............. F
32-bit timestamp #1 ....... I ............. I ........... F ............. F
32-bit timestamp #2 ....... I ............. I ........... F ............. F
64-bit timestamp #1 ....... I ............. I ........... F ............. F
64-bit timestamp #2 ....... I ............. I ........... F ............. F
64-bit timestamp #3 ....... I ............. I ........... F ............. F
96-bit timestamp #1 ....... I ............. I ........... F ............. F
96-bit timestamp #2 ....... I ............. I ........... F ............. F
96-bit timestamp #3 ....... I ............. I ........... F ............. F
===========================================================================
Total                  0.9642          1.0909        0.8224          0.7213
Skipped                     4               4             4               4
Failed                      0               0            24              17
Ignored                    24              24             0               7

Note that the msgpack extension (v2.1.2) doesn't support ext, bin and UTF-8 str types.

License

The library is released under the MIT License. See the bundled LICENSE file for details.

Author: rybakit
Source Code: https://github.com/rybakit/msgpack.php
License: MIT License

#php 

Treebender: A Symbolic Natural Language Parsing Library for Rust

Treebender

A symbolic natural language parsing library for Rust, inspired by HDPSG.

What is this?

This is a library for parsing natural or constructed languages into syntax trees and feature structures. There's no machine learning or probabilistic models, everything is hand-crafted and deterministic.

You can find out more about the motivations of this project in this blog post.

But what are you using it for?

I'm using this to parse a constructed language for my upcoming xenolinguistics game, Themengi.

Motivation

Using a simple 80-line grammar, introduced in the tutorial below, we can parse a simple subset of English, checking reflexive pronoun binding, case, and number agreement.

$ cargo run --bin cli examples/reflexives.fgr
> she likes himself
Parsed 0 trees

> her likes herself
Parsed 0 trees

> she like herself
Parsed 0 trees

> she likes herself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: she))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: herself)))
[
  child-2: [
    case: acc
    pron: ref
    needs_pron: #0 she
    num: sg
    child-0: [ word: herself ]
  ]
  child-1: [
    tense: nonpast
    child-0: [ word: likes ]
    num: #1 sg
  ]
  child-0: [
    child-0: [ word: she ]
    case: nom
    pron: #0
    num: #1
  ]
]

Low resource language? Low problem! No need to train on gigabytes of text, just write a grammar using your brain. Let's hypothesize that in American Sign Language, topicalized nouns (expressed with raised eyebrows) must appear first in the sentence. We can write a small grammar (18 lines), and plug in some sentences:

$ cargo run --bin cli examples/asl-wordorder.fgr -n
> boy sit
Parsed 1 tree
(0..2: S
  (0..1: NP ((0..1: N (0..1: boy))))
  (1..2: IV (1..2: sit)))

> boy throw ball
Parsed 1 tree
(0..3: S
  (0..1: NP ((0..1: N (0..1: boy))))
  (1..2: TV (1..2: throw))
  (2..3: NP ((2..3: N (2..3: ball)))))

> ball nm-raised-eyebrows boy throw
Parsed 1 tree
(0..4: S
  (0..2: NP
    (0..1: N (0..1: ball))
    (1..2: Topic (1..2: nm-raised-eyebrows)))
  (2..3: NP ((2..3: N (2..3: boy))))
  (3..4: TV (3..4: throw)))

> boy throw ball nm-raised-eyebrows
Parsed 0 trees

Tutorial

As an example, let's say we want to build a parser for English reflexive pronouns (himself, herself, themselves, themself, itself). We'll also support number ("He likes X" v.s. "They like X") and simple embedded clauses ("He said that they like X").

Grammar files are written in a custom language, similar to BNF, called Feature GRammar (.fgr). There's a VSCode syntax highlighting extension for these files available as fgr-syntax.

We'll start by defining our lexicon. The lexicon is the set of terminal symbols (symbols in the actual input) that the grammar will match. Terminal symbols must start with a lowercase letter, and non-terminal symbols must start with an uppercase letter.

// pronouns
N -> he
N -> him
N -> himself
N -> she
N -> her
N -> herself
N -> they
N -> them
N -> themselves
N -> themself

// names, lowercase as they are terminals
N -> mary
N -> sue
N -> takeshi
N -> robert

// complementizer
Comp -> that

// verbs -- intransitive, transitive, and clausal
IV -> falls
IV -> fall
IV -> fell

TV -> likes
TV -> like
TV -> liked

CV -> says
CV -> say
CV -> said

Next, we can add our sentence rules (they must be added at the top, as the first rule in the file is assumed to be the top-level rule):

// sentence rules
S -> N IV
S -> N TV N
S -> N CV Comp S

// ... previous lexicon ...

Assuming this file is saved as examples/no-features.fgr (which it is :wink:), we can test this file with the built-in CLI:

$ cargo run --bin cli examples/no-features.fgr
> he falls
Parsed 1 tree
(0..2: S
  (0..1: N (0..1: he))
  (1..2: IV (1..2: falls)))
[
  child-1: [ child-0: [ word: falls ] ]
  child-0: [ child-0: [ word: he ] ]
]

> he falls her
Parsed 0 trees

> he likes her
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: he))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: her)))
[
  child-2: [ child-0: [ word: her ] ]
  child-1: [ child-0: [ word: likes ] ]
  child-0: [ child-0: [ word: he ] ]
]

> he likes
Parsed 0 trees

> he said that he likes her
Parsed 1 tree
(0..6: S
  (0..1: N (0..1: he))
  (1..2: CV (1..2: said))
  (2..3: Comp (2..3: that))
  (3..6: S
    (3..4: N (3..4: he))
    (4..5: TV (4..5: likes))
    (5..6: N (5..6: her))))
[
  child-0: [ child-0: [ word: he ] ]
  child-2: [ child-0: [ word: that ] ]
  child-1: [ child-0: [ word: said ] ]
  child-3: [
    child-2: [ child-0: [ word: her ] ]
    child-1: [ child-0: [ word: likes ] ]
    child-0: [ child-0: [ word: he ] ]
  ]
]

> he said that he
Parsed 0 trees

This grammar already parses some correct sentences, and blocks some trivially incorrect ones. However, it doesn't care about number, case, or reflexives right now:

> she likes himself  // unbound reflexive pronoun
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: she))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: himself)))
[
  child-0: [ child-0: [ word: she ] ]
  child-2: [ child-0: [ word: himself ] ]
  child-1: [ child-0: [ word: likes ] ]
]

> him like her  // incorrect case on the subject pronoun, should be nominative
                // (he) instead of accusative (him)
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: him))
  (1..2: TV (1..2: like))
  (2..3: N (2..3: her)))
[
  child-0: [ child-0: [ word: him ] ]
  child-1: [ child-0: [ word: like ] ]
  child-2: [ child-0: [ word: her ] ]
]

> he like her  // incorrect verb number agreement
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: he))
  (1..2: TV (1..2: like))
  (2..3: N (2..3: her)))
[
  child-2: [ child-0: [ word: her ] ]
  child-1: [ child-0: [ word: like ] ]
  child-0: [ child-0: [ word: he ] ]
]

To fix this, we need to add features to our lexicon, and restrict the sentence rules based on features.

Features are added with square brackets, and are key: value pairs separated by commas. **top** is a special feature value, which basically means "unspecified" -- we'll come back to it later. Features that are unspecified are also assumed to have a **top** value, but sometimes explicitly stating top is more clear.

/// Pronouns
// The added features are:
// * num: sg or pl, whether this noun wants a singular verb (likes) or
//   a plural verb (like). note this is grammatical number, so for example
//   singular they takes plural agreement ("they like X", not *"they likes X")
// * case: nom or acc, whether this noun is nominative or accusative case.
//   nominative case goes in the subject, and accusative in the object.
//   e.g., "he fell" and "she likes him", not *"him fell" and *"her likes he"
// * pron: he, she, they, or ref -- what type of pronoun this is
// * needs_pron: whether this is a reflexive that needs to bind to another
//   pronoun.
N[ num: sg, case: nom, pron: he ]                    -> he
N[ num: sg, case: acc, pron: he ]                    -> him
N[ num: sg, case: acc, pron: ref, needs_pron: he ]   -> himself
N[ num: sg, case: nom, pron: she ]                   -> she
N[ num: sg, case: acc, pron: she ]                   -> her
N[ num: sg, case: acc, pron: ref, needs_pron: she]   -> herself
N[ num: pl, case: nom, pron: they ]                  -> they
N[ num: pl, case: acc, pron: they ]                  -> them
N[ num: pl, case: acc, pron: ref, needs_pron: they ] -> themselves
N[ num: sg, case: acc, pron: ref, needs_pron: they ] -> themself

// Names
// The added features are:
// * num: sg, as people are singular ("mary likes her" / *"mary like her")
// * case: **top**, as names can be both subjects and objects
//   ("mary likes her" / "she likes mary")
// * pron: whichever pronoun the person uses for reflexive agreement
//   mary    pron: she  => mary likes herself
//   sue     pron: they => sue likes themself
//   takeshi pron: he   => takeshi likes himself
N[ num: sg, case: **top**, pron: she ]  -> mary
N[ num: sg, case: **top**, pron: they ] -> sue
N[ num: sg, case: **top**, pron: he ]   -> takeshi
N[ num: sg, case: **top**, pron: he ]   -> robert

// Complementizer doesn't need features
Comp -> that

// Verbs -- intransitive, transitive, and clausal
// The added features are:
// * num: sg, pl, or **top** -- to match the noun numbers.
//   **top** will match either sg or pl, as past-tense verbs in English
//   don't agree in number: "he fell" and "they fell" are both fine
// * tense: past or nonpast -- this won't be used for agreement, but will be
//   copied into the final feature structure, and the client code could do
//   something with it
IV[ num:      sg, tense: nonpast ] -> falls
IV[ num:      pl, tense: nonpast ] -> fall
IV[ num: **top**, tense: past ]    -> fell

TV[ num:      sg, tense: nonpast ] -> likes
TV[ num:      pl, tense: nonpast ] -> like
TV[ num: **top**, tense: past ]    -> liked

CV[ num:      sg, tense: nonpast ] -> says
CV[ num:      pl, tense: nonpast ] -> say
CV[ num: **top**, tense: past ]    -> said

Now that our lexicon is updated with features, we can update our sentence rules to constrain parsing based on those features. This uses two new features, tags and unification. Tags allow features to be associated between nodes in a rule, and unification controls how those features are compatible. The rules for unification are:

  1. A string feature can unify with a string feature with the same value
  2. A top feature can unify with anything, and the nodes are merged
  3. A complex feature ([ ... ] structure) is recursively unified with another complex feature.

If unification fails anywhere, the parse is aborted and the tree is discarded. This allows the programmer to discard trees if features don't match.

// Sentence rules
// Intransitive verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #1)
S -> N[ case: nom, num: #1 ] IV[ num: #1 ]
// Transitive verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #2)
// * If there's a reflexive in the object position, make sure its `needs_pron`
//   feature matches the subject's `pron` feature. If the object isn't a
//   reflexive, then its `needs_pron` feature will implicitly be `**top**`, so
//   will unify with anything.
S -> N[ case: nom, pron: #1, num: #2 ] TV[ num: #2 ] N[ case: acc, needs_pron: #1 ]
// Clausal verb:
// * Subject must be nominative case
// * Subject and verb must agree in number (copied through #1)
// * Reflexives can't cross clause boundaries (*"He said that she likes himself"),
//   so we can ignore reflexives and delegate to inner clause rule
S -> N[ case: nom, num: #1 ] CV[ num: #1 ] Comp S

Now that we have this augmented grammar (available as examples/reflexives.fgr), we can try it out and see that it rejects illicit sentences that were previously accepted, while still accepting valid ones:

> he fell
Parsed 1 tree
(0..2: S
  (0..1: N (0..1: he))
  (1..2: IV (1..2: fell)))
[
  child-1: [
    child-0: [ word: fell ]
    num: #0 sg
    tense: past
  ]
  child-0: [
    pron: he
    case: nom
    num: #0
    child-0: [ word: he ]
  ]
]

> he like him
Parsed 0 trees

> he likes himself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: he))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: himself)))
[
  child-1: [
    num: #0 sg
    child-0: [ word: likes ]
    tense: nonpast
  ]
  child-2: [
    needs_pron: #1 he
    num: sg
    child-0: [ word: himself ]
    pron: ref
    case: acc
  ]
  child-0: [
    child-0: [ word: he ]
    pron: #1
    num: #0
    case: nom
  ]
]

> he likes herself
Parsed 0 trees

> mary likes herself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: mary))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: herself)))
[
  child-0: [
    pron: #0 she
    num: #1 sg
    case: nom
    child-0: [ word: mary ]
  ]
  child-1: [
    tense: nonpast
    child-0: [ word: likes ]
    num: #1
  ]
  child-2: [
    child-0: [ word: herself ]
    num: sg
    pron: ref
    case: acc
    needs_pron: #0
  ]
]

> mary likes themself
Parsed 0 trees

> sue likes themself
Parsed 1 tree
(0..3: S
  (0..1: N (0..1: sue))
  (1..2: TV (1..2: likes))
  (2..3: N (2..3: themself)))
[
  child-0: [
    pron: #0 they
    child-0: [ word: sue ]
    case: nom
    num: #1 sg
  ]
  child-1: [
    tense: nonpast
    num: #1
    child-0: [ word: likes ]
  ]
  child-2: [
    needs_pron: #0
    case: acc
    pron: ref
    child-0: [ word: themself ]
    num: sg
  ]
]

> sue likes himself
Parsed 0 trees

If this is interesting to you and you want to learn more, you can check out my blog series, the excellent textbook Syntactic Theory: A Formal Introduction (2nd ed.), and the DELPH-IN project, whose work on the LKB inspired this simplified version.

Using from code

I need to write this section in more detail, but if you're comfortable with Rust, I suggest looking through the codebase. It's not perfect, it started as one of my first Rust projects (after migrating through F# -> TypeScript -> C in search of the right performance/ergonomics tradeoff), and it could use more tests, but overall it's not too bad.

Basically, the processing pipeline is:

  1. Make a Grammar struct
  • Grammar is defined in rules.rs.
  • The easiest way to make a Grammar is Grammar::parse_from_file, which is mostly a hand-written recusive descent parser in parse_grammar.rs. Yes, I recognize the irony here.
  1. It takes input (in Grammar::parse, which does everything for you, or Grammar::parse_chart, which just does the chart)
  2. The input is first chart-parsed in earley.rs
  3. Then, a forest is built from the chart, in forest.rs, using an algorithm I found in a very useful blog series I forget the URL for, because the algorithms in the academic literature for this are... weird.
  4. Finally, the feature unification is used to prune the forest down to only valid trees. It would be more efficient to do this during parsing, but meh.

The most interesting thing you can do via code and not via the CLI is probably getting at the raw feature DAG, as that would let you do things like pronoun coreference. The DAG code is in featurestructure.rs, and should be fairly approachable -- there's a lot of Rust ceremony around Rc<RefCell<...>> because using an arena allocation crate seemed too harlike overkill, but that is somewhat mitigated by the NodeRef type alias. Hit me up at https://vgel.me/contact if you need help with anything here!

Download Details:
Author: vgel
Source Code: https://github.com/vgel/treebender
License: MIT License

#rust  #machinelearning 

Py Rouge: A Full Python Implementation Of The ROUGE Metric

Py-rouge

A full Python implementation of the ROUGE metric, producing same results as in the official perl implementation.

Important remarks

  • The original Porter stemmer in NLTK is slightly different than the one use in the official ROUGE perl script as it has been written by end. Therefore, there might be slightly different stems for certain words. For DUC2004 dataset, I have identified these words and this script produces same stems.
  • The official ROUGE perl script use resampling strategy to compute the average with confidence intervals. Therefore, we might have a difference <3e-5 for ROUGE-L as well as ROUGE-W and <4e-5 for ROUGE-N.
  • Finally, ROUGE-1.5.5. has a bug: should have $tmpTextLen += $sLen at line 2101. Here, the last sentence, $limitBytes is taken instead of $limitBytes-$tmpTextLen (as $tmpTextLen is never updated with bytes length limit). It has been fixed in this code. This bug does not have a consequence for the default evaluation -b 665.

In case of doubts, please see all the implemented tests to compare outputs between the official ROUGE-1.5.5 and this script.

Installation

Package is uploaded on PyPI <https://pypi.org/project/py-rouge>_.

You can install it with pip:

pip install py-rouge

or do it manually:

git clone https://github.com/Diego999/py-rouge
cd py-rouge
python setup.py install

Issues/Pull Requests/Feedbacks

Don't hesitate to contact for any feedback or create issues/pull requests (especially if you want to rewrite the stemmer implemented in ROUGE-1.5.5 in python ;)).

Example

import rouge


def prepare_results(m, p, r, f):
    return '\t{}:\t{}: {:5.2f}\t{}: {:5.2f}\t{}: {:5.2f}'.format(m, 'P', 100.0 * p, 'R', 100.0 * r, 'F1', 100.0 * f)


for aggregator in ['Avg', 'Best', 'Individual']:
    print('Evaluation with {}'.format(aggregator))
    apply_avg = aggregator == 'Avg'
    apply_best = aggregator == 'Best'

    evaluator = rouge.Rouge(metrics=['rouge-n', 'rouge-l', 'rouge-w'],
                           max_n=4,
                           limit_length=True,
                           length_limit=100,
                           length_limit_type='words',
                           apply_avg=apply_avg,
                           apply_best=apply_best,
                           alpha=0.5, # Default F1_score
                           weight_factor=1.2,
                           stemming=True)


    hypothesis_1 = "King Norodom Sihanouk has declined requests to chair a summit of Cambodia 's top political leaders , saying the meeting would not bring any progress in deadlocked negotiations to form a government .\nGovernment and opposition parties have asked King Norodom Sihanouk to host a summit meeting after a series of post-election negotiations between the two opposition groups and Hun Sen 's party to form a new government failed .\nHun Sen 's ruling party narrowly won a majority in elections in July , but the opposition _ claiming widespread intimidation and fraud _ has denied Hun Sen the two-thirds vote in parliament required to approve the next government .\n"
    references_1 = ["Prospects were dim for resolution of the political crisis in Cambodia in October 1998.\nPrime Minister Hun Sen insisted that talks take place in Cambodia while opposition leaders Ranariddh and Sam Rainsy, fearing arrest at home, wanted them abroad.\nKing Sihanouk declined to chair talks in either place.\nA U.S. House resolution criticized Hun Sen's regime while the opposition tried to cut off his access to loans.\nBut in November the King announced a coalition government with Hun Sen heading the executive and Ranariddh leading the parliament.\nLeft out, Sam Rainsy sought the King's assurance of Hun Sen's promise of safety and freedom for all politicians.",
                    "Cambodian prime minister Hun Sen rejects demands of 2 opposition parties for talks in Beijing after failing to win a 2/3 majority in recent elections.\nSihanouk refuses to host talks in Beijing.\nOpposition parties ask the Asian Development Bank to stop loans to Hun Sen's government.\nCCP defends Hun Sen to the US Senate.\nFUNCINPEC refuses to share the presidency.\nHun Sen and Ranariddh eventually form a coalition at summit convened by Sihanouk.\nHun Sen remains prime minister, Ranariddh is president of the national assembly, and a new senate will be formed.\nOpposition leader Rainsy left out.\nHe seeks strong assurance of safety should he return to Cambodia.\n",
                    ]

    hypothesis_2 = "China 's government said Thursday that two prominent dissidents arrested this week are suspected of endangering national security _ the clearest sign yet Chinese leaders plan to quash a would-be opposition party .\nOne leader of a suppressed new political party will be tried on Dec. 17 on a charge of colluding with foreign enemies of China '' to incite the subversion of state power , '' according to court documents given to his wife on Monday .\nWith attorneys locked up , harassed or plain scared , two prominent dissidents will defend themselves against charges of subversion Thursday in China 's highest-profile dissident trials in two years .\n"
    references_2 = "Hurricane Mitch, category 5 hurricane, brought widespread death and destruction to Central American.\nEspecially hard hit was Honduras where an estimated 6,076 people lost their lives.\nThe hurricane, which lingered off the coast of Honduras for 3 days before moving off, flooded large areas, destroying crops and property.\nThe U.S. and European Union were joined by Pope John Paul II in a call for money and workers to help the stricken area.\nPresident Clinton sent Tipper Gore, wife of Vice President Gore to the area to deliver much needed supplies to the area, demonstrating U.S. commitment to the recovery of the region.\n"

    all_hypothesis = [hypothesis_1, hypothesis_2]
    all_references = [references_1, references_2]

    scores = evaluator.get_scores(all_hypothesis, all_references)

    for metric, results in sorted(scores.items(), key=lambda x: x[0]):
        if not apply_avg and not apply_best: # value is a type of list as we evaluate each summary vs each reference
            for hypothesis_id, results_per_ref in enumerate(results):
                nb_references = len(results_per_ref['p'])
                for reference_id in range(nb_references):
                    print('\tHypothesis #{} & Reference #{}: '.format(hypothesis_id, reference_id))
                    print('\t' + prepare_results(metric,results_per_ref['p'][reference_id], results_per_ref['r'][reference_id], results_per_ref['f'][reference_id]))
            print()
        else:
            print(prepare_results(metric, results['p'], results['r'], results['f']))
    print()

It produces the following output:

Evaluation with Avg
    rouge-1:    P: 28.62    R: 26.46    F1: 27.49
    rouge-2:    P:  4.21    R:  3.92    F1:  4.06
    rouge-3:    P:  0.80    R:  0.74    F1:  0.77
    rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    rouge-l:    P: 30.52    R: 28.57    F1: 29.51
    rouge-w:    P: 15.85    R:  8.28    F1: 10.87

Evaluation with Best
    rouge-1:    P: 30.44    R: 28.36    F1: 29.37
    rouge-2:    P:  4.74    R:  4.46    F1:  4.59
    rouge-3:    P:  1.06    R:  0.98    F1:  1.02
    rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    rouge-l:    P: 31.54    R: 29.71    F1: 30.60
    rouge-w:    P: 16.42    R:  8.82    F1: 11.47

Evaluation with Individual
    Hypothesis #0 & Reference #0: 
        rouge-1:    P: 38.54    R: 35.58    F1: 37.00
    Hypothesis #0 & Reference #1: 
        rouge-1:    P: 45.83    R: 43.14    F1: 44.44
    Hypothesis #1 & Reference #0: 
        rouge-1:    P: 15.05    R: 13.59    F1: 14.29

    Hypothesis #0 & Reference #0: 
        rouge-2:    P:  7.37    R:  6.80    F1:  7.07
    Hypothesis #0 & Reference #1: 
        rouge-2:    P:  9.47    R:  8.91    F1:  9.18
    Hypothesis #1 & Reference #0: 
        rouge-2:    P:  0.00    R:  0.00    F1:  0.00

    Hypothesis #0 & Reference #0: 
        rouge-3:    P:  2.13    R:  1.96    F1:  2.04
    Hypothesis #0 & Reference #1: 
        rouge-3:    P:  1.06    R:  1.00    F1:  1.03
    Hypothesis #1 & Reference #0: 
        rouge-3:    P:  0.00    R:  0.00    F1:  0.00

    Hypothesis #0 & Reference #0: 
        rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    Hypothesis #0 & Reference #1: 
        rouge-4:    P:  0.00    R:  0.00    F1:  0.00
    Hypothesis #1 & Reference #0: 
        rouge-4:    P:  0.00    R:  0.00    F1:  0.00

    Hypothesis #0 & Reference #0: 
        rouge-l:    P: 42.11    R: 39.39    F1: 40.70
    Hypothesis #0 & Reference #1: 
        rouge-l:    P: 46.19    R: 43.92    F1: 45.03
    Hypothesis #1 & Reference #0: 
        rouge-l:    P: 16.88    R: 15.50    F1: 16.16

    Hypothesis #0 & Reference #0: 
        rouge-w:    P: 22.27    R: 11.49    F1: 15.16
    Hypothesis #0 & Reference #1: 
        rouge-w:    P: 24.56    R: 13.60    F1: 17.51
    Hypothesis #1 & Reference #0: 
        rouge-w:    P:  8.29    R:  4.04    F1:  5.43

Download Details:

Author: Diego999
Source Code: https://github.com/Diego999/py-rouge

License: Apache-2.0 license

#perl #python 

A Plugin for D3.js That Allows You to Easy Use Context-menus

d3-context-menu

This is a plugin for d3.js that allows you to easy use context-menus in your visualizations. It's 100% d3 based and done in the "d3 way", so you don't need to worry about including additional frameworks.

Install with Bower

bower install d3-context-menu

Basic usage:

// Define your menu
var menu = [
    {
        title: 'Item #1',
        action: function(d) {
            console.log('Item #1 clicked!');
            console.log('The data for this circle is: ' + d);
        },
        disabled: false // optional, defaults to false
    },
    {
        title: 'Item #2',
        action: function(d) {
            console.log('You have clicked the second item!');
            console.log('The data for this circle is: ' + d);
        }
    }
]

var data = [1, 2, 3];

var g = d3.select('body').append('svg')
    .attr('width', 200)
    .attr('height', 400)
    .append('g');

g.selectAll('circles')
    .data(data)
    .enter()
    .append('circle')
    .attr('r', 30)
    .attr('fill', 'steelblue')
    .attr('cx', function(d) {
        return 100;
    })
    .attr('cy', function(d) {
        return d * 100;
    })
    .on('contextmenu', d3.contextMenu(menu)); // attach menu to element
});

Advanced usage:

Headers and Dividers

Menus can have Headers and Dividers. To specify a header simply don't define an "action" property. To specify a divider, simply add a "divider: true" property to the menu item, and it'll be considered a divider. Example menu definition:

var menu = [
    {
        title: 'Header',
    },
    {
        title: 'Normal item',
        action: function() {}
    },
    {
        divider: true
    },
    {
        title: 'Last item',
        action: function() {}
    }
];

Nested Menu

Menus can have Nested Menu. To specify a nested menu, simply add "children" property. Children has item of array.

var menu = [
    {
        title: 'Parent',
        children: [
            {
                title: 'Child',
                children: [
                    {
                        // header
                        title: 'Grand-Child1'
                    },
                    {
                        // normal
                        title: 'Grand-Child2',
                        action: function() {}
                    },
                    {
                        // divider
                        divider: true
                    },
                    {
                        // disable
                        title: 'Grand-Child3',
                        action: function() {}
                    }
                ]
            }
        ]
    },
];

See the index.htm file in the example folder to see this in action.

Pre-show callback

You can pass in a callback that will be executed before the context menu appears. This can be useful if you need something to close tooltips or perform some other task before the menu appears:

    ...
    .on('contextmenu', d3.contextMenu(menu, function() {
        console.log('Quick! Before the menu appears!');
    })); // attach menu to element

Post-show callback

You can pass in a callback that will be executed after the context menu appears using the onClose option:

    ...
    .on('contextmenu', d3.contextMenu(menu, {
        onOpen: function() {
            console.log('Quick! Before the menu appears!');
        },
        onClose: function() {
            console.log('Menu has been closed.');
        }
    })); // attach menu to element

Context-sensitive menu items

You can use information from your context in menu names, simply specify a function for title which returns a string:

var menu = [
    {
        title: function(d) {
            return 'Delete circle '+d.circleName;
        },
        action: function(d) {
            // delete it
        }
    },
    {
        title: function(d) {
            return 'Item 2';
        },
        action: function(d) {
            // do nothing interesting
        }
    }
];

// Menu shown is:

[Delete Circle MyCircle]
[Item 2]

Dynamic menu list

You can also have different lists of menu items for different nodes if menu is a function:

var menu = function(data) {
    if (data.x > 100) {
        return [{
            title: 'Item #1',
            action: function(d) {
                console.log('Item #1 clicked!');
                console.log('The data for this circle is: ' + d);
            }
        }];
    } else {
        return [{
            title: 'Item #1',
            action: function(d) {
                console.log('Item #1 clicked!');
                console.log('The data for this circle is: ' + d);
            }
        }, {
            title: 'Item #2',
            action: function(d) {
                console.log('Item #2 clicked!');
                console.log('The data for this circle is: ' + d);
            }
        }];
    }
};

// Menu shown for nodes with x < 100 contains 1 item, while other nodes have 2 menu items

Deleting Nodes Example

The following example shows how to add a right click menu to a tree diagram:

http://plnkr.co/edit/bDBe0xGX1mCLzqYGOqOS?p=info

Explicitly set menu position

Default position can be overwritten by providing a position option (either object or function returning an object):

    ...
    .on('contextmenu', d3.contextMenu(menu, {
        onOpen: function() {
            ...
        },
        onClose: function() {
            ...
        },
        position: {
            top: 100,
            left: 200
        }
    })); // attach menu to element

or

    ...
    .on('contextmenu', d3.contextMenu(menu, {
        onOpen: function() {
            ...
        },
        onClose: function() {
            ...
        },
        position: function(d) {
            var elm = this;
            var bounds = elm.getBoundingClientRect();

            // eg. align bottom-left
            return {
                top: bounds.top + bounds.height,
                left: bounds.left
            }
        }
    })); // attach menu to element

Set your own CSS class as theme (make sure to style it)

d3.contextMenu(menu, {
    ...
    theme: 'my-awesome-theme'
});

or

d3.contextMenu(menu, {
    ...
    theme: function () {
        if (foo) {
            return 'my-foo-theme';
        }
        else {
            return 'my-awesome-theme';
        }
    }
});

Close the context menu programatically (can be used as cleanup, as well)

d3.contextMenu('close');

The following example shows how to add a right click menu to a tree diagram:

http://plnkr.co/edit/bDBe0xGX1mCLzqYGOqOS?p=info

Additional callback arguments

Depending on the D3 library version used the callback functions can provide an additional argument:

  • for D3 6.x or above it will be the event, since the global d3.event is not available.
var menu = [
    {
        title: 'Item #1',
        action: function(d, event) {
            console.log('Item #1 clicked!');
            console.log('The data for this circle is: ' + d);
            console.log('The event is: ' + event);
        }
    }
]
  • for D3 5.x or below it will be the index, for backward compatibility reasons.
var menu = [
    {
        title: 'Item #1',
        action: function(d, index) {
            console.log('Item #1 clicked!');
            console.log('The data for this circle is: ' + d);
            console.log('The index is: ' + index);
        }
    }
]

What's new in version 2.1.0

  • Added support for accessing event information in with D3 6.x.

What's new in version 2.0.0

  • Added support for D3 6.x
  • The index parameter of callbacks are undefined when using D3 6.x or above. See the index.htm file in the example folder to see how to get the proper index value in that case.
  • Added class property for menu items that allows specifying CSS classes (see: https://github.com/patorjk/d3-context-menu/pull/56).

What's new in version 1.1.2

  • Menu updated so it wont go off bottom or right of screen when window is smaller.

What's new in version 1.1.1

  • Menu close bug fix.

What's new in version 1.1.0

  • Nested submenus are now supported.

What's new in version 1.0.1

  • Default theme styles extracted to their own CSS class (d3-context-menu-theme)
  • Ability to specify own theme css class via the theme configuration option (as string or function returning string)
  • onOpen/onClose callbacks now have consistent signature (they receive data and index, and this argument refers to the DOM element the context menu is related to)
  • all other functions (eg. position, menu) have the same signature and this object as onClose/onOpen
  • Context menu now closes on mousedown outside of the menu, instead of click outside (to mimic behaviour of the native context menu)
  • disabled and divider can now be functions as well and have the same signature and this object as explained above
  • Close the context menu programatically using d3.contextMenu('close');

What's new in version 0.2.1

  • Ability to set menu position
  • Minified css and js versions

What's new in version 0.1.3

  • Fixed issue where context menu element is never removed from DOM
  • Fixed issue where <body> click event is never removed
  • Fixed issue where the incorrect onClose callback was called when menu was closed as a result of clicking outside

What's new in version 0.1.2

  • If contextmenu is clicked twice it will close rather than open the browser's context menu.

What's new in version 0.1.1

  • Header and Divider items.
  • Ability to disable items.

It's written to be very light weight and customizable. You can see it in action here:

http://plnkr.co/edit/hAx36JQhb0RsvVn7TomS?p=info

Author: Patorjk
Source Code: https://github.com/patorjk/d3-context-menu 
License: MIT license

#javascript #d3 #menu #visualization 

Everything You Need to Know About Instagram Bot with Python

How to build an Instagram bot using Python

Instagram is the fastest-growing social network, with 1 billion monthly users. It also has the highest engagement rate. To gain followers on Instagram, you’d have to upload engaging content, follow users, like posts, comment on user posts and a whole lot. This can be time-consuming and daunting. But there is hope, you can automate all of these tasks. In this course, we’re going to build an Instagram bot using Python to automate tasks on Instagram.

What you’ll learn:

  • Instagram Automation
  • Build a Bot with Python

Increase your Instagram followers with a simple Python bot

I got around 500 real followers in 4 days!

Growing an audience is an expensive and painful task. And if you’d like to build an audience that’s relevant to you, and shares common interests, that’s even more difficult. I always saw Instagram has a great way to promote my photos, but I never had more than 380 followers… Every once in a while, I decide to start posting my photos on Instagram again, and I manage to keep posting regularly for a while, but it never lasts more than a couple of months, and I don’t have many followers to keep me motivated and engaged.

The objective of this project is to build a bigger audience and as a plus, maybe drive some traffic to my website where I sell my photos!

A year ago, on my last Instagram run, I got one of those apps that lets you track who unfollowed you. I was curious because in a few occasions my number of followers dropped for no apparent reason. After some research, I realized how some users basically crawl for followers. They comment, like and follow people — looking for a follow back. Only to unfollow them again in the next days.

I can’t say this was a surprise to me, that there were bots in Instagram… It just made me want to build one myself!

And that is why we’re here, so let’s get to it! I came up with a simple bot in Python, while I was messing around with Selenium and trying to figure out some project to use it. Simply put, Selenium is like a browser you can interact with very easily in Python.

Ideally, increasing my Instagram audience will keep me motivated to post regularly. As an extra, I included my website in my profile bio, where people can buy some photos. I think it is a bit of a stretch, but who knows?! My sales are basically zero so far, so it should be easy to track that conversion!

Just what the world needed! Another Instagram bot…

After giving this project some thought, my objective was to increase my audience with relevant people. I want to get followers that actually want to follow me and see more of my work. It’s very easy to come across weird content in the most used hashtags, so I’ve planed this bot to lookup specific hashtags and interact with the photos there. This way, I can be very specific about what kind of interests I want my audience to have. For instance, I really like long exposures, so I can target people who use that hashtag and build an audience around this kind of content. Simple and efficient!

My gallery is a mix of different subjects and styles, from street photography to aerial photography, and some travel photos too. Since it’s my hometown, I also have lots of Lisbon images there. These will be the main topics I’ll use in the hashtags I want to target.

This is not a “get 1000 followers in 24 hours” kind of bot!

So what kind of numbers are we talking about?

I ran the bot a few times in a few different hashtags like “travelblogger”, “travelgram”, “lisbon”, “dronephotography”. In the course of three days I went from 380 to 800 followers. Lots of likes, comments and even some organic growth (people that followed me but were not followed by the bot).

To be clear, I’m not using this bot intensively, as Instagram will stop responding if you run it too fast. It needs to have some sleep commands in between the actions, because after some comments and follows in a short period of time, Instagram stops responding and the bot crashes.

You will be logged into your account, so I’m almost sure that Instagram can know you’re doing something weird if you speed up the process. And most importantly, after doing this for a dozen hashtags, it just gets harder to find new users in the same hashtags. You will need to give it a few days to refresh the user base there.

But I don’t want to follow so many people in the process…

The most efficient way to get followers in Instagram (apart from posting great photos!) is to follow people. And this bot worked really well for me because I don’t care if I follow 2000 people to get 400 followers.

The bot saves a list with all the users that were followed while it was running, so someday I may actually do something with this list. For instance, I can visit each user profile, evaluate how many followers or posts they have, and decide if I want to keep following them. Or I can get the first picture in their gallery and check its date to see if they are active users.

If we remove the follow action from the bot, I can assure you the growth rate will suffer, as people are less inclined to follow based on a single like or comment.

Why will you share your code?!

That’s the debate I had with myself. Even though I truly believe in giving back to the community (I still learn a lot from it too!), there are several paid platforms that do more or less the same as this project. Some are shady, some are used by celebrities. The possibility of starting a similar platform myself, is not off the table yet, so why make the code available?

With that in mind, I decided to add an extra level of difficulty to the process, so I was going to post the code below as an image. I wrote “was”, because meanwhile, I’ve realized the image I’m getting is low quality. Which in turn made me reconsider and post the gist. I’m that nice! The idea behind the image was that if you really wanted to use it, you would have to type the code yourself. And that was my way of limiting the use of this tool to people that actually go through the whole process to create it and maybe even improve it.

I learn a lot more when I type the code myself, instead of copy/pasting scripts. I hope you feel the same way!

The script isn’t as sophisticated as it could be, and I know there’s lots of room to improve it. But hey… it works! I have other projects I want to add to my portfolio, so my time to develop it further is rather limited. Nevertheless, I will try to update this article if I dig deeper.

This is the last subtitle!

You’ll need Python (I’m using Python 3.7), Selenium, a browser (in my case I’ll be using Chrome) and… obviously, an Instagram account! Quick overview regarding what the bot will do:

  • Open a browser and login with your credentials
  • For every hashtag in the hashtag list, it will open the page and click the first picture to open it
  • It will then like, follow, comment and move to the next picture, in a 200 iterations loop (number can be adjusted)
  • Saves a list with all the users you followed using the bot

If you reached this paragraph, thank you! You totally deserve to collect your reward! If you find this useful for your profile/brand in any way, do share your experience below :)

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from time import sleep, strftime
from random import randint
import pandas as pd

chromedriver_path = 'C:/Users/User/Downloads/chromedriver_win32/chromedriver.exe' # Change this to your own chromedriver path!
webdriver = webdriver.Chrome(executable_path=chromedriver_path)
sleep(2)
webdriver.get('https://www.instagram.com/accounts/login/?source=auth_switcher')
sleep(3)

username = webdriver.find_element_by_name('username')
username.send_keys('your_username')
password = webdriver.find_element_by_name('password')
password.send_keys('your_password')

button_login = webdriver.find_element_by_css_selector('#react-root > section > main > div > article > div > div:nth-child(1) > div > form > div:nth-child(3) > button')
button_login.click()
sleep(3)

notnow = webdriver.find_element_by_css_selector('body > div:nth-child(13) > div > div > div > div.mt3GC > button.aOOlW.HoLwm')
notnow.click() #comment these last 2 lines out, if you don't get a pop up asking about notifications

In order to use chrome with Selenium, you need to install chromedriver. It’s a fairly simple process and I had no issues with it. Simply install and replace the path above. Once you do that, our variable webdriver will be our Chrome tab.

In cell number 3 you should replace the strings with your own username and the respective password. This is for the bot to type it in the fields displayed. You might have already noticed that when running cell number 2, Chrome opened a new tab. After the password, I’ll define the login button as an object, and in the following line, I click it.

Once you get in inspect mode find the bit of html code that corresponds to what you want to map. Right click it and hover over Copy. You will see that you have some options regarding how you want it to be copied. I used a mix of XPath and css selectors throughout the code (it’s visible in the find_element_ method). It took me a while to get all the references to run smoothly. At points, the css or the xpath directions would fail, but as I adjusted the sleep times, everything started running smoothly.

In this case, I selected “copy selector” and pasted it inside a find_element_ method (cell number 3). It will get you the first result it finds. If it was find_elements_, all elements would be retrieved and you could specify which to get.

Once you get that done, time for the loop. You can add more hashtags in the hashtag_list. If you run it for the first time, you still don’t have a file with the users you followed, so you can simply create prev_user_list as an empty list.

Once you run it once, it will save a csv file with a timestamp with the users it followed. That file will serve as the prev_user_list on your second run. Simple and easy to keep track of what the bot does.

Update with the latest timestamp on the following runs and you get yourself a series of csv backlogs for every run of the bot.

Instagram bot with Python

The code is really simple. If you have some basic notions of Python you can probably pick it up quickly. I’m no Python ninja and I was able to build it, so I guess that if you read this far, you are good to go!

hashtag_list = ['travelblog', 'travelblogger', 'traveler']

# prev_user_list = [] - if it's the first time you run it, use this line and comment the two below
prev_user_list = pd.read_csv('20181203-224633_users_followed_list.csv', delimiter=',').iloc[:,1:2] # useful to build a user log
prev_user_list = list(prev_user_list['0'])

new_followed = []
tag = -1
followed = 0
likes = 0
comments = 0

for hashtag in hashtag_list:
    tag += 1
    webdriver.get('https://www.instagram.com/explore/tags/'+ hashtag_list[tag] + '/')
    sleep(5)
    first_thumbnail = webdriver.find_element_by_xpath('//*[@id="react-root"]/section/main/article/div[1]/div/div/div[1]/div[1]/a/div')
    
    first_thumbnail.click()
    sleep(randint(1,2))    
    try:        
        for x in range(1,200):
            username = webdriver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div/article/header/div[2]/div[1]/div[1]/h2/a').text
            
            if username not in prev_user_list:
                # If we already follow, do not unfollow
                if webdriver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div/article/header/div[2]/div[1]/div[2]/button').text == 'Follow':
                    
                    webdriver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div/article/header/div[2]/div[1]/div[2]/button').click()
                    
                    new_followed.append(username)
                    followed += 1

                    # Liking the picture
                    button_like = webdriver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div/article/div[2]/section[1]/span[1]/button/span')
                    
                    button_like.click()
                    likes += 1
                    sleep(randint(18,25))

                    # Comments and tracker
                    comm_prob = randint(1,10)
                    print('{}_{}: {}'.format(hashtag, x,comm_prob))
                    if comm_prob > 7:
                        comments += 1
                        webdriver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div/article/div[2]/section[1]/span[2]/button/span').click()
                        comment_box = webdriver.find_element_by_xpath('/html/body/div[3]/div/div[2]/div/article/div[2]/section[3]/div/form/textarea')

                        if (comm_prob < 7):
                            comment_box.send_keys('Really cool!')
                            sleep(1)
                        elif (comm_prob > 6) and (comm_prob < 9):
                            comment_box.send_keys('Nice work :)')
                            sleep(1)
                        elif comm_prob == 9:
                            comment_box.send_keys('Nice gallery!!')
                            sleep(1)
                        elif comm_prob == 10:
                            comment_box.send_keys('So cool! :)')
                            sleep(1)
                        # Enter to post comment
                        comment_box.send_keys(Keys.ENTER)
                        sleep(randint(22,28))

                # Next picture
                webdriver.find_element_by_link_text('Next').click()
                sleep(randint(25,29))
            else:
                webdriver.find_element_by_link_text('Next').click()
                sleep(randint(20,26))
    # some hashtag stops refreshing photos (it may happen sometimes), it continues to the next
    except:
        continue

for n in range(0,len(new_followed)):
    prev_user_list.append(new_followed[n])
    
updated_user_df = pd.DataFrame(prev_user_list)
updated_user_df.to_csv('{}_users_followed_list.csv'.format(strftime("%Y%m%d-%H%M%S")))
print('Liked {} photos.'.format(likes))
print('Commented {} photos.'.format(comments))
print('Followed {} new people.'.format(followed))

Instagram bot with Python

The print statement inside the loop is the way I found to be able to have a tracker that lets me know at what iteration the bot is all the time. It will print the hashtag it’s in, the number of the iteration, and the random number generated for the comment action. I decided not to post comments in every page, so I added three different comments and a random number between 1 and 10 that would define if there was any comment at all, or one of the three. The loop ends, we append the new_followed users to the previous users “database” and saves the new file with the timestamp. You should also get a small report.

Instagram bot with Python

And that’s it!

After a few hours without checking the phone, these were the numbers I was getting. I definitely did not expect it to do so well! In about 4 days since I’ve started testing it, I had around 500 new followers, which means I have doubled my audience in a matter of days. I’m curious to see how many of these new followers I will lose in the next days, to see if the growth can be sustainable. I also had a lot more “likes” in my latest photos, but I guess that’s even more expected than the follow backs.

Instagram bot with Python

It would be nice to get this bot running in a server, but I have other projects I want to explore, and configuring a server is not one of them! Feel free to leave a comment below, and I’ll do my best to answer your questions.

I’m actually curious to see how long will I keep posting regularly! If you feel like this article was helpful for you, consider thanking me by buying one of my photos.

Instagram bot with Python



How to Make an Instagram Bot With Python and InstaPy

Instagram bot with Python

What do SocialCaptain, Kicksta, Instavast, and many other companies have in common? They all help you reach a greater audience, gain more followers, and get more likes on Instagram while you hardly lift a finger. They do it all through automation, and people pay them a good deal of money for it. But you can do the same thing—for free—using InstaPy!

In this tutorial, you’ll learn how to build a bot with Python and InstaPy, which automates your Instagram activities so that you gain more followers and likes with minimal manual input. Along the way, you’ll learn about browser automation with Selenium and the Page Object Pattern, which together serve as the basis for InstaPy.

In this tutorial, you’ll learn:

  • How Instagram bots work
  • How to automate a browser with Selenium
  • How to use the Page Object Pattern for better readability and testability
  • How to build an Instagram bot with InstaPy

You’ll begin by learning how Instagram bots work before you build one.

Table of Contents

  • How Instagram Bots Work
  • How to Automate a Browser
  • How to Use the Page Object Pattern
  • How to Build an Instagram Bot With InstaPy
    • Essential Features
    • Additional Features in InstaPy
  • Conclusion

Important: Make sure you check Instagram’s Terms of Use before implementing any kind of automation or scraping techniques.

How Instagram Bots Work

How can an automation script gain you more followers and likes? Before answering this question, think about how an actual person gains more followers and likes.

They do it by being consistently active on the platform. They post often, follow other people, and like and leave comments on other people’s posts. Bots work exactly the same way: They follow, like, and comment on a consistent basis according to the criteria you set.

The better the criteria you set, the better your results will be. You want to make sure you’re targeting the right groups because the people your bot interacts with on Instagram will be more likely to interact with your content.

For example, if you’re selling women’s clothing on Instagram, then you can instruct your bot to like, comment on, and follow mostly women or profiles whose posts include hashtags such as #beauty, #fashion, or #clothes. This makes it more likely that your target audience will notice your profile, follow you back, and start interacting with your posts.

How does it work on the technical side, though? You can’t use the Instagram Developer API since it is fairly limited for this purpose. Enter browser automation. It works in the following way:

  1. You serve it your credentials.
  2. You set the criteria for who to follow, what comments to leave, and which type of posts to like.
  3. Your bot opens a browser, types in https://instagram.com on the address bar, logs in with your credentials, and starts doing the things you instructed it to do.

Next, you’ll build the initial version of your Instagram bot, which will automatically log in to your profile. Note that you won’t use InstaPy just yet.

How to Automate a Browser

For this version of your Instagram bot, you’ll be using Selenium, which is the tool that InstaPy uses under the hood.

First, install Selenium. During installation, make sure you also install the Firefox WebDriver since the latest version of InstaPy dropped support for Chrome. This also means that you need the Firefox browser installed on your computer.

Now, create a Python file and write the following code in it:

from time import sleep

from selenium import webdriver


browser = webdriver.Firefox()


browser.get('https://www.instagram.com/')


sleep(5)


browser.close()

Run the code and you’ll see that a Firefox browser opens and directs you to the Instagram login page. Here’s a line-by-line breakdown of the code:

  • Lines 1 and 2 import sleep and webdriver.
  • Line 4 initializes the Firefox driver and sets it to browser.
  • Line 6 types https://www.instagram.com/ on the address bar and hits Enter.
  • Line 8 waits for five seconds so you can see the result. Otherwise, it would close the browser instantly.
  • Line 10 closes the browser.

This is the Selenium version of Hello, World. Now you’re ready to add the code that logs in to your Instagram profile. But first, think about how you would log in to your profile manually. You would do the following:

  1. Go to https://www.instagram.com/.
  2. Click the login link.
  3. Enter your credentials.
  4. Hit the login button.

The first step is already done by the code above. Now change it so that it clicks on the login link on the Instagram home page:

from time import sleep

from selenium import webdriver


browser = webdriver.Firefox()

browser.implicitly_wait(5)


browser.get('https://www.instagram.com/')


login_link = browser.find_element_by_xpath("//a[text()='Log in']")

login_link.click()


sleep(5)


browser.close()

Note the highlighted lines:

  • Line 5 sets five seconds of waiting time. If Selenium can’t find an element, then it waits for five seconds to allow everything to load and tries again.
  • Line 9 finds the element <a> whose text is equal to Log in. It does this using XPath, but there are a few other methods you could use.
  • Line 10 clicks on the found element <a> for the login link.

Run the script and you’ll see your script in action. It will open the browser, go to Instagram, and click on the login link to go to the login page.

On the login page, there are three important elements:

  1. The username input
  2. The password input
  3. The login button

Next, change the script so that it finds those elements, enters your credentials, and clicks on the login button:

from time import sleep

from selenium import webdriver


browser = webdriver.Firefox()

browser.implicitly_wait(5)


browser.get('https://www.instagram.com/')


login_link = browser.find_element_by_xpath("//a[text()='Log in']")

login_link.click()


sleep(2)


username_input = browser.find_element_by_css_selector("input[name='username']")

password_input = browser.find_element_by_css_selector("input[name='password']")


username_input.send_keys("<your username>")

password_input.send_keys("<your password>")


login_button = browser.find_element_by_xpath("//button[@type='submit']")

login_button.click()


sleep(5)


browser.close()

Here’s a breakdown of the changes:

  1. Line 12 sleeps for two seconds to allow the page to load.
  2. Lines 14 and 15 find username and password inputs by CSS. You could use any other method that you prefer.
  3. Lines 17 and 18 type your username and password in their respective inputs. Don’t forget to fill in <your username> and <your password>!
  4. Line 20 finds the login button by XPath.
  5. Line 21 clicks on the login button.

Run the script and you’ll be automatically logged in to to your Instagram profile.

You’re off to a good start with your Instagram bot. If you were to continue writing this script, then the rest would look very similar. You would find the posts that you like by scrolling down your feed, find the like button by CSS, click on it, find the comments section, leave a comment, and continue.

The good news is that all of those steps can be handled by InstaPy. But before you jump into using Instapy, there is one other thing that you should know about to better understand how InstaPy works: the Page Object Pattern.

How to Use the Page Object Pattern

Now that you’ve written the login code, how would you write a test for it? It would look something like the following:

def test_login_page(browser):
    browser.get('https://www.instagram.com/accounts/login/')
    username_input = browser.find_element_by_css_selector("input[name='username']")
    password_input = browser.find_element_by_css_selector("input[name='password']")
    username_input.send_keys("<your username>")
    password_input.send_keys("<your password>")
    login_button = browser.find_element_by_xpath("//button[@type='submit']")
    login_button.click()

    errors = browser.find_elements_by_css_selector('#error_message')
    assert len(errors) == 0

Can you see what’s wrong with this code? It doesn’t follow the DRY principle. That is, the code is duplicated in both the application and the test code.

Duplicating code is especially bad in this context because Selenium code is dependent on UI elements, and UI elements tend to change. When they do change, you want to update your code in one place. That’s where the Page Object Pattern comes in.

With this pattern, you create page object classes for the most important pages or fragments that provide interfaces that are straightforward to program to and that hide the underlying widgetry in the window. With this in mind, you can rewrite the code above and create a HomePage class and a LoginPage class:

from time import sleep

class LoginPage:
    def __init__(self, browser):
        self.browser = browser

    def login(self, username, password):
        username_input = self.browser.find_element_by_css_selector("input[name='username']")
        password_input = self.browser.find_element_by_css_selector("input[name='password']")
        username_input.send_keys(username)
        password_input.send_keys(password)
        login_button = browser.find_element_by_xpath("//button[@type='submit']")
        login_button.click()
        sleep(5)

class HomePage:
    def __init__(self, browser):
        self.browser = browser
        self.browser.get('https://www.instagram.com/')

    def go_to_login_page(self):
        self.browser.find_element_by_xpath("//a[text()='Log in']").click()
        sleep(2)
        return LoginPage(self.browser)

The code is the same except that the home page and the login page are represented as classes. The classes encapsulate the mechanics required to find and manipulate the data in the UI. That is, there are methods and accessors that allow the software to do anything a human can.

One other thing to note is that when you navigate to another page using a page object, it returns a page object for the new page. Note the returned value of go_to_log_in_page(). If you had another class called FeedPage, then login() of the LoginPage class would return an instance of that: return FeedPage().

Here’s how you can put the Page Object Pattern to use:

from selenium import webdriver

browser = webdriver.Firefox()
browser.implicitly_wait(5)

home_page = HomePage(browser)
login_page = home_page.go_to_login_page()
login_page.login("<your username>", "<your password>")

browser.close()

It looks much better, and the test above can now be rewritten to look like this:

def test_login_page(browser):
    home_page = HomePage(browser)
    login_page = home_page.go_to_login_page()
    login_page.login("<your username>", "<your password>")

    errors = browser.find_elements_by_css_selector('#error_message')
    assert len(errors) == 0

With these changes, you won’t have to touch your tests if something changes in the UI.

For more information on the Page Object Pattern, refer to the official documentation and to Martin Fowler’s article.

Now that you’re familiar with both Selenium and the Page Object Pattern, you’ll feel right at home with InstaPy. You’ll build a basic bot with it next.

Note: Both Selenium and the Page Object Pattern are widely used for other websites, not just for Instagram.

How to Build an Instagram Bot With InstaPy

In this section, you’ll use InstaPy to build an Instagram bot that will automatically like, follow, and comment on different posts. First, you’ll need to install InstaPy:

$ python3 -m pip install instapy

This will install instapy in your system.

Essential Features

Now you can rewrite the code above with InstaPy so that you can compare the two options. First, create another Python file and put the following code in it:

from instapy import InstaPy

InstaPy(username="<your_username>", password="<your_password>").login()

Replace the username and password with yours, run the script, and voilà! With just one line of code, you achieved the same result.

Even though your results are the same, you can see that the behavior isn’t exactly the same. In addition to simply logging in to your profile, InstaPy does some other things, such as checking your internet connection and the status of the Instagram servers. This can be observed directly on the browser or in the logs:

INFO [2019-12-17 22:03:19] [username]  -- Connection Checklist [1/3] (Internet Connection Status)
INFO [2019-12-17 22:03:20] [username]  - Internet Connection Status: ok
INFO [2019-12-17 22:03:20] [username]  - Current IP is "17.283.46.379" and it's from "Germany/DE"
INFO [2019-12-17 22:03:20] [username]  -- Connection Checklist [2/3] (Instagram Server Status)
INFO [2019-12-17 22:03:26] [username]  - Instagram WebSite Status: Currently Up

Pretty good for one line of code, isn’t it? Now it’s time to make the script do more interesting things than just logging in.

For the purpose of this example, assume that your profile is all about cars, and that your bot is intended to interact with the profiles of people who are also interested in cars.

First, you can like some posts that are tagged #bmw or #mercedes using like_by_tags():

from instapy import InstaPy


session = InstaPy(username="<your_username>", password="<your_password>")

session.login()

session.like_by_tags(["bmw", "mercedes"], amount=5)

Here, you gave the method a list of tags to like and the number of posts to like for each given tag. In this case, you instructed it to like ten posts, five for each of the two tags. But take a look at what happens after you run the script:

INFO [2019-12-17 22:15:58] [username]  Tag [1/2]
INFO [2019-12-17 22:15:58] [username]  --> b'bmw'
INFO [2019-12-17 22:16:07] [username]  desired amount: 14  |  top posts [disabled]: 9  |  possible posts: 43726739
INFO [2019-12-17 22:16:13] [username]  Like# [1/14]
INFO [2019-12-17 22:16:13] [username]  https://www.instagram.com/p/B6MCcGcC3tU/
INFO [2019-12-17 22:16:15] [username]  Image from: b'mattyproduction'
INFO [2019-12-17 22:16:15] [username]  Link: b'https://www.instagram.com/p/B6MCcGcC3tU/'
INFO [2019-12-17 22:16:15] [username]  Description: b'Mal etwas anderes \xf0\x9f\x91\x80\xe2\x98\xba\xef\xb8\x8f Bald ist das komplette Video auf YouTube zu finden (n\xc3\xa4here Infos werden folgen). Vielen Dank an @patrick_jwki @thehuthlife  und @christic_  f\xc3\xbcr das bereitstellen der Autos \xf0\x9f\x94\xa5\xf0\x9f\x98\x8d#carporn#cars#tuning#bagged#bmw#m2#m2competition#focusrs#ford#mk3#e92#m3#panasonic#cinematic#gh5s#dji#roninm#adobe#videography#music#bimmer#fordperformance#night#shooting#'
INFO [2019-12-17 22:16:15] [username]  Location: b'K\xc3\xb6ln, Germany'
INFO [2019-12-17 22:16:51] [username]  --> Image Liked!
INFO [2019-12-17 22:16:56] [username]  --> Not commented
INFO [2019-12-17 22:16:57] [username]  --> Not following
INFO [2019-12-17 22:16:58] [username]  Like# [2/14]
INFO [2019-12-17 22:16:58] [username]  https://www.instagram.com/p/B6MDK1wJ-Kb/
INFO [2019-12-17 22:17:01] [username]  Image from: b'davs0'
INFO [2019-12-17 22:17:01] [username]  Link: b'https://www.instagram.com/p/B6MDK1wJ-Kb/'
INFO [2019-12-17 22:17:01] [username]  Description: b'Someone said cloud? \xf0\x9f\xa4\x94\xf0\x9f\xa4\xad\xf0\x9f\x98\x88 \xe2\x80\xa2\n\xe2\x80\xa2\n\xe2\x80\xa2\n\xe2\x80\xa2\n#bmw #bmwrepost #bmwm4 #bmwm4gts #f82 #bmwmrepost #bmwmsport #bmwmperformance #bmwmpower #bmwm4cs #austinyellow #davs0 #mpower_official #bmw_world_ua #bimmerworld #bmwfans #bmwfamily #bimmers #bmwpost #ultimatedrivingmachine #bmwgang #m3f80 #m5f90 #m4f82 #bmwmafia #bmwcrew #bmwlifestyle'
INFO [2019-12-17 22:17:34] [username]  --> Image Liked!
INFO [2019-12-17 22:17:37] [username]  --> Not commented
INFO [2019-12-17 22:17:38] [username]  --> Not following

By default, InstaPy will like the first nine top posts in addition to your amount value. In this case, that brings the total number of likes per tag to fourteen (nine top posts plus the five you specified in amount).

Also note that InstaPy logs every action it takes. As you can see above, it mentions which post it liked as well as its link, description, location, and whether the bot commented on the post or followed the author.

You may have noticed that there are delays after almost every action. That’s by design. It prevents your profile from getting banned on Instagram.

Now, you probably don’t want your bot liking inappropriate posts. To prevent that from happening, you can use set_dont_like():

from instapy import InstaPy

session = InstaPy(username="<your_username>", password="<your_password>")
session.login()
session.like_by_tags(["bmw", "mercedes"], amount=5)
session.set_dont_like(["naked", "nsfw"])

With this change, posts that have the words naked or nsfw in their descriptions won’t be liked. You can flag any other words that you want your bot to avoid.

Next, you can tell the bot to not only like the posts but also to follow some of the authors of those posts. You can do that with set_do_follow():

from instapy import InstaPy

session = InstaPy(username="<your_username>", password="<your_password>")
session.login()
session.like_by_tags(["bmw", "mercedes"], amount=5)
session.set_dont_like(["naked", "nsfw"])
session.set_do_follow(True, percentage=50)

If you run the script now, then the bot will follow fifty percent of the users whose posts it liked. As usual, every action will be logged.

You can also leave some comments on the posts. There are two things that you need to do. First, enable commenting with set_do_comment():

from instapy import InstaPy

session = InstaPy(username="<your_username>", password="<your_password>")
session.login()
session.like_by_tags(["bmw", "mercedes"], amount=5)
session.set_dont_like(["naked", "nsfw"])
session.set_do_follow(True, percentage=50)
session.set_do_comment(True, percentage=50)

Next, tell the bot what comments to leave with set_comments():

from instapy import InstaPy

session = InstaPy(username="<your_username>", password="<your_password>")
session.login()
session.like_by_tags(["bmw", "mercedes"], amount=5)
session.set_dont_like(["naked", "nsfw"])
session.set_do_follow(True, percentage=50)
session.set_do_comment(True, percentage=50)
session.set_comments(["Nice!", "Sweet!", "Beautiful :heart_eyes:"])

Run the script and the bot will leave one of those three comments on half the posts that it interacts with.

Now that you’re done with the basic settings, it’s a good idea to end the session with end():

from instapy import InstaPy

session = InstaPy(username="<your_username>", password="<your_password>")
session.login()
session.like_by_tags(["bmw", "mercedes"], amount=5)
session.set_dont_like(["naked", "nsfw"])
session.set_do_follow(True, percentage=50)
session.set_do_comment(True, percentage=50)
session.set_comments(["Nice!", "Sweet!", "Beautiful :heart_eyes:"])
session.end()

This will close the browser, save the logs, and prepare a report that you can see in the console output.

Additional Features in InstaPy

InstaPy is a sizable project that has a lot of thoroughly documented features. The good news is that if you’re feeling comfortable with the features you used above, then the rest should feel pretty similar. This section will outline some of the more useful features of InstaPy.

Quota Supervisor

You can’t scrape Instagram all day, every day. The service will quickly notice that you’re running a bot and will ban some of its actions. That’s why it’s a good idea to set quotas on some of your bot’s actions. Take the following for example:

session.set_quota_supervisor(enabled=True, peak_comments_daily=240, peak_comments_hourly=21)

The bot will keep commenting until it reaches its hourly and daily limits. It will resume commenting after the quota period has passed.

Headless Browser

This feature allows you to run your bot without the GUI of the browser. This is super useful if you want to deploy your bot to a server where you may not have or need the graphical interface. It’s also less CPU intensive, so it improves performance. You can use it like so:

session = InstaPy(username='test', password='test', headless_browser=True)

Note that you set this flag when you initialize the InstaPy object.

Using AI to Analyze Posts

Earlier you saw how to ignore posts that contain inappropriate words in their descriptions. What if the description is good but the image itself is inappropriate? You can integrate your InstaPy bot with ClarifAI, which offers image and video recognition services:

session.set_use_clarifai(enabled=True, api_key='<your_api_key>')
session.clarifai_check_img_for(['nsfw'])

Now your bot won’t like or comment on any image that ClarifAI considers NSFW. You get 5,000 free API-calls per month.

Relationship Bounds

It’s often a waste of time to interact with posts by people who have a lot of followers. In such cases, it’s a good idea to set some relationship bounds so that your bot doesn’t waste your precious computing resources:

session.set_relationship_bounds(enabled=True, max_followers=8500)

With this, your bot won’t interact with posts by users who have more than 8,500 followers.

For many more features and configurations in InstaPy, check out the documentation.

Conclusion

InstaPy allows you to automate your Instagram activities with minimal fuss and effort. It’s a very flexible tool with a lot of useful features.

In this tutorial, you learned:

  • How Instagram bots work
  • How to automate a browser with Selenium
  • How to use the Page Object Pattern to make your code more maintainable and testable
  • How to use InstaPy to build a basic Instagram bot

Read the InstaPy documentation and experiment with your bot a little bit. Soon you’ll start getting new followers and likes with a minimal amount of effort. I gained a few new followers myself while writing this tutorial.


Automating Instagram API with Python

Instagram bot with Python

Gain active followers - Algorithm

Maybe some of you do not agree it is a good way to grow your IG page by using follow for follow method but after a lot of researching I found the proper way to use this method.

I have done and used this strategy for a while and my page visits also followers started growing.

The majority of people failing because they randomly targeting the followers and as a result, they are not coming back to your page. So, the key is to find people those have same interests with you.

If you have a programming page go and search for IG pages which have big programming community and once you find one, don’t send follow requests to followers of this page. Because some of them are not active even maybe fake accounts. So, in order to gain active followers, go the last post of this page and find people who liked the post.

Unofficial Instagram API

In order to query data from Instagram I am going to use the very cool, yet unofficial, Instagram API written by Pasha Lev.

**Note:**Before you test it make sure you verified your phone number in your IG account.

The program works pretty well so far but in case of any problems I have to put disclaimer statement here:

Disclaimer: This post published educational purposes only as well as to give general information about Instagram API. I am not responsible for any actions and you are taking your own risk.

Let’s start by installing and then logging in with API.

pip install InstagramApi

from InstagramAPI import InstagramAPI

api = InstagramAPI("username", "password")
api.login()

Once you run the program you will see “Login success!” in your console.

Get users from liked list

We are going to search for some username (your target page) then get most recent post from this user. Then, get users who liked this post. Unfortunately, I can’t find solution how to paginate users so right now it gets about last 500 user.

users_list = []

def get_likes_list(username):
    api.login()
    api.searchUsername(username)
    result = api.LastJson
    username_id = result['user']['pk'] # Get user ID
    user_posts = api.getUserFeed(username_id) # Get user feed
    result = api.LastJson
    media_id = result['items'][0]['id'] # Get most recent post
    api.getMediaLikers(media_id) # Get users who liked
    users = api.LastJson['users']
    for user in users: # Push users to list
        users_list.append({'pk':user['pk'], 'username':user['username']})

Follow Users

Once we get the users list, it is time to follow these users.

IMPORTANT NOTE: set time limit as much as you can to avoid automation detection.

from time import sleep

following_users = []

def follow_users(users_list):
    api.login()
    api.getSelfUsersFollowing() # Get users which you are following
    result = api.LastJson
    for user in result['users']:
        following_users.append(user['pk'])
    for user in users_list:
        if not user['pk'] in following_users: # if new user is not in your following users                   
            print('Following @' + user['username'])
            api.follow(user['pk'])
            # after first test set this really long to avoid from suspension
            sleep(20)
        else:
            print('Already following @' + user['username'])
            sleep(10)

Unfollow Users

This function will look users which you are following then it will check if this user follows you as well. If user not following you then you are unfollowing as well.

follower_users = []

def unfollow_users():
    api.login()
    api.getSelfUserFollowers() # Get your followers
    result = api.LastJson
    for user in result['users']:
        follower_users.append({'pk':user['pk'], 'username':user['username']})

    api.getSelfUsersFollowing() # Get users which you are following
    result = api.LastJson
    for user in result['users']:
        following_users.append({'pk':user['pk'],'username':user['username']})
    for user in following_users:
        if not user['pk'] in follower_users: # if the user not follows you
            print('Unfollowing @' + user['username'])
            api.unfollow(user['pk'])
            # set this really long to avoid from suspension
            sleep(20) 

Full Code with extra functions

Here is the full code of this automation

import pprint
from time import sleep
from InstagramAPI import InstagramAPI
import pandas as pd

users_list = []
following_users = []
follower_users = []

class InstaBot:

    def __init__(self):
        self.api = InstagramAPI("your_username", "your_password")

    def get_likes_list(self,username):
        api = self.api
        api.login()
        api.searchUsername(username) #Gets most recent post from user
        result = api.LastJson
        username_id = result['user']['pk']
        user_posts = api.getUserFeed(username_id)
        result = api.LastJson
        media_id = result['items'][0]['id']

        api.getMediaLikers(media_id)
        users = api.LastJson['users']
        for user in users:
            users_list.append({'pk':user['pk'], 'username':user['username']})
        bot.follow_users(users_list)

    def follow_users(self,users_list):
        api = self.api
        api.login()
        api.getSelfUsersFollowing()
        result = api.LastJson
        for user in result['users']:
            following_users.append(user['pk'])
        for user in users_list:
            if not user['pk'] in following_users:
                print('Following @' + user['username'])
                api.follow(user['pk'])
                # set this really long to avoid from suspension
                sleep(20)
            else:
                print('Already following @' + user['username'])
                sleep(10)

     def unfollow_users(self):
        api = self.api
        api.login()
        api.getSelfUserFollowers()
        result = api.LastJson
        for user in result['users']:
            follower_users.append({'pk':user['pk'], 'username':user['username']})

        api.getSelfUsersFollowing()
        result = api.LastJson
        for user in result['users']:
            following_users.append({'pk':user['pk'],'username':user['username']})

        for user in following_users:
            if not user['pk'] in [user['pk'] for user in follower_users]:
                print('Unfollowing @' + user['username'])
                api.unfollow(user['pk'])
                # set this really long to avoid from suspension
                sleep(20) 

bot =  InstaBot()
# To follow users run the function below
# change the username ('instagram') to your target username
bot.get_likes_list('instagram')

# To unfollow users uncomment and run the function below
# bot.unfollow_users()

it will look like this:

Reverse Python

some extra functions to play with API:

def get_my_profile_details():
    api.login() 
    api.getSelfUsernameInfo()
    result = api.LastJson
    username = result['user']['username']
    full_name = result['user']['full_name']
    profile_pic_url = result['user']['profile_pic_url']
    followers = result['user']['follower_count']
    following = result['user']['following_count']
    media_count = result['user']['media_count']
    df_profile = pd.DataFrame(
        {'username':username,
        'full name': full_name,
        'profile picture URL':profile_pic_url,
        'followers':followers,
        'following':following,
        'media count': media_count,
        }, index=[0])
    df_profile.to_csv('profile.csv', sep='\t', encoding='utf-8')

def get_my_feed():
    image_urls = []
    api.login()
    api.getSelfUserFeed()
    result = api.LastJson
    # formatted_json_str = pprint.pformat(result)
    # print(formatted_json_str)
    if 'items' in result.keys():
        for item in result['items'][0:5]:
            if 'image_versions2' in item.keys():
                image_url = item['image_versions2']['candidates'][1]['url']
                image_urls.append(image_url)

    df_feed = pd.DataFrame({
                'image URL':image_urls
            })
    df_feed.to_csv('feed.csv', sep='\t', encoding='utf-8')


Building an Instagram Bot with Python and Selenium to Gain More Followers

This is image title

Let’s build an Instagram bot to gain more followers! — I know, I know. That doesn’t sound very ethical, does it? But it’s all justified for educational purposes.

Coding is a super power — we can all agree. That’s why I’ll leave it up to you to not abuse this power. And I trust you’re here to learn how it works. Otherwise, you’d be on GitHub cloning one of the countless Instagram bots there, right?

You’re convinced? — Alright, now let’s go back to unethical practices.

The Plan

So here’s the deal, we want to build a bot in Python and Selenium that goes on the hashtags we specify, likes random posts, then follows the posters. It does that enough — we get follow backs. Simple as that.

Here’s a pretty twisted detail though: we want to keep track of the users we follow so the bot can unfollow them after the number of days we specify.

Setup

So first things first, I want to use a database to keep track of the username and the date added. You might as well save/load from/to a file, but we want this to be ready for more features in case we felt inspired in the future.

So make sure you create a database (I named mine instabot — but you can name it anything you like) and create a table called followed_users within the database with two fields (username, date_added)

Remember the installation path. You’ll need it.

You’ll also need the following python packages:

  • selenium
  • mysql-connector

Getting down to it

Alright, so first thing we’ll be doing is creating settings.json. Simply a .json file that will hold all of our settings so we don’t have to dive into the code every time we want to change something.

Settings

settings.json:

{
  "db": {
    "host": "localhost",
    "user": "root",
    "pass": "",
    "database": "instabot"
  },
  "instagram": {
    "user": "",
    "pass": ""
  },
  "config": {
    "days_to_unfollow": 1,
    "likes_over": 150,
    "check_followers_every": 3600,
    "hashtags": []
  }
}

As you can see, under “db”, we specify the database information. As I mentioned, I used “instabot”, but feel free to use whatever name you want.

You’ll also need to fill Instagram info under “instagram” so the bot can login into your account.

“config” is for our bot’s settings. Here’s what the fields mean:

days_to_unfollow: number of days before unfollowing users

likes_over: ignore posts if the number of likes is above this number

check_followers_every: number of seconds before checking if it’s time to unfollow any of the users

hashtags: a list of strings with the hashtag names the bot should be active on

Constants

Now, we want to take these settings and have them inside our code as constants.

Create Constants.py:

import json
INST_USER= INST_PASS= USER= PASS= HOST= DATABASE= POST_COMMENTS= ''
LIKES_LIMIT= DAYS_TO_UNFOLLOW= CHECK_FOLLOWERS_EVERY= 0
HASHTAGS= []

def init():
    global INST_USER, INST_PASS, USER, PASS, HOST, DATABASE, LIKES_LIMIT, DAYS_TO_UNFOLLOW, CHECK_FOLLOWERS_EVERY, HASHTAGS
    # read file
    data = None
    with open('settings.json', 'r') as myfile:
        data = myfile.read()
    obj = json.loads(data)
    INST_USER = obj['instagram']['user']
    INST_PASS = obj['instagram']['pass']
    USER = obj['db']['user']
    HOST = obj['db']['host']
    PASS = obj['db']['pass']
    DATABASE = obj['db']['database']
    LIKES_LIMIT = obj['config']['likes_over']
    CHECK_FOLLOWERS_EVERY = obj['config']['check_followers_every']
    HASHTAGS = obj['config']['hashtags']
    DAYS_TO_UNFOLLOW = obj['config']['days_to_unfollow']

the init() function we created reads the data from settings.json and feeds them into the constants we declared.

Engine

Alright, time for some architecture. Our bot will mainly operate from a python script with an init and update methods. Create BotEngine.py:

import Constants


def init(webdriver):
    return


def update(webdriver):
    return

We’ll be back later to put the logic here, but for now, we need an entry point.

Entry Point

Create our entry point, InstaBot.py:

from selenium import webdriver
import BotEngine

chromedriver_path = 'YOUR CHROMEDRIVER PATH' 
webdriver = webdriver.Chrome(executable_path=chromedriver_path)

BotEngine.init(webdriver)
BotEngine.update(webdriver)

webdriver.close()

chromedriver_path = ‘YOUR CHROMEDRIVER PATH’ webdriver = webdriver.Chrome(executable_path=chromedriver_path)

BotEngine.init(webdriver)
BotEngine.update(webdriver)

webdriver.close()

Of course, you’ll need to swap “YOUR CHROMEDRIVER PATH” with your actual ChromeDriver path.

Time Helper

We need to create a helper script that will help us calculate elapsed days since a certain date (so we know if we should unfollow user)

Create TimeHelper.py:

import datetime


def days_since_date(n):
    diff = datetime.datetime.now().date() - n
    return diff.days

Database

Create DBHandler.py. It’ll contain a class that handles connecting to the Database for us.

import mysql.connector
import Constants
class DBHandler:
    def __init__(self):
        DBHandler.HOST = Constants.HOST
        DBHandler.USER = Constants.USER
        DBHandler.DBNAME = Constants.DATABASE
        DBHandler.PASSWORD = Constants.PASS
    HOST = Constants.HOST
    USER = Constants.USER
    DBNAME = Constants.DATABASE
    PASSWORD = Constants.PASS
    @staticmethod
    def get_mydb():
        if DBHandler.DBNAME == '':
            Constants.init()
        db = DBHandler()
        mydb = db.connect()
        return mydb

    def connect(self):
        mydb = mysql.connector.connect(
            host=DBHandler.HOST,
            user=DBHandler.USER,
            passwd=DBHandler.PASSWORD,
            database = DBHandler.DBNAME
        )
        return mydb

As you can see, we’re using the constants we defined.

The class contains a static method get_mydb() that returns a database connection we can use.

Now, let’s define a DB user script that contains the DB operations we need to perform on the user.

Create DBUsers.py:

import datetime, TimeHelper
from DBHandler import *
import Constants

#delete user by username
def delete_user(username):
    mydb = DBHandler.get_mydb()
    cursor = mydb.cursor()
    sql = "DELETE FROM followed_users WHERE username = '{0}'".format(username)
    cursor.execute(sql)
    mydb.commit()


#add new username
def add_user(username):
    mydb = DBHandler.get_mydb()
    cursor = mydb.cursor()
    now = datetime.datetime.now().date()
    cursor.execute("INSERT INTO followed_users(username, date_added) VALUES(%s,%s)",(username, now))
    mydb.commit()


#check if any user qualifies to be unfollowed
def check_unfollow_list():
    mydb = DBHandler.get_mydb()
    cursor = mydb.cursor()
    cursor.execute("SELECT * FROM followed_users")
    results = cursor.fetchall()
    users_to_unfollow = []
    for r in results:
        d = TimeHelper.days_since_date(r[1])
        if d > Constants.DAYS_TO_UNFOLLOW:
            users_to_unfollow.append(r[0])
    return users_to_unfollow


#get all followed users
def get_followed_users():
    users = []
    mydb = DBHandler.get_mydb()
    cursor = mydb.cursor()
    cursor.execute("SELECT * FROM followed_users")
    results = cursor.fetchall()
    for r in results:
        users.append(r[0])

    return users

Account Agent

Alright, we’re about to start our bot. We’re creating a script called AccountAgent.py that will contain the agent behavior.

Import some modules, some of which we need for later and write a login function that will make use of our webdriver.

Notice that we have to keep calling the sleep function between actions. If we send too many requests quickly, the Instagram servers will be alarmed and will deny any requests you send.

from time import sleep
import datetime
import DBUsers, Constants
import traceback
import random

def login(webdriver):
    #Open the instagram login page
    webdriver.get('https://www.instagram.com/accounts/login/?source=auth_switcher')
    #sleep for 3 seconds to prevent issues with the server
    sleep(3)
    #Find username and password fields and set their input using our constants
    username = webdriver.find_element_by_name('username')
    username.send_keys(Constants.INST_USER)
    password = webdriver.find_element_by_name('password')
    password.send_keys(Constants.INST_PASS)
    #Get the login button
    try:
        button_login = webdriver.find_element_by_xpath(
            '//*[@id="react-root"]/section/main/div/article/div/div[1]/div/form/div[4]/button')
    except:
        button_login = webdriver.find_element_by_xpath(
            '//*[@id="react-root"]/section/main/div/article/div/div[1]/div/form/div[6]/button/div')
    #sleep again
    sleep(2)
    #click login
    button_login.click()
    sleep(3)
    #In case you get a popup after logging in, press not now.
    #If not, then just return
    try:
        notnow = webdriver.find_element_by_css_selector(
            'body > div.RnEpo.Yx5HN > div > div > div.mt3GC > button.aOOlW.HoLwm')
        notnow.click()
    except:
        return

Also note how we’re getting elements with their xpath. To do so, right click on the element, click “Inspect”, then right click on the element again inside the inspector, and choose Copy->Copy XPath.

Another important thing to be aware of is that element hierarchy change with the page’s layout when you resize or stretch the window. That’s why we’re checking for two different xpaths for the login button.

Now go back to BotEngine.py, we’re ready to login.

Add more imports that we’ll need later and fill in the init function

import AccountAgent, DBUsers
import Constants
import datetime


def init(webdriver):
    Constants.init()
    AccountAgent.login(webdriver)


def update(webdriver):
    return

If you run our entry script now (InstaBot.py) you’ll see the bot logging in.

Perfect, now let’s add a method that will allow us to follow people to AccountAgent.py:

def follow_people(webdriver):
    #all the followed user
    prev_user_list = DBUsers.get_followed_users()
    #a list to store newly followed users
    new_followed = []
    #counters
    followed = 0
    likes = 0
    #Iterate theough all the hashtags from the constants
    for hashtag in Constants.HASHTAGS:
        #Visit the hashtag
        webdriver.get('https://www.instagram.com/explore/tags/' + hashtag+ '/')
        sleep(5)

        #Get the first post thumbnail and click on it
        first_thumbnail = webdriver.find_element_by_xpath(
            '//*[@id="react-root"]/section/main/article/div[1]/div/div/div[1]/div[1]/a/div')

        first_thumbnail.click()
        sleep(random.randint(1,3))

        try:
            #iterate over the first 200 posts in the hashtag
            for x in range(1,200):
                t_start = datetime.datetime.now()
                #Get the poster's username
                username = webdriver.find_element_by_xpath('/html/body/div[3]/div[2]/div/article/header/div[2]/div[1]/div[1]/h2/a').text
                likes_over_limit = False
                try:
                    #get number of likes and compare it to the maximum number of likes to ignore post
                    likes = int(webdriver.find_element_by_xpath(
                        '/html/body/div[3]/div[2]/div/article/div[2]/section[2]/div/div/button/span').text)
                    if likes > Constants.LIKES_LIMIT:
                        print("likes over {0}".format(Constants.LIKES_LIMIT))
                        likes_over_limit = True


                    print("Detected: {0}".format(username))
                    #If username isn't stored in the database and the likes are in the acceptable range
                    if username not in prev_user_list and not likes_over_limit:
                        #Don't press the button if the text doesn't say follow
                        if webdriver.find_element_by_xpath('/html/body/div[3]/div[2]/div/article/header/div[2]/div[1]/div[2]/button').text == 'Follow':
                            #Use DBUsers to add the new user to the database
                            DBUsers.add_user(username)
                            #Click follow
                            webdriver.find_element_by_xpath('/html/body/div[3]/div[2]/div/article/header/div[2]/div[1]/div[2]/button').click()
                            followed += 1
                            print("Followed: {0}, #{1}".format(username, followed))
                            new_followed.append(username)


                        # Liking the picture
                        button_like = webdriver.find_element_by_xpath(
                            '/html/body/div[3]/div[2]/div/article/div[2]/section[1]/span[1]/button')

                        button_like.click()
                        likes += 1
                        print("Liked {0}'s post, #{1}".format(username, likes))
                        sleep(random.randint(5, 18))


                    # Next picture
                    webdriver.find_element_by_link_text('Next').click()
                    sleep(random.randint(20, 30))
                    
                except:
                    traceback.print_exc()
                    continue
                t_end = datetime.datetime.now()

                #calculate elapsed time
                t_elapsed = t_end - t_start
                print("This post took {0} seconds".format(t_elapsed.total_seconds()))


        except:
            traceback.print_exc()
            continue

        #add new list to old list
        for n in range(0, len(new_followed)):
            prev_user_list.append(new_followed[n])
        print('Liked {} photos.'.format(likes))
        print('Followed {} new people.'.format(followed))

It’s pretty long, but generally here’s the steps of the algorithm:

For every hashtag in the hashtag constant list:

  • Visit the hashtag link
  • Open the first thumbnail
  • Now, execute the following code 200 times (first 200 posts in the hashtag)
  • Get poster’s username, check if not already following, follow, like the post, then click next
  • If already following just click next quickly

Now we might as well implement the unfollow method, hopefully the engine will be feeding us the usernames to unfollow in a list:

def unfollow_people(webdriver, people):
    #if only one user, append in a list
    if not isinstance(people, (list,)):
        p = people
        people = []
        people.append(p)

    for user in people:
        try:
            webdriver.get('https://www.instagram.com/' + user + '/')
            sleep(5)
            unfollow_xpath = '//*[@id="react-root"]/section/main/div/header/section/div[1]/div[1]/span/span[1]/button'

            unfollow_confirm_xpath = '/html/body/div[3]/div/div/div[3]/button[1]'

            if webdriver.find_element_by_xpath(unfollow_xpath).text == "Following":
                sleep(random.randint(4, 15))
                webdriver.find_element_by_xpath(unfollow_xpath).click()
                sleep(2)
                webdriver.find_element_by_xpath(unfollow_confirm_xpath).click()
                sleep(4)
            DBUsers.delete_user(user)

        except Exception:
            traceback.print_exc()
            continue

Now we can finally go back and finish the bot by implementing the rest of BotEngine.py:

import AccountAgent, DBUsers
import Constants
import datetime


def init(webdriver):
    Constants.init()
    AccountAgent.login(webdriver)


def update(webdriver):
    #Get start of time to calculate elapsed time later
    start = datetime.datetime.now()
    #Before the loop, check if should unfollow anyone
    _check_follow_list(webdriver)
    while True:
        #Start following operation
        AccountAgent.follow_people(webdriver)
        #Get the time at the end
        end = datetime.datetime.now()
        #How much time has passed?
        elapsed = end - start
        #If greater than our constant to check on
        #followers, check on followers
        if elapsed.total_seconds() >= Constants.CHECK_FOLLOWERS_EVERY:
            #reset the start variable to now
            start = datetime.datetime.now()
            #check on followers
            _check_follow_list(webdriver)


def _check_follow_list(webdriver):
    print("Checking for users to unfollow")
    #get the unfollow list
    users = DBUsers.check_unfollow_list()
    #if there's anyone in the list, start unfollowing operation
    if len(users) > 0:
        AccountAgent.unfollow_people(webdriver, users)

Conclusion

And that’s it — now you have yourself a fully functional Instagram bot built with Python and Selenium. There are many possibilities for you to explore now, so make sure you’re using this newly gained skill to solve real life problems!

You can get the source code for the whole project from this GitHub repository.


Building a simple Instagram bot with Python tutorial

Here we build a simple bot using some simple Python which beginner to intermediate coders can follow.

Here’s the code on GitHub
https://github.com/aj-4/ig-followers


Build A (Full-Featured) Instagram Bot With Python

Source Code: https://github.com/jg-fisher/instagram-bot 


How to Get Instagram Followers/Likes Using Python

In this video I show you how to program your own Instagram Bot using Python and Selenium.

https://www.youtube.com/watch?v=BGU2X5lrz9M 

Code Link:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
import random
import sys


def print_same_line(text):
    sys.stdout.write('\r')
    sys.stdout.flush()
    sys.stdout.write(text)
    sys.stdout.flush()


class InstagramBot:

    def __init__(self, username, password):
        self.username = username
        self.password = password
        self.driver = webdriver.Chrome()

    def closeBrowser(self):
        self.driver.close()

    def login(self):
        driver = self.driver
        driver.get("https://www.instagram.com/")
        time.sleep(2)
        login_button = driver.find_element_by_xpath("//a[@href='/accounts/login/?source=auth_switcher']")
        login_button.click()
        time.sleep(2)
        user_name_elem = driver.find_element_by_xpath("//input[@name='username']")
        user_name_elem.clear()
        user_name_elem.send_keys(self.username)
        passworword_elem = driver.find_element_by_xpath("//input[@name='password']")
        passworword_elem.clear()
        passworword_elem.send_keys(self.password)
        passworword_elem.send_keys(Keys.RETURN)
        time.sleep(2)


    def like_photo(self, hashtag):
        driver = self.driver
        driver.get("https://www.instagram.com/explore/tags/" + hashtag + "/")
        time.sleep(2)

        # gathering photos
        pic_hrefs = []
        for i in range(1, 7):
            try:
                driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
                time.sleep(2)
                # get tags
                hrefs_in_view = driver.find_elements_by_tag_name('a')
                # finding relevant hrefs
                hrefs_in_view = [elem.get_attribute('href') for elem in hrefs_in_view
                                 if '.com/p/' in elem.get_attribute('href')]
                # building list of unique photos
                [pic_hrefs.append(href) for href in hrefs_in_view if href not in pic_hrefs]
                # print("Check: pic href length " + str(len(pic_hrefs)))
            except Exception:
                continue

        # Liking photos
        unique_photos = len(pic_hrefs)
        for pic_href in pic_hrefs:
            driver.get(pic_href)
            time.sleep(2)
            driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
            try:
                time.sleep(random.randint(2, 4))
                like_button = lambda: driver.find_element_by_xpath('//span[@aria-label="Like"]').click()
                like_button().click()
                for second in reversed(range(0, random.randint(18, 28))):
                    print_same_line("#" + hashtag + ': unique photos left: ' + str(unique_photos)
                                    + " | Sleeping " + str(second))
                    time.sleep(1)
            except Exception as e:
                time.sleep(2)
            unique_photos -= 1

if __name__ == "__main__":

    username = "USERNAME"
    password = "PASSWORD"

    ig = InstagramBot(username, password)
    ig.login()

    hashtags = ['amazing', 'beautiful', 'adventure', 'photography', 'nofilter',
                'newyork', 'artsy', 'alumni', 'lion', 'best', 'fun', 'happy',
                'art', 'funny', 'me', 'followme', 'follow', 'cinematography', 'cinema',
                'love', 'instagood', 'instagood', 'followme', 'fashion', 'sun', 'scruffy',
                'street', 'canon', 'beauty', 'studio', 'pretty', 'vintage', 'fierce']

    while True:
        try:
            # Choose a random tag from the list of tags
            tag = random.choice(hashtags)
            ig.like_photo(tag)
        except Exception:
            ig.closeBrowser()
            time.sleep(60)
            ig = InstagramBot(username, password)
            ig.login()

Build An INSTAGRAM Bot With Python That Gets You Followers


Instagram Automation Using Python


How to Create an Instagram Bot | Get More Followers


Building a simple Instagram Influencer Bot with Python tutorial

#python #chatbot #web-development