Fast & Non-regex Based Lexical Analysis for Deno

Lightweight, fast and non-regex based tokeniser for Deno.

features

  • Fast compared to regex based lexers.
  • Clever ident lookup.
  • Works on Deno, Node and the browser.

usage

let lexer = new Lexer("1 + 1 - 1");
lexer.use("1", (ch) => { 
  return { type: "one", value: ch };
});

lexer.use(["+", "-"], (ch) => { 
  return { type: "operator", value: ch };
});

const tokens = lexer.lex();

benchmarks

Benchmark code can be found in benchmarks/benchmark.ts. Below results are for a math expression lexer implemented in lexy and regexp. Input length: 400001 characters.

iters lexy (ms) regexp (ms)
1 0.461942 196.247439
10 0.587031 1526.35242
100 0.61392 14445.481962
1000 1.158556 156252.4421

Less is better

why

Lexical analysis is a common step towards building great parsers & analysers. Many libraries use regex for building lexers although that can impact performance in the long run.

In real programming languages, lexers are generally hand-written instead of regex. One of the main reason is performance.

This article compares hand-written lexers to regex based.

Download Details:

Author: littledivy

Source Code: https://github.com/littledivy/lexy

#deno #nodejs #node #javascript

Fast & Non-regex Based Lexical Analysis for Deno
2.60 GEEK