A Lexer Generator for Dart

lexer_builder

A lexer generator for dart.

Features

  • Match tokens using RegExp syntax.
  • Rules dependent on lexer state.
  • Generates lexer code automatically.

Caveats

  • Generated lexers using regex are likely slower than handwritten ones.

Getting started

Include lexer_builder and build_runner as dev_dependencies and lexer_builder_runtime in your pubspec.yml.

Usage

Annotate a class with Lexer() and methods in it with Rule() to define a lexer. See the example for detailed instructions.

TODO

  • Eventually generate custom code for the rules instead of using RegExp internally.
  • Support async lexing with that by accurately measuring partly-matched subgroups, dispatching a rule if the next character doesn't make any match longer.

Additional information

For more information about lexers in general, see the flex lexer generator and its documentation.

Use this package as a library

Depend on it

Run this command:

With Dart:

 $ dart pub add lexer_builder

This will add a line like this to your package's pubspec.yaml (and run an implicit dart pub get):

dependencies:
  lexer_builder: ^0.1.0

Alternatively, your editor might support dart pub get. Check the docs for your editor to learn more.

Import it

Now in your Dart code, you can use:

import 'package:lexer_builder/lexer_builder.dart'; 

example/lib/example.dart

import 'package:lexer_builder_runtime/lexer_builder_runtime.dart';

// This line is needed to include the generated lexer.
// Always include "part 'filename.g.dart';" in files where you declare lexers.
part 'example.g.dart';

/// The [Token] type for the lexer.
/// The lexer returns a [List] of tokens generated by the [Rule] methods (actions if you're coming from the lexer generator flex).
/// This has to be a subclass of [Token].
/// You can pass arbitrary data in tokens back out of the lexer.
/// This example just passes back the matched String.
class StringLexerToken extends Token {
  /// The value matched by the rule.
  String value;
  
  StringLexerToken(this.value);
}

/// The lexer class.
/// Annotate a class with [Lexer] to generate a _Classname class for it to extend.
/// The generated class includes the matching code for the rules, and needs the Token class as a type parameter.
/// The optional parameter startState defines the starting state for the lexer, and defaults to 0.
@Lexer()
class StringLexer extends _StringLexer<StringLexerToken> {
  
  // Since the generated class defines these methods, you should use override.
  @override
  // Rule specifies the method as a rule for the lexer.
  // The first parameter is the pattern that will cause the method to be executed if matched.
  // The second parameter is the priority, the highest rule that has a match is selected.
  @Rule("[a-zA-Z0-9]+", 0)
  TokenResponse<StringLexerToken> word(String token, int line, int char, int index) {
    // TokenResponse.accept accepts the match and can return a token to the token stream.
    // TokenResponse.reject would cause the lexer to find another matching rule for the input instead.
    return TokenResponse.accept(StringLexerToken(token));
  }
  
  @override
  @Rule(r"\s+", 0)
  TokenResponse<StringLexerToken> space(String token, int line, int char, int index) {
    // If null is passed, no token is placed in the token stream for this rule.
    return TokenResponse.accept(null);
  }
  
  @override
  @Rule('"', 1)
  TokenResponse<StringLexerToken> quote(String token, int line, int char, int index) {
    // The variable state is defined by the generated class and lets you query and modify the lexer state.
    // Here state 0 represents a word matching state and state 1 matches everything in double quotes, like a string literal.
    if (state == 0) {
      state = 1;
    } else {
      state = 0;
    }
    return TokenResponse.accept(null);
  }
  
  
  @override
  // The optional parameter state defines the state in which the rule will be considered for matching.
  // It defaults to null, which means the rule is considered in any state.
  // Here it is set to 1 to make this rule only match in the string literal state,
  @Rule('[^"]+', 1, state: 1)
  TokenResponse<StringLexerToken> wordQuoted(String token, int line, int char, int index) {
    return TokenResponse.accept(StringLexerToken(token));
  }

  
  
} 

Download details:

Author: 

Source: https://github.com/tareksander/Dart-Parsertools/tree/main/lexer_builder

#flutter #android #ios 

A Lexer Generator for Dart
1.00 GEEK