Exercism's Rust Track: Rust Code Normalizer

A normalizing representer for Exercism's Rust track.

What's a Normalizing Representer?

A representer's job is to normalize some input code by stripping out and replacing any trivial details that introduce differences between students' submitted code. Comments, whitespace, and variable names, things that don't contribute to the overall logical flow and structure of the students' approach, are stripped out. In the case of variable names, these are replaced by a standard placeholder.

The ultimate purpose of the representer is to facilitate quicker response times from mentors by standardizing student implementations so that mentors can provide feedback on the approach the student took to solve the problem.

Example

Given an example submission for the two-fer exercise like the following:

fn twofer(name: &str) -> String {
    match name {
        "" => "One for you, one for me.".to_string(),
        // use the `format!` macro to return a formatted String
        _ => format!("One for {}, one for me.", name),
    }
}

The representer will return:

fn PLACEHOLDER_1(PLACEHOLDER_2: &str) -> String {
    match PLACEHOLDER_2 {
        "" => "One for you, one for me.".to_string(),
        _ => format!("One for {}, one for me.", PLACEHOLDER_2),
    }
}

Progress

Currently the following statement/expression types are visited by the representer:

let bindings
struct names
struct fields
enum names
enum variants
fn definitions
fn calls
method calls
const names
static names
union names
type aliases
match expressions
match arms
macro arguments
closure expressions
for loops
while loops
loops
if expressions
impl blocks
type annotations
if let bindings
user-defined types
user-defined traits
mod imports
output variable mappings to a JSON file

Design

The high-level steps the representer takes are as follows:

It transforms the source code into an AST, stripping out comments in the process.
From there, it traverses the AST, looking for identifiers.
When it finds an identifier:
- It checks whether the identifier is a Rust keyword (or any other sort of identifier that isn't actually being used as a variable/function name).
- If the identifier isn't a keyword, it then checks if the identifier is one that has been encountered before.
  - If it is, then a placeholder for this identifier has already been generated and stored in a HashMap; the identifier is replaced with the placeholder.
  - If it isn't, then the placeholder needs to be generated and saved in the HashMap before the identifier is replaced by it.
The transformed output is then put through another formatting pass.

Download Details:

Author: exercism

Official Github: https://github.com/exercism/rust-representer

License: AGPL-3.0 license

#rust