RogerBW's Blog

Basic text Parsing with Winnow 04 February 2025

For Advent of Code 2023 I tried out the nom parsing library and then switched to winnow, but I found it quite hard work and did a lot of cargo-culting of my own code. I went back to the library for a recent problem and think I've got more of a handle on it now. Since there seem to be very few worked examples, here's mine. Note that this is a partial spoiler for Advent of Code 2015 day 7.

I'm not claiming this is the best way to do things, but it's a way that works.

This is a system of logic gates and related operations, and there are three distinct possible line layouts:

  • value operator value -> address
  • NOT value -> address
  • value -> address

An address is the name of a variable, which can hold a number. A value can be either a constant (literal number) or an address, in which case the value at that address is used. (Therefore the thing on the right-hand side of the "->" assignment glyph is an address, not a value.)

Operators can be "AND", "OR", "LSHIFT" or "RSHIFT". (Technically NOT could also be considered an operator, and even pure assignment, but we'll be treating them differently because of the different syntax.)

OK, so we need some data structures. I'm going to store addresses as numbers (treating the letters as base-36 digits) rather as than the alphabetic strings that they're supplied as, because in Rust it's less faff to copy numbers about the place than ditto strings, and a 32-bit integer will fit up to six base-36 digits. So here's a trick you can do with an enum to let a variable hold one of two different things without changing its type.

#[derive(Debug)]
enum Value {
    Const(u16),
    Address(u32),
}

The operator. Every parsed line will contain an operator, so I include "Assign" and "Not" as possibilities.

#[derive(Debug)]
enum Op {
    Assign,
    Not,
    And,
    Or,
    Lshift,
    Rshift,
}

A gate. This is what I'm ultimately going to parse every line into. (If the operator only takes one argument, the second will be ignored. I could use an Option<Value> here but I didn't bother.)

#[derive(Debug)]
struct Gate {
    op: Op,
    in_a: Value,
    in_b: Value,
    out: Value,
}

Right. Now to build parsers from the bottom up. Here I'll parse an unsigned decimal number, and store it as a u16 to go into a Const-type Value.

fn parse_const(input: &mut &str) -> ModalResult<Value> {
    let c = dec_uint.parse_next(input)?;
    Ok(Value::Const(c))
}

(Note that ModalResult was known as PResult in older versions of winnow, but it works the same way. I say older versions; the release with the change happened after I wrote the first draft of this.)

So the first line tries the parse, and returns a failure if it doesn't succeed; the second line can assume it has valid contents, and returns the actual value I want.

Next we'll parse a string of letters for an address, which I'll treat as a base-36 number and store as a u32. (Yes, this will accept upper-case letters too, even though they might instead be an operator. That won't be a problem as we'll deal with it later.)

fn parse_address(input: &mut &str) -> ModalResult<Value> {
    let c = alpha1.parse_next(input)?;
    Ok(Value::Address(u32::from_str_radix(c, 36).unwrap()))
}

Now I'll combine these two to make a parser that'll recognise and return any Value. The alt combinator tries several parsers in succession on the same string, and returns with the first one that works (or the error from the last one).

Because all I want to return is the result of the parser, whatever it may be, I don't need to mess about with assigning to a variable and then wrapping some byproduct of that variable in Ok.

fn parse_value(input: &mut &str) -> ModalResult<Value> {
    alt((parse_const, parse_address)).parse_next(input)
}

Let's parse the four two-argument operators next. Again we use alt to try each possibility, but this time the alternatives are just string fragments. I use a match to pull these out as opcodes. (The panic shouldn't be able to happen because if we didn't have one of these four strings we shouldn't get as far as the match.)

fn parse_op(input: &mut &str) -> ModalResult<Op> {
    let c = alt(("AND", "OR", "LSHIFT", "RSHIFT")).parse_next(input)?;
    Ok(match c {
        "AND" => Op::And,
        "OR" => Op::Or,
        "LSHIFT" => Op::Lshift,
        "RSHIFT" => Op::Rshift,
        _ => panic!("bad op"),
    })
}

Now let's parse one of the three types of full lines, in this case an assignment ("value -> address"). I can use separated_pair to spot and discard the "->", and return the two values, which I'll build into a Gate (with a zero value for in_b).

Since this is an AoC problem I don't need to worry about incorrect input.

fn parse_assign(input: &mut &str) -> ModalResult<Gate> {
    let c = separated_pair(parse_value,
                           (space0, "->", space0),
                           parse_address
    )
        .parse_next(input)?;
    Ok(Gate { op: Op::Assign, in_a: c.0, in_b: Value::Const(0), out: c.1 })
}

Now to parse the second type of full line ("NOT value -> address"). Here I capture more than I need (the "NOT" and the separator), but I just ignore them when building the Gate object. (seq! offers a way round this, but I won't bother with it here.)

fn parse_oneop(input: &mut &str) -> ModalResult<Gate> {
    let c = ("NOT",
             space1,
             parse_value,
             (space0, "->", space0),
             parse_address
    )
        .parse_next(input)?;
    Ok(Gate { op: Op::Not, in_a: c.2, in_b: Value::Const(0), out: c.4 })
}

And the third one, "value operator value -> address". Again I don't care about the details of the separators.

fn parse_twoop(input: &mut &str) -> ModalResult<Gate> {
    let c = (
        parse_value,
        space1,
        parse_op,
        space1,
        parse_value,
        (space0, "->", space0),
        parse_address,
    )
        .parse_next(input)?;
    Ok(Gate { op: c.2, in_a: c.0, in_b: c.4, out: c.6 })
}

The last parsing step is a single parser that will try each of the last three, i.e. each allowable structure of a line, and return a Gate built from the one that works. (Which is why I didn't worry about selecting lower-case letters back in parse_address.)

fn parse_line(input: &mut &str) -> ModalResult<Gate> {
    alt((parse_twoop, parse_oneop, parse_assign)).parse_next(input)
}

Now I need to build that into my actual puzzle solver. I need to pass a mutable ref of the input string.

fn main() {
    let args: Vec<String> = env::args().collect();
    let fname = &args[1];
    let f = File::open(fname).unwrap();
    let fr = BufReader::new(f);
    let mut gates: Vec<Gate> = Vec::new();
    for line in fr.lines() {
        let lu = line.unwrap();
        let mut ll = lu.as_str();
        if let Ok(x) = parse_line(&mut ll) {
            gates.push(x);
        } else {
            panic!("bad parse");
        }
    }

And now I have a Vec of Gate structs, ready to go on to solving the actual problem… but that's out of scope for this post.

You'll need a whole lot of include lines like use winnow::combinator::alt; at the start of the program, but the compiler will tell you what they should be. In this case:

use winnow::ascii::{alpha1, dec_uint, space0, space1};
use winnow::combinator::{alt, separated_pair};
use winnow::ModalResult;
use winnow::Parser;

See also:
Advent of Code 2023

Add A Comment

Your Name
Your Email
Your Comment

Your submission will be ignored if any field is left blank, but your email address will not be displayed. Comments will be processed through markdown.

Search
Archive
Tags 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 2300ad 3d printing action advent of code aeronautics aikakirja anecdote animation anime army astronomy audio audio tech base commerce battletech bayern beer boardgaming book of the week bookmonth chain of command children chris chronicle church of no redeeming virtues cold war comedy computing contemporary cornish smuggler cosmic encounter coup covid-19 crime crystal cthulhu eternal cycling dead of winter doctor who documentary drama driving drone ecchi economics en garde espionage essen 2015 essen 2016 essen 2017 essen 2018 essen 2019 essen 2022 essen 2023 essen 2024 existential risk falklands war fandom fanfic fantasy feminism film firefly first world war flash point flight simulation food garmin drive gazebo genesys geocaching geodata gin gkp gurps gurps 101 gus harpoon historical history horror hugo 2014 hugo 2015 hugo 2016 hugo 2017 hugo 2018 hugo 2019 hugo 2020 hugo 2021 hugo 2022 hugo 2023 hugo 2024 hugo-nebula reread in brief avoid instrumented life javascript julian simpson julie enfield kickstarter kotlin learn to play leaving earth linux liquor lovecraftiana lua mecha men with beards mpd museum music mystery naval noir non-fiction one for the brow opera parody paul temple perl perl weekly challenge photography podcast politics postscript powers prediction privacy project woolsack pyracantha python quantum rail raku ranting raspberry pi reading reading boardgames social real life restaurant reviews romance rpg a day rpgs ruby rust scala science fiction scythe second world war security shipwreck simutrans smartphone south atlantic war squaddies stationery steampunk stuarts suburbia superheroes suspense television the resistance the weekly challenge thirsty meeples thriller tin soldier torg toys trailers travel type 26 type 31 type 45 vietnam war war wargaming weather wives and sweethearts writing about writing x-wing young adult
Special All book reviews, All film reviews
Produced by aikakirja v0.1