RogerBW's Blog

Fractionally less basic text parsing with Winnow 08 February 2025

I've been making a little more progress with text parsing in Rust using the winnow library. Today I will parse the input format (and thus provide a very limited spoiler) for Advent of Code 2015 day 16.

I'm still not claiming this is the best way to do things, but it's a way that works.

An example line from my puzzle input is:

Sue 1: cars: 9, akitas: 3, goldfish: 0

After the ID number, each comma-separated item is a key-value pair, which I want to put into a HashMap. Here's the target structure:

struct Sue {
    id: u32,
    attr: HashMap<String, u32>,
}

Here I'll parse a set of attributes, separated by commas. Lifetimes in Rust are still a very new thing to me, but as I understand it the basic idea here is to say "the output is a reference to parts of the input, so the input must be kept allocated until I've finished with the output".

separated() gets me a sequence of things with a common separator: in this case I want one or more key-value pairs, separated by a comma and at least one space.

separated_pair() looks at just one of those pairs, a word and a number separated by a colon and at least one space.

Specifying the output type in the function template lets Rust assemble everything behind the scenes.

fn parse_attributes<'a>(
    input: &mut &'a str,
) -> ModalResult<HashMap<&'a str, u32>> {
    separated(
        1..,
        separated_pair(alpha1, (":", space1), dec_uint),
        (",", space1),
    )
    .parse_next(input)
}

The line parser looks for the "Sue (number):" part of the line, and extracts the number, then throws parse_attribute at the rest.

seq! is a macro that lets me specify several fields to parse in a row, but discard some of them (in this case the fixed text "Sue " and ": ").

fn parse_line(input: &mut &str) -> ModalResult<Sue> {
    let c = seq!(
        _: "Sue ",
        dec_uint,
        _: ": ",
        parse_attributes,
    )
    .parse_next(input)?;

Then because I don't want to have to preserve the input when I've finished parsing I copy all the &str references to distinct Strings to go in the output HashMap. (This is a thing that serious Rust people seem to regard as Bad, and I can see the inefficiency, but this is a pretty tiny problem.)

    let mut p: HashMap<String, u32> = HashMap::new();
    for (k, v) in c.1.iter() {
        p.insert(k.to_string(), *v);
    }
    Ok(Sue { id: c.0, attr: p })
}

The usual includes are needed at the top, in this case:

use winnow::ascii::{alpha1. dec_uint, space1};
use winnow::combinator::{separated, separated_pair, seq};
use winnow::ModalResult;
use winnow::Parser;

See also:
Basic text Parsing with Winnow

Add A Comment

Your Name
Your Email
Your Comment

Note that I will only approve comments that relate to the blog post itself, not ones that relate only to previous comments. This is to ensure that the blog remains outside the scope of the UK's Online Safety Act (2023).

Your submission will be ignored if any field is left blank, but your email address will not be displayed. Comments will be processed through markdown.

Search
Archive
Tags 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 2300ad 3d printing action advent of code aeronautics aikakirja anecdote animation anime army astronomy audio audio tech base commerce battletech bayern beer boardgaming book of the week bookmonth chain of command children chris chronicle church of no redeeming virtues cold war comedy computing contemporary cornish smuggler cosmic encounter coup covid-19 crime crystal cthulhu eternal cycling dead of winter doctor who documentary drama driving drone ecchi economics en garde espionage essen 2015 essen 2016 essen 2017 essen 2018 essen 2019 essen 2022 essen 2023 essen 2024 existential risk falklands war fandom fanfic fantasy feminism film firefly first world war flash point flight simulation food garmin drive gazebo genesys geocaching geodata gin gkp gurps gurps 101 gus harpoon historical history horror hugo 2014 hugo 2015 hugo 2016 hugo 2017 hugo 2018 hugo 2019 hugo 2020 hugo 2021 hugo 2022 hugo 2023 hugo 2024 hugo-nebula reread in brief avoid instrumented life javascript julian simpson julie enfield kickstarter kotlin learn to play leaving earth linux liquor lovecraftiana lua mecha men with beards mpd museum music mystery naval noir non-fiction one for the brow opera parody paul temple perl perl weekly challenge photography podcast politics postscript powers prediction privacy project woolsack pyracantha python quantum rail raku ranting raspberry pi reading reading boardgames social real life restaurant reviews romance rpg a day rpgs ruby rust scala science fiction scythe second world war security shipwreck simutrans smartphone south atlantic war squaddies stationery steampunk stuarts suburbia superheroes suspense television the resistance the weekly challenge thirsty meeples thriller tin soldier torg toys trailers travel type 26 type 31 type 45 vietnam war war wargaming weather wives and sweethearts writing about writing x-wing young adult
Special All book reviews, All film reviews
Produced by aikakirja v0.1