RogerBW's Blog

Dustbin Day, iCalendar, and PhantomJS 18 April 2023

I wanted to get dustbin collection days into the house calendar server. Shouldn't be too hard, right?

It's not quite as simple as "recyclables week A, main rubbish week B", because collections get deferred for bank holidays (especially around Christmas), and sometimes (as last summer when the council refused to pay extra money to the contracting company, I mean "had a labour shortage") some collections get cancelled completely.

The local council provides this information in various ways. It used to put a card through the door a couple of times a year, and sometimes it still does; a PDF version of that is made available, but generally the new one isn't released (electronically or physically) until after the old one has expired. And of course that requires me to type in all the exceptions by hand, and doesn't get updated for emergencies.

But help is at hand! They have a web page on which you can specify your address, and get back the next collection for each sort of rubbish. Not much in the way of advance notice, but they do actually keep it up to date for extra bank holidays and such like. So I can just scrape that and parse the page, right? Right?

Well.

If you are me, you already know your house's UPRN, which of course is what they (quite reasonably) use as an input to the lookup. But you can't just submit that. Or even type in an address. Or even bookmark the results page. No, you have to go in through their postcode lookup. Which needs JavaScript, so that's rather beyond what poor old WWW::Mechanize can manage. (Somewhere behind all this there's a straightforward API call, but I wasn't able to get it to respond to my prodding any more simply than going through the pages; the necessary parameters are put together by the JavaScript, and even replaying a request captured in the browser didn't work reliably.)

This calls, in fact, for a headless browser. Selenium is the canonical answer to this problem, but that needs a great big Java daemon – and Java in general doesn't have the best of security reputations, nor what one might call a small footprint. So instead I ended up using PhantomJS – canonically a dead project, but it still works, it's in Debian/stable, and it's much more lightweight.

This is basically a central lump of code with tentacles. To the user it presents itself as a JavaScript interpreter; to the web it runs a WebGTK browser. One directs it with JavaScript, which I've been learning since last year, and one can also mark code as to be run inside the context of the loaded page.

So the procedure ends up being:

  • load the first page
  • enter my postcode
  • click on the lookup
  • wait
  • check the dropdown for my address
  • select it, and trigger a "change" event on the dropdown
  • wait
  • submit the form
  • wait
  • get back the results page, and parse it for the dates

In-browser JavaScript has useful methods like document.getElementsByTagName() so I do the final HTML parsing there, and dump JSON onto stdout for a calmanager plugin to pick up and update my iCalendar server. (That does things like lumping multiple collections together into a single calendar entry, and making the actual diary event go off on the previous evening to remind me to put the bins out on the night before what might be an early morning pickup.)

I'm not planning to make this code public, but if you have a use for it, let me know.

I wonder how much the council paid for this overcomplicated setup?

Tags: computing

Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.

Search
Archive
Tags 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 2300ad 3d printing action advent of code aeronautics aikakirja anecdote animation anime army astronomy audio audio tech base commerce battletech bayern beer boardgaming book of the week bookmonth chain of command children chris chronicle church of no redeeming virtues cold war comedy computing contemporary cornish smuggler cosmic encounter coup covid-19 crime crystal cthulhu eternal cycling dead of winter doctor who documentary drama driving drone ecchi economics en garde espionage essen 2015 essen 2016 essen 2017 essen 2018 essen 2019 essen 2022 essen 2023 essen 2024 existential risk falklands war fandom fanfic fantasy feminism film firefly first world war flash point flight simulation food garmin drive gazebo genesys geocaching geodata gin gkp gurps gurps 101 gus harpoon historical history horror hugo 2014 hugo 2015 hugo 2016 hugo 2017 hugo 2018 hugo 2019 hugo 2020 hugo 2021 hugo 2022 hugo 2023 hugo 2024 hugo-nebula reread in brief avoid instrumented life javascript julian simpson julie enfield kickstarter kotlin learn to play leaving earth linux liquor lovecraftiana lua mecha men with beards mpd museum music mystery naval noir non-fiction one for the brow opera parody paul temple perl perl weekly challenge photography podcast politics postscript powers prediction privacy project woolsack pyracantha python quantum rail raku ranting raspberry pi reading reading boardgames social real life restaurant reviews romance rpg a day rpgs ruby rust scala science fiction scythe second world war security shipwreck simutrans smartphone south atlantic war squaddies stationery steampunk stuarts suburbia superheroes suspense television the resistance the weekly challenge thirsty meeples thriller tin soldier torg toys trailers travel type 26 type 31 type 45 vietnam war war wargaming weather wives and sweethearts writing about writing x-wing young adult
Special All book reviews, All film reviews
Produced by aikakirja v0.1