RogerBW's Blog

Perl Weekly Challenge 32 04 November 2019

I’ve been doing the Perl Weekly Challenges (I missed 31 because of getting ready for Essen, and didn’t have time to do this one in Perl6). This week’s was about counting entities and generating ASCII bar charts.

Create a script that either reads standard input or one or more files specified on the command-line. Count the number of times [each item occurs] and then print a summary, sorted by the count of each entry.

For extra credit, add a -csv option to your script, which would generate:

Those of us who speak Unix recognise this as the extremely useful formulation |sort|uniq -c|sort -nr, which I use often enough that I can type it as though it were a long and familiar word. (Sort the lines, count how often each one occurs, sort that list numerically in descending order.)

But in Perl the most obvious approach is to build a hash keyed on the lines, so we do:

use Getopt::Std;
use Text::CSV_XS;

my %o;
getopts('c',\%o);

my %s;
while (<>) {
  chomp;
  $s{$_}++;
}

Then sort by the key values, descending, and all is done.

my $csv = Text::CSV_XS->new;

foreach my $k (sort {$s{$b} <=> $s{$a} ||
                       $a cmp $b} keys %s) {
  if ($o{c}) {
    $csv->say(*STDOUT,[$k,$s{$k}]);
  } else {
    print "$k $s{$k}\n";
  }
}

The use of Text::CSV_XS is possibly a heavier-weight approach than this problem really requires, but I’ve been bitten by the vagaries of CSV “standard” formatting before. If someone had asked me to do this for a real problem, I’d use the module so that when their specific requirements for CSV files turned out to be subtly different from what I’d produced I could just tweak the module parameters rather than re-invent things from scratch.

(It's entirely standard until you need to include a comma within a data field. Or a quotation mark of some sort. Or a non-ASCII character. Or transfer files between Unix and the outside world. Or…)

Write a function that takes a hashref where the keys are labels and the values are integer or floating point values. Generate a bar graph of the data and display it to stdout.

If you fancy then please try this as well: (a) the function could let you specify whether the chart should be ordered by (1) the labels, or (2) the values.

I know that NeilB, who contributed these questions, maintains a module to produce tabular output…

Terminal width is always a slightly fiddly thing, so I allow the caller to specify it; then I scale the largest bar to the full width of the terminal (minus the width of the longest label, and the decoration), and the others grow or shrink accordingly. Yes, there’s a bug here if the allowed width is too narrow for the labels and decoration; and this function doesn’t allow for negative values either. The third parameter should be non-zero if you want ordering by labels.

use List::Util qw(max);

sub generate_bar_graph {
  my $data=shift;
  my $width=shift || $ENV{COLUMNS} || 80;
  my $labelordering=shift or 0;
  my @k=keys %{$data};
  if ($labelordering) {
    @k=sort @k;
  } else {
    @k=sort {$data->{$b} <=> $data->{$a}} @k;
  }
  my $kl=max(map {length($_)} @k);
  my $bw=$width-$kl-3;
  my $scale=$bw/max(values %{$data});
  my $format='%-'.$kl.'s | %-'.$bw."s\n";
  foreach my $k (@k) {
    printf($format,$k,'#' x ($scale*$data->{$k}));
  }
}

Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.

Search
Archive
Tags 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 3d printing action advent of code aeronautics aikakirja anecdote animation anime army astronomy audio audio tech base commerce battletech beer boardgaming book of the week bookmonth chain of command children chris chronicle church of no redeeming virtues cold war comedy computing contemporary cornish smuggler cosmic encounter coup covid-19 crime crystal cthulhu eternal cycling dead of winter doctor who documentary drama driving drone ecchi economics en garde espionage essen 2015 essen 2016 essen 2017 essen 2018 essen 2019 essen 2022 essen 2023 existential risk falklands war fandom fanfic fantasy feminism film firefly first world war flash point flight simulation food garmin drive gazebo genesys geocaching geodata gin gkp gurps gurps 101 gus harpoon historical history horror hugo 2014 hugo 2015 hugo 2016 hugo 2017 hugo 2018 hugo 2019 hugo 2020 hugo 2021 hugo 2022 hugo 2023 hugo 2024 hugo-nebula reread in brief avoid instrumented life javascript julian simpson julie enfield kickstarter kotlin learn to play leaving earth linux liquor lovecraftiana lua mecha men with beards mpd museum music mystery naval noir non-fiction one for the brow opera parody paul temple perl perl weekly challenge photography podcast politics postscript powers prediction privacy project woolsack pyracantha python quantum rail raku ranting raspberry pi reading reading boardgames social real life restaurant reviews romance rpg a day rpgs ruby rust scala science fiction scythe second world war security shipwreck simutrans smartphone south atlantic war squaddies stationery steampunk stuarts suburbia superheroes suspense television the resistance the weekly challenge thirsty meeples thriller tin soldier torg toys trailers travel type 26 type 31 type 45 vietnam war war wargaming weather wives and sweethearts writing about writing x-wing young adult
Special All book reviews, All film reviews
Produced by aikakirja v0.1