I ran in the Beat the Bridge 8k today, and I was getting frustrated waiting for the online results to get posted. So I fired up a little script.
#!/usr/local/bin/perl
use strict;
use warnings;
while (1) {
eval {
system ("wget 'http://onlineraceresults.com/race/view_race.php?race_id=14261' -O race");
my $results = system ("grep 'There are currently no results posted for this race' race");
if ( $results ) {
system("/usr/bin/mail -s 'Race results' xxxxxxxxxx\@tmomail.net < /home/zachmu/results") and die "can't send mail";
exit 0;
}
};
sleep 60;
}
By the way, if you know my mobile phone number, you can use the email address template above to send me text messages via email. Please don’t script this.
Anyway, the above didn’t work because of a subtle bug. See it? I didn’t, and so I didn’t get a text when the results were posted. Turns out that wget -O file won’t overwrite an existing file, so after the first one it was a no-op. Grr. Always read your man pages! This can be fixed with a little call to rm inside the loop.
Next I wanted to see how I did compared to the rest of the pack, and wanted to view a histogram of times. The online results site doesn’t support such a thing, of course, so I ginned something up:
#!/usr/local/bin/perl
use warnings;
use strict;
use Data::Dumper;
my $seenDiv = 0;
my $runner;
my $cell = 0;
my $runners = [];
my $cellProcessors = [
simpleFieldExtractor('link'),
simpleFieldExtractor('first'),
simpleFieldExtractor('last'),
simpleFieldExtractor('division'),
simpleFieldExtractor('place'),
simpleFieldExtractor('divplace'),
simpleFieldExtractor('genderplace'),
timeExtractor('guntime'),
timeExtractor('time'),
timeExtractor('pace'),
];
while (<>) {
my $line = $_;
chomp($line);
# print "DEBUG $line\n";
if (!$seenDiv && $line =~ m/DIVISION:/) {
$seenDiv = 1;
} elsif (!$seenDiv) {
next;
}
if ($line =~ m/tr class/) {
$runner = {};
push @$runners, $runner;
} elsif ($line =~ m/td class.*>(.*)<\/td>/) {
$cellProcessors->[$cell]->($runner, $1);
$cell = ($cell + 1) % scalar @$cellProcessors;
} elsif ($line =~ m/block-footer/) {
last;
} elsif ($line =~ m/<\/tr>/) {
$cell = 0;
}
}
my @filteredRunners = grep { defined $_->{'time:sec'} } @$runners;
$runners = \@filteredRunners;
analyze();
sub analyze {
my @sortedByTime = sort {
$a->{'time:hrs'} <=> $b->{'time:hrs'}
|| $a->{'time:min'} <=> $b->{'time:min'}
|| $a->{'time:sec'} <=> $b->{'time:sec'}
|| 0;
} @$runners;
my $bucket = 0;
my $bucketCnt = 0;
my $STEP = 60;
foreach $runner ( @sortedByTime ) {
next if (not defined $runner->{'time:sec'});
my $totalSec = $runner->{'time:hrs'} * 3600
+ $runner->{'time:min'} * 60
+ $runner->{'time:sec'};
if ($bucket == 0 || $totalSec > $bucket + $STEP) {
use integer;
my $label = "";
$label .= $bucket / 3600;
my $min = ($bucket % 3600) / 60;
$min = "0$min" if ($min < 10);
$label .= ":" . $min;
# $label .= ":" . $bucket % 60;
print "$label:";
for (my $i = 0; $i < $bucketCnt / 10; $i++) {
print "*";
}
print " $bucketCnt\n";
$bucket = $totalSec - ($totalSec % $STEP);
$bucketCnt = 1;
} else {
$bucketCnt++;
}
}
}
#print Dumper $runners;
sub simpleFieldExtractor {
my $fieldName = shift;
return sub {
my ($runner, $field) = @_;
$runner->{$fieldName} = $field;
};
}
sub timeExtractor {
my $fieldName = shift;
return sub {
my ($runner, $field) = @_;
my ($hrs, $min, $sec) = split(/:/, $field);
if (not defined $sec) {
$sec = $min;
$min = $hrs;
$hrs = 0;
}
$runner->{"$fieldName:hrs"} = $hrs;
$runner->{"$fieldName:min"} = $min;
$runner->{"$fieldName:sec"} = $sec;
};
}
When you feed this the race results page, it spits out the following histogram. Each asterisk represents 10 finishers with the time indicated, discarding seconds.
0:24: 2 0:25: 7 0:26:* 14 0:27:* 12 0:28:* 12 0:29:** 20 0:30:** 28 0:31:*** 38 0:32:**** 41 0:33:***** 51 0:34:******** 83 0:35:********* 90 0:36:************ 128 0:37:************* 137 0:38:****************** 180 0:39:********************* 216 0:40:********************* 210 0:41:********************** 228 0:42:************************ 243 0:43:***************************** 296 0:44:************************* 257 0:45:*********************** 238 0:46:****************** 187 0:47:********************** 221 0:48:************************ 241 0:49:******************** 209 0:50:******************* 192 0:51:*********************** 231 0:52:******************* 190 0:53:****************** 187 0:54:************ 129 0:55:*************** 151 0:56:************ 126 0:57:*********** 111 0:58:********* 98 0:59:******* 77 1:00:********* 97 1:01:****** 66 1:02:***** 55 1:03:***** 54 1:04:****** 66 1:05:*** 36 1:06:** 27 1:07:*** 32 1:08:** 27 1:09:* 17 1:10:* 18 1:11:** 26 1:12: 8 1:13: 7 1:14: 8 1:15: 8 1:16: 6 1:17: 9 1:18: 8 1:19: 7 1:20: 1 1:21: 3
So, my 37 minute time puts me well above the modal hump there. That’s what I wanted to know!
I should be able to use these same tools on other race results posted on that site, provided they don’t change their format significantly and break my screen scraping. Provide an API, you yokels! I hereby release the above software into the public domain, so if you’re of the running and coding persuasion feel free to use it!






Red Chaser
The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity
American Gods
How the Mind Works
The Urth of the New Sun (Book of the New Sun #5)