F@H Team Statistics Scraper
I created a team for Little Filament on Folding@home. Our team number is 172406 (in case you want to join), but I wanted to add our latest stats on the Little Filament site. As far as I can tell there is no API for the stats, so I worked up a scraper in bash.
Basically all it does is fetch the page, then grep and sed it's way to the variables, finally dumping them into a json file (for easy JavaScript consumption).
The kicker is that the stats server is overloaded or down a lot, so we can't rely on it and we don't want to stress it out further. My decision was to poll it at a large interval, 12-24 hours. I don't have enough clients on the team to exact significant change over 6-12 hours, but I don't want to fall too far out of date either. So if the server is overloaded and drops it once or twice, not a big deal.
Without further ado, here is the script.
#!/bin/bash
NOW=$(date +%s)
THEN=$(cat fah_check.lock | tr -d '\n')
if [ $NOW -gt $(($THEN + 86400)) ]; then
wget "http://fah-web.stanford.edu/cgi-bin/main.py?qtype=teampage&teamnum=172406" -O fah_check.html
if [ "$?" == "0" ]; then
grep "Grand Score" fah_check.html > /dev/null 2&>1
if [ "$?" == "0" ]; then
SCORE=$(grep -C 2 "Grand Score" fah_check.html | sed 's/[^0-9]//gm' | tr -d '\n')
WU=$(grep -C 2 "Work Unit Count" fah_check.html | sed 's/[^0-9]//gm' | tr -d '\n')
RANK=$(grep -C 1 "Team Ranking" fah_check.html | sed 's/[^0-9of]//gm' | tr -d '\n' | sed 's/f\([0-9]*\)of\([0-9]*\)/\1 of \2/')
echo "{\"score\": \"$SCORE\", \"work_units\": \"$WU\", \"rank\": \"$RANK\" }" > fah_check.json
echo "[$NOW] - Success!" >> fah_check.log
echo $NOW > fah_check.lock
else
echo "[$NOW] - Filter Failed" >> fah_check.log
fi
else
echo "[$NOW] - Download Failed" >> fah_check.log
fi
else
echo "[$NOW] - Skip Update" >> fah_check.log
fi
That cranks out fah_check.json, which looks like this:
{"score": "4355", "work_units": "20", "rank": "39881 of 169721" }
To see it in action, check out the Little Filament Folding page.