Feed Me
Our first frost should come any time now, and I want to have warning of it so we can rescue our tomatoes. Well, I have a link to the NWS Area Forecast Discussion in my bookmarks which I try to read every day, but some days I forget. What I need is a feed. But as far as I can tell only Honolulu is cool enough to get a feed for AFDs. Weird. So what I needed was to create a feed from an existing web page.
I thought I would find a website that offers a service like this but I didn't (in a few short minutes of searching). I found websites that came halfway, but they were very complicated to set up and/or they didn't display the body of the page, only a link. The whole point is that I want to read it in my feed reader!
So I whipped up this Ruby code (standard libs only, no gems required):
#! /usr/bin/ruby
require 'erb'
require 'open-uri'
require 'ostruct'
# configuration
channel = OpenStruct.new(:url => "http://www.srh.noaa.gov/data/EPZ/AFDEPZ",
:title => "EPZAFD",
:description => "National Weather Service Area Forecast Discussion, El Paso TX/Santa Teresa NM")
item = OpenStruct.new(:url => channel.url,
:title => "Area Forecast Discussion",
:date => Time.now)
# fetch the page
afd = open(item.url)
item.date = afd.last_modified unless afd.last_modified.nil?
item.body = "<pre>" + afd.read + "</pre>"
# emit
include ERB::Util
template = ERB.new <<EOF
Content-Type: application/rss+xml
<?xml version="1.0"?>
<rss version="2.0">
<channel>
<title><%= h channel.title %></title>
<link><%= h channel.url %></link>
<description><%= h channel.description %></description>
<lastBuildDate><%= h Time.now.rfc822 %></lastBuildDate>
<generator>feedme</generator>
<item>
<title><%= h item.title %></title>
<description><%= h item.body %></description>
<pubDate><%= h item.date.rfc822 %></pubDate>
<guid><%= h "#{channel.url}?date=#{item.date.iso8601}" %></guid>
</item>
</channel>
</rss>
EOF
puts template.result(binding)
Nothing overly fancy here. I use open-uri to fetch the page, extract the Last-Modified header (if it exists) and shoehorn it into an ERB template for the RSS.
In this case I just made it executable and slapped a Content-Header before the output and call it as a CGI. You could just as well run a cron job to update a file on disk (In which case remove the Content-Header from the template).
Once I found the pure text version of the AFD, it was just a matter of slapping it between <pre> tags, but if you had some actual screen scraping to do you might want to look at Hpricot which makes that really easy. In particular, I could have used the URL http://www.crh.noaa.gov/product.php?site=NWS&issuedby=EPZ&product=AFD&format=txt&version=1&glossary=1 and done
...
require 'hpricot'
...
item.body = (doc/"#content").to_html
which is in fact how I started out. But this page doesn't have a Last-Modified header which means my feed reader would always show it as a new item (every time the cron job updated, or every time I hit the CGI script, either way). Luckily I found the text-only URL that doesn't have this problem.
display: inline-block
I'm not an HTML/CSS guru, but I do know my way around and I do try to use
"semantic HTML" and CSS for styling wherever possible. I hate tables for
layout, if for no other reason than they're a big mess and hard to read, and I
do all my HTML by hand (or generated programmatically, e.g. by
Markdown,
Markaby, or
Haml).
So I'm a little bit surprised that even lil' ol' dabbler me has more than once
wished to put blocks together in an inline-like flow, e.g. for a photo gallery
or other dynamic grid-based layout, and yet the big guys are content to fall
back on tables. Maybe I can afford to be more puritanical as a hobbyist.
I dug into the issue once again, determined to finally grok the CSS box model
and view model once and for all. I don't know if I got that far, but I do
understand enough about display: block and display: inline to see why it
doesn't work the way I was hoping it would work. It boils down to a block
element will seek the left edge of its parent.
But, there *is* a way to do precisely what I want to do in CSS 2.1. It's called display: inline-block, and it says to layout block elements in the inline flow. If you're still lost about what I'm driving at, there's a nice live demo and screenshot of display: inline-block at quirksmode.org. Alas, IE and Gecko (Firefox, Mozilla, Camino, etc.) don't support this display mode. But in browsers that do (e.g. KHTML-based browsers like Konq and Safari), it's a thing of beauty.
So we need a workaround for Firefox and IE. You could revert to tables if you detect those browsers, but I like to keep the HTML unchanged. You could use float: left, and that works very well unless your blocks are different heights. So you can force your blocks to be the same height, and possibly use overflow: auto to give scrollbars as needed (or hide the overflow or let it just overflow or whatever).
So here's a little example. The blue blocks use display: inline-block and the green blocks use float: left.
ipsum
ipsum
Do play around with it by resizing your browser and dusting off that Konquerer or Safari browser to see how the blue blocks look when rendered correctly.
I think it's a crying shame that firefox doesn't render this properly. Maybe the complacent table-hugging masses haven't made enough noise. Let's make some noise.
While I'm on the subject, I highly recommend css_browser_selector.js method of doing browser-specific CSS.