Jeff Croft: “If RSS content can have style, behavior, and everything else that we’re used to finding on the web, what’s the point of RSS?”
For this article, I’m going to focus on Atom, mostly because it’s the best spec’d format for all values of RSS. For some of the things I want to talk about, Atom can handle far more cleanly than RSS. So, for the sake of this article, I’m going to pretend Jeff’s talking about Atom (and indeed, he may well be including Atom in the mix).
I can imagine sites that are built wholly out of Atom feeds. For a small site, maybe all of the content is stored in one feed. For much larger sites, each major section, or major/minor section, would contain groups of related entries. The notion sounds absurd at first because there’s so much inertia behind the idea that Atom feeds can only be used for syndication. But I don’t see any technical reason why an Atom feed can’t be used to store a collection of related web pages just for browsing; I do see a number of benefits, however.
On my business site, I’ve structured the site so that I’m using includes to wrap the template around the relevant content. Without the template, all the content taken together amounts to 11.5K of data, with an average of 1K per file. However, viewing the source of any given web page (and doing a quick byte count) shows that each served document is about 2.5K, so only 40% of each file is the important stuff – the rest is just window dressing. If I were to include the byte count of all images, javascript, and CSS, that 40% gets even smaller.
So what if I were to set up an Atom feed of my site, and make enough entries in that feed for all my pages, then link in an XSLT file to wrap that content with an attractive template? What could that buy me, and any enterprising soul who visits my site?
Well, for me, maintenance might be a bit harder - I’ve only hand built one Atom feed for a work project, retrofitting it to an existing application, and creating a warning-less, valid feed took a bit of effort, but it was my first time, maybe it’ll get easier over time. Definitely though, I have to maintain a lot more metadata per page, but that would not be wasted effort. And obviously, a lot of that kind of work can be saved, somehow, by building applications (web based or otherwise) capable of automating much of the entry and maintenance of that metadata. I don’t think this is a big deal.
But one cool thing, as a developer, about using Atom to hold my data is that each feed entry doesn’t have to contain XHTML. I could have a feed containing a collection of XForms for building some kind of complicated web application (and you better believe that someday we’ll start to see browsers savvy enough to handle XForms). If I were running a site focused on maths or sciences, then feed entries could contain MathML content too. Everything is nicely packaged up with relevant material in their own feed.
This is powerful stuff now. We have a number of web browsers today who can translate those Atom feeds into content that looks like a regular web page, and the user experience would not be appreciably different. Sure, support is spotty for some of the browsers today, but do you really think it’s going to remain that way forever?
So suppose I have my site set up as Atom feeds, and a savvy potential client (let’s call him Jeff, heh) likes what he sees. Maybe he’s in a rush - he saw something that looks good, so he’ll pull down the feed to a PDA (and with 11.5K of content wrapped with Atom metadata, that’s certainly doable), and now he can look over everything I have to offer in one document while he’s offline.
He reads it, and it’s obvious that I’m not going to tinker too much with my content, but it could happen. So Jeff decides, when he’s back on the grid, to subscribe to my site - it’s the same URL after all. Since his aggregator is configured to ignore (or doesn’t support) the <?xml-stylesheet?>
tag, he just gets the raw content.
If I do update my site, whether it’s fixing a typo, or adding a new service, he’ll know about it without having to go check it again, because his aggregator will be keeping an eye on it. I only had to make the change in one place to accomplish this (not one in HTML, and one in RSS), and it’s all good. Even more compelling: What if I were a software shop, selling new and updated products? Simply by making changes to my feed, Jeff would eventually be alerted by his aggregator that I’ve got something new to announce.
Or how about this: what if I stored all my images and other binary media on sites like flickr or YouTube, and start mashing up feeds linking to those resources with my on-host Atom Feeds? It’s all the same format, so there’s not a whole lot of work required to mash them together. And as far as the end user is concerned, their experience isn’t compromised, but enriched. One feed will hold all of the related and relevant content.
If there’s real uptake on this idea, we could see a new class of web browsers/aggregators that could work for surfers in ways that today’s browsers can’t. We could go back to my hypothetical math site. I could produce an Atom feed containing entries dedicated to explaining some algorithm. In another entry, I have a representation of the algorithm encoded in MathML. I could imagine someone pulling up the MathML fragment right in their browser and run their own analysis.
Even regular surfing could be enhanced with these new browsers. With the additional metadata required by Atom, a user could ‘Get Info’ a page to find out how current a page is (or discover any other interesting metadata).
Ok, so let’s get back to Jeff’s question: “What is the point of RSS?” Building websites inside Atom feeds means that you’ll get the benefits of subscribing to any web content - Atom provides all the necessary metadata to make syndication a useful activity. On the other hand, why would you want to subscribe to an HTML page? It’s technically possible to do, but we have no in-file infrastructure to alert an HTML aggregator that a change was cosmetic, for example. That’s why we have feeds in the first place.
But an Atom feed is just a text file after all, just like an XHTML document, and so it should be possible to browse to it and interact with that content in a browser-oriented way. No semantics are lost by interacting with a feed that way. Adding the stylesheet helps the producer faithfully recreate the traditional browsing experience.
In a real sense then, what’s the point of XHTML as a standalone document?
Copyright © 2009
Robert Hahn.
All Rights Reserved unless otherwise indicated.