Web Authors
RSS Delivers the XML Promise
By Peter Wiggin
Rank: 3
Key RSS Points
RSS stands for Rich Site Summary. It was originally developed by Netscape to help implement the My Netscape Network.
XML stands for the eXtensible Markup Language and was created to improve data sharing among applications on the Internet.
my.userland.com began as a customizable page displaying RSS channels and has evolved into a hub that aggregates new stories from over 440 channels.
|
If Only I Had Known
Ah, the promise of XMLa mark-up language that enables the painless sharing of data between previously incompatible applications. The concept sounds wonderful, but where is it being used in practice?
Despite a tidal wave of marketing hype, we don't see many XML-based applications surfacing. A notable exception, however, is RSSa simple, yet powerful, web content syndication format. If you're new to XML and would like to get your feet wet, RSS can be a great place to start.
What is RSS?
RSS (Rich Site Summary) is an application of XML (eXtensible Markup Language). In essence, RSS is a file format that uses XML. It can be created easily by hand or by any web content management system. The RSS core file defines a "channel." Surrounding this core file are a number of tools, services, and protocols that, while not strictly RSS, extend the power of this simple format to the point where it can compete with many high-priced commercial content-sharing and syndication systems.
The RSS format was originally developed by Netscape for use on My Netscape Network, a customizable start page for Netcenter. My Netscape Network provides a simple RSS framework for web sites to create channels that can then be added to the customizable start page.
Here is a sample RSS fileit's the channel for XML.com (a sister site of Web Review)and here is a common HTML rendering of that channel. If you have a browser that supports XML (like Microsoft IE5), you can view the channel as an XML file.
Netscape provides a page with complete details on the RSS specification and how to create files. All the requirements and options for creating RSS files are there, so I won't repeat much of that information here. I will, however, explain some of the key elements.
Creating an RSS file
Looking at our sample RSS file, the first three items are standard fare for most XML documents, though their content differs from one XML file to another. First is the standard XML declaration <?xml version="1.0"?
which specifies the version of XML being used.
This is followed by a document type declaration:
<!DOCTYPE rss PUBLIC
"-//Netscape Communications//DTD RSS 0.91//EN"
"http://my.netscape.com/publish/formats/rss-0.91.dtd">
The document type declaration points to markup statements that provide a grammar for the rss class of documents. This grammar is known as a document type definition, or DTD. The RSS version 0.91 DTD lives at http://my.netscape.com/publish/formats/rss-0.91.dtd. If you look at it, you'll see that it is not very human-readable. That's because it's designed to be read by programs. For example, Netscape's validation engine will validate an RSS file against this DTD when you submit your My Netscape channel, and tell you what, if anything, is wrong with the syntax of your RSS file.
The third item on the page is the root element:
<rss version="0.91">
The root element, like the <html> tag in an HTML document, is the tag that should contain all other elements in the document. There is always only one root element, and the element type of the root element always matches the name (in this case rss) in the document type declaration above.
Within the <rss> root element of an RSS file, there is one <channel> element, which contains all the other elements. There are four main sections of an RSS file: they contain information about the channel; information about an optional channel image; up to 15 channel items; and an optional form input box. It is helpful to break the file into these sections even though they are not delimited by XML elements.
Because you are dealing with an XML file, the elements need to be properly nested. Also, they cannot contain any HTML markup, unless it's in the form of entities (like & etc.).
The channel information is pretty straightforward. This is where you define the title, description, and url for the channel, along with contact details:
<title>XML News and Features from XML.com</title>
<description>XML.com features a rich mix of
information and services for the XML community.
</description>
<language>en-us</language>
<link>http://xml.com/pub</link>
<copyright>Copyright 1999, O'Reilly and Associates
and Seybold Publications</copyright>
<managingEditor>dale@xml.com (Dale Dougherty)
</managingEditor>
<webMaster>peter@xml.com (Peter Wiggin)</webMaster>
The image section is where you can define a small image to go with the channel:
<image>
<title>XML News and Features from XML.com</title>
<url>http://xml.com/universal/images/xml_tiny.gif</url>
<link>http://xml.com/pub</link>
<width>88</width>
<height>31</height>
</image>
We don't use the form input box in XML.com's RSS file, so you won't see it in the example. But you can use the element to solicit user feedback, provide a search interface, or perform any HTML form action that uses a single text field and a single submit button.
Channel items are RSS's strength
The real guts of an RSS file is the list of channel items. Items can be anything you want. On XML.com, items are the titles and descriptions of current articles and news items we have published on the site. Other channels use RSS to publish announcements of new software releases, or new user-submitted reviews of anything from software to movies to obituaries.
An item has three elementsa title, a link, and a description:
<item>
<title>Customizing the DocBook DTD</title>
<link>
http://xml.com/pub/1999/10/docbook/index.html?wwwrrr_rss
</link>
<description>In this three-part excerpt from his new book,
Norm Walsh describes how to modify the DocBook DTD and
customize it for your own applications.</description>
</item>
The only requirements for these elements are that the contents of the <title> element must be less than 100 characters long, and the contents of the <link> and <description> elements must be less than 500 characters long. The <description> element is optional. There are a maximum of 15 items allowed per channel.
One common rendering of an item looks like this:
- Customizing the DocBook DTD
In this three-part excerpt from his new book, Norm Walsh describes how to modify the DocBook DTD and customize it for your own applications.
One reason why RSS has become so popular is that it is a very simple way to define content. While it has its limitations (developers of RSS tools are already looking at methods for adding categorization at the item level, among other things), its simplicity makes it work for just about any type of material.
You can update your RSS file as often as you'd like. The RSS channel for XML.com is automatically updated whenever a new item is added. When you register your channel with My Netscape, you can specify how often their server should check for updates. You can also specify update times and days of the week in your RSS file using the optional <skipDays> and <skipHours> elements, but My Netscape does not make use of this information.
Why Would You Use RSS?
Creating and registering an RSS channel can help drive traffic to your web site.
Figuring Out Browser Compatibility
RSS Delivers the XML Promise