RSS reader programmers: fix your referrer fields!
RSS has a troubled enough history already with respect to "standards compliance," but the trend for RSS aggregators leave bogus referrer information is becoming immensely irritating.
At least I'm not the only one to be annoyed by this, Jason Kottke also noticed —
"Since I began publishing RSS feeds for this site, my server logs are full of referers from various RSS readers. Six of my top ten referers are from RSS readers like NetNewsWire and the brand new Syndirella."
It's the same here at WriteTheWeb, and while I could just filter those referrers out, that's not the point. It's another bit of sticking plaster that must be applied. Here we are, a community that lambasts large vendors for their non-adherence to Web standards, proliferating exactly the same sin.
The referrer string ought to contain the URL that the user agent linked from to find the resource on your site, not an advert for the user agent itself. That's what the user agent header is for.
Trawling a little through my logs, I'm willing to wager this is a classic "cut and paste" bit of programming: someone copies broken behavior from another program, and before you know it, it's everywhere. This is not a de facto standard, it's de facto broken!
Which standard?Posted by: qmacro at 2003-02-03
Hi Edd (nice to see WtW back again)
I may be missing something (my brane is only half on), but I'm not sure what you're expecting to appear in the referer entry when an aggregator grabs something from your site. Perhaps you're talking about a different context, but the two I can think of don't leave much room for inspiration:
- aggregator is grabbing the RSS feed: there is no referer, it's just a URL that the aggregator has been told to fetch. Putting
fakereferer data in the request is just as much a sticking plaster as that to which you referred, perhaps worse, as it's wrong information
- aggregator user clicks on href in HTML display inside aggregator: what URL do you suggest as a referer when the HTML is locally rendered and is (for want of a better type) like a
And the other one I can think of is fine:
- aggregator is web-based (e.g. http://www.pipetree.com/cgi-bin/blosxom/djnews): there's no problem here as the web browser will send the appropriate referer URL information
I agree that setting a
fake referer URL is not a good thing, but purely because it's not what the real referer is (it's impossible to determine, in other words, which I guess is why there's a
second best URL pointing to the aggregator homepage).
Which web standard are you referring to here?
Also, from what I can see, aggregators hitting my site (including NetNewsWire et al.) correctly announce themselves in the user-agent field.
Hope that made at least some sense..
ps. suggestion: any chance that the original post is displayed above the comment form to provide context (i.e. so I can remember what I'm replying to)
Replies to this comment
Aggregators are getting fixedPosted by: akalsey at 2003-02-04
A few days ago, I wrote up some proposed fixes for Referrer misuse. The main point was to take the URL that is currently being sent as a referrer and place it in the User-Agent field instead.
One person suggested that if your RSS subscription list is available at a URI, then that URI would be a valid referrer. That complies with the letter of the spec, but I'm not sure it fits the spirit.
Some aggregator developers are realizing the error of their ways and fixing the problem. Les Orchard has a patch for AmphetaDesk, Mark Paschal has a fix for Radio, and I have a fix for Aggie, and I just got word that the Aggie fix is going to be included in the next release of Aggie.
So there's hope yet.