main ... archive ... Scribbles ... directory ... about Last modified on 10/9/97; 11:54:19 AM Central

Steve's Scribbles...


Keep It Around

Musings on the archiving of content

Thursday, October 9, 1997

Got RealAudio? Got a 28.8 connection or better? Got a couple of hours? There were a couple of broadcasts last week that make for great listening. (if you need the player, get it from real.com; I recommend the 'previous release' version, since they're currently in another beta-testing cycle) Tracking down these links got me thinking...

How long will they be around?

I've often been struck by what an excellent archival medium the web is (or, rather, can be). There's no reason I can think of (aside from the scarcity of hard drive space) that these broadcasts shouldn't be available from here to eternity; I get frustrated at how many things disappear from servers for no apparent reason. Hopefully these will stick around.

I had to hunt quite a while to find the link to Jobs' keynote; the link was no longer where it had been just last week. Will the link eventually disappear altogether? Will the file still be in the same place (and accessible if you just have the right address) but be hidden to the general population? If you're reading this in 1998 or later, I'm interested to know if either of the links still works.

So What, Steve...

An example that would perhaps be more relevant to non-geeks is the campaign and election process we go through every four years. 1996 was the first year when a great deal of the goings-on at the Republican and Democratic conventions was digitized (audio-only or even video) and posted on the Net. CNN, ABC and some other sites had whole online galleries of QuickTime videos of the speeches.

In 2000 (or, frankly, in 2096) it would be of enormous historical (or even just amusement) value to be able to go back and hear/see the actual proceedings, the actual speeches, the advertisements, the analyses, and so on. I think eventually we will reach the point where all that will be kept available online, but I worry about it for the '96 content; did anyone save it? Are they planning on keeping it posted? How would I get to Al Gore's convention speech, today, if I wanted to? (note: I haven't tried, I'm just thinking out loud). Where could I quickly find the delegate vote counts for each state? And so on.

It may be that this stuff is all available; I hope so, but you get the idea.

It seems to me that many sites remove content very quickly after they think it's no longer relevant; everything's focused on up-to-the-minute news, very little attention is focused on yesterday's news (and certainly not last year's). Some sites like nando.net seem to break or reuse their links more than once a day; if someone gives you a pointer to one of their stories, chances are it won't be there when you look for it.

I think that's the wrong model of web publishing to follow. News should stick around, preferably in a permanent or at least predictable location.

What about non-news?

I have to admit, not every kind of content may be suited to keeping around; in ten years I don't think anyone will be interested in what we said in our 1996 MBA promotional brochure, or what was on our announcements page. But hey, I could be wrong; maybe an alumnus will want to see pictures of the people who worked there when he or she came to the school; maybe there's benchmarking to be done as far as tuition costs way back when.

In an ideal world (from a geek/engineer point of view) would every piece of content be permanent?

When is it okay (that is, compatible with the optimal archival scenario) to delete content?

An alternate approach (I think BusinessWeek does this) is to hide old content behind a pay-for-access scheme: "if you want to search our archives, you have to pay us money". Obviously this only works if the content is important enough to someone that they'll pay.

Is there a right answer? I haven't thought about this enough to say one way or the other, but it's something I'm going to continue to think about. Feedback and musings are welcome.


As always, for more news, pointers & commentary, see Steve Bogart's home page.

Handy Official Disclaimer: Steve's Scribbles are my own personal work and not meant to be taken as official Olin School of Business pronouncements. They are, however, © Steve Bogart (original publication date is at the top of each Scribble).
main ... archive ... Scribbles ... directory ... about
Built with BBEdit and Frontier
bogart@wuolin.wustl.etc..