Antoine Amarilli's blog

Messy is better than nothing

— updated

The amount of information available on the web today is so large that it is tempting to think of it as some kind of library of Babel in which you don't wonder if the information exists but if the information can be found reasonably easily.

To some extent, this is a good way to think about data in general. The web (and, more generally, the sheer amount of data we have to manage today) has forced us to realize that archiving things is not enough if the archives aren't easily accessible and searchable. It's all well and good to keep things, but there is a huge difference between an ordered collection with a nice interface which you will actually use, and a messy dead drop of data which you will never take the time to consult.

However, if there is indeed a difference between ordered data and messy data, there is also a huge difference between messy data and no data at all. The thing is that, in some cases, you just need a piece of data, and searching for it in a wide mess, however tedious, is the only possibility. Or think about a book by some obscure author. Even if the only copy is hidden deep in some mysterious library (or, for that matter, a nameless PDF amongst thousands of others on some website), this is still very different from no copy at all. The thing is that you never know what's going to happen in the future, and it is still possible that someday, somehow, somebody stumbles upon it.

(Here is a stupid way to think about it: losing the Ring at the bottom of a river isn't the same as destroying it for good. Of course, the Ring has a will of its own and can lure people towards itself, whereas data cannot really do this kind of thing, but you get the idea.)

This is why, in my opinion, it is a good idea to archive in bulk everything which could conceivably be useful someday but probably won't be, because sorting it out isn't worthwhile whereas having it around just in case you desperately need it can turn out to be a good thing.

comments welcome at a3nm<REMOVETHIS>@a3nm.net