Running Without Backups

As I mentioned, the server that hosts eighty-twenty.net suffered a hard disk failure 10 days ago. Unfortunately, my hosting agreement doesn't include a backup service and I hadn't thought to perform them myself. The code that runs the weblog was backed up; however, the posts themselves were not. I've come to regard this sort of like a fire: it's best to just let it go.

I subscribe to my own feed in RSS Bandit and NewsGator, so I was thinking that I could recover the posts from there. The problem with these is that I've deleted many of the posts in NewsGator, and in any case, NG doesn't handle wfw:commentRss, so I don't have comments. On the other hard, I don't use RSS Bandit often, so I'm missing a number of posts, and while RSS Bandit does handle wfw:commentRss, it doesn't cache comments locally.

It turns out that bloglines does keep a full archive, and I found someone on bloglines who's subscribed back to November, 2003. So I've experimented with scraping the HTML, thinking it was a simple matter of applying Python's sgmllib, but even after running Tidy on the output, it can't handle it.

Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
  File "C:\Python23\lib\sgmllib.py", line 94, in feed
    self.rawdata = self.rawdata + data
TypeError: cannot concatenate 'str' and 'list' objects

In any case, any comments left during 2004 are gone, which is really disappointing, as some of the comments were better than the post they were commenting on. In any case, I will work on a backup method so this doesn't happen again.

— Gordon Weakliem at permanent link