Converting a large Blogger blog to Movable Type - ARGH
I just converted a large Blogger blog to MT. I’ve been keeping this Estonian one for quite a few years now and it had 400+ articles. Converting was a royal pain.
First it seemed to be easy. You could find this forum post that suggested a Blogger template to dump all the entries into a temporary index, import that back in to MT, and you’re done. Fair enough, and a test index dump with just a few entries seemed to work fine, after changing some linebreaks in the template, so I thought I was almost done. Right? Wrong. All the trouble started then.
First, some configuration changes you had to do to get the test import working properly: change both the post and comment date format to the ones required by MT, which was MM/DD/YYYY HH:MM:SS AM/PM. At least Blogger had this format available for both, so that was fine. Hmm… I think that was the only needed format/comment change.
So then you increased the number of posts to publish in the index to the amount that’s in your blog (several hundred) and hope it will publish. But it didn’t. For hours. So it wasted a good part of my last night and this morning too.
The next thought I had was, OK pal, if you don’t work like this, then maybe I can retrieve all the posts and comments with the Atom API? I even found a PHP implementation and got a basic dumper going pretty quickly. This would have worked but two problems surfaced which made this again a wasted effort:
- Atom returns you only a subset of all the blog’s posts. There’s apparently no way to get a full archive, except to know the long numeric ID-s of all posts (which I have no channel of getting?) and getting them one by one. Had there been a way to enumerate all the posts, I would have done so, but there wasn’t, plus…
- Atom doesn’t give you comments for a post, so I would have lost some data, which I wasn’t up for.
So ditch Atom and look for alternatives again. Turns out there is still a working solution. Which is:
- export a full blog archive with empty indexes and all the individual posts in MT import format
- have a small script (Perl, of course!) to merge all the individual posts into one Movable Type import file “candidate”
- manually edit the import file before importing
- Blogger inserted some empty DIV-s before and after each post body — easy to remove with global search/replace
- convert encoding — the old blog was running on Windows-1252 encoding, but the new one is using UTF-8. Just save the file in a different encoding
- import stuff into MT
- manually go over the articles — not all, but fix obvious errors like some high-bit characters were still lost, and migrate the pictures to new vhost — since Blogger kindly inserts full path for pictures, can again do just a global search/replace. Some of this remains to be done still, but at least the new blog is operational.
So here’s the Blogger template I ended up using which creates empty indexes and dumps individual posts into MT import format.
<Blogger><ItemPage>TITLE: <BlogItemTitle><$BlogItemTitle$></BlogItemTitle>
AUTHOR: <$BlogItemAuthorNickname$>
DATE: <$BlogItemDateTime$>
STATUS: publish
-----
BODY:
<$BlogItemBody$>
-----
<BlogItemCommentsEnabled><BlogItemComments>COMMENT:
AUTHOR: <$BlogCommentAuthor$>
DATE: <$BlogCommentDateTime$>
<$BlogCommentBody$>
-----
</BlogItemComments></BlogItemCommentsEnabled>
--------
</ItemPage></Blogger>
And here’s the Perl script that merges all the individual entries into one file. By default it writes to standard output so redirect the output into a file to work with it further. Found this guide useful to construct the finding.
use File::Find;
sub Wanted
{
/\.html$/ or return;
# print $File::Find::name."\n";
open (CURRFILE, "<".$File::Find::name) || die("can't open input file: $!");
while ($line = <CURRFILE>) {
print $line;
}
close(CURRFILE);
}
find(\&Wanted, "C:/tmp/migration/_migration/");
Phew… I should be done. Bye bye Blogger. Was nice knowing you but I just outgrew you. And the parting experience could have been way less painful — why not let me dump the entries from your dashboard/backend in any machine-processable format that I could work further with? Would have saved me some hours of frustration.
Why switch from Blogger to MT in the first place?
- I like using Markdown in all of my recent business and private blogging/publishing initiatives, I’m too old and tired to write plain HTML
but Blogger doesn’t support that - limited (meaning no) support for categories, tags and other fancy stuff… there are some “plugins”, but in MT it is much more standard stuff
- crazy rebuild and upload times compared to MT. I’ve heard some people complain about MT-s rebuild times. You obviously haven’t had to maintain a larger Blogger blog. It’s insane.
- no trackbacks. I’ve found I like trackbacking, both in- and outbound. Again, can be done with a plugin, but why bother if it’s native stuff in MT.
- limited import and export support (as demonstrated above!) With MT, if something messes up, I can at least get a full dump of all the content I have with no real trouble.
- MT-s “Upload file/image” mechanism is more streamlined with the automatic thumbnailing capabilities and just fits my DNA and workflow expectations more.
- in general, I like MT backend more. More power features. Power editing, search, replace. Power everything.
- I like all my QuickPost bookmarklets to have the same style and my Blogger blog was the only exception with its own separate bookmarklet.
- (not necessarily Blogger vs MT issue, but good to do major switches at the same time
I have much better stats capabilities with my new/current webhost than I had at the old place.
2 Comments
Leave a comment



Welcome! I’m one of the developers at SixApart and a plugin contributor to MT. I recently switched my blog over from wordpress to MT a few weeks ago. Around 1,400 posts in total including categories. I didn’t get to keep any of my comments but at least I didn’t lose any real content.
For the occasion I created a perl module called WWW::Blog::Move that uses the built-in xmlrpc calls supported by blogger, wordpress, MT, etc to transfer over everything.
Anyway, glad it worked. Its always interesting to see how people get migrations working.
Cheers
I wish I’d found this post earlier!
I just did a migration for a friend who had 100+ entries on her Blogger blog, and more comments than entries. We got most of it, but getting it into shape has been a real pain. And then trying to wring her photos back out of Blogger since she didn’t keep copies in any organized fashion… well, let’s just say I had to escalate to someone more knowledgeable!
My worst peeve now is that Blogger stores the commenter URL with the commenter name. I didn’t have the aforementioned more knowledgeable person fix it before I imported the posts, so now it’s stuck that way. I thought I would be able to use Search and Replace with a regex to strip out the URL info (mostly useless links to people’s Blogger profiles), but S&R doesn’t work in the commenter field. Ah well…
I have lost a lot of respect for them during this process, for exactly the reason you mentioned: “hy not let me dump the entries from your dashboard/backend in any machine-processable format that I could work further with?” I feel like they hold people’s data hostage. Especially after she switched to FTP publishing temporarily while we were trying to create the export file, and now she can’t get back into her blog to post a final post telling people where to go.
Yuck.