Migrating from Jekyll to WordPress

Funnily enough, there aren’t all that many resources for people who’re moving from Jekyll to WordPress. I took some advice from a post by Fabrizio Regini, but had to modify it a bit, so here’s what I figured out…

My starting point was a Jekyll-based site stored on github. Comments were stored using Disqus.

As a first step, I installed WordPress on my hosting. This was, as they like to boast, very easy.

Next I had to get all my existing content into that WordPress install. I decided the easiest way to do this was to use the RSS import plugin that WordPress recommends. So I added an RSS export file to my Jekyll site and ran Jekyll to have it build a complete dump of all my posts which I could use.

Here I ran into a problem. I’d set up my new WordPress site on PHP 7… and the RSS importer wasn’t able to run because it was calling a removed function. It was just a magic-quotes-disabling function, so I tried editing the plugin to remove it. However, after doing this I found that running the importer on my completely-valid (I checked) RSS file resulted in every single post having the title and contents of the final post in the file. So, plugin debugging time!

While doing this I discovered that the RSS importer was written using regular expressions to parse the XML file. Although, yes, this is about as maximally compatible as possible, I decided that it was better not to go down the rabbit hole of debugging that, and just rewrote the entire feed-parsing side of it to use PHP’s built-in-since-PHP-5 SimpleXML parser. This fixed my title/contents problem.

My version of the plugin is available on github. I can’t say that I tested it on anything besides the specific RSS file that I generated, but it should be maintaining the behavior of the previous plugin.

With all my posts imported, I went through and did a little maintenance:

  • The import gave me post slugs which were all auto-generated from the title, while some of mine in Jekyll had been customized a bit, so I updated those to keep existing URLs working.
  • All images in posts needed to be updated. I went through and fixed these up by uploading them through WordPress.
  • Some markup in posts needed to be fixed. Mostly involving <code> tags.

Next came importing comments from Disqus. I tried just installing the Disqus plugin and letting it sync, but it seems that relies on you having WordPress post IDs associated with your comments… which I naturally didn’t. So I went out and found a Disqus comment importer plugin… which, much like the RSS importer, was broken. It expects a version of the Disqus export file which was current around 5 years ago, when it was last updated.

Thus we have my version of the Disqus comment importer plugin. It tries to work out the ID of your posts by looking at the URL. This works pretty well, but I did have to edit a few of the URLs in the export file to make sure they matched my current permalink structure. If you’ve never changed your permalinks, you should be good without that step.

Migration: complete.

WordPress Again

I haven’t been updating this site very often. Upon reflection, I decided that this is in part because the Jekyll workflow that I switched to was… inconvenient.

It would be possible to hack around this. I could have written some sort of simple web-app which generated a new post, committed it to git, pushed it to github, built the site, and sync’d it onto my hosting. That’d keep the ridiculous performance / security benefits of a static site, while still letting me make quick updates from wherever I happen to be. It’d even be fairly easy, at least to get something basic working.

But. I don’t really want to do that. The point of using a system like Jekyll or (before it) WordPress is to offload that particular bit of work onto someone else, who can pay attention to all of those details for me.

So, here I am on WordPress again. Hopefully, after a bit more than four years, I won’t find myself getting hacked again. 😛

Why WordPress again? Well…

It’s really popular. This does count for something. Automattic likes to point out that it around a quarter of the public web runs on it. This means there’s a lot of resources available.

To keep some of what I liked from Jekyll, I’m using Automattic’s Jetpack plugin. This gets me a lot of the fancy features from WordPress.com, including letting me keep using Markdown to write these posts. I’m also using the WP-Super-Cache plugin, because it seems that even now running uncached WordPress is just asking for trouble.

I’ll write another post soon about how to migrate from Jekyll to WordPress. There were a few bumps along the way.

Raking Jekyll

I’ve never really touched rake before, but since switching to Jekyll I’m finding that it’s becoming an essential part of my workflow. In the limited area of blogging, at least.

rake is a version of make in which you define all your targets in Ruby. Because practically anything would be an improvement over Makefile syntax, this is pretty easy to work with. I’m not a huge fan of shell scripting at the best of times, so mixing it in with something else is… not desirable. I still find Ruby less intuitive than Python, but that’s my prejudices talking.

To elaborate… what does posting a new entry look like for me?

  1. rake server to start up an automatically-rebuilding local webserver copy of my blog
  2. rake post[raking-jekyll] to make a new post with the YAML front matter boilerplate
  3. Actually edit the newly created post in an editor
  4. rake deploy to rsync the local copy to my hosting over ssh

Any part of my routine which looks like it might be scriptable has been replaced with a rake target. For example, the post target:

  1. Copies a template file
  2. Names it according to the current date and provided title
  3. Adds an expanded version of the current date into its YAML front matter so sorting will work correctly if I post multiple times a day

Since I rarely know the current date without having to look it up, that certainly saves me some effort.

Here’s my Rakefile, if you want to use anything from it. It’s probably not properly idiomatic Ruby, but it does at least work.

Jekyll

I’ve just redone my website using Jekyll. It is now completely static. No PHP, no database, nothing like that.

Why did I do this?

  • It’s quite soothing knowing that all my content is version controlled.
  • I am now nigh-immune to traffic spikes. I was using caching with WordPress before, so it had never been an issue even when I was on the HN frontpage, but there’s some peace of mind in it.
  • WordPress had a history of security bugs which wasn’t comforting. Since nothing on this new site is executable I feel pretty secure now.
  • My site is now ridiculously flexible. Jekyll forces almost no structure on you, leaving you free to change things around as you please.

I’m happy with the end result, but the process of getting there was not without pain.

The initial difficulty came from Jekyll’s documentation being somewhat lacking. I found myself somewhat confused about minor details like “how does a layout work?”. After I’d cribbed that together by examining other sites posted with Jekyll, I discovered that the template data docs were inaccurate / misleading, implying the presence of a post variable which failed to exist. It turned out to be something that’s merged into page if you’re viewing a post.

I don’t completely blame Jekyll for this being opaque. Jekyll uses Liquid for its templating language, which claims to be aimed at designers… and I feel it would benefit from some sort of debugging mode that dumps the current scope for examination.

I resorted to reading Jekyll’s source, which cleared up a number of things. However, I view it as a bad sign that I felt I had to do this. Not that a command-line driven static website generator is ever likely to be a mainstream product, but still, it’s the principle of the thing.

Pagination worked, but was completely lacking in configuration. Since part of my goal was to have my URLs remain the same as they were in WordPress, I had to change this. I did so with a horrible monkey-patching hack of a plugin. Specifically, I made a copy of the pagination module from Jekyll’s core into my _plugins directory and selectively edited it to change the pagination urls.

In the process I noticed a bug in the core code, and submitted a pull request to fix it. So horrible monkey patching might at least pay off this time.

Also utterly broken was the related posts feature. No matter what, it always seems to think the most recent posts are the most related to anything. It’s possible that running with --lsi would have helped with this, via complex semantic analysis, but that takes forever and I’ve seen others complain that it doesn’t really help. So there’s more monkey patching going on via Lawrence Woodman’s related posts plugin, which I took and edited so it worked based on tags instead of categories.

One thing I haven’t fixed, which I’d like to, is making the automatic regeneration of your site during development / writing a lot smarter. Right now it notices a file has changed and so it regenerates every single bit of content on your site. This does mean that the live generated site always has recent/related posts up to date everywhere… but it’d be nice to have some sort of --quick option that ignored that stuff in favor of a faster development cycle.

Because of the utter staticness, I naturally cannot have my own comment system in use any more. So I’ve switched to Disqus, which adds commenting to the site via JavaScript. It feels sort of weird to be outsourcing a component of my user experience like this… but they seem to be trustable. Widely used, and their monetization plan is fairly transparent.

If you’re interested you can see the repo for my website on github. It contains, in its default / post templates, markup that’s compatible with any WordPress theme that’s based on Toolbox, which might be of use to some.

Like I said, I’m happy with how it turned out. I wouldn’t recommend this at all for a non-technical person, but if you want to dig in and get your hands dirty then Jekyll is quite workable.