Why didn’t I respond to your pull request?

I have some fairly popular open source packages up on GitHub. Happily, I get people submitting pull requests, adding features or fixing bugs. It’s great when this happens, because people are doing work that I don’t want to do / haven’t gotten to yet / didn’t think of.

…but I’m pretty bad at responding to these. They tend to languish for a while before I get to them. There’s a decent number which I’ve never even replied to.

Why is this?

Fundamentally, it’s because reviewing a pull request is potentially a lot of work… and the amount of work isn’t necessarily obvious up-front. This means I only tend to do reviews for anything which isn’t obviously trivial when I’m feeling energetic and like I have a decent amount of free time.

First, there’s some common potential problems which might turn up:

  1. It does something I don’t want to include in the project. This is the only outright deal-breaker. Project owner’s prerogative.

  2. It doesn’t work. This happens more often than you’d think, generally because the submitter has written code for the exact use-case they had, and hasn’t considered what will happen if someone tries to use it in a different way.

  3. It works, but not in the way I want it to. For instance, it might behave inconsistently with existing features, and I’d want it adjusted to match.

  4. It should be written differently. This tends to include feedback like “you should use this module” / “this code should really go over here” / “this duplicates code”.

  5. It has coding style violations. Things like indentation, variable names, or trailing whitespace. These aren’t functional problems, but I still don’t want to merge them, because I’d just have to make another commit to fix them myself.

Once I’ve read the patch and given this feedback, which might itself take a while since design feedback and proper testing that exercises all code paths isn’t necessarily quick, I’ll respond asking for changes. Then there’s an unknown wait period while the submitted finds time to respond to those changes. Best-case for me, they agree with everything I said, make all requested changes perfectly, and update their pull request with them! Alas, people don’t always think I’m a font of genius, so there’s an unknowable amount of back-and-forth needed to find a compromise position we both agree on. This generally involves enough time between responses that the specifics of the patch aren’t in my head any more, so I have to repeat the review process each time.

What can I do better?

One obvious fix: delegate more. Accept more people onto projects and give them commit access, so I don’t have to be the bottleneck. I’m bad at doing this, because my projects tend to start as “scratch my itch” tasks, and I worry about them drifting away from code I’m personally happy with. Plus, I feel that if the problem is “I don’t review patches promptly”, “make someone else do it instead” is perhaps disingenuous as a response. 😀

So, low-hanging fruit…

Coding style violations, despite being trivial, are probably the most common sources of a patch sitting unmerged as I wait for someone to respond to a request to fix them. This is kind of my fault, because I have a bad habit of not documenting the coding style I expect to be used in my projects, relying on people writing consistent code by osmosis. Demonstrably, this doesn’t work.

As such, I’m starting to add continuous integration solutions like Travis to my projects. Without any particular work on my part, this lets me automatically warn contributors about coding style concerns which can be linted for, via tools like flake8 or editorconfig. If their editing environment is set up for it, they’ll get feedback as they write their patch… and if not, they’ll be told on GitHub when a pull request fails the tests, and don’t have to wait for me to get back to them about it.

Build Status

The “it doesn’t work” issue can be worked into this framework as well, with a greater commitment to writing tests on my part. If my project is already well-covered, I can have the CI build check test coverage, and thus require that contributors are providing tests that cover at least most of what they’re submitting, and don’t break existing functionality.

This should reduce me to having to personally respond to a smaller set of “how should this be written?” issues, which I think will help.

Sublime Text packages: working in 2 and 3

I maintain the Git package for Sublime Text. It’s popular, which is kind of fun and also occasionally stressful. I recently did a major refactor of it, and want to share a few tips.

I needed to refactor it because, back when the Sublime Text 3 beta came out, I had made a branch of the git package to work with ST3, and was thus essentially maintaining two versions of the package, one for each major Sublime version. This was problematic, because all new features needed to be implemented twice, and wound up hurting my motivation to work on things.

Why did I feel the need to branch the package? Well…

The Problem

Sublime Text is currently suffering from a version problem. There’s the official version, Sublime Text 2, and the easily available beta version, Sublime Text 3. They’re both in widespread use. This division has ground on for around three years now, and is a pain to deal with.

It’s annoying, as a plugin developer, because of a few crucial differences:

Sublime Text 2:

  • Uses Python 2.7.
  • Puts all package contents into a shared namespace.

Sublime Text 3:

  • Uses Python 3.3.
  • Puts all package contents into a module named for the package.
  • Has some new APIs, removes some old APIs.

…yes, the Sublime Text 2 / 3 situation is an annoyingly close parallel to the general Python 2 / 3 situation that is itself a subset of the Sublime problem. I prefer less irony in my life.

Python

What changed in Python 3 is a pretty well-covered topic, which I’m not going to go into here.

Suffice it to say that the changes are good, but introduce some incompatibilities which need code to be carefully written if it wants to run on both versions.

Imports

If your plugin is of any size at all, you probably have multiple files because separation of code into manageable modules is good. Unfortunately, the differing way that packages are treated in ST2 vs ST3 makes referring to these files difficult.

In Sublime Text 2, all files in packages are in a great big “sublime” namespace. Any package can import modules from any other package, perhaps accidentally.

For instance, in ST2…

…gets us the Default.comment module, which provides the built-in “toggle comment on a line” functionality. Unless some other package has a comment.py, in which case who what we’ll get becomes order-of-execution dependent.

Note the fun side-effect of this: if any package has a file which shares a name with anything in the standard library, it’ll “shadow” that and any other package which then tries to use that part of the standard library will break.

Because of these drawbacks, Sublime Text 3 made the very sensible decision to make every package its own module. That is, to get that comment module, we need to do:

This is better, and makes it harder to accidentally break other packages via your own naming conventions. However, it does cause compatibility problems in two situations:

  1. You want to access another package
  2. You want to use relative imports to access files in your own package

The latter case, this is something which behaves differently depending on whether you’re inside a module or not.

Editing text

In Sublime Text 2 you had to call edit = view.begin_edit(...) and view.end_edit(edit) to group changes you were making to text, so that undo/redo would bundle them together properly.

In Sublime Text 3, these were removed, and any change to text needs to be a sublime_plugin.TextCommand which will handle the edit-grouping itself without involving you.

The Solution (sort of)

If you want to write a plugin that works on both versions, you have to write Python that runs on 2 and 3, and has to play very carefully around relative imports.

Python 2 / 3

A good first step here is to stick this at the top of all your Python files:

This gets Python 2 and 3 mostly on the same page; you can largely just write for Python 3 and expect it to work in Python 2. There’s still some differences to be aware of, mostly in areas where the standard library was renamed, or when you’re dealing with points where the difference between bytes and str actually matters. But these are workable-around.

For standard library reshuffling, checking exceptions works:

If your package relies on something which changed more deeply, more extensive branching might be required.

Imports

If you want to access another module, as above, this is a sensible enough place to just check for exceptions.

You could check for the version of Sublime, of course, but the duck-typing approach here seems more Pythonic to me.

When accessing your own files, what made sense to me was to make it consistent by moving your files into a submodule, which means that the “importing a file in the same module” case is all you ever have to think about.

Thus: move everything into a subdirectory, and make sure there’s an __init__.py within it.

There’s one drawback here, which is that Sublime only notices commands that are in top-level package files. You can work around this with a my_package_commands.py file, or similar, which just imports your commands from the submodule:

There’s one last quirk to this, which only applies to you during package development: Sublime Text only reloads your plugin when you change a top-level file. Editing a file inside the submodule does nothing, and you have to restart Sublime to pick up the changes.

I noticed that Package Control has some code to get around this, so I copied its approach in my top-level command-importing file, making it so that saving that file will trigger a reload of all the submodule contents. It has one minor irritation, in that you have to manually list files in the right order to satisfy their dependencies. Although one could totally work around this, I agree with the Package Control author that it’s a lot simpler to just list the order and not lose oneself in metaprogramming.

Editing text

Fortunately, sublime_plugin.TextCommand exists in Sublime Text 2, with the same API signature as in Sublime Text 3, so all you have to do here is wrap all text-edits into a TextCommand that you execute when needed.

Conclusion

Getting a package working in Sublime Text 2 and 3 simultaneously is entirely doable, though there are some nuisances involved, which is appropriate given that “run in Python 2 and 3 simultaneously” is a subset of the problem. That said, if you do what I suggest here, it should largely work without you having to worry about it.

Wikimedia

I mentioned that I hadn’t been updating this blog, and that wasn’t just a matter of there being nothing to talk about.

Back in July I got laid off by DeviantArt. Since that was their second layoffs round of 2015, I think it’s fair to say that they’re having some problems.

This was non-ideal for me. In retrospect, I should probably have started looking around for a new job after the first layoffs round, but I’ll count that as a learning experience.

Fortunately, I then spent a month on downtime and relaxing, because I’d been terrible at taking vacation time at DeviantArt and they thus had to pay out a lot of vacation hours to lay me off.

Now I’m part of the Visual Editor team at the Wikimedia Foundation. I help people edit Wikipedia, essentially.

Migrating from Jekyll to WordPress

Funnily enough, there aren’t all that many resources for people who’re moving from Jekyll to WordPress. I took some advice from a post by Fabrizio Regini, but had to modify it a bit, so here’s what I figured out…

My starting point was a Jekyll-based site stored on github. Comments were stored using Disqus.

As a first step, I installed WordPress on my hosting. This was, as they like to boast, very easy.

Next I had to get all my existing content into that WordPress install. I decided the easiest way to do this was to use the RSS import plugin that WordPress recommends. So I added an RSS export file to my Jekyll site and ran Jekyll to have it build a complete dump of all my posts which I could use.

Here I ran into a problem. I’d set up my new WordPress site on PHP 7… and the RSS importer wasn’t able to run because it was calling a removed function. It was just a magic-quotes-disabling function, so I tried editing the plugin to remove it. However, after doing this I found that running the importer on my completely-valid (I checked) RSS file resulted in every single post having the title and contents of the final post in the file. So, plugin debugging time!

While doing this I discovered that the RSS importer was written using regular expressions to parse the XML file. Although, yes, this is about as maximally compatible as possible, I decided that it was better not to go down the rabbit hole of debugging that, and just rewrote the entire feed-parsing side of it to use PHP’s built-in-since-PHP-5 SimpleXML parser. This fixed my title/contents problem.

My version of the plugin is available on github. I can’t say that I tested it on anything besides the specific RSS file that I generated, but it should be maintaining the behavior of the previous plugin.

With all my posts imported, I went through and did a little maintenance:

  • The import gave me post slugs which were all auto-generated from the title, while some of mine in Jekyll had been customized a bit, so I updated those to keep existing URLs working.
  • All images in posts needed to be updated. I went through and fixed these up by uploading them through WordPress.
  • Some markup in posts needed to be fixed. Mostly involving <code> tags.

Next came importing comments from Disqus. I tried just installing the Disqus plugin and letting it sync, but it seems that relies on you having WordPress post IDs associated with your comments… which I naturally didn’t. So I went out and found a Disqus comment importer plugin… which, much like the RSS importer, was broken. It expects a version of the Disqus export file which was current around 5 years ago, when it was last updated.

Thus we have my version of the Disqus comment importer plugin. It tries to work out the ID of your posts by looking at the URL. This works pretty well, but I did have to edit a few of the URLs in the export file to make sure they matched my current permalink structure. If you’ve never changed your permalinks, you should be good without that step.

Migration: complete.

WordPress Again

I haven’t been updating this site very often. Upon reflection, I decided that this is in part because the Jekyll workflow that I switched to was… inconvenient.

It would be possible to hack around this. I could have written some sort of simple web-app which generated a new post, committed it to git, pushed it to github, built the site, and sync’d it onto my hosting. That’d keep the ridiculous performance / security benefits of a static site, while still letting me make quick updates from wherever I happen to be. It’d even be fairly easy, at least to get something basic working.

But. I don’t really want to do that. The point of using a system like Jekyll or (before it) WordPress is to offload that particular bit of work onto someone else, who can pay attention to all of those details for me.

So, here I am on WordPress again. Hopefully, after a bit more than four years, I won’t find myself getting hacked again. 😛

Why WordPress again? Well…

It’s really popular. This does count for something. Automattic likes to point out that it around a quarter of the public web runs on it. This means there’s a lot of resources available.

To keep some of what I liked from Jekyll, I’m using Automattic’s Jetpack plugin. This gets me a lot of the fancy features from WordPress.com, including letting me keep using Markdown to write these posts. I’m also using the WP-Super-Cache plugin, because it seems that even now running uncached WordPress is just asking for trouble.

I’ll write another post soon about how to migrate from Jekyll to WordPress. There were a few bumps along the way.

Hubot

My employer has long used Skype as a team communication tool. This has some drawbacks, as I found myself complaining about way back in 2011, mostly that Skype is very much not optimized for big long-running rooms, particularly on mobile devices.

Given this, why have we stuck with it?

  • If we switch, everyone in the company needs to change their workflow to use some new tool, and most people don’t want to do this very much.
  • As such, we want to be really sure that whatever we switch to is sufficiently better than Skype that we won’t have to switch again anytime soon, because it’ll be an even harder sell to do so.
  • …but Skype really is fantastic at the “call a bunch of people” and chat, without having to care about network settings use-case. So we either need something else that’s fantastic at it, or something which’ll make it easy to keep using Skype to call a group of people you’re chatting with.
  • We have existing tools set up around Skype. We wrote a bot that announces stuff we care about in our chats, which we’re all very used to having around.

That last point gives us some incentive to make a switch now, as Skype decided to discontinue important parts of its API back in late 2013. This means that our existing integration is slowly falling apart, as the old version of Skype it has to work with becomes unable to interact with newer clients. It recently reached the point now where it cannot send messages in rooms created by newer clients, which makes it effectively useless for new projects.

So. We’re kind of looking at Slack, and part of working out if we like it is getting our bot in there, so we can see how it feels with our normal workflow. However, our bot is just a thin wrapper about Skype4Py, and porting it to use the Slack API would effectively mean rewriting it in full… which seems to be potentially wasted effort.

Enter the Hubot

Hubot is a chat bot framework, with adaptors for approximately everything. It’s fairly popular amongst the hip tech-company crowd, which our company is entirely too long to consider itself a part of.

So I decided to port our custom stuff to hubot scripts. This turned out to be pretty easy, so long as I kept the CoffeeScript reference open in a browser tab.

I’ve written:

  • A subscribe-to-deviantart-events script, which lets users/rooms sign up to be notified of the events our existing skype-bot was already announcing.
  • A Zendesk script, which can be set up as a “target” on Zendesk so we can feed tickets into the aforementioned notifier. As a bonus, I gave our helpdesk a new system they’d been requesting for ages, which announces if we’ve receieved more than X tickets over the last Y hours, as a warning that there might be a serious issue.
  • A Phabricator integration to expand references to Phabricator objects (tickets, code-reviews, commits, etc). This one I’ve actually stuck up on github for general use, since I think it has nothing DA-specific in it.

Slack seems nice, so hopefully we’ll settle on it. But if we don’t, at least I’ve invested my time in something transferable.

Video Kiled the Tutorial Stars

I loathe having to watch a video or listen to audio to extract information, and would far prefer to read an article explaining the same information.

It’s therefore unfortunate that most of the time when I want to find out how to do something nowadays, a google search will turn up almost-exclusively video results. Particularly as one gets into niche areas… “minecraft iron golem farm” for example, seems like an area where some detailed diagrams would be vastly better than watching 30 minute videos.

I’m apparently a minority in holding this opinion. Drat.

Hosting Switch

Recently I switched my personal hosting from Dreamhost to WebFaction. I’d been butting up against the resource limits on Dreamhost’s cheap plan for ages, and an annoying multi-day outage was the last straw.

The outage was actually pretty interesting, in its way. I discovered that all sites served from a particular user account were having their host processes instantly killed. Okay, I assumed that I was being hit by some crazy-aggressive spider, and I’d have to go throttle something. Then I tried to ssh in, and discovered that my login shell got insta-killed. Problem.

Eventually, via their web panel, I migrated all the sites on that user account to a different user account, and thus discovered that there was no unusual load at all. The process killer had just gone mad, and was killing any process owned by that user, without any reason. Dreamhost did eventually restore access to that account, but it took something like three days.

In that time I did a bit of research about alternatives! WebFaction came highly recommended from some of my coworkers and had a 60 day money back guarantee, so I felt they were worth a shot. They’ve turned out to work very well.

What I get out of this is:

  • Less resource constraints. For one thing… CPU and memory consumed by the server-wide Apache/MySQL/Postgres instances don’t count against your plan limits.
  • Less heavily-loaded servers. This article is accurate, from my own tests.
  • Focus on long-running apps. Dreamhost was very much a PHP host. You could run other stuff on it, but it clearly wasn’t what they intended.
  • Memcache installed on all servers. Thank $deity.

I’ve lost:

  • A bit of hand-holding. Dreamhost was pretty good at doing things for you with a checkbox, like redirecting www to the subdomainless domain if you wanted. With webfaction I had to write my own little www-remover app for it. (Which was simple, but still.)
  • Sites hosted under multiple user accounts. I always liked that as an extra little burst of security.
  • (Related to that last point…) a collection of cracker backdoor scripts that had been installed via some compromised WordPress themes and eventually been neutered by me.

All in all, I think I’d still recommend Dreamhost to relatively non-technical people. If all you want is to host a generic PHP package, WebFaction is going to be confusing.

One final note: it having been a long time since I last switched hosting plans (I’d been on Dreamhost since 2005), I was slightly amused to notice that it took me longer to bzip up a multi-gig database dump than it probably would have to just scp the uncompressed file across.