I just watched Linus Torvalds talking about git, the distributed version control system he wrote.

What struck me here is that several times in this talk he was asked by Google employees variations on the theme of “why should we use git?”, and he didn’t have a compelling answer. His statements boiled down to “we have an alternative way of visualizing branches, and subtly different workflows”.

Many of his “git is awesome” points were rooted in specific objections to CVS. In particular, the complaints about branching and merging, and finding changes to a particular subset of a repository. I’m not saying git isn’t good at these things, I just think he should be comparing to SVN, which enables less cheap shots.

None of this is to imply that I don’t like git. I’ve been playing with it recently, and it seems impressive. I can’t say that its distributed nature feels very essential to me – but then, I’m not a member of a massive pseudo-anarchistic project like the Linux kernel. I suspect that until you hit some critical mass of independent developers on a project, git and SVN are fundamentally interchangeable.

The things that have actually made me go “ooh” about git thus far are offline commits, and content-tracking across files.


As an amusing personal idiosyncrasy I run a fiction archive called FicWad. (I believe much Katamari Damacy had been played just before the name was chosen.)

I call this idiosyncratic because I don’t use it myself. I’m not, generally speaking, a fanfic-reading sort of person. I run it because my wife wanted to start a fiction archive, and I was dragged in to provide the technical side of it. She’s since drifted away, leaving me to play as tinpot dictator over the writing masses. (I am a very laissez-faire dictator, so this works pretty well for them.)

I treat it as a coding hobby project. It doesn’t actually make any noticeable money from the ads, so I don’t feel compelled to put effort in apart from when I feel interested.

I’ve come to the conclusion that this sort of hobby project is a really good thing to have. When you’re writing something that thousands of people use, they’ll scream at you if it doesn’t work. It provides incentive to work out how to do things right.

In particular, it provides incentive to work out how to do things yourself. A solo project like this doesn’t let you get away with passing the buck to someone else on your team who’s done something like this before. If it turns out that you need to optimize your SQL, or use caching, or write a prioritized mailing queue, or whatever, you have to learn about the problem area.

Yes, you’ll write some awful code. In fact, I had to rewrite the whole site from scratch earlier this year because back when I first wrote it I really didn’t understand SQL performance, and I had to choose between throwing money at it (better server, etc.) or fixing the real problem.

But I know I’m a better programmer for having it around. It forces me to confront issues outside of my comfort zone, and that can only be a good thing.

Python whitespace doesn’t matter

If you have some programming experience then there is one particular feature of Python that is likely to turn you off.

Whitespace matters. Indentation is significant.

This comes as a shock to many people who are used to it being meaningless. Most languages in common use designate code blocks with braces ({ ... }), or in some cases special keywords (e.g. Lua’s do/then ... end). If you want to write your program without any indentation whatsoever then it won’t stop you.

So having to pay attention to whitespace worries people. Puts them off trying Python. Makes them chafe at the thought of using it. Their freedom is being abridged!

This is really weird, since everyone agrees that consistent indentation is a good idea. In fact, it’s about the easiest and most effective thing you can do to make your code legible.

This is why objecting to syntactically meaningful whitespace in Python is a straw-man. All it’s doing is requiring you to not write unreadable and misleading code. In fact, if you have good habits already then you will never notice it. Ever.

(This post is an example of low-hanging fruit for a programming blog. Something to post about early on, for the sake of getting your opinion out there.)

svn synonym

I’d like to take a moment to (a) completely alienate my audience, and (b) ruin my credibility. I will do this by discussing the semantics of a particular operation of the Subversion version control system, and explaining how poorly I initially understood it.

It took me a while to realize how svn merge should be used. My intuitive sense of the usage of the command diverged significantly from its actual use.

When I think of it as “merge” the syntax that intuitively appears is svn merge [source] [target], with the intent of merging “source” into “target”.

This led to my attempting to merge from branch HEAD to trunk HEAD, and be puzzled at the lack of effect.

I find that everything fits together if I think of svn merge as a synonym for svn repeat. You’re asking for the changes between two revisions to be applied to the current working copy.

Thus svn merge branches/coolfeature trunk doesn’t do anything because it doesn’t describe any changes.

What’s required is svn merge branches/coolfeature@73 branches/coolfeature@HEAD, which takes the changes made to the coolfeature branch between revision 73 and the latest revision, and applies them to the working copy you’re currently in.

(It’s much shorter to write this as: svn merge -r 73:HEAD branches/coolfeature)

The awkward bit is finding out when you should start merging from (the “73” in my example). To do this you have to read the logs and find either the revision where you created the branch or the last revision you merged from, whichever is more recent.

All this should become irrelevant shortly, though. Subversion 1.5 (according to an aside in this developer blog post) will automatically track much merge metadata so the requirement to specify revision ranges of changes to merge should be eliminated in the common case. I look forward to it.

Yahoo! Pipes is awesome

I like Dan Savage’s column in the Seattle newspaper The Stranger. He also writes a blog for them. However, his entries are all mixed in with a great many other people’s entries, and there’s only an RSS feed for the amalgam.

This wasn’t a problem, because his entries were output at the bottom of the most-recent-column page, as well. But that seems to have stopped.

So I was faced with a dilemma. I wanted to read his entries without subscribing to a high-traffic blog.

I considered writing my own feed filterer, registering, and trying to get bought out by Google. Then it occurred to me that I should check whether someone beat me to it.

Googling for “feedfilter” (the first thing that came to mind) got me to the feedfilter project on Google Code. I thought to myself “this is only ‘a java program running as a CGI’… I bet it’s not hip enough for Google”. But then I noticed that link to Yahoo! Pipes. I checked it out, and my dreams of being a web2.0 billionaire died.

It turns out that Yahoo! Pipes is totally rad.

Not so much for what it does, as for how it does it. They have put the effort into making a really good GUI. It reminds me of Lego Mindstorms, or more recently of Automator for Mac OS X.

It’s this lovely system of connecting components together. I just had to pick a feed source, connect it to a filter, tell the filter I only wanted items whose author was “Dan Savage”, and hook it up to the output.

Now my custom feed works, and is providing me with much useful content in my Google Reader.

(I would like to acknowledge that this is old hat. Pipes was released months ago, and I even noticed people talking about it at the time. I didn’t realize quite how useful it is until I encountered this need, though.)

Yes, I said PHP

I mentioned that I got my start programming with PHP. Depending on the crowd you’re in, this can be a bit of an unsavory thing to admit to. Especially if you’re in the hip crowd, with their Django on Ruby or whatever.

Why’s that?

Well, PHP sucks. In numerous ways. Some of them are a matter of taste, but others are not.

  1. It has no namespaces.

    PHP is a very big language now. It has many modules. Its standard library contains many thousands of functions, all of which live in one big shared namespace. This is more an elegance issue than anything else, but it’s the sort of thing that really bugs a lot of people.

    Without namespacing it’s necessary to make function names more complicated to reflect what they’re intended for. So stripos has to convey that (a) it’s intended for use on strings, and (b) what it does to them. This wouldn’t have been so bad if the various module authors had stuck to a consistent naming scheme. Which leads me to…

  2. It has no consistent function naming scheme.

    Let’s take three functions: strip_tags, stripslashes, and stripos. strip_tags has an _, and stripslashes doesn’t. stripos doesn’t strip anything, it’s actually “find position of first occurrence of a case-insensitive string”. You can imagine how this starts to be irritating after a while.

  3. It’s easy to write insecure code.

    Register globals” used to be the bugbear here. It automatically sticks the contents of every global variable into the global scope. Which means that if you go to then there’d be a variable called $foo containing "haxxor".

    But register globals is off by default now. Although most hosting companies will re-enable it for compatibility with crap apps. So.

    The database functions available are rather low-featured, and the “obvious” approach to putting together queries is just to stick user input into a string. It takes extra effort to (a) realize that you need to check input, and (b) actually check it. Thus newbies are unlikely to be protected.

    I’m not saying that it’s impossible to write secure code. However, insecurity is rather the default. Sites written in PHP are predisposed towards SQL injection attacks, and various manipulations.

There are a lot more reasons that people complain about PHP. Google will tell you of them. Suffice it to say that the problems I listed are the ones that matter to me.

Despite these complaints, PHP has its good points.

Notably, it’s very easy to get into. You can stick some special tags into an HTML document, and wham! you’ve written a dynamic website. You don’t have to think about models, views, controllers, or whatever.

PHP is great for the beginner, so long as you understand its flaws.

David Lynch has a blog

It is possible that this blog’s naming scheme could be construed as being a bit egotistical. I just want to reclaim my name from popular culture!

But enough of that. Who am I?

I’m a 23 year old who immigrated to the USA from England when I was 17. I possess many useful insights into the ways that Americans love Received Pronunciation. (The accent thing may come up more. I have anecdotes, I swear to dog.)

I’m a programmer. I think the first program I ever wrote was (from a tutorial) a calculator in Objective C on a NeXT cube – a revelation that might possibly date me. I didn’t really get into that, though, and it took the discovery of PHP to make me actually realize that, hey, I could actually write useful things. Since then I’ve moved on to Python and Lua as my languages of choice, and am dabbling in Ruby.

I play World of Warcraft. This rather ties into my previous point, since I’m fairly sure that I’ve spent more time writing addons than actually playing. It takes all sorts. Anyway, it’s a thing I intend to talk about, so I disclose it here.

I recognize that this blending of topics makes this blog horribly unmarketable, and I am committing many a sin against recommended traffic-building practices here. Woe is me.