The Fugue Counterpoint by Hans Fugal

24Aug/096

Terminal Merge Conflict Resolution

A very important tool in the toolbox of any collaborating developer is a merge conflict resolution tool. OS X has the fantastic FileMerge, there are various graphical tools for linux like kdiff3, but I have yet to hear of one for the terminal. There's vimdiff, but it is really not up to the task of merge conflict resolution (doesn't handle 3-way diffs). There's probably something in emacs, just because there's always something for emacs. Emacs users please enlighten me, I'm not above using emacs for merge-conflict resolution. Might even be the gateway drug.

It doesn't seem overly hard (at least, no harder than writing kdiff3 or FileMerge) to make an ncurses tool that will take a 3-way merge and let you efficiently choose A, B, or edit for each diff section. Can it really be that nobody has done it yet?

16Nov/0712

Mercurial and Darcs

I'm a long-time distributed revision control user and advocate. I think it's the only way to go, but this article isn't about convincing you of that. I'll just take it for granted.

From almost the beginning, I have used darcs. I did try Tom's ahem Lame Arch, which I found to be completely unusable. Darcs hasn't been around as long as arch, but it is in roughly the same generation as other early systems like monotone and bazaar. I think of these as the second wave of distributed revision control (the first wave being, basically, tla). When I read about darcs I was impressed by the theory of patches. When I tried darcs I was very impressed by how easy it is to use. Soon it became second nature and I never looked back.

But I did look forward, and sideways. When collaborating on a project using svn I gave svk a whirl. I don't hesitate to recommend svk if you're otherwise stuck with svn.

When the third wave of distributed revision control came along, ushered in by the Great Linux/BitKeeper Fiasco, I took a cursory look at git. A couple years later (aka earlier this year) I took a closer look. I can see that it has come a long way (in user interface). You don't need "chrome" or whatever they used to call the wrappers like cogito anymore. It's fairly usable. But it's also fairly complicated (that's an understatement, if you left your understatement detector at home). It's perfectly tailored for what Linus and other kernel developers need, but that's not necessarily what the rest of us need. I wouldn't hesitate to nod my head in agreement if you moved from CVS or subversion to git.

Not entirely satisfied with git, and not being able to use darcs (because of the dreaded exponential merge, and general slowness), I looked at Mercurial (Hg from now on). I found that it was fast like git, but easy to use like darcs. The documentation is excellent, probably the best in the game. The underlying theory is easy to grok, and this is important. Darcs is the same way, although the two theories are definitely different. It has served me well in the application where I couldn't use darcs, but I remained a darcs user for other things.

But mercurial has been seeping into my consciousness of late. Rumors surface from time to time of darcs users migrating to mercurial (it seems to be the only one they migrate to). Other people praise it too. I find it plenty easy to use even as a second-class arrow in my quiver. Finally it gains enough ground that I decide to use it for a project where I'm actually exercising the version control system. I have really liked what I've seen. So the rest of this already-long post will be a comparison of darcs and mercurial. I'm not saying either one or even both together are "the best" by any objective measure, though I definitely prefer them over others myself. I don't know that I'll come to any conclusion over whether to use darcs or mercurial in the future (I haven't yet, anyway, though I'm leaning towards mercurial). I just want to get my comparison out there. Note also that I'm probably not going to mention every feature that I love that they both have in common. As such this isn't an evangelism for either one over the rest of the pack. Just a comparison of what distinguishes these two.

First, darcs. Darcs has excellent support for cherry picking. Its user interface is second to none. It is easy to mail a set of patches to another darcs user or to someone in general. The theory of patches is very interesting and flexible, and easy to follow (until strange things happen, when it gets really hard to follow). It's written in Haskell by a physicist, which has got to count for something. darcs record, which will ask you whether you want to check in each hunk that's changed in your working directory, rather than an all-or-nothing commit, is a joy to work with (especially for those of us who seem to lack the discipline to check in features and bugfixes in neat little packages).

On the other hand, darcs uses _darcs to store its metadata, which is probably for cross-platform reasons but is definitely an eyesore. (.darcs would be better). It's annoyingly inconvenient to grab a specific revision (partly because such a concept doesn't really exist) or a specific point of time. Darcs has no idea of branching. Or more precisely, every darcs repository is its own branch and so it leaves that up to you. That's theoretically ok, but I do prefer to keep the clutter of multiple working copies at bay. Combined with not having to rebuild the whole project from scratch and not wasting the disk space for a large working directory, in-place switching to certain branches/revisions like git and hg do is really nice. Darcs doesn't support symlinks or permissions, again for cross-platform reasons. The two real thorns in the side of darcs are the dreaded exponential merge (I have run into it, alas), and the fact that I have a devil of a time getting it to run everywhere I am. Haskell is not that common, and it's a big thing to have to build, e.g. with MacPorts. That's when it will build at all (but that MacPorts rant is for another post). There are binaries for windows and mac, but sometimes they don't work and versions don't match up... it can be a nightmare. It can usually work out, but it can be difficult.

I mentioned cherry picking. One UI flaw in darcs that greatly reduces the real-world utility of its excellent cherry picking support is that you can't tell it that you would prefer to refuse this patch for ever and ever. Every time you pull you have to tell it "no, I don't want that patch". This makes maintaining similar but subtly different branches impractical. A better tool for this job is quilt. In practice, I've done precious little cherry picking. It's always nice to know I can, but after some deep thought on the subject over the last couple days I'm not at all convinced that it's worth supporting in the revision control system. More on that in another post.

Now, mercurial. It's fast, easy to use, and easy to install. It is written in Python (with some C) and works well on all three platforms. The .hg directory is not an eyesore. hg is quick and easy to type. It supports branching. It has a powerful extension mechanism, with many interesting extensions available. The theory is easy to grok and follow, so you're not surprised very often. The implementation seems excellent, from the point of view of both a user and a system administrator. CVS-like abbreviations for minimal typing are handy (like hg ci for hg commit). I've already mentioned the excellent documentation. I might as well mention an excellent Google Tech Talk given by the author of said documentation. hg clone is fast and efficient and uses hardlinks where it can (for the metadata, not the working directory). It has a built in webserver for quick and easy web-based collaboration (for firewall reasons or for an interface to the repo for non-hg users). It's storage efficient, and has the novel and all-important optimization principle of avoiding disk seeks. That hg, written in python, gives git a run for its money shows that the devs know what they're doing when it comes to systems programming. Mercurial Queues is extremely cool and useful.

As for the cons, although the theory is easy to grok and doesn't surprise you much, it will surprise you as a newcomer unless you grok the theory. This is especially true of pull and push which don't update the working directory. Some of the most useful stuff is provided by extensions that are disabled by default (at least they are distributed). fetch, record (for darcs-like hunk-by-hunk recording), mq, bisect, transplant (for cherry picking). It doesn't have the cherry picking abilities of darcs, though there is the record extension (for cherry picking from your working directory) and the transplant extension (for cherry picking proper). I haven't used the latter yet so I don't know if it works well or not. .hgignore is easy to manage, but comes with almost no useful defaults. This is for performance reasons apparently, but I'd be willing to take the hit. Let me override with a minimal .hgignore if performance matters that much on this project. I don't like that I have to type ssh urls out, as ssh://example.com//path/to/repository, and it's too difficult to set the default push/pull location so I end up having to type it more than I'd like. Directories aren't first class, which makes me slightly uneasy. It hasn't caused me grief yet though.

I don't like that you don't get compatible repositories if you start from the same code. For example:

tar xzf foo-1.0.tar.gz
rsync -a foo-1.0 foo-1.0-2
cd foo-1.0
hg init
hg ci -Am 'initial'
cd ../foo-1.0-2
hg init
hg ci -Am 'initial'

At this point you cannot pull or push from one of these repositories to the
other, although they are semantically identical. I do this from time to time
with darcs.

Last, and while certainly not least it's certainly not the biggest deal, mercurial's website is ugly.

So there you have it. Look for a post on quilt and MQ and some thoughts on cherry picking soon.

1Jun/061

svk

The problem: Ardour uses Subversion, but I'm addicted to distributed revision
control systems. Actually, svn and I would have got along just fine if it
weren't for svn merge. What an embarrasment for svn lovers everywhere! You
have to manually dig up which revisions to merge with. svn doesn't keep track
of what's already been merged so you also have to be careful not to merge the
same stuff twice. To add insult to injury, you have to type those full, long
svn URLs too. So what would be darcs pull becomes something like:

svn merge -r 536:543 \
  svn+ssh://ardoursvn@ardour.org/ardour2/trunk \
  svn+ssh://ardoursvn@ardour.org/ardour2/branches/region-plugins

Ick.

So I started investigating gateways from svn to darcs or git, either of which I would have happily used. Tailor seems like a good way to do it, but I had a hard time wrapping my head around the bidirectional setup. git-svnimport looks promising for a git solution, but before I got a chance to try it I looked at svk. svk is perfect for this situation.

If you're not aware, svk is a distributed RCS front-end to svn. I knew about it
before but had always thought of it as a hack marrying the worst of two worlds.
Since my last look it has improved considerably, and my approach has also
changed as I'm now looking at it asa developer in an svn project, not the guy
setting up a repo and wondering which system to use.

svk is mostly like svn, except you mirror the repo on your hard disk and can do
disconnected development. To be honest I haven't looked at the truly
distributed aspects of svk (if they exist), but rather I have focused on what I
needed: disconnected operation with the ability to create local branches with
easy merging, and work with the existing svn repo. svk does these things very
well.

Here's my new workflow, from the point of installing svk:

# setup
svk mirror svn+ssh://ardoursvn@ardour.org/ardour2 //mirror/ardour2
svk sync //mirror/ardour2
cd ~/src
svk co //mirror/ardour2

# branch
cd ardour2
svk cp //mirror/ardour2 //local/region-plugins
svk switch //local/region-plugins

# edit stuff then check in
svk ci

# merge in trunk changes to my branch
svk pull

# merge my branch back into the trunk
svk push

Learn more about svk by reading "SVK, A Visual Guide", Jonathan Weiss' blog entry, and the Bieber Labs tutorial.