The Fugue Counterpoint by Hans Fugal

6Aug/100

OSX ignores ownership on external drives by default.

I had reason to copy my harddrive off and back (to reformat as case-sensitive), and I missed one very important detail.

http://www.egg-tech.com/mac_backup/

IMPORTANT STEP: disable "Ignore ownership on this volume".

Yeah, with everything owned by user 99, it won't even boot. You can boot into single-user mode and "chown -R root:wheel /" which will at least let it boot, then you can go into Disk Utility and repair permissions (which will actually do something useful and vital in this situation). Then chown -R your home directory. But it's still a mess to clean up, and you lose all that user and group info.

Tagged as: , , , , No Comments
12Feb/098

Initialize a remote git repository

So you create a git repository (repo) on your local machine (foo) and begin hacking away. Then, you want to push a backup copy of that repo to another box (bar). What's the best way to do that? There are many ways to do it, but some ways are dangerous and some are just cumbersome.

Ideally, git would have an option to push that instructs it to initialize the remote repository and just do the right thing. I hope this is in our future, but for now we have to do some dancing around.

The canonical way is
bar:~/repo$ git clone --mirror local:repo
...
foo:~/repo$ git push bar:repo --mirror

but that's not always feasible due to firewalls and nasty NATs.

Unwise methods include copying the working tree with rsync or scp, doing something like the above without --bare or --mirror (which implies --bare), and other methods that would have you pushing to a non-bare repository.

The best method I've found is this:
bar:~$ git --git-dir=repo init --bare
foo:~/repo$ git remote add --mirror bar bar:repo
foo:~/repo$ git push bar
...

We set up a bare repository on bar, then set up a "remote" for bar which automatically mirrors (you could do that with the canonical method described above too), so it sends the whole shebang. After that, we push as we normally would.

6Feb/076

On rsync and unison

I have a Yepp, which is a wonderful mp3 player because it's small (the size of a AA battery), plays OGG (out of the box!), and presents itself as USB mass storage.

As I am frequently changing the contents for my yepp, because I use it mostly for listening to podcasts, I decided the best thing to do is to keep a mirror on my hard drive and then just hook up the yepp and sync it. Because I also keep music on for relatively long time (compared to podcasts) and I let my podcasting software handle deletions so some longer-period podcasts hang around for awhile, I figured something like rsync or unison would be the best approach.

Initially I was excited about unison because I expected I would sometimes change stuff on the yepp and want to see the changes propogated to my hard drive. Possible scenarios for this are putting random files or songs on the yepp from other computers or deleting podcasts I had already listened to on the yepp.

I quickly found that unison was a very poor choice. Unison is designed for synchronizing incremental changes to roughly the same set of files. As such, it expends a lot of effort in doing things like checksums and who knows what else. The first time you run unison on a set of files it will take a long time, this is a FAQ. And by long time I mean long time. I once timed how long it takes to copy over a bunch of files that filled all 512MB. It took about 5 minutes IIRC. I then ran unison on the same set of files and after about 5 minutes it had completed almost nothing and had an estimated completion time of an hour. I am not privy to the internal workings of unison, but I think the constant addition of new files, which is in fact basically the only change, as the MP3s themselves don't change, meant unison had to do whatever this really slow thing is fresh for every new file. I have found that it doesn't matter if unison has already built its database or not, adding new files always takes 10 times or more longer than it would to just copy them. I'm not sure what it's doing, but I do know that the whole time my yepp is flashing between "READY" and "WRITING". That tells me unison is doing a lot of itty bitty writes to the flash memory, and I don't think that can be good (yes, even when you specify rsync transfer mode).

So I gave up on the bidirectional sync and wrote a simple script to sync with rsync. The first version looked like this:

rsync --delete -rv ~/Music/Podcasts /Volumes/YEPP

This syncs my Podcast folder to the Podcast folder on the yepp. It takes no
longer than cp to copy new files, and it deletes old podcasts. Which reminds
me, unison waits until the end to delete old files. This is not good if you're
tight on space. rsync deletes the old files first, then copies the
new/changed ones over.

There's a problem with that script, though. When you run it again tomorrow, it will think it needs to sync all the files. I think that rsync will not actually copy the whole file over, but it will have to do the checksum thing which is slow, and you'll get basically no speedup. And my yepp keeps flashing writing even when no changes would need to be propogating, so rsync is writing to my flash more often than it needs to too.

Well the problem isn't rsync, it's the user. With only the -rv options, (recursive and verbose) rsync will copy new files over with the current timestamp. When you run it again, rsync's quick check algorithm checks the timestamp and sees that they're different and assumes it has to sync those files. It then does the checksum calculations and only the changed bits are transferred over the network. Good for the network usage, not too bad if you have smallish files and a fast disk. Not very good at all if one of the disks is slow.

So we need to be sure and sync the modified times too. That option is -t, which is included in -a. So I tried this script:

rsync --delete -av ~/Music/Podcasts /Volumes/YEPP

That kind of sort of worked. Some files weren't resynced, but others were. I tried -t and a few other things, and scratched my head a bit. Then I read the fine manual (rsync(1)).

When transferring to FAT filesystems rsync may re-sync unmodified files. See the comments on the --modify-window option.

Here's what it has to say about --modify-window:

When comparing two timestamps rsync treats the timestamps as being equal if they are within the value of modify_window. This is normally zero, but you may find it useful to set this to a larger value in some situations. In particular, when transferring to Windows FAT filesystems which cannot represent times with a 1 second resolution --modify-window=1 is useful.

So I updated my script and now it works like a charm:

rsync --delete --modify-window=1 -av ~/Music/Podcasts /Volumes/YEPP

There are a few lessons to be learned here. First, unison isn't always the best tool for the job. Don't drink the kool-aid. I think it is a good bit of software though and has its uses. I'm not sure it can be adapted for efficient flash-drive frequent new-file usage, but if you know how feel free to comment. Second, rsync is nifty, but it's more nifty when you use it properly. Get in the habit of -a instead of -r, it almost never hurts and will often speed things up dramatically. Remember the --modify-window=1 trick for FAT filesystems. Finally, if you *do* make a change to a file and keep the same timestamp and file size somehow, rsync won't sync that file unless you specify -c. It's a pathological case, but it's good to know about.