Implementing State Machines
The lowly state machine, aka finite automata, is a beautiful and useful thing, on paper. But when it comes time to code one up, things can get ugly. You see, any first-year CS student can code up a state machine in the naïve way—with a state variable and a big nested while/case/if mess. It's hard to read, it's tedious to code, and it's very easy to get yourself all wet if you're not careful. Then there's the state transition table where you put all the state transition information in a hardcoded table, or in a source file that you write a program to generate the static table from, or if you're really fancy you read the source file at runtime. These get uglier than ugly, real fast.
I've always known deep down that there must be a better way to write state machines, at least in languages with the necessary niceties. I've implemented a number of semielegant state machines, but never really felt I hit on the One True Pattern™. Speaking of patterns, the Gang of Four have a state pattern for overengineering state machines with the magic of object-oriented programming. It's definitely more OO than the traditional approach. Probably one of the better ways to do it in C++ or Java. But let me share with you what I figured out in Ruby this evening.
Disclaimer: I'm probably not the first to come up with something like this, but I didn't see anything in my feeble attempts at googling state machine implementations. Second disclaimer: this is an agile state machine pattern; it doesn't concern itself with being bulletproof, all-encompassing, or high performance. But it's dead simple, very little code which is very easy to read, very flexible, and elegant. You and I should be able to adapt it to our needs easily many times over.
Here's a sample state machine, and the code for implementing it (in the code I add arbitrary final states and a transition action for demonstration purposes):

require 'dfa'
d = DFA.new('A', ['A','C'])
d.transition do |s,a|
ss = case [s, a]
when ['A', 0]: 'B'
when ['A', 1]: 'C'
when ['B', 0]: 'B'
when ['B', 1]: 'C'
when ['C', 0]: 'D'
when ['C', 1]: 'C'
when ['D', 0]: 'A'
when ['D', 1]: 'C'
else raise "Invalid transition (#{s}, #{a})"
end
print ss
ss
end
ary = [0,0,0,1,1,1,0,1,1,1,0,0].map do |a|
d.eat a
end
puts
p ary
The file dfa.rb which contains the DFA class is straightforward too.
Notice how you're free to implement the transition function however you like, e.g. by parsing a config file or hardcoding a state transition table, or clumping things together in an intuitive or even metaprogramming way if you have a lot of similar transitions. Notice that you don't have to do formal dances like defining the set of states explicitly, but the underlying theory is sound. Notice how you can get the input any way you please, and feed it to the state machine one piece at a time. Nothing is holding you down, you are free! And yet none of these things is in the least bit harder than if you hadn't used it. It's helpful but transparent. It is cuter than your cat and more loyal than your dog. Ok, I'll stop now.
The basic pattern is, in case you haven't been reading the code and documentation (ahem), abstract away the state-tracking stuff into the DFA object; provide the transition function which aligns nicely with theory (d: S x ∑ → S), and we'll look the other way if you sneak some transitional actions in there; and provide the input loop. All you really need in your pet language is first-class functions. Duck typing would (as always) be nice too, because then your states can be meaningful objects, eliminating an extra layer of indirection. It might not turn out as pretty in your language, but it would be as flexible and DRY.
Extending this to nondeterministic finite state automata wouldn't be hard, either, if that's what you need.
If you have a better way to do state machines (or a pretty-good way in
$your_favorite_crippled_language), do tell in the comments.
Posted in cs | 4 comments |
The Ugly Chronicles of Postcript and PDF
Levi and I tried to figure out the age-old question of how to get not-ugly PDFs out of troublesome PS files. This question used to be primarily "how do I get legible PDFs out of LaTeX?" That question has been answered and answered well, and in most setups of LaTeX it just works these days (in my experience). Here is a good page on that subject.
The question now has become, how do I get legible PDFs out of PostScript files for which I do not have the .dvi or source files? The web is fairly silent on this point.
First, let's talk about the problem. The problem is bitmapped fonts. The beautiful Computer Modern fonts are (were) not vectorized fonts, but bitmapped fonts. That means, in a nutshell, they only look right at certain resolutions. When ps2pdf or its kin converts such a PostScript file to PDF, it generates a PDF file with Type 3 fonts. I believe Type 3 fonts can be either vector or raster, but we're interested in the raster case.
Many many programs do a fantastically terrible job of rendering Type 3 fonts. This includes Adobe Acrobat (rumor has it newer versions are improved), OS X Preview, gv, and a whole slew of others. So what you get is a PDF that is fairly illegible on-screen but that prints fine. Some nice screenshots here.
This inability to cope with Type 3 fonts seems to be a matter of utter incompetence. If you know anything about image processing, you know that it would be hard not to do a better job of downsampling. There do seem to be programs capable of displaying PDFs with Type 3 fonts in at least an acceptable manner. On linux, xpdf and whatever GNOME's PDF viewer is called manage well enough. gv does not handle a PDF with Type 3 fonts well, however it handles the original PostScript file very well. On OS X, Preview fails miserably. You should be able to get xpdf from MacPorts, which is a valid option. Another option is to enhance the generated PDF to the point of moderate legibility:
ps2pdf -dPDFSETTINGS=/screen -dGrayDownsampleType=2 foo.ps
That says downsample to screen resolution not printer resolution, and use bicubic resampling instead of the fast and worthless default. Levi reports MacGhostView ($20 shareware) works well too, but I can't get it to extract out of the sitx file for some reason. I tried PStill ($70) and got an absolutely pitiful output (and the UI sucks too). Probably the best option is to use gv in X11 to view the original PostScript file, if you don't mind the interface (I actually am kind of fond of gv's interface). However my initial attempt at this with MacPorts doesn't seem to work - I get no fonts at all.
Hopefully this helps somebody. If you know anyone that is responsible for the abysmal state of things, whack him upside the head for me. If you ever generate a PostScript or PDF file from LaTeX with bitmapped fonts instead of vector fonts, whack yourself upside the head for me.
Posted in cs | no comments |
align Environment
How I managed to get along for so long without knowing this I don't know. The align environment in LaTeX is very nice for typesetting a series of equations, e.g. in a proof. Here is an example:
\begin{align*}
P(A)\text{ and }P(\overline A\cap B)&\text{ are mutually exclusive.}\
P(\overline A\cap B)+P(A\cap B) &= P(B)\
P(A\cup(\overline A\cap B)) &= P(A)+P(\overline A\cap B) &\because \text{Third axiom of probability}\
P(A)+P(\overline A\cap B) &\le 1 &\because \text{First axiom of probability}\
P(A)+P(\overline A\cap B)+P(A\cap B)-1 &\le P(A\cap B)\
P(A)+P(B)-1 &\le P(A\cap B)
\end{align*}
That will typeset a series of equations aligned by the &s.
Posted in cs | no comments |
Qt4
Back when Qt 4 was released, I watched "The Qt 4 Dance". It's silly, catchy, and, well, cute. Somehow the mp3 got wedged in my iTunes and I've heard it a few more times since then. Not the worst song in the world, but I wish I could get rid of it.
What I don't wish is to get rid of Qt 4. I didn't actually look at Qt 4 until a few weeks ago. I didn't believe that Qt 4 could solve all my problems, no matter how great it was for Jean-Claude. When I had to do any GUI programming, I was happy with Qt 3.
With the move to the MacBook, Qt and qtruby needed installing, and I decided I might as well install the Qt 4 versions. Qt 4 for OS X is much improved just in the distribution and packaging. qt4-qtruby is slightly easier to install than the Qt3 version, but it still needs some packaging work. Finally I had gotten qt4-qtruby installed and then I tried to fire up Neelix and discovered Qt4 doesn't just run Qt3 code automatically.
So I investigated a bit and decided to port to Qt 4. It was a bit of a long process, and perhaps a bit painful, but it was like the muscle soreness you get after a good workout. At the end I didn't feel like "what a complete waste of time", but excited for the new possibilities.
You see, the major differences between Qt 3 and Qt 4 are architectural. Some stupidity has been replaced with elegance. Believe me, refactoring your code to use the new Model/View framework for tables, tree views, etc. instead of the old item-based tree and table views is an exhilerating experience. The old way is klunky. The new way is elegant. If I had to pick one reason to prefer Qt 4, that would be it.
But there are other great changes too. Take a look at the What's New in Qt 4 page. I'll just rattle off a few of my favorites: a better main window class, Qt Designer improvements, integrated SVG support, PDF printing backend, unit testing framework, syntax highlighting class, splitting things up into modules (this should make gentoo users giddy), C++ namespaces, OS X improvements, a Calendar widget, a text completion framework, an undo framework, it's open source on all three platforms, and much more.
Another favorite change isn't on that page, but deserves mention. In the old days you'd design a window with Qt Designer and then you'd have to do klunky things like subclass widgets to get all the functionality. I never really liked that. Qt 4 takes a much more elegant approach, IMHO. You can still subclass if you're an OO freak, but that is deprecated. uic now hands you a POD class which will populate a given QWidget for you. So in my code I create an instance of this POD thing, create a main window widget, then pass my widget to the POD's setupUi() function. Et voilá, I have a populated main window. No subclassing. I can access all the widgets as members of the POD.
If you're in C++ and thinking of porting your code to Qt 4 it's not as much work as I led you to believe above. Qt 4 has a compatibility layer and an automagic tool to convert your Qt 3 code to Qt 4 code (and Qt Designer and/or uic will convert your .ui files). qtruby users will need to just port it, because qtruby doesn't support the compatibility layer for reasons I reluctantly agree with.
So in the end, Qt 4 *is* worth dancing about. Fancy that.
Posted in cs | no comments |
VMAC
Back in 2003-2004 I worked with a team of bright graduate and undergraduate EE
and CS students at BYU on
It was also a fun project because I got to do some serious system programming. I was in charge of the core scheduler and communications in QNX. I hand-crafted ethernet frames (TCP/IP was too much overhead), wrote my own QNX network stack, juggled various threads with different realtime priorities, and lived to see it all work, and work well.
Mike's thesis has made it through the red tape gauntlet and is available for you to read at http://contentdm.lib.byu.edu/ETD/image/etd553.pdf. You can read all about it and see the Doxygen documentation of the code I wrote. You probably won't, but what's cool (to me) is that you could.
:set exrc
Let's say you work on more than one project, each of which has different policy about tabs (e.g. one of the only two sane tab policies: all real tabs or all spaces). You need vim to behave one way in project A and another way in project B. You could use modelines, but that's a violation of DRY.
The solution is something I'd often wished for but never realized vim did. set
exrc in your ~/.vimrc tells vim to search the current directory for a
.vimrc or .exrc file. You might need a few symlinks (automatable with a
Makefile) to reach all the subdirectories of your project tree, but this will
solve all your problems.
Posted in cs |
Duck Typing Defended
Ever since I discovered ruby I've loved dynamic typing. I never paid much heed to people who got their knickers in a knot over how dynamic typing is "unsafe".
While working on Ardour this summer, I've literally been wrestling with C++, all because it is statically typed. The code I've been writing over the past month could have been done in one week with half my brain tied behind my back (which is the usual situation, by the way), if only C++ were dynamically typed.
In other ways static typing hasn't exactly got in my way, it just causes me to jump through hoops that seem silly, unnecessary, and inelegant. A simple concrete example will help. I'm doing undo serialization, and of course Ardour already serializes other things in order to save sessions. In C++ this means lots and lots of otherwise unrelated things all inherit from a common serializable base class. This is why multiple inheritance is necessary in C++. In Java you can use interfaces, which is better than multiple inheritance but not by much. In dynamic languages, you don't even have to give it a second thought. You just make sure everything that needs to be serialized responds to a serialize method, et voilá!
So as I've been wrestling I've had this nagging feeling that it's *me* that's broken, not C++. Have I become so spoiled by dynamic languages that I can't program efficiently in a static language anymore? Are the static people right and one day my whole world will come crashing in and all my ducks will keel over? Feelings of self doubt are part of being human, and these were some of mine this summer. Then I read a blog post by Bruce Eckel on Strong Typing vs. Strong Testing and my mind was put at ease. The thrust is, dynamic typing is good, and testing is necessary. I think this quote about Robert Martin (a C++ guru) sums things up nicely:
Robert came to more or less the same conclusion I have, but he did so by becoming "test infected" first, then realizing that the compiler was just one (incomplete) form of testing, then understanding that a weakly-typed language could be much more productive but create programs that are just as robust as those written in strongly-typed languages, by providing adequate testing.
I have a thought along these lines. Let's say you've got duck typing in full swing as in his examples. Now in the real world it sometimes happens that you choose a common ambiguous word for the duck method, like "run" perhaps, but more often than not you have methods that are named fairly specifically, like "speak" or "jump". The static argument is that you might accidentally call the method on something that won't do what you anticipate because it's not of the right type. But really now, how many of your objects that implement "jump" are going to be the wrong kind of objects in your application? Often the answer is none. So you jump through hoops for no reason.
Do read the article, it's fun and well-written. Even if you come out still clinging to your beliefs that static typing is Good and Wholesome, at least you'll see that us dynamic people aren't necessarily lazy, stupid heathens.
Posted in cs | 2 comments |
Extra, Extra: Nearly All Computation is Broken
This monster keeps popping up and my rant switch has been flipped.
Google Research Blog has a post about how "nearly all binary searches and mergesorts are broken" because this can overflow:
int mid = (low + high) / 2;
Boy, have I got news for you. Computers are finite. If you put big enough
numbers in, you will have overflow problems, period. Binary search and
mergesort are no more broken than a+b or x*y or any other arithmetic
operation in every program you have ever written.
All we have here is the need to recognize the limits of your computation. If you need to support big numbers, you can then deal with overflow. That might mean using 64-bit integers, or if you have just a little bit of overflow you can use some creative arithmetic and/or casting. The fun is endless, take it from a DSP programmer (overflow is an omnipresent concern in DSP programming).
So no, everybody's binary search and merge sort implementations are not broken. The algorithm is certainly not broken; theoretical discussion on algorithms is purposefully distanced from implementation concerns like how many bits you're working with, unless that happens to be the theoretical focus. So the textbooks and pseudocode are also not broken. The only thing that's broken is that some of us have forgotten that computers have limits, and we're surprised when all of a sudden the computer does what it was designed to do when the limits are reached.
Posted in cs | 2 comments |
svk
The problem: Ardour uses Subversion, but I'm addicted to distributed revision
control systems. Actually, svn and I would have got along just fine if it
weren't for svn merge. What an embarrasment for svn lovers everywhere! You
have to manually dig up which revisions to merge with. svn doesn't keep track
of what's already been merged so you also have to be careful not to merge the
same stuff twice. To add insult to injury, you have to type those full, long
svn URLs too. So what would be darcs pull becomes something like:
svn merge -r 536:543 \
svn+ssh://ardoursvn@ardour.org/ardour2/trunk \
svn+ssh://ardoursvn@ardour.org/ardour2/branches/region-plugins
Ick.
So I started investigating gateways from svn to darcs or git, either of which I would have happily used. Tailor seems like a good way to do it, but I had a hard time wrapping my head around the bidirectional setup. git-svnimport looks promising for a git solution, but before I got a chance to try it I looked at svk. svk is perfect for this situation.
If you're not aware, svk is a distributed RCS front-end to svn. I knew about it before but had always thought of it as a hack marrying the worst of two worlds. Since my last look it has improved considerably, and my approach has also changed as I'm now looking at it asa developer in an svn project, not the guy setting up a repo and wondering which system to use.
svk is mostly like svn, except you mirror the repo on your hard disk and can do disconnected development. To be honest I haven't looked at the truly distributed aspects of svk (if they exist), but rather I have focused on what I needed: disconnected operation with the ability to create local branches with easy merging, and work with the existing svn repo. svk does these things very well.
Here's my new workflow, from the point of installing svk:
# setup
svk mirror svn+ssh://ardoursvn@ardour.org/ardour2 //mirror/ardour2
svk sync //mirror/ardour2
cd ~/src
svk co //mirror/ardour2
# branch
cd ardour2
svk cp //mirror/ardour2 //local/region-plugins
svk switch //local/region-plugins
# edit stuff then check in
svk ci
# merge in trunk changes to my branch
svk pull
# merge my branch back into the trunk
svk push
Learn more about svk by reading "SVK, A Visual Guide", Jonathan Weiss' blog entry, and the Bieber Labs tutorial.
FileMerge
OS X has this nifty utility called FileMerge which is far and away the best way to merge differing files that I have ever seen. Yes, that includes vimdiff, although a sufficiently large gvimdiff is pretty good too. If you give it a try, don't give up saying "too much mousing" until you've explored the (mostly hidden) keyboard acceleration. I found sensible accelerators for every function I hoped for.
Bruno De Fraine has
gifted us with a set of bash
scripts to use FileMerge as
Subversion's --diff-cmd and --diff3-cmd. This means when you have a
conflict in Subversion you can merge with FileMerge. Along with the [helpers]
section in ~/.subversion/config you can be in FileMerge/SVN paradise.
I intend to figure out how to get this working with darcs soon. Actually I've found some pages about it that don't seem to work with my current version of darcs. When I get more HDD space I can install that silly Haskell compiler and upgrade, then I will report if it works.
Posted in cs | no comments |