26 Sep 2004 18:11

Bogofilter, procmail, and mutt, oh my!

Spam is not a problem for me. I indiscriminately use my email address when shopping online, when subscribing to email lists, and when posting to usenet. I get a lot of spam, but I see almost none of it. I imagine I get just about every flavor of spam you can imagine, and my spam filtering system does a great job of correctly filtering it.

I use fetchmail to fetch my mail from various accounts to my home machine. When it arrives here on falcon, it gets processed by these two procmail rules:

:0fw
| bogofilter -l -e -p


# if bogofilter failed, return the mail to the queue, the MTA will
# retry to deliver it later
# 75 is the value for EX_TEMPFAIL in /usr/include/sysexits.h

:0e
{ EXITCODE=75 HOST }

Then, it gets processed by other procmail stuff, e.g. to filter out the lists by the List-Id header. If it makes it through that gauntlet, it hits these procmail rules:

:0
* ^X-Bogosity: Yes
spam/

:0
* ^X-Bogosity: Unsure
unsure/

I read my email in mutt. The following is my bogofilter-related mutt configuration:

# <f9> spam, <f10> ham

macro index <f9> "<enter-command>unset wait_key\n\
<pipe-entry>bogofilter -l -s\n\
<enter-command>set wait_key\n\
<save-message>=spam\n" "learn as spam, and save to spambox"

macro index <f10> "<enter-command>unset wait_key\n\
<pipe-entry>bogofilter -l -n\n\
<enter-command>set wait_key\n" "learn as non-spam"

color index yellow default '~h X-Bogosity:\ Yes'
color index green default '~h X-Bogosity:\ Unsure'

Non-spam (ham) is the normal color, and spam is "yellow" (really more like brown). If I notice a misclassification, e.g. a mailing list message that's brown, or a ham in the spam mailbox, I press F9 or F10 to learn as spam or ham respectively.

I used to have bogofilter auto-learn, i.e. train itself with its decisions. This self-reinforcement was interesting, but one day something broke with my mutt bindings (I don't remember what) and I was too busy to fix it. Soon my spam filter was completely off-kilter and I just had to disable it altogether. The next time around I decided to use a minimalistic approach. I decided to only train bogofilter when it made a mistake. It only took a few days for it to be quite accurate, and now I rarely notice a mistake. I do still vgrep my spam occasionally (once or twice a week) to check for false positives. I occasionally get a false positive, but usually it is related to an online order or ebay, or something else that I am expecting.

I have been meaning to use the logs to calculate how accurate it is, but I just haven't got the time. I currently have trained 111 spams and 71 hams, in about 3 months time. I get from 50-100 spams per day, (averaged over the stastically exhaustive period of the last two days) and a heckuvalot more hams, mostly in the form of email lists like plug, lad, etc. You can do the math, but it boils down to being fairly accurate. And even better, it takes almost zero of my time to maintain.

21 Sep 2004 21:03

snd-cs46xx doesn't do wavetable synthesis

I got a Turtle Beach Santa Cruz sound card which is driven by the cs46xx ALSA driver, but much to my dismay there doesn't seem to be a way to get its wavetable MIDI synthesis to work in linux. So I guess it's back to the softsynths for me, e.g. fluidsynth. They're not so bad, but I think the card would have done a better job of it.

Still, it's a good card.

15 Sep 2004 16:21

ri from within irb

14 Sep 2004 11:55

sourdough calculator 0.1

So I wrote a little command-line sourdough calculator because I like the command line better than a web form or spreadsheet.

11 Sep 2004 13:23

ACPI and gcc

I fussed and fussed with this Compaq Evo N150 laptop and ACPI. ACPI utterly failed to report any useful information about the battery or anything else. No matter what, my laptop was running at 71°C, which was ridiculous.

I tried fixing the DSDT. I tried different versions of the kernel and different versions of the ACPI tools. Nothing helped in the least.

Finally, I noticed that things seemed to work properly in Knoppix 3.4. So I began whittling away at the differences between Knoppix and what I was doing. I found the source for the knoppix kernel in /usr/src, and the knoppix README said that knoppix used vanilla stable kernels. I compiled a kernel with the knoppix config, which also did not work. I racked my brain and finally realized that the only difference between knoppix and my kernel that there could possibly be might be the compiler. Knoppix' patch did change gcc to gcc-2.95. So I made that change in my makefiles, and lo and behold it worked. So ACPI code has problems with gcc3.

The moral of the story is to compile the kernel with 2.95.x where x ≥ 3, until the kernel documentation (Documentation/Changes) says it's time to use gcc3.