Jan 21 2009

Log-frequency Spectrograms

Tensai asked me how I made my graphs in my ringtones post. I’d like to blather on about the graphs and why they’re cool and how you can make them, because that’s the sort of thing I’m good at.

In the olden days, the spectrogram was invented.

spectrogram of speech

Originally greyscale, they are now usually portrayed in color, with “hot” colors meaning higher amplitude and “cool” colors meaning lower amplitude.

While there are myriad ways to view and/or generate spectrograms, the most convenient for me right now is to do it in Octave. If you’re not familiar with Octave, it’s a MATLAB clone. Octave is libre, MATLAB is insanely expensive. Obviously, I use Octave. I have used MATLAB previously (I had a beta copy, until it left beta), and for the most part they are quite comparable. Octave is a bit slower for some things (less optimized) but I’ve seen Octave outperform MATLAB on some specific tasks. The biggest impedance mismatch is in user interface stuff (MATLAB has sophisticated dialog support) and graphing. Octave has most of the essential graphing functionality (it uses gnuplot to render graphs).

So how do we generate a spectrogram in Octave? First we need to read in the WAV file, then we generate the spectrogram.


[x,sr] = wavread('logchirp.wav');
specgram(x,8192,sr);

8192 is the size of the FFT. I find 8192 is a nice compromise between time and frequency resolution (and computation time), but other powers of 2 (especially 1024) are common as well. Here’s the chirp spectrogram:

chirp spectrogram

Notice how the chirp is logarithmic. To our ears, this sounds like a steadily-rising tone. For this reason when dealing with music it’s often better to look at a log-frequency spectrogram. Otherwise all the low frequencies are scrunched together and the relationships between different pitches (and harmonics) aren’t constant. Here’s a log-frequency spectrogram of the same chirp:

log-frequency spectrogram of chirp

This was generated by the Octave code

logfsgram(x,8192,sr); title('logchirp.wav');

Notice the bleed on the low frequencies, this is because we need a longer FFT in order to get more frequency resolution at low frequencies. This is a tradeoff in time resolution though, and processor time. Experiment with different FFT lengths for extra credit.

To use these functions you’ll need to put the specgram and logfsgram “m-files” in your octave search path (current directory or whatever else you specify in your ~/.octaverc).


Jan 21 2009

Y Tick Labels in Octave

I struggled with this on and off for a couple of months. Finally I stumbled on the magic needed to get it working.

Dan Ellis has some MATLAB code for log-frequency spectrograms, but in Octave the graph lacks the custom y tick marks. Here’s the original code:

    yt = get(gca,'YTick');
    for i = 1:length(yt)
        ytl{i} = sprintf('%.0f',logffrqs(yt(i)));
    end
    set(gca,'YTickLabel',ytl);

This code gets the existing tick marks and labels them with the log frequency (instead of the frequency bin).
The problem is that Octave doesn’t populate YTick unless it was manually set already.

Here’s the working code:

    % octave doesn't populate YTick with its tick marks, so we have to set
    % our own. Besides, this way we get nice octave intervals.
    set(gca,'YTick',1:12:M);
    yt = get(gca,'YTick');
    for i = 1:length(yt)
        ytl{i} = sprintf('%.0f',logffrqs(yt(i)));
    end
    set(gca,'YTickLabel',ytl);

Jan 21 2009

Ringtones

This new phone is unique in all the cell phones I’ve owned or played with in that it isn’t actually difficult to mp3s onto it and use them for ringtones. It’s a Motorola V195. It supports Bluetooth (which is how I get the mp3s on it), but it also has a regular old USB port through which it charges and I take it allows transfer of files (if you don’t have Bluetooth)—though I haven’t confirmed that.

There must be an unwritten rule among cell phone companies that included ringtones must sound like they were written by a drunk adolescent pelican. So we want to put an mp3 ringtone on, but let’s make it a good one. There are good mp3 ringtones and there are really bad mp3 ringtones, and there’s very little middle ground.

Before I give you my criteria for a good ringtone, let’s cover some background theory. Cell phone speakers are little and cheap. They’ve improved some over the years, but they’re still a far cry from a good pair of desktop speakers let alone audiophile gear. So your mp3 will sound different on the phone than it does on your computer or even your headphones. How will it sound? That depends on the frequency response of the speaker in the cell phone. Since that’s not likely included in your cell phone manual, let’s see how we can measure it.

I whipped up this chirp in Audacity. It goes from 20 Hz to 20 kHz (the audible range) with a second of white noise on either side so you know when the thing starts and ends (since 20 Hz and 20 kHz are both out of the hearing range of most people). Its log-frequency spectrogram looks like this: (ignore the bleed at the low frequencies, it’s an artifact of the log-frequency analysis)

logchirp.wav

The astute among you will have realized that encoding this as an mp3 may change the spectrum. (I use the default settings to lame throughout this post.)

logchirp.mp3

It looks like the only relevant change is that encoding to mp3 cuts off the frequencies above about 17 kHz, which most people don’t hear well or at all. So the experiment is still a go.

Now I put the chirp on my phone and played it back, recording it into Audacity on my laptop. The astute among you will again realize that the frequency response of my microphone makes a difference. Luckily my microphone has a decent frequency response from 30 Hz to 20 kHz. Even if it didn’t though, as long as the mic has a better frequency range than your phone you can still glean some important information. Here’s the result:

Motorola V195

Notice that there’s basically nothing below about 250 Hz. To give you some perspective, 262 Hz is middle C (each tic mark on the Y axis is an octave, and the scale is logarithmic). So anything below middle C is severely attenuated, and the octave between middle C and high C is moderately attenuated.

For comparison, here is the chirp recorded on my desktop speakers.

Speakers

See how it goes lower and also how it is flatter. There’s not an appreciable peak at 8 kHz like there is with the cell phone. That means your high frequencies are going to stick out on a cell phone too.

Ok, so what does this mean for picking ringtones? It means steer clear of things that rely on low frequencies to sound good. Less obviously, it also means to try to steer clear of wide-band sounds that rely on lots of frequencies including low ones to sound good. That means drums! You also want something that is intelligible even across the room or in your pocket, and that means simplicity. Look for fun riffs on one or two instruments at the beginnings of songs, steer clear of human voices, bass, and thick mixes. If you want an approximate preview of what it would sound like on your phone, run it through a high-pass filter in Audacity and set the cutoff about 400 Hz. (Of course your phone may have a slightly different frequency response than mine, but the principles apply for almost all phones).

I won’t go into a lot of detail about how to make a ringtone in Audacity, but I’ll give you some general pointers. First, the length should be about 20–30 seconds (after trimming silence). If you want, time the time it takes to go to voicemail. Second, you should normalize the sound to peak at -3 dB (this might actually attenuate some overloud music). If the clip has a lot of dynamic range (louds and softs) you may want to run it through a compressor before normalizing, otherwise the soft will be lost on the bad speakers and noisy world. Penultimately, mix it down to 1 channel—your phone doesn’t ring in stereo and while stereo files will probably work just fine it’s just a waste of space. Finally, pick a medium quality encoding, too high a bitrate will probably be rejected by the phone and you don’t need it.


Nov 17 2008

Clojure DSP Longing

I often find myself longing to be able to use Clojure, a very enticing lispy language that runs on the JVM.

I could possibly be using it right now in my dissertation research. It has the promise of dynamic languages, functional programming, almost-as-cool-as-Erlang concurrency, JVM performance, and Java library soup. It could be so awesome. A few months ago I started briefly down this road, unaware that…

Clojure sucks. Not generally, but it sucks for DSP. More specifically, Java and therefore Clojure has no real support for complex numbers. In order to do serious DSP, you need native syntactic, semantic, and performance support for complex numbers. Java has none of the above. Older versions of C didn’t have syntactic or semantic support, but the performance of using arrays was plenty fast. Not so in Java, at least not to the extent necessary to override the lack of syntactic and semantic.

So someday, when I’m writing general purpose code again and not high performance DSP code, I will have an opportunity to use Clojure, and I think that will make me very happy. By then the book will be out of beta. The community will be in full swing. There will be awesome libraries. Children will play in pristine parks with formerly-ravenous ravens.

In the meantime, if anyone sees the scene change, do let me know.


Sep 19 2008

IMMS

So Apple added this Genius thing to iTunes recently. Not being the type to get excited about new iPod styles, it looks like the most interesting thing they could come up with this year. I gave it a try. I am not impressed.

I think it’s because I’ve been spoiled. 5 years ago I was using what I still consider to be the peak of intelligent listening software, IMMS. Genius isn’t half as cool as IMMS was then, and while IMMS hasn’t made any quantum leaps in coolness, quite a few rough edges have been rounded off in the meantime.

I’ve been living in a sort of IMMS drought the past couple of years, since I switched to using a laptop primarily. Namely, an Apple laptop. This Genius release spurred me on to rectify that situation. If the best Apple could do was generate a 25-song playlist based on statistics gathered from other people the hopes of someone else hacking up an iTunes plugin to do IMMS or something like it dwindled to obscurity.

The bane of IMMS is, ironically, its most compelling feature. IMMS is cool because you don’t have to do anything. It pays attention to your listening habits, and analyzes the audio, and makes intelligent decisions for you when you turn on random. 4 years ago I would show up to work and be in a Depeche Mode mood, so I’d manually queue up a Depeche Mode song or two and the whole day I’d be treated to complementary music. If the occasional happy song slipped through, I just skipped it and IMMS took the hint. Don’t underestimate the amazing wow factor of a computer apparently reading your mind.

But this focus on simple non-obtrusive UI has been its biggest technical struggle. Media players are now a dime a dozen, and few of them have the plugin and UI sophistication to support IMMS’ modus operandi. IMMS was developed originally as a plugin for XMMS and even then ugly workaround hacks were required. Then someone wrote a queue control patch for XMMS, and if you patched your XMMS you were in heaven. Oh, did I mention that still almost no other media players even have queue functionality, let alone let the plugins control the queue? Then when you consider the set of media players usable on OS X the situation gets laughable.

Somewhere in the middle MPD came along. It fit my situation well because the speakers over on the desktop were a lot nicer than the ones in my laptop. But queues it has not and nobody seems to care. Von bravely came up with an IMMS hack for MPD, but it was too hacky for me—too much like the old XMMS days before the queue control patch (incidentally, queue control is part of XMMS proper now as of version 1.2.11).

So I suffered along with manual or truly random music listening. Until now.

Recently I looked into this again for the desktop, and I was delighted to discover that one of the many XMMS descendants has finally solved the XMMS bitrot without throwing the baby out with the bathwater. Audacious is as cool as XMMS ever was and as modern as your favorite modern player (unless you measure modern by klunky iTunes-like screen-wasting music browsers). What’s more, the imms plugin for it is right there in the Ubuntu repository. Just apt-get install imms-audacious and enable the plugin and you’re off and running. So I set it up and… didn’t use it. As in, we rarely listen to music on the desktop because nobody really sits there for very long. So finally earlier this week I hammered out a simple remote control using Audacious’ dbus interface. That’s another post, once I knock off a few other TODO points.

Feeling on a roll and feeling left out when at school, I decided to get an IMMS solution on my laptop, running OS X Leopard (10.5.4). I’ll spare you the agonizing play-by-play and give you the shortest path to success: install Audacious and then IMMS. Actually the really shortest path is to install XMMS and then IMMS, because XMMS is in MacPorts. But it’s the old version of XMMS without queue control, and doesn’t have CoreAudio support (you have to use the JACK output plugin) so I don’t recommend that.

To install Audacious, install its dependencies (mostly using MacPorts), then build it and its plugins. Installing its dependencies is the hardest part because it’s difficult to locate libmcs and libmowgli (they’re not where the README says they are, and Google is less than helpful). I just ended up stealing the *.orig.tar.gz files from the Ubuntu packages (apt-get source -d libmcs1 libmowgli). There is one patch you need for the plugins.

 src/CoreAudio/audio.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Index: audacious-plugins-1.5.1/src/CoreAudio/audio.c
===================================================================
--- audacious-plugins-1.5.1.orig/src/CoreAudio/audio.c  2008-09-19 12:08:01.000000000 -0600
+++ audacious-plugins-1.5.1/src/CoreAudio/audio.c   2008-09-19 12:10:28.000000000 -0600
@@ -326,7 +326,12 @@ gint osx_get_output_time(void)
 {
        gint retval;

-        retval = output_time_offset + ((output_total * sample_size * 1000) / output.bps);
+        if (output.bps == 0)
+        {
+            printf("Avoiding divide by zero in osx_get_output_time()\n");
+            retval = 0;
+        } else
+            retval = output_time_offset + ((output_total * sample_size * 1000) / output.bps);
        retval = (int)((float)retval / user_pitch);

        //printf("osx_get_output_time(): time is %d\n",retval);

Next you need to install IMMS. This is a bit more involved, but should be straightforward with these patches. I’ll put them here and talk about each in turn.

First, a missing include for mkdir()

 immsd/immsd.cc |    1 +
 1 file changed, 1 insertion(+)

Index: imms-3.1.0-rc4/immsd/immsd.cc
===================================================================
--- imms-3.1.0-rc4.orig/immsd/immsd.cc  2008-03-02 18:54:06.000000000 -0700
+++ imms-3.1.0-rc4/immsd/immsd.cc   2008-09-19 08:05:58.000000000 -0600
@@ -2,6 +2,7 @@
 #include <errno.h>
 #include <signal.h>
 #include <unistd.h>
+#include <sys/stat.h>

 #include <iostream>
 #include <sstream>

Then, a workaround due to OS X not having an initstate_r() (which I
incidentally couldn’t find in the current Linux manpages on Ubuntu or Debian
either). This patch may not apply cleanly by itself, you may need to apply your
cognitive reasoning.

configure.ac         |    3 +++
immsconf.h           |    3 +++
immsconf.h.in        |    3 +++
immscore/immsutil.cc |    9 +++++++++
4 files changed, 18 insertions(+)

Index: imms-3.1.0-rc4/immscore/immsutil.cc
===================================================================
--- imms-3.1.0-rc4.orig/immscore/immsutil.cc    2008-03-02 18:54:06.000000000 -0700
+++ imms-3.1.0-rc4/immscore/immsutil.cc 2008-09-19 08:13:29.000000000 -0600
@@ -27,6 +27,7 @@ int imms_random(int max)
{
    int rand_num;
    static bool initialized = false;
+#ifndef INITSTATE_BUG
    static struct random_data rand_data;
    static char rand_state[256];
    if (!initialized)
@@ -36,6 +37,14 @@ int imms_random(int max)
        initialized = true;
    }
    random_r(&rand_data, &rand_num);
+#else
+    if (!initialized)
+    {
+        srandom(time(0));
+        initialized = true;
+    }
+    rand_num = random();
+#endif
    double cof = rand_num / (RAND_MAX + 1.0);
    return (int)(max * cof);
}
Index: imms-3.1.0-rc4/configure.ac
===================================================================
--- imms-3.1.0-rc4.orig/configure.ac    2008-03-02 18:54:06.000000000 -0700
+++ imms-3.1.0-rc4/configure.ac 2008-09-19 08:17:58.000000000 -0600
@@ -68,6 +68,9 @@ else
    AC_MSG_RESULT([yes])
fi

+AC_DEFINE(INITSTATE_BUG,, [initstate_r is buggy])
+
+
AC_CHECK_LIB(z, compress,, [with_zlib=no])
AC_CHECK_HEADERS(zlib.h,, [with_zlib=no])
if test "$with_zlib" = "no"; then
Index: imms-3.1.0-rc4/immsconf.h
===================================================================
--- imms-3.1.0-rc4.orig/immsconf.h  2008-09-19 08:05:31.000000000 -0600
+++ imms-3.1.0-rc4/immsconf.h   2008-09-19 08:18:23.000000000 -0600
@@ -121,6 +121,9 @@
/* Define to 1 if you have the <zlib.h> header file. */
#define HAVE_ZLIB_H 1

+/* initstate_r is buggy */
+#define INITSTATE_BUG /**/
+
/* Define to the address where bug reports for this package should be sent. */
#define PACKAGE_BUGREPORT "mag@luminal.org"

Index: imms-3.1.0-rc4/immsconf.h.in
===================================================================
--- imms-3.1.0-rc4.orig/immsconf.h.in   2008-09-19 07:48:52.000000000 -0600
+++ imms-3.1.0-rc4/immsconf.h.in    2008-09-19 08:16:32.000000000 -0600
@@ -120,6 +120,9 @@
/* Define to 1 if you have the <zlib.h> header file. */
#undef HAVE_ZLIB_H

+/* initstate_r is buggy */
+#undef INITSTATE_BUG
+
/* Define to the address where bug reports for this package should be sent. */
#undef PACKAGE_BUGREPORT

This patch is just so libpcre can be found

build/Makefile |    1 +
1 file changed, 1 insertion(+)

Index: imms-3.1.0-rc4/build/Makefile
===================================================================
--- imms-3.1.0-rc4.orig/build/Makefile  2008-03-02 18:54:06.000000000 -0700
+++ imms-3.1.0-rc4/build/Makefile   2008-09-19 12:25:05.000000000 -0600
@@ -18,6 +18,7 @@ libimmscore.a: $(call objects,../immscor
libmodel.a: $(call objects,../model) svm-similarity-data.o
        $(AR) $(ARFLAGS) $@ $(filter %.o,$^)

+immstool-LIBS=`pcre-config --libs`
immstool: immstool.o libmodel.a libimmscore.a
training_data: training_data.o libmodel.a libimmscore.a
train_model: train_model.o libmodel.a libimmscore.a

Linking shared libraries on OS X is so much different from on Linux that there is almost always a need to do a patch something like this.

rules.mk |    5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

Index: imms-3.1.0-rc4/rules.mk
===================================================================
--- imms-3.1.0-rc4.orig/rules.mk    2008-09-19 09:04:13.000000000 -0600
+++ imms-3.1.0-rc4/rules.mk 2008-09-19 12:25:50.000000000 -0600
@@ -14,9 +14,8 @@ link = $(CXX) $(filter-out %.a,$1) $(fil
%.o: %.c; $(call compile, $(CC), $<, $@, $($*-CFLAGS) $(CFLAGS) $($*-CPPFLAGS) $(CPPFLAGS))
%: %.o; $(call link, $^ $($*-OBJ) $(LIBS), $@, $($*-LIBS) $(LDFLAGS))
%.so:
-   $(CXX) $^ $($*-OBJ) $($*-LIBS) $(LIBS) \
-       $(LDFLAGS) \
-            -shared -Wl,-z,defs,-soname,$@ -o $@
+   gcc -flat_namespace -undefined suppress -o $@ -bundle $^ $($*-OBJ) $($*-LIBS) $(LIBS) \
+       $(LDFLAGS) -o $@

%-data.o: %
        $(OBJCOPY) -I binary -O $(OBJCOPYTARGET) -B $(OBJCOPYARCH) --rename-section .data=.rodata,alloc,load,readonly,data,contents $< $@

This final patch fixes IMMS to use the proper interface for audacious (seems like this would have to be done anywhere?)

clients/audacious/audaciousinterface.c |  177 +++++++++++++++++++++++++++++++++
clients/audacious/rules.mk             |    2
2 files changed, 178 insertions(+), 1 deletion(-)

Index: imms-3.1.0-rc4/clients/audacious/audaciousinterface.c
===================================================================
--- /dev/null   1970-01-01 00:00:00.000000000 +0000
+++ imms-3.1.0-rc4/clients/audacious/audaciousinterface.c   2008-09-19 15:30:21.000000000 -0600
@@ -0,0 +1,177 @@
+#include <gtk/gtk.h>
+
+#ifdef BMP
+#include <bmp/configdb.h>
+#include <bmp/util.h>
+#include <bmp/plugin.h>
+#elif AUDACIOUS
+#include <audacious/configdb.h>
+#include <audacious/util.h>
+#include <audacious/plugin.h>
+#endif
+#include "immsconf.h"
+#include "cplugin.h"
+
+
+int use_xidle = 1;
+int poll_tag = 0;
+
+GtkWidget *configure_win = NULL, *about_win = NULL, *xidle_button = NULL;
+
+gint poll_func(gpointer unused)
+{
+    imms_poll();
+    return TRUE;
+}
+
+void read_config(void)
+{
+    ConfigDb *cfgfile;
+
+    if ((cfgfile = cfg_db_open()) != NULL)
+    {
+        cfg_db_get_int(cfgfile, "imms", "xidle", &use_xidle);
+        cfg_db_close(cfgfile);
+    }
+}
+
+void init(void)
+{
+    imms_init();
+    read_config();
+    imms_setup(use_xidle);
+    poll_tag = gtk_timeout_add(200, poll_func, NULL);
+}
+
+void cleanup(void)
+{
+    imms_cleanup();
+
+    if (poll_tag)
+        gtk_timeout_remove(poll_tag);
+
+    poll_tag = 0;
+}
+
+void configure_ok_cb(gpointer data)
+{
+    ConfigDb *cfgfile = cfg_db_open();
+
+    use_xidle = !!GTK_TOGGLE_BUTTON(xidle_button)->active;
+
+    cfg_db_set_int(cfgfile, "imms", "xidle", use_xidle);
+    cfg_db_close(cfgfile);
+
+    imms_setup(use_xidle);
+    gtk_widget_destroy(configure_win);
+}
+
+#define ADD_CONFIG_CHECKBOX(pref, title, label, descr)                          \
+    pref##_frame = gtk_frame_new(title);                                        \
+    gtk_box_pack_start(GTK_BOX(configure_vbox), pref##_frame, FALSE, FALSE, 0); \
+    pref##_vbox = gtk_vbox_new(FALSE, 10);                                      \
+    gtk_container_set_border_width(GTK_CONTAINER(pref##_vbox), 5);              \
+    gtk_container_add(GTK_CONTAINER(pref##_frame), pref##_vbox);                \
+                                                                                \
+    pref##_desc = gtk_label_new(label);                                         \
+                                                                                \
+    gtk_label_set_line_wrap(GTK_LABEL(pref##_desc), TRUE);                      \
+    gtk_label_set_justify(GTK_LABEL(pref##_desc), GTK_JUSTIFY_LEFT);            \
+    gtk_misc_set_alignment(GTK_MISC(pref##_desc), 0, 0.5);                      \
+    gtk_box_pack_start(GTK_BOX(pref##_vbox), pref##_desc, FALSE, FALSE, 0);     \
+    gtk_widget_show(pref##_desc);                                               \
+                                                                                \
+    pref##_hbox = gtk_hbox_new(FALSE, 5);                                       \
+    gtk_box_pack_start(GTK_BOX(pref##_vbox), pref##_hbox, FALSE, FALSE, 0);     \
+                                                                                \
+    pref##_button = gtk_check_button_new_with_label(descr);                     \
+    gtk_toggle_button_set_active(GTK_TOGGLE_BUTTON(pref##_button), use_##pref); \
+    gtk_box_pack_start(GTK_BOX(pref##_hbox), pref##_button, FALSE, FALSE, 0);   \
+                                                                                \
+    gtk_widget_show(pref##_frame);                                              \
+    gtk_widget_show(pref##_vbox);                                               \
+    gtk_widget_show(pref##_button);                                             \
+    gtk_widget_show(pref##_hbox);
+
+void configure(void)
+{
+    GtkWidget *configure_vbox;
+    GtkWidget *xidle_hbox, *xidle_vbox, *xidle_frame, *xidle_desc;
+    GtkWidget *configure_bbox, *configure_ok, *configure_cancel;
+
+    if (configure_win)
+        return;
+
+    read_config();
+
+    configure_win = gtk_window_new(GTK_WINDOW_TOPLEVEL);
+    gtk_signal_connect(GTK_OBJECT(configure_win), "destroy",
+            GTK_SIGNAL_FUNC(gtk_widget_destroyed), &configure_win);
+    gtk_window_set_title(GTK_WINDOW(configure_win), "IMMS Configuration");
+
+    gtk_container_set_border_width(GTK_CONTAINER(configure_win), 10);
+
+    configure_vbox = gtk_vbox_new(FALSE, 10);
+    gtk_container_add(GTK_CONTAINER(configure_win), configure_vbox);
+
+    ADD_CONFIG_CHECKBOX(xidle, "Idleness",
+#ifdef BMP
+            "Disable this option if you use BEEP on a dedicated machine",
+#elif AUDACIOUS
+            "Disable this option if you use Audacious on a dedicated machine",
+#endif
+            "Use X idleness statistics");
+
+    /* Buttons */
+    configure_bbox = gtk_hbutton_box_new();
+    gtk_button_box_set_layout(GTK_BUTTON_BOX(configure_bbox), GTK_BUTTONBOX_END);
+    gtk_button_box_set_spacing(GTK_BUTTON_BOX(configure_bbox), 5);
+    gtk_box_pack_start(GTK_BOX(configure_vbox), configure_bbox, FALSE, FALSE, 0);
+
+    configure_ok = gtk_button_new_with_label("Ok");
+    gtk_signal_connect(GTK_OBJECT(configure_ok), "clicked",
+            GTK_SIGNAL_FUNC(configure_ok_cb), NULL);
+    GTK_WIDGET_SET_FLAGS(configure_ok, GTK_CAN_DEFAULT);
+    gtk_box_pack_start(GTK_BOX(configure_bbox), configure_ok, TRUE, TRUE, 0);
+    gtk_widget_show(configure_ok);
+    gtk_widget_grab_default(configure_ok);
+
+    configure_cancel = gtk_button_new_with_label("Cancel");
+    gtk_signal_connect_object(GTK_OBJECT(configure_cancel), "clicked",
+            GTK_SIGNAL_FUNC(gtk_widget_destroy), GTK_OBJECT(configure_win));
+    GTK_WIDGET_SET_FLAGS(configure_cancel, GTK_CAN_DEFAULT);
+    gtk_box_pack_start(GTK_BOX(configure_bbox), configure_cancel, TRUE, TRUE, 0);
+    gtk_widget_show(configure_cancel);
+    gtk_widget_show(configure_bbox);
+    gtk_widget_show(configure_vbox);
+    gtk_widget_show(configure_win);
+}
+
+void about(void)
+{
+    if (about_win)
+        return;
+
+    about_win =
+#ifdef AUDACIOUS
+        audacious_info_dialog(
+#else
+        xmms_show_message(
+#endif
+            "About IMMS",
+            PACKAGE_STRING "\n\n"
+            "Intelligent Multimedia Management System" "\n\n"
+            "IMMS is an intelligent playlist plug-in for BPM" "\n"
+            "that tracks your listening patterns" "\n"
+            "and dynamically adapts to your taste." "\n\n"
+            "It is incredibly unobtrusive and easy to use" "\n"
+            "as it requires no direct user interaction." "\n\n"
+            "For more information please visit" "\n"
+            "http://www.luminal.org/wiki/index.php/IMMS" "\n\n"
+            "Written by" "\n"
+            "Michael \"mag\" Grigoriev <mag@luminal.org>",
+            "Dismiss", FALSE, NULL, NULL);
+
+    gtk_signal_connect(GTK_OBJECT(about_win), "destroy",
+            GTK_SIGNAL_FUNC(gtk_widget_destroyed), &about_win);
+}
Index: imms-3.1.0-rc4/clients/audacious/rules.mk
===================================================================
--- imms-3.1.0-rc4.orig/clients/audacious/rules.mk  2008-03-02 18:54:06.000000000 -0700
+++ imms-3.1.0-rc4/clients/audacious/rules.mk   2008-09-19 15:28:17.000000000 -0600
@@ -7,7 +7,7 @@ libaudaciousimms-LIBS = $(AUDACIOUSLDFLA
audaciousinterface-CPPFLAGS=$(AUDACIOUSCPPFLAGS)
audplugin-CPPFLAGS=$(AUDACIOUSCPPFLAGS)

-audaciousinterface.o: bmpinterface.c
+audaciousinterface.o: audaciousinterface.c
        $(call compile, $(CC), $<, $@, $($*-CFLAGS) $(CFLAGS) $($*-CPPFLAGS) $(CPPFLAGS))

AUDACIOUSDESTDIR=""

Phew. And that’s not all. When you build IMMS you need to have OBJDUMP=gobjdump if you’re using the default binutils variant from MacPorts, and this patch:

 rules.mk   |    2 +-
 vars.mk    |    6 +++---
 vars.mk.in |    1 +
 3 files changed, 5 insertions(+), 4 deletions(-)

Index: imms-3.1.0-rc4/rules.mk
===================================================================
--- imms-3.1.0-rc4.orig/rules.mk        2008-09-19 08:49:43.000000000 -0600
+++ imms-3.1.0-rc4/rules.mk     2008-09-19 16:17:33.000000000 -0600
@@ -19,7 +19,7 @@ link = $(CXX) $(filter-out %.a,$1) $(fil
             -shared -Wl,-z,defs,-soname,$@ -o $@

 %-data.o: %
-       objcopy -I binary -O $(OBJCOPYTARGET) -B $(OBJCOPYARCH) --rename-section .data=.rodata,alloc,load,readonly,data,contents $< $@
+       $(OBJCOPY) -I binary -O $(OBJCOPYTARGET) -B $(OBJCOPYARCH) --rename-section .data=.rodata,alloc,load,readonly,data,contents $< $@

 # macros that expand to the object files in the given directories
 objects=$(sort $(notdir $(foreach type,c cc,$(call objects_$(type),$1))))
Index: imms-3.1.0-rc4/vars.mk
===================================================================
--- imms-3.1.0-rc4.orig/vars.mk 2008-09-19 09:03:05.000000000 -0600
+++ imms-3.1.0-rc4/vars.mk      2008-09-19 15:07:44.000000000 -0600
@@ -5,8 +5,8 @@ INSTALL = /opt/local/bin/ginstall -c
 prefix = /usr
 PREFIX = $(prefix)
 OBJCOPY = gobjcopy
-OBJCOPYTARGET =
-OBJCOPYARCH =
+OBJCOPYTARGET = mach-o-le
+OBJCOPYARCH = i386
 exec_prefix = ${prefix}
 bindir = ${exec_prefix}/bin
 datadir = ${prefix}/share
@@ -15,7 +15,7 @@ VPATH = ../immscore:../analyzer:../model
 ARFLAGS = rs

 SHELL = bash
-PLUGINS = libxmmsimms.so
+PLUGINS = libxmmsimms.so libaudaciousimms.so
 OPTIONAL = immsremote analyzer

 GLIB2CPPFLAGS=`pkg-config glib-2.0 --cflags`
Index: imms-3.1.0-rc4/vars.mk.in
===================================================================
--- imms-3.1.0-rc4.orig/vars.mk.in      2008-03-02 18:54:06.000000000 -0700
+++ imms-3.1.0-rc4/vars.mk.in   2008-09-19 16:17:24.000000000 -0600
@@ -4,6 +4,7 @@ VERSION = @PACKAGE_VERSION@
 INSTALL = @INSTALL@
 prefix = @prefix@
 PREFIX = $(prefix)
+OBJCOPY = @OBJCOPY@
 OBJCOPYTARGET = @OBJCOPYTARGET@
 OBJCOPYARCH = @OBJCOPYARCH@
 exec_prefix = @exec_prefix@

Finally, make install doesn’t finish the job.

cp build/libaudaciousimms.so /usr/local/lib/General/imms.impl

Well I think that’s all the information you need, though it may not go smoothly. Hopefully we can get this all worked into IMMS proper and the 3.1.0 release will just work. If you use linux give Audacious+IMMS a try—it’s easy and painless. If you think Audacious is for sissies, learn about the queue and the jump feature and try out IMMS for a week or two before you pass final judgement.

Oh, two final notes: Installing Torch can be a real pain and Audacious keyboard shortcuts don’t work well with the gtk2 +quartz variant in MacPorts, so you want to stick with X11 gtk2. Oh, and though Audacious has a Last.fm plugin I haven’t yet been able to figure out how to get it to stay enabled.


Jul 26 2008

On Rolling Cables

Have you ever rolled a cable? Maybe I should ask, how many cables do you roll a day? Don’t forget the headphones, laptop power cord, extension cords, hoses, etc.

And what happens every single time you go to unroll one of these cables? They get all tangled up. Guaranteed.

Well it doesn’t have to be like that. There is a simple way to roll cables that is easy, avoids tangles, and lies well. Well, actually, two ways—one for big cables and one for small ones like headphones. Oh, and did I mention it’s good for the cables, and the way you have been doing it is bad for the cables?

For the big cables, watch this video. Go on, I’ll wait.

For small cables, like headphones, raise your index and pinky fingers and wrap the cable around them in a figure eight. The figure eight nature does the same thing as the over/under technique for larger cables, preventing tangles, and it also makes a nice compact easy-to-stash roll. See this lifehacker article


Jul 9 2008

k20

I finished the promised K-20 meter. I imaginatively called it k20, and you can find it at http://hans.fugal.net/src/k20. Here’s a screenshot:

k20 screenshot

From left to right, read average (VU), peak (instantaneous with 26 dB / 3 sec
falloff), maximum peak, and overs.

This is pure unadulterated printf() abuse. No ncurses. Not that I have
anything against ncurses, just that I’m lazy. Of course you need an ANSI
capable terminal, but I’m sure you can find one lying around.


Jul 9 2008

dB and Gain

I’ve been studying up on audio recording on the web, so that I can make decent
recordings for my research. Making good recordings is a lot more
involved than you might think. One of the perhaps needlessly overinvolved
aspects is understanding gain.

A microphone detects sound and produces a faint electrical signal, usually too
faint for practical use. So, the signal is amplified. But if you amplify the
signal too much, you get distortion. Once the signal passes from the analog to
the digital domain, you have forever lost some information, so the natural
thing is to get as much gain as you can before the A/D conversion.

So how do you know how much gain is enough but not too much? A good indicator is a peak meter. Here’s a picture of an analog peak meter (called a PPM):

PPM

In the digital world, the practical equivalent looks something like this:

DPM

We’ll ignore the wacky scale on that analog meter for now. Notice the scale on
the DPM—it goes from 0dB at the top to about -70dB at the bottom. Now, decibel
(dB) is a ratio, meaning it has no meaning without some reference point. In
this case it’s actually dBFS (dB full scale), calculated as dBFS =
20log10(|x|/1.0) when representing samples as floating point numbers
between -1.0 and 1.0. Whatever you divide by inside the logarithm is your
reference. You would divide by the maximum sample value if you were using fixed
point or some other representation. So our reference is 1.0, aka full scale.
Since we can’t go over full scale, all dBFS numbers will be less than or equal
to zero.

So, if you record such that you are just below 0dBFS, you are using every bit
of quantization to its fullest. The trick is that you often don’t know just how
high the maximum peak will be (unless perhaps you’re recording something
repeatable, like playing back a tape). So, you need to leave a little
headroom in case you have some peaks that are larger than average. How much
headroom to use is hotly debated, but really it is just a function of the kind
of thing you’re recording (and the environment in which it is being recorded).
There is no substitute for experience here. Here’s a hint: record stuff with
plenty of headroom, find out what the maximum peak, and calculate how much
extra headroom you have. Do that a bunch, and you will begin to get an idea of
how much headroom you need.

Or, don’t record that stuff, just watch a meter. Good digital meters will have
a resettable maximum peak indicator.

But there’s more. The peak isn’t a good indicator of how loud it is. Loudness
is perceived more closely to the average signal amplitude. The most common way
to calculate this is Root Mean Square (RMS): sqrt(sum(x)/length(x)). A common
meter which measures loudness (although it doesn’t measure RMS, exactly) is the
VU meter. You may have seen it:

VU meter

The VU meter does not measure peaks, but it does measure loudness. But
there’s a trick. 0 dB is not 0 dBFS, but rather 0 dBu, which in turn is usually
4dBV. Those are electrical voltage references for the analog world. In this
world, you aim for 0dB to be “forte”, and it’s ok if the meter occasionally
goes higher than 0dB. Remember the wacky scale on the analog PPM meter? It
reads dBV, i.e. 4 on the PPM is 0 dBu (but the marks are usually 4 dB increments, so 5 on a PPM would be +4 dBu).

So how do we relate these to the digital world? Well that depends on your ADC.
In practice there will be some variation in how the same strength (dBu) signal
is converted to digital, although the variation is probably not too great. The
important thing is to use the proper meter. When you’re doing a digital
recording, the proper scale is dBFS and it’s irrelevant what analog signal you
need to get the dBFS you need. Not entirely irrelevant—you want to make your
signal hotter earlier because that reduces the noise introduced by each
component in the system, but ultimately how much gain you apply (early in the
system) is governed by your digital meter.

But wait, aren’t most digital meters peak meters? Yes, but there are loudness
meters if you look for them. On Linux and OS X with JACK, look at
jmeters (a
derivative of the neglected meterbridge).
At the time of this writing they it has VU and PPM meters, though you will have
to play with them to get the hang of their reference points. Or I could walk
you through it.

Go grab some calibration
tones
.
Included in the zip available there are 1kHz sine wave at -20 dBFS RMS and
various pink noise tones (different bandwidths) at -20 dbFS RMS. But wait,
what’s this?

octave:1> [x,sr] = wavread('1khz sine wave -20 dBFS.wav');
octave:2> 20*log10(sqrt(sum(x.^2)/length(x)))
ans = -23.322
octave:3> [x,sr] = wavread('PINK NOISE FULL bw -20 dBFS.wav');
octave:4> 20*log10(sqrt(sum(x.^2)/length(x)))
ans = -22.908

Gee, looks like those are -23 dBFS RMS doesn’t it? Well, technically they are.
But here’s the rub: the smart people at the Audio Engineering Society declared
that in the analog world we should measure a sine wave with -20dB peak as -20dB
RMS, i.e. a sine wave has the same measurement on a peak and AES-17 RMS meter.
Whether they originated this or it came from history I don’t know, but probably
the latter. So, since a sine wave has about 3dB crest factor, these test tones
have -23 dBFS true RMS, but -20 dBFS AES-17 RMS. Confused yet?

So we fire up jmeters -t vu x and jmeters -t ppm x, then hook up the JACK
connections and play the sine wave. The VU meter settles nicely on -10 VU and
the PPM reads 2. So it would seem that Fons has decided that 0 VU should be -10
dBFS, which gives 10 dB headroom. The gain is adjustable if you don’t like that
choice, by the way.

Ok, so now you know enough about meters to be a little dangerous. Now let me
throw another wrench in things. Let’s talk about mastering. After you record
all the tracks, you need to mix them together into songs and then finally
master the album. Mastering is (to oversimplify) all about loudness. You want
the songs to all fit together when you play the whole album. You don’t want
some songs to sound too loud relative to others. Ideally, you’d want all albums
to have similar loudness. But you and I know that isn’t usually the case. This
is the result of the “loudness war”. Over the past 20+ years, albums have been
getting louder and louder and louder. Everyone wants the loudest-sounding
album. Everyone wants to be the baddest.

And bad they are. audio engineers have always mastered to take full advantage
of the medium. You can count on the highest peak usually being right up to the
clipping/distortion threshold. It’s only right. But then how do things get
louder? Dynamic range compression. I won’t go into the technical details of how
it works, and I exhort you to read up on it. Just Google loudness
war
. Suffice it to say
compression brings the peaks (and valleys) closer to the RMS. The loudness war
is a crying shame. So much music that could sound so good sounds so terrible
because the life has been compressed out of it.

So what does this have to do with metering? Well, the whole point of this post
is to inform you of the K-System meters. A smart audio engineer by the name of
Bob Katz (author of the newest book on my wishlist, Mastering Audio) proposes this system which borrows from history and other industries
that have standards which have protected them from the loudness wars. He
proposes 3 meters, the K-20, K-14, and K-12 meters. The number is the amount of
headroom above 0 dB, so a K-20 meter will set 0dB equal to -20 dBFS, and 0 dB
again means “forte”. Then, you calibrate your studio monitors such that pink
noise that reads 0 dB RMS gives 83 dB SPL (sound pressure level) to your ears.
You need fancy equipment to do that, but unless you’re mastering in a studio
(in which case you have the fancy equipment) it doesn’t really matter. The CD
has dB FS, not dB SPL. If everyone did this, even if everone calibrated “forte”
to -20 dBFS, all the CDs would sound about the same loudness, there would be
plenty of dynamic range to make the music come alive, and the world would be a
happy place.

I have talked to Fons and he has been working on a K-20 meter for jmeters. I’ve
also been writing one with a simple printf display, which I will make
available once I get it working properly. The take-home lesson is to record
such that your highest peak is as close to 0 dBFS as you can get without
clipping, and master to forte (83 dB SPL) at -20 dBFS RMS.

But there’s still that question of headroom. Why not just adjust the gain so
that forte is -20 dBFS RMS and be done with it? I’m not implying you wouldn’t
have to master to get a professional-sounding recording (mastering is more than
just adjusting gain), but that it should give you a nice volume with enough
headroom. I think it’s a great idea. I’m glad I thought of it (though I’m sure
I’m not the first). But make sure you’re using as many bits as possible for
quantization, preferably 24 bits, since you are potentially sacrificing
bits for extra headroom.

I hope this little treatise was interesting and/or useful. To learn much more, I recommend starting at Bob Katz’ articles on level practices.


Jun 4 2008

PulseAudio as a JACK Client

I spoke too soon about not being able to get PulseAudio working as a JACK client. I found this post that tells you how to do it.

The key I think is chmod -s `which pulseaudio`. I didn’t have to start the JACK transport rolling, so that may be antiquated information. I did have to build some packages from source, though:

sudo apt-get build-dep pulseaudio
sudo apt-get install libjack-dev
fakeroot apt-get source -b pulseaudio

This creates a bunch of .debs, including pulseaudio-module-jack*.deb. I just installed them all, but you can probably just install the jack module deb. Make the changes permanent by putting them in ~/.pulse/default.pa or in /etc/pulse/default.pa and you’re in business.


Mar 23 2008

CoreMIDI

I’m porting Aeolus to OS X. In the process I’m learning how CoreMIDI works. Naturally you get to hear my opinion on the matter.

CoreMIDI seems like a decent framework, actually. It is callback-based, which is good. It has a pretty reasonable design; physical devices and virtual devices alike communicate with eachother in the same way. They each have endpoints—source endpoints and destination endpoints.

It all looks well and good on the surface, but there’s some problems. The first problem is arguably a feature. You can create input/output ports and connect sources/destinations to those ports from within your application. This allows you to make a cute or complicated dialog box where the user can select the MIDI (virtual) device(s) she wants to use. Sounds reasonable right? And so it is, and I wouldn’t argue against this ability.

The badness comes in when you consider that every application has to duplicate this functionality. It would be much better to have an external patchbay for connecting applications together. This would be more powerful and flexible and free up application developers to not worry about it. They just have to create the endpoints and then they’re done.

Alas, OS X proper has no such patchbay. “Yes it does, silly. It’s called Audio MIDI Setup” you say. That’s the most infuriating thing—Audio MIDI Setup lets you route between devices in just the sort of way I’m talking about, but it only works for physical devices. Someone needs to be shot.

Luckily, some guy named Pete wrote a MIDI Patchbay. It’s serviceable, if quirky and ugly. He also wrote a simple software synthesizer called SimpleSynth (also quirky and ugly) that does what something in OS X (e.g. QuickTime Player) should already be doing: accept MIDI input and use the QuickTime music synthesizer to render it. Kudos to Pete for filling in the gaps, and I’m sorry for calling your children ugly.

While I’m complaining about patchbays, I’m still dumbfounded that JACK doesn’t seem to have a command-line application for patching things together. I’m thinking something akin to aconnect for ALSA MIDI, though of course for JACK it would be for audio and MIDI both. qjackctl is absolutely marvelous, and I wouldn’t use anything else given the choice, but sometimes you don’t have qjackctl handy and it might be quite difficult indeed to get it. This was the case for me the other day. I had the latest greatest JACK installed from source, but qjackctl (which I finally managed to figure out how to build using the QtMac binary, whose qmake refuses to output a real Makefile but instead an XCode project) was choking on it. So I had to downgrade to Jack OS X and rebuild qjackctl (it’s still an immense improvement over JackPilot). This is depressing because the newer version of JACK is much more friendly to the CLI user on OS X. The version in Jack OS X 0.76 still requires some ugly workarounds (which JackPilot helps you to do). The latest version of JACK (0.109.2) Just Works™ when you type jackd -R -d coreaudio. So I’m still starting JACK with JackPilot, which I then summarily quit in favor of qjackctl.