Gnuplot and Missing Data

Let’s say you have some data you want to plot with gnuplot:

2007-08-16 119.02264
2007-08-17 120.20198
2007-08-18 121.29
2007-08-19 120.65557
2007-08-20 119.92982

Further suppose you’d like to plot both the weight (those are kilograms) and the BMI (weight/height2 in kg/m2). One way to do that is this gnuplot snippet:

set xdata time
set timefmt '%Y-%m-%d'
set datafile missing "?"
plot 'data' using 1:2 title 'Weight' with lines,\
     'data' using 1:($2/3.7249) title 'BMI' with lines

So far so good. Now, what if you have a missing data point? Well, in this contrived example you would leave that line out and all would be well. But let’s say you have other columns with data that may not be missing, so you can have a missing data point:

2007-08-16 119.02264
2007-08-17 120.20198
2007-08-18 ?
2007-08-19 120.65557
2007-08-20 119.92982

If you use the same snippet above to plot this data, you get an interesting result. The first plot (weight) is just what you’d expect. It just skips that missing data point, connecting the data from 8/17 and 8/19. But the second plot (BMI) instead leaves a gap between 8/17 and 8/19. This oddly inconsistent behavior is unexpected to mere mortals, but not to gnuplot developers. I quote from help missing:

   set datafile missing "?"
   set style data lines
   plot '-'
      1 10
      2 20
      3 ?
      4 40
      5 50
      e
   plot '-' using 1:2
      1 10
      2 20
      3 ?
      4 40
      5 50
      e
   plot '-' using 1:($2)
      1 10
      2 20
      3 ?
      4 40
      5 50
      e

The first plot will recognize only the first datum in the “3 ?” line. It
will use the single-datum-on-a-line convention that the line number is “x”
and the datum is “y”, so the point will be plotted (in this case erroneously)
at (2,3).

The second plot will correctly ignore the middle line. The plotted line
will connect the points at (2,20) and (4,40).

The third plot will also correctly ignore the middle line, but the plotted
line will not connect the points at (2,20) and (4,40).


One Response to “Gnuplot and Missing Data”

Leave a Reply