Feb 4 2009

Identification in PDFs

If you need to create a PDF with no embedded identification it may not be enough to simply refrain from typing your name. For example:

$ strings foo.pdf | egrep -i '(hans|fugal)'
/PTEX.FileName (./0_Users_fugalh_research_foo_fig1.pdf)
/Author (Hans Fugal)
/PTEX.FileName (./1_Users_fugalh_research_foo_fig2.pdf)
/Author (Hans Fugal)
/PTEX.FileName (./2_Users_fugalh_research_foo_fig3.pdf)
/Author (Hans Fugal)
/PTEX.FileName (./3_Users_fugalh_research_foo_fig4.pdf)
/Author (Hans Fugal)
/PTEX.FileName (./4_Users_fugalh_research_foo_fig5.pdf)
/Author (Hans Fugal)

The /PTEX lines are from pdftex and the /Author lines originated from gnuplot

/Title (fig1.pdf)
/Author (Hans Fugal)
/Creator (gnuplot 4.2 patchlevel 4 )

Removing the offending lines didn’t hurt the PDF in this situation. So if you must anonymize a PDF (e.g. to submit a paper for blind review), be sure to check for hidden identification. Of course, most reviewers wouldn’t go digging for it, but you will rest easy knowing it’s truly anonymous.


Sep 22 2008

Gnuplot in Action

One of the oldest and most universally useful tools we have is gnuplot. It is also one of the least understood and most underutilized tools we have.

I can hear you now. “What do I need gnuplot for? I don’t make graphs.” Well that’s exactly the problem. Everyone who works with data should be making graphs, and lots of them. Do you write programs that manipulate data? You need gnuplot. Do you want to evaluate performance or traffic on your website? You need gnuplot. Do you want to impress your friends with cool graphs of the growth rates of yeast and bacteria in sourdough or your weight loss and percent body fat? You need gnuplot.

I’ve been using gnuplot for years. I scraped up enough gnuplot skillz to make basic graphs and it has been invaluable. But I knew gnuplot could do more than I knew how to make it do, and whenever I tried to do something advanced it was only with great pain that I succeeded. Often I failed. Let’s face it, gnuplot can be a bear to learn. Why? Well, mostly because of the documentation. Not that there isn’t any, almost the contrary. There’s a lot of documentation, but it’s very much reference documentation. What the world has been lacking is a good introduction to gnuplot that isn’t afraid to get nitty-gritty where it needs to, but doesn’t just parrot the abundant but obscure documentation that’s already out there.

We no longer need to wait. The book is called Gnuplot in Action by Philipp Janert, and it is an absolutely fantastic book. Really, I can’t say enough good about it.

Janert walks the fine line between cheesy tutorial and dense reference with the skill of a circus acrobat. The writing is approachable, yet chock full of useful information. Nothing is rushed, but it doesn’t plod. The text is sprinkled with beautiful graphs that expand your imagination and open your eyes to the possibilities of gnuplot.

In chapter 2, “Essential Gnuplot”, the impatient reader is given a whirlwind tour of gnuplot basics. After just 11 pages you will know everything you need to know for 90% of the graphs you will ever need to create. In fact, you’ll know more than I knew when I began reading it—I learned a couple things that I kick myself for never having discovered on my own.

Chapter 3 goes into more detail on dealing with data, and in that chapter I learned a ton. Several of the things I learned in this chapter have saved me numerous hours this semester alone. Chapter 4 picks up the remaining miscellany.

In part 2, all those nagging questions of polish are addressed. This is where I used to spend the most time banging my head against the wall, searching, plodding through various newsgroup threads. “How do I get this or that to look just right?” These types of questions are hard to find answers to in search engines. Janert takes us by the hand and explains each and every question I’ve ever had and a few I hadn’t yet dared to have. Truly beautiful graphs are now within my grasp. What’s more, it no longer seems like an exercise in pain but a simple recipe for success. After Janert explains these techniques they seem plain as the nose on your face, yet he’s not condescending.

Part 3 dives into the deep dark secrets of gnuplot. 3D plots, color, multiplots, different coordinate systems, fitting, terminals, and a dozen other things you didn’t even know that you didn’t want to know. No doubt you’ll skim this section the first time and come back to it when you need those dark magic tidbits.

Part 4 is arguably the most important part of this book, or perhaps second after part 1. Part 4 is a crash course on graphical analysis. What kinds of graphs you can create, when you should and shouldn’t use them, how not to lie with graphs (and how to pick out people lying with graphs), and most importantly, how to go from raw data that you don’t understand to organized data that you do understand and have pretty graphs to demonstrate to boot. All
with practical examples that you can tweak for your own use.

Finally, there’s a gnuplot reference in the appendix. This is a deluxe package and has everything you need to become a gnuplot guru. I am thrilled that this book is coming to dispel the darkness surrounding gnuplot.

I really have no cons to speak of, other than the prerelease PDF I had access to had some minor problems—the sort of problem I would expect to be resolved in the final stages of editing. I don’t have experience with other Manning books, but having seen prerelease versions of other books from other publishers I’d say the current copy is par for the course. I’m certain they’ll
fix those things up and have an outstanding PDF in the end. I recommend springing for the dead tree version though, as I expect the reference at the end of the book and the examples throughout will be more accessible next to your computer instead of on the screen. (You already use quite a bit of real estate running gnuplot and/or editing a gnuplot file and displaying graphs.)


Oct 22 2007

Total Fitness in 30 Minutes a Week

Sorry, one more post on the topic of fitness and fat loss. I picked up ($4 with
shipping) and reread Total Fitness in 30 Minutes a
Week

by Laurence Morehouse, Ph.D. and Leonard Gross, and it’s as good as I
remembered. For various reasons I didn’t follow through with the plan laid out
in this book last time I read it years back, but the principles I picked up
stuck with me, and influenced my search for my custom and sustainable fitness
program. I had a question that I thought was answered in this book, and so I
picked up a copy and started reading. I’ll give you a synopsis and review, then
I’ll divulge my (finally) finished custom plan.

First, this book is older than I am (1975). Naturally, that means we’ve learned
some things that Dr. Morehouse (a Ph.D. in exercise physiology at UCLA) didn’t
know. On the other hand, most of what he did know back then, especially the
basic foundation on which the program is based, is just as true now as the law
of gravity remains. The book shows its age, but in ways inconsequential to
successfully losing fat and/or gaining fitness. Indeed, it worked for people in
the 50s, 60s, and 70s and there’s no reason why we should be any different now.

And not just any people. Dr. Morehouse worked for NASA on exercise programs for
the astronauts. Thanks to him astronauts on extended missions were able to walk
(if a little shakily) rather than be carried on stretcher when returning to
Earth. Low gravity is worse than sitting in front of a computer when it comes
to atrophying muscle.

If I had to boil this whole book down into one paragraph, this would be it: Eat
balanced meals, exercise with a balance of simple equipment-free strength
training and aerobic exercise 3 times a week for a total of 30 minutes a week,
live a bit more actively (take the stairs, etc.), and take note of and respond
to feedback to stay on track. To get into shape (generic good shape, not
athletic shape), that’s all you need. To lose fat, you chart your course of 1
lb a week on a piece of graph paper. If you’re above the line that day, you eat
a little less (skip that piece of pie or extra helping). If you’re below, you
eat normally.

The book and method are very straightforward. There’s no gimmicks here. It
won’t get you ready to run a race or climb Mount Everest. There’s no confusion
here between being an athlete and just plain getting in shape. The book is a
little wordy, and could be half as long and just as informative. But that may
be because I’d already convinced myself of most of the points he drives home in
this book and didn’t need the persuasive arguments.

This book is very much along the same lines as the Hacker’s Diet I reviewed the
other day, except it emphasizes exercise much more (for its own sake,
primarily, not as a primary means for losing weight). Both use the simple view:
calories in and calories out. Both emphasize the importance of feedback and the
realities of measurement. Both give you a sustainable and easy-to-follow
program (this one is easier than hacker’s diet since you don’t have to count
calories).

So combining these two books and everything I’ve read from the web (everything
from fat-loss zone heart rate cardio training to the bodybuilder mantra “cardio
is useless for fat loss”), I have come up with my own personal plan. Time will
tell if it works.

If I’m going to exercise, it’s going to be swimming. I told myself that many
many times over the years, and I meant every word. So I go swimming 3 days a
week. There’s my cardio. It’s also part of my strength training, when doing
intervals. The other part is on the other 3 days when I do some simple
equipment-free strength training (5-10 minutes). I’m basing my exertion on the
combination of perceived exertion (primarily how hard I’m breathing) and heart
rate. I aim for staying aerobic and jumping the lactate threshhold on the hard
intervals.

I’m convinced you can’t lose weight in a reasonable amount of time without
adjusting your diet, unfortunately. The numbers just don’t add up otherwise.
Every pound of lean mass you add burns some 10-20 kcal a day, and you’re a
lucky bodybuilder if you can add 1-2 pounds a week. That gets you no closer to
burning off that extra pair of twinkies than the hour of jogging. Exercise
alone, in the sense of that thing you do for an hour in the morning, is not
enough to raise your energy usage enough to create a calorie deficit without
adjusting your diet (especially since if you have too much fat you’ve probably
been eating a calorie surplus). No, you have to adjust your diet. That doesn’t
mean you have to starve, it just means you need to be conscious.

Exercise does, however, apparently act as an appetite suppressant, and it will
make you feel better and so you’ll be a hair more willing to walk instead of
ride, stand instead of sit, etc. Water is apparently another appetite
suppressant, and it is important to drink plenty for other reasons especially
when losing fat, so drink plenty. If for no other reason than because it fills
your stomach partially, water can suppress your appetite if you drink it before
a meal. So on my above-line days I’ll be drinking a couple glasses before
meals.

Dr. Morehouse says not to lose more than 1 lb a week. The consensus on the web
is similar, but says 1-2 lb a week. I’m a little too impatient for 1 lb a week,
but probably too lazy for 2 lb a week, so I’m aiming for 1.5 lb a week. I shall
have lost my goal 40 lb by mid April.

To recap, I’m watching feedback (heart rate and weight) to fine-tune my eating
and exercising in order to stay on track for a reasonable goal. No starving.
Only the exercise I like. Reasonable and sustainable. Why don’t you play along
at home?


Jul 26 2007

Harry Potter

This is a casual review of the seventh Harry Potter book, and to a lesser extent the entire series. Spoilers herein, so don’t follow that “read more…” link unless you’ve read it.

Continue reading


Jul 26 2007

Practical Ruby for System Administration

I love Ruby, and I use it whenever I can for all kinds of tasks. When I was
working as a professional system administrator, I put it to good use a few
times there too. But using ruby always left me feeling a little guilty, because
whoever came after me would probably curse my name for using a “nonstandard”
language. It didn’t help that Ruby was rarely installed, if even available, by
the popular server distributions of the day. But all that has changed in the
past two years, and using Ruby is often acceptable even in the conservative
world of system administration.

You can imagine my interest then, in a book titled “Practical Ruby for System
Administration
” by André Ben Hamou. The title is telling; Ruby is known for
being recommended by The Pragmatic Programmers because, well, it’s a pragmatic
language. System administration is all about pragmatism, above all else. Doing
things right is important, but doing them now is more important. I knew all
along that Ruby was an excellent tool for the enlightened sysadmin; now there’s
a book to back me up.

The book starts out with why you would want to use Ruby, and nips the
counterarguments in the bud. Hamou does a good job on both counts, though as I
already know Ruby I can’t say whether the language introduction is sufficient
of its own accord to get started with actually writing Ruby, or whether you’ll
need to refer to other sources (which are available online).

In chapter 2 he discusses one-liners. One-liners are the staple of many
sysadmins, but I never fell for them. I’d rather write a throwaway script than
try to cram it all on one line, or if it can really be done well on one line it
can probably be done even more concisely with shell script and UNIX tools.
However, the chapter does discuss a lot of fascinating switches that the ruby
executable takes. I can see myself using many of these, and I was ignorant of
most of them. You might learn the same from a careful study of the man page,
but Hamou presents it in an easier-to-digest format. It’s not always easy to
inject humor into a book while maintaining concise brevity, but Hamou does a
decent job of that. I found myself laughing aloud more than once, and yet I
feel that the book is concise enough to serve as a reference, and that the
jokes won’t get annoying after a few readings of a section.

This book is not only a great book for the sysadmin hoping to use Ruby, but also an excellent book for any sysadmin who may not even be interested in Ruby. Although not even pretending to be a book on system administration best practices, there are many gems in here that will leave you saying “Why didn’t I think of that?” and “I really need to implement that. It will be so helpful and it’s astoundingly simple to do with Ruby!”

Chapter 3 gives you not only a quantitative feel for the speed of Ruby versus other performance-oriented languages (i.e. C), but it teaches you when to be concerned with execution speed, when to be concerned with implementation speed, and makes you think twice about your own feelings about performance. “Ruby is slow” is one of the favorite counterarguments against Ruby, but it’s rarely a good one. This is even more true in system administration, where the sysadmin’s time is much more important than whether the script runs in 1/10 second or 3 seconds.

Chapter 4 discusses “metaprogramming”. I found myself at odds with Hamou most in this chapter, but it was all academic differences in terminology. He’s fairly sloppy with the term metaprogramming and other terms (e.g. “macros”) in this chapter, but the content is nonetheless a useful and vital part of making the most out of Ruby. I even learned a thing or two. He discusses domain-specific languages (DSLs) in Ruby, but mostly at the level of recognizing one when you see one. I think the world needs a paper or book on rapidly making your own DSLs, something sysadmins could really leverage if it were truly easy to do. I’ve written a DSL in Ruby, and while easier than I could possibly have imagined it’s still not at the level of “easy and completely generic” that I feel DSLs can one day reach.

Chapter 5 is where the fun really begins. With the basics of the language under our belts we can really look at specific examples. This is where the book really shines, both as a tool for applying Ruby and as a minefield of good sysadmin practice. We learn to read and write files, with examples for the most common ones. I don’t mean we discuss IO.open and IO.close in detail, I mean we talk about the concepts that apply across all kinds of file reading/parsing and generation, including issues of locking.

In chapter 6 we explore the storage and retreival of data, in a variety of approaches including inspect, marshalling, YAML, and ActiveRecord. Most interesting to me was the section on memcached. I think he ought to also have introduced SQLite (perhaps in the context of ActiveRecord) and DRb.

Chapter 7 is dedicated to dealing with “enterprise data”. You know that of which we speak. XML, CSVs, protocols like XML-RPC, SOAP. Most valuable to me in this chapter was a coherent discussion of what REST is (finally!) and how to do it in Ruby.

Chapter 8 discusses network, including writing simple clients and servers. One tendency Hamou has in this book is to use pure Ruby and steer clear of system and backticks. This tendency sticks out in this chapter, where much time is spent discussing that which could be acheived better with a shell script, or at least a Ruby script with judicious use of system or backticks. The general argument for doing things in pure Ruby is portability, and to a lesser extent performance. Neither of these is the first concern of a sysadmin, who is generally not going to write a script that must work on more than one platform (even if it runs on several different Linux/UNIX distributions, which have the same GNU tools).

Chapter 9 deals with that task which we all hate: network and log monitoring. He has some gems of wisdom herein, but all in all we’re left feeling about the same as when we went in: we can do it (now we can do it with Ruby) but it’s a pain. The gains are not as great here as they are in other areas. This isn’t Ruby’s fault, nor Hamou’s, it’s just that nobody’s really come up with a good and general way to attack this problem yet.

Chapter 10 discusses RubyGems. I think it’s fitting that such a chapter be at the end of the book, and that’s all I have to say about that today.

Chapter 11 discusses testing. You should do it in some form and he discusses a few forms. Nothing earth-shattering here, from where I sit. Chapter 12 talks about the future of Ruby, and will probably cause you to salivate.

All in all, an excellent book on using Ruby as a sysadmin. Go ahead, come out of the closet. Ruby is not only OK, it’s often the best choice.


Aug 1 2006

Ladder 49

I try to avoid reviewing too many movies, because movie reviews are a dime a
dozen and why should you care what movies I like or don’t like? But this is not
a movie review. It’s a cultural observation with a movie review piggy-backed
onto it.

Ladder 49 is one long cliché. To be more precise, it is a cliché flashback
movie (you know, like the bad MacGyver episodes) containing a montage of cliché
“life of a firefighter” clips. It’s boring, sappy drama. This in itself is not
extraordinary. There are a lot of boring sappy dramas.

What I wish to comment on is the fact that the preceding paragraph would no
doubt send some people into a fiery rage about how I’m an insensitive clod that
doesn’t understand or respect what firefighters do for me and how dare I speak
that way about firefighters, and I hope your house burns down some day… I
know this is the case because while I was bored watching this movie I decided
to try and figure out if that was the actor’s cleft lip or if it was special
makeup for the character (the actor’s scar, although not from a cleft lip). In
the process I ended up on the IMDb site for the movie and read a few threads
that started with a negative review (two words permeated: “boring” and
“cliché”) and ended with a whole slew of replies along the vein I already
described.

A lot of firefighters commented in those threads. They say it’s the most
accurate movie about firefighters’ lives that they’ve seen. I don’t doubt
they’re right. And if I were a firefighter I’d love to see an accurate
portrayal of my lifestyle on the big screen. Of course they like the movie.
(I would find an accurate movie about lifeguards (no, not baywatch) or an
accurate movie about computer scientists (ha!) interesting for that reason but
I have no delusions that it would be anything but very very boring for the
masses) It was somewhat more interesting than a documentary in some ways,
although probably somewhat less accurate as well.

Since 9/11 firefighters are sacred, and therefore movies about firefighters
can’t be criticized without hordes of email-forwarding people calling the
critic an insensitive clod. I’m here to tell you that that is ridiculous. I,
and every other person who thought this movie was boring, can think and say
that without the least bit of disrespect for firefighters. I do respect them
and what they do. I’ve been in the rescue business. Although I’d never compare
lifeguarding to firefighting (at least not uneventful poolside lifeguarding),
I’ve been in the culture which includes lifeguards, firefighters, and
paramedics/EMTs. I know these people, and I respect and honor them and count
some of them as my friends.

If you were previously inclined to label someone a firefighter-hater
(publically or not) just because they don’t like a boring movie about the life
of a firefighter, then grow up. You would do far better to honor these people
by calling a spade a spade, instead of rolling over and applauding whatever
firefighter-themed drivel Hollywood throws your way while fishing for the
email-forwarder’s dollar.


Jul 27 2006

Samsung Printer

My old Deskjet 712C bit the dust (it’s probably still fixable to some degree if
you’re interested), and my wife has never been pleased with the
partially-working scanner we got from who-knows-where, so we broke down and got
a Samsung
SCX-4100

monochrome laser printer+scanner/copier.

We didn’t just pick this out of the blue of course, I did some research on what
would be a good but affordable printer for linux. First stop,
linuxprinting.org’s suggested printers
page
. This printer wasn’t mentioned
by name, but they did recommend Samsung laser printers as a possibility. This
printer was available and the price was right on NewEgg, so I googled a bit and
found Shane’s post and the many
helpful comments. What a small world. The comments got more and more
encouraging as the page (and time) wore on, and I concluded that it would be a
good printer for linux.

It arrived maybe half an hour or an hour ago, and it’s already working
perfectly as a printer, scanner, and copier. Anyone who’s fussed with printing
in UNIX before will confirm that this is very good indeed. I ran Samsung’s
latest drivers for linux installer (dated 2006-07-19) and it installed without
a hitch. I then replaced the /usr/bin/lpr symlink back to the original
non-GUI version. That’s it. No Qt problems, no messing with kernel modules (I’m
using USB, parallel may be a different story).

I’m running Debian testing (etch) and a vanilla 2.6.14 kernel.

Kudos to Samsung for recognizing and supporting linux, and for providing good
drivers (after some growing pains), and kudos to Samsung for making a
multifunction laser that’s affordable.


Jul 7 2006

mpd

I have a laptop and I often use it at home, on the couch. I have a workstation on the other side of the room that my wife frequently uses (which is one of the reasons why I mostly use the laptop), which has decent speakers. I like to listen to music. I think you can see where I’m going with this.

In the past we’ve used various media players. The one that worked best for
remote control was xmms with a homebrew xmlrpc client I made, and now that I
have lirc working that would be another option for remote control. But my wife
kept closing xmms and xmms is lame anyway. I got into quod libet for a little
while but even though there’s several ways to control it apart from the gui,
it’s still not very remote friendly.

Von and I have kicked around the idea for a better music player many times, and
the other day I had an epiphany: it should be a sort of stereo daemon, with
clients connecting over the network and driving it. I’ve learned that most good
ideas I have have already been had, so I searched a bit and sure enough,
mpd. mpd is nifty, wholesome, and flexible. Your wife
can’t close it, you can control it with simple command-line, ncurses, gtk, qt,
etc. clients (still need a good osx client, but the ncurses one is good enough
for a cli junkie like me). If I have music playing and she wants to stop it,
she just has to fire up her client if necessary and hit stop. No fuss, no
excuses. If she’s closed her client and left the room and I want music I don’t
have to stand up, futz around with ssh, nohup, and DISPLAY, or even think two
thoughts. I just fire up my client and hit play. Highly recommended.


Jun 22 2006

Buffalo AirStation WHR-G54S

For the price of a new Actiontech DSL modem/wireless router, I purchased a used
Cisco 678 and a Buffalo AirStation
WHR-G54S
.
I don’t need to tell you which is the better deal.

I’ve been hearing some good things about Buffalo, and since you can’t be sure
what you’re getting with a Linksys Wireless router these days I decided to go
for it. I’ve just finished configuring it and now I have some opinions to spout.

First, it seems like a good router. It’s pretty, has 4 ethernet ports, a WAN
port, and wireless (of course), it’s working fine, it’s very configurable, and
it’s reported that OpenWRT runs on it. The cons are that
it takes a lot of rebooting to configure, the wired port LEDs are in the back,
and some of the terminology is a bit confusing. For example it does support WPA
but nowhere in the manual is it called WPA, and only as an aside in the web
interface do you realize that “AES” is WPA, when it asks for the WPA
“previously-shared” key.

Some highlights from the configuration, that I had plenty of time to ponder
while rebooting after every screen:

  • WAN
    • Does PPPoE, DHCP, unnumbered IP, or static
    • can set the MAC address
  • LAN
    • lots of DHCP server options
  • Misc
    • NTP
    • Syslog
    • Very detailed system info screen
    • Logs, and granular control over what’s logged (or what’s sent to syslog)

Jun 5 2006

Primer

If you like science fiction–I mean real science fiction, not just a movie
that happens to have a robot or a spaceship in it–then you want to go out and
get Primer this week. You want to get
it, watch it, watch it again, then watch the director commentary.

I can’t possibly do justice to the movie here. It’s an amazing, intelligent,
thought-provoking movie. You should watch it. Trust me.

That said, it’s not for everyone. People who read this blog should like it.
People who liked Monster-in-Law almost
certainly will not enjoy it. But it can be fun to watch them squirm so invite
them anyway.