Identification in PDFs

If you need to create a PDF with no embedded identification it may not be enough to simply refrain from typing your name. For example:

$ strings foo.pdf | egrep -i '(hans|fugal)'
/PTEX.FileName (./0_Users_fugalh_research_foo_fig1.pdf)
/Author (Hans Fugal)
/PTEX.FileName (./1_Users_fugalh_research_foo_fig2.pdf)
/Author (Hans Fugal)
/PTEX.FileName (./2_Users_fugalh_research_foo_fig3.pdf)
/Author (Hans Fugal)
/PTEX.FileName (./3_Users_fugalh_research_foo_fig4.pdf)
/Author (Hans Fugal)
/PTEX.FileName (./4_Users_fugalh_research_foo_fig5.pdf)
/Author (Hans Fugal)

The /PTEX lines are from pdftex and the /Author lines originated from gnuplot

/Title (fig1.pdf)
/Author (Hans Fugal)
/Creator (gnuplot 4.2 patchlevel 4 )

Removing the offending lines didn’t hurt the PDF in this situation. So if you must anonymize a PDF (e.g. to submit a paper for blind review), be sure to check for hidden identification. Of course, most reviewers wouldn’t go digging for it, but you will rest easy knowing it’s truly anonymous.


One Response to “Identification in PDFs”

Leave a Reply