What is your best recommendation for a tool to see/remove JS content from a PDF file in Linux?

Hello Friends

Assuming you have a pdf file and you have confirmed it has JavaScript content

Question

  • What is your best recommendation for a tool to see and remove JS content from a pdf file in Linux?

Therefore once show the JS: “if it looks fine” or safe should be keep it in peace, otherwise it should be removed and save the same pdf file

Thanks in advance

1 Like

I wonder if you converted it to a postscript file (.ps) would that remove it?
I think there is a program pdf2ps.

" convert it from pdf to ps and back to pdf. Basically, it prints the file, which (I believe) will remove all meta information. I imagine that pdf2ps doesn’t have a built in javascript library, so I think it is safe to assume that any malicious javascript will be securely removed in this process."
Yes that seems OK

Another idea. You can use the qpdf program to inspect a pdf and remove things manually.

Note :
There is a program called diffpdf which compares 2 pdf files… like diff compares 2 txt files. You could use it to check if things have been removed.

5 Likes

Hello Neville

I wonder if you converted it to a postscript file (.ps) would that remove it?
I think there is a program pdf2ps .

I don’t have idea if it is possible … @Rosika some thougths about this? Perhaps it is related with the pdfinfo command suggested by you at:

Continuing

" convert it from pdf to ps and back to pdf. Basically, it prints the file, which (I believe) will remove all meta information. I imagine that pdf2ps doesn’t have a built in javascript library, so I think it is safe to assume that any malicious javascript will be securely removed in this process ."

From where you took that statement? It has " "

Yes that seems OK

If some member of this network can confirm that would be great

Another idea. You can use the qpdf program to inspect a pdf and remove things manually.

Have you test it?

Note :
There is a program called diffpdf which compares 2 pdf files… like diff compares 2 txt files. You could use it to check if things have been removed.

Huge thanks for that command … but for this scenario only exists 1 file, therefore there is no other file to do a comparison but this command now is in my toolbox

As usual huge thanks for your polite support friend!

3 Likes

It came from a google AI summary.

Go ahead and use pdf2ps… then convert the .ps file back to .pdf with ps2pdf
The javascript should not survive the conversion.
Compare the result with the original using diffpdf

2 Likes

Hello Neville

It came from a google AI summary.

Understood

Go ahead and use pdf2ps … then convert the .ps file back to .pdf with ps2pdf
The javascript should not survive the conversion.
Compare the result with the original using diffpdf

Now has more sense both commands working together

Thank You!!!

1 Like

@Manuel_Jordan :

Hi Manuel, :waving_hand:

I was always of the opinion that flattening would do the job the easiest way.
I´m not totally sure about it, so you´d have to try it out for yourself and then check the outcome to see whether JS was removed.

  • Printing the PDF to a new PDF using the “Print to PDF” feature (available in most Linux PDF viewers or web browsers) effectively flattens the document.

  • This process should remove JavaScript, as the printer driver does not transfer code, only the visual representation. At least that´s my understanding of it.

Another method could be using the ghostscript command.
Personally I use it a lot for converting PDFs to monochrome or to combine PDFs and the like.
But you might use it for getting rid of JS, too.
Just try the following and then check the result:

  • gs -o output.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress input.pdf

Should act the same way as “Print to PDF”.

@nevj ´s suggestion should work as well. :+1:

I checked it with the help of perplexity:

  • Converting a PDF to PostScript using pdf2ps and then back to PDF with ps2pdf will also remove JavaScript and most interactive content. The PostScript format does not support embedded JavaScript, so the conversion process strips it out* .
  • This is a reliable method, but it may degrade some aspects of the PDF, such as hyperlinks, bookmarks, or advanced layout features.

Hope it helps.

Many greetings from Rosika :slightly_smiling_face:

4 Likes