Hi all,
on Zum Wochenende: Dream Machines - GNU/Linux.ch I recently found an interesting article dealing with AI (in German though):
Artificial intelligence is the most beautiful and at the same time most threatening concept in the world.
It frightens and impresses at the same time. In principle, it means the simulation of mental processes by all means,
but in general it is one or two forms of computer simulation […]
(translation via “TranslateLocally for Firefox” add-on)
In the article the download-link of a very interesting booklet (132 pages) as a PDF-file is also provided.
Here the “fun” begins.
I downloaded the PDF with firefox
this way: I opened the respective link and the PDF was displayed in a new tab.
Then I “printed” it as a PDF-file from within firefox. It resulted in a 22.8 MB PDF.
Hmm, I know that the direct download with wget
often enough gets me a smaller PDF. So I also downloaded it with wget
:
wget "http://worrydream.com/refs/Nelson-ComputerLibDreamMachines1975.pdf"
The direct download yielded a pdf with just 15 MB. Well, that´s better indeed.
From experience I know that using the ghostscript
command on a PDF can also reduce the filesize. In the past I used to employ ghostscript for combining two PDFs, like so:
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=combine.pdf -dBATCH [FILE1].pdf [FILE2].pdf
I also know the size of a PDF also works without combining two file, it works on one file alone, too.
So I tried it on the two PDFs from above.
Yet there is difference:
file alt_Nelson-ComputerLibDreamMachines1975.pdf Nelson-ComputerLibDreamMachines1975.pdf Nelson-ComputerLibDreamMachines1975.pdf
alt_Nelson-ComputerLibDreamMachines1975.pdf: PDF document, version 1.5 # via print function in firefox
Nelson-ComputerLibDreamMachines1975.pdf: PDF document, version 1.6 (zip deflate encoded) # directly via wget
… and there is a difference when trying to shrink them with ghostscript
:
The first one (via firefox´ print PDF function) ran through smoothly :
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=combine.pdf -dBATCH alt_Nelson-ComputerLibDreamMachines1975.pdf
GPL Ghostscript 9.55.0 (2021-09-27)
Copyright (C) 2021 Artifex Software, Inc. All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
Processing pages 1 through 132.
Page 1
Page 2
Page 3
[...]
… but the combine.pdf
got bigger instead of becoming smaller.
The second one (directly via wget
) yields a huge amount of status messages but finally also gets the job done (2nd attempt though).
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=2combine.pdf -dBATCH Nelson-ComputerLibDreamMachines1975.pdf
GPL Ghostscript 9.55.0 (2021-09-27)
Copyright (C) 2021 Artifex Software, Inc. All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
Processing pages 1 through 132.
[...]
Loading CourierNewPS-BoldMT font from /usr/share/fonts/truetype/msttcorefonts/Courier_New_Bold.ttf... 10056724 8550450 18765336 16743137 4 done.
Can't find (or can't open) font file /usr/share/ghostscript/9.55.0/Resource/Font//usr/share/gho.
Can't find (or can't open) font file Candara-Italic.
Loading Candara-Italic font from /usr/share/fonts/truetype/litefonts/Candarai.ttf... 10076924 8561114 19648940 17571273 4 done.
Substituting font NewCenturySchlbk-Bold for CenturySchoolbook-Bold.
Page 122
Substituting font Helvetica-Bold for SimSun,Bold.
Substituting font Helvetica for SimSun.
Substituting font Helvetica-Oblique for SimSun,Italic.
Substituting font NewCenturySchlbk-BoldItalic for CenturySchoolbook-BoldItalic.
Substituting font NewCenturySchlbk-Roman for CenturySchoolbook.
[...]
Its output also got bigger instead of smaller (17.8 MB)
Summary:
22 MB —> 26MB
15MB —> 17MB
ll
total 107M
-rw-rw-r-- 1 rosika rosika 17M Apr 17 17:49 2combine.pdf
-rw-rw-r-- 1 rosika rosika 22M Apr 16 17:06 alt_Nelson-ComputerLibDreamMachines1975.pdf
-rw-rw-r-- 1 rosika rosika 26M Apr 17 18:10 combine.pdf
-rw-rw-r-- 1 rosika rosika 15M Aug 10 2013 Nelson-ComputerLibDreamMachines1975.pdf
I guess this unusual behaviour has something to do with strange fonts used in the original PDF. O.K., if I accept that as an explanation there still remains this
question:
Why is so much font substitution going on only in one PDF but not in the other
I guess the font substitution sees its result in the newly created 2combine.pdf.
Perhaps because of different types of PDFs:
file *
2combine.pdf: PDF document, version 1.7
alt_Nelson-ComputerLibDreamMachines1975.pdf: PDF document, version 1.5
combine.pdf: PDF document, version 1.7, 132 pages
Nelson-ComputerLibDreamMachines1975.pdf: PDF document, version 1.6 (zip deflate encoded)
What might you think of it?
Many greetings from Rosika