Wednesday, 26 June 2019

linux - Using Ghostscript to convert multi-page PDF into single JPG?


I know Ghostscript can convert PDFs to JPGs, and in the case of a multi-page PDF, can rip each page to an individual JPG. But is it possible to have it rip them to one JPG, so that the pages are pasted below each other, e.g. the top half of the JPG is page 1, the bottom half is page 2? Or do I have to use another program (and can ImageMagick do this?) to combine the JPG pages into one image?



Answer



Yes, you'll have to convert each PDF page into a single JPG file (Ghostscript can do that).


Then stitch together the resulting JPG files using another program (ImageMagick or GraphicsMagic can do that using their montage sub-commands).


I'm not aware of any software which can do that in one go.


PDF-to-JPG conversion (with Ghostscript): You'll want to make sure that you get the best possible result. So make sure you tweak the commandline options so they work for you. I'd start with this:


gswin32c.exe ^
-dBATCH ^
-dNOPAUSE ^
-dSAFER ^
-sDEVICE=jpeg ^
-dJPEGQ=95 ^
-r600x600 ^
-sOutputFile=c:/path/to/jpeg-dir/pdffile-%03d.jpeg ^
c:/path/to/pdffile.pdf

This will create JPGs called pdffile-001.jpeg, pdffile-002.jpg etc. The parameter *-dJPEGQ=95" sets "JPEG Quality" to 95%. It uses a resolution of "600x600 dpi". You may need to additionally control the pagesize of the resulting JPGs in case your Ghostscript's default doesn't fit your needs:


gswin32c.exe ^
-dBATCH ^
-dNOPAUSE ^
-dSAFER ^
-sDEVICE=jpeg ^
-dJPEGQ=95 ^
-r600x600 ^
-dPDFFitPage ^
-dFIXEDMEDIA ^
-dDEVICEWIDTHPOINTS=800 ^
-dDEVICEHEIGHTPOINTS=600 ^
-sOutputFile=c:/path/to/jpeg-dir/pdffile-%03d.jpeg ^
c:/path/to/pdffile.pdf

or


gswin32c.exe ^
-dBATCH ^
-dNOPAUSE ^
-dSAFER ^
-sDEVICE=jpeg ^
-dJPEGQ=95 ^
-r600x600 ^
-dPDFFitPage ^
-dFIXEDMEDIA ^
-sDEFAULTPAPERSIZE=a4 ^
-sOutputFile=c:/path/to/jpeg-dir/pdffile-%03d.jpeg ^
c:/path/to/pdffile.pdf

multiple-to-single-JPG-stitching with montage (ImageMagick or GraphicsMagick): The montage command (used in this example is ImageMagick) allows you to control the tiling pattern. If you use e.g. -tile 4x3 you'd get this imposition layout:


1  2  3  4    
5 6 7 8
9 10 11 12

You could use this command to stitch together 12 individual JPGs into one:


montage ^
-border 0 ^
-tile 4x3 ^
c:/path/to/jpeg-dir/pdffile-*.jpeg ^
c:/path/to/final.jpg

Of course, montage has many dozen of additional parameters which allow you to determine background, spacing, offsets, decoration, labels, rotation, cropping, caption etc. for the input and the resulting JPG.




EDIT: (I had wanted to give this hint already in my original answer, but forgot.) montage by default will use tile sizes of 120x120 pixels. If you want to keep the original page sizes for each tile, you have to add -geometry to the commandline. Assuming you had A4 (=595x852 pt) pages in your PDF, and you want to keep this, but also add a spacing of 11pt to the horizontal and 22 pt to the vertical direction of the tiling (plus 4pt strong gray border/frame lines around each tile), do this:


montage ^
-border 4 ^
-tile 4x3 ^
-geometry 595x842+11+22 ^
c:/path/to/jpeg-dir/pdffile-*.jpeg ^
c:/path/to/final.jpg



EDIT 2: (Missed still another important hint.) If you do not want to lose the good image quality during the stitching/montage process, which your PDF-to-JPG conversion had created, then also add the -quality 100 parameter to your commandline like this:


montage ^
-border 4 ^
-tile 4x3 ^
-geometry 595x842+11+22 ^
-quality 100 ^
c:/path/to/jpeg-dir/pdffile-*.jpeg ^
c:/path/to/final.jpg

No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?

I am trying to VLOOKUP reference data with around 400 seperate Excel files. Is it possible to do this in a quick way rather than doing it m...