Converting a multi-page PDF to a multi-page PDF

Discussions on NConvert - the command line tool for image conversion and manipulation

Moderators: XnTriq, helmut, xnview

IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Converting a multi-page PDF to a multi-page PDF

Post by IxenPDF »

cday wrote: Tue Mar 30, 2021 8:47 pm I only had time to read your post quickly this morning, ..
No problem.
cday wrote: Tue Mar 30, 2021 8:47 pm Otherwise, could you please upload the source and output files used so that I can have a look sometime tomorrow.
Thank you. It's attached.


Edit:
Your source file is 24-bit colour, you may need to convert to black and white before outputting to a 1-bit TIFF, I'm not immediately sure...

I'm attaching a 3-page 1-bit Fax compression PDF that I've created as a test file, the pages are in the wrong order due to a quirk in the XnView MP creation interface, please test your code with that file to check if you get the correct result.

M-PDF.pdf
(3.6 KiB) Downloaded 3 times
When I analyse your 3 page original file from " Mon Mar 29, 2021 8:13 pm" with the same method, I can not find, that it is "fax compressed". It is said that it is uncompressed. And it's also 24-bit colour. (?):

Code: Select all

$ ./nconvert -info M-PDF.pdf
** NCONVERT v7.39 (c) 1991-2019 Pierre-E Gougelet (Feb 25 2020/13:19:45) **
        Version for Linux x86 (X11)  (All rights reserved)
** This is freeware software (for non-commercial use)

Over...

M-PDF.pdf : Success
    Format               : Portable Document Format
    Name                 : pdf
    Compression          : Uncompressed
    Width                : 359
    Height               : 359
    Components per pixel : 3
    Bits per component   : 8
    Depth                : 24
    # colors             : 16777216
    Color model          : RGB
    Bytes Per Plane      : 1077
    Orientation          : Top Left
    Xdpi                 : 72
    Ydpi                 : 72
    Page(s)              : 3
    Info:
      PhotometricInterpretation: 2
      PlanarConfiguration: 1
      SamplesPerPixel: 3
      DateTime: 2021:03:31 07:35:38
      Software: GPL Ghostscript 9.27
    Metadata             : ( EXIF ICC )
    Color Profile        : Artifex Software sRGB ICC Profile
Strange. Do I use -info the wrong way?
You do not have the required permissions to view the files attached to this post.
cday
XnThusiast
Posts: 4173
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Converting a multi-page PDF to a multi-page PDF

Post by cday »

The immediate question is whether you are now able to output a 300 DPI file after changing the order of the -dpi and -xall terms?

I'll check the test file I sent you later, it was produced quickly using XnView MP to combine some of my test images and it may be that the output file didn't have the properties I intended.

Could you please confirm the intended colour depth and compression type of the source file you wish to convert, and the output file you wish to produce?
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Converting a multi-page PDF to a multi-page PDF

Post by IxenPDF »

cday wrote: Wed Mar 31, 2021 7:05 am The immediate question is whether you are now able to output a 300 DPI file after changing the order of the -dpi and -xall terms?
The anwer is no:

Code: Select all

$ ./nconvert -brightness 124 -gamma 0.72 -dpi 300 -xall -out pdf -multi -c 0 -o XnVOut.pdf M-PDF3-1.pdf
** NCONVERT v7.39 (c) 1991-2019 Pierre-E Gougelet (Feb 25 2020/13:19:45) **
        Version for Linux x86 (X11)  (All rights reserved)
** This is freeware software (for non-commercial use)

Over...
Conversion of M-PDF3-1.pdf into XnVOut_1.pdf OK
Over...
Conversion of M-PDF3-1.pdf into XnVOut_1.pdf OK
Over...
Conversion of M-PDF3-1.pdf into XnVOut_1.pdf OK
user@user:~/NConvert_von_XnView
$ ./nconvert -info XnVOut_1.pdf
** NCONVERT v7.39 (c) 1991-2019 Pierre-E Gougelet (Feb 25 2020/13:19:45) **
        Version for Linux x86 (X11)  (All rights reserved)
** This is freeware software (for non-commercial use)

Over...

XnVOut_1.pdf : Success
    Format               : Portable Document Format
    Name                 : pdf
    Compression          : Uncompressed
    Width                : 359
    Height               : 359
    Components per pixel : 3
    Bits per component   : 8
    Depth                : 24
    # colors             : 16777216
    Color model          : RGB
    Bytes Per Plane      : 1077
    Orientation          : Top Left
    Xdpi                 : 72
    Ydpi                 : 72
    Page(s)              : 3
    Info:
      PhotometricInterpretation: 2
cday wrote: Wed Mar 31, 2021 7:05 am Could you please confirm the intended colour depth and compression type of the source file you wish to convert, and the output file you wish to produce?
What I really want is to convert different pdf's of different characters with this options:

Code: Select all

-brightness 124 -gamma 0.72 -xall -out pdf
And this in a way, that all other options remain untouched.

So if the source pdf had 150dpi, I want to have the target pdf also with 150dpi.
When the source pdf had fax compression, I want to have also fax compression in the target pdf.
When my source pdf was grey-scale, I also want to have grey-scale in the target pdf.
When my source pdf had a depth of 8, I also want to have a depth of 8 in my target pdf.
When my source pdf had a depth of 24, I also want to have a depth of 24 in my target pdf.
And so on...

Would this be possible?
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Converting a multi-page PDF to a multi-page PDF

Post by IxenPDF »

Here is another example:

Code: Select all

./nconvert -dpi 30 -xall -out pdf -multi -c 0 -o XnVOut.pdf  M-PDF3-1.pdf

Code: Select all

./nconvert -info XnVOut.pdf
Output:

Code: Select all

XnVOut.pdf : Success
  [...]
    Xdpi                 : 72
    Ydpi                 : 72
[...]
My expectation was:

Code: Select all

XnVOut.pdf : Success
  [...]
    Xdpi                 : 30
    Ydpi                 : 30
[...]
Summary:
I have converted a source pdf into a 30dpi pdf.
But "-info" again tells me, that it is still 72dpi.

But when I open the source pdf and the target pdf and check the dpi by eye, it looks like 30dpi.
Last edited by IxenPDF on Wed Mar 31, 2021 8:03 am, edited 1 time in total.
cday
XnThusiast
Posts: 4173
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Converting a multi-page PDF to a multi-page PDF

Post by cday »

I'll look into the DPI issue when I have time, hopefully later today.
IxenPDF wrote: Wed Mar 31, 2021 7:32 am What I really want is to convert different pdf's of different characters with this options:

Code: Select all

-brightness 124 -gamma 0.72 -xall -out pdf
And this in a way, that all other options remain untouched.

So if the source pdf had 150dpi, I want to have the target pdf also with 150dpi.
When the source pdf had fax compression, I want to have also fax compression in the target pdf.
When my source pdf was grey-scale, I also want to have grey-scale in the target pdf.
When my source pdf had a depth of 8, I also want to have a depth of 8 in my target pdf.
When my source pdf had a depth of 24, I also want to have a depth of 24 in my target pdf.
And so on...

I would make the following quick and possibly not entirely precise comments:

The output file DPI must always be set in the code used, it can't automatically follow the DPI value of the source file.

The output file compression type must always be specified, it doesn't automatically follow that of the input file.

Fax compression is only applicable to 1-bit black and white images.

The bit depth of a colour or grayscale file source file may be preserved, I think.

Brightness and gamma setting would only be applicable to colour and possibly grayscale files, not to 1-bit source files if you have any.
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Converting a multi-page PDF to a multi-page PDF

Post by IxenPDF »

cday wrote: Wed Mar 31, 2021 7:57 am I'll look into the DPI issue when I have time, hopefully later today.
Thank you.
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Converting a multi-page PDF to a multi-page PDF

Post by IxenPDF »

OK I have found this in the meantime:
viewtopic.php?f=34&t=12410&p=47747&hili ... nfo#p47747


Summary:

-info can not tell the correct dpi because:

-a pdf can contain several pictures with different dpi's
-a pdf with no picture is in vector format (really?)
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Converting a multi-page PDF to a multi-page PDF

Post by IxenPDF »

I have made further experiments:

source file:

Code: Select all

XnVOut_2.pdf  -- 256kb
Command:

Code: Select all

$ ./nconvert -xall -out pdf -multi -c 0 -o XnVOut.pdf XnVOut_2.pdf
target file:

Code: Select all

XnVOut.pdf  -- 1456kb
The source file is also uncompressed.
The command does not change anything.
So my expectation was, that now the size of the target file will be of the same amount like the source file. But it is much bigger.
Why?
And do I have any possibility to test the dpi of the pictures, that are in the target file, to find out, if the dpi has remained unchanged?

Edit:
I have tried out the same command again. I mean this one from this post:

Code: Select all

$ ./nconvert -xall -out pdf -multi -c 0 -o XnVOut.pdf XnVOut_2.pdf
... but tried out several dpi values.
To be concrete I have tried out this variants:

Code: Select all

$ ./nconvert -xall -dpi 300 -out pdf -multi -c 0 -o XnVOut.pdf XnVOut_2.pdf
$ ./nconvert -xall -dpi 72 -out pdf -multi -c 0 -o XnVOut.pdf XnVOut_2.pdf
And I have found out, that the size of the target pdf is exactly 1456kb (see above), when I use "-dpi 72". So I guess, that when I do not include the option -dpi into the command, then all pictures in the pdf are converted into 72 dpi.


@xnview
1. question: Is 72dpi as default a good choice? Why?
2. question: Couldn't this be changed in a way, that the dpi of all pictures in a pdf remains untouched?

@xnview
Suggestion:
Please make a new -dpi option especially for formats that can contain more than one picture (tiff, pdf).
Idea for the name: -mdpi (multipicture dpi).
The property content should look like this: 150-300 (this means in this file there are several pictures and the dpi is between 150 until 300 dpi.

If there are further options, for -out pdf, that have the same behaviour, I suggest to also make new "-m..." options for them.

What do you think?
Last edited by IxenPDF on Wed Mar 31, 2021 11:34 am, edited 3 times in total.
cday
XnThusiast
Posts: 4173
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Converting a multi-page PDF to a multi-page PDF

Post by cday »

In your code -c 0 is the option for the output PDF file to be uncompressed, if you want the output file to be compressed you need to use another compression option that is supported for the bit depth of the output images...

You can check the properties of any file easily by opening the file in XnView MP, and then using the 'i' for information icon on the viewer toolbar.

I'll have to return to this later in the day, possibly not until this evening.
1. question: Is 72dpi as default a good choice? Why?
2. question: Couldn't this be changed in a way, that the dpi of all pictures in a pdf remains untouched?
The 72 DPI value is set in Ghostscript, and there is no way that the DPI of an input file can be automatically maintained, unless the value is first read using -info and then, somehow, entered into a variable that can be used in the NConvert code line.
cday
XnThusiast
Posts: 4173
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Converting a multi-page PDF to a multi-page PDF

Post by cday »

@IxenPDF

It will be later in the day before I can respond to the now multiple issues raised: sitting outside in some rare fine weather I can think, which is often beneficial in a situation like this, but I can't use my laptop as the screen isn't bright enough.

I wouldn't be optimistic that Pierre will see, let alone have time to reply to, the GS points you have made, but I think in general what you are requesting isn't likely to be possible. You are using Linux aren't you, do you have detailed experience with bash, which could possibly be used to create scripts as a workaround for some of your needs, if it is worth the effort?
User avatar
xnview
Author of XnView
Posts: 44817
Joined: Mon Oct 13, 2003 7:31 am
Location: France

Re: Converting a multi-page PDF to a multi-page PDF

Post by xnview »

there is no way to get DPI from a PDF file, because it's resolution independant
Pierre.
cday
XnThusiast
Posts: 4173
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Converting a multi-page PDF to a multi-page PDF

Post by cday »

xnview wrote: Wed Mar 31, 2021 1:47 pm there is no way to get DPI from a PDF file, because it's resolution independent
Thank you for that clarification... :D

I take it that a PDF file containing only an image contains only the image matrix, with no dimensional information, that is logical, whereas a PDF file containing an image on a page of a PDF document necessarily also contains the size of the 'box' in which the image is placed, and so the image has a DPI value, although that isn't of course relevant to an image editing software?
cday
XnThusiast
Posts: 4173
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Converting a multi-page PDF to a multi-page PDF

Post by cday »

I have thought for some time that a short tutorial on working with multi-page PDF files would be a useful reference, and some of the points that have come out in this thread in relation PDF image files could usefully be included, if I ever write one.

I think the relevance of what Pierre said above is as follows:

Whereas an image file with an extension such as JPEG contains an image as a bitmap, and also information that enables the image pixel dimensions, canvas size and DPI to be displayed, a PDF file image file contains only a bitmap image, and so only the image pixel dimensions can be displayed, the image has no defined canvas size or DPI.

It follows that when the file properties of a PDF image file are displayed, using -info or the 'file properties' in a GUI software, any 'DPI' value that may be shown for a PDF image file is an arbitrary value (usually 72) that is inserted for display by the file properties tool. In which case, one might wonder why isn't '–' displayed. It follows that the displayed 'DPI' value doesn't change when the -dpi option is used in an attempt to alter the value.

Edit:

If a PDF image file contains only an image of specified pixel dimensions, what determines the output image dimensions when it is opened ('rasterised') at a specified DPI value?

Isn't the answer that when the imaged is rasterised at a DPI of 2x72 = 144 likely to be 2000x2000px?

The value '72' seems to be a value assigned by Ghostscript that is used to determine the pixel dimensions of the opened image, that value relating to the 'point' size used in printing of 1/72 inch.

When the opened image is saved with a file format such as JPEG, it then acquires a canvas size and DPI values.

@IxenPDF

Given the way this now long thread has evolved, and the new insight provided above by Pierre into the 'DPI' value of a PDF file, I think it might be best if you review what you now need to know, and then post any questions that remain in a concise form...
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Converting a multi-page PDF to a multi-page PDF

Post by IxenPDF »

Thank you @cday

I want to prevent that by converting with the command ...

Code: Select all

./nconvert -brightness 124 -gamma 0.72 -dpi 300 -xall -out pdf -multi XnVOut.pdf M-PDF3-1.pdf
... the quality of my source document (whose properties are unknown) is degraded more than necessary.

How could the quality be degraded by converting?
I think, if the pictures in the document had 250 dpi and I convert it with the command "./nconvert -brightness 124 -gamma 0.72 -dpi 300 -xall -out pdf -multi XnVOut.pdf M-PDF3-1.pdf", then the quality will degrade because each conversion of dpi's will cause a light quality degradation.

The same applies to type of compression and amount of compression. When my source document has a different compression method than the target document then I have a loss of quality.

Perhaps this also applies to the "Depth" property?

cday wrote: Wed Mar 31, 2021 6:09 pm Given the way this now long thread has evolved, and the new insight provided above by Pierre into the 'DPI' value of a PDF file, I think it might be best if you review what you now need to know, and then post any questions that remain in a concise form...



Here are my questions:

Please note: I do not know anything about Ghostscript and it's role in converting or showing the properties of the pictures that are in the pdf.
Please note: In my source PDF's all included pictures will have the same properties. I will not have pictures with different properties in the same PDF.
Please note: All questions refer to pdf's that contain images.

1. My wish would be to just run the command ...

Code: Select all

./nconvert -brightness 124 -gamma 0.72 -xall -out pdf -multi XnVOut.pdf M-PDF3-1.pdf
... and all properties of the pictures in this pdf remains the same.

But in the meantime I know, this is not possible.

Question: Which of the following properties will be changed through the command above? And which of them will remain untouched? Are there even other picture properties, that will not be untouched through Nconvert and are not in the list below?

Code: Select all

    Format               : Portable Document Format
    Name                 : pdf
    Compression          : Uncompressed
    Width                : 359
    Height               : 359
    Components per pixel : 3
    Bits per component   : 8
    Depth                : 24
    # colors             : 16777216
    Color model          : RGB
    Bytes Per Plane      : 1077
    Orientation          : Top Left
    Xdpi                 : 72
    Ydpi                 : 72
    Page(s)              : 3
    Info:
      PhotometricInterpretation: 2
      PlanarConfiguration: 1
      SamplesPerPixel: 3
      DateTime: 2021:04:01 09:14:49
      Software: GPL Ghostscript 9.27
    Metadata             : ( EXIF ICC )
    Color Profile        : Artifex Software sRGB ICC Profile
(Until now I know it is at least the property xdpi and Ydpi and the property "compression", I guess also the property Depth.)


2. Question:
Which of the following properties show me the real properties of the pictures that are in the pdf and which of them are fantasy values?

Code: Select all

    Format               : Portable Document Format
    Name                 : pdf
    Compression          : Uncompressed
    Width                : 359
    Height               : 359
    Components per pixel : 3
    Bits per component   : 8
    Depth                : 24
    # colors             : 16777216
    Color model          : RGB
    Bytes Per Plane      : 1077
    Orientation          : Top Left
    Xdpi                 : 72
    Ydpi                 : 72
    Page(s)              : 3
    Info:
      PhotometricInterpretation: 2
      PlanarConfiguration: 1
      SamplesPerPixel: 3
      DateTime: 2021:04:01 09:14:49
      Software: GPL Ghostscript 9.27
    Metadata             : ( EXIF ICC )
    Color Profile        : Artifex Software sRGB ICC Profile
(Until now I know it is at least the property xdpi and Ydpi and the property "compression", I guess also the property Depth.)


3. Question:
I know, I can not get the real properties of the pictures that are in the pdf (In my pdf's all pictures will have the same properties. So no need to find out the properties of each picture separately).
Question: So how can I find out the real properties of my pictures that are in my pdf? Is there any third party tool, that can do this?

Thank you.
cday
XnThusiast
Posts: 4173
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Converting a multi-page PDF to a multi-page PDF

Post by cday »

This needs more thought: your original 10pages.pdf file was presumably created in a word processor (Libre Office?) and exported as a PDF?

When opened in XnView MP, which provides an easy way to view the file properties, it is shown to be 300 DPI and uncompressed, however I'm not sure whether the PDF file itself contains a rasterised bitmap image, or a vector font which is rasterised when the file is opened. In principle it is possible to open a PDF file in a text editor and view the contents, but not too easy to do so quickly with limited knowledge of the PDF format.

Edit: The DPI value of the opened file was shown in the file properties as 300 as that was the value set in XnView MP for rasterising PDF files, that doesn't reflect on whether the file contains bitmap or vector text, though...

With regard to maintaining the quality of the viewed file when processing it, the two issues I think are the DPI, to the extent that it is relevant, and the type of compression used. There are lossless compression options available, so that factor comes down to the selection of the compression method used when the file is resaved, which needs to be considered in relation to an acceptable file size.

I am a volunteer providing limited support when I can, but although the issues raised above are interesting, in practice I have limited time to consider them!