linux ubuntu - Merge / convert multiple PDF files into one PDF




pdfunite 18.04 (14)

How could I merge / convert multiple PDF files into one large PDF file?

I tried the following, but the content of the target file was not as expected:

convert file1.pdf file2.pdf merged.pdf

I need a very simple/basic command line (CLI) solution. Best would be if I could pipe the output of the merge / convert straight into pdf2ps ( as originally attempted in my previously asked question here: Linux piping ( convert -> pdf2ps -> lp) ).


Answers

Try the good ghostscript:

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=merged.pdf mine1.pdf mine2.pdf

or even this way for an improved version for low resolution PDFs (thanks to Adriano for pointing this out):

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -sOutputFile=merged.pdf mine1.pdf mine2.pdf

In both cases the ouput resolution is much higher and better than this way using convert:

convert -density 300x300 -quality 100 mine1.pdf mine2.pdf merged.pdf

In this way you wouldn't need to install anything else, just work with what you already have installed in your system (at least both come by default in my rhel).

Hope this helps,

UPDATE: first of all thanks for all your nice comments!! just a tip that may work for you guys, after googling, I found a superb trick to shrink the size of PDFs, I reduced with it one PDF of 300 MB to just 15 MB with an acceptable resolution! and all of this with the good ghostscript, here it is:

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/default -dNOPAUSE -dQUIET -dBATCH -dDetectDuplicateImages -dCompressFonts=true -r150 -sOutputFile=output.pdf input.pdf

cheers!!


If you want to convert all the downloaded images into one pdf then execute

convert img{0..19}.jpg slides.pdf


Use PDF tools from python https://pypi.python.org/pypi/pdftools/1.0.6

Download the tar.gz file and uncompress it and run the command like below

python pdftools-1.1.0/pdfmerge.py -o output.pdf -d file1.pdf file2.pdf file3 

You should install pyhton3 before you run the above command

This tools support the below

  • add
  • insert
  • Remove
  • Rotate
  • Split
  • Merge
  • Zip

You can find more details in the below link and it is open source

https://github.com/MrLeeh/pdftools


Also pdfjoin a.pdf b.pdf will create a new b-joined.pdf with the contents of a.pdf and b.pdf


You can use sejda-console, free and open source. Unzip it and run sejda-console merge -f file1.pdf file2.pdf -o merged.pdf

It preserves bookmarks, link annotations, acroforms etc.. it actually has quite a lot of options you can play with, just run sejda-console merge -h to see them all.


You can use the convert command directly,

e.g.

convert sub1.pdf sub2.pdf sub3.pdf merged.pdf

Considering that pdfunite is part of poppler it has a higher chance to be installed, usage is also simpler than pdftk:

pdfunite in-1.pdf in-2.pdf in-n.pdf out.pdf

The other answers are good, but if you cannot merge PDFs locally, whether you are in a shared hosting environment, or for other reasons, they will not help you.

If you are looking for an API to merge PDFs remotely, you can try api2pdf which has an endpoint for merging pdfs together. Documentation is here.


I'm sorry, I managed to find the answer myself using google and a bit of luck : )

For those interested;

I installed the pdftk (pdf toolkit) on our debian server, and using the following command I achieved desired output:

pdftk file1.pdf file2.pdf cat output output.pdf

OR

gs -q -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf file1.pdf file2.pdf file3.pdf ...

This in turn can be piped directly into pdf2ps.


I am biased being one of the developers of PyMuPDF (a Python binding of MuPDF).

You can easily do what you want with it (and much more). Skeleton code works like this:

#-------------------------------------------------
import fitz         # the binding PyMuPDF
fout = fitz.open()  # new PDF for joined output
flist = ["1.pdf", "2.pdf", ...]  # list of filenames to be joined

for f in flist:
    fin = fitz.open(f)  # open an input file
    fout.insertPDF(fin) # append f
    fin.close()

fout.save("joined.pdf")
#-------------------------------------------------

That's about it. Several options are available for selecting only pages ranges, maintaining a joint table of contents, reversing page sequence or changing page rotation, etc., etc.

We are on PyPi.


Apache PDFBox http://pdfbox.apache.org/

PDFMerger This application will take a list of pdf documents and merge them, saving the result in a new document.

usage: java -jar pdfbox-app-x.y.z.jar PDFMerger "Source PDF files (2 ..n)" "Target PDF file"


pdfunite is fine to merge entire PDFs. If you want, for example, pages 2-7 from file1.pdf and pages 1,3,4 from file2.pdf, you have to use pdfseparate to split the files into separate PDFs for each page to give to pdfunite.

At that point you probably want a program with more options. qpdf is the best utility I've found for manipulating PDFs. pdftk is bigger and slower and Red Hat/Fedora don't package it because of its dependency on gcj. Other PDF utilities have Mono or Python dependencies. I found qpdf produced a much smaller output file than using pdfseparate and pdfunite to assemble pages into a 30-page output PDF, 970kB vs. 1,6450 kB. Because it offers many more options, qpdf's command line is not as simple; the original request to merge file1 and file2 can be performed with

qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf

Here's a method I use which works and is easy to implement. This will require both the fpdf and fpdi libraries which can be downloaded here:

require('fpdf.php');
require('fpdi.php');

$files = ['doc1.pdf', 'doc2.pdf', 'doc3.pdf'];

$pdf = new FPDI();

foreach ($files as $file) {
    $pdf->setSourceFile($file);
    $tpl = $pdf->importPage(1, '/MediaBox');
    $pdf->addPage();
    $pdf->useTemplate($tpl);
}

$pdf->Output('F','merged.pdf');

Convert it to PNG via ImageMagick, and display the PNG (quick and dirty).

<?php
  $dir = '/absolute/path/to/my/directory/';
  $name = 'myPDF.pdf';
  exec("/bin/convert $dir$name $dir$name.png");
  print '<img src="$dir$name.png" />';
?>

This is a good option if you need a quick solution, want to avoid cross-browser PDF viewing problems, and if the PDF is only a page or two. Of course, you need ImageMagick installed (which in turn needs Ghostscript) on your webserver, an option that might not be available in shared hosting environments. There is also a PHP plugin (called imagick) that works like this but it has it's own special requirements.







linux pdf merge command-line-interface