Scripting

SVGs, PDFs and Java

The need to create a variety of similar pages in a PDF file recently came about. Once again unique, once again related to d-touch. When such pages require rotated text, it becomes a whole new level of fun sorting out the co-ordinates and bounds.

A different approach seemed necessary and I found myself thinking how nice a template would be, just search and replace. Then it hit me: SVGs. Images in XML, with support for including actual images and text at any angle, size or position. Perfect. Graphically creating a first draft of the template took minutes.

The other benefit of using a graphical template is how easy it is to change the design and re-run the data through.

215 SVG files don’t quite leap into a PDF on their own though, and the two image packages I intended to use rather failed to manage. One slowly loaded each and every one before giving an error refusing to accept them as images, the other was quite happy they were valid images for merging to PDF but didn’t do anything after I pressed go.

I had manual options available I wasn’t about to jump at, such as saving each one as a PDF and using BullZip PDF Printer or similar to merge them slowly by hand.

The surprisingly simple and fast solution turned out to be Java. Running the Rasteriser from the Batik SVG Toolkit on the folder of the SVGs returned a folder of PDFs, a step closer but not quite there yet.

java -jar batik-rasterizer.jar -m application/pdf -w 2480 -h 3508 -d folderForPDFs folderOfSVGs\*.svg

The final step was to write a small Java wrapper for the PDF Merger Utility from the PDFBox package. This basically consisted of scanning the directory of single page PDF files, adding each one as a source for the Merger, and running it.

import java.io.File;
import org.apache.pdfbox.util.PDFMergerUtility;

public class Merge {
   public static void main(String[] args) {
      DoMerge dm = new DoMerge(args[0], args[1]);
   }
}

class DoMerge {
   public DoMerge(String from, String to) {
      try {
         // load util
         PDFMergerUtility ut = new PDFMergerUtility();

         // get files
         File dir = new File(from);
         String[] pdfs = dir.list();
         for (int i=0; i<pdfs.length; i++) {
            // add to pdf
            ut.addSource(from + File.separator + pdfs[i]);
         }

         // save
         ut.setDestinationFileName(to + "_out.pdf");
         ut.mergeDocuments();

      } catch (Exception e) {
         System.out.println("Oh no, an error! " + e);
      }
   }
}

Yes, that is a generic catch everything that goes wrong block. No, you shouldn’t use them 😛

To compile that you’ll be needing to use something like this

javac -cp pdfbox-1.6.0.jar Merge.java

And something like this to run it

java -cp .;pdfbox-1.6.0.jar;commons-logging-1.1.1.jar Merge folderOfPDFs outputName