Generating PDFs in Javascript for fun and profit!

Generating PDFs in Javascript for fun and profit!

<strong>Up until recently, creating complex or elegant PDFs in Javascript has been challenging. Here I’m going to show you step-by-step the path of least resistance to beautiful PDFs. Spoiler: recently made possible by docx to PDF conversion in Javascript :-)</strong>

Up until recently, creating complex or elegant PDFs in Javascript has been challenging. Here I’m going to show you step-by-step the path of least resistance to beautiful PDFs. Spoiler: recently made possible by docx to PDF conversion in Javascript :-)

What follows is some of what I will cover in my upcoming talk at PDF Association conference in Seattle in June.

From 1000 feet, here are your three main alternatives:

  • The first is to create the PDF directly, using pdfkitjsPDF, or the higher level pdfmake. Pdfkit is like iText in the Java world. Pdfmake, based on pdfkit, has its own format for representing rich text; it converts this to PDF.
  • The second is to create HTML, then convert that to PDF. These days probably using puppeteer.
  • The third is to create a docx, then convert that to PDF.

Put another way, you can either create the PDF directly, or use HTML or docx as an intermediate format.

Since its now easy to convert docx to PDF in Javascript, the docx approach is the path of least resistance — particularly for business documents (proposals, invoices, contracts etc).

For one thing, often the content will already be in Word document format, making your job easy.

More importantly, its worth thinking up front about ongoing maintenance (changes to content and formatting). Is that something that you as a developer want to be doing, or is it better to enable the business to do this themselves? If its a Word document, then business users can update the document without troubling you.

Creating a docx in Javascript has been easy for some years, but until recently, converting it to PDF from Javascript has the sticking point. Happily, this is now do-able — without invoking some SAAS API, using LibreOffice, or anything like that.

With docx.js you can programmatically build up your Word document (much like pdfkit and jsPDF allow you to build up a PDF). But this probably isn’t a great idea, because for the final PDF to come out looking right, any feature you care to use has to be supported in both the create-docx and docx-to-pdf steps. For example, merged cells in a table, or adding a watermark.

What we want is an easy way to create a docx, and then the confidence that our docx will be converted cleanly to PDF.

For this, a “templating” approach is the answer: basically, you create a docx template with your wanted layout - in Microsoft Word, LibreOffice, Google Docs, Native Documents or whatever - then use the template engine to replace “variables”.

Step 1: populate docx template

Here we’ll use docxtemplater, in node.js.

Say you want a PDF invoice. Since part of the point of using a Word template is that it is easy for business users to make it pretty, let’s start with one of the invoice templates designed by Microsoft and included in Word.

You can see I’ve added some variables (represented with curly braces, as required by docxtemplater).

You can click the image to see the docx in our Word File Editor. Click invoice-template.docx to download/use it with the code which follows.

Being a Javascript library, docxtemplater ingests data in JSON format:

{CustomerName : "Microsoft Corp",
AddressLine1: "One Microsoft Way",
City: "Redmond", State: "WA", Zip: "98052",
Country: "USA",
InvoiceDate : "14 March 2019",
InvoiceNumber: "INV123",
Items: [
{
 Item_Description: "Bananas",
 Item_Price: "5",
 Item_Qty: "10",
 Item_SubTotal: "50"
 },
{
 Item_Description: "Mangoes",
 Item_Price: "10",
 Item_Qty: "4",
 Item_SubTotal: "40"
 }
],
TotalEx:"90", 
SalesTax:"9", 
Shipping:"10", 
TotalPrice:"$109", 
DueDate : "28 March 2019"
}

Notice the Items array. The table row repeats for each of the Items. You can see docxtemplater’s markup for a repeat/loop at the start and end of that table row.

For demo purposes here we’ll provide that inline in our javascript:

var JSZip = require('jszip');
var Docxtemplater = require('docxtemplater');

var fs = require('fs'); var path = require('path');

//Load the docx file as a binary var content = fs .readFileSync(path.resolve(__dirname, 'invoice-template.docx'), 'binary');

var zip = new JSZip(content);

var doc = new Docxtemplater(); doc.loadZip(zip);

//set the templateVariables doc.setData({CustomerName : "Microsoft Corp", AddressLine1: "One Microsoft Way", City: "Redmond", State: "WA", Zip: "98052", Country: "USA",

InvoiceDate : "14 March 2019", InvoiceNumber: "INV123", Items: [ { Item_Description: "Bananas", Item_Price: "5", Item_Qty: "10", Item_SubTotal: "50" }, { Item_Description: "Mangoes", Item_Price: "10", Item_Qty: "4", Item_SubTotal: "40" } ], TotalEx:"90", SalesTax:"9", Shipping:"10",
TotalPrice:"$109", DueDate : "28 March 2019" });

try { // render the document ie replace the variables doc.render() } catch (error) { var e = { message: error.message, name: error.name, stack: error.stack, properties: error.properties, } console.log(JSON.stringify({error: e})); throw error; }

var buf = doc.getZip() .generate({type: 'nodebuffer'});

// buf is a nodejs buffer, you can either write it to a file or do anything else with it. fs.writeFileSync(path.resolve(__dirname, 'invoice-instance.docx'), buf);

To try it, install docxtemplater as per its instructions:

npm install docxtemplater
npm install [email protected]

Then its just:

node invoice-template-docx.js

And you get a populated invoice instance:

Notice the table row has been repeated, and all variables replaced.

If you run the code yourself, you can verify the results by opening invoice-instance.docx in your favourite docx editor, or in ours: click here then drag/drop your docx.

Step 2: convert the docx to PDF

So far so good. Now we just need to convert the populated invoice instance to PDF.

For that, we’ll use docx-wasm, a node module we at Native Documents released earlier this year. Our bread and butter at Native Documents is the web-based document editing/viewing component we used above to display invoice-template.docx, and this node module generates PDF output using that Word compatible page layout code. Put another way, the page layout reproduces what Word does so closely that it can also be used for high quality PDF output.

First, install it:

npm install @nativedocuments/docx-wasm

Converting the docx in the node.js buffer object to PDF is then just:

const docx = require("@nativedocuments/docx-wasm");

// init docx engine docx.init({ ND_DEV_ID: "4H2I80DDEVNAJQSGGIC3K98N8S", ND_DEV_SECRET: "3CTNJA7DBQFA8UDV2GM8I60N38", // ND_DEV_ID: "XXXXXXXXXXXXXXXXXXXXXXXXXX", // goto https://developers.nativedocuments.com/ to get a dev-id/dev-secret // ND_DEV_SECRET: "YYYYYYYYYYYYYYYYYYYYYYYYYY", // you can also set the credentials in the enviroment variables ENVIRONMENT: "NODE", // required LAZY_INIT: true // if set to false the WASM engine will be initialized right now, usefull pre-caching (like e.g. for AWS lambda) }).catch( function(e) { console.error(e); });

async function convertHelper(document, exportFct) { const api = await docx.engine(); await api.load(document); const arrayBuffer = await apiexportFct; await api.close(); return arrayBuffer; }

convertHelper(buf, "exportPDF").then((arrayBuffer) => { fs.writeFileSync("sample.pdf", new Uint8Array(arrayBuffer)); }).catch((e) => { console.error(e); });

You’ll need a ND_DEV_ID, ND_DEV_SECRET pair to use this module. You can get free-tier keys at https://developers.nativedocuments.com/

Copy these into the docx.init call (or alternatively, you can set these as environment vars).

I haven’t posted the PDF here, since it just looks the same as the invoice-instance docx.

Putting it all together

Here is Javascript which combines step 1 and step 2.

// Step 1: generate docx

var JSZip = require('jszip'); var Docxtemplater = require('docxtemplater');

var fs = require('fs'); var path = require('path');

//Load the docx file as a binary var content = fs .readFileSync(path.resolve(__dirname, 'template.docx'), 'binary');

var zip = new JSZip(content);

var doc = new Docxtemplater(); doc.loadZip(zip);

//set the templateVariables doc.setData({CustomerName : "Microsoft Corp", AddressLine1: "One Microsoft Way", City: "Redmond", State: "WA", Zip: "98052", Country: "USA",

InvoiceDate : "14 March 2019", InvoiceNumber: "INV123", Items: [ { Item_Description: "Bananas", Item_Price: "5", Item_Qty: "10", Item_SubTotal: "50" }, { Item_Description: "Mangoes", Item_Price: "10", Item_Qty: "4", Item_SubTotal: "40" } ], TotalEx:"90", SalesTax:"9", Shipping:"10",
TotalPrice:"$109", DueDate : "28 March 2019" });

try { // render the document ie replace the variables doc.render() } catch (error) { var e = { message: error.message, name: error.name, stack: error.stack, properties: error.properties, } console.log(JSON.stringify({error: e})); throw error; }

var buf = doc.getZip() .generate({type: 'nodebuffer'});

// buf is a nodejs buffer, you can either write it to a file or do anything else with it. //fs.writeFileSync(path.resolve(__dirname, 'output.docx'), buf);

// Step 2: convert docx to pdf

const docx = require("@nativedocuments/docx-wasm");

// init docx engine docx.init({ // ND_DEV_ID: "XXXXXXXXXXXXXXXXXXXXXXXXXX", // goto https://developers.nativedocuments.com/ to get a dev-id/dev-secret // ND_DEV_SECRET: "YYYYYYYYYYYYYYYYYYYYYYYYYY", // you can also set the credentials in the enviroment variables ENVIRONMENT: "NODE", // required LAZY_INIT: true // if set to false the WASM engine will be initialized right now, usefull pre-caching (like e.g. for AWS lambda) }).catch( function(e) { console.error(e); });

async function convertHelper(document, exportFct) { const api = await docx.engine(); await api.load(document); const arrayBuffer = await apiexportFct; await api.close(); return arrayBuffer; }

convertHelper(buf, "exportPDF").then((arrayBuffer) => { fs.writeFileSync("output.pdf", new Uint8Array(arrayBuffer)); }).catch((e) => { console.error(e); });

To try it, download invoice-template.docx then:

node docx-template-to-pdf.js

Deployment Options

A nice way to run this is on AWS Lambda. With Lambda, you get easy scalability, and you aren’t paying for servers when you aren’t using them. More on this in my upcoming talk at PDF Association conference in Seattle in June! In the meantime, docx-to-pdf-on-AWS-Lambda shows you how to do the docx to PDF part on Lambda. Adding the docx templating piece is straightforward.

Its also now possible to convert docx to PDF client-side, in-browser, reducing server loads, and opening the way to offline operation. docx-wasm-client-side shows you how to do the docx to PDF part client-side.

Originally published by Jason Harrop at https://hackernoon.com

Learn More

☞ The Complete JavaScript Course 2019: Build Real Projects!

☞ Become a JavaScript developer - Learn (React, Node,Angular)

☞ JavaScript: Understanding the Weird Parts

☞ Vue JS 2 - The Complete Guide (incl. Vue Router & Vuex)

☞ The Full JavaScript & ES6 Tutorial - (including ES7 & React)

☞ JavaScript - Step By Step Guide For Beginners

☞ The Web Developer Bootcamp

☞ MERN Stack Front To Back: Full Stack React, Redux & Node.js

☞ Visual Studio Code Settings and Extensions for Faster JavaScript Development

☞ Vue.js Authentication System with Node.js Backend

Angular 9 Tutorial: Learn to Build a CRUD Angular App Quickly

What's new in Bootstrap 5 and when Bootstrap 5 release date?

Brave, Chrome, Firefox, Opera or Edge: Which is Better and Faster?

How to Build Progressive Web Apps (PWA) using Angular 9

What is new features in Javascript ES2020 ECMAScript 2020

JavaScript Tutorial: if-else Statement in JavaScript

This JavaScript tutorial is a step by step guide on JavaScript If Else Statements. Learn how to use If Else in javascript and also JavaScript If Else Statements. if-else Statement in JavaScript. JavaScript's conditional statements: if; if-else; nested-if; if-else-if. These statements allow you to control the flow of your program's execution based upon conditions known only during run time.

How to Retrieve full Profile of LinkedIn User using Javascript

I am trying to retrieve the full profile (especially job history and educational qualifications) of a linkedin user via the Javascript (Fetch LinkedIn Data Using JavaScript)

Java vs. JavaScript: Know The Difference

Java vs. JavaScript: Know the Difference, Java vs. JavaScript: What's the Difference? Java vs. JavaScript: Major Similarities and Differences. pros and cons of JavaScript and Java.