Deploying Puppeteer as a Service in your Docker Application

Otobong Peter
4 min readJul 20, 2024

--

Introduction

According to the puppeteer documentation, Puppeteer is a Node.js library which provides a high-level API to control Chrome or Firefox over the DevTools Protocol or WebDriver BiDi. Puppeteer runs in the headless (no visible UI) by default but can be configured to run in a visible (“headful”) browser.

Photo by Evan Fitzer on Unsplash

In simple terms, puppeteer is a library that allows you to automate different activities over the browser. It can be used when you want to query a URL, to extract data(web scraping). It also comes in handy when you want to build services that use users' data to generate custom web pages that can be sent to the client.

What is this article about?

This article is not about how puppeteer works or what to use it for, if that’s what you want to know about I recommend this. Rather this article is about ensuring that puppeteer works appropriately in your docker environment.

Getting started

While building a web application, I needed to retrieve certain information from the database — and then use that data to fill up a template. Then use Puppeteer to create a PDF from this template to be delivered to the client.

To get started with Puppeteer, you have to first ensure that the library has been installed:

npm install puppeteer

If you just want to use the puppeteer operations and emulate actions using your local browser, you can use puppeteer-core and set an executablePath to your browser. Installing just puppeteer instead of puppeteer-core means a browser is also installed for you to use.

In my case, after building out my application, services and template. It ran locally and it worked! But on deployment via docker, the Puppeteer service did not work. I later learned that while npm did install Puppeteer for use by the service, it could not find a path to the browser. By default, puppeteer tries to launch the browser installed, by looking into the cache, but then since the cache is available in your application on the root, you have to tell your docker container to expressly look for it.

Aside from that, you also need to install the Chromium browser and its dependencies in your docker container.

Step 1: Install Puppeteer and configure your service or function in your app

Ensure that Puppeteer has been installed in your application and set up properly. In the example below, I created a function in my puppeteer service that takes in HTML data and converts it to a PDF buffer that can be sent to the client. Remember to set the args like in my example too, to avoid chromium from sandboxing your application and ensuring that your browser is fully under your service’s control.

import puppeteer from 'puppeteer';

class PuppeteerService {
static async templateToPdf(htmlData: string): Promise<Buffer> {
const browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox', '--disable-setuid-sandbox', '--unhandled-rejections=strict'],
//executablePath: puppeteer.executablePath(),
});
const page = await browser.newPage();
await page.setContent(htmlData, { waitUntil: 'networkidle0' });

const pdfBuffer = await page.pdf({
format: 'A4',
printBackground: true,
});

await browser.close();
return pdfBuffer;
}
}

export default PuppeteerService;

Step 2: Create a config file at the root

In the root of your application create a puppeteer config file. This file helps your container find the location of your browser in the container. The file must have the following names/extension:

  • .puppeteerrc.cjs,
  • .puppeteerrc.js,
  • .puppeteerrc (YAML/JSON),
  • .puppeteerrc.json,
  • .puppeteerrc.yaml,
  • puppeteer.config.js, and
  • puppeteer.config.cjs

Puppeteer automatically looks for any of these file names at the root to enable it to find the path to your browser.

After creating the file, it needs to contain a function that exports the accurate path to the browser and exposes that to Puppeteer. Paste the configuration below into the file you created:

const { join } = require('path');

/**
* @type {import("puppeteer").Configuration}
*/
module.exports = {
// Changes the cache location for Puppeteer.
cacheDirectory: join(__dirname, '.cache', 'puppeteer'),
};

Step 3: Install Chrome or Chromium browser and its dependencies in your container via your application’s Dockerfile

In my case, my docker container is running a Linux OS hence my Dockerfile will use installations custom to a Linux environment. You might need to check how it differs across other OS, such as Fendora, CentOs etc

# Use the official Node.js version 18 image as the base
FROM node:18

# Set the working directory
WORKDIR /app

# Install the chromium browser and its dependencies
RUN apt-get update && apt-get install -y \
chromium \
gconf-service \
libasound2 \
libatk1.0-0 \
libc6 \
libcairo2 \
libcups2 \
libdbus-1-3 \
libexpat1 \
libfontconfig1 \
libgcc1 \
libgconf-2-4 \
libgdk-pixbuf2.0-0 \
libglib2.0-0 \
libgtk-3-0 \
libnspr4 \
libpango-1.0-0 \
libpangocairo-1.0-0 \
libstdc++6 \
libx11-6 \
libx11-xcb1 \
libxcb1 \
libxcomposite1 \
libxcursor1 \
libxdamage1 \
libxext6 \
libxfixes3 \
libxi6 \
libxrandr2 \
libxrender1 \
libxss1 \
libxtst6 \
ca-certificates \
fonts-liberation \
libappindicator1 \
libnss3 \
lsb-release \
xdg-utils \
wget

# Copy package.json and package-lock.json files
COPY package*.json ./

# Install Node.js dependencies
RUN npm ci

# Copy the rest of the application code
COPY . .

# Build the application
RUN npm run build

# Expose the API port
EXPOSE 3000

# Start the application
CMD ["node", "dist/main"]

In the Dockerfile above, I defined the node engine/version, created the working directory, and installed a browser and its accompanying dependencies.

Voila! Puppeteer would now work perfectly in your docker container. Please do not forget to like, share and drop a comment if you found this application helpful.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Otobong Peter
Otobong Peter

Written by Otobong Peter

Software Engineer - Passionate about building tools & services that people care about. In another universe, I care about Leadership and People.

Responses (1)

Write a response