Monitor your parent's home network with a Raspberry Pi

In addition to a new reporting feature, the BeezKeeper 1.2 release includes the ability to export reports as PDFs. As with most features, this was much more complex to implement than we initially anticipated. But with a little bit of ingenuity, we were able to devise a solution. A major requirement of this project was the ability to inline our external CSS files. There were some libraries available, but none really gave us the flexibility we needed. Most fetched and inlined an entire external webpage, whereas we needed to be more selective. As a result, we wrote our own module which we intend to publish in the future, as it seems like it could be useful to someone else.

Once we inlined the CSS, the entire HTML webpage could be sent to the backend as a string to be rendered as a PDF. Eventually, we would like to generate a PDF report from scratch, bypassing the need to render from HTML. However, given the timeframe we decided to develop for a more immediate goal.

As stated above, most other libraries inline an entire webpage. This didn’t suit our needs for several reasons:

The Netbeez Dashboard is a single, dynamic HTML page. Using one of these libraries would guarantee we would have no control over which “view” was inlined.
Related to the previous point, we wanted to operate on the current report which the user is working on, not an entire page.
There are some parts of the Reports page that shouldn’t be in the final PDF report. For example, the navigation bar or the filtering widget doesn’t belong in the final report file.

Overview

As we didn’t want the entire webpage to be rendered, we first had to find a good <div> that encompassed the entire report without any extraneous chrome. This took a good bit of trial and error. After some time, we were able choose a uniquely identifiable element that contained more or less what we wanted.

NetBeez

When the user chooses to create a Report, the PDF is automatically generated alongside it. To achieve this, we first duplicate/clone the element. Otherwise this will affect the current Report the user is viewing. After all the processing, which will be discussed later on, the cloned element is appended to a new page (created in a separate tab) and ready for conversion.

The new page is sent to the backend as raw text (a string). With the help of wkhtmltopdf, the string contents are rendered in a headless browser and converted to PDF. This also required trial and error, as the renderer can be given a variety of configurations which affects the appearance of the final document.

After rendering, the PDF is returned to the frontend as a blob, stored in the DOM, and linked to a download button. Technically, the “downloading” has already been done when the user chooses to download the PDF. However, it speeds up the entire process and puts minimal strain on the backend to pre-render and download the PDF.

Processing: Inlining CSS

Before sending the web page to the server for conversion, we had to do some processing. Namely, inline our external CSS files. At this point, we had already cloned the proper element and stored it in memory. While our module only supports inlining CSS, it would require minimal effort to include support for scripts and other external resources.

function getDomCssHrefs(){
  return _.chain(document.styleSheets) // <in> -> <out>
    .pluck('href')                     // [{foo:bar, href: url}] -> [url]
    .filter(_.isString)                // [null, "www.google.com"] -> ["www.google.com"]
    .value();                          // unwrap underscore chain
}

function processHrefs(cssHrefs, document){
  return Promise
    .all(requestHrefsContents(cssHrefs))
    .then(buildStyleTags) //wraps css in style tags
    .then(concatStrings)  //concats all style tags into one string
    .then(wrapInElement); //wrap in jquery element
}

function buildNewPage(bodyContent, cssStylesObj){
  const newPage = createEmptyPage();
  const pageToStringFn = _.partial(pageToString, newPage);
  const closeNewPageFn = _.partial(closeWindow, newPage);

// note: compose works right to left (appendObjectsTo is called first)
 const buildFn = _.compose(wrapInHtmlTags, pageToStringFn, 
                           closeNewPageFn, addPageBreakStylesToSvg, 
                           adjustBodyView, appendObjectsTo);

  return buildFn(newPage, bodyContent, cssStylesObj);
}



// the main inline function
function inline(_bodyContent){
  const buildNewPageFn = _.partial( buildNewPage, _bodyContent.clone() );

  return processHrefs( getDomCssHrefs() ).then(buildNewPageFn);
}

Initially, all the the stylesheet elements are queried in the document. Then the stylesheet URL (known as “href”) is plucked from each element, and some filtering is applied to cleanup the results and take edge cases into account. Next, the style contents at each URL are fetched, wrapped in style tags, then wrapped as jQuery elements. Each element is then appended to the head of the page.

To apply this process to any other external element: the initial query would need to be changed and the raw document contents would need to be wrapped in its proper tag. For example, Javascript would need wrapped in script tags.

Recap

Obviously, this method has several flaws. Rendering the PDF from scratch, without all the web page manipulation and conversion, would create a cleaner and more flexible document. However, our delivery timeline to implement this meant that it would have to be done at a later time. The advantage of our method: quick to implement, simple to understand, and it works for most use cases.

Current inlining tools just aren’t flexible enough and assume a whole page is desired. In many cases this would be useful, but sometimes flexibility is desired.

Future releases will definitely improve the PDF functionality to make it more useful for all our users. If you have any suggestions or comments, we welcome you to contact us.

Dynamically Inlining CSS for PDF Export

Further Reading