The Features Of Using Rotativa To Generate PDF

ASPDOTNET

The Features Of Using Rotativa To Generate PDF

Many developers are faced with the creation of PDF reports for web applications, it is quite a natural request. Rotativa to generate reports. This, in my opinion, is one of the most convenient libraries for such purposes in its segment, but when using it I encountered several not obvious points that I want to talk about.

To be completely honest, I would like to share with you the set of rakes on which I stepped in the process of integrating a real library, no doubt fast and very convenient.

In this article, I will not touch the question of choosing a library. Each may have their own reasons for using this or that. I chose Rotativa because it had everything necessary to meet customer requirements at a minimum cost for setting up. Besides her, I tried three or four options.

Formulation of the problem

A web application on ASP.NET MVC, .NET version 4.6. Other features do not matter in this context, with the exception of deployment. It is assumed that the deployment will occur on Azure. This is important because some other libraries (for example HiQPdf) do not transfer installations in certain Azure configurations, this is documented.

I need to open a certain static HTML report for one link, and for the second link – a PDF version of the same report. The report itself is simply a set of some tables, fields, and graphs for demonstration to the user. Both versions assume the presence of a menu with navigation through report sections, the presence of tables, some graphics (colors, text size, borders).

Rotativa library application

Rotativa is applied as easily as possible in my opinion.

  • Let’s assume that you already have an HTML report in the form of a template and an ASP.NET MVC controller, such as this:

[HttpGet]
public async Task<ActionResult> Index(int param1, string param2)
{
var model = await service.GetReportDataAsync(param1, param2);
return View(model);
}

  • Install NuGet Rotativa package
  • Add a new controller for PDF report

[HttpGet]
public async Task<ActionResult> Pdf(int param1, string param2)
{
var model = await service.GetReportDataAsync(param1, param2);
return new ViewAsPdf("Index", model);
}

In essence, from now on, you have a PDF returned as a file containing all the data from the original HTML report.

I did not describe the routing here, but it is assumed that you have configured the routes to correctly call both controllersSmartSpate

Interestingly, this library itself is essentially a wrapper over the well-known console utility wkhtmltopdf. The speed of work at altitude, you can bet on Azure – will work. But there are features that we will talk about.

Page Number

It is logical to assume that the customer will print the PDF and will want to see the page number. It’s all very simple, thanks to the creators of Rotativa.

According to the Rotativa documentation, you can use the CustomSwitches parameter to specify the arguments that will be passed to the wkhtmltopdf utility itself. Well, online tips are generous with examples. The following call adds a number to the bottom of each page:


return new ViewAsPdf("Index", model)
{
PageMargins = new Rotativa.Options.Margins(10, 10, 10, 10),
PageSize = Rotativa.Options.Size.A4,
PageOrientation = Rotativa.Options.Orientation.Portrait,
CustomSwitches = "--page-offset 0 --footer-center [page] --footer-font-size 8
};

In addition to [page] there are others:

  • [page] Replaced by the number of the pages currently being printed
  • [frompage] Replaced by the number of the first page to be printed
  • [topage] Replaced by the number of the last page to be printed
  • [webpage] Replaced by the URL of the page being printed
  • [section] Replaced by the name of the current section
  • [subsection] Replaced by the name of the current subsection
  • [date] Replaced by the current date in system local format
  • [isodate] Replaced by the current date in ISO 8601 extended format
  • [time] Replaced by the current time in system local format
  • [title] Replaced by the title of the of the current page object
  • [doctitle] Replaced by the title of the output document
  • [sitepage] Replaced by the number of the page in the current site being converted
  • [sitepages] Replaced by the number of pages in the current site being converted

Table of content

Large multi-page reports require content and page navigation in PDF. This is very convenient and simply vital when the number of pages in the report exceeds one hundred.

wkhtmltopdf manual contains a complete list of all parameters, among which is –too. Seeing this parameter, the utility essentially collects all the tags <h1>, <h2>, … <h6> according to the document and generates a table of contents based on them. Accordingly, it is necessary to provide for the proper use of these header tags in your HTML template.

But in reality, adding –toc does not lead to any consequences. As if the parameter was not. However, other parameters work. Thanks to a post on some forum, I discovered that this parameter needs to be passed without hyphens: toc. Indeed, in this case, the content is added as the very first page. When you click on the line in the content, you go to the desired page of the document, page numbers are correct.

It’s not completely clear how to customize the styles, but I haven’t done it yet.

JavaScript Execution

The next point I encountered was the need to add graphics to the report. My HTML page contains JS code that adds graphics using the dc.js library. Here is an example:


function initChart() {
renderChart(@Html.Raw(Json.Encode(Model.Chart_1_Data)), 'chartDiv_1');
}

function renderChart(data, chartElementId) {
var colors = ['#03a9f4', '#67daff', '#8bc34a'];
var barHeight = 45;
var clientHeight = data.length * barHeight + 50;
var clientWidth = document.getElementById(chartId).offsetWidth;

var chart = dc.rowChart('#' + chartElementId);
var ndx = crossfilter(dataToRender);
var dimension = ndx.dimension(d => d.name);
var group = dimension.group().reduceSum(d => d.value);

chart
.width(clientWidth)
.height(clientHeight)
.margins({ top: 16, right: 16, bottom: 16, left: 16 })
.ordinalColors(colors)
.dimension(dimension)
.group(group)
.xAxis()
.scale(d3.scaleLinear().domain([0, 2]).range([1, 3]).nice());
chart.render();
}

At the same time in HTML I have the corresponding element:


<span class="hljs-tag"><<span class="hljs-name">div</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"chart_C2"</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"dc-chart"</span>></span><span class="hljs-tag"></<span class="hljs-name">div</span>></span>

For this code to work, you must import the appropriate libraries: dc.js, d3.js, crossfilter.js. A call to the initChart function will create a graph and insert the received
SVG to the specified item in the tree.

But the PDF does not contain a trace of graphs. As well as any other trace of executing javascript code before rendering PDF. This is quite easy to check – all you have to do is add the elementary code for creating a simple <div> element with text, just to test the fact that JavaScript has been called.

It was found out experimentally that the location of the JS code for wkhtmltopdf plays a significant role. Located at the end of <html> or say at the end of <body> JS code will not be executed. It seems that the utility simply does not notice him, or does not expect him to meet him there.

But the code inside the <head> is executed. Thus, I came to the scheme when the JavaScript code is located after the declaration of styles inside the <head> tag and is invoked by the usual construction:


<body onload="initCharts()">

In this case, the code will be executed as expected.

JavaScript Limitations

But there were no graphs in the output PDF anyway. Then I began to guess that is not a full-fledged browser, the rendering and execution engine for pdf was probably not perfect and did not understand the latest rules. Again, my experience, I found out that the switch functions are not perceived. And if the interpreter finds something unknown to him, then he simply stops working.

Replacing the switch functions of the form x => x.value with the more classical function (x) {return x.value; } it helped and all the code was executed, the resulting graph got into the PDF file.

Chart Width

It was found out experimentally that it is necessary to clearly indicate the width of the parent element of the graph. For this, I specified the dc-chart style. It contains the width of the graph in pixels. Otherwise, the graph on the PDF will be very small, despite the fact that in the HTML version it will occupy the entire width. Specifying width in percent will work only for HTML.

Inline JavaScript/CSS

Finally, I would like to note that many HTML-to-PDF conversion libraries accept a certain baseUrl as a parameter. This is the URL based on which the converter will complete building relative paths to get imported CSS styles, javascript files or fonts. I can not say exactly how it works in Rotativa, but I came to a different approach.

To speed up the initial loading of the report and eliminate the very source of problems with embedding script files or styles when converting, I embed the necessary JS and CSS directly into the body of the HTML template.

To do this, create the appropriate bundles:


public class BundleConfig
{
public static void RegisterBundles(BundleCollection bundles)
{
bundles.Add(new StyleBundle("~/Styles/report-html")
.Include("~/Styles/report-common.css")
.Include("~/Styles/report-html.css")
);

bundles.Add(new StyleBundle("~/Styles/report-pdf")
.Include("~/Styles/report-common.css")
.Include("~/Styles/report-pdf.css")
);

bundles.Add(new ScriptBundle("~/Scripts/charts")
.Include("~/Scripts/d3/d3.js")
.Include("~/Scripts/crossfilter/crossfilter.js")
.Include("~/Scripts/dc/dc.js")
);
}
}

Add a configuration call to these bundles in Global.asax.cs


protected void Application_Start()
{
...
BundleConfig.RegisterBundles(BundleTable.Bundles);
}

And add the appropriate method to embed the code into the page. It must be placed in the same namespace as Global.asax.cs so that the method can be called from the HTML template:


public static class HtmlHelperExtensions
{
public static IHtmlString InlineStyles(this HtmlHelper htmlHelper, string bundleVirtualPath)
{
string bundleContent = LoadBundleContent(htmlHelper.ViewContext.HttpContext, bundleVirtualPath);
string htmlTag = $"<style rel=\"stylesheet\" type=\"text/css\">{bundleContent}</style>";
return new HtmlString(htmlTag);
}

public static IHtmlString InlineScripts(this HtmlHelper htmlHelper, string bundleVirtualPath)
{
string bundleContent = LoadBundleContent(htmlHelper.ViewContext.HttpContext, bundleVirtualPath);
string htmlTag = $"<script type=\"text/javascript\">{bundleContent}</script>";
return new HtmlString(htmlTag);
}

private static string LoadBundleContent(HttpContextBase httpContext, string bundleVirtualPath)
{
var bundleContext = new BundleContext(httpContext, BundleTable.Bundles, bundleVirtualPath);
var bundle = BundleTable.Bundles.Single(b => b.Path == bundleVirtualPath);
var bundleResponse = bundle.GenerateBundleResponse(bundleContext);
return bundleResponse.Content;
}
}

Well, the final touch is a call from the template:


@Html.InlineStyles("~/Styles/report-pdf");
@Html.InlineScripts("~/Scripts/charts");

As a result, all the necessary CSS and JavaScript will be directly in HTML, although during development you can work with individual files.

Most likely, many will immediately think about the inefficiency of this approach from the point of view of caching requests by the browser. But I had two specific goals:

  • The PDF converter does not have to make requests somewhere for styles or code, and the user to wait for this, respectively;
  • That the first download of PDF and HTML report takes the minimum time, without having to wait for several additional requests. In the context of my project, this is important;

Page breaks

Structuring a report into sections may be accompanied by requirements to start a new section from a new page. In this case, you can successfully use a simple CSS approach:


.page-break-before {
page-break-before: always;
}

.no-page-break-inside {
page-break-before: auto;
page-break-inside: avoid;
}

The wkhtmltopdf utility successfully reads these classes and understands that you need to start a new page. The first class — page-break-before — tells the utility to always start a new page with this element. The second class – no-page-break-inside – should be applied to those elements that you want to keep as much as possible on the page. For example, you have successive blocks of structured information or say tables. If two blocks fit on the page – they will be located so. If the third already does not fit into the page – it will not be next. If it is larger than a page, then its transfer is inevitable. It all works adequately and conveniently.

Flex Behavior in wkhtmltopdf

Well, the last feature I noticed is related to the use of flexbox markup styles. We are all accustomed to them and almost all the markup is made flex. However, wkhtmltopdf in this regard is slightly behind. The horizontal flex options do not work (at least in my case, this did not happen. I saw on the network that it was worth duplicating the flex styles as follows:


display: -webkit-flex;
display: flex;

flex-direction: row;
-webkit-flex-direction: row;
-webkit-box-pack: justify; /* wkhtmltopdf uses this one */
-webkit-justify-content: space-between;
justify-content: space-between;

But unfortunately, it did not lead to the expected markup in PDF. I had to redo the layout of some elements so that the horizontal placement of the blocks was in accordance with the requirements. If someone has a successful integration experience for wkhtmltopdf, please share. That would be quite helpful.