Dr. Dobb's | Automating Applications On-the-Fly

Third-party tools, C#, and Visual Studio 2005 can take the pain out of generating PDF content in ASP.NET.

I run the IT department for a preemployment screening company. Like most other developers, I am always trying to get the most work out of a limited amount of resources, whether that involves time, budget, or development staff. How many times have you been faced with a project whose requirements can be accomplished using a third-party tool, but you just don't know whether the cost of the tool justifies the time savings? If you're like me, this situation is very common. And also like me, I'm sure you've had both positive and negative experiences with either option. We recently had a business problem that warranted using a third-party component and, I'm happy to say, it was able to meet and exceed our expectations.

The Problem

We offer clients a database screening (background check) that is made up of various federal, state, and county criminal record repositories. We purchase this information from our vendor by using their web service to make a request on behalf of our clients. The vendor maintains over 300 million records belonging to several hundred repositories. These repositories are constantly changing as new states/counties are added and others are removed or their content restricted or expanded. When a customer executes a search, they typically want to know if a particular repository was included in that search. For example, if your applicant's address is in, say, Fresno, California, and he has lived there for many years, you would probably like to know whether the Fresno County Court was included in your database search. We maintain a PDF document that contains the record sources and make it available online. Until recently, the maintenance of this document was handled by one of our operations personnel who verified the sources we claimed versus the sources the vendor claimed. If there were differences, they were fixed in our document, then redeployed back to the web site; see Figure 1.

If a client had questions about which sources were being searched when a request was made, they could go to a Sources.aspx page that would load and display the PDF document. This workflow was fine when there were only a few changes a month, but as more sources became available from our vendor, maintaining this document became an hourly job and the employee started to fall behind on other work. Customers were getting upset because they felt the report they requested wasn't searching all available jurisdictions when, even though it was searching everything available, the source list didn't reference them. Additionally, by not listing the most current sources for our information, clients were beginning to think our product wasn't competitive.

The Solution

What we needed was a way to get the source from the vendor, take this information and compare against what we have, and, if need be, generate the PDF automatically (Figure 2). This would eliminate the need for a full-time employee to spend valuable work-hours constantly updating this document as well as improving the product quality.

[Click image to view at full size]

Figure 2: Future workflow for building the PDF on-the-fly.

Third-Party PDF Solutions

We contacted the vendor who was willing to augment their existing web service to offer us a web method to retrieve their source information. Now we had to decide how we should generate the PDF document. Knowing that it wasn't practical or reasonable to roll our own PDF generation tool, I began to look at third-party PDF solutions. I considered some open-source tools and other component vendors, but decided on PDF.Web—a component included with Syncfusion's Essential PDF suite (www.syncfusion.com)—for several reasons:

Most of the other solutions we considered involved writing content first to a file, then printing the file with a PDF print driver. Some were just too heavy to be viable in a high-traffic web environment, and others poorly documented. Some candidate solutions looked like they might fit the bill, but deployment ended up being an issue due to using COM.

Using PDF.Web

Having selected PDF.Web, I began investigating how it works. I'd had lots of experience using PDF writers that depended on a standard coordinate system to layout content and was expecting the same with PDF.Web. Although the coordinate system gives great granular control when laying out complex content, it's cumbersome and increases developer time to build documents since small layout errors can be hard to find. I was happy to see that Syncfusion offers both a Coordinate (Standard and Cartesian) system as well as something called "Document Logical Structure" (DLS)—a "flow layout" manager. Knowing that my content was mainly text with a single masthead image, I elected to use the DLS layout and let it handle margins, page length, and other layout matters that usually make you want to pull your hair out.

Components of the Solution

The application I describe here assumes you have Visual Studio 2005 and have installed the Syncfusion Essential Studio (trial versions can be downloaded at www.syncfusion.com/Products/Studio.aspx).

To begin using the PDF.Web component in our web project, we needed to add a few references. In the Syncfusion assemblies under the .NET tab, you add references to Syncfusion.Core, Syncfusion.DLS.Base, Syncfusion.Pdf.Base, and Syncfusion.Shared.Base.

Once the references are added, you import some namespaces. In particular, we need to reference Syncfusion.Pdf, Syncfusion.Pdf.DLS, and Syncfusion.DLS. With the namespaces added, go to the Default.aspx page's Page_Load event. The object we use to manipulate PDF documents in a DLS layout is PDFLogicalDocument. This class acts as the container object for various sections and paragraphs in our document. You declare a new instance inside the Page_Load like this:

Now we need to add a new "section" to the document. Sections are areas of a document that have formatting applied to them. You can vary the format section to section. We use an IPDFSection interface that we can obtain by calling the AddSection() method of the PDFLogicalDocument. Sections let you format various characteristics of the page, including the page settings, alignment, and other options available to the page as whole. Sections also act as the container for Paragraphs. Create a new IPDFSection by calling the AddSection() method:

Sections also let you specify the headers and footers of your PDF document. You can designate a different style for the first page and the odd or even pages by setting the boolean properties DifferentFirstPage and DifferentOddAndEvenPages to True. Another formatting object PDF.Web offers is the IPDFParagraph interface which changes spacing, adds borders, and other options. Most of these options are exposed through the property IPDFParagraph.ParagraphFormat. Let's create a new paragraph by calling our section's AddParagraph() method, then change a few format properties; see Example 1.

IPDFParagraph paragraph = section.AddParagraph();

//ParagraphFormat 
paragraph.ParagraphFormat.Borders.Color = Color.Navy;
paragraph.ParagraphFormat.Borders.BorderType = 
       Syncfusion.DLS.BorderStyle.Single;
//Set the alginment to the center
paragraph.ParagraphFormat.HorizontalAlignment = 
       HorizontalAlignment.Left;

Example 1

If you want to add text to a paragraph, you do so by calling the AppendText(string) method of the IPDFParagraph interface. This returns an IPDFTextRange interface. Additional formatting can applied to the text here. To test changing the font, colors, and font weight to some text, I add Example 2 to the paragraph.

IPDFTextRange textRange = paragraph.AppendText("Sources by State");
textRange.CharacterFormat.Bold = true;
textRange.CharacterFormat.FontSize = 18f;
textRange.CharacterFormat.FontName = "Arial";
textRange.CharacterFormat.TextColor = Color.Red;

Example 2

Building the format for each text range or paragraph can be tiresome if you have many different paragraphs to be added that share common formatting styles. We do have another option. If you want to define a style, you can make it persistent in your document and use it in many different places. This is done using the IStyle interface. Let's say you wanted to make a header style similar to the one we defined above and you want to apply this style multiple times in your document. By calling the AddStyle() method of the PDFLogicDocument and specifiying a StyleType enumeration and style name, we get back a IStyle interface. We need to convert the IStyle interface into the specfic style our StyleType enumeration matches. Using Example 3, let's define a HeaderParagraphStyle for our document and replace the formatting above.

//A paragraph style 
IParagraphStyle headerParaStyle =
   doc.AddStyle(StyleType.ParagraphStyle, 
     "headerParaStyle") as IParagraphStyle;

//A text range style 
headerParaStyle.CharacterFormat.FontSize = 18f;
headerParaStyle.CharacterFormat.FontName = "Courier";
headerParaStyle.CharacterFormat.TextColor = Color.Black;
headerParaStyle.CharacterFormat.Italic = true;

Example 3

Since we've defined this as a paragraph style, we can it apply to IPDFParagraph instances in the following manner:

The reason using a style is such a benefit is the tremendous flexibility this gives us when formatting PDFs. If two customers each wanted a different looking PDF document, you could have a class that retrieves their particular styles based on the customer requesting the document, assigns the style names to the appropriate IStyle instance and applies them dynamically.

Say you'd like to add a header image to the PDF. Images are added by first instantiating a System.Drawing.Image, then calling the paragraph's AppendPicture(Image) method passing the new created image as an argument. To add a header image, add Example 4 to the previously appended text.

//Add an image to the header 
System.Drawing.Image logo = 
    System.Drawing.Image.
      FromFile(Server.MapPath("SelectionLogo.png"));
IPDFPicture logoPic = paragraph.AppendPicture(logo);
logoPic.Height = 47;
logoPic.Width = 415;

//add a little spacing in between the logo and the header text 
paragraph.ParagraphFormat.AfterSpacing = 30;	

//Add a paragraph for the text header
paragraph = section.AddParagraph();

Example 4

Remember to get a new IPDFParagraph by calling the section's AddParagraph before appending the text we created earlier or your header and text will share the same paragraph. If you were to render the PDF, the content should look like Figure 3.

With the header in place, define the style for each state and its respective repositories. Our vendor's XML document defines a "state" element with children of "county". The counties contain details about the repository; see Listing One (available electronically; see www.ddj.com/code/).

Example 5 is the style we can use for the "state" XML elements. This style uses a the TextBackgroundColor property to accent the state name. This gives the separation we are looking for between the state and its repositories.

IParagraphStyle stateContentHeader =
    doc.AddStyle(StyleType.ParagraphStyle, 
     "stateContentHeader") as IParagraphStyle;
stateContentHeader.CharacterFormat.Bold = true;
stateContentHeader.CharacterFormat.FontSize = 14f;
stateContentHeader.CharacterFormat.TextBackgroundColor = Color.Black;
stateContentHeader.CharacterFormat.FontName = "Arial";
stateContentHeader.CharacterFormat.TextColor = Color.White;
stateContentHeader.ParagraphFormat.LeftIndent = 10f;

Example 5

Finally, build the style to display the details under each "county" (Example 6).

IParagraphStyle contentRepositoryStyle =
    doc.AddStyle(StyleType.ParagraphStyle, 
     "contentRepositoryStyle") as IParagraphStyle;
 contentRepositoryStyle.CharacterFormat.Bold = false;
 contentRepositoryStyle.CharacterFormat.FontSize = 12f;
 contentRepositoryStyle.CharacterFormat.FontName = "Arial";
 contentRepositoryStyle.CharacterFormat.TextColor = Color.Black;
 contentRepositoryStyle.ParagraphFormat.LeftIndent = 20f;

Example 6

When you need to display the PDF to users, you can stream the PDF content to the user's browser by calling the document's Save() method with the HttpReadType enumeration set to Open:

This "inlines" the PDF document in the user's browser. Another method prompts users to save the PDF file. The HttpReadType is set to Save:

When we apply and iterate our vendor's XML document, the generated PDF looks like Figure 4.

PDF.Web also has support for PDF forms, actions, and security, including digital signatures, user/owner passwords, encryption and operations restrictions. These behaviors can be added to your document using a simple, easy to implement object model. One of the best features of the tool is its thorough documentation and sample projects. Syncfusion distributes working projects in multiple versions of the .NET Framework as part of its install so, if you have any questions, most of your answers can be found reverse-engineering the samples or going to the web site and reading the forums or perusing the FAQs section.

Conclusion

When this solution was put into production, we were able to save 5.5 hours employee hours per week as well as improve the quality of the product. We were able to take our vendor's XML document and turn it into an aesthetic, professional-looking document that renders quickly. One of the things I consider to be the greatest benefit was the solution our team built was easy to maintain. Using the DLS, all of the layout headaches and display issues that go hand-in-hand with a coordinate system were eliminated. I also take comfort in knowing that, when a new release comes around, deployment is simple. From project inception to completion, we spent about three days developing and testing, followed by a quick deployment to production. Using a third-party component might not always be the best choice, but in this case PDF.Web sure makes it hard to choose anything else.