Many of our customers are using this user specific watermarking facility for security reasons by adding watermarks that include who opened a document, from where and at what time. This all works well, but the resulting PDF file was not encrypted and it was not possible to encrypt the file before watermarking as encrypted files cannot be modified with watermarks.
To cut a long story short, with the introduction of version 6.0 of the PDF Converter for SharePoint it is now possible to apply typical PDF Security setting to a file the moment it is opened or downloaded. Security is applied after a file is watermarked so files can now have user specific watermarks and PDF security at the same time, woohooo!
The key features are as follows:
Apply security after user specific watermarks have been applied.
Apply typical PDF Security including Open Password, Owner Password, Prevent Printing, Prevent Copy, Prevent Document assembly, etc.
Allow filters to be specified and only apply security when a condition is met, e.g. a Status field is set to Approved, or the user that is accessing the document is in a specific group.
Apply security to files in Document Libraries as well as files attached to individual list items.
Works on all SharePoint 2007 and 2010 versions.
Let’s work through an example to show how easy it is to set this up.
By default the Secure / Watermark on open facility is disabled so use SharePoint Central Administration to enable the Muhimbi PDF Converter - Automatic PDF Processor Feature at the relevant Web Application. Note that this is a Web Application Scoped Feature, not a Farm or Site Collection scoped one. You also need to enable the Muhimbi PDF Converter - Automatic PDF Processing User Interface Feature at either the Web Application Level (to enable the screen on all Site Collections) or at the individual Site Collection level.
Once enabled, a new menu named PDF security settings can be found in the Site Actions / Site Settings screen as well as the List Settings screen on each individual List and Document Library. Default security settings can optionally be specified at the Site Collection level, which can then be inherited at the individual List or Library Level, which is displayed in the following screen.
As you can see in the screenshot there are also options to enable security during Insert and Update events. However, the focus of this article is to Secure On Open. In this screenshot we have specified both an Open and an Owner Password. The Owner Password must be set when any of the PDF Security Options are selected, the Open Password is optional.
In the same screenshot we have also specified a filter to only secure documents when the person opening the file is in the Test Visitors SharePoint group. Please note that you can only use SharePoint Group names, not Windows Group names.
That is all there is to it. When a PDF file is opened from the Document Library, and the user opening it is a member of the Test Visitors group, then PDF Security will be applied automatically to the file without modifying the original in the List or Document Library.
Please note that securing files this way is a real-time action and adds some overhead. If there is no need to apply security in combination with user specific watermarks, or based on a user specific filter, then we recommend applying security using a SharePoint Designer or Nintex workflow the moment a file is created or modified.
At Muhimbi we take great pride in always going the extra mile. Not only do we add features and functionality specifically requested by our customers, but sometimes we add a ‘wildcard’ feature that no one has asked for. One of the wildcards we added to the very first version of the Muhimbi PDF Converter for SharePoint was the ability to copy meta data while converting a file and, based on customer feedback, people absolutely love it.
Over the years we have received various requests for additions and changes to this meta-data copying facility, but as we don’t want to cause any backwards compatibility issues we were unable to facilitate these requests. For example, half the requests were for copying the source file’s content type as well, while the other half wanted to default to the library’s default content type. Sigh…. customers :-).
We thought we would never be able to please everyone, but I think we have actually cracked it as, with the introduction of version 6.0 of the PDF Converter for SharePoint, we have added a new stand-alone Workflow Activity to copy meta-data and set content types in one easy step.
From a very high level the functionality is as follows:
Standalone Workflow Activity that can be used in combination with any of Muhimbi’s Workflow Activities, or without them.
Copy all meta-data or only selected fields.
Copy meta-data to files in different folders or site collections.
Change the content type to either the source file’s, destination file’s, the default content type for the library or a specific named content type.
Copy content of Author, Created and Modified fields by explicitly specifying these field names. This information is not copied when the default ‘copy all fields’ option is enabled. It is not possible to copy the Editor field as that is always overwritten by the workflow. Please note that this functionality requires the PDF Converter version 8.0 or newer.
This document: The source document to copy the meta data from. For most workflows selecting Current Item will suffice, but some custom scenarios (e.g. Site workflows) may require the look up of a different item.
Fields: By default the content of all fields is copied to the destination file. However, if you wish to copy only specific fields then the field names can be specified in this list. You can separate fields using line breaks, ‘,’ or ‘;’ and you can use both internal and display field names.
Content type: While copying meta-data you have full control over the content type of the destination file. The following options can be specified:
(source): The content type of the source file is copied to the destination file. Please include the round brackets.
(target): The content type of the destination file is not modified and remains what it was before the copy operation. Please include the round brackets.
(default): The default content type for the document library is applied to the destination file. Please include the round brackets.
Name of Content type: The destination file is set to a specific, named, content type. Please do not use round brackets around the name of the content type.
Parameter ‘List ID’: The ID of the file the meta-data was copied to. This can later in the workflow be used to perform additional tasks on the file such as performing a check-in or out.
Parameter ‘List Item ID’: The ID of the list that holds the file that the meta-data was copied to.
Similar to our other Workflow Activities, this new Copy Meta Data facility is both simple and powerful and works in SharePoint 2007 as well as 2010.
As of version 6.0 the Muhimbi PDF Converter for SharePoint supports cross-conversion of documents. An absolute brilliant new feature that, in addition to converting documents to PDF, makes it possible to convert between versions of file formats (doc to docx, xlsx to xls etc) and even between completely different file types (InfoPath to Word, Excel and HTML or Excel to MS-Word). A SharePoint Designer workflow example and a full chart outlining all possible combinations can be found here.
As we are big fans of Nintex Workflow we are making sure that Nintex Workflow is supported from day one. Similar to all other Nintex Activities provided by Muhimbi, the Convert Document activity integrates with Nintex Workflow at a deep level. It supports SharePoint 2007 as well as 2010, allows errors to be handled and even supports integration with Nintex’ iterators to deal with multiple items and loops. For a comprehensive example and details about how to enable the Nintex Workflow integration see this blog post that discusses our generic Nintex PDF Conversion activity.
Cross convert between document types using Nintex Workflow 2007 and 2010
Building a full example workflow is out of the scope of this post as it is very simple, especially considering it works almost identical to the existing Convert to PDF activity, with the following exceptions:
Output Format: This field is new and allows the output format to be specified, e.g. doc, xls, pdf, txt, csv, etc.
Output Item ID:The type of this field is Text rather than Item ID. The reason for this is that a future version of our software may return multiple, comma separated, values for certain actions. If you wish to pass this ID into a secondary activity then you may need to convert it to the correct data type using the Convert Value Workflow Activity. An example can be found here.
That is all! Very straight forward, especially if you are familiar with our other Nintex Workflow Activities. Please note that you may need to make some small modifications if you intend to convert InfoPath to Excel, HTML or MS-Word. For details see this blog post.
Muhimbi’s range of server based PDF Conversion products have been developed with performance, scalability and reliability in mind. As a result the software scales from the most humble ‘everything on the same server’ environments to environments that deal with millions of conversions a day across a farm of servers. In this article I will discuss the most common deployment scenarios.
Whenever this article mentions ‘Server’ it doesn’t matter if this is a physical or virtualised server. Our software does not differentiate between the two and will work fine on either type.
Introduction – Architecture
Both the PDF Converter for SharePoint and the PDF Converter Service ship with the same central conversion engine. This engine, The Muhimbi Conversion Service, is responsible for carrying out all the work including conversion of files, watermarking, merging, splitting and security activities. Although in case of our PDF Converter for SharePoint our front end is quite comprehensive, all it really does is prepare requests for the Conversion Service and receive responses containing new or modified documents.
If you are involved in deploying any of our PDF Conversion products then it is essential to know how our software works from an architectural perspective. The Muhimbi Conversion Service is a standard Windows Service that starts automatically when Windows boots up and requires no user interaction or anyone to be logged in to the server console.
This Windows Service contains a WCF based Web Service that exposes functionality to any Web Services capable environment including Java, C#, VB.NET, Documentum, SAP, PHP etc. Typically when administrators think about web services they assume that they need to host this inside a web server such as IIS or Apache. Although that may be true for many web services, Muhimbi’s PDF Conversion software runs inside a self-hostedWCF service that does not have any external dependencies on third party web servers.
By basing the conversion service on WCF we get a lot of benefits, including:
No external dependencies on web servers and other 3rd party products.
A mature framework with support for different message and transport types, built in security and advanced features such as MTOM encoding for large attachments.
And most importantly, all functionality is exposed via standard HTTP based Web Service requests.
This last point about requests being HTTP based is very important as it allows the Conversion Service to be scaled across multiple servers using standard hardware or software based load balancers, including the free NLBS that ships with Windows. By utilising a load balanced environment you can achieve linear scalability and automatic failover. (See these performance tests).
Example based on SharePoint Front ends, but Java and .NET deployments work the same.
For details about tuning the various options of the Muhimbi Conversion Service, and installation in general, see the Resources section at the end of this article.
Single Server Farm
The most basic configuration possible is to install everything on a single server. Just follow Chapter 2 in the Administration Guide (including all links to Appendices) and you are ready to go. There is nothing else to configure and, if needed, the Web Service can be accessed on the following URL:
For slightly larger deployments of 2 or more servers, but with a single conversion server, deployment is very simple as well. Let’s take the following example:
Server 1: A new or existing Application Server that will run the Muhimbi Conversion Service.
Server 2: A server that will run either a SharePoint WFE or a custom solution.
In this particular case it is a matter of installing the Conversion Service on Server 1 as per the instructions in Chapter 2 of the Administration Guide. No further changes are required, but remember that the web service URL is now:
Naturally you will need to replace ‘Server1’ with the actual host name of the server.
In case of a SharePoint deployment you need to deploy the WSP file to Server 2 and enter the Web Services URL in the PDF Converter’s SharePoint Central Administration screen. If you are not running SharePoint, but your own Java / .NET / etc solution then you will need to pass the above mentioned web service URL using whatever syntax is common for your platform.
Large farms with multiple conversion servers
Now, this is where things get interesting, a farm with multiple front end servers and multiple conversion servers. Let’s take the following example:
Server 1: A new or existing Application Server that will run the Muhimbi Conversion Service.
Server 2: A new or existing Application Server that will run the Muhimbi Conversion Service.
Server 3: A server that will run either a SharePoint WFE or a custom solution.
Server 4: A server that will run either a SharePoint WFE or a custom solution.
In this scenario the Conversion Service will need to be installed on Server 1 and Server 2 as per the instructions in Chapter 2 of the Administration Guide. Once installation is complete the web service can be reached on the following 2 URLs:
Although in theory you could build functionality into your software to alternate requests between these two URLs, it is much easier and more robust to use an off-the-shelf HTTP load balancer or Windows NLBS. How this works in detail differs per load balancer, but it usually involves creating a virtual host that listens for requests and configure this virtual hosts to send requests to Server1 and Server2. In this example we assume that this virtual host is named LoadBalancer resulting in the following Web Service URL:
In case of a SharePoint deployment you need to deploy the WSP file to Server 3 or 4. it doesn’t matter which one, the installer will automatically deploy it to all WFEs.
The LoadBalancer URL needs to be added to the PDF Converter’s SharePoint Central Administration page, or your application’s configuration file. When a request is made to the load balanced URL the load balancer will automatically detect which of the servers are available and only send requests to servers that are up and running. If multiple servers are up and running it will distribute requests between all servers.
If you have multiple SharePoint Web Front End servers it may be tempting to install a copy of the Conversion Service on each WFE. Although this is a supported scenario we recommend deploying the Conversion Service on separate Application Servers to make sure that your WFEs resources are available for running SharePoint, which can be quite resource hungry by itself.
Please make sure you are licensed for the correct number of servers. Muhimbi's software is licensed on a 'per server' basis. This means that you need a license for every server that runs our software including all Web Front End servers in your SharePoint farm and all servers running the conversion service.
To make sure that our conversion service also works with dumber less capable languages we have to target the lowest common denominator, which means Base64 encoding for all attachments. Unfortunately Base64 adds 33% overhead, so a 1MB file takes up 1.3MB worth of bytes when sent over the wire. In most situations you will never notice this unless you are processing a particular high volume of data or you….just…have…to…squeeze…out…every…last…byte.
Fortunately the WCF framework that the Muhimbi Conversion Service has been built on is very flexible and switching encoding is almost trivial. Listed below are the steps to switch from Base64 to MTOM, which has almost no encoding overhead. For the time being you should not apply these changes to the PDF Converter for SharePoint as our SharePoint front end has no knowledge of MTOM (scheduled for 6.1), but you can safely apply the following changes to the PDF Conversion Service if you control all interaction with this web service using your own code.
Open Muhimbi.DocumentConverter.Service.exe.config in notepad. A shortcut to the installation folder can be found in the Windows Start Menu.
Search for the following ‘binding’ element and add the messageEncoding attribute.
In your client side code you will need to enable MTOM encoding as well. How this is done depends on your platform of choice. In case of C# / .NET it is a matter of setting the messageEncoding property on your binding.
binding.MessageEncoding = WSMessageEncoding.Mtom;
That is all there is to it. Naturally you can make this more complex by adding bindings for both encoding types, but that is an exercise I'll leave to the reader.