Embedding SharePoint Document IDs in PDF files and generating Short URLs

Posted at: 18:16 on 25 November 2011 by Muhimbi

Some time ago we wrote about some new functionality in the PDF Converter for SharePoint that allows ‘real-time’ and ‘user specific’ watermarks to be added to PDF files as soon these files are opened / downloaded / copied. This has proven a real hit with our customers and one of the questions that recently came up is how to use this functionality in combination with the SharePoint 2010’s Document ID Service. (Hint, the internal field names are called _dlc_DocId, _dlc_DocIdUrl and _dlc_DocIdPersistId).

In this post we’ll show how to automatically apply a watermark containing the Document ID to PDF files and then generate a Short URL based on this Document ID. Rather than using our watermark on open facility, we’ll create a SharePoint Designer workflow that carries out the work. Once a Document ID has been created it never changes so although real-time watermarking will work, it is not necessary and a waste of valuable processing resources.

Please note that the example in this post uses both the Muhimbi PDF Converter for SharePoint and the Muhimbi URL Shortener for SharePoint. If you are only interested in URL Shortening or just in Watermarking then you can change the sample accordingly. Also in this example we are not carrying out any PDF Conversion. Examples for PDF Conversion are available on our site for SharePoint Designer as well as Nintex Workflow.


OK, let’s get all the prerequisites in place first.

  1. Download and install the PDF Converter for SharePoint and install it as described in Chapter 2 of the included Administration Guide.
  2. Download and install the URL Shortener for SharePoint and configure it to accept short URLs on a web application of your choice as described in the included Administration Guide.
  3. If not already installed, download and install the free SharePoint Designer for your environment. 
  4. Make sure you have the appropriate privileges to create workflows on a site collection.
  5. Enable and configure SharePoint 2010’s Document ID Service


Item with Document ID Populated

With the prerequisites in place, let’s build our custom solution.

  1. On a Document Library of your choice, modify the View to include the Document ID field. (This is just for reference, you don’t need it in the real world.)
  2. On this same document library create a new field named Short URL of type Hyperlink. (This is just to see the name of the short URL after the workflow has executed, it is not needed in the real world).
  3. Create a new SharePoint Designer Workflow using a method of your choice. In this example we’ll do this by navigating to the Library Tab and then selecting Settings / Workflow Settings / Create a Workflow in SharePoint Designer.
  4. Give the workflow a sensible name, e.g. Document ID Processing and change the workflow settings to automatically start when an item is created.
  5. Switch to Edit Workflow and insert a Create Short URL workflow Action.
  6. Click optional short name, followed by the fx button.
  7. From the 'lookup string' dialog, select Document ID Value from the Field from source drop down and close the dialog.
  8. Click this ID / address, followed by the '...' button.
  9. Enter the Document ID URL for your site, you can copy it from the Document ID of an existing item. Generally it is something like http://<path to your site collection>/_layouts/DocIdRedir.aspx?ID= .
  10. Position the cursor after the '=', click the Add or Change Lookup button and select the Document ID Value again. Close the String Builder. 
  11. From Document / Display Form select Document. 
  12. From Overwrite / Return null select Overwrite.
  13. When the workflow executes it will write the shortened URL to a workflow variable named this variable. The variable name can be changed, but we accept the default.
  14. Insert a set field in current item workflow action, select Short URL as the field.
  15. Click value, followed by the fx button. From the Data Source select Workflow Variables and Parameters and from Field from source select this variable.


    The URL Shortening part of the example is now ready. Publish the workflow, upload a file to your document library and after a few seconds the Short URL field will be populated. Click the Short URL to verify it works.    

    WM-and-Short-URL This is what the final workflow will look like after the Watermarking is added as well

      You can do whatever you like with this Short URL. Send it by email, tell someone on the phone, or.... embed it in a watermark, something we'll do next.


    Return to the Workflow editor and add the following steps:

  1. Position the cursor JUST ABOVE the previously added Set field action and add the If current item field matches condition.
  2. Click field and select File Type. 
  3. In value enter pdf (lower case, no '.' prefixed) .
  4. Position the cursor inside the condition and add the Add Text Watermark activity (other watermark types are also available).
  5. Fill out the fields for the watermark
    - this document: Set to Current Item.
    - this file: Leave the default to overwrite the existing file.
    - content: Add the this variable Workflow Variable to insert the previously generated URL. Other content can be added as well.
    - position: Bottom Center.
    - width x height: 600 x 30


That should take care of the watermarking. Publish the workflow and upload a PDF file. You should now see the Short URL being created and when the document is opened you should see the Short URL at the bottom of the document.

If you find this useful, or have any questions, then feel free to leave a comment below.


Labels: , , , , , , ,

Specifying paths and file names when using the PDF Converter for SharePoint

Posted at: 10:13 on 23 November 2011 by Muhimbi

File-PathsThe Muhimbi PDF Converter for SharePoint has grown over the last few years from a relatively modest SharePoint GUI application to a sophisticated framework that can be used from Nintex, SharePoint Designer, K2, Workflow Manager, SharePoint Online and Visual Studio workflows to carry out all kind of PDF processing, including conversion, merging, OCR, securing, watermarking, splitting and merging.

Regardless of the operation that is carried out, input and / or output file names, including paths, need to be specified. Although we have tried to make this as intuitive as possible, and we continue to make improvements, according to our statistics the most common support call we receive is related to how to specify these paths. We hope to clarify the situation in this post.

Please note that the details described below are the same for both SharePoint on-premise and SharePoint Online. However, SharePoint Online is limited to paths in the current Site Collection (unless you are happy to use a workaround). Due to strict security boundaries it is not possible to write files to a different site collection. Unlike its on-premise equivalent, the Online version supports fully qualified paths that include the hostname (e.g. http://somedomain/somesite/somedoclib) to make life easier when creating reusable workflows. For details see this blog post.


Targeting the same directory / using the same file name as the source file

If you wish to read from or write to the same location as the source file that is being processed, e.g. you are using a SharePoint Designer workflow to automatically convert files to PDF, then leave the path empty. This works for all situations with the exception of HTML to PDF Conversions and the Merge Method in K2 as these are is not necessarily associated with a source file (they can convert URLs as well).

If you wish to use the same name for the target file as the source file then there is no need to specify a file name, the system will generate the name for you. When converting files to PDF the files’ extension will automatically be updated to ‘PDF’, however when the source file is a PDF (e.g. when applying PDF Security) then omitting the file name and path will overwrite the source file. This may be the desired behaviour, but keep it in mind.


Targeting a (sub) folder in the current Document Library

If you wish to specify a (sub) folder in the same Document Library as the source file then you must specify the Document Library name as well as the path to the folder. Just specifying the folder name will not work. For example to convert a file to the ‘Archives/PDFs’ folder in the Shared Documents library, specify:

       Shared Documents/Archives/PDFs/ 

Make sure you use a trailing slash when specifying an output folder without an explicit file name. Please do not start the path with a slash in this particular case.


Targeting a different Document Library

To specify a location in a different Document Library in the current Site, specify the name of the Document Library followed by any folders. E.g. if your workflow activity is operating on a file in the Shared Documents library, but you want to write the generated file to the PDF folder in the Archive Document Library then specify the following as the path:


Make sure you use a trailing slash when specifying an output folder without an explicit file name. Please do not start the path with a slash in this particular case.


Targeting a Sub Site

SharePoint Site Collections can contain multiple Sub Sites. To write a file to a location in a Sub Site specify the name of that Sub Site followed by the name of the Document Library followed by any folder names. For example if the current workflow is acting on a file in the root web of a Site Collection and you wish to write it to the PDFs folder in the Paid Document Library in the Invoices Sub Site specify the following path:


Make sure you use a trailing slash when specifying an output folder without an explicit file name. Please do not start the path with a slash in this particular case.

To target a Sub Site ‘next’ to the current site, use an absolute path. For details see Targeting a different Site Collection below. Please note that ‘traditional’ relative path notation such as ‘../../’ is not supported.


Targeting a different Site Collection

Due to strict security restrictions, it is not possible to target a different site collection directly in SharePoint Online, A workaround can be found here.

In order to read from or write to a file in a different site collection you have to use an absolute path starting with ‘/’. For example if a PDF Conversion workflow is running in the Accounting Site collection and the generated PDF file should be written to the Archiving site collection in the PDFs folder in the Accounting Document Library, use the following path


If all your Site Collections live under ‘/sites/’ then this will need to be reflected in the path, e.g.


Please do not start absolute paths with http://YourWebApplication/, always start absolute paths with a slash. (Except when using reusable workflows in SharePoint Online, see the ‘hints and tips’ section at the end of this post.)


Targeting historical files

As of version 6.0 it is possible to specify specific versions in a file’s history for Merge and Watermark (elements) operations. The syntax is similar to what is described above, for example:

Specify a document in the same document library:

       _vti_history/1536/Automatic Merging/End Page.docx

Specify a document in a different document library:

       _vti_history/1024/Shared Documents/Subfolder/End Page.docx

Specify a document in a sub-site of the current web:

       SubSite/AnotherSubSite/_vti_history/1024/Shared Documents/Empty.pdf

Specify a document in a different site collection or in a sub web next to the current web:

      /sites/PDFTest/_vti_history/512/Shared Documents/introduction.pdf


Targeting a different Web Application

As there are clear security boundaries between SharePoint Web Applications it is not possible to exchange files between them. If this is required then you will need to use a third party Workflow Activity, use Nintex Workflow or write a little bit of custom code..


Targeting host named site collections

Although the PDF Converter fully supports output to sub-sites, libraries and folders using the syntax described above, it is not possible to specify a location in a host named site collection from a different site collection. If this is required then you will need to use a third party Workflow Activity or write a little bit of custom code.


Using a List Item Attachment as the source

A popular use for the PDF Converter is to create a PDF for a list item including all attachments. Targeting list item attachments is easy, just right-click an attachment and copy the source URL to determine the structure of the path. It typically looks something like the following. In this example ‘10’ is the ID of the list item.


It is not possible to directly write a converted file to a list item attachment. You will need to output the file to a Document Library and then attach that file using a technology of your choice. (Nintex Workflow, Trigger etc)


Creating dynamic paths / file names

All of Muhimbi’s SharePoint Designer and Nintex Workflow activities allow lookup variables to be used in order to generate dynamic paths. For example, in order to make a workflow generic, the name of a source or target File / Library / Folder can  be determined at run-time.

PDF-Merge-2010For Nintex Workflow use the ‘Insert Reference’ button
workflow variablesIn SharePoint Designer use the ‘Add Lookup’ button


Some final, generic, hints and tips:

  • Never start a path with http://YourWebApplication/. Start absolute paths with a ‘/’. (The only place where hostnames are allowed is in SharePoint Online reusable workflows)
  • Some activities support templates in the output file name, e.g. to automatically generate file names when splitting PDFs.
  • Although you can use forward (/) as well as back (\) slashes, make it a habit to use forward slashes as SharePoint Designer 2010 workflows do not always deal well with back slashes.
  • No matter where you read from or write to, your user must have access to the location that is being specified.


Labels: , ,

Convert Outlook MSG files to PDF, with attachments, using the Muhimbi PDF Converter

Posted at: 14:48 on 16 November 2011 by Muhimbi


The latest version of this blog post can be found here



In our eternal quest to add support for as many file formats as possible we have arrived at the ‘MSG’ file format. Many people don’t realise that this is probably the most popular file format in the world as each individual message in your Outlook client is an MSG file. As a result I personally have more than 50.000 of these ‘files’ on my machine.

Official support for the MSG file type is available starting with version 5.2 of the Muhimbi PDF Converter for SharePoint as well the PDF Converter Services. The key features are as follows:

  1. Support for all common MSG content types, including HTML, RTF and plain text.
  2. Conversion of rich content including in-line images.
  3. Support for the conversion of signed emails, both SMIME and Clear Text.
  4. Conversion of attachments.
  5. No need for external dependencies on the server, e.g. Outlook.

Like proud parents we ‘love all our features equally’, but our favourite child feature is the automatic conversion of attachments. Many of our customers are sitting on massive heaps of emails that need to go through a long term archiving process to make sure that 30-40 years down the line this information can still be retrieved. Using the new MSG to PDF facility of our products it is now possible to convert each email, including all attachments, to a single PDF file. In combination with our PDF/A post-processing facility this is the perfect solution for long time archiving.

As the converter is part of our highly scalable PDF Conversion platform, it automatically benefits from all its features including reliability, scalability, watermarking engine, cross platform support, web services based API, PDF security, SharePoint integration, Nintex Workflow integration, Java support, PHP Support, Ruby Support, InfoPath attachments, Windows Azure etc.


Example output of a regular email conversation as well as part of a web based newsletter


Labels: , , , , ,

Subscribe to News feed