When we made the decision to completely re-architect our popular PDF Converter for SharePoint (MDCS), we didn’t make this decision lightly. Adding support for all these additional file formats was going to give our support staff quite the additional workload, no matter how well we tested the product.
Now that the product has been out in the wild, we are indeed getting more support requests. Fortunately most questions are easily answered as the majority of problems have a similar root cause. Rather than continuously copying and pasting the same email response we have decided to summarise the main pain points in this blog post, which should make life easier for both our customers and our poor support staff.
1. Administration Guide.
When encountering any kind of problem, the first step is to consult our comprehensive Administration Guide, which is included in the download and available on-line. The Guide contains the following key areas:
Quick start: Installation steps for people who don’t read manuals. Please read section 2.1 at a minimum.
Detailed installation instructions: If you want to know what is going on and fine-tune the installation, read section 2.2.
Troubleshooting: Chapter 3 contains all the information you need to troubleshoot problems using the Windows Event Log and the MDCS Trace Log. Some common problems and their solutions are discussed in this chapter as well.
High level MDCS Architecture
2. Some, or none, of the converters are working.
MDCS ships with a very handy Diagnostics tool that carries out a simple end-to-end test for each of the converters to ensure the software has been deployed properly. The Diagnostics screen can be accessed from Central Administration / Application Management / Muhimbi Document Converter Settings or from the following URL - http://<your_ca_server>/_admin/Muhimbi.PDFConverter/WebAppDocumentConverterSettings.aspx.
Use the Test Button to check connectivity and authentication between SharePoint and the Document Conversion Service. Once this test has completed successfully, select the converters of your choice and click the Validate Settings button to check the individual converters.
If there are any problems then please use the following checklist to troubleshoot the issue:
None of the converters are working: This may be caused by Office 2007/2010 or any of the other prerequisites not being installed. Please follow the installation steps in the Administration Guide. Do not use the Click-to-Run version of Office 2010 as it is not compatible.
The InfoPath converter is singled out: InfoPath is somewhat tricky to configure. Read our separate InfoPath troubleshooting guide.
Wrong MS-Office language: Some customers have reported problems when they deploy a non English version of MS-Office on their Document Conversion Server. If one or more of the converters are not validating then please install the English version of MS-Office. Note that you will still be able to convert documents written in any language using the English version of MS-Office.
If the server has not been rebooted since MS-Office and the latest service packs were installed then please reboot it.
Make sure the MDCS Service account has local Administrator privileges.
Log in using the MDCS Service account and launch each Office application once, activate if prompted and close it again.
Make sure the Printer Spooler Service is running and at least one printer is installed (preferably the XPS Document Writer that comes with Windows and is installed by default). Note that printers connected via a remote desktop session do not count as they disappear after the session is disconnected. If you are experiencing problems then please double check that the XPS Document Writer is the default printer.
Try running MDCS using a different account with local Admin rights. Make sure to log in using that account and launch each Office application once. Please restart the service after changing the account using the following command:
Net stop "Muhimbi Document Converter Service" Net start "Muhimbi Document Converter Service"
Template (dot) file resides in SharePoint: If a document’s template file is located in SharePoint then MS-Word will attempt to retrieve it during the conversion process. To allow this, make sure the account the MDCS is running under has privileges to access the file and the site hosting the template is listed in the account’s Trusted or Intranet sites. If the MDCS is located on one of the Web Front End Servers then make sure you are not suffering from Loopback problems as described here. Update: This is no longer a problem for docx files as of version 3.1. Note that pre Office 2007 .doc files are still affected.
InfoPath Based Document Information panels: If a document uses InfoPath based Document Information Panels then please make sure InfoPath is installed on the machine hosting the MDCS. Update: This is no longer a problem for docx files as of version 3.1. Note that pre Office 2007 .doc files are still affected.
If possible check if the problem is isolated to the machine by installing it on another machine.
Check if the server is using non standard DCOM Settings as follows:
From the Start Menu, select RUN and type ‘dcomcnfg’. Note that by default this will launch the 64bit version, if the 32 bit version of MS-Office is installed then launch 'mmc.exe -32' and use the 'File' menu to add the 'Component Services' snap-in.
Navigate to Component Services / Computers / My Computer / DCOM Config / Microsoft PowerPoint Slide (or whatever application is causing problems).
Open the properties for the application in question.
Check that on the Location Tab ‘Run Application on this computer’ is the only check box selected.
Check that on the Identity Tab the user account is set to the 'Launching User'.
Check that on the Security Tab ‘Launch and Activation permissions’ and ‘Access Permissions’ are set to ‘Use Default’.
3. Contact Muhimbi Support
If none of the previously described steps solved the problem then please contact us and include the following information:
The last few entries listed in the Windows Application Event Log with Event ID 41734.
The version and language of your Operation System.
The version and language of your MS-Office 2007 installation.
The regional settings for the account used by the MDCS (See Control Panel / Regional Settings)
Anything else about your system that could be considered ‘non standard’. E.g. Office 2003 installed alongside Office 2007, any custom security software etc.
Our customers are using our PDF Converter for SharePoint in ways we never imagined. Recently we were contacted by a customer who uses the product to automatically convert all files, including the actual EML file, sent to an email enabled Document Library to PDF Format.
This gave me a great idea for a new blog post as I am sure that converting files to PDF Format via email is something that many of our customers will be interested in. After all, an entire industry has been build around just that concept.
What follows is a description of how to use SharePoint designer to build a simple workflow to convert documents mailed to a document library to PDF Format and return an email with a link to the converted document. If you wish, the converted file can be included in the email as well as described here.
You may want to consider installing Service Pack 2 on your SharePoint server as that makes email enabled workflows much more stable.
Creating the Document Library
Before we can create the workflow we need to enable and configure the Document Library using the following steps:
Create a new Document Library in the site of your choice. I named mine ‘Workflow’, but I am sure you can come up with a better name.
Mail enable the document library from Settings / Document Library Settings / Incoming e-mail settings and configure it as follows:
Allow Incoming Email.
Give it an email address of your preference. I named it pdfconvert.
Select Save all attachments in folder grouped by e-mail sender.
Disable Overwrite files with the same name.
Disable Save original e-mail unless you want the text of the email to be converted to PDF as well.
Disable Save meeting invitations.
Select the Security Policy of your choice. I set it to Accept e-mail messages from any sender.
When e-mail is enabled, SharePoint automatically adds a number of new columns to the Document Library, including the E-Mail From field that holds the address of the user who sent the email. Unfortunately the content of this field is not suitable for the email Workflow Action so a Calculated Column needs to be created to change the email address to something usable.
From the Settings Menu select Create Column.
Name the column Return Address.
Select Calculated as the type.
Enter the following in the Formula field: =MID([E-Mail From],FIND("<",[E-Mail From])+1,FIND(">",[E-Mail From])-1-FIND("<",[E-Mail From]))
Creating the Workflow
Finally we need to create a workflow that converts all files that are not already in PDF format to PDF format and send out an email to the originator with a link to the location of the converted file.
Start SharePoint Designer and open the site collection that contains the new Workflow Document library.
From the File menu select New > Workflow.
On the first screen of the Workflow wizard, specify the following settings:
Name the workflow Convert file to PDF.
Select the Workflow list.
Select the 2nd and 3rd checkboxes to make sure the workflow is triggered whenever a document is created or (its status) is updated.
Click the Next button to proceed.
We are now ready to create the workflow. From the Conditions menu select Compare any data source. This inserts the If valueequalsvalue condition.
Click on the first value followed by the display data binding (fx) button.
Select Current Item as the Source and select File Type in the Field. Click the OK button to continue.
Change equals to not equals.
Click on the second value and select pdf from the list.
With the conditions in place we can now add the Actions, which is where the magic happens.
From the Actions menu, select Convert to PDF. It may be hidden behind the More Actions option.
The following action is inserted:
Convert this document to this url using the same file name and include / exclude meta data. Store the converted item details in List ID: Variable: List ID, Item ID: Variable: List Item ID.
Let’s examine what the various options mean:
this document: Specify which document to convert. Select the option and make sure Current Item is selected.
this url: Specify the location the converted file will be written to. The following options are available:
Leave it empty: When no value is specified then the converted document is written to the same folder as where the source file is located. This is the option we want so leave this field empty.
Site Relative URL: By specifying a URL relative to the current site, e.g. subsite/shared documents/PDF Files, any folder location in the current site collection can be targeted.
Web Application relative URL: Using a URL that is relative to the entire web application, e.g. /sites/Press Office/Public Documents/To Distribute, any folder location in any site collection can be targeted.
the same file name: The name of the converted file can be specified here. In our case we’ll leave it empty to make sure we use the same name as the original document.
include / exclude meta data: In case of sensitive documents we may want to strip any custom SharePoint columns from the file. In this scenario we need the meta data as the Return Address is stored in there, so select include.
Variable: List ID: A new workflow variable named List ID is automatically created. After the file has been converted, this variable will contain the ID of the list the converted file was saved to. This can later be fed into another action in order to manipulate this file further.
Variable: List Item ID: A new workflow variable named ‘List Item ID’ is automatically created. After the file has been converted, this variable will contain the ID of the item the converted file was saved to. This can later be fed into another action in order to manipulate this file further.
Insert a new action named Log to History List and enter File converted to PDF Format.
The files are now automatically converted to PDF format. Next we will add functionality to the workflow that sends out an email alert for each generated PDF Document.
Click Add ‘Else-if’ Conditional Branch.
From the Actions menu, select Send an Email. It may be hidden behind the More Actions option.
Click this message to open the email composition window.
Click the address book button next to the To field.
Select the Workflow lookup address and select Return Address from the Current Item.
Enter a Subject of your choice or copy the Subject from the screenshot above.
Compose test for the email body. In order to create a link to the generated PDF file, enclose some text with <a href=’’></a>, insert the cursor between the two single quotes, click the Add Lookup to Body button and select Encoded Absolute URL from the Current Item.
That’s it, you are done. Click OK on the email composition Window followed by the Finish button on the main workflow window.
Testing
Providing everything has been setup and configured correctly, you can carry out a test by sending an email with one or more MS-Word, Excel or other Office files to the email address registered on the Document Library. The Document Library will pick up the file usually within a minute and create a folder named after the originator. The files will be stored inside the folder and workflow progress can be followed on each individual file.
For each converted file a separate email confirmation will be send back to the originator with a link to the converted PDF file.
Further improvements
The solution provided in this article can be used with great success in any Production environment. However, you may want to make some improvements, for example:
When we started the development cycle for version 3 of our PDF Converter for SharePoint, we decided to make ‘reliable InfoPath to PDF Conversion’ one of the main deliveries of the new version. There has been a lot of customer demand for it and well…. how difficult can it be…?
… pretty difficult it turned out. InfoPath is a complex product that is much more than a simple forms designer. It allows connectivity with external data sources such as SharePoint Lists and Web Services as well as forms to be hosted and filled out inside SharePoint using Forms Services.
With the release of version 3.0 I believe we have delivered a great product that has already been implemented by many of our customers. Deployment, however, may require some planning depending on your environment and settings. This article provides an architectural overview of how we deal with InfoPath to PDF conversion and provides some helpful pointers to troubleshoot InfoPath conversion in your environment.
PDF Converter Architecture
The architecture behind the PDF converter is relatively straight forward. There are 2 main components:
SharePoint Front End: A SharePoint WSP solution that is deployed to all Web Front End Servers. This contains all SharePoint related logic including end user screens as well as Central Administration screens and Workflow Actions. The Front End doesn’t carry out any conversion, instead it offloads all conversion to the Muhimbi Document Converter Windows Service.
Muhimbi Document Converter Windows Service (MDCS): A Windows service which can be hosted on the actual Web Front End server or on a completely separate (virtual) machine. This service accepts conversion requests via WCF Web Service calls from the SharePoint server and carries out the actual conversion.
Although the following is a simplification, when MDCS receives a request for conversion of an InfoPath form it carries out the following steps:
Load the XSN file that is associated with the InfoPath data file.
Pass the InfoPath data and XSN files to InfoPath for conversion.
Depending on the complexity of the form, InfoPath may make requests to external data sources such as SharePoint Lists or Web Services.
Step #1 may fail for a number of reasons, most notably:
The XSN file is not in the expected location. This may happen when the InfoPath data file is copied from a completely different environment. Many users don’t realise that without the XSN file an InfoPath data file does not know how to visually represent itself.
The service account that MDCS is running under does not have the privileges to load the XSN file from SharePoint.
A 401 (Access Denied) error is logged as the service is not allowed to connect to the specified hostname due to loopback checking. To disable loopback checking see this post.
A 401 (Access denied) error is logged as integrated Windows Authentication is not enabled on the application pool used by the Web Application.
Step #2 may fail for different reasons, for example:
InfoPath is not installed on the machine that hosts MDCS.
MDCS is running using an account that does not have local Administration rights.
InfoPath has never been launched by the MDCS Service account using an interactive session.
The XSN file and data sources are hosted at locations that are not ‘trusted’.
Troubleshooting steps
The Administration Guide that ships with the product contains a comprehensive troubleshooting section. A separate Troubleshooting Guide is available as well. The main points are repeated below.
Connectivity test: The first test to carry out is to verify connectivity and authentication between SharePoint and the Document Converter Service. Navigate to Central Admin / Application Management / Muhimbi Document Converter Settings, verify the Address of the Web Service and click the Test Button.
Validate Converters: Once the connectivity and authentication test has completed successfully, select the converters to validate and click the Validate Settings button. The result of the test are displayed underneath the button. If any of the converters failed then the Windows Application Event Log will contain additional details.
Printer Spooler & Drivers: InfoPath requires the Printer Spooler service to be started as well as a single printer driver to be installed. Although in general it doesn’t matter which driver is installed, some drivers such as VMWare’s Virtual Printer or the OneNote printer will cause problems. Windows Server 2003 R2 and Windows Server 2008 automatically install the XPS Document Writer driver and the Muhimbi Service installer attempts to start the Spooler Service. Unless you have taken specific actions to disable the Spooler Service and removed all printer drivers, all should work as expected. Note that printers connected via a remote desktop session do not count as they disappear after the session is disconnected. If you are experiencing problems then please double check that the local XPS Document Writer is the default printer.
Service Packs: Please make sure the latest Office Service packs have been installed. At the time of writing SP2 is available for Office 2007. The version number as well as Service Pack level can be found in InfoPath under the Help / About Microsoft Office InfoPath menu.
Check the form works: Perhaps obvious, but make sure the form itself opens in InfoPath without errors or warnings.
Check Trust settings: When an InfoPath document containing external connections, e.g. a dropdown list with the contents of a SharePoint list, fails to convert then this may be caused by the location of the XSN file not being trusted or the Access data sources across domains setting not being enabled for the trusted site. You may need to disable / uninstall Internet Explorer Enhanced Security Configuration as well.
Full details about how to check these settings can be found in section 3.5.8 of the Administration Guide.
Verify access to XSN file: If the log files contain messages listing ‘Access denied’, ‘Unauthorized’ or ‘401’ related errors then MDCS was unable to retrieve the XSN file. Make sure the MDCS Service account can access the XSN file using the following steps:
Open the InfoPath XML data file in notepad and retrieve the address of the XSN file from the first line.
Log-in interactively using the MDCS Service Account.
Try to open the XSN file in a web browser.
If these steps don’t make it clear what the problem is behind the Access Denied messages then it is worth checking the following:
Check the privileges on the XSN file. The MDCS account requires Read Access. You may want to consider applying Web Application wide Read Access for this account using a SharePoint Policy (http://<your_central_admin/_admin/policy.aspx)
Are there any proxy settings defined? You probably want MDCS to bypass any proxies.
Are there any hints in the IIS log file about the reason why access to the XSN file is denied?
Check automatic log-on settings: When an InfoPath form contains references to external sources, e.g. images in SharePoint libraries or external data connections, then you have to make sure that InfoPath carries out an automatic log-on when accessing those resources. InfoPath uses the Internet Explorer Security settings for this so configure these settings as follows.
Log-in to the Windows Desktop of the Conversion Server using the account the Conversion Service runs under. Open Internet Explorer / Internet Options / Security tab. Then select the security zone for the domain where the images are stored / data connections are made to. If this isn't already configured then add this domain to the 'Local Intranet' or 'Trusted Sites' zone. With the relevant zone selected click 'Custom Level' and scroll all the way till the end. Once there, set 'Logon' to any of the 'Automatic' options.
These setting can (and should) be configured using a Group Policy in production environments by your network administrator.
Windows Event Log: If all else fails, look in the Windows Application Event log for all messages with Event ID 41734.
InfoPath Offline mode: Make sure that InfoPath is not running in offline mode. Log in using the account the service is running under, open an InfoPath form and check the setting in the File menu.
Ink Controls: When using Ink Controls, for example to capture signatures on InfoPath forms, then please make sure the Ink controls are installed on the machine(s) that run the Muhimbi Conversion Service. For example, in Windows Server 2008 these controls are installed as part of the “Ink & Handwriting features”, part of the “Desktop Experience” Windows Feature.
People pickers: When experiencing problems converting forms that contain people-pickers then Internet Explorer may be configured to block (or prompt for) all ActiveX controls for the Internet Zone. As a solution log in as the account the conversion service runs under (or use a group policy) and make sure the following settings are configured for the relevant zones:
* Run ActiveX controls and plug-ins: Enable * Automatic prompting for ActiveX control: Disable
MDCS Trace Log: MDCS uses the industry standard log4net framework to write logging and trace data to a log file. Out-of-the-box information is logged to the DocumentConverter.log file stored in the directory MDCS has been installed in. A new file is created for each day and the default logging level is set to ‘INFO’. To increase the log detail, open the config file in the MDCS root directory, search for the <root> element, change the log level to DEBUG and restart MDCS using the following commands:
Net stop "Muhimbi Document Converter Service" Net start "Muhimbi Document Converter Service"
Disable Delete Test: If none of the options mentioned above resolve the problem then please carry out the ‘DisableDelete’ Test, which will most likely highlight the source of the problem.
If none of the guidance provided in this posting or in the Administration Guide has resolved your particular problem then please contact us directly.
It has been almost 3 months since we released a new version of the Muhimbi PDF Converter for SharePoint, a considerable change from our usual 4-6 weeks release schedule. A change with a good reason though as we have completely overhauled our architecture, added support for Excel, PowerPoint, Publisher and complex InfoPath documents as well as support for converting files to PDF Format using a Web Services based interface.
Over the next few days and weeks we will release a number of interesting blog postings with details about what is new and how the changes will benefit our customers, so make sure make sure you subscribe to our RSS feed.
In addition to making all new functionality available to our existing customers for free, we are also happy to announce that the price of the product remains the same. We don’t charge extra for the ability to convert additional file formats or attempt to sell a 64-bit ‘pro’ version, we don’t even care how many users will be accessing the software. A single Web Application license continues to be $349.
For those not familiar with the product, the PDF Converter for SharePoint is a lightweight solution (4MB download) that allows end-users to convert common document types to PDF format from within SharePoint without the need to install any client side Software or Adobe Acrobat. It integrates at a deep level with SharePoint and leverages facilities such as the Audit log, localisation, security and tracing. It runs on both WSS 3 as well as MOSS and is available in English, German, Dutch, French and Japanese. For detailed information check out the product page as well as this blog posting describing how to use the PDF Converter from A SharePoint Designer Workflow.
Convert files using the User Interface or an automated Workflow
The number of improvements in this version is considerable. The main ones are as follows:
-
New: Support for Excel to PDF conversion.
-
New: Support for PowerPoint to PDF conversion.
-
New: Support for MS-Publisher to PDF conversion.
-
New: PDF Conversion available via a Web Services Interface.
-
New: Allow PDF Converter to run on separate machine or VM.
521
New: Full Support for 'Right to Left' languages including Arabic and Hebrew.
579
New: Support for refreshing Word 2007 Doc Properties & Quick Parts with content from SharePoint columns.
243
New: Support for MS-Word 2007 XML based files.
Improved: Support for complex InfoPath forms.
-
Improved: Fidelity of converted MS-Word files.
234
Improved: Open Office ODT files PDF Conversion.
335
Fixed: Japanese documents are not converted correctly.
659
Fixed: Trade Gothic font not supported.
653
Fixed: Certain DocProperties refreshed incorrectly.
646
Fixed: Meta data not always copied over for certain documents.
628
Fixed: DocProperties referring to field names containing spaces not supported.
624
Fixed: Complex page numbering scenarios not fully supported.
620
Fixed: Custom formatting in DocProperties not supported.
509
Fixed: Add 'Convert to PDF' Buttons to Forms library.
569
Fixed: Conversion of large RTF files may time out on certain systems.
381
Fixed: Error message during conversion: Window width is invalid.
346
Fixed: Table of contents is made up of blue hyperlinks.
341
Fixed: Converting 1MB TXT file is very slow.
276
Fixed: Unsupported sfnt version while loading document.
270
Fixed: Add support for floating Elements and complex text flow in MS-Word.
230
Fixed: Add support for footnotes in MS-Word based files.
For more information check out the following resources: