Posted at: 9:50 AM on 23 August 2012 by Muhimbi
When Microsoft introduced Word Automation Services (WAS) as part of SharePoint 2010 we were (briefly) worried as its functionality overlaps somewhat with the features provided by our popular PDF Converter for SharePoint. Fortunately WAS is more a platform for developers rather than a solution that regular end users can use, so if anything it has validated the need for document conversion in SharePoint and generated more interest in the topic of PDF Conversion (and increased our sales). Pfew!
Ever since we first heard learned about WAS we thought it would be a good idea to create a plug-in for those customers who prefer to use WAS for one reason or the other. As of version 18.104.22.168 we officially provide support for WAS. The functionality is largely on par with our existing MS-Word conversion capabilities.
The Pros and Cons of WAS are as follows:
- Relatively Low resource usage.
- Good Conversion Fidelity.
- Included in SharePoint Server 2010.
- Not supported by SharePoint Foundation or SharePoint 2007. You must have the (paid for) SharePoint Server 2010 and sufficient licenses for each user.
- Conversion is very slow compared to our regular MS-Word converter. The main reason for this is that WAS is ‘job based’. Although we have improved the 1 minute minimum job interval, it still takes on average 10 seconds before WAS starts the conversion.
- WAS Doesn’t support as many file formats as our existing converter, although it supports the main ones (docx, docm, dotx, dotm, doc, dot, rtf, htm, html, mht, mhtml, xml)
- WAS’ programming model is clumsy to say the least, but that is nicely abstracted by our Web Services Interface.
- The Source and Destination files must ‘live’ in SharePoint so there is some copying of temp files going on inside SharePoint.
- As WAS is completely integrated with SharePoint the Muhimbi Conversion Service must run on a SharePoint Server. When using WAS you don’t have the option to host the Conversion Service on a non-SharePoint server.
- WAS doesn’t come with a native User Interface or Workflow Activities. It is up to developers to build something on top of it.
If you are only looking to convert Word, HTML, MSG, AutoCAD and Image based formats - and don’t need to convert Excel, InfoPath, PowerPoint, Visio - AND you are using SharePoint Server 2010 then you may want to consider going down the Word Automation Services road. If you do need any of the other formats then there is no benefit in using WAS, it just complicates installation and maintenance.
We have implemented Word Automation Services as a plug in that transparently replaces the stock MS-Word converter. As a result you don’t need to re-write any of your Muhimbi workflows or teach your end users anything new. It just works, the only thing you’ll notice is that it is a bit slower.
All other facilities provided by the PDF Converter for SharePoint remain unchanged. You can still convert to and from file formats other than MS-Word and PDF. Converted PDF files can still be secured, watermarked, merged or split.
The only thing you need to do is enable Word Automation Services and make a change in the Document Converter’s configuration file. Details can be found below.
Enabling Word Automation Services
Providing SharePoint Server 2010 has already been installed, enabling Word Automation Services is relatively simple. The steps - using Central Administration - are summarised below, but you can do it via PowerShell as well. A Word Automation Services Guide is also available on MSDN.
- Start SharePoint 2010 Central Administration.
- On the Central Administration home page click Manage Service Applications.
- If the service is not already listed, use the 'New' menu in the ribbon and add Word Automation Service.
- Follow the Wizard / questions to create the Database and Application pool. Please make sure the Application pool account has proper access to the web application / Database that contains input files (See TempDocLibPath below).
- In the page opened as part of step #2 above click Word Automation Service in order to configure it.
- Accept the default settings except for the following values (See screenshot below):
- Disable Embedded Fonts: No
- Conversion Processes: ‘2’ (or higher if you anticipate a larger number of parallel conversions)
- Maximum Conversion Attempts: 1
- Navigate to System Settings / Manage services on server and enable Word Automation Services.
- Navigate to Application Management / Configure Service Application Associations and make sure Word Automation Services is enabled in at least one proxy group. If you don’t do this then you must configure the service name in the Conversion Service’s configuration file (see below)
Configuring the Muhimbi PDF Converter to use Word Automation Services
Once the Word Automation Service has been installed and configured, the final change that needs to be made is disabling the stock MS-Word Converter in the Conversion Service’s configuration file and enabling the Word Automation Service’s plug in. The steps are as follows:
- Create a Document Library / Folder in SharePoint to hold the temporary files used by the converter. You can hide this library from the end users if you wish.
- Make sure the account the Conversion Service and Word Automation Services run under have read and write access to this library. The easiest (as in quick and dirty) thing to do is to use the Application Pool account used by the relevant Web Application or add the Conversion Service account to the WSS_Content_Application_Pools database role. If you are setting up a separate account with more fine grained security then give it rights on:
- Content Database: Any account accessing SharePoint content will need to be given rights similar to your Web Application Pool accounts on the relevant Content Database(s).
- Config Database: The same rights need to be given on the Configuration database for the farm.
- SharePoint rights: Naturally regular SharePoint rights will need to be set on the Document Library that holds the temporary files. The account requires privileges to Create, Read and Delete files.
- Open Muhimbi.DocumentConverter.Service.exe.config in Notepad. This file is located in the folder the Muhimbi Conversion Service has been deployed to. A handy shortcut to this folder can be found in the Muhimbi group in the Windows Start Menu.
- Search for "WordProcessing" (including the quotes) and use XML comments <!-- --> to comment out the entire ‘add’ element (see below for an example).
- Remove the XML comments for the converter named "WordAutomationServiceConverter" directly underneath.
- In the parameter attribute of WordAutomationServiceConverter make the following changes:
- Service Name: If you do not wish to use the standard WAS Proxy then enter the WAS Service’s name here.
- TempDocLibPath: The full path to the Document Library / Folder name created as part of step #1.
- As the WAS based converter does not support all file formats, remove all references to txt, wps, eml, odt and ott in CrossConverter_MHT for both the supportedExtensions and supportedOutputFormats attributes.
- Open Services.msc from the Start Menu and restart the Muhimbi Document Converter Service.
- In the PDF Converter’s Central Administration page enable the new Word Automation Service.
The XML in the configuration file should now look something like this.
description="Word Processing (e.g. MS-Word, RTF, TXT)"
Muhimbi.DocumentConverter.WebService, Version=22.214.171.124, Culture=neutral,
PublicKeyToken=c9db4759c9eaad12" /> -->
description="Word Automation Service (e.g. MS-Word, RTF, MHT)"
Muhimbi.DocumentConverter.WebService, Version=126.96.36.199, Culture=neutral,
parameter="Service Name=;TempDocLibPath=http://your site/sites/your web app/Temp"/>
That is all. Once the service has restarted try converting an MS-Word file. If you are encountering any errors then please look in the Windows Application Event Log and / or SharePoint’s ULS logs. We have tried to make the errors as descriptive as possible.
A common problem when using WAS is that it takes a long time to spin-up, making it appear to our software that it is hanging, resulting in a not responding error. If this is the case in your environment then we recommend changing the value of the MAX_HUNG_COUNT value in the PDF Converter’s configuration file from 15 (seconds) to 45.