Posted at: 12:02 PM on 23 January 2012 by Muhimbi
At the end of last week we released version 5.2 of the PDF Converter for SharePoint, which ships with an improved version of our popular PDF Conversion engine. Today we are releasing an update to the standalone version of the Muhimbi PDF Converter Services that includes all new functionality and fixes including the ability to convert MSG (email) files and output in PDF/A format.
The list of new features and improvements is considerable but the main ones are as follows:
A quick introduction for those not familiar with the product: The Muhimbi PDF Converter Services is an ‘on premises’ server based SDK that allows software developers to convert typical Office files to PDF format using a robust, scalable but friendly Web Services interface from Java and .NET based solutions. It supports a large number of file types including MS-Office and ODF file formats as well as HTML, MSG (email) AutoCAD and Image based files and is used by some of the largest organisations in the world for mission critical document conversions. In addition to converting documents the product ships with a sophisticated watermarking engine, PDF Splitting and Merging facilities and the ability to secure PDF files. A separate SharePoint specific version is available as well.
New, PDF Conversion of MSG based emails
In addition to the changes listed above, some of the main changes in the new version are as follows:
| 1580 | CAD | Improvement | Converting AutoCAD files results in much smaller PDF files as before. |
| 1575 | CAD | Improvement | Support for Spatial Filters has been added for AutoCAD to PDF Converter |
| 1565 | CAD | Improvement | Default "EmptyLayoutDetectionMode" setting is now set to "SkipEmptyLayouts" |
| 1543 | CAD | New | Provide different sorting options for CAD Layouts |
| 1472 | Documentation | New | Create Merge example for Java |
| 1582 | Excel | Fix | Excel sheets in 'Page Break Preview' mode do not convert |
| 1443 | HTML | Fix | Images are split when converting certain pages. |
| 1140 | HTML | Fix | HTML to PDF Conversion of SP2010 screens is not working correctly. |
| 1501 | HTML | Fix | Base tag is ignored when converting HTML fragments to PDF |
| 1540 | HTML | Fix | Conversion hangs on certain SP2010 pages |
| 1542 | HTML | Fix | HTML to PDF on SP2010 uses wrong page breaks when split images is enabled |
| 1566 | InfoPath | Fix | InfoPath Schema Validation error on forms that use FusionX and view switching |
| 1433 | InfoPath | Improvement | Provide an option to 'skip' or 'fail' problematic InfoPath attachments |
| 1511 | InfoPath | Improvement | Downloading InfoPath XSN files on systems using FBA does not work |
| 1551 | InfoPath | Improvement | Add option to create PDF bookmarks for each attached InfoPath document |
| 1517 | InfoPath | Improvement | Add switch to pre-process Full Trust InfoPath files |
| 1568 | InfoPath | Improvement | Allow invalid SSL Certificates to be used for downloading InfoPath XSN files |
| 1513 | Merging | Fix | Merging certain PDF files results in 'index out of bounds' |
| 1594 | Merging | Fix | Error in Lexer' when loading PDF File |
| 1617 | Merging | Fix | Merging of PDF Files generated with IOS scanner app |
| 1618 | Merging | Fix | System.NullReferenceException when loading certain PDF files for merging |
| 1637 | Merging | Fix | Bookmarks corrupted when merging certain files |
| 1317 | MSG | New | Improve email converter (with MSG support) |
| 1553 | Open Office | New | Add support for open office template files (ott, ots, otp) |
| 1505 | PDF/A | New | Allow PDF as an input format to support PDF to PDF/A conversion |
| 676 | PDF/A | New | Excel Conversion - Add support for PDF/A |
| 619 | PDF/A | New | Add support for PDF/A Post Processing (requires 'pro' license) |
| 1458 | Setup | New | Add support for silent install / uninstall of Conversion Service |
| 779 | Setup | New | Add support for the 64 bit version of Office 2010 |
| 1537 | Splitting | New | Update web service to allow PDFs to be split |
| 1541 | Visio | New | Add support for Visio VDW extension |
| 1547 | Watermarking | Fix | Watermarking: Applying rotation of less than -45 degrees rotates the entire page during conversion to PDF/A |
| 1636 | Watermarking | Fix | Error in Lexer when applying watermark on merged documents |
| 1613 | Web Service | Improvement | Conversion Service Authentication problems on certain systems that have disabled anonymous access (Kerberos related) |
For more information check out the following resources:
As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.
Download your free trial here (10MB). .
.
Labels: AutoCAD, Java, MSG, News, pdf, PDF Converter Professional, PDF Converter Services, PDF/A, Products, Splitting
Posted at: 2:01 PM on 20 January 2012 by Muhimbi
Over the past 4 months our engineers have been hard at work on a number of brilliant new facilities resulting in the new 5.2. release of our popular Muhimbi PDF Converter for SharePoint.
The list of new features and improvements is considerable but the main ones are as follows:
For those not familiar with the product, the PDF Converter for SharePoint is a lightweight solution that allows end-users to watermark, merge, split, secure and convert common document types - including InfoPath, AutoCAD, MSG (email) MS-Office, HTML and images - to PDF format from within SharePoint using a friendly user interface, workflows or a web service call without the need to install any client side software or Adobe Acrobat. It integrates at a deep level with SharePoint and leverages facilities such as the Audit log, Nintex Workflow, localisation, security and tracing. It runs on WSS 3, MOSS as well as SharePoint 2010 and is available in English, German, Dutch, French, Traditional Chinese and Japanese. For detailed information check out the product page.
New, PDF Conversion of MSG based emails
In addition to the changes listed above, some of the main changes in the new version are as follows:
| 1580 | CAD | Improvement | Converting AutoCAD files results in much smaller PDF files as before. |
| 1575 | CAD | Improvement | Support for Spatial Filters has been added for AutoCAD to PDF Converter |
| 1565 | CAD | Improvement | Default "EmptyLayoutDetectionMode" setting is now set to "SkipEmptyLayouts" |
| 1543 | CAD | New | Provide different sorting options for CAD Layouts |
| 1472 | Documentation | New | Create Merge example for Java |
| 1582 | Excel | Fix | Excel sheets in 'Page Break Preview' mode do not convert |
| 1443 | HTML | Fix | Images are split when converting certain pages. |
| 1140 | HTML | Fix | HTML to PDF Conversion of SP2010 screens is not working correctly. |
| 1501 | HTML | Fix | Base tag is ignored when converting HTML fragments to PDF |
| 1540 | HTML | Fix | Conversion hangs on certain SP2010 pages |
| 1542 | HTML | Fix | HTML to PDF on SP2010 uses wrong page breaks when split images is enabled |
| 1471 | HTML | Improvement | HTML to PDF Conversion - Error not clear when user has no rights |
| 1566 | InfoPath | Fix | InfoPath Schema Validation error on forms that use FusionX and view switching |
| 1433 | InfoPath | Improvement | Provide an option to 'skip' or 'fail' problematic InfoPath attachments |
| 1511 | InfoPath | Improvement | Downloading InfoPath XSN files on systems using FBA does not work |
| 1551 | InfoPath | Improvement | Add option to create PDF bookmarks for each attached InfoPath document |
| 1517 | InfoPath | Improvement | Add switch to pre-process Full Trust InfoPath files |
| 1568 | InfoPath | Improvement | Allow invalid SSL Certificates to be used for downloading InfoPath XSN files |
| 1513 | Merging | Fix | Merging certain PDF files results in 'index out of bounds' |
| 1594 | Merging | Fix | Error in Lexer' when loading PDF File |
| 1617 | Merging | Fix | Merging of PDF Files generated with IOS scanner app |
| 1618 | Merging | Fix | System.NullReferenceException when loading certain PDF files for merging |
| 1637 | Merging | Fix | Bookmarks corrupted when merging certain files |
| 1317 | MSG | New | Improve email converter (with MSG support) |
| 1508 | Nintex WF | Fix | Nintex File Merge activity fails when used inside Nintex 'For each' iterator |
| 1553 | Open Office | New | Add support for open office template files (ott, ots, otp) |
| 1505 | PDF/A | New | Allow PDF as an input format to support PDF to PDF/A conversion |
| 676 | PDF/A | New | Excel Conversion - Add support for PDF/A |
| 619 | PDF/A | New | Add support for PDF/A Post Processing (requires 'pro' license) |
| 1458 | Setup | New | Add support for silent install / uninstall of Conversion Service |
| 779 | Setup | New | Add support for the 64 bit version of Office 2010 |
| 1536 | Splitting | New | Create workflow activity to split PDF files |
| 1537 | Splitting | New | Update web service to allow PDFs to be split |
| 1541 | Visio | New | Add support for Visio VDW extension |
| 1538 | Watermarking | Fix | Watermarking using a workflow activity removes meta data |
| 1547 | Watermarking | Fix | Watermarking: Applying rotation of less than -45 degrees rotates the entire page during conversion to PDF/A |
| 1636 | Watermarking | Fix | Error in Lexer when applying watermark on merged documents |
| 1507 | Watermarking | Improvement | Watermark filtering - Checking for 'modified = today' does not work |
| 1535 | Web Service | Improvement | Add support for the conversion of files up to 1GB |
| 1613 | Web Service | Improvement | Conversion Service Authentication problems on certain systems that have disabled anonymous access (Kerberos related) |
| 1576 | Workflow | Fix | Merging Workflow Activity fails when used after a 'Pause for Duration' activity |
| 1573 | Workflow | Fix | Backslashes are not allowed in output path of HTML to PDF Conversion activity |
| 1545 | Workflow | Improvement | Merge workflow activity does not accept URLs |
For more information check out the following resources:
As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.
Download your free trial here (14MB). .
.
Labels: AutoCAD, InfoPath, Merging, MSG, News, pdf, PDF Converter, PDF Converter Professional, PDF/A, Products, Splitting, Workflow
Posted at: 5:44 PM on 02 December 2011 by Muhimbi
To facilitate the new PDF Merging facility in our PDF Converter for SharePoint we have added the ability to convert and merge multiple files to our core PDF Conversion engine, which our SharePoint product shares with our generic Java / .NET oriented PDF Converter Services.
In this post we’ll describe in detail how to invoke this new merging facility from your own code. This demo uses Java, but the web services based interface is identical when used from .NET (See the .NET version of this same article).
This post is part of the following series related to manipulating PDF files using web services.
Key Features
The key features of the new merging facilities are as follows:
- Convert and merge any supported file format (inc. HTML, AutoCAD, MS-Office, InfoPath, TIFF, MSG) or merge existing PDF files.
- Apply different watermarks on each individual file as well as on the entire merged file (e.g. page numbering).
- Apply PDF Security settings and restrictions on the merged file.
- Optionally skip (and report) corrupt / unsupported files.
- Add PDF Bookmarks for each converted file.
- Apply any ConversionSetting supported by the regular conversion process.
Object Model
The object model is relatively straight forward. The classes related to PDF Merging are displayed below. A number of enumerations are used as well by the various classes, these can be found in our original post about Converting files using the Web Services interface. A detailed Developer Guide is available here.

The Web Service method that controls merging of files is called ProcessBatch (highlighted in the screenshot above). It accepts a ProcessingOptions object that holds all information about the source files to convert and the MergeSettings to apply, which may optionally include security and watermarking related settings. A Results object is returned that, when it comes to merging of files, always contains a single file in element 0 that holds the byte array for the merged PDF file.
Sample code
The following sample merges all files specified on the command line into a single PDF. If the source files are not already in PDF format then it automatically converts them in the process. A PDF bookmark is automatically generated for each merged file as well.
The example described below assumes the following:
- The JDK has been installed and configured.
- The Conversion Service and all prerequisites have been installed in line with the Administration Guide.
- The Conversion Service is running in the default anonymous mode. This is not an absolute requirement, but it makes initial experimentation much easier.
The first step is to generate proxy classes for the web service by executing the following command:
wsimport http://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl
-d src -Xnocompile -p com.muhimbi.ws
Feel free to change the package name and destination directory to something more suitable for your organisation.
Wsimport automatically generates the Java class names. Unfortunately some of the generated names are rather long and ugly so you may want to consider renaming some, particularly the Exception classes, to something friendlier. This, however, means that if you ever run wsimport again you will need to re-apply those changes. For more information have a look at the high level overview of the Object Model exposed by the web service.
Once the proxy classes have been created add the following sample code to your project. Run the code and make sure the files to merge are specified on the command line.
As of version 5.2 this sample code is automatically installed alongside the product. The source code, including pre-generated proxy classes for the web service, can be downloaded here.
package com.muhimbi.app; import com.muhimbi.ws.*; import java.io.*; import java.net.URL; import javax.xml.bind.JAXBElement; import javax.xml.namespace.QName; public class WsClient { private final static String DOCUMENTCONVERTERSERVICE_WSDL_LOCATION = "http://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl"; private static ObjectFactory _objectFactory = new ObjectFactory(); public static void main(String[] args) { try { if (args.length == 0) { System.out.println("Please specify one or more file names to convert and merge."); } else { System.out.println("Merging files"); // ** Initialise Web Service DocumentConverterService_Service dcss = new DocumentConverterService_Service( new URL(DOCUMENTCONVERTERSERVICE_WSDL_LOCATION), new QName("http://tempuri.org/", "DocumentConverterService")); DocumentConverterService dcs = dcss.getBasicHttpBindingDocumentConverterService(); // ** Get the options for all files that need to be merged ProcessingOptions processingOptions = getProcessingOptions(args); // ** Carry out the merging (and converting if needed) BatchResults results = dcs.processBatch(processingOptions); // ** Get the content of the first file, which holds the merged file in the byte array byte[] convertedFile = results.getResults().getValue().getBatchResult().get(0).getFile().getValue(); // ** Write converted file to file system writeFile(convertedFile, "merged.pdf"); System.out.println("Files merged into 'merged.pdf'"); } } catch (IOException e) { System.out.println(e.getMessage()); } catch (DocumentConverterServiceProcessBatchWebServiceFaultExceptionFaultFaultMessage e) { printException(e.getFaultInfo()); } } public static ProcessingOptions getProcessingOptions(String[] sourceFileNames) throws IOException { // ** Options and all settings for batch conversion ProcessingOptions processingOptions = new ProcessingOptions(); // ** Specify the minimum level of merge settings, you can optionally add watermarks and security settings MergeSettings mergeSettings = new MergeSettings(); mergeSettings.setBreakOnError(false); processingOptions.setMergeSettings(_objectFactory.createProcessingOptionsMergeSettings( mergeSettings )); // ** Create an array of files to merge ArrayOfSourceFile sourceFiles = new ArrayOfSourceFile(); for(int i =0; i<sourceFileNames.length; i++) { SourceFile sourceFile = getSourceFile(sourceFileNames[i]); sourceFiles.getSourceFile().add(sourceFile); } processingOptions.setSourceFiles( _objectFactory.createProcessingOptionsSourceFiles(sourceFiles)); return processingOptions; } public static SourceFile getSourceFile(String fileName) throws IOException { File file = new File(fileName); // ** Read the contents of the file System.out.println("- Reading: " + fileName); byte[] sourceFileContent = readFile(fileName); // ** Set the absolute minimum open options OpenOptions openOptions = getOpenOptions(getFileName(file), getFileExtension(file) ); // ** Set the absolute minimum conversion settings. ConversionSettings conversionSettings = getConversionSettings(); // ** Create merge settings for each file and set the name for the PDF bookmark FileMergeSettings fileMergeSettings = new FileMergeSettings(); fileMergeSettings.setTopLevelBookmark(_objectFactory.createFileMergeSettingsTopLevelBookmark(file.getName())); // ** Create a source file object and return it SourceFile sourceFile = new SourceFile(); sourceFile.setOpenOptions(_objectFactory.createSourceFileOpenOptions(openOptions)); sourceFile.setConversionSettings(_objectFactory.createSourceFileConversionSettings(conversionSettings)); sourceFile.setMergeSettings(_objectFactory.createSourceFileMergeSettings(fileMergeSettings)); sourceFile.setFile(_objectFactory.createSourceFileFile(sourceFileContent)); return sourceFile; } public static OpenOptions getOpenOptions(String fileName, String fileExtension) { OpenOptions openOptions = new OpenOptions(); // ** Set the minimum required open options. Additional options are available openOptions.setOriginalFileName(_objectFactory.createOpenOptionsOriginalFileName(fileName)); openOptions.setFileExtension(_objectFactory.createOpenOptionsFileExtension(fileExtension)); return openOptions; } public static ConversionSettings getConversionSettings() { ConversionSettings conversionSettings = new ConversionSettings(); // ** Set the minimum required conversion settings. Additional settings are available conversionSettings.setQuality(ConversionQuality.OPTIMIZE_FOR_PRINT); conversionSettings.setRange(ConversionRange.ALL_DOCUMENTS); conversionSettings.getFidelity().add("Full"); conversionSettings.setFormat(OutputFormat.PDF); return conversionSettings; } public static String getFileName(File file) { String fileName = file.getName(); return fileName; } public static String getFileExtension(File file) { String fileName = file.getName(); return fileName.substring(fileName.lastIndexOf('.') + 1, fileName.length()); } public static byte[] readFile(String filepath) throws IOException { File file = new File(filepath); InputStream is = new FileInputStream(file); long length = file.length(); byte[] bytes = new byte[(int) length]; int offset = 0; int numRead; while (offset < bytes.length && (numRead = is.read(bytes, offset, bytes.length - offset)) >= 0) { offset += numRead; } if (offset < bytes.length) { throw new IOException("Could not completely read file " + file.getName()); } is.close(); return bytes; } public static void writeFile(byte[] fileContent, String filepath) throws IOException { OutputStream os = new FileOutputStream(filepath); os.write(fileContent); os.close(); } public static void printException(WebServiceFaultException serviceFaultException) { System.out.println(serviceFaultException.getExceptionType()); JAXBElement<ArrayOfstring> element = serviceFaultException.getExceptionDetails(); ArrayOfstring value = element.getValue(); for (String msg : value.getString()) { System.out.println(msg); } } } |
.
Labels: Articles, Java, Merging, News, pdf, PDF Converter, PDF Converter Services, Products
Posted at: 6:16 PM on 25 November 2011 by Muhimbi
Some time ago we wrote about some new functionality in the PDF Converter for SharePoint that allows ‘real-time’ and ‘user specific’ watermarks to be added to PDF files as soon these files are opened / downloaded / copied. This has proven a real hit with our customers and one of the questions that recently came up is how to use this functionality in combination with the SharePoint 2010’s Document ID Service. (Hint, the internal field names are called _dlc_DocId, _dlc_DocIdUrl and _dlc_DocIdPersistId).
In this post we’ll show how to automatically apply a watermark containing the Document ID to PDF files and then generate a Short URL based on this Document ID. Rather than using our watermark on open facility, we’ll create a SharePoint Designer workflow that carries out the work. Once a Document ID has been created it never changes so although real-time watermarking will work, it is not necessary and a waste of valuable processing resources.
Please note that the example in this post uses both the Muhimbi PDF Converter for SharePoint and the Muhimbi URL Shortener for SharePoint. If you are only interested in URL Shortening or just in Watermarking then you can change the sample accordingly. Also in this example we are not carrying out any PDF Conversion. Examples for PDF Conversion are available on our site for SharePoint Designer as well as Nintex Workflow.
OK, let’s get all the prerequisites in place first.
- Download and install the PDF Converter for SharePoint and install it as described in Chapter 2 of the included Administration Guide.
- Download and install the URL Shortener for SharePoint and configure it to accept short URLs on a web application of your choice as described in the included Administration Guide.
- If not already installed, download and install the free SharePoint Designer for your environment.
- Make sure you have the appropriate privileges to create workflows on a site collection.
- Enable and configure SharePoint 2010’s Document ID Service
Item with Document ID Populated
With the prerequisites in place, let’s build our custom solution.
- On a Document Library of your choice, modify the View to include the Document ID field. (This is just for reference, you don’t need it in the real world.)
- On this same document library create a new field named Short URL of type Hyperlink. (This is just to see the name of the short URL after the workflow has executed, it is not needed in the real world).
- Create a new SharePoint Designer Workflow using a method of your choice. In this example we’ll do this by navigating to the Library Tab and then selecting Settings / Workflow Settings / Create a Workflow in SharePoint Designer.
- Give the workflow a sensible name, e.g. Document ID Processing and change the workflow settings to automatically start when an item is created.
- Switch to Edit Workflow and insert a Create Short URL workflow Action.
- Click optional short name, followed by the fx button.
- From the 'lookup string' dialog, select Document ID Value from the Field from source drop down and close the dialog.
- Click this ID / address, followed by the '...' button.
- Enter the Document ID URL for your site, you can copy it from the Document ID of an existing item. Generally it is something like http://<path to your site collection>/_layouts/DocIdRedir.aspx?ID= .
- Position the cursor after the '=', click the Add or Change Lookup button and select the Document ID Value again. Close the String Builder.
- From Document / Display Form select Document.
- From Overwrite / Return null select Overwrite.
- When the workflow executes it will write the shortened URL to a workflow variable named this variable. The variable name can be changed, but we accept the default.
- Insert a set field in current item workflow action, select Short URL as the field.
- Click value, followed by the fx button. From the Data Source select Workflow Variables and Parameters and from Field from source select this variable.
The URL Shortening part of the example is now ready. Publish the workflow, upload a file to your document library and after a few seconds the Short URL field will be populated. Click the Short URL to verify it works.
This is what the final workflow will look like after the Watermarking is added as well
You can do whatever you like with this Short URL. Send it by email, tell someone on the phone, or.... embed it in a watermark, something we'll do next.
Return to the Workflow editor and add the following steps:
- Position the cursor JUST ABOVE the previously added Set field action and add the If current item field matches condition.
- Click field and select File Type.
- In value enter pdf (lower case, no '.' prefixed) .
- Position the cursor inside the condition and add the Add Text Watermark activity (other watermark types are also available).
- Fill out the fields for the watermark
- this document: Set to Current Item.
- this file: Leave the default to overwrite the existing file.
- content: Add the this variable Workflow Variable to insert the previously generated URL. Other content can be added as well.
- position: Bottom Center.
- width x height: 600 x 30
That should take care of the watermarking. Publish the workflow and upload a PDF file. You should now see the Short URL being created and when the document is opened you should see the Short URL at the bottom of the document.
If you find this useful, or have any questions, then feel free to leave a comment below.
.
Labels: Articles, MuSH, News, pdf, PDF Converter, SP2010, Watermarking, Workflow
Posted at: 10:13 AM on 23 November 2011 by Muhimbi
The Muhimbi PDF Converter for SharePoint has grown over the last few years from a relatively modest SharePoint GUI application to a sophisticated framework that can be used from Nintex, SharePoint Designer and Visual Studio workflows to carry out all kind of PDF processing, including securing, watermarking, splitting and merging.
Regardless of the operation that is carried out, input and / or output file names, including paths, need to be specified. Although we have tried to make this as intuitive as possible, and we continue to make improvements, according to our statistics the most common support call we receive is related to how to specify these paths. We hope to clarify the situation in this post.
Targeting the same directory / using the same file name as the source file
If you wish to read from or write to the same location as the source file that is being processed, e.g. you are using a SharePoint Designer workflow to automatically convert files to PDF, then leave the path empty. This works for all situations with the exception of the HTML to PDF Conversion as that is not necessarily associated with a source file (it can convert URLs as well).
If you wish to use the same name for the target file as the source file then there is no need to specify a file name, the system will generate the name for you. When converting files to PDF the files’ extension will automatically be updated to ‘PDF’, however when the source file is a PDF (e.g. when applying PDF Security) then omitting the file name and path will overwrite the source file. This may be desired, but keep it in mind.
Targeting a (sub) folder in the current Document Library
If you wish to specify a (sub) folder in the same Document Library as the source file then you must specify the Document Library name as well as the path to the folder. Just specifying the folder name will not work. For example to convert a file to the ‘Archives/PDFs’ folder in the Shared Documents library, specify:
Shared Documents/Archives/PDFs/
Make sure you use a trailing slash as that is used in some cases to resolve potentially conflicting file and folder names. Please do not start the path with a slash in this particular case.
Targeting a different Document Library
To specify a location in a different Document Library in the current Site, specify the name of the Document Library followed by any folders. E.g. if your workflow activity is operating on a file in the Shared Documents library, but you want to write the generated file to the PDF folder in the Archive Document Library then specify the following as the path:
Archive/PDF/
Make sure you use a trailing slash when specifying folders as that is used in some cases to resolve potentially conflicting file and folder names.
Targeting a Sub Site
SharePoint Site Collections can contain multiple Sub Sites. To write a file to a location in a Sub Site specify the name of that Sub Site followed by the name of the Document Library followed by any folder names. For example if the current workflow is acting on a file in the root web of a Site Collection and you wish to write it to the PDFs folder in the Paid Document Library in the Invoices Sub Site specify the following path:
Invoices/Paid/PDFs/
Make sure you use a trailing slash when specifying folders as that is used in some cases to resolve potentially conflicting file and folder names.
To target a Sub Site ‘next’ to the current site, use an absolute path. For details see Targeting a different Site Collection below. Please note that ‘traditional’ relative path notation such as ‘../../’ is not supported.
Targeting a different Site Collection
In order to read from or write to a file in a different site collection you have to use an absolute path starting with ‘/’. For example if a PDF Conversion workflow is running in the Accounting Site collection and the generated PDF file should be written to the Archiving site collection in the PDFs folder in the Accounting Document Library, use the following path
/Archiving/Accounting/PDFs/
If all your Site Collections live under ‘/sites/’ then this will need to be reflected in the path, e.g.
/sites/Archiving/Accounting/PDFs/
Please do not start absolute paths with http://YourWebApplication/, always start absolute paths with a slash.
Targeting a different Web Application
As there are clear security boundaries between SharePoint Web Applications it is not possible to exchange files between them. If this is required then you will need to use a third party Workflow Activity, use Nintex Workflow or write a little bit of custom code using the Muhimbi Workflow Power Pack.
Creating dynamic paths / file names
All of Muhimbi’s SharePoint Designer and Nintex Workflow activities allow lookup variables to be used in order to generate dynamic paths. For example, in order to make a workflow generic, the name of a source or target File / Library / Folder can be determined at run-time.
For Nintex Workflow use the ‘Insert Reference’ button
In SharePoint Designer use the ‘Add Lookup’ button
Some final, generic, hints and tips:
- Never start a path with http://YourWebApplication/. Start absolute paths with a ‘/’.
- Some activities support templates in the output file name, e.g. to automatically generate file names when splitting PDFs.
- Although you can use forward (/) as well as back (\) slashes, make it a habit to use forward slashes as SharePoint Designer 2010 workflows do not always deal well with back slashes.
- No matter where you read from or write to, your user must have access to the location that is being specified.
.
Labels: Articles, PDF Converter, Workflow
Posted at: 2:48 PM on 16 November 2011 by Muhimbi

In our eternal quest to add support for as many file formats as possible we have arrived at the ‘MSG’ file format. Many people don’t realise that this is probably the most popular file format in the world as each individual message in your Outlook client is an MSG file. As a result I personally have more than 50.000 of these ‘files’ on my machine.
Official support for the MSG file type is available starting with version 5.2 of the Muhimbi PDF Converter for SharePoint as well the PDF Converter Services. The key features are as follows:
- Support for all common MSG content types, including HTML, RTF and plain text.
- Conversion of rich content including in-line images.
- Conversion of attachments.
- No need for external dependencies on the server, e.g. Outlook.
Like proud parents we ‘love all our features equally’, but our favourite child feature is the automatic conversion of attachments. Many of our customers are sitting on massive heaps of emails that need to go through a long term archiving process to make sure that 30-40 years down the line this information can still be retrieved. Using the new MSG to PDF facility of our products it is now possible to convert each email, including all attachments, to a single PDF file. In combination with our PDF/A post-processing facility this is the perfect solution for long time archiving.
As the converter is part of our highly scalable PDF Conversion platform, it automatically benefits from all its features including reliability, scalability, watermarking engine, cross platform support, web services based API, PDF security, SharePoint integration, Nintex Workflow integration, Java support, InfoPath attachments, Windows Azure etc.
Example output of a regular email conversation as well as part of a web based newsletter
.
Labels: Articles, MSG, News, pdf, PDF Converter, PDF Converter Services
Posted at: 5:02 PM on 27 October 2011 by Muhimbi
To facilitate the new PDF Splitting facility in our PDF Converter for SharePoint we have added the ability to split a single file into multiple ones to our core PDF Conversion engine, which our SharePoint product shares with our generic Java / .NET oriented PDF Converter Services.
In this post we’ll describe in detail how to invoke this new splitting facility from your own code. This demo uses C# and .NET, but the web services based interface is identical when used from Java (See this generic PDF Conversion sample).
This post is part of the following series related to manipulating PDF files using web services.
Key Features
The key features of the new splitting facility are as follows:
- Split a single PDF file into one or more individual PDF files.
- Split based on number of pages or bookmarks.
- Automatically generate numbered file names using .NET’s formatting syntax, e.g. 'split-{0:3D}.pdf' will use 3 digits for the sequential numbers starting at ‘split-001.pdf’. When splitting by bookmark then an optional {1} parameter can be inserted in the file name to include the name of the bookmark as well.
- Can be combined in combination with other actions, e.g. convert & merge.
.
A note about splitting based on bookmark levels: PDFs store bookmarks at the page level, so it is not clear on what part of the page a heading starts or ends. As a result an extra page will always be exported for each file split based on bookmark levels.
For example let’s assume the following document:
- Page 1: Contains chapter 1 and sections 1.1. and 1.2.
- Page 2: Contains the last paragraph of 1.2 and all of chapter 2.
- Page 3: Contains Chapter 3.
When splitting this document based on bookmarks using ‘1’ as the batch size then the following files will be created:
- File 1: Contains page 1 and 2 as expected.
- File 2: Contains pages 2 and 3 even though Chapter 2 is only really part of page 2. This is because there is no way to know if Chapter 2 runs over into page 3 or not.
- File 3: Contains Chapter 3.
Object Model
The object model is relatively straight forward. The classes related to PDF Splitting are displayed below. A number of enumerations are used as well by the various classes, these can be found in our original post about Converting files using the Web Services interface.
The Web Service method that controls splitting (as well as merging) of files is called ProcessBatch. It accepts a ProcessingOptions object that holds all information about the files to process and the operations to apply. A Results object is returned that, when it comes to splitting of files, contains one or more results that hold the contents of the file as well as the suggested output file name, which you may us to save the file locally.
As the ProcessingOptions class accepts both MergeSettings and SplitOptions it is possible to convert and merge a set of input files and then split up the results, all in a single web service call. Just populate the various properties and the system will take care of the rest.
Example code
The following sample describes the steps needed to split up a single PDF file based on the number of pages. We are using Visual Studio and C#, but any environment that can invoke web services should be able to access this functionality. Note that the WSDL can be found at http://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl.
A generic PDF Conversion Java based example is installed alongside the product and discussed in the User & Developer Guide. The source code for this example can be found in the folder the Muhimbi Conversion service has been installed to.
- Start a new Visual Studio project and create the project type of your choice. In this example we are using a standard .net 3.0 project of type Console Application. Name it ‘Split PDF’.
- In the Solution Explorer window, right-click References and select Add Service Reference. (Do not use web references!)
- In the Address box enter the WSDL address listed in the introduction of this section. If the Conversion Service is located on a different machine then substitute localhost with the server’s name.
- Accept the default Namespace of ServiceReference1 and click the OK button to generate the proxy classes.
- Optionally add a PDF file to the solution, set the Build Action to None and Copy to Output Directory to Copy if newer. By doing this there will always be a valid test file in the same directory as the compiled executable.
- Copy and paste the following code and replace the contents of Program.cs.
using System;
using System.IO;
using System.ServiceModel;
using Split_PDF.ServiceReference1;
namespace Split_PDF
{
class Program
{
// ** The URL where the Web Service is located. Amend host name if needed.
static string SERVICE_URL = "http://localhost:41734/Muhimbi.DocumentConverter.WebService/";
static void Main(string[] args)
{
DocumentConverterServiceClient client = null;
try
{
// ** Determine the source file and read it into a byte array.
string sourceFileName = null;
if (args.Length == 0)
{
//** Delete any split files from a previous test run.
foreach (string file in Directory.GetFiles(Directory.GetCurrentDirectory(),
"spf-*.pdf"))
{
File.Delete(file);
}
// ** If nothing is specified then read the first PDF file.
string[] sourceFiles = Directory.GetFiles(Directory.GetCurrentDirectory(),
"*.pdf");
if (sourceFiles.Length > 0)
sourceFileName = sourceFiles[0];
else
{
Console.WriteLine("Please specify a document to split.");
Console.ReadKey();
return;
}
}
else
sourceFileName = args[0];
byte[] sourceFile = File.ReadAllBytes(sourceFileName);
// ** Open the service and configure the bindings
client = OpenService(SERVICE_URL);
//** Set the absolute minimum open options
OpenOptions openOptions = new OpenOptions();
openOptions.OriginalFileName = Path.GetFileName(sourceFileName);
openOptions.FileExtension = "pdf";
// ** Set the absolute minimum conversion settings.
ConversionSettings conversionSettings = new ConversionSettings();
// ** Create the ProcessingOptions for the splitting task.
ProcessingOptions processingOptions = new ProcessingOptions()
{
MergeSettings = null,
SplitOptions = new FileSplitOptions()
{
FileNameTemplate = "spf-{0:D3}",
FileSplitType = FileSplitType.ByNumberOfPages,
BatchSize = 5,
BookmarkLevel = 0
},
SourceFiles = new SourceFile[1]
{
new SourceFile()
{
MergeSettings = null,
OpenOptions = openOptions,
ConversionSettings = conversionSettings,
File = sourceFile
}
}
};
// ** Carry out the splittng.
Console.WriteLine("Splitting file " + sourceFileName);
BatchResults batchResults = client.ProcessBatch(processingOptions);
// ** Process the returned files
foreach (BatchResult result in batchResults.Results)
{
Console.WriteLine("Writing split file " + result.FileName);
File.WriteAllBytes(result.FileName, result.File);
}
Console.WriteLine("Finished.");
}
catch (FaultException<WebServiceFaultException> ex)
{
Console.WriteLine("FaultException occurred: ExceptionType: " +
ex.Detail.ExceptionType.ToString());
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
finally
{
CloseService(client);
}
Console.ReadKey();
}
/// <summary>
/// Configure the Bindings, endpoints and open the service using the specified address.
/// </summary>
/// <returns>An instance of the Web Service.</returns>
public static DocumentConverterServiceClient OpenService(string address)
{
DocumentConverterServiceClient client = null;
try
{
BasicHttpBinding binding = new BasicHttpBinding();
// ** Use standard Windows Security.
binding.Security.Mode = BasicHttpSecurityMode.TransportCredentialOnly;
binding.Security.Transport.ClientCredentialType =
HttpClientCredentialType.Windows;
// ** Increase the client Timeout to deal with (very) long running requests.
binding.SendTimeout = TimeSpan.FromMinutes(30);
binding.ReceiveTimeout = TimeSpan.FromMinutes(30);
// ** Set the maximum document size to 50MB
binding.MaxReceivedMessageSize = 50 * 1024 * 1024;
binding.ReaderQuotas.MaxArrayLength = 50 * 1024 * 1024;
binding.ReaderQuotas.MaxStringContentLength = 50 * 1024 * 1024;
// ** Specify an identity (any identity) in order to get it past .net3.5 sp1
EndpointIdentity epi = EndpointIdentity.CreateUpnIdentity("unknown");
EndpointAddress epa = new EndpointAddress(new Uri(address), epi);
client = new DocumentConverterServiceClient(binding, epa);
client.Open();
return client;
}
catch (Exception)
{
CloseService(client);
throw;
}
}
/// <summary>
/// Check if the client is open and then close it.
/// </summary>
/// <param name="client">The client to close</param>
public static void CloseService(DocumentConverterServiceClient client)
{
if (client != null && client.State == CommunicationState.Opened)
client.Close();
}
}
}
Compile the application and run it either from the command prompt, with a path to the PDF file to split on the command line, or – if a PDF file is present in the executable’s folder – just run it.
Note that In this example we are programmatically configuring the WCF Bindings and End Points. If you wish you can use the declarative approach using the config file as well.
This new functionality is available as of version 5.2 of our software.
.
Labels: Articles, News, pdf, PDF Converter Services, Products, Splitting