Posted at: 17:40 on 16 May 2011 by Muhimbi
To facilitate the new PDF Merging facility in our PDF Converter for SharePoint we have added the ability to convert and merge multiple files to our core PDF Conversion engine, which our SharePoint product shares with our generic Java / .NET oriented PDF Converter API and Server Platform.
In this post we’ll describe in detail how to invoke this new merging facility from your own code. This demo use C# and .NET, but the web services based interface is identical when used from Java (See this generic PDF Conversion sample)
This post is part of the following series related to manipulating PDF files using web services.
- Converting Office files to PDF Format using a Web Services based interface (C# / .NET).
- Converting Office files to PDF Format using a Web Services based interface (Java).
- Invoking the PDF Converter Web Service from Visual Studio 2005 using VB.net
- Using Windows Azure to convert documents to PDF format.
- Using the awesome new watermarking features of the Muhimbi PDF Converter Services (C# / .NET).
- Using the PDF Watermarking features from Java based environments.
Key Features
The key features of the new merging facilities are as follows:
- Convert and merge any supported file format (inc. HTML, AutoCAD, MS-Office, InfoPath, TIFF) or merge existing PDF files.
- Apply different watermarks on each individual file as well as on the entire merged file (e.g. page numbering).
- Apply PDF Security settings and restrictions on the merged file.
- Optionally skip (and report) corrupt / unsupported files.
- Add PDF Bookmarks for each converted file.
- Apply any ConversionSetting supported by the regular conversion process.
Object Model
The object model is relatively straight forward. The classes related to PDF Merging are displayed below. A number of enumerations are used as well by the various classes, these can be found in our original post about Converting files using the Web Services interface.
The Web Service method that controls merging of files is called ProcessBatch (highlighted in the screenshot above). It accepts a ProcessingOptions object that holds all information about the source files to convert and the MergeSettings to apply, which may include security and watermarking related settings. A Results object is returned that, when it comes to merging of files, always contains a single file that holds the byte array for the merged PDF file.
Simple example code
The following sample describes the steps needed to convert all files in a directory, merge the results into a single file and apply page numbering to the merged file using the built in watermarking engine. We are using Visual Studio and C#, but any environment that can invoke web services should be able to access this functionality. Note that the WSDL can be found at http://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl. A generic PDF Conversion Java based example is installed alongside the product and discussed in the User & Developer Guide.
- Start a new Visual Studio project and create the project type of your choice. In this example we are using a standard .net 3.0 project of type Windows Forms Application. Name it ‘Simple PDF Converter Sample’.
- Add a TextBox and Button control to the form. Accept the default names of textBox1 and button1.
- In the Solution Explorer window, right-click References and select Add Service Reference. (Do not use web references!)
- In the Address box enter the WSDL address listed in the introduction of this section. If the Conversion Service is located on a different machine then substitute localhost with the server’s name.
- Accept the default Namespace of ServiceReference1 and click the OK button to generate the proxy classes.
- Double click Button1 and replace the content of the entire code file with the following:
using System;
using System.Collections.Generic;
using System.IO;
using System.ServiceModel;
using System.Windows.Forms;
using Simple_PDF_Converter_Sample.ServiceReference1;
namespace Simple_PDF_Converter_Sample
{
public partial class Form1 : Form
{
// ** The URL where the Web Service is located. Amend host name if needed.
string SERVICE_URL = "http://localhost:41734/Muhimbi.DocumentConverter.WebService/";
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
DocumentConverterServiceClient client = null;
try
{
// ** Options and all settings for batch conversion
ProcessingOptions processingOptions = new ProcessingOptions();
// ** Specify the minimum level of merge settings
MergeSettings mergeSettings = new MergeSettings();
mergeSettings.BreakOnError = false;
mergeSettings.Watermarks = CreateWatermarks();
processingOptions.MergeSettings = mergeSettings;
// ** Get all files in the folder
string sourceFolder = textBox1.Text;
string[] sourceFileNames = Directory.GetFiles(sourceFolder);
// ** Iterate over all files and create a list of SourceFile Objects
List<SourceFile> sourceFiles = new List<SourceFile>();
foreach (string sourceFileName in sourceFileNames)
{
// ** Read the contents of the file
byte[] sourceFileContent = File.ReadAllBytes(sourceFileName);
// ** Set the absolute minimum open options
OpenOptions openOptions = new OpenOptions();
openOptions.OriginalFileName = Path.GetFileName(sourceFileName);
openOptions.FileExtension = Path.GetExtension(sourceFileName);
// ** Set the absolute minimum conversion settings.
ConversionSettings conversionSettings = new ConversionSettings();
conversionSettings.Fidelity = ConversionFidelities.Full;
conversionSettings.Quality = ConversionQuality.OptimizeForPrint;
// ** Create merge settings for each file and set the name for the PDF bookmark
FileMergeSettings fileMergeSettings = new FileMergeSettings();
fileMergeSettings.TopLevelBookmark = openOptions.OriginalFileName;
// ** Create a source file object and add it to the list
SourceFile sourceFile = new SourceFile();
sourceFile.OpenOptions = openOptions;
sourceFile.ConversionSettings = conversionSettings;
sourceFile.MergeSettings = fileMergeSettings;
sourceFile.File = sourceFileContent;
sourceFiles.Add(sourceFile);
}
// ** Assign source files
processingOptions.SourceFiles = sourceFiles.ToArray();
// ** Open the service and configure the bindings
client = OpenService(SERVICE_URL);
// ** Carry out the merge process
BatchResults results = client.ProcessBatch(processingOptions);
// ** Read the results of the merged file.
byte[] mergedFile = results.Results[0].File;
// ** Write the converted file back using the name of the folder
string folderName = new DirectoryInfo(sourceFolder).Name;
DirectoryInfo parentFolder = Directory.GetParent(sourceFolder);
string destinationFileName = Path.Combine(parentFolder.FullName, folderName + ".pdf");
using (FileStream fs = File.Create(destinationFileName))
{
fs.Write(mergedFile, 0, mergedFile.Length);
fs.Close();
}
MessageBox.Show("Contents of directory merged to " + destinationFileName);
}
catch (FaultException<WebServiceFaultException> ex)
{
MessageBox.Show("FaultException occurred: ExceptionType: " +
ex.Detail.ExceptionType.ToString());
}
catch (Exception ex)
{
MessageBox.Show(ex.ToString());
}
finally
{
CloseService(client);
}
}
/// <summary>
/// Configure the Bindings, endpoints and open the service using the specified address.
/// </summary>
/// <returns>An instance of the Web Service.</returns>
public static DocumentConverterServiceClient OpenService(string address)
{
DocumentConverterServiceClient client = null;
try
{
BasicHttpBinding binding = new BasicHttpBinding();
// ** Use standard Windows Security.
binding.Security.Mode = BasicHttpSecurityMode.TransportCredentialOnly;
binding.Security.Transport.ClientCredentialType =
HttpClientCredentialType.Windows;
// ** Increase the Timeout to deal with (very) long running requests.
binding.SendTimeout = TimeSpan.FromMinutes(30);
binding.ReceiveTimeout = TimeSpan.FromMinutes(30);
// ** Set the maximum document size to 40MB
binding.MaxReceivedMessageSize = 50 * 1024 * 1024;
binding.ReaderQuotas.MaxArrayLength = 50 * 1024 * 1024;
binding.ReaderQuotas.MaxStringContentLength = 50 * 1024 * 1024;
// ** Specify an identity (any identity) in order to get it past .net3.5 sp1
EndpointIdentity epi = EndpointIdentity.CreateUpnIdentity("unknown");
EndpointAddress epa = new EndpointAddress(new Uri(address), epi);
client = new DocumentConverterServiceClient(binding, epa);
client.Open();
return client;
}
catch (Exception)
{
CloseService(client);
throw;
}
}
/// <summary>
/// Check if the client is open and then close it.
/// </summary>
/// <param name="client">The client to close</param>
public static void CloseService(DocumentConverterServiceClient client)
{
if (client != null && client.State == CommunicationState.Opened)
client.Close();
}
/// <summary>
/// This method creates watermarks for applying page numbers
/// </summary>
/// <returns>Array of watermarks</returns>
private Watermark[] CreateWatermarks()
{
// ** Create watermark container
Watermark pageWatermark = new Watermark();
// ** Set positioning to the lower right of the page
pageWatermark.HPosition = HPosition.Right;
pageWatermark.VPosition = VPosition.Bottom;
// ** Set size
pageWatermark.Width = "200";
pageWatermark.Height = "20";
// ** Create text object for the page numbering
Text oddPageText = new Text();
// ** No need to position the element in the watermark container
oddPageText.Width = "200";
oddPageText.Height = "20";
// ** set content including field codes
oddPageText.Content = "Page {PAGE} of {NUMPAGES}";
// ** set font properties
oddPageText.FillColor = "#ffff0000";
oddPageText.FontFamilyName = "Verdana";
oddPageText.FontSize = "10";
oddPageText.FontStyle = FontStyle.Regular;
//* set text alignment
oddPageText.HAlign = HAlign.Right;
oddPageText.VAlign = VAlign.Top;
//** create array of watermark elements
Element[] pageWatermarkElements = new Element[] { oddPageText };
//** set elements of watermark
pageWatermark.Elements = pageWatermarkElements;
//* return array of watermarks
return new Watermark[] { pageWatermark };
}
}
}
Providing the project and all controls are named as per the steps above, it should compile without errors. Run it, enter the full path to a folder that holds a couple of text files (PDF, Word, Excel, etc) and click the button to start the convert and merge process. The operation may take a while depending on the number and complexity of files in the folder.
Note that In this example we are programmatically configuring the WCF Bindings and End Points. If you wish you can use a declarative approach using the config file.
A more complex and full featured sample application is installed, with full source code, alongside the Conversion Service.
This new functionality is available as of version 5.0 of our software.
.
Labels: Articles, Merging, News, pdf, PDF Converter, PDF Converter Services, Products
14 Comments:
Hi,
My client is in need the following functionality:
Option to merge several files into one PDF but with the following rule: if a document has an uneven number of pages (e.g. 3) the second merged document needs to start at the next uneven page. So when the document will be printed double-sided there's should be a blank page between document 1 and 2.
Regards,
Michel Smit - iDevteam
Is this doable by code using Muhimbi?
By
Anonymous, At
07 November, 2011 12:02
Hi Michel,
Unfortunately that facility is not available at this time, but I have added it as a feature request. Unfortunately I cannot make any promises about if or when this will be implemented.
For now you will need to carry out the conversion using the Muhimbi Service, but then carry out the merging using your own code using a free third party PDF library such as iTextSharp. (See http://goo.gl/a4Wam)
By
Muhimbi, At
07 November, 2011 12:20
Is there a way to convert multiple files to one merged PDF using just the API without having to explicitly call the web service? I think it would be more useful then the example above where the Web Service URL is hard coded, I know I can set the service URL as a configuration value somewhere but since the setup for pdf convertor already has this value why not make the merging a method that can be called like Muhimbi.SharePoint.DocumentConverter.PDF.DocumentConverter() Or am I just missing something here?
By
Anonymous, At
28 January, 2013 23:07
The Web Service is our only official and documented API, but you can request the URL of the Web Services configured in Central Administration using an instance of 'Muhimbi.SharePoint.DocumentConverter.PDF.WebServiceDocumentProcessorConfiguration' and then requesting the 'WebServiceURL' property.
This is unlikely to change, but it is not out official API so we do not guarantee it will not change in the future
By
Muhimbi, At
29 January, 2013 09:43
What if all of the documents I want to merge into one are already .pdf files? Is the web service call necessary then?
By
Anonymous, At
29 January, 2013 14:19
The web services does ALL the work. There is very little processing logic inside the SharePoint front-end.
The Conversion Service automatically detects if files are already in PDF format. If needed it converts non-PDF files first and then merges everything together before returning the single, merged, PDF.
As part of the same operation it can also apply watermarks, PDF Security and other modifications so everything needs to be handled in a central engine.
By
Muhimbi, At
29 January, 2013 14:24
Ok gotcha, Thank you for the information.
By
Anonymous, At
29 January, 2013 17:35
so when trying to merge a docx template with the data source file (XML), if there is a space in the template, will it thorw an exception. The service should be processing files with or without space in the filename right?
By
Anonymous, At
23 April, 2015 19:26
The name of the file should not matter. If you have any questions about this then please contact support@muhimbi.com
By
Muhimbi, At
24 April, 2015 08:05
Hello
I am doing PDF conversion with Muhimbi from Web pages programmatically.
I would like to know if there is way to change the document margins programmatically without being through the "Muhimbi.DocumentConverter.Service.exe.config"
I have a other little bug that I'm not figuring out how to solve. The PDF always presents me a border-right, sometimes I do not see it but in the impression it appears
By
Unknown, At
30 November, 2016 19:38
Hi Cristóvão,
In Conversionsettings.ConverterSpecificSettings you can pass in an instance of ConverterSpecificSettings_HTML, which can be used to set the margins on a call-by-call basis.
I am writing this from memory, for exact details see the Developer Guide at http://www.muhimbi.com/support/documentation/PDF-Converter-Services/User---Developer-Guide.aspx, specifically section 3.2.6.
Please contact support@muhimbi.com with regards to the other issue you are encountering.
By
Muhimbi, At
01 December, 2016 09:47
Resolved :)
Thanks
ConverterSpecificSettings_HTML PageSettings = new ConverterSpecificSettings_HTML();
PageSettings.PageMargins = "13mm, 13mm, 13mm, 21mm";
PageSettings.PaperSize = "A4";
PageSettings.ClearBrowserCache = true;
PageSettings.SplitImages = false;
conversionSettings.ConverterSpecificSettings = PageSettings;
By
Unknown, At
02 December, 2016 11:06
hi,
can you help-me?
I use the Muhimbi to Merged Documents.
In my developing enviroment it's ok but in my Quality enviroment i have the issue when I try to merge:
The formatter threw an exception while trying to deserialize the message: There was an error while trying to deserialize parameter http://services.muhimbi.com/2009/10/06:options. The InnerException message was 'Invalid enum value 'Default' cannot be deserialized into type 'Muhimbi.DocumentConverter.WebService.Data.DocumentStartPage'. Ensure that the necessary enum values are present and are marked with EnumMemberAttribute attribute if the type has DataContractAttribute attribute.'. Please see InnerException for more details.; STACK TRACE: Server stack trace: at System.ServiceModel.Channels.ServiceChannel.ThrowIfFaultUnderstood(Message reply, MessageFault fault, String action, MessageVersion version,...
By
Unknown, At
14 December, 2016 19:09
Hi,
The best way to resolve this is to contact support@muhimbi.com.
By
Muhimbi, At
14 December, 2016 19:15
Post a Comment
Subscribe to Post Comments [Atom]