Convert Office files to PDF Format from Java using a Web Services based interface

Posted at: 14:10 on 20 April 2010 by Muhimbi


As we have been receiving an increasing number of requests for Java based sample code for the Muhimbi PDF Converter API and Server Platform, we have decided to lift the relevant chapter from the Developer Guide and publish it in this blog post. A .NET version of this post is available here.

For those not familiar with the product, the MDCS is a server based SDK that allows software developers to convert typical Office files, including MS-Word, Excel, PowerPoint, Visio, Publisher and InfoPath, to PDF format using a robust, scalable but friendly Web Services interface from Java and .NET based solutions.

Even though the MDCS itself must run on a Windows based server, it has been designed to interoperate with non-Windows platforms such as Java. This section describes how to convert documents to PDF format using a Java based environment.

The full version of the sample code discussed in this post, including pre generated proxies, is installed alongside each copy of the MDCS.

The example described below assumes the following:

  1. The JDK has been installed and configured.
  2. The MDCS and all prerequisites have been installed in line with the Administration Guide.
  3. The MDCS is running in the default anonymous mode. This is not an absolute requirement, but it makes initial experimentation much easier.


The first step is to generate proxy classes for the web service by executing the following command:

    wsimport http://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl
    -d src -Xnocompile -p

Feel free to change the package name and destination directory to something more suitable for your organisation.

If the Muhimbi Conversion Service is not located on the same system as where wsimport is executed then change localhost to the name of the server running the Conversion Service. You will also need to change the host name in the Conversion Service’s config file. A convenient shortcut to the Installation folder is located in the Muhimbi Start Menu. Open Muhimbi.DocumentConverter
, search for baseAddress and change the host name. Restart the Muhimbi Document Converter Service to activate the change.

Wsimport automatically generates the Java class names. Unfortunately some of the generated names are rather long and ugly so you may want to consider renaming some, particularly the Exception classes, to something friendlier. This, however, means that if you ever run wsimport again you will need to re-apply those changes. For more information have a look at the high level overview of the Object Model exposed by the web service.

Once the proxy classes have been created add the following sample code to your project. Run the code and make sure the path to the document to convert is specified on the command line. (Download Source Code)

This example sets ConversionSettings.Format to OutputFormat.PDF. As a result the file is converted to the default PDF format. It is possible to convert files to other file formats as well by setting this property to a different value. For details see this blog post.



import java.util.List;
import javax.xml.bind.JAXBElement;
import javax.xml.namespace.QName;

public class WsClient {


  public static void main(String[] args) {
    try {
      if (args.length != 1) {
        System.out.println("Please specify a single file name on the command line."); 
      } else {
        // ** Process command line parameters
        String sourceDocumentPath = args[0];
        File file = new File(sourceDocumentPath);
        String fileName = getFileName(file);
        String fileExt = getFileExtension(file);

        System.out.println("Converting file " + sourceDocumentPath);

        // ** Initialise Web Service
        DocumentConverterService_Service dcss = new DocumentConverterService_Service(
            new QName("", "DocumentConverterService"));
        DocumentConverterService dcs = dcss.getBasicHttpBindingDocumentConverterService();

        // ** Only call conversion if file extension is supported
        if (isFileExtensionSupported(fileExt, dcs)) {
          // ** Read source file from disk
          byte[] fileContent = readFile(sourceDocumentPath);

          // ** Converting the file
          OpenOptions openOptions = getOpenOptions(fileName, fileExt);
          ConversionSettings conversionSettings = getConversionSettings();
          byte[] convertedFile = dcs.convert(fileContent, openOptions, conversionSettings);

          // ** Writing converted file to file system
          String destinationDocumentPath = getPDFDocumentPath(file);
          writeFile(convertedFile, destinationDocumentPath);
          System.out.println("File converted sucessfully to " + destinationDocumentPath);

        } else {
          System.out.println("The file extension is not supported.");

    } catch (IOException e) {
    } catch (DocumentConverterServiceGetConfigurationWebServiceFaultExceptionFaultFaultMessage e) {
    } catch (DocumentConverterServiceConvertWebServiceFaultExceptionFaultFaultMessage e) {

  public static OpenOptions getOpenOptions(String fileName, String fileExtension) {
    ObjectFactory objectFactory = new ObjectFactory();
    OpenOptions openOptions = new OpenOptions();
    return openOptions;

  public static ConversionSettings getConversionSettings() {
    ConversionSettings conversionSettings = new ConversionSettings();
    return conversionSettings;

  public static String getFileName(File file) {
    String fileName = file.getName();
    return fileName.substring(0, fileName.lastIndexOf('.'));

  public static String getFileExtension(File file) {
    String fileName = file.getName();
    return fileName.substring(fileName.lastIndexOf('.') + 1, fileName.length());

  public static String getPDFDocumentPath(File file) {
    String fileName = getFileName(file);
    String folder = file.getParent();
    if (folder == null) {
      folder = new File(file.getAbsolutePath()).getParent();
    return folder + File.separatorChar + fileName + '.' + OutputFormat.PDF.value();

  public static byte[] readFile(String filepath) throws IOException {
    File file = new File(filepath);
    InputStream is = new FileInputStream(file);
    long length = file.length();
    byte[] bytes = new byte[(int) length];

    int offset = 0;
    int numRead;
    while (offset < bytes.length && (numRead =, offset, bytes.length - offset)) >= 0) {
      offset += numRead;

    if (offset < bytes.length) {
      throw new IOException("Could not completely read file " + file.getName());
    return bytes;

  public static void writeFile(byte[] fileContent, String filepath) throws IOException {
    OutputStream os = new FileOutputStream(filepath);

  public static boolean isFileExtensionSupported(String extension, DocumentConverterService dcs)
    throws DocumentConverterServiceGetConfigurationWebServiceFaultExceptionFaultFaultMessage
      Configuration configuration = dcs.getConfiguration();
      final JAXBElement<ArrayOfConverterConfiguration> converters = configuration.getConverters();
      final ArrayOfConverterConfiguration ofConverterConfiguration = converters.getValue();
      final List<ConverterConfiguration> cList = ofConverterConfiguration.getConverterConfiguration();
      for (ConverterConfiguration cc : cList) {
        final List<String> supportedExtension = cc.getSupportedFileExtensions().getValue().getString();
        if (supportedExtension.contains(extension)) {
          return true;

    return false;

  public static void printException(WebServiceFaultException serviceFaultException) {
    JAXBElement<ArrayOfstring> element = serviceFaultException.getExceptionDetails();
    ArrayOfstring value = element.getValue();
    for (String msg : value.getString()) {


Labels: , , , ,

Use Server Side Java & .NET to Convert Word, Excel, PowerPoint, Publisher & Visio to PDF

Posted at: 15:27 on 19 April 2010 by Muhimbi

PDFConverterServicesBoxWe are very excited to announce our first product spin-off, the Muhimbi PDF Converter API and Platform (MDCS), a server side solution to convert typical MS-Office files to PDF Format in a robust and scalable manner from any web services capable environment, including Java and .NET.

The MDCS is a spin-off from our popular PDF Converter for SharePoint. Much of the logic is the same, but all the SharePoint specific bells and whistles have been removed and the documentation has been completely rewritten and includes both Java and .NET sample code.


Key features:

  • Convert popular document types to PDF or XPS format with near perfect fidelity.
  • Scalable architecture that allows multiple conversions to run in parallel. The service can be scaled up by adding additional CPUs and scaled out by using standard HTTP Load Balancers.
  • Runs as a Windows Service. No need to install or configure IIS or other web service frameworks.
  • Convert password protected documents.
  • Apply security settings to generated PDF files including encryption, password protection and multiple levels of PDF Security options to prevent users from printing documents or copying a document’s content.
  • Generate regular PDF files or files in PDF/A format.
  • Generate high resolution PDF Files optimised for printing or normal resolution files optimised for use on screen.
  • Dynamically refresh a document’s content before generating the PDF. Ideal for merging content from external sources into your PDF file.
  • Control how to convert hidden / selected content such as PowerPoint Slides and Excel worksheets.

The web services class diagram is displayed below. A blog post describing how to use the Web Service in a .NET environment can be found here.


PDF-Converter-Web-Services-Class-DiagramPowerful Web Services based interface. Click to enlarge


For more information check out the following resources:

As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.


Download your free trial here (4MB).


Labels: , , , , ,

Subscribe to News feed