Subscribe to News feed

Convert PDF to XLSX, PDF to DOC & PDF to HTML using Muhimbi’s Server Side Converters

Posted at: 15:07 on 24 July 2014 by David Radford

refreshAlthough our software also support cross conversion between a number of file types, at Muhimbi our focus has always been on converting a wide range of document types to PDF with perfect fidelity. Well, why just let people convert to PDF?  Why not look at ways to allow conversion from PDF? 

As regular readers of our blog might have already guessed, this is where our ability to integrate 3rd party converters into both the Muhimbi PDF Converter for SharePoint and Muhimbi PDF Converter Services comes in handy yet again!

Finding a good 3rd party command line converter to use as an example is always a challenge for us.  Our developers have spent countless hours refining Muhimbi’s own converters and so we rarely find one that completely meets our standards.  We are proud of our products and don’t want to suggest integrating them with another converter that won’t produce the same high quality, consistent results that we insist upon for our own converters.  The search has been especially difficult for this particular post as the range of supported features between the various converters, not to mention the price, is so vast.  So, while we’ll focus on one Converter, we’ll mention some others that provide some specific conversion options that might be more suitable for your particular needs.
 
Please note that we do not have any formal or informal relationships with the companies mentioned in this post. They are merely the result of a brief Google Search session.
 
PDF to Excel conversion is tricky since the process is a bit like trying to rebuild a sand castle after a wave has hit it- all the sand (data) is still there, but shape (cell formatting) is gone.  Other formats are easier and so the approach we’ll take in this example is to use a converter that automates the creation of the cells based on its best guess as well as being able to handle multiple formats.  This isn’t perfect, but it simplifies the conversion tremendously while still providing a good example of how the conversion works.  Total PDF Converter X from CoolUtils is a lightweight conversion application with a simple command line and provides a good quality conversion.  It’s available as a free trial and at the time of writing costs $159.90 to purchase.  It provides conversion support from PDF to a wide variety of formats- DOC, RTF, XLSX, HTML, EPS, PS, TXT, CSV and images (BMP, JPEG, GIF, WMF, EMF, PNG, TIFF).

As always, we start with the assumption that the Converter for SharePoint or Converter Services is properly installed.  Once that is done carry out the following steps:

  1. Download and install Total PDF Converter X.
  2. Modify the ‘Muhimbi.DocumentConverter.Service.exe.config’ file as described here and add the following entry to the <MuhimbiDocumentConverters> section.  This tells the Converter that PDFs can be converted to XLSX. If you installed the 3rd party software in a different path then please update the content of the parameter attribute.

<add key="PDF_XLSXConverter"
     description="PDF to XLSX Converter"
     fidelity="Full"
     supportedExtensions="pdf"
     supportedOutputFormats="xlsx"
     type="Muhimbi.DocumentConverter.WebService.CommandLineConverter,
           Muhimbi.DocumentConverter.WebService, Version=1.0.1.1, Culture=neutral, 
PublicKeyTo
ken=c9db4759c9eaad12"
     parameter="C:\Program Files (x86)\Total PDF ConverterX\pdfconverterx.exe | {0} {1} -c xls"/>

 

Now, if you want to be able to convert PDFs to DOC format, the command is just slightly modified as seen below:

<add key="PDF_DOCConverter"
     description="PDF to DOC Converter"
     fidelity="Full"
     supportedExtensions="pdf"
     supportedOutputFormats="doc"
     type="Muhimbi.DocumentConverter.WebService.CommandLineConverter,
           Muhimbi.DocumentConverter.WebService, Version=1.0.1.1, Culture=neutral, 
PublicKeyTo
ken=c9db4759c9eaad12"
     parameter="C:\Program Files (x86)\Total PDF ConverterX\pdfconverterx.exe | {0} {1} -c doc"/>

 

The same changes can be made to allow conversion to any of the supported formats.  And, before you ask- YES!  You can add multiple entries to the config file to allow conversion from PDF to multiple types.

So, what do the results look like?  Here we have an image of a typical PDF based quote:

PDF_quote

Then here, you have the conversion to an Excel file:

XLSX_quote

 

So, as we’ve seen, this a good option that provides a great selection of conversion choices.  Depending on your needs though, it might not provide the conversion flexibility required for something like Excel, or if you just need PDF to Word conversion, might not be worth the expense.

The addition of this converter allows SharePoint to take advantage of this conversion ability through workflows simply by changing the output format of the workflow to the desired type in the ‘Convert Document’ workflow action for SharePoint Designer and Nintex workflows.  In order to allow SharePoint to process PDF files, which by default are not routed to the Converter, please run the following command from a SharePoint Command or PowerShell prompt:

     stsadm.exe -o setproperty -pn Muhimbi.SharePoint.DocumentConverter.PDF.SkipPDFFiles -pv false

 

For PDF to Excel conversion there are a few other options:

  • Intelligent Converters has the PDF-To-Excel converter for $29.  It provides good conversion from PDF to Excel if you are primarily interested in extracting the data from the PDF and saving it in Excel format.
  • A-PDF.COM has the A-PDF to Excel converter for $39.  It provides about the same level of conversion as PDF-To-Excel, but also uses a template file for conversion.  This could be very helpful if you have a large number of consistent PDFs that need to be converted and want to specify cells and formatting for the conversions.

For PDF to Word conversion there is a decent free option from Weeny Software, Free PDF to Word Converter.  The conversion has some problems with complex formatting (it creates quite good RTF documents), but if you are more concerned about getting the data into Word format than the way it looks, it is a good option.

 

Any questions or feedback? Leave a comment in the section below or contact us, we love talking to our customers.

.

Labels: , , ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

Links to this post:

Create a Link

Need support from experts?

Access our Forum

Download Free Trials