Subscribe to News feed

A hop, skip, and a Tiff- Converting all supported document types to Tiff

Posted at: 7:29 AM on 15 July 2014 by David Radford

filetype_tiffBoth the Muhimbi PDF Converter for SharePoint and Muhimbi PDF Converter Services have long been able to  cross-convert one file format to another out of the box, not just to PDF.  This works very well, but what happens when you really want to mix it up- say convert a grid based text format like Excel to a multi-page image format like TIFF?  As we told the customer who requested this- “I’m not sure, but I know we can figure it out for you!”

And so here it is!  The solution is actually fairly simple and relies on two features that already exist within Muhimbi’s Converter products.  The first is the multi-step converter we built to allow cross-conversions and the second is our ability to integrate 3rd party converters within Muhimbi’s own conversion process.

The multi-step converter allows for an intermediate format to be used when there is no converter that can directly convert between the file types requested.  This stepping stone approach happens inside the Converter and is completely invisible to the user.  There is only one step to creating the multi-step TIFF converter:

  1. Modify the ‘Muhimbi.DocumentConverter.Service.exe.config’ file as described here and add the following entry to the <MuhimbiDocumentConverters> section.  This tells the Converter that all the listed extensions can be converted to TIFF by first converting them to PDF and then from there to TIFF.

 

<add key="CrossConverter_TIFF"
    description="Convert all formats to TIFF"
    fidelity="Full"
    supportedExtensions="xml,infopathxml,doc,docx,docm,rtf,txt,wps,odt,ott,xls,xlsx,xlsm,xlsb,
    csv,dif,ods,ots,html,htm,mht,gif,png,jpg,jpeg,bmp,ppt,pptx,pptm,xml,odp,otp,pps,
    ppsx,ppsm,vsd,vdx,svg,svgz,vdw,dxf,dwg,msg,eml"
    supportedOutputFormats="tiff"
    type="Muhimbi.DocumentConverter.WebService.MultiStepConverterFullFidelity,
    Muhimbi.DocumentConverter.WebService, Version=1.0.1.1, Culture=neutral, 
    PublicKeyToken=c9db4759c9eaad12">
  <steps>
    <step from="" to="pdf"/>
    <step from="pdf" to=""/>
  </steps>
</add>      

 

The 3rd party converter integration actually allows the Converter to do the conversion from PDF to TIFF and complete the job.  This step is really no different than adding any 3rd party converter.  It also has the added benefit of allowing PDF to TIFF conversion.

  1. Download the latest Ghostscript GPL Release from the Ghostscript website (please ensure you download the Windows Version).
  2. Install Ghostscript in a location of your choice on every server that runs the Muhimbi Conversion Service. Please make note of the location of the installation so you can point the Converter to it in the XML fragment listed below.

 

The next step is to modify the ‘Muhimbi.DocumentConverter.Service.exe.config’ file again and add the following entry to the same <MuhimbiDocumentConverters> section. Please remove the line wrapping from the content of the parameter attribute, this example has been reformatted to make it fit in a browser window. Please update the location of the Ghostscript executable as well.

<add key="PDF_TIFFConverter"
     description="PDF to TIFF Converter"
     fidelity="Full"
     supportedExtensions="pdf"
     supportedOutputFormats="tiff"
     type="Muhimbi.DocumentConverter.WebService.CommandLineConverter,
           Muhimbi.DocumentConverter.WebService, Version=1.0.1.1, Culture=neutral, 
PublicKeyTo
ken=c9db4759c9eaad12"
     parameter="C:\Program Files\gs\gs9.14\bin\gswin64c.exe |-dBATCH -dNOPAUSE -sDEVICE=tiff24nc 
-sCompression=lzw -r300x300 -sOutputFile={1} {0}
"/>

 
More details on the Muhimbi parameters can be found here, though you shouldn’t need to change them.   You can also review the various Ghostscript options here, especially the resolution that is used (-r300x300) as you may wish to change this in order to suit your specific needs.

NOTE- For this to work you must be on at least version 7.2.1 of the Muhimbi PDF Converter.

There you go- your Muhimbi Converter can now convert all supported formats to multi-page TIFF as well!  Want to convert an InfoPath or MSG file – including all attachments – to TIFF and apply watermarks in the process? Now you can.

Labels: , , , ,

PDF Converter Services 7.2.1 – Harder, Better, Faster, Stronger

Posted at: 6:19 PM on 08 July 2014 by Muhimbi

PDFConverterServicesBox4_thumb3

When releasing several new versions of a product each year - like we do with the Muhimbi PDF Converter Services - it's easy to overlook the overall stability and performance of the product in pursuit of new features. Sure, adding new features is fun, but adding stability and performance improvements is just as important, if not more so.

To make sure that everything will continue to work smoothly and reliably, we have dedicated an entire release to just this topic. We have fixed a number of important issues and improved performance, but all work and no play isn't the way to go either. So, we STILL found some time to sneak a few new features in, particularly the ability to convert files attached to PDFs and translate email and calendar labels for the language of your choice.

We haven't just been looking through our code- we've also be talking with our customers about how they're using the Converter. This has lead to a number of new blog posts suggesting new ways to use the existing features already present in the Converter.  For example, you can easily add new conversion types as described in these recent blog posts about adding PCL and XPS to PDF Conversion using free third party software.

A quick introduction for those not familiar with the product: The Muhimbi PDF Converter Services is an ‘on premises’ server based SDK that allows software developers to convert typical Office files to PDF format using a robust, scalable but friendly Web Services interface from Java, .NET, Ruby & PHP based solutions. It supports a large number of file types including MS-Office and ODF file formats as well as HTML, MSG (email), EML, AutoCAD and Image based files and is used by some of the largest organisations in the world for mission critical document conversions. In addition to converting documents the product ships with a sophisticated watermarking engine, PDF Splitting and Merging facilities, an OCR facility and the ability to secure PDF files. A separate SharePoint specific version is available as well.
 

Calendar-to-PDF-Conversion-German5  Converted Calendar Entry in English and German


In addition to the changes listed above, some of the main changes and additions in the new version are as follows:

2064 CAD Fix CAD Converter - Object reference not set to an instance of an object.
2205 CAD Fix CAD Converter - NullRefException converting CAD file.
2206 CAD Fix CAD Converter - ArgumentOutOfRangeException while converting CAD file.
2115 CAD Fix CAD Converter - Hatch Lines fill not correct.
2217 CAD Improvement CAD Converter - Increase performance, reduce file size and improve compatibility
2158 Excel Fix Excel fails to load certain documents in Excel 2013 with the following error: "Unable to get the Open property of the Workbooks class".
2159 Excel Improvement Excel files with external links open very slowly.
2145 HTML Fix Occasionally in-line images go missing when converting HTML/MSG to PDF.
2012 Merging Fix Internal hyperlinks are broken when merging certain documents.
2167 Merging Fix Merge operations cannot be executed as PDF/A due to problem with security settings.
2184 MSG Fix MSG to PDF - Attachment name not recognised when MSG is exported using ANSI.
2208 MSG Fix MSG to PDF - Converting email returns empty PDF.
1898 MSG Improvement MSG to PDF - Allow email labels to be translated.
2154 OCR Fix OCR Speeds 'fast' and 'rapid' stopped working.
2156 OCR Fix OCR - Occasional error under load.
2230 Other Improvement Add support for specifying additional output formats such as 'TIFF, PNG, GIF, JPG, PS, BMP, PCL' (This does not include native support for converting to these formats, which requires third party plug ins)
2109 PDF New Add support for converting and merging files attached to PDFs.
2122 TIFF Fix Converting certain TIFF files to PDF shows empty in Acrobat Reader.
2103 Watermarking Fix Watermarking certain documents causes problem in Adobe Reader 9.
2186 Watermarking Fix Watermarking Crystal Report files causes exception.
2161 Word Fix Corrupt MS-Word file is never removed from Temp folder.

 

For more information check out the following resources:


As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.

Download your free trial here (39MB). .

.

Labels: , , ,

PDF Converter for SharePoint 7.2.1–Harder, Better, Faster, Stronger

Posted at: 5:46 PM on by Muhimbi

PDFBox5

When releasing several new versions of a product each year - like we do with the Muhimbi PDF Converter for SharePoint - it's easy to overlook the overall stability and performance of the product in pursuit of new features. Sure, adding new features is fun, but adding stability and performance improvements is just as important, if not more so.

To make sure that everything will continue to work smoothly and reliably, we have dedicated an entire release to just this topic. We have fixed a number of important issues and improved performance, but all work and no play isn't the way to go either. So, we STILL found some time to sneak a few new features in, particularly the ability to convert files attached to PDFs and translate email and calendar labels for the language of your choice.

We haven't just been looking through our code- we've also be talking with our customers about how they're using the Converter. This has lead to a number of new blog posts suggesting new ways to use the existing features already present in the Converter.  For example, you can easily add new conversion types as described in these recent blog posts about adding PCL and XPS to PDF Conversion using free third party software.

For those not familiar with the product, the PDF Converter for SharePoint is a lightweight solution that allows end-users to merge, split, watermark, secure, OCR and convert common document types - including InfoPath, AutoCAD, MSG (email) MS-Office, HTML and images - to PDF as well as other formats from within SharePoint using a friendly user interface, workflows or a web service call without the need to install any client side software or Adobe Acrobat. It integrates at a deep level with SharePoint and leverages facilities such as the Audit log, Nintex Workflow, localisation, security and tracing. It runs on SharePoint 2007, 2010 & 2013 and is available in English, German, Dutch, French, Traditional Chinese and Japanese. For detailed information check out the product page.
 

Calendar-to-PDF-Conversion (German)  Converted Calendar Entry in English and German


In addition to the changes listed above, some of the main changes and additions in the new version are as follows:

2064 CAD Fix CAD Converter - Object reference not set to an instance of an object.
2205 CAD Fix CAD Converter - NullRefException converting CAD file.
2206 CAD Fix CAD Converter - ArgumentOutOfRangeException while converting CAD file.
2115 CAD Fix CAD Converter - Hatch Lines fill not correct.
2217 CAD Improvement CAD Converter - Increase performance, reduce file size and improve compatibility
2158 Excel Fix Excel fails to load certain documents in Excel 2013 with the following error: "Unable to get the Open property of the Workbooks class".
2159 Excel Improvement Excel files with external links open very slowly.
2145 HTML Fix Occasionally in-line images go missing when converting HTML/MSG to PDF.
2012 Merging Fix Internal hyperlinks are broken when merging certain documents.
2214 Merging Fix Converting List Item Attachments using Merge Facility.
2167 Merging Fix Merge operations cannot be executed as PDF/A due to problem with security settings.
2184 MSG Fix MSG to PDF - Attachment name not recognised when MSG is exported using ANSI.
2208 MSG Fix MSG to PDF - Converting email returns empty PDF.
1898 MSG Improvement MSG to PDF - Allow email labels to be translated.
2154 OCR Fix OCR Speeds 'fast' and 'rapid' stopped working.
2156 OCR Fix OCR - Occasional error under load.
2230 Other Improvement Add support for specifying additional output formats such as 'TIFF, PNG, GIF, JPG, PS, BMP, PCL' (This does not include native support for converting to these formats, which requires third party plug ins)
2109 PDF New Add support for converting and merging files attached to PDFs.
1915 SharePoint Fix Duplicate sitemap node exception on SP2013 after redeployment.
2122 TIFF Fix Converting certain TIFF files to PDF shows empty in Acrobat Reader.
2103 Watermarking Fix Watermarking certain documents causes problem in Adobe Reader 9.
2186 Watermarking Fix Watermarking Crystal Report files causes exception.
2161 Word Fix Corrupt MS-Word file is never removed from Temp folder.


For more information check out the following resources:


As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.

Download your free trial here (46MB). .

.

Labels: , , ,

XPS to PDF Conversion using Muhimbi’s range of Server Based PDF Converters

Posted at: 3:55 PM on 20 May 2014 by David Radford

xpsicon

A quick search online for products that convert one format to another results in a sometimes overwhelming list, with some utilities boasting of being able to convert hundreds of different formats.  Needless to say, most of these products produce less than high fidelity conversions- how can they, when they deal with so many different formats?   Our focus at Muhimbi has always been high fidelity conversions that provide professional quality results- along with a supporting environment that makes implementing conversions a truly enterprise level prospect.  This means we limit the number of formats we convert natively to important ones we can perfect and update when new versions are released, so even some fairly well known file types just don’t make the cut.

Luckily, Muhimbi’s range of PDF Conversion products has had the ability to use 3rd party converters for a long time.  This ability fills the gaps- whether for an esoteric format like HPGL, or a more prosaic one, such as XPS.  Ah, XPS, that wonderful XML based format that was going to wrestle the market away from PDF.  Unfortunately for XPS, the most common thought when someone sees it’s extension is- “How can I convert this into something useful, like a PDF”?  Well, that’s where we’ve got you covered!  Using the Muhimbi Converter and GhostXPS, you can convert XPS documents just like any other.

First thing to do (after installing the Muhimbi PDF Converter of course), is to download and install GhostXPS on you conversion server(s).

  1. Download the latest GhostXPS GPL Release from the Ghostscript website (please ensure you download the Windows Version).
  2. Install GhostXPS in a location of your choice on every server that runs the Muhimbi Conversion Service. Please make note of the location of the installation so you can point the Converter to it.

 

The next step is to modify the ‘Muhimbi.DocumentConverter.Service.exe.config’ file as described here and add the following entry to the <MuhimbiDocumentConverters> section.

<add key="XPSConverter"
     description="XPS to PDF Converter"
     fidelity="Full"
     supportedExtensions="xps"
     supportedOutputFormats="pdf"
     type="Muhimbi.DocumentConverter.WebService.CommandLineConverter,
           Muhimbi.DocumentConverter.WebService, Version=1.0.1.1, Culture=neutral, 
PublicKeyTo
ken=c9db4759c9eaad12"
     parameter="C:\gs\gxps-9.14-win32.exe | -sDEVICE=pdfwrite -sOutputFile={1} -dNOPAUSE {0} "/>

 
More details on the parameters can be found here, though you shouldn’t need to change them.

That is all there is to it. Once everything has been configured, XPS files will be picked up automatically and treated exactly the same as all other file formats supported by the Muhimbi PDF Converter.

.

Labels: , , , ,

PCL to PDF Conversion using Muhimbi’s range of Server Based PDF Converters

Posted at: 7:10 AM on by David Radford

printerIn a previous post about Muhimbi’s ability to integrate 3rd party converters into it’s conversion process, we looked at how to use GhostXPS to convert XPS files to PDF.  Here we’ll take a look at another important, but ‘undercover’ format that Muhimbi’s range of PDF Conversion products can convert with some help from GhostPCL.

Printer Command Language (PCL) is not a commonly recognized format, but it’s used by people everyday- without them ever knowing it.  The vast majority of print jobs are sent to today’s printers via PCL as it is a compact and efficient language for this kind of data transmission.  It may not be the ‘best’ in terms of absolute quality, but for anything other than large scale professional printing, it is the de facto standard.  This means that most printers cannot accept raw PostScript jobs anymore, so PCL files are generated for automated print jobs and by applications.  The problem arises when you don’t want to print the file- how many people can double click on a .pcl file and have it open in a friendly viewer?  Once again, this is where Muhimbi’s Converter and GhostPCL come to the rescue and seamlessly convert PCL files to PDF for any user to easily access.

First thing to do (after installing the Muhimbi PDF Converter of course), is to download and install GhostPCL on you conversion server(s).

  1. Download the latest GhostPCL GPL Release from the Ghostscript website. (please ensure you download the Windows Version).
  2. Install GhostPCL in a location of your choice on every server that runs the Muhimbi Conversion Service. Please make note of the location of the installation so you can point the Converter to it.

 

The next step is to modify the ‘Muhimbi.DocumentConverter.Service.exe.config’ file as described here and add the following entry to the <MuhimbiDocumentConverters> section.

<add key="PCLConverter"
     description="PCL to PDF Converter"
     fidelity="Full"
     supportedExtensions="pcl"
     supportedOutputFormats="pdf"
     type="Muhimbi.DocumentConverter.WebService.CommandLineConverter,
           Muhimbi.DocumentConverter.WebService, Version=1.0.1.1, Culture=neutral, 
PublicKeyTo
ken=c9db4759c9eaad12"
     parameter="C:\gs\pcl6-9.14-win32.exe | -sDEVICE=pdfwrite -sOutputFile={1} -dNOPAUSE {0}  "/>

 
More details on the parameters can be found here (though there should be no need to change them).

That is all there is to it. Once everything has been configured, PCL files will be picked up automatically and treated exactly the same as all other file formats supported by the Muhimbi PDF Converter.

.

Labels: , , , ,

Programmatically Converting and Merging files attached to PDF Documents

Posted at: 1:54 PM on 17 April 2014 by Muhimbi

One of the cool things you can do when you have a comprehensive PDF Conversion and processing platform such as the Muhimbi PDF Converter Services, is that you can add relatively complex facilities  you didn’t originally envision, with relative ease.

When we originally set out we never thought of converting PDF files to PDF format, why would anyone do that? Well, it turns out there are several good reasons including Converting PDF to PDF/A (or other PDF versions), Changing PDF Viewer preferences or embed strip fonts from a PDF. Starting with version 7.2.1 we are adding another scenario to the mix, which is the ability to convert files attached to PDF Documents.

Similar to emails, a PDF document can have other files attached. Previously we simply ignored these files, but now we actively inspect PDF attachments and offer the option to convert and merge them to the main PDF. Ideal for archiving or printing purposes.

This new facility is accessible from our Web Services interface, see below, as well as SharePoint Designer and Nintex Workflows using our XML Override syntax. Conversion of PDF Attachments can globally be controlled using the Conversion Service’s configuration file by modifying the PDF.ConvertAttachments and PDF.ConvertAttachmentMode keys.
 

ConverterSpecificSettings_PDF 
The syntax is simple. Create a new instance of ConverterSpecificSettings_PDF, set its properties to the appropriate values and assign it to ConversionSettings.ConverterSpecificSettings before kicking off the conversion operation. A brief code example, that can easily be plugged into our standard sample code, can be found below.

ConverterSpecificSettings_PDF csc = new ConverterSpecificSettings_PDF();
csc.ConvertAttachments = true;
csc.ConvertAttachmentMode = PDFConvertAttachmentMode.RemoveSupported;
conversionSettings.ConverterSpecificSettings = csc;

 
The syntax for Java, Ruby and PHP is similar, but the code needs to be adapted to syntax specific to those environments.

 

The possible values for ConverterAttachmentMode are as follows:

  • RemoveAll: When a PDF file is processed, all attachments will be converted and merged to the main PDF. All attachments will be removed from the PDF, including those of attachments for which the file type is not recognised by the converter.
  • RemoveSupported: When a PDF file is processed, all attachments will be converted and merged to the main PDF, but only those attachments that are supported by the converter are removed from the PDF, all other attachments remain present in the main file.

Naturally these values are only used when ConvertertAttachments is set to True.

 

As this behaviour is part of the PDF Conversion Service’s processing pipeline, this new facility can be used in combination with all Merging, Watermarking, OCR, PDF Encryption and PDF/A post processing facilities.

 

Any questions or feedback? Leave a comment in the section below or contact us, we love talking to our customers.

 

.

Labels: , , , , ,

PDF Converter Services 7.2 - Extract text using OCR, MSG Improvements

Posted at: 5:55 PM on 09 April 2014 by Muhimbi

PDFConverterServicesBox4_thumb3

We are happy to announce version 7.2 of the popular Muhimbi PDF Converter Services. This new release further extends the OCR facility and MSG improvements introduced in the previous version and adds support for extracting text from bitmap based content and rendering of MSG based calendar entries.

A quick introduction for those not familiar with the product: The Muhimbi PDF Converter Services is an ‘on premises’ server based SDK that allows software developers to convert typical Office files to PDF format using a robust, scalable but friendly Web Services interface from Java, .NET, Ruby & PHP based solutions. It supports a large number of file types including MS-Office and ODF file formats as well as HTML, MSG (email), EML, AutoCAD and Image based files and is used by some of the largest organisations in the world for mission critical document conversions. In addition to converting documents the product ships with a sophisticated watermarking engine, PDF Splitting and Merging facilities, an OCR facility and the ability to secure PDF files. A separate SharePoint specific version is available as well.
 

  Example of a converted Calendar entry with an (OLE) embedded Excel sheet


In addition to the changes listed above, some of the main changes and additions in the new version are as follows:

2100 Excel New Optionally scale Excel to page width & height
2059 HTML Fix System.ArgumentException: uri - string can not be empty
1996 HTML Improvement Reduce white space causing occasional extra empty PDF pages at end of file.
1802 Merging Fix Bookmark targets bottom of page
2093 Merging Fix "Unexpected token Unknown before 107448" while merging file
2078 Merging Fix Kernel Error while loading PDF
2073 Merging Fix System.IndexOutOfRangeException while merging
2074 Merging Fix System.NullReferenceException while merging
2075 Merging Fix System.NullReferenceException while merging
2076 Merging Fix Some HTML Converted files cannot be saved in Acrobat Pro after merging
2126 MSG Fix "System.InvalidOperationException: Stack empty" during conversion of 3rd party generated MSG files
2133 MSG Fix "Parameter is not valid" during conversion of 3rd party generated MSG files
2136 MSG Fix Content missing from converted MSG file
2106 MSG Fix Fixed MSG body for 3rd party generated MSG files
2116 MSG Fix Conversion of MSG files with an attached MSG that is signed
2124 MSG Fix "System.IndexOutOfRangeException" Converting German email
2125 MSG Fix Conversion of email never finishes
2105 MSG Fix "Invalid Compressed RTF header" during conversion of 3rd party generated emails
2090 MSG Fix Extra '}' in body text
2058 MSG Fix No bookmark generated for certain attachments
2056 MSG Fix ‘Sent date' not correct on some 3rd party generated emails
2057 MSG Fix Unicode converter issue (also with EML)
2088 MSG Improvement Add support for attendees to meeting invitations
2086 MSG Improvement Optionally throw error if embedded content is encountered that cannot be converted
2013 MSG Improvement From address shows LDAP path
2046 MSG Improvement Web Service support for MSGConverterFullFidelity.EmailAddressDisplayMode and FromEmailAddressDisplayMode
2087 MSG New Convert the visual representation of embedded objects
2068 MSG New Add support for the conversion of Calendar Entries
2050 MSG New Add config value to allow MSG attachments list to be displayed, even when attachments are disabled
2113 MSG/HTML Fix Rendering error in very long emails / HTML pages
2066 MSG/HTML Fix Sometimes content is truncated on systems running IE9, IE10 or IE11
2005 MSG/HTML Fix Fonts look weird in some emails
1786 OCR Fix Handle leak during OCR
2054 OCR Fix Some Mixed content (MS-Word files with scanned images) does not always OCR
1999 OCR Fix Arabic training data causes exception
1788 OCR Improvement Increase OCR Performance
2089 OCR Improvement Update Diagnostics tool to display OCRed text
2081 OCR Improvement In-line images are recognised but text is not placed on it correctly
1998 OCR Improvement Add support for Hebrew
2048 OCR New Support for extracting text from bitmap based content using OCR
2072 Other New Allow timeouts to be specified on web service call
2102 Watermarking Fix Chinese & Japanese fonts are not displayed in watermarks
2103 Watermarking Fix Watermarking some documents causes problem in Adobe Reader 9

 
For more information check out the following resources:


As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.

Download your free trial here (39MB). .

.

Labels: , , ,

PDF Converter for SharePoint 7.2 - OCR Workflow Activities, MSG Improvements

Posted at: 5:24 PM on by Muhimbi

PDFBox5

The new features introduced with version 7.1 of the PDF Converter for SharePoint have proven to be popular with our customers. Today we are happy to announce version 7.2, which takes the existing features and elevates them to the next level while staying compatible with all SharePoint versions including SharePoint 2007, 2010 and 2013.

In addition to a number of bug fixes, the main new features are OCR Workflow Actions for SharePoint Designer and Nintex workflow, the ability to extract text from bitmap based content using OCR as well as further improvements to the MSG and EML based converters, specifically in the area of embedded (OLE) content and calendar entries.

 
For those not familiar with the product, the PDF Converter for SharePoint is a lightweight solution that allows end-users to merge, split, watermark, secure, OCR and convert common document types - including InfoPath, AutoCAD, MSG (email) MS-Office, HTML and images - to PDF as well as other formats from within SharePoint using a friendly user interface, workflows or a web service call without the need to install any client side software or Adobe Acrobat. It integrates at a deep level with SharePoint and leverages facilities such as the Audit log, Nintex Workflow, localisation, security and tracing. It runs on SharePoint 2007, 2010 & 2013 and is available in English, German, Dutch, French, Traditional Chinese and Japanese. For detailed information check out the
product page.
 

 
Example of a converted Calendar entry with an (OLE) embedded Excel sheet


In addition to the changes listed above, some of the main changes and additions in the new version are as follows:

2100 Excel New Optionally scale Excel to page width & height
2059 HTML Fix System.ArgumentException: uri - string can not be empty
1996 HTML Improvement Reduce white space causing occasional extra empty PDF pages at end of file.
1802 Merging Fix Bookmark targets bottom of page
2093 Merging Fix "Unexpected token Unknown before 107448" while merging file
2078 Merging Fix Kernel Error while loading PDF
2073 Merging Fix System.IndexOutOfRangeException while merging
2074 Merging Fix System.NullReferenceException while merging
2075 Merging Fix System.NullReferenceException while merging
2076 Merging Fix Some HTML Converted files cannot be saved in Acrobat Pro after merging
2126 MSG Fix "System.InvalidOperationException: Stack empty" during conversion of 3rd party generated MSG files
2133 MSG Fix "Parameter is not valid" during conversion of 3rd party generated MSG files
2136 MSG Fix Content missing from converted MSG file
2106 MSG Fix Fixed MSG body for 3rd party generated MSG files
2116 MSG Fix Conversion of MSG files with an attached MSG that is signed
2124 MSG Fix "System.IndexOutOfRangeException" Converting German email
2125 MSG Fix Conversion of email never finishes
2105 MSG Fix "Invalid Compressed RTF header" during conversion of 3rd party generated emails
2090 MSG Fix Extra '}' in body text
2058 MSG Fix No bookmark generated for certain attachments
2056 MSG Fix ‘Sent date' not correct on some 3rd party generated emails
2057 MSG Fix Unicode converter issue (also with EML)
2088 MSG Improvement Add support for attendees to meeting invitations
2086 MSG Improvement Optionally throw error if embedded content is encountered that cannot be converted
2013 MSG Improvement From address shows LDAP path
2046 MSG Improvement Web Service support for MSGConverterFullFidelity.EmailAddressDisplayMode and FromEmailAddressDisplayMode
2087 MSG New Convert the visual representation of embedded objects
2068 MSG New Add support for the conversion of Calendar Entries
2050 MSG New Add config value to allow MSG attachments list to be displayed, even when attachments are disabled
2113 MSG/HTML Fix Rendering error in very long emails / HTML pages
2066 MSG/HTML Fix Sometimes content is truncated on systems running IE9, IE10 or IE11
2005 MSG/HTML Fix Fonts look weird in some emails
1786 OCR Fix Handle leak during OCR
2054 OCR Fix Some Mixed content (MS-Word files with scanned images) does not always OCR
1999 OCR Fix Arabic training data causes exception
1788 OCR Improvement Increase OCR Performance
2089 OCR Improvement Update Diagnostics tool to display OCRed text
2081 OCR Improvement In-line images are recognised but text is not placed on it correctly
1998 OCR Improvement Add support for Hebrew
1975 OCR New SharePoint Designer OCR Workflow Activity for generating searchable PDFs
1975 OCR New SharePoint Designer OCR Workflow Activity for extracting text from bitmaps
1976 OCR New Nintex Workflow OCR Activity for generating searchable PDFs
1976 OCR New Nintex Workflow OCR Workflow Activity for extracting text from bitmaps
2048 OCR New Support for extracting text from bitmap based content using OCR
2072 Other New Allow timeouts to be specified on web service call
2102 Watermarking Fix Chinese & Japanese fonts are not displayed in watermarks
2103 Watermarking Fix Watermarking some documents causes problem in Adobe Reader 9
2049 Watermarking New Add support for USER_NAME in addition to the existing REMOTE_USER and LOGON_USER in watermarks


For more information check out the following resources:


As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.

Download your free trial here (46MB). .

.

Labels: , , ,

Get Outlook Mail in and out of SharePoint and Convert it to PDF - The Easy Way

Posted at: 3:53 PM on 03 April 2014 by David Radford

At Muhimbi we’re always looking to add great new features and functionality to our products. At the same time, we need to be careful not to start throwing features in just because they’re cool- a full featured product is great, a schizophrenic one isn’t.

A good example of a feature that doesn’t belong in a conversion product is the transferring of documents in and out of the SharePoint environment. There are so many ways to implement this, that it’d really be its own completely separate product… And it is! Mail2Share from Techtra provides a clean interface to SharePoint from within Outlook, allowing easy SharePoint adoption and integration from the familiar Outlook workspace.

The reasons for storing e-mails as PDFs is not always obvious, but with corporate Document Management Strategies becoming more complex and file formats always changing, there are some clear advantages to this:

  • PDF, particularly PDF/A, is the ideal file format for long term archiving.
  • PDF files can be viewed with a high level of fidelity on mobile devices. For example, if you received an AutoCAD file (dxf, dwg) and want to preview or share it with users that do not have an AutoCAD preview handler, with Mail2Share, you would be able to send the attachment to the configured mount point and receive back a converted version in PDF format.
  • The Muhimbi PDF Converter does not require the installation of a PDF writer on the local machine.

So, how can our PDF Converter and Mail2Share help with all this? Well, as it turns out- quite easily!  Leveraging The PDF Converter’s outstanding e-mail conversion, workflow integration, and watermarking features with Mail2Share’s innovative Outlook connectivity turns complex scenarios like the following into a few simple steps.

 

The scenario:

You have a number of regional offices that need to send in their current sales forecasts- an Excel spreadsheet with the details and then a written summary of the reasoning behind them. When these e-mails arrive, they need to be redirected to various internal groups, but they’re also sensitive and so access needs to be tracked and restricted. How can this all be managed easily in a central manner? How can we do this while also having a single file to move around that contains both the e-mail AND the attachment?


The steps:

  1. Install and configure The Muhimbi PDF Converter for SharePoint following the installation instructions from Chapter 2 of the Administration Guide.
     
  2. Install the Techtra Mail2Share desktop application (there is no server side component to worry about). To configure Mail2Share, just choose your SharePoint server, select the site you want to add libraries from, and then add the libraries you want to see in Outlook (and have SharePoint rights to) and you’re good to go.  
     
    select sharepoint
     
  3. Once that is done, you will have some additional folders in Outlook. To move an e-mail from Outlook to SharePoint, simply drag and drop the selected e-mail to the library you want from the list.

     

    drag and drop with arrow cropped 65 

  4. Once the e-mail arrives in the ‘Incoming Sales Projections’ library, it gets picked-up by a simple SharePoint workflow using our conversion action set to run when new files are created in it. The workflow converts both the body of the e-mail with the reasoning AND the Excel attachment with the details into an easy to manage PDF and then copies it to a different Library.
      workflow cropped

     

  5. The Library the newly created PDF is sent to has our Watermark on Open feature enabled (you might also want to add our PDF Security on Open as well, or instead of this). This watermarks the PDF with the date, location, and username of the person opening the PDF. In this case we have added the following text to the bottom left of every PDF every time it is opened in SharePoint or through Mail2Share.
     

    watermark on open75  

  6. The library is available to specific users in Outlook, using Mail2Share, based on their SharePoint rights.  In this case, the user only has rights to the ‘Outgoing Sales Projections’ library. The user then browses it like any other folder, selects the e-mail and XLS combined PDF, previews it if required, and then simply right clicks to send it as an attachment. The act of downloading it from SharePoint watermarks the PDF in the background and is seamless to the user.
     

    send as attachment75 

  7. Ah, no- that’s it- you’re done!

 

You now can easily convert e-mails (with attachments!), that need to be shared, into PDF, make them available in a central location, and instead of just restricting access- you can track where and when a specific copy of that PDF was created and by who. All automatically, without users needing to navigate into SharePoint or do anything more than drag-and-drop!

This is just a small sample of how Muhimbi’s PDF Converter for SharePoint and Techtra’s Mail2Share applications can work together to facilitate sharing and collaboration for users, while also becoming valuable tools for corporate Document Management Strategies.

 

 

Labels: , , , , , ,

Extract text from scanned content using OCR and Nintex Workflow

Posted at: 2:53 PM on 31 March 2014 by Muhimbi

OCR-Logo5_thumbWith the release of version 7.1 of the PDF Converter for SharePoint we added a fundamental new technology to our Document Conversion and Manipulation platform, Optical Character Recognition (OCR). That initial release was able to process scanned / bitmap based content and generate fully searchable PDFs.

With the introduction of version 7.2 we are adding support for a new OCR related use case, which is the ability to recognise text on (part of) a page and return the actual text (not a bitmap) to the workflow for further processing. A common use for this functionality is to extract a particular area of text from documents that all use a common template or layout. For example, if a reference number can always be found at the top right corner of scanned documents then that text can be extracted and stored in a SharePoint column from where it can be included in searches or be used in further workflow steps… pretty powerful stuff.

This post describes the Nintex Workflow Activity. The SharePoint Designer equivalent can be found here.

For more details, including an introduction, see these related blog posts.

 

Once the Muhimbi PDF Converter for SharePoint is installed, and the Nintex Workflow Integration has been activated, a number of new activities will be added automatically to the list, including the new Extract text using OCR activity. It is compatible with Nintex Workflow 2007, 2010 & 2013 and this is what it looks like.
 

OCR-Extract-Workflow-Activity


Building a full example workflow is out of the scope of this post as it is relatively easy. For details see our generic PDF Conversion for Nintex Workflow example.

The fields supported by this Workflow Activity are as follows:

  1. Language: The language the source document is written in. It defaults to English, but we currently (version 7.2) support Arabic, Danish, German, English, Dutch, Finnish, French, Hebrew, Hungarian, Italian, Norwegian, Portuguese, Spanish and Swedish.
  2. Performance: Specify the performance / accuracy of the OCR engine. It is recommended to leave this on the default Slow but accurate setting.
  3. Whitelist / Blacklist: Control which characters are recognised. For example limit recognition to numbers by whitelisting 1234567890. This prevents, for example, a 0 (zero) to be recognised as the letter o or O.
  4. Pagination: In some specific cases a single image spans multiple pages. Enable pagination for those cases.
  5. Region: Specify the x, y, width and height of the region to retrieve text from. The unit of measure (UOM) is 1/72nd of an inch. When extracting text from non-PDF files, e.g. a TIFF or PNG, then please take into account that internally the image is first converted to PDF, which may add margins around the image but guarantees that a single – unified - UOM is used across all file formats. If you are not sure how internal conversion affects the dimensions of your image or scan then use our software to convert the file to PDF and open it in a PDF reader.
  6. Page Number: By default text is extracted from all pages and concatenated. To extract the text from a specific page specify the page number in this field.
  7. Output Text: The recognised text will be stored in this variable (type String).
  8. Source List ID & List Item: The item that triggered the workflow is processed by default. You can optionally specify the ID of a different List and List Item using workflow variables. Please use data type string for the List ID workflow variable. For the Item ID use type Item ID (in SharePoint 2007) or Integer (in SharePoint 2010 / 2013)
  9. Error Handling: Similar to the way some of Nintex’ own Workflow Activities allow errors to be captured and evaluated by subsequent actions, all of Muhimbi’s Workflow Activities allow the same. By default this facility is disabled meaning that any error terminates the workflow.

 

For more details about using the PDF Converter for SharePoint in combination with Nintex Workflow see this Knowledge Base article.

Please note that the PDF Converter Professional add-on license is needed in order to use OCR in your production environment.

Any questions or comments? Leave a message below or contact us.

.

Labels: , , , , , , ,