Over the last few months we have been working with Microsoft’s Flow team to make Muhimbi’s popular PDF Conversion and manipulation facilities available to all Flow, LogicApps and PowerApps users. All this hard work has paid off as – starting today – our software is available in Microsoft’s standard list of services.
There is nothing to install or configure, just create a Flow as normal, enable a Trigger to start when an event occurs (e.g. file created in DropBox, OneDrive, Box.com, SharePoint or any of the other supported services), add an action and either search for ‘Muhimbi’ to display all our actions to convert, merge, watermark, secure or OCR – or type the name of a Muhimbi action directly, e.g. ‘Convert Document’.
Fill in the blanks - it is pretty self-explanatory but additional information is available in our Core Concepts article - and feed the generated document in a secondary Flow action, e.g. to email it or write it to a different service. That is it, for more details see the steps and screenshots in this blog post.
One of the cooler aspects of the way the integration works is that it works equally well in combination with other platforms including:
Azure LogicApps: Available from the Azure portal, LogicApps represent the ‘grown up’ version of Flow. It provides more flexibility and control.
If you have been paying close attention to our recent posts, you may have noticed that we have fallen a little bit in love with Microsoft’s new Flow product. (See Attaching PDF files to emails and Moving files between SharePoint Online site collections). A simple but elegant Workflow Engine that works well in combination with SharePoint Online, but can also be used to integrate with non-SharePoint systems including OneDrive, DropBox, SalesForce, and now ……
Our previous posts have focused on how standard out-of-the-box Flow functionality can be used to post-process files generated by our SharePoint Online Workflow actions. However, we are not exaggerating when we say that we are extremely excited to announce that all Muhimbi’s workflow actions – as available for Nintex Workflow, K2, SharePoint Designer and Visual Studio – are now available for Microsoft Flow as well as part of the new Muhimbi PDF Converter Services Online product line. That is right, you can now Convert, Merge, Watermark, Secure and OCR files directly in Microsoft Flow in combination with any other Flow service provider.
Convert all files uploaded to OneDrive, watermark and secure them, and write the generated file to DropBox? No problem. Automatically archive all approved SharePoint Files in PDF/A format to Google Drive? Even easier. The sky is the limit.
So how does this work? We’ll go into a great level of detail during the next few weeks and months, but in summary it works as follows.
In Flow’s editor the Muhimbi actions show up in the same way as other built-in services such as SharePoint, DropBox, SalesForce etc. All Muhimbi workflow actions that our customers are already used to - including various conversion, merge, watermarking, security and OCR actions - are displayed in Flow’s list of actions.
Let’s take the Convert Document action as an example. All it takes is the name of the source file, which is typically available from the Flow Trigger that started the workflow, and the file’s content, which tends to be available from the same trigger as well. The Output format defaults to PDF, but – depending on the input format – you can select different output formats as well.
Although there are some other fields available under Show advanced options, this is basically all there is to it.
An example of a completed Flow can be found below. It is basic but powerful and can easily be extended to take files from different sources (SharePoint, Box.com, Google Drive), carry out multiple operations by feeding the converted document into other Muhimbi actions to watermark and secure the generated PDF, and then send the generated file not only by email but also into SharePoint, DropBox or any of the other gazillions of services that integrate with Flow.
The flow is triggered when a file is uploaded to a particular OneDrive folder.
The file name and file content provided by the OneDrive trigger are fed into the Convert Document action
The converted file is attached to an email and sent out.
Similar to other Flow actions, each Muhimbi action returns a number of output parameters that can be consumed by other actions in the same Flow. The following screenshot shows how the generated PDF file is attached to the email.
Similar to our SharePoint Online subscription service, a free 30 day trial is available for this new product as well. The first time any of our actions are used, Microsoft Flow requires a connection to be setup with your Muhimbi subscription. Just follow the basic instructions when going through that process and all will be setup in minutes.
Please note that there is no need to purchase separate subscriptions for both the PDF Converter for SharePoint Online and PDF Converter Services Online (as used by Flow). The same subscription can be used from both SharePoint Online and Flow, just make sure that during registration you enter the Tenancy ID of your SharePoint Online environment (See this KB article for details) to link the two. Please keep in mind that operations carried out by both products are shared and come from your existing monthly allotment.
We are happy to announce a new version of the Muhimbi PDF Converter Services. As the product is mature and stable, this is largely a maintenance release that solves a number of issues and introduces some refinements.
A quick introduction for those not familiar with the product: The Muhimbi PDF Converter Services is an ‘on premises’ server based SDK that allows software developers to convert typical Office files to PDF format using a robust, scalable but friendly Web Services interface from Java, .NET, Ruby & PHP based solutions. It supports a large number of file types including MS-Office and ODF file formats as well as HTML, MSG (email), EML, AutoCAD and Image based files and is used by some of the largest organisations in the world for mission critical document conversions. In addition to converting documents the product ships with a sophisticated watermarking engine, PDF Splitting and Merging facilities, an OCR facility and the ability to secure PDF files. A separate SharePoint specific version is available as well.
Some of the main changes and additions in the new version are as follows:
Incorrect text size and alignment during CAD to PDF conversion
Lines are too thick when converting DWG
AutoCAD x-ref search path does not search in sub folders
When converting HTML to PDF, large images that are loaded from an absolute path or URL are skipped
Some PDF iFilters do not pick up PDF Documents that have been converted from HTML to PDF
HTML Conversion problems on extremely large files
Error while merging certain files
Merge operations timeout after 30 minutes
OCR overlay is rotated for certain PDF files
PDF Syntax (validation) errors after carrying out OCR
OCR temp files not cleaned up in case of an error
Content of some OCR-ed PDFs not picked up by iFilters
PDF/A Color intent mismatch for certain source documents
Move DocumentConverter logs to 'log' sub folder
Add 'EPS' to OutputFormat.cs and related code (workflow actions etc)
Switching to new printer driver after installing with the old driver doesn't work
Improve installer on systems with wide range of .net framework versions
Fix automatic uninstall steps when switching back and forth in installer
Improve checking for local Admin rights on non-English operating systems
Last step of uninstallation hangs under certain circumstances
Duplicate Windows firewall rules created by installer
Installer doesn't write Exceptions to log until after error dialog is closed
Free Text & RTF watermarks do not show Arabic text consistently
For more information check out the following resources:
After determining that converting documents to PDF is the route that best addresses your organization’s needs, we then need to determine how to archive those PDFs. Simply storing a file is suboptimal in terms of efficacy; for this practice and resulting PDF to be useful, it will be critical to know where documents are stored. There are a lot of suggested best practices in this arena, and that can make determining the specifics of your process more convoluted than it needs to be.
In order to architect a more straightforward storage plan there are three essential points to address: ensuring metadata is retained in a converted form, making the conversion and storage of a document part of a workflow, and creating a cogent plan for the first two points in advance in order to avoid ad hoc policy decisions.
Why Metadata is Important
Metadata allows you to tag documents with information that can be accessed later, without involving a user. This not only relieves an end user of remembering which type of data needs to be stored with specific documents, it also yields two additional benefits:
First, the ability to search SharePoint storage by metadata allows very specific queries to be used when looking for documentation; the more specificity in the query, the more accurate and specific the results. This becomes more and more useful when hundreds of thousands of documents are stored and may need to be queried. For example, it would be easy to sort archived PDFs by InfoPath form title, author, and date range created (all of which are default metadata settings).
Second, the metadata within a document can be used by Muhimbi’s PDF Converter for SharePoint to create watermarks for that document, meaning that not only is the metadata there for search, but it is also attached in a viewable and “un-touchable” (encrypted) way. The inability to edit metadata can be required for some regulations and compliance rules.
Selecting and adding new metadata is a well-covered and documented area of SharePoint usage, so we won’t go into depth about adding new metadata requirements for a document in this article, but we’ve had the ability to maintain metadata during conversion baked into Muhimbi’s PDF converter for SharePoint since the very first version. Worth mentioning as well, it was later updated in version 6 to allow for a workflow step to copy metadata and set content type in a single operation. The PDF converter for SharePoint can also secure that document so that it can be viewed but NOT changed.
Simplify using Workflows
Now that a document is converted and metadata has been retained it still needs to be pushed to its final storage location. Manual storage is an option, but in reality it would only be appropriate for very small, and highly disciplined teams. Like most manual processes, it becomes cumbersome quickly if there is notable volume or number of contributors involved. One form that isn’t stored in the proper place won’t be an issue… until that form is needed, often months or even years later, and mistakes are likely to occur more frequently in tandem with document volume, especially if they are being stored manually.
Furthermore, in the event that PDFs are being manually archived, protocols have to be manually adhered to every single time a PDF is saved as well. For example, let’s assume we’ll be manually archiving customer invoices. We’d want them saved in a tree accordingly:
This process will almost always work, but one miss-click means that the invoice is being saved in the wrong year, or even wrong customer. Again, manual processes become cumbersome and increasingly error-prone as volume increases, so unless the team can be counted on 100% of the time to always remember to convert a form to PDF, and then place that PDF in the right location, manual conversion and archiving should be avoided.
Luckily, workflows automate these processes, and are pretty easy to set up. Planning ahead and setting up workflows not only makes the process easier for everyone, it also eliminates headaches and mitigates the risk of human error, such as plain forgetfulness.
Creating a workflow with Muhimbi is just a few steps - and works in most common workflow environments including SharePoint Designer, Nintex Workflow, K2, Visual Studio and Microsoft Flow. It’s easy enough that there is no real reason not to use workflows for any type of content that will be routinely created and needs to be stored.
Saving documentation in a safe, secure and reliable manner is a core business need, and regulatory compliance requirements only make the need more prominent. While the archiving of data may seem complex, it doesn’t need to be complicated as long as an automated process is developed that leverages metadata use and automated workflows.
While Optical Character Recognition (OCR) may seem like a newer technology, it’s been around for more than 50 years. In fact, OCR has become embedded in our daily life without much fanfare. For example, if you’ve ever inserted a check directly into an ATM and the ATM displayed the amount– that was OCR working for you. Of course, OCR functionality goes well beyond depositing Grandma’s birthday check.
Due to an overwhelming amount of user requests, OCR has been an important part of Muhimbi’s range of server-side PDF Conversion products (SharePoint, SDK for Java / PHP / C#, SharePoint Online / Office 365). Implementing software to recognize images and convert them to alpha-numeric characters was no trivial task, but thankfully it’s much easier to explain than it was to actually implement!
When an image is entered into a system it is reviewed for recognizable text. That text is then deciphered by the system with its best guess for each individual character. The system then creates a hidden data layer that contains the deciphered content, synced to the appropriate space on the image.
There are a lot of ways this can be useful for a business and we have included a few examples below. Perhaps more than one will ring familiar to your organization’s needs-set.
If an organization needs to digitize old orders and invoices, doing so manually would involve discrete steps for scanning in the paper copies, renaming them, and storing them in the correct place. However, with OCR technology it’s possible to scan the images in, set rules to look for key information, rename files, and create settings to automatically store them appropriately. SharePoint workflows become super helpful with tasks like these!
Another example involves InfoPath, always a popular topic in our PDF Converter for SharePoint’s use case. It’s not uncommon for InfoPath forms to allow (or require) the attaching of relevant documents. Those docs are most often attached as images, or non-OCR PDFs. By having OCR scan and digitize the content of those files their later usability is significantly increased.
OCR also offers advantages that deal with search-ability. The content in the hidden data layer attached to the file is searchable using a PDF reader or web browser. This allows for “search by content” functionality. Additionally, this text layer can also be set to be crawl-able or index-able allowing search engines to display the OCR document as results. Naturally, this makes said documents much more convenient to work with.
Scanned Document with OCRed text selected
Perhaps most meaningfully, OCR can empower visually impaired users to access content that would be otherwise impossible; the data layer can be used by a text-to-speech system to ‘read’ content to a user. Of course, even though expanding available content to the visually impaired has obvious business value, the impact goes well beyond office work.
For a bit of history in how OCR became a dominant technology in providing content access to the visually impaired, we should start by mentioning that many governments around the world have implemented standards based off Web Content Accessibility Guidelines (WCAG), which has helped formalize how web content should be created and accessed by any machine. Some examples of governmental implementation include US section 508 and UK Equality Act of 2010, meaning that all US and UK government websites must adhere to the standards set in the WCAG.
The WCAG is a lasting legacy of the Web Accessibility Initiative, which spun out of the personal computer boom of the 1990s. As recently as a few years ago, only about 1% of published books became available in braille, so the WCAG and Web Accessibility Initiative have played an important role in setting up useful guidelines to make sure that online content was held to a higher standard.
The wide adoption of these standards means more electronic content has become available to the visually impaired, both through electronic braille readers (which can cost upwards of $3,000), and the less expensive combination of OCR technology and a screen reader. A screen reader, either as a desktop application or a browser extension, allows text-to-speech capabilities for both rich-text content as well as OCR saved content.
Furthermore, while personal, organizational, or corporate sites aren’t required to comply with these standards, most do because they’ve become widely accepted best-practices. This increases the prevalence and frequency with which OCR technology is used.
There are plenty of solid business reasons for including OCR capabilities into Muhimbi’s range of server side PDF Conversion products. However, we can’t help but think that bringing new content, and more options to those with a visual impairment is perhaps the most notable.
Posted at: 12:51 PM on 23 December 2016 by Muhimbi
Ever since Muhimbi was founded, we have seen a multitude of new SharePoint releases including SharePoint 2007, 2010, 2013, Online and now SharePoint 2016. It takes a lot of work and effort, but our policy is to never leave a customer behind while making sure there is an upgrade path to whatever is coming next.
As a result, we are very happy to announce the new version of the Muhimbi PDF Converter for SharePoint, now compatible with SharePoint 2016 and Nintex Workflow 2016. In addition to support for these new technologies there are also fixes and improvements including the ability to use the new Workflow Manager to create Site Workflows as well as Reusable Workflows.
This version of the PDF Converter requires software to be installed on the server. Please note that if you have no server access there is always the option to deploy our SharePoint Online App to on-premise versions of SharePoint 2013 and 2016. For details see this blog post.
For those not familiar with the product, the PDF Converter for SharePoint is a lightweight solution that allows end-users to merge, split, watermark, secure, OCR and convert common document types - including InfoPath, AutoCAD, MSG (email) MS-Office, HTML and images - to PDF as well as other formats from within SharePoint using a friendly user interface, workflows or a web service call without the need to install any client side software or Adobe Acrobat. It integrates at a deep level with SharePoint and leverages facilities such as the Audit log, Nintex Workflow, K2 blackpearl, localisation, security and tracing. It runs on SharePoint 2007-2016 & SharePoint Online and is available in English, German, Dutch, French, Traditional Chinese and Japanese. For detailed information check out the product page.
Although at the time of writing (October 2016) Microsoft’s cool new Flow platform is not yet officially available, our support team is quickly turning into Flow junkies. Problems that are difficult to solve using regular SharePoint Designer workflows are absolutely trivial to crack using Flow.
Today we are describing a solution to what is a top 5 support request, at least for our support team, which is to automatically convert a file to PDF using a workflow and sending the result via email as an attachment. You’d think this is easy to achieve in a SharePoint Designer workflow as that comes with an e-mail action, however that action does not support attachments. Bummer!
Read on to find out how easy it is to solve this problem.
In SharePoint Online (or on-premise if you have configured a gateway) make sure you have access to two different Document Libraries. This example can be combined to use only a single library, but things are slightly easier when using two. Let’s name our libraries Auto Convert Files and Email Files.
As at the time of writing Muhimbi’s workflow actions are not (yet) available in Flow, we need to create a SharePoint Designer workflow to carry out the conversion to PDF. Navigate to the Auto Convert Files Library and create a new SharePoint Designer Workflow (Library tab / Workflow Settings / Create a Workflow in SharePoint Designer).
Name your workflow Convert to PDF, set it to automatically start when a new file is created, convert the Current Item to Email Files/ (including the trailing slash), in PDF format and Exclude metadata. Too cryptic, have a look at this demo video (that uses slightly different parameters, but illustrates the concept).
Close SharePoint Designer and check that the new workflow works by uploading a new file to Auto Convert Files. After a few seconds the PDF rendition should be placed in the Email Files Document Library.
It is time to create the Flow that picks up the newly converted file and send it via email. On the My Flows screen select the option to create a New flow from blank.
The trigger that starts the flow must be defined first, enter SharePoint when a file is created and select the displayed trigger. Enter the site collection and Email Files document library (use the Library picker if needed).
With the fields filled out, click New Step followed by Add an Action.
There are several built in email related actions (Outlook, Office 365 Outlook, SMTP), but in this example we use the basic Mail – Send email action. Select it and accept the terms & conditions if needed.
Fill out the recipient, subject etc and click Show Advanced options. Select the Attachment field and select File content from the Dynamic content picker. Similarly for the Attachment file name, select File name.
That is it, enter an appropriate name for the flow at the top of the screen and click the Create Flow option. You can later extend the flow by adding a SharePoint Delete File action to delete the original file if that is no longer needed.
On the My Flows screen make sure the newly created flow is enabled. Upload a file to Auto Convert Files and after a few seconds an email will be delivered with a copy of the converted file.
Naturally this is not limited to PDF Conversion, it works with any file generated by our various workflow actions including Merge, Watermark, Secure and OCR operations. The current flow always sends the email to the same recipient, this can easily be extended to take the recipient from a separate column by querying that column in Flow.
During the previous 8+ years we have made it a habit to announce new software releases – for our on-premise software – at the time it became available for download. However, because releasing updates for an online service, where we maintain the entire back-end, doesn’t require any end-user involvement, we haven’t always done such a good job where it comes to announcing new versions of the Muhimbi PDF Converter for SharePoint Online / Office 365.
That changes today as we formally announce the availability of version 9.8, the 8th release since the product first became available in June 2015. An overview of all recent and historical changes can be found below.
Please note that all SharePoint Online versions are numbered in the 9.X range. At the time of writing the most recent version of the on-premise software is 8.1.
The number of new features and changes is almost too large to list, but include support for Site Workflows, Reusable workflows, the ability to install the App in SharePoint 2013 / 2016, support for the new Document Library experience, real-time watermarking, a brand new website for SharePoint Online and much much more.
If you are an existing customer, or installed a trial version before October 2016, then we recommend upgrading the App and installing the latest workflow actions. (Especially as Microsoft has deprecated certain types of sandbox solutions and caused some issues when they introduce the new Document Library experience)
For those not familiar with the product, the Muhimbi PDF Converter for SharePoint Online is a lightweight subscription based solution that allows end-users to merge, split, watermark, secure, OCR and convert common document types - including InfoPath, AutoCAD, MSG (email) MS-Office, HTML and images - to PDF using SharePoint Online through a friendly user interface or via workflows, without the need to install any client side software or Adobe Acrobat. More details can be found on the product page.
Deploy App Store Add-in to SharePoint Online or on-premise.
In addition to the changes listed above, some of the main changes and additions in the new version are as follows:
The 'Split' Workflow action does not recognise the default 'interval' value.
The List ID variable is not created by merge activity for some tenancies.
One of the key advantages of deploying Apps in SharePoint Online, or at least App Store Apps, is the ease of installation, it is absolutely trivial. A quick search in the App Store followed by another click to install a complex product and you are done. No need to involve IT staff, plan capacity, assess risk, install dependencies, monitor servers and maintain systems. It doesn’t get much easier (for the customer, now we get to do all the hard work in our hosting environment :-)
Well, and I guess you can see where this is going, today we are doing exactly that. Providing your SharePoint 2013 / 2016 SharePoint environment has been set up to integrate with the Office Store, you can install both our SharePoint Online App and Workflow Actions on-premise using the click of a button. Brilliant!
While installation of the App is easy, please make sure that:
None of these requirements are specific to Muhimbi’s Apps / Add-ins. Most Apps, at least the non-trivial ones, require the same one-time SharePoint configuration.
So, what else do you need to know?
Although from a functional perspective the App is largely identical to the traditional on-premise product, the license is completely different. The App is subscription based, regardless of the environment it is installed in. For details about the various subscriptions, see this overview.
Although we aim for full feature parity between the App and the traditional PDF Converter for SharePoint, there are some differences. The App does not directly integrate with Nintex Workflow (Nintex for Office 365 does not support 3rd party add-ins at the time of writing). Our API is also not (yet) available from the App. If you need either of these functions then please reach out to our support desk for the appropriate workarounds. An overview of the key differences and similarities can be found in this Knowledge Base article.
App Store integration is only available in SharePoint 2013 and later. This does not work on older SharePoint versions such as 2007 or 2010. Please install our on-premise software in those environments.
With the modern App being available on-premise, you may think that we will no longer focus development on our traditional on-premise products. This is not the case, we have a very complete and actively developed roadmap for both products and will continue to develop each separately. We pride ourselves on never leaving any customers behind, which is why every new version of our on-premise products still supports SharePoint 2007 and Windows Server 2003. (Yes, many people still run that combination, and we don’t mind that they do)
Any questions or comments? Please leave a message below or contact us, we love talking to our customers.