Enterprise Content Management Integration – ECM

There are many solutions involving BPM that involve working with document and file content such as Microsoft Word, PDFs, Excel spreadsheets, image files and other pre-existing information items. Instances of document types such as these may be needed as part of the handling of a process instance. Consider an insurance claim for a car accident. This may involve an initial claim form, a police report, pictures of the accident and faxes from garages with quotes for repair. To process a claim, a clerk may work through a process instance which must make these documents available. Obviously these documents have to be stored somewhere within an IT system and this is where the category of software product called an Enterprise Content Management (ECM) system comes into play. An ECM is a software product that explicitly manages the existence, organization, search and retrieval of instances of documents. A variety of vendors make a variety of such ECMs. These include:

In addition, the BPM product itself provides an in-built document repository.

In this section we will discuss the support provided by IBM BPM for integrating with ECMs such that content from documents can be managed as part of the process solution.

See also:

Content Management Interoperability Services – CMIS

If we consider what we want from an arbitrary ECM we will find a common set of required functions. Capabilities that we obviously want are:

Historically, each provider of an ECM supplied their own sets of APIs to be able to interact with it. Each API was perfect for interacting with a specific brand of ECM however if we wished to change ECM providers, we would be stuck with the job of re-coding our applications to use the different APIs. What we need is an API that is an industry standard that each vendor that implements an ECM can expose. If we can define such an API then we need only write our application that interacts with an abstract ECM once and then simply choose an ECM provider to use based on other qualities such as performance, capacity, price or other features.

Content Management Interoperability Services (CMIS) is an open standards protocol from the standards body called OASIS. CMIS provides a vendor neutral API for interacting with ECMs. CMIS is what IBM BPM uses to interact with back-end ECM systems thus insulating BPM from any single provider. A number of ECM's support this standard including:

CMIS provides the following functions and here we indicate which ones are supported by IBM BPM:

See also:

Document Types

When we think of a file on the file system of an operating system, we think of it as containing content and having a name. There are other attributes of files such as their owner and the date/time they were last modified. However, that is about the limit of the meta data of a file. Documents held within an ECM are different from a simple file. When a document is created within the ECM, we tell the ECM what "type" of document it is. By doing this, we have implicit associated a type template with that document and this is very valuable. By associating the template with the document, we can work with it in a richer manner.

For example, if we create a document which is my tax return for 2015, then we can imagine that if we viewed it, we might see a PDF of a 1099ez (A US tax form). However, if when we create the document within the repository, we explicit state that the document is of "type" 1099ez, we may have defined a model of what that contains. For example, my social security number, my salary, my dependents and other meta information related to the document. By doing that, one can now retrieve such documents using richer search criteria. For example, to retrieve all my tax filings for the last few years, one could formulate a query such as:

select all documents of type 1099ez where the socialSecurityNumber="123-45-6789"

When using FileNet or IBM BPM Content Store, one can use the acce tool to examine the document types to see what are available and to validate that a document type you created is where it is expected to be.

See also:

Configuring IBM BPM for an external ECM

In order to integrate with an external ECM system using CMIS, you must define a connection to it within your Process Application. Within Process Designer, within the Process App Settings, you will find a tab called Servers. We create a new definition here ensuring the "Type" is "Enterprise Content Management Server".

A definition of this type has a number of attributes associated with it:

Hostname – The host-name or TCP/IP address on which the ECM server can be found.

Port – The TCP/IP port number on the host-name where the ECM server is listening for requests it should perform on behalf of the BPM system.

Context Path – The relative URL part of the context path on which the ECM server is willing to accept Web Service requests for ECM work.

Secure Server – Whether or not the HTTPS protocol should be used for communications. HTTPS provides transport level security.

Repository – The name of the repository.

Userid – The userid used to connect to the ECM server.

Password – The password for the user used to connect to the ECM server.

The "Test Connection" button may be used to validate the information entered. If all is well, the following dialog will be shown:

Here is a sample Alfresco definition:

Here is a sample for FileNet:

In order to use ECM functions within a BPM solution, we must add an IBM supplied toolkit as a dependency to our application. Add the content management toolkit which is called "Content Management" which has an acronym ID called "SYSCM":

The use of the same user for multiple ECM connections was considered a security vulnerability and has been augmented with a new service that is described in this technote. The vulnerability has been logged as CVE-2015-1904.

The Document List Coach View

After having defined a BPM process app with knowledge of how to connect to an ECM, we would next turn our attention to thinking about how a user would see managed documents from our solution. IBM has supplied a number of Coach Views that perform such services.

To show documents to the user from a Coach, you can add a "Document List" Coach View to the Coach. From the palette of controls, this looks as follows:

When added to the Coach, it looks like:

It has quite a few configuration parameters:

Open in new window – If selected and if a request to show the content of the document is made, then the content is shown in its own browser window. If we don't check this option and wish to show the content of the document then we will have to use a Document Viewer.

Refresh Trigger – A boolean valued variable which, if it becomes true causes the content of the list to refresh and then toggles the value of the variable back to false.

Number of results to show – The number of results for a document search to show in a page. The default is 10.

Show all content – If not selected (default) then the result will show a fixed number of rows on the page and paging will be enabled. If selected, then the fixed number of rows will still show but paging will not be enabled.

Configure for use with – Select the type of document provider. There are two choices:

BPM document options – When the type of repository is the BPM Document Store, these additional options apply:

Display options – This section acts as a filter to be used when displaying items in the list

Associated with process instance – By default, all documents are shown. Selecting this option causes only documents created within the current process instance to be shown.

Display match rule – A selection of how matches are made. There are three options:

Display properties – A list of name and value pairs where the name is that of a property of a document and value is its corresponding value. These display properties are used in conjunction with the "Display match rule" option to filter the set of documents shown to the user.

Upload options

Default upload name – Default name for an uploaded document. If supplied, the "User editable" property defines whether or not the user can alter this value.

User editable – Defines whether or not the user can edit the name of the document to be created.

Add properties – Should additional properties be added during an upload. If yes, those properties are defined in the "Upload properties" list.

Upload properties

Hide in Portal – If selected, the document will not be visible in the document list.

A BPM data type called BPMDocumentOptions is defined that models the settings just described. This data type contains:

displayOptions (BPMDocumentDisplayOptions)

uploadOptions (BPMDocumentUploadOptions)

A JavaScript initializer for this data type might be:

{ displayOptions: { associatedWithProcessInstance: false, displayMatchRule: "NONE", // Other choices are "ANY" and "ALL" displayProperties: [] }, uploadOptions: { defaultUploadName: "", userEditable: false, addProperties: false, uploadProperties: [], hideInPortal: false } }

ECM document options – An instance of CMISDocumentOptions. This data structure has two properties:

Search (user implementation required for ECM documents) – A service which returns the documents to be shown to the user. The default is the IBM supplied service called "Default ECM Search Service". See Document List – Search service.

Get all document versions – A script which gets the versions of each document retrieved

Get document – A script which retrieves an individual document

Get type definition – A script which retrieves the types of all the documents retrieved.

Get type descendants – A script which retrieves the types of documents.

But what does this Coach View look like when it is shown within a Coach? The following is an example screen shot:

What is displayed is a table containing details of the documents that are found within the ECM. If there are more documents than can be display in the maximum allowable size, the table will provide paging (if desired). As mentioned, the selection of columns included in the table can be customized.

In the "Actions" column are click-able icons that can provide functions to be performed against the document that a row represents. The magnifying glass icon allows us to view the document's content in an associated content viewer widget.

The pencil icon allows us to edit the properties of the document. When clicked, we can change the properties or upload a new version of the content. This option is only available if the "Allow update" configuration option is checked.

The final icon in the Actions column allows us to view the previous revisions of the document. The revision list is shown in the same view:

Also in the view are two further buttons associated with the view as a whole. The first is called "Refresh" and simply refreshes the view from the content managed by the ECM. If new documents have been added, or old ones deleted the view will reflect those changes after a refresh. The second button is called "Create Document".

Here we can provide a number of parts including the type of document, the name we wish to give to the document and a local file whose content will be uploaded and associated with the new document. The newly created document will be created in the ECM folder defined by the "Parent folder path" configuration attribute. The ability to create new documents is only available if the "Allow creates" configuration option is selected.

The Coach View can be bound to a variable of data type "List of ECMDocumentInfo". This list will be populated with information on the retrieved items. The selected item in the list will change when the user selects a document to view.

See also:

Document List – Search service

When the Document List Coach View loads, it has to get the list of documents from the ECM that it will use to display to the user. It performs this task by delegating it to a Service defined by the Search configuration parameter. The Search service is used to return the list of documents to be shown to the end user. For external ECM providers, the Search service must be implemented, there is no default. The Search service can be easily created by clicking the "New" button in the Document List configuration parameters.

However, when using Client Side Human Services, the "New" button is not present.

That isn't good. The solution is to create a temporary Heritage Human Service and add the Document List Coach View into a Coach. From the Heritage Human Service, you will continue to find the "New" button which can be used to create the correct Ajax service. Once done, you can go back to the Client Side Human Service and pick the one you just created. The temporary Heritage Human Service can then be deleted. An alternative would be to copy the service called "Default ECM Search Service" from the "Content Management" toolkit into your local process app and rename/modify that copy. This service can be found in the "User Interface" category as the service is an Ajax service.

The new Ajax service with the correct signature will look as follows:

It is your responsibility as the designer of this service to return a searchResult object from your execution. To achieve this, add a Content Integration component into the service as shown:

In its implementation settings, select your server and the "Search" operation:

Map its required parameters from the signature of the service you are building:

In order to enable the Actions within the list's Action column for the Document List shown rows, the search must include the "cmis:objectId" as one of the returned items. This is used as the key to perform the action. This is added by default unless you have chosen to build your own CMIS query. Failure to return this query will result in the actions grayed with a helpful tool-tip hint shown here:

See also:

Document List – Search Service – Properties

The properties of a document can be found in the "Properties" attribute of CMIS. It seems that the individual properties don't appear by themselves as top level properties but are rather encoded within this single property.

Document List – Search Service – The provided cmisQuery parameter

You will have noticed that one of the parameters passed into the service is called "cmisQuery". This is a string which contains the cmisQuery that reflects the settings made within Process Designer. If we ignore this cmisQuery, we ignore the settings that we made. It is odd to note that the raw settings (an instance of BPMDocumentOptions) is not supplied to the service. Since this is not supplied, we don't have the information needed to factor these settings into our own search. One workaround (read "hack") is to understand the structure of the generated cmisQuery and hope that it remains constant from release to release. If this is the case, we can string manipulate the supplied cmisQuery to add our own properties and columns.

The Document Viewer Coach View

A companion Coach View to the Document List is the Document Viewer. This Coach View's purpose is to act as a container for showing the content of a selected document. From the palette, this Coach View looks like:

When dropped on a Coach canvas, it looks like:

The Document Viewer Coach View has no configuration options. However, it should be be bound to a variable of type "ECMDocumentInfo" that should be a variable also bound to a Document List Coach View. The recommended way is to bind to the "listSelected" element in the Document List Coach View bound list. When the user selects a document in the Document List Coach View, the selected entry in the list changes which is noticed by the Document Viewer Coach View and is the trigger to show the new content.

The height of the Document Viewer is not likely to be the size you want. This can be fixed by changing the style:

.classDocViewFrame { height: 500px; }

See also:

The Document Explorer Coach View

With the arrival of IBM BPM 8.5.5, a new addition was made to the set of Coach Views that are provided to interact with the IBM BPM content management system. The new Coach View is called the "Document Explorer". It provides a new visualization for documents and extends the concept to include folders. This Coach View is only available with the Basic Case Management feature installed.

The palette entry for the Document Explorer looks as follows:

And is only available when the Content Management toolkit has been added to the project. When dragged and dropped onto the canvas, it looks as follows:

It has a rich set of configuration options. These options are illustrated below:

Content Integration node

We should now have a general idea that interaction with an ECM is achieved using a protocol called CMIS and that there are Coach Views that can be used to present the documents to the user. We have also seen that some of these Coach Views call Services in order for them to obtain data. What has not yet been explained is how these Service arrange for the CMIS calls to the ECM provider to be made. IBM BPM provides a node that can be included in a service called the Content Integration node. This is the core to the story.

The Content Integration node provides CMIS access to the ECM provider. It can be added to a variety of Service implementations. When seen in the palette, it looks as follows:

When added to the service implementation canvas, it looks like:

Once added to the canvas, the implementation details of the node must be defined. There are two primary attributes. The first is the server against which the CMIS command is to be performed. This server definition must be create in the Process Application Server options. The second parameter is called "Operation Name" and defines which CMIS operation is to be performed against the server.

Part of the goal of IBM BPM is to try and shield the developer from the worst details of CMIS usage.

The operations that can be performed on a Content Integration node are:

The entries in red are not available to the IBM BPM Document Store.

Each of these commands will be covered in turn.

Depending on which command is selected, the input and output parameters of the Content Integration node will change dramatically.

IBM provides some new Business Object data type definitions for some of those parameters. These new data types are described next

Description of a Document (ECMDocument)

A document is described by the data type called "ECMDocument":

The list of properties associated with the document will be a mixture of ECM specific and application injected.

See also:

Description of a Search Result (ECMSearchResult)

When a search is executed, an object of type ECMSearchResult is returned.

This object contains:

To see how this fits together, think of resultSet as being a list of row properties and each row being a list of column properties.

See also:

Description of ECMSearchResultSet

When a search is performed, we get back an ECMSearchResult. This has a property called "resultSet" which is an instance of ECMSearchResultSet. This data type looks like:

Which as we see is really no more than a list of rows. Each row is an instance of ECMSearchResultRow.

Description of ECMSearchResultRow

This data structure represents a single document instance returned from a search. The instance is a set of properties called columns where each column corresponds to a particular property of the document instance. The order of the columns corresponds to the order of the propertyMetaData in the ECMSearchResult object.