Thursday, June 21, 2012

Exception handling; fault management and priority messages

Introduction

Exception handling is often a topic which is paid too little attention to. Functional people often consider this a technical topic and technical people consider the handling of error situations to mostly be functional in nature. This difference in opinion can result in software going to a production environment which is difficult to maintain.

It is suggested to first read the following two articles to understand the background of this article;
http://javaoraclesoa.blogspot.nl/2012/05/exception-handling-in-soa-suite-10g-and.html
http://javaoraclesoa.blogspot.nl/2012/05/re-enqueueing-faulted-bpel-messages.html

The below article describes a combination of exception handling by using a custom Java class in a fault policy to retire the process and a BPEL catch branch to reenqueue a message with a higher priority.

The implementation will satisfy the following requirements;

If an error occurs
- the process will be retired to prevent future faults in case a service invocation fails (fault management framework and custom Java fault handler)
- the faulted message must be put on a queue so it can easily be re-offered to the process after the error has been resolved (catch branch in BPEL using AQ adapter to re-enqueue a message)
- the faulted message must be picked up before the other messages in the queue if the problem is fixed in order to retain the order of messages processed (AQ with sorting on priority)

The fault management framework

Summary

The fault management framework does not handle all exceptions which occur but only invoke exceptions. The fault management framework takes precedence over the BPEL catch branch. The fault management framework makes it easy to use custom Java fault handlers to for example retire a process.

The BPEL catch branch does not allow a custom Java class to handle the exception. You can however use Java embedding. See for an example; http://javaoraclesoa.blogspot.nl/2012/03/base64encode-and-base64decode-in-bpel.html. Mind that when an exception is rethrown in a Catch branch and a CatchAll branch is present on the same level, the exception is propagated to the higher level and not caught by the CatchAll branch!

Example

The following example will retire a process when errors occur in invoke activities. The rethrow action makes sure that after the process has been retired, the error is propagated to the catch activities in the BPEL process.

The following code can be put in the <SCA-project>/SCA-INF/src/ms/exceptiontest.soa.faultHandlers folder;

package ms.exceptiontest.soa.faultHandlers;

import com.collaxa.cube.engine.fp.BPELFaultRecoveryContextImpl;
import java.util.logging.Logger;
import oracle.integration.platform.faultpolicy.IFaultRecoveryContext;
import oracle.integration.platform.faultpolicy.IFaultRecoveryJavaClass;
import oracle.soa.management.facade.Composite;
import oracle.soa.management.facade.Locator;
import oracle.soa.management.facade.LocatorFactory;

public class retireFaultHandler implements IFaultRecoveryJavaClass {
    private final static Logger logger =
        Logger.getLogger(retireFaultHandler.class.getName());

    public void handleRetrySuccess(IFaultRecoveryContext iFaultRecoveryContext) {
    }

    public String handleFault(IFaultRecoveryContext iFaultCtx) {
        try {
            BPELFaultRecoveryContextImpl bpelCtx =
                (BPELFaultRecoveryContextImpl)iFaultCtx;
            Locator loc = LocatorFactory.createLocator();
            Composite comp =
                loc.lookupComposite(bpelCtx.getProcessDN().getCompositeDN());
            comp.retire();
            System.out.println("retired " + comp.getDN());

        } catch (Exception e) {
            logger.severe("Error in FaultHandler " +
                          retireFaultHandler.class.getName());
            e.printStackTrace();
        }
        return "OK";
    }
}

Then you can use the following fault-policies.xml

<?xml version="1.0" encoding="UTF-8"?>
<faultPolicies xmlns="http://schemas.oracle.com/bpel/faultpolicy"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <faultPolicy version="2.1.3" id="ConnectionFaults">
    <Conditions>
      <faultName>
        <condition>
          <action ref="retireProcess"/>
        </condition>
      </faultName>
    </Conditions>
    <Actions>
      <Action id="retireProcess">
        <javaAction className="ms.exceptiontest.soa.faultHandlers.retireFaultHandler"
                    defaultAction="ora-rethrow-fault">
          <returnValue value="OK" ref="ora-rethrow-fault"/>
        </javaAction>
      </Action>
      <!-- Generics -->
      <Action id="ora-terminate">
        <abort/>
      </Action>
      <Action id="ora-replay-scope">
        <replayScope/>
      </Action>
      <Action id="ora-rethrow-fault">
        <rethrowFault/>
      </Action>
      <Action id="ora-human-intervention">
        <humanIntervention/>
      </Action>
    </Actions>
  </faultPolicy>
</faultPolicies>

And fault-bindings.xml

<?xml version="1.0" encoding="UTF-8"?>
<faultPolicyBindings version="2.0.1"
                     xmlns="http://schemas.oracle.com/bpel/faultpolicy"
                     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <composite faultPolicy="ConnectionFaults"/>
</faultPolicyBindings>

Re-enqueue messages using a higher priority

PL/SQL queue code

To be able to re-enqueue messages using a higher priority, a multiconsumer queue is created. Creating a multiconsumer queue is not required but allows more flexibility since multiple parties can become subscriber on the queue.

The queue uses a sort_list of priority,enq_time making sure messages with higher priority are dequeued first. The retention time is set to 1 week since even if messages are dequeued, it is handy if we can still find them to check their contents. The user holding the queues (and queue package), should have the following grants;

GRANT EXECUTE ON DBMS_AQADM TO testuser;
GRANT EXECUTE ON DBMS_AQ TO testuser;
GRANT AQ_ADMINISTRATOR_ROLE TO testuser;

--creating queue tables
BEGIN
  DBMS_AQADM.CREATE_QUEUE_TABLE( Queue_table => '"TESTUSER"."SOA_MULTI_QT"', Queue_payload_type => 'SYS.XMLTYPE', Sort_list => 'PRIORITY,ENQ_TIME', Multiple_consumers => TRUE, Compatible => '8.1.3');
END;

--creating queues
BEGIN
  DBMS_AQADM.CREATE_QUEUE( Queue_name => 'TESTUSER.SOA_MULTI_QUEUE', Queue_table => 'TESTUSER.SOA_MULTI_QT', Queue_type => 0, Max_retries => 5, Retry_delay => 0, Retention_time => '604800', dependency_tracking => FALSE, COMMENT => 'multi queue');
END;

--adding subscribers
begin
DBMS_AQADM.ADD_SUBSCRIBER ('TESTUSER.SOA_MULTI_QUEUE',sys.aq$_agent('FAULTTEST', null, null));
DBMS_AQADM.Start_queue( Queue_name => 'TESTUSER.SOA_MULTI_QUEUE',TRUE,TRUE );
end;

To drop the queue, queuetable, etc, the following can be used

BEGIN
  DBMS_AQADM.Stop_queue( Queue_name => 'TESTUSER.SOA_MULTI_QUEUE' );
  DBMS_AQADM.DROP_QUEUE( Queue_name => 'TESTUSER.SOA_MULTI_QUEUE' );
  DBMS_AQADM.DROP_QUEUE_TABLE( Queue_table => '"TESTUSER"."SOA_MULTI_QT"');
END;

To allow other users to enqueue messages without having to grant them rights to the DBMS_AQ package, the following package can be used;

CREATE OR REPLACE
PACKAGE soa_queue_pack
IS
type t_recipients_list IS TABLE OF VARCHAR2 (50); -- index by binary_integer;

PROCEDURE vul_multi_queue(
    p_queue_naam      IN VARCHAR2 ,
    p_xml_payload     IN xmltype ,
    p_priority        IN BINARY_INTEGER ,
    p_recipients_list IN t_recipients_list);
END soa_queue_pack;

create or replace
PACKAGE BODY soa_queue_pack
IS
PROCEDURE vul_multi_queue(
    p_queue_naam      IN VARCHAR2 ,
    p_xml_payload     IN xmltype ,
    p_priority        IN BINARY_INTEGER ,
    p_recipients_list IN t_recipients_list )
IS
  l_msg_aq raw(18);
  l_enq_opt dbms_aq.enqueue_options_t;
  l_msg_prop dbms_aq.message_properties_t;
  l_recipients_list dbms_aq.aq$_recipient_list_t;
BEGIN
  FOR i IN p_recipients_list.FIRST .. p_recipients_list.LAST
  LOOP
    l_recipients_list(i) := sys.aq$_agent(p_recipients_list(i),NULL,NULL);
  END LOOP;
  l_msg_prop.priority       := p_priority;
  l_msg_prop.recipient_list := l_recipients_list;
  dbms_aq.enqueue(p_queue_naam, l_enq_opt, l_msg_prop, p_xml_payload, l_msg_aq);
END;
END;

Other users can then do the following to enqueue messages ('testuser' is the user who owns the package and queues);

DECLARE
l_recipients          testuser.soa_queue_pack.t_recipients_list;
l_message_id RAW(16);
l_message SYS.XMLType;
BEGIN
l_recipients := testuser.soa_queue_pack.t_recipients_list('FAULTTEST');
l_message := sys.XMLType.createXML('<itemCollectionArray xmlns:msg_out="http://test.ms/itemcollections" xmlns="http://test.ms/itemcollections"><msg_out:itemsCollection><msg_out:item><msg_out:name>name</msg_out:name><msg_out:value>Piet</msg_out:value></msg_out:item></msg_out:itemsCollection></itemCollectionArray>');
testuser.soa_queue_pack.vul_multi_queue('TESTUSER.SOA_MULTI_QUEUE',l_message,5,l_recipients);
END;

This can for example be used in a PL/SQL table trigger in a different schema to enqueue a message on a certain status change or insert.

BPEL dequeueing and enqueueing

In BPEL you can specify the consumer a property in the AQ wizard for the dequeue operation;


For enqueueing, you can specify the enqueue priority to for example make sure messages causing errors, are picked up first when the problem is fixed.


To make sure no loops are caused and to throttle the speed at which messages are processed by the AQ adapter, the following properties can be used in the composite.xml;

  <service name="soa_multi_queue_AQ"
           ui:wsdlLocation="soa_multi_queue_AQ.wsdl">
    <interface.wsdl interface="http://xmlns.oracle.com/pcbpel/adapter/aq/bpel/multiqueue/soa_multi_queue_AQ#wsdl.interface(Dequeue_ptt)"/>
    <binding.jca config="soa_multi_queue_AQ_aq.jca"/>
    <property name="minimumDelayBetweenMessages">10000</property>
    <property name="adapter.aq.dequeue.threads" type="xs:string" many="false">1</property>
  </service>

The above setting causes 1 message every 10 seconds to be picked up from the queue.

Demonstration

The following BPEL process has been created;
This process calls an HelloWorld webservice. This webservice can be shutdown or retired in order to simulate a RemoteException (failure to invoke the webservice). The fault policy will call the custom Java action to retire the process. After retirement, the fault is rethrown and caught by the catchall block. This will re-enqueue the message with a higher priority. Upon fixing the error and enabling the process, the faulted message is picked up first and processed correctly.

First a message is put on the queue by executing;
DECLARE
l_recipients          testuser.soa_queue_pack.t_recipients_list;
l_message_id RAW(16);
l_message SYS.XMLType;
BEGIN
l_recipients := testuser.soa_queue_pack.t_recipients_list('FAULTTEST');
l_message := sys.XMLType.createXML('<itemCollectionArray xmlns:msg_out="http://test.ms/itemcollections" xmlns="http://test.ms/itemcollections"><msg_out:itemsCollection><msg_out:item><msg_out:name>name</msg_out:name><msg_out:value>Piet</msg_out:value></msg_out:item></msg_out:itemsCollection></itemCollectionArray>');
testuser.soa_queue_pack.vul_multi_queue('TESTUSER.SOA_MULTI_QUEUE',l_message,5,l_recipients);
END;

Then confirm the message is processed succesfully;

Next, disable the HelloWorld service by retiring it and re-enqueue a message. Check how CallHelloWorld handles the exception. The process is retired and the message is re-enqueued with a higher priority.



You can also confirm that after enabling the HelloWorld service and Activating the CallHelloWorld process, it first picks up the faulted message. You can download the sample (BPEL/Java code) at; http://dl.dropbox.com/u/6693935/blog/ExceptionTest.zip

Monday, June 4, 2012

Increasing BPEL performance

Introduction

Performance is a difficult topic with many variables. If the overall performance of an application is considered insufficient by a customer, then the people creating and maintaining (amongst others) the application, database, operating-system and server are first confronted with the task to determine the most important bottlenecks and then tackle them.

This post provides some suggestions on how to improve BPEL performance, should BPEL be a bottleneck. There are of course a lot more optimization options then the ones listed here like Weblogic tuning, JVM tuning and database tuning to get a BPEL process to run more smoothly.

Large dehydration store

A large dehydration store is a common cause for performance issues. Asynchronous callbacks for example, can become slow when the dehydration store contains a long history of process instances. This can become a significant portion of the processing-time. The solution for this is purging the dehydration store of old messages.

To purge old instance data, several scripts can be used depending on the precise version and requirements for purging. Truncating of tables is drastic. Deleting instances is less drastic and can be done more selectively (do keep long running processes in mind if you use them!) and using the Oracle supplied scripts is the supported way of purging.

Truncate tables
The below script is fast and truncates several tables; http://www.emarcel.com/soa-suit/152-deleteinstancessoasuite11gwls (example 4). It removes all instances and frees the space previously occupied.

Delete instances older then a specific date
If you don't use long running processes and workflows, the below script can help; http://orasoa.blogspot.nl/2011/07/purging-soa-suite-11g-extreme-edition.html. It deletes instances before a specific date (but does not take into account that the process might still be running).

Oracle purging scripts
See for more information; 
https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=REFERENCE&id=1384379.1

Grant rights for execution

>sqlplus / as sysdba

Grant the privileges:

SQL> GRANT EXECUTE ON DBMS_LOCK TO DEV_SOAINFRA;

Grant succeeded.

SQL> GRANT CREATE ANY JOB TO DEV_SOAINFRA;

Grant succeeded.


Add the purge procedures;

cd <FMW_HOME>/rcuHome/rcu/integration/soainfra/sql/soa_purge/

>sqlplus DEV_SOAINFRA/<password>

SQL> @soa_purge_scripts.sql

Procedure created.
Function created.
Type created.
Type body created.
PL/SQL procedure successfully completed.
Package created.
Package body created.
SQL>


Execute the purge action (below example is single threaded)

DECLARE
MAX_CREATION_DATE timestamp;
MIN_CREATION_DATE timestamp;
batch_size integer;
max_runtime integer;
retention_period timestamp;
BEGIN
MIN_CREATION_DATE := to_timestamp('2010-01-01','YYYY-MM-DD');
MAX_CREATION_DATE := to_timestamp('2010-01-31','YYYY-MM-DD');
max_runtime := 60;
retention_period := to_timestamp('2010-01-31','YYYY-MM-DD');
batch_size := 10000;
soa.delete_instances(
min_creation_date => MIN_CREATION_DATE,
max_creation_date => MAX_CREATION_DATE,
batch_size => batch_size,
max_runtime => max_runtime,
retention_period => retention_period,
purge_partitioned_component => false);
END;
/


Rebuilding indexes and reclaiming free space
Based on http://www.emarcel.com/soa-suit/152-deleteinstancessoasuite11gwls and https://forums.oracle.com/forums/thread.jspa?threadID=2286397 I've created my own script to rebuild indexes and reclaim space for Oracle SOA Suite 11.1.1.6. Execute the script under the SOA Infra schema;

ALTER TABLE XML_DOCUMENT enable row movement;
ALTER TABLE XML_DOCUMENT shrink space compact;
ALTER TABLE XML_DOCUMENT shrink space;
ALTER TABLE XML_DOCUMENT shrink space cascade;
ALTER TABLE XML_DOCUMENT disable row movement;
ALTER INDEX DOC_STORE_PK rebuild online;

ALTER TABLE AUDIT_TRAIL enable row movement;
ALTER TABLE AUDIT_TRAIL shrink space compact;
ALTER TABLE AUDIT_TRAIL shrink space;
ALTER TABLE AUDIT_TRAIL shrink space cascade;
ALTER TABLE AUDIT_TRAIL disable row movement;
ALTER INDEX AT_PK rebuild online;

ALTER TABLE HEADERS_PROPERTIES enable row movement;
ALTER TABLE HEADERS_PROPERTIES shrink space compact;
ALTER TABLE HEADERS_PROPERTIES shrink space;
ALTER TABLE HEADERS_PROPERTIES shrink space cascade;
ALTER TABLE HEADERS_PROPERTIES disable row movement;

ALTER TABLE CUBE_SCOPE enable row movement;
ALTER TABLE CUBE_SCOPE shrink space compact;
ALTER TABLE CUBE_SCOPE shrink space;
ALTER TABLE CUBE_SCOPE shrink space cascade;
ALTER TABLE CUBE_SCOPE disable row movement;
ALTER INDEX CS_PK rebuild online;

ALTER INDEX REFERENCE_INSTANCE_CDN_STATE rebuild online;
ALTER INDEX REFERENCE_INSTANCE_CO_ID rebuild online;
ALTER INDEX REFERENCE_INSTANCE_ECID rebuild online;
ALTER INDEX REFERENCE_INSTANCE_ID rebuild online;
ALTER INDEX REFERENCE_INSTANCE_STATE rebuild online;
ALTER INDEX REFERENCE_INSTANCE_TIME_CDN rebuild online;

ALTER TABLE DLV_MESSAGE enable row movement;
ALTER TABLE DLV_MESSAGE shrink space compact;
ALTER TABLE DLV_MESSAGE shrink space;
ALTER TABLE DLV_MESSAGE shrink space cascade;
ALTER TABLE DLV_MESSAGE disable row movement;
ALTER INDEX DM_CONVERSATION rebuild online;

ALTER TABLE CUBE_INSTANCE enable row movement;
ALTER TABLE CUBE_INSTANCE shrink space compact;
ALTER TABLE CUBE_INSTANCE shrink space;
ALTER TABLE CUBE_INSTANCE shrink space cascade;
ALTER TABLE CUBE_INSTANCE disable row movement;
ALTER INDEX CI_CREATION_DATE rebuild online;
ALTER INDEX CI_CUSTOM3 rebuild online;
ALTER INDEX CI_ECID rebuild online;
ALTER INDEX CI_NAME_REV_STATE rebuild online;
ALTER INDEX CI_PK rebuild online;

ALTER TABLE INSTANCE_PAYLOAD enable row movement;
ALTER TABLE INSTANCE_PAYLOAD shrink space compact;
ALTER TABLE INSTANCE_PAYLOAD shrink space;
ALTER TABLE INSTANCE_PAYLOAD shrink space cascade;
ALTER TABLE INSTANCE_PAYLOAD disable row movement;
ALTER INDEX INSTANCE_PAYLOAD_KEY rebuild online;

ALTER TABLE XML_DOCUMENT_REF enable row movement;
ALTER TABLE XML_DOCUMENT_REF shrink space compact;
ALTER TABLE XML_DOCUMENT_REF shrink space;
ALTER TABLE XML_DOCUMENT_REF shrink space cascade;
ALTER TABLE XML_DOCUMENT_REF disable row movement;

ALTER INDEX COMPOSITE_INSTANCE_CIDN rebuild online;
ALTER INDEX COMPOSITE_INSTANCE_CO_ID rebuild online;
ALTER INDEX COMPOSITE_INSTANCE_CREATED rebuild online;
ALTER INDEX COMPOSITE_INSTANCE_ECID rebuild online;
ALTER INDEX COMPOSITE_INSTANCE_ID rebuild online;
ALTER INDEX COMPOSITE_INSTANCE_STATE rebuild online;

ALTER TABLE DOCUMENT_DLV_MSG_REF enable row movement;
ALTER TABLE DOCUMENT_DLV_MSG_REF shrink space compact;
ALTER TABLE DOCUMENT_DLV_MSG_REF shrink space;
ALTER TABLE DOCUMENT_DLV_MSG_REF shrink space cascade;
ALTER TABLE DOCUMENT_DLV_MSG_REF disable row movement;
ALTER INDEX DOC_DLV_MSG_GUID_INDEX rebuild online;

ALTER TABLE DLV_SUBSCRIPTION enable row movement;
ALTER TABLE DLV_SUBSCRIPTION shrink space compact;
ALTER TABLE DLV_SUBSCRIPTION shrink space;
ALTER TABLE DLV_SUBSCRIPTION shrink space cascade;
ALTER TABLE DLV_SUBSCRIPTION disable row movement;
ALTER INDEX DS_CONV_STATE rebuild online;
ALTER INDEX DS_CONVERSATION rebuild online;
ALTER INDEX DS_FK rebuild online;

ALTER TABLE AUDIT_DETAILS enable row movement;
ALTER TABLE AUDIT_DETAILS shrink space compact;
ALTER TABLE AUDIT_DETAILS shrink space;
ALTER TABLE AUDIT_DETAILS shrink space cascade;
ALTER TABLE AUDIT_DETAILS disable row movement;
ALTER INDEX AD_PK rebuild online;

ALTER TABLE WI_FAULT enable row movement;
ALTER TABLE WI_FAULT shrink space compact;
ALTER TABLE WI_FAULT shrink space;
ALTER TABLE WI_FAULT shrink space cascade;
ALTER TABLE WI_FAULT disable row movement;
ALTER INDEX WF_CRDATE_CIKEY rebuild online;
ALTER INDEX WF_CRDATE_TYPE rebuild online;
ALTER INDEX WF_FK2 rebuild online;

ALTER TABLE WORK_ITEM enable row movement;
ALTER TABLE WORK_ITEM shrink space compact;
ALTER TABLE WORK_ITEM shrink space;
ALTER TABLE WORK_ITEM shrink space cascade;
ALTER TABLE WORK_ITEM disable row movement;
ALTER INDEX WI_EXPIRED rebuild online;
ALTER INDEX WI_STRANDED rebuild online;

--LOBs

ALTER TABLE XML_DOCUMENT MODIFY lob (DOCUMENT) (shrink space);
ALTER TABLE AUDIT_DETAILS MODIFY lob (bin) (shrink space);
ALTER TABLE WI_FAULT MODIFY lob (MESSAGE) (shrink space);
ALTER TABLE CUBE_SCOPE MODIFY lob (SCOPE_BIN) (shrink space);
ALTER TABLE REFERENCE_INSTANCE MODIFY lob (STACK_TRACE) (shrink space);
ALTER TABLE REFERENCE_INSTANCE MODIFY lob (ERROR_MESSAGE) (shrink space);
ALTER TABLE COMPOSITE_INSTANCE_FAULT MODIFY lob (STACK_TRACE) (shrink space);
ALTER TABLE COMPOSITE_INSTANCE_FAULT MODIFY lob (ERROR_MESSAGE) (shrink space);
ALTER TABLE TEST_DEFINITIONS MODIFY lob (DEFINITION) (shrink space);

Audit trail logging

Production audit trail logging is synchronous. This causes a delay in the execution of a BPEL process. It can be made asynchronous;

https://supporthtml.oracle.com/epmos/faces/ui/km/SearchDocDisplay.jspx?_afrLoop=4432510064434000&type=DOCUMENT&id=1328382.1&displayIndex=14&_afrWindowMode=0&_adf.ctrl-state=169zpsk90p_218

(Oracle Support ID 1328382.1)

Log in to the Fusion Middleware Console (at http://soaserver:7001/em)
Navigate to Farm_soa_domain --> SOA --> (right-click on) soa-infra --> SOA Administration --> Common Properties --> More SOA Infra Advanced Configuration Properties (bottom of page)
Click on "Audit Config"
Set the following values and click on "Apply" afterwards

AuditConfig/compositeInstanceStateEnabled = false
AuditConfig/level = Production
AuditConfig/policies/Element_0/isActive = false
AuditConfig/policies/Element_0/name = Immediate
AuditConfig/policies/Element_1/isActive = true
AuditConfig/policies/Element_1/name = Deferred

(sample values)
AuditConfig/policies/Element_1/properties/Element_0/name = maxThreads
AuditConfig/policies/Element_1/properties/Element_0/value = 10
AuditConfig/policies/Element_1/properties/Element_1/name = flushDelay
AuditConfig/policies/Element_1/properties/Element_1/value = 5000
AuditConfig/policies/Element_1/properties/Element_2/name = batchSize
AuditConfig/policies/Element_1/properties/Element_3/value = 100

I did a small test on how much performance was improved by changing this setting. I created a synchronous HelloWorld BPEL process. Input string output Hello string. I first tested the process in the Fusion Middleware Control test console before starting a loadtest from SOAP UI (to make sure the first request would not incur additional overhead for loading of the process).

I used the Simple strategy, no randomness, 100 requests, 5 threads. Results are in ms.

Default audit config

Development
min 35
max 178
avg 57.76

Production
min 33
max 122
avg 56.83

Off
min 28
max 155
avg 47.87

After change audit config

Development
min 32
max 207
avg 53.45

Production
min 31
max 161
avg 51.82

Off
min 25
max 172
avg 48.65

As you can see there is performance improvement on the averages in development and production audit setting. The improvement is about 10% in this test situation. My process is however not heavy on audit logging (receive, assign, reply). The differences found might have been greater if the process contained more activities. No difference was expected in Off situation (averages slightly differ but not much) and also as expected, Production setting is faster then Development setting. I tested this on a customers development environment. My home system is of course a lot faster ;)

Several other options

The following document; http://docs.oracle.com/cd/E21764_01/core.1111/e10108/bpel.htm describes several other options which can be used to increase BPEL performance. Some examples;

- inMemoryOptimization component property indicates the process is transient and dehydration can be performed on completion or not at all (depending on completionPersistPolicy). Reduces dehydration store traffic.

- validateXML (domain or composite property) allows validation of XML to be skipped. reduces memory and CPU usage.




One setting to mind is the nonBlockingInvoke partnerlink property; new threads are spawned for invokes instead of using the same threads for all invokes. I've done several measures on this on several machines in several configurations (Oracle VM, custum installed machine with separate Admin / Managed domain and a customer system). Performance was reduced in all cases. Even when increasing the BPEL engine properties DispatcherInvokeThreads and DispatcherNonBlockInvokeThreads.



Of course when using the DbAdapter a lot, it can be worthwhile to consider your polling strategy (DbAdapter settings, datasource settings, connection pool settings, connection factory settings (e.g. XA datasources cause some overhead. often a regular datasource is sufficient)). It can also be worthwhile to consider alternatives for polling since polling is an active asynchronous process. A different trigger such as an EDN event (http://javaoraclesoa.blogspot.nl/2012/04/scheduling-edn-business-events-using.html) or a webservice call (http://javaoraclesoa.blogspot.nl/2012/02/scheduling-bpel-processes.html) can be more efficient (also considering clustering/load-balancing).


General suggestions

Also process design is important. As a rule of thumb; Assign activities are often relatively slow and transformations are fast. If parallel processing is possible, why not use it?

Try to make the messages processed to be as small as possible. Smaller is faster and you don't need to have every piece of data available all the time. 


If possible, avoid batch processes and process messages as they arrive (picking up a file for processing at once can seem a good idea at first, but what if the file offered, becomes very large? if the customer says it won't, don't trust him/her!). 

Use queues whenever possible. This makes throttling and error handling easier (http://javaoraclesoa.blogspot.nl/2012/05/re-enqueueing-faulted-bpel-messages.html).

Conclusion

I've suggested several options for performance improvements of BPEL processes. There are a lot of factors involved in the total performance of an application and to improve performance, probably several specialisms are needed; a database performance expert is not necessarily good at tuning application servers. The developer also plays an important role in building processes which are efficient. We have to work together to get the best results.