<img height="1" width="1" style="display:none;" alt="" src="https://dc.ads.linkedin.com/collect/?pid=287945&amp;fmt=gif">

Why processes don't crunch data and why this matters

Posted by FlowWright on Aug 8, 2019 10:08:00 AM

Processes orchestrate your companies needs often across multiple systems.  When processes try to crunch data, they typically fail. As an example to illustrate: when a user is building large workflows without using the sub-workflows to break functionality down, the user will be frustrated.  We explain how to gather the needed data without breaking the process. 

We have customers in the Pharmaceutical space that process 700k+  prescriptions per day, and these prescriptions require optical character recognition (OCR) processing since prescriptions arrive as fax images.  OCR processing is a very CPU intensive task, imagine putting this load on the workflow server to process? If everything was server based; the performance of the workflow server will degrade.  

Our customers also tend to implement custom steps that make calls to their servers to perform the work. In the above example, a workflow step makes a REST API call to server to perform the OCR on the fax and goes to sleep.  Once the server performs the OCR process, it makes a call back to FlowWright to say start processing from the sleeping step.

Another example of data working for companies can be found within our medical claim processing customers, They process a lot of incoming claim files from medical institutions that have data sensitivity and need to be HIPPA compliant.  Data from these claims need to be aggregated and sent to payers.  A custom step uses a database table for configuration data on how to processes these files and asks different servers to perform the actions based on the type of incoming claim.

We have seen instances where the customer will make the workflow process data, #1 the workflow instance can get very large, since the xml has to carry all this data.  Most workflow still use some level of data, for example to make decisions, what path to continue processing.  

In cases where the workflow instance is waking up every hour to process data, FlowWright has a built-in feature to maintain performance.  On the start step of the workflow, you will find a checkbox called "Use single iteration" as shown below:

By using this feature, FlowWright only keeps data for the last executed iteration of the step.  Imagine having access to a workflow with 15 steps that execute every hour:

Type # of execution iterations
1 hour 15
1 day 360
1 week 2,520

There are many execution iterations of data a workflow instance has to store.  The larger the workflow instance, the more time it takes the engine to load and process.  Performance of the workflow engine decreases tremendously by simply trimming execution iterations and keeping the last execution iteration, it makes the workflow engine load the workflow instances faster with a very small memory footprint and also increases the engine's processing performance.

Have Questions? Let's Talk! 

 why processes shouldn't crunch data

Topics: Workflow Automation Software