CamundaCon 2020.2 Day 2: roadmap, business-driven DMN and ethical algorithms

I split off the first part of CamundaCon day 2 since it was getting a bit long: I had a briefing with Daniel Meyer earlier in the week on the new RPA integration, and had a lot of thoughts on that already. I rejoined for Camunda VP of Product Management Rick Weinberg’s roadmap presentation, which covered what’s coming in 2021. If you’re a Camunda customer, or thinking about becoming one, you should check out the replay of his session if you missed it. Expect to see updates to decision automation, developer experience, process monitoring and interoperability.

I tuned in to the business architecture track for a presentation by David Ibl, Enterprise Architect at LV 1871 (a German insurance company) on how they enabled their business specialists to perform decision model simulation and test case definition using their own DMN Manager based on the Camunda modeler toolkit. Their business people were already using BPMN for modeling processes, but were modeling business decisions as part of the process, and needed to use externalize the rules from the processes in order to simplify the processes. This was initially done by moving the decisions to code, then calling that from within the process, but that made the decisions much less transparent to the business. Now, the business specialists model both BPMN and DMN in Signavio, which are then committed to git; these models are then pulled from git both for deployment and for testing and simulation directly by the business people. You can read a much better description of it written by David a few months ago. A good example (and demo) on how business people can model, test and simulate their own decisions as well as processes. And, since they’re committed to open source, you can find the code for it on github.

I also attended a session by Omid Tansaz of Nexxbiz, a Camunda consulting services partner, on their insurance process monitoring capability that allows systems across the entire end-to-end chain of insurance processes to be monitored in a consolidated fashion. This includes broker systems, front- and back-off systems within the insurer, as well as microservices. They were already using Camunda’s BPM engine, and started using Optimize for process visualization since Optimize 3.0 can include external event sources (from all of the other systems in the end-to-end process) as well as the Camunda BPM processes. This is one of the first case studies of the external event capability in Optimize, since that was only released in April, and show the potential for having a consolidated view across multiple systems: not just visibility, but compliance auditing, bottleneck analysis, and real-time issue prevention.

The conference closed with a keynote by Michael Kearns from the University of Pennsylvania on the science of socially-aware algorithm design. Ethical algorithms (the topic of his recent book written with Aaron Roth) are not just an abstract concept, but impact businesses from risk mitigation through to implementation patterns. There are many cases of how algorithmic decision-making shows definite biases, and instead of punting to legal and regulatory controls, their research looks at technical solutions to the problem in the form of better algorithms. This is a non-trivial issue, since algorithms often have outcomes that are difficult to predict, especially when machine learning is involved. This is exactly why software testing is often so bad (just to inject my own opinion): developers can’t or don’t consider the entire envelope of possible outcomes, and often just test the “happy path” and a few variants.

Kearns’ research proposes embedding social values in algorithms: privacy, fairness, accountability, interpretability and morality. This requires a definition of what these social values mean in a precise mathematical. There’s already been some amount of work on privacy by design, spearheaded by the former Ontario Information and Privacy Commissioner Ann Cavoukian, since privacy is one of the better-understood algorithmic concepts.

Kearns walked us through issues around algorithmic privacy, including the idea that “anonymized” data often isn’t actually anonymized, since the techniques used for this assume that there is only a single source of data. For example, redacting data within a data set can make it anonymous if that’s the only data set that you have; as soon as other data sets exist that contain one or more of the same unredacted data values, you can start to correlate the data sets and de-anonymize the data. In short, anonymization doesn’t work, in general.

He then looked at “differential privacy”, which compares the results of an algorithm with and without a specific person’s data: if an observer can’t tell the discern between the outcomes, then the algorithm is preserving the privacy of that person’s data. Differential privacy can be implemented by adding a small amount of random noise to each data point, which makes is impossible to figure out the contribution of any specific data point., and the noise contributions will cancel out of the results when a large number of data points are analyzed. Problems can occur, however, with data points that have very small values, which may be swamped by the size of the noise.

He moved on to look at algorithmic fairness, which is trickier: there’s no agreed-upon definition of fairness, and we’re only just beginning to understand tradeoffs, e.g., between race and gender fairness, or between fairness and accuracy. He had a great example of college admissions based on SAT and GPA scores, with two different data sets: one for more financially-advantaged students, and the other for students from modest financial situations. The important thing to note is that the family financial background of a student has a strong correlation with race, and in the US, as in other countries, using race as an explicit differentiator is not allowed in many decisions due to “fairness”. However, it’s not really fair if there are inherent advantages to being in one data set over the other, since those data points are artificially elevated.

There was a question at the end about the role of open source in these algorithms: Kearns mentioned OpenDP, an open source toolset for implementing differential privacy, and AI Fairness 360, an open source toolkit for finding and mitigating discrimination and bias in machine learning models. He also discussed some techniques for determining if your algorithms adhere to both privacy and fairness requirements, and the importance of auditing algorithmic results on an ongoing basis.

CamundaCon 2020.2 Day 2 opening keynotes: BPM patterns and RPA integration

I’m back at CamundaCon 2020.2 for day 2, which kicked off with a keynote by Camunda co-founder and developer advocate Bernd Rücker. He’s a big fan of BPM and graphical models (obviously), but not of low-code: his view is that the best way to build robust process-based applications is with a stateful workflow engine, a graphical process modeler, and code. In my opinion, he’s not wrong for complex core applications, although I believe there are a lot of use cases for low code, too. He covered a number of different implementation architectures and patterns with their strengths and weaknesses, especially different types of event-driven architectures and how they are best combined with workflow systems. You can see the same concepts covered in some of his previous presentations, although every time I hear him give a presentation, there are some interesting new ideas. He’s currently writing a book call Practical Process Automation, which appears to be gathering many of these ideas together.

CTO Daniel Meyer was up next with details of the upcoming 7.14 release, particularly the RPA integration that they are releasing. He positions Camunda as having the ability to orchestrate any type of process, which may include endpoints (i.e., non-Camunda components for task execution) ranging from human work to microservices to RPA bots. Daniel and I have had a number of conversations about the future of different technologies, and although we have some disagreements, we are in agreement that RPA is an interim technology: it’s a stop-gap for integrating systems that don’t have APIs. RPA tends to be brittle, as pointed out by Camunda customer Goldman Sachs at the CamundaCon Live earlier this year, with a tendency to fail when anything in the environment changes, and no proper state maintained when failures occur. Daniel included a quote from a Forrester report that claims that 45% of organizations using RPA deal with breakage on at least a weekly basis.

As legacy systems are replaced, or APIs created for them, RPA bots will gradually be replaced as the IT infrastructure is modernized. In the meantime, however, we need to deal with RPA bots and limit the technical debt of converting the bots in the future when APIs are available. Camunda’s solution is to orchestrate the bots as external tasks; my advice would also be to refactor the bots to push as much process and decision logic as possible into the Camunda engine, leaving only the integration/screen scraping capabilities in the bots, which would further reduce the future effort required to replace them with APIs. This would require that RPA low-code developers learn some of the Camunda process and decision modeling, but this is done in the graphical modelers and would be a reasonable fit with their skills.

The new release includes task templates for adding RPA bots to processes in the modeler, plus an RPA bridge service that connects to the UiPath orchestrator, which in turn manages UiPath bots. Camunda will (I assume) extend their bridge to integrate with other RPA vendors’ orchestrators in the future, such as Automation Anywhere and Blue Prism. What’s interesting, however, is that the current architecture of this is that the RPA task in a process is an external task — a task that relies on an external agent to poll for work, rather than invoking a service call directly — then the Camunda RPA bridge invokes the RPA vendor’s orchestrator, then the RPA bots poll their own orchestrator. If you are using a different RPA platform, especially one that doesn’t have an orchestrator, you could configure the bots to poll Camunda directly at the external task endpoint. In short, although the 7.14 release will add some capabilities that make this easier (for enterprise customers only), especially if you’re using UiPath, you should be able to do this already with any RPA product and external tasks in Camunda.

Daniel laid out a modernization journey for companies with an existing army of bots: first, add in monitoring of the bot activities using Camunda Optimize, which now has the capability to monitor external events, in order to gain insights into the existing bot activities across the organization. Then, orchestrate the bots using the Camunda workflow engine (both BPMN and DMN models) using the tools and techniques described above. Lastly, as API replacements become available for bots, switch them out, which could require some refactoring of the Camunda models. There will likely be some bots left hanging around for legacy systems that are never going to have APIs, but that number should dwindle over the years.

Daniel also teased some of the “smart low-code” capabilities that are on the Camunda roadmap, which will covered in more detail later by Rick Weinberg, since support for RPA low-code developers is going to push them further into this territory. They’re probably never going to be a low-code platform, but are becoming more inclusive for low-code developers to perform certain tasks within a Camunda implementation, while professional developers are still there for most of the application development.

This is getting a bit long, so I’m going to publish this and start a new post for later sessions. On a technical note, the conference platform is a bit wonky on a mobile browser (Chrome on iPad); although it’s “powered by Zoom”, it appears in the browser as an embedded Vimeo window that sometimes just doesn’t load. Also, the screen resolution appears to be much lower than at the previous CamundaCon, with the embedded video settings maxing out at 720p: if you compare some of my screen shots from the two different conferences, the earlier ones are higher resolution, making them much more readable for smaller text and demos. In both cases, I was mostly watching and screen capping on iPad.

CamundaCon 2020.2 Day 1

I listened to Camunda CEO Jakob Freund‘s opening keynote from the virtual CamundaCon 2020.2 (the October edition), and he really hit it out of the park. I’ve known Jakob a long time and many of our ideas are aligned, and there was so much in particular in his keynote that resonated with me. He used the phrase “reinvent [your business] or die”, whereas I’ve been using “modernize or perish”, with a focus not just on legacy systems and infrastructure, but also legacy organizational culture. Not to hijack this post with a plug for another company, but I’m doing a keynote at the virtual Bizagi Catalyst next week on aligning intelligent automation with incentives and business outcomes, which looks at issues of legacy organizational culture as well as the technology around automation. Processes are, as he pointed out, the algorithms of an organization: they touch everything and are everywhere (even if you haven’t automated them), and a lot of digital-native companies are successful precisely because they have optimized those algorithms.

Jakob’s advice in achieving reinvention/modernization is to do a gradual transformation, not try to do a big bang approach that fails more often than it succeeds, and positions Camunda (of course) as the bridge between the worlds of legacy and new technology. In my years of technology consulting on BPM implementations, I also recommend using a gradual approach by building bridges between new and old technology, then swapping out the legacy bits as you develop or buy replacements. This is where, for example, you can use RPA to create stop-gap task automation with your existing legacy systems, then gradually replace the underlying legacy or at least create APIs to replace the RPA bots.

The second opening keynote was with Marco Einacker and Christoph Anzer of Deutsche Telekom, discussing how they are using process and task automation by combining Camunda for the process layer and RPA at the task layer. They started out with using RPA for automating tasks and processes, ending up with more than 3,000 bots and an estimated €93 million in savings. It was a very decentralized approach, with initially being created by business areas without IT involvement, but as they scaled up, they started to look for ways to centralize some of the ideas and technology. First was to identify the most important tasks to start with, namely those that were true pain points in the business (Einacker used the phrase ” look for the shittiest, most painful process and start there”) not just the easy copy-paste applications. They also looked at how other smart technologies, such as OCR and AI, could be integrated to create completely unattended bots that add significant value.

The decentralized approach resulted in seven different RPA platforms and too much process automation happening in the RPA layer, which increased the amount of technical debt, so they adapted their strategy to consolidate RPA platforms and separate the process layer from the bot layer. In short, they are now using Camunda for process orchestration, and the RPA bots have become tasks that are orchestrated by the process engine. Gradually, they are (or will be) replacing the RPA bots with APIs, which moves the integration from front-end to back-end, making it more robust with less maintenance.

I moved off to the business architecture track for a presentation by Srivatsan Vijayaraghavan of Intuit, where they are using Camunda for three different use cases: their own internal processes, some customer-facing processes for interacting with Intuit, and — most interesting to me — enabling their customers to create their own workflows across different applications. Their QuickBooks customers are primarily small and mid-sized business that don’t have the skills to set up their own BPM system (although arguably they could use one of the many low-code process automation platforms to do at least part of this), which opened the opportunity for Intuit to offer a workflow solution based on Camunda but customizable by the individual customer organizations. Invoice approvals was an obvious place to start, since Accounts Payable is a problem area in many companies, then they expanded to other approval types and integration with non-Intuit apps such as e-signature and CRM. Customers can even build their own workflows: a true workflow as a service model, with pre-built templates for common workflows, integration with all Intuit services, and a simplified workflow designer.

Intuit customers don’t interact directly with Camunda services; Camunda is a separately hosted and abstracted service, and they’ve used Kafka messages and external task patterns to create the cut-out layer. They’ve created a wrapper around the modeling tools, so that customers use a simplified workflow designer instead of the BPMN designer to configure the process templates. There is an issue with a proliferation of process definitions as each customer creates their own version of, for example, an invoice approval workflow — he mentioned 70,000 process definitions — and they will likely need to do some sort of automated cleanup as the platform matures. Really interesting use case, and one that could be used by large companies that want their internal customers to be able to create/customize their own workflows.

The next presentation was by Stephen Donovan of Fidelity Investments and James Watson of Doculabs. I worked with Fidelity in 2018-19 to help create the architecture for their digital automation platform (in my other life, I’m a technical architecture/strategy consultant); it appears that they’re not up and running with anything yet, but they have been engaging the business units on thinking about digital transformation and how the features of the new Camunda-based platform can be leveraged when the time comes to migrate applications from their legacy workflow platform. This doesn’t seem to have advanced much since they talked about it at the April CamundaCon, although Donovan had more detailed insights into how they are doing this.

At the April CamundaCon, I watched Patrick Millar’s presentation on using Camunda for blockchain ledger automation, or rather I watched part of it: his internet died partway through and I missed the part about how they are using Camunda, so I’m back to see it now. The RiskStream Collaborative is a not-for-profit consortium collaborating on the use of blockchain in the insurance industry; their parent organization, The Institutes, provides risk management and insurance education and is guided by senior executives from the property and casualty industry. To copy from my original post, RiskStream is creating a distributed network platform, called Canopy, that allows their insurance company members to share data privately and securely, and participate in shared business processes. Whenever you have multiple insurance companies in an insurance process, like a claim for a multi-vehicle accident, having shared business processes — such as first notice of loss and proof of insurance — between the multiple insurers means that claims can be settled quicker and at a much lower cost.

I do a lot of work with insurance companies, as well as with BPM vendors to help them understand insurance operations, and this really resonates: the FNOL (first notice of loss) process for multi-party claims continues to be a problem in almost every company, and using enterprise blockchain to facilitate interactions between the multiple insurers makes a lot of sense. Note that they are not creating or replacing claims systems in any way; rather, they are connecting the multiple insurance companies, who would then integrate Canopy to their internal claims systems such as Guidewire.

Camunda is used in the control framework layer of Canopy to manage the flows within the applications, such as the FNOL application. The control framework is just one slice of the platform: there’s the core distributed ledger layer below that, where the blockchain data is persisted, and an integration layer above it to integrate with insurers’ claims systems as well as the identity and authorization registry.

There was a Gartner keynote, which gave me an opportunity to tidy up the writing and images for the rest of this post, then I tuned back in for Niall Deehan’s session on Camunda Hackdays over on the community tech track, and some of the interesting creations that come out of the recent virtual version. This drives home the point that Camunda is, at its heart, open source software that relies on a community of developer both within and outside Camunda to extend and enhance the core product. The examples presented here were all done by Camunda employees, although many of them are not part of the development team, but come from areas such as customer-facing consulting. These were pretty quick demos so I won’t go into detail, but here are the projects on Github:

If you’re a Camunda customer (open source or commercial) and you like one of these ideas, head on over to the related github page and star it to show your interest.

There was a closing keynote by Capgemini; like the Gartner keynote, I felt that it wasn’t a great fit for the audience, but those are my only real criticisms of the conference so far.

Jakob Freund came back for a conversation with Mary Thengvall to recap the day. If you want to see the recorded videos of the live sessions, head over to the agenda page and click on Watch Now for any session.

There’s a lot of great stuff on the agenda for tomorrow, including CTO Daniel Meyer talking about their new RPA orchestration capabilities, and I’ll be back for that.

CamundaCon Live 2020 – Day 2: blockchain interrupted, and customer case studies with Capital One and Goldman Sachs

It was a tough choice with the first post-break session at CamundaCon Live: I wanted to listen in on Rick Weinberg, Camunda VP of Products, as he talked about their product direction roadmap, but I decided on the presentation by Muthukumar Vaidhianathan and Tandeep Sidhu from Capital One instead, focused on their process automation modernization with Camunda. I’ll catch the recorded version of Weinberg’s session later, along with a few others that I want to see.

Vaidhianathan and Sidhu talked about some of the problems that they were having with case management using their legacy infrastructure, and how they selected and deployed Camunda. Sidhu talked quite a bit about getting the technical teams up and running with Camunda, and some of the team and DevOps scalability issues. They use a single consolidated UI (actually, one each for front office and back office workers), then a case orchestration layer that connects to the multiple Camunda-based applications: Bank Case Manager, Fraud, Collections, and Legal.

Vaidhianathan then took us through their Legal Case Management application, and how they use BPMN and DMN for automating with complex business rules. Decision tables are used to decide, for example, on the course of action for a particular case based on the data about the case. They feel that it’s important for product owners to own their BPMN and DMN models, while building the strong relationships between developers and the product owners on the business side. Some good lessons learned from their journey at the end.

Captial One also presented at the Camunda Day in NYC last summer, but talked about how they organized a Camunda hackathon rather than the business applications — I think they were much earlier in their journey then, and weren’t ready to talk about business applications yet.

I’ve been interested in blockchain and BPM for a while now, and listened in on Patrick Millar of the non-profit consortium RiskStream Collaborative as he presented on ledger automation using Camunda. Their parent organization is The Institutes, which provides risk management and insurance education, and is guided by senior executives from the property and casualty industry. RiskStream is creating a distributed network platform, called Canopy, that allows their insurance company members to share data privately and securely, and participate in shared business processes. Whenever you have multiple insurance companies in an insurance process, like a claim for a multi-vehicle accident, having shared business processes — such as first notice of loss and proof of insurance — between the multiple insurers means that claims can be settled quicker and at a much lower cost.

In addition to private lines of insurance, they are also looking at applications in commercial lines and reinsurance. There are pretty significant savings if they get 100% market adoption (not an unrealistic goal since the market is made up of their members): $300M in personal lines auto for FNOL and proof of insurance, $384 in commercial lines for certificates of insurance, and $97M in reinsurance for placement.

Unfortunately, we lost the audio/video connection to the presenter in the middle of the session (yes, this really is happening live, and shit happens) and they had to close the session, just as I was really getting into the topic. Also, he never got to the part about how they’re using Camunda. We’ve already heard from Camunda that they will have him record his presentation and have it added to the on-demand videos.

The next session brought both tracks back together for a panel on digital transformation, featuring Mike Ryan, VP Software Engineering at JP Morgan Chase; Christine Yen, CEO of Honeycomb; and Camunda’s Bernd Rücker. Mary Thengvall, Camunda’s Director of Developer Relations, moderated the panel. Here’s some of the points that came up:

  • We’ve built up these massive monolithic systems over the last few decades, but now need to break up these legacy pieces while still supporting them, all while adding new functionality in a more agile manner. This is making it difficult for many of the established companies to compete with the new competitors, such as older financial services companies competing with fintechs. (By the way, I talked about this on a recent webinar, and see it with my own enterprise customers)
  • There’s a need to protect — and improve — the customer experience while the monolith is being replaced piece by piece. In my opinion, “big bang” as a deployment model is dead: gradual migrations without disrupting the user experience should be the general method.
  • There has been a lot of change in roles and communication within organizations. DevOps is part of that, which changes what people are responsible for, and also the concept of process owners being responsible for the end-to-end metrics. Microservices (and service-oriented architecture in general) means that systems can be more targeted since they’re assembled from shared services for a unique purpose.
  • There are a lot of great tools and methodologies now, but many companies are not yet ready to implement them. Microservices, serverless architectures, etc. are changing how we design systems for future state.
  • The current pandemic crisis is driving some amount of digital transformation, and companies are having to decide what is critical for survival now versus what can wait. Ryan said that JP Morgan sent 300,000 employees home to work, and they are rethinking how productive that people can be in distributed environments, and how teams can still work collaboratively. As a financial company, they need to keep serving customers who need access to financial transactions, and are probably having to scale up their online customer experiences to accommodate. Yen believes there is as much of a focus on how people work together remotely to build applications, as there is on the technology itself.

The panel felt a bit unfocused, and wasn’t as engaging as yesterday’s panel. Possibly I’m not quite as fresh after live-blogging 6,000 words over two days.

The last presentation of the day, and of CamundaCon Live 2020, was Richard Tarling, co-head of Workflow Engineering at Goldman Sachs, on the process automation platform that they built with Camunda at its core. He is focused on workflow at enterprise scale: they have 60,000 users (the entire firm) with 8,000 daily users, participating in 10M new activities and 250M decisions per day spread over 650 compute servers. This includes 3,000 process models, 1,000 decision models, 6,000 forms models and 125 RPA bot automations, all created and supported by 1,500 platform developers. Yowsa.

Their goal with creating their digital automation platform was to accelerate developers, but also support non-technical/citizen developers. This means that they embraced model-driven development by creating six key design tools:

  • Workflow Control Centre
  • Workflow Application Project Modeler
  • Workflow Designer, based on bpmn.io
  • Data Modeler
  • Decision Designer, based on dmn.io
  • Forms Designer

They built some engine extensions for their implementation, specifically around the using a stripped-down embedded BPM engine to implement decision flows with high-performance, plus the creation of an open-source jDMN execution engine.

He walked through their overall design-time and execution platform architecture, and some of the things that they did to maximize performance while maximizing (developer) usability. Decision services is a big part of their platform, and he discussed their enterprise-wide decision services execution platform. Their architecture wasn’t born in the cloud, but he feels that their use of microservices design principles means that they could move into the cloud in a straightforward manner.

They have a number of different UI personas that they’ve developed for, resulting in a “zero inbox” persona versus a “power user”. They’ve recently redesigned these UIs with a mobile-first focus. They’re also supporting citizen developers for creating their own case management applications through a combination of model-driven design and pre-built components, plus a governed software development lifecycle built on GitLab. They’ve also built their own provisioning, runtime management and monitoring tools — they even use a BPMN-based process for provisioning.

If you’re building your own large-scale digital process/decision automation platform, definitely go and watch the replay of this presentation — Tarling has been in the trenches and has a ton of great advice. Lots of great Q&A at the end, too.

@phoebe_cat

Jakob Freund came back briefly to chat with Mary Thengvall and wrap the conference: thanks for giving a shout out to this blog (and my cat, who made a brief appearance on the Slack channel). And congrats to all for a great virtual conference that was much, much more than a long series of webinars.

I mentioned on Twitter today that CamundaCon is now the gold standard for online conferences: all you other vendors who have conferences coming up, take note. I believe that the key contributers to this success are live (not pre-recorded) presentations, use of a discussion platform like Slack or Discord alongside the broadcast platform, full engagement of a large number of company participants in the discussion platform before/during/after presentations, and fast upload of the videos for on-demand watching. Keep in mind that a successful conference, whether in-person or online, allows people to have unscripted interactions: it’s not a one-way broadcast, it’s a big messy collaborative conversation.

CamundaCon Live 2020 – Day 2: Microservices Orchestration, new stuff from Camunda, and legacy BPM migration

Day 2 of CamundaCon Live kicked off with Camunda co-founder Bernd Rücker talking about microservices orchestration and integation using workflow automation. This is a common theme for him, and I’ve seen earlier versions of this presentation, but he always brings something fresh to the discussion. He discussed reactive applications that are responsive, resilient, elastic and message-driven, then covered different styles of event-driven architecture.

He gave a (live) demo of autonomous services communicating using Kafka, and showed the issue with peer-to-peer choreography: there is no sense of the end-to-end orchestration to ensure that all services that should have run did actually run. He created an event-based process in Camunda Optimize that modeled the expected end-to-end process, and now by connecting that to the Kafka messages, he had a visualization of the workflow that he defined that showed what happens when one service isn’t running: effectively, the virtual workflow is stuck at the previous service since it does not receive a message that the (stopped) service has picked up the messages.

One solution is to extract the end-to-end responsibility into its own service: really, this implies some level of orchestration via commands rather than purely reacting to events, even if it’s not a completely tightly-coupled workflow. If you use an engine like Camunda to do that top-level orchestration, then you can move the monitoring of the process within that engine (Cockpit rather than Optimize) although it’s likely that anyone using an event-based architecture is going to be looking at an event monitoring system like Optimize as well. You can see his slides below, and the video will be available on the CamundaCon Live hub probably by the time that I publish this post.

The morning session continued with CTO Daniel Meyer on some of the new product capabilities. Camunda’s goal has moved from just being a BPM engine for Java developers to a much broader orchestration platform that can integrate any technology stack and any endpoints.

He introduced a new distribution called Camunda Run (or Lil’ Camboot, as Niall Deehan calls it) provides a lightweight package (50MB) that includes the BPMN and DMN workflow and decision engines, Cockpit, Tasklist and the REST API. It can even be run in headless mode, which disables the web apps, if you just want the engines. It’s Open API enabled, CORS enabled, and SSL enabled out of the box. He gave a quick demo of downloading, starting and running Camunda Run: it’s pretty familiar if you’ve spent any time with Camunda, and it starts fast. From the blog post announcement, the target audience for Run is if at least one of the following is true:

  • You need a standalone process engine accessible via REST API
  • You don’t have extensive Java knowledge (or none at all) but still want to use Camunda BPM
  • You don’t want to configure an application server yourself
  • You want to configure everything in one place
  • You just want to Run Camunda BPM

Meyer also talked about Camunda Optimize, specifically the event-based process monitoring. We saw a bit of that yesterday in Felix Müller’s presentation, and I had a more complete view of the event-based features of Optimize a few weeks ago on the 3.0 release webinar. Basically, you add the event source to Optimize (such as Kafka), and Optimize exposes messages and allows them to be attached to the entry/exit points of elements on a BPMN diagram that represents the event-driven process. They are offering a 30-day free trial for Optimize now if you want to try it out.

Meyer’s third topic was about process automation as a service via Camunda Cloud, which is powered by Zeebe (rather than Camunda BPM). Having cloud-native Zeebe behind the scenes means that it’s highly scalable and fault-tolerant, and uses pub-sub orchestration to let you include endpoints from anywhere. He demonstrated how to spin up a new Zeebe cluster, then deploy a BPMN model that was created in the Zeebe Modeler and start instances of the process using the zbctl command line. These instances were then visible in Camunda Operate (the Zeebe process monitoring tool), and he ran JavaScript workers and published messages to complete tasks in the process and show the instance progressing through the process model. There’s a free trial for Camunda Cloud, and an early access version for $699/month that includes access to larger clusters and technical support.

He fielded some questions that came up on the Slack workspace during his talk. Moving from an existing Camunda BPM implementation to Camunda Run is apparently as easy as just redirecting to the new application server. You can’t use Java delegates, but will have to switch those out for external tasks. There was a question about BPM versus Zeebe, which I think is a question that a lot of Camunda customers have: although most are likely familiar with the technical and functional differences, there is an open question of whether Camunda will continue to support two workflow engines in the future, and if they are going to shift focus more towards Zeebe use cases.

The morning finished by breaking out into two tracks; I stayed with the customer presentations rather than the technical breakout to hear some of the case studies. The one that I was most interested in was Fareed Saeed, head of Product and Tools for Advanced Process Solutions at Fidelity Investments, talking about migrating their monolithic legacy BPM to Camunda, in part because I did some early technical architecture consulting with them on their digital process automation platform over a year ago, although I’m not involved at this time. For those of you who know me mostly through this blog and as an independent industry analyst, you may not be aware that the other half of my business is as a consultant to large enterprises, mostly financial services and insurance, on technical architecture and strategy, or anything else to help make their process-centric implementation projects a success.

James Watson of Doculabs, who advised Fidelity on migration strategies, joined him on the discussion. Saeed talked about their current home-built workflow system, which runs thousands of different processes for most of their back office operations, and the need to move away from monolithic architecture and fragile, non-agile systems to a more flexible platform. This talk was not about the architecture or platform, but about the migration planning and execution: a key subject for any large enterprise moving off a legacy platform, but one that is often not fully considered during new digital automation platform implementation.

There are a few different strategies for migrating process-based applications, and it’s not the same story for each process. Watson shared his thoughts on this (see the slide at right), but this is my take on it:

  • High-volume processes, that usually represent a smaller number of process models but most of the transaction volume, are usually rewritten from scratch while incorporating some degree of re-engineering and process improvement along the way. These are the core business processes that need to be done right, and will most benefit from the more agile and scalable new platform.
  • Lower volume processes can be reviewed to see if they’re still required, may possible be combined into similar processes, then a straightforward “lift and shift” rewrite done to just duplicate the functionality as is. In short, these aren’t worth the time to do the re-engineering unless there are obvious wins, since the volume is relatively low. These are also candidates for low-code business-led development if that’s available on the automation platform, rather than the professional development teams required for the high-volume transactional processes.
  • Very low volume processes can be retired, especially if their functionality can be rolled into processes in one of the first two categories.

Although they are looking at a “factory model” for some level of automation around the migration, Saeed believes that this is an opportunity to re-engineer the processes rather than just rewriting the same (broken) process on a new platform. They want to have smaller, distributed groups for developing and delivering new applications, which means that they need to have the right governance and standards in place to support a distributed model. He sees the need for early pilots and successes to allow everyone to see how this can work, and learn how to make it successful. A strong diverse team of business leaders is also a plus, since there will be some degree of pain in the business units as the migration happens.

That’s it for the morning of Day 2, they must have read my comments yesterday and actually made sure that we finished on time so that we get our 15 minute lunch break. 🙂 I’ll be back for the afternoon to finish off CamundaCon Live 2020.





CamundaCon Live 2020 – Day 1: Optimize, RPA, and how 24 Hour Fitness executes 5B process nodes per month

We continued the first day of CamundaCon Live (virtual) 2020 with Felix Mueller, senior product manager, presenting on how to use Camunda Optimize for driving continuous improvement in processes. I attended the Optimze 3.0 release webinar a couple of weeks ago, and saw some of the new things that they’re doing with monitoring and optimization of event-based processes — this allows processes that are not part of Camunda to be included in Optimize. The CamundaCon session started with a broader view of Optimize functionality, showing how it collects information then can be used for root cause analysis of process bottlenecks as well as displaying realtime metrics. They have some good case studies for Optimize, including insurance provider Visana Group.

He then moved to show the event-based process monitoring, and how Optimize can ingest and aggregate information from any external system with a connector, such as RabbitMQ (which they have built). His demo showed a customer onboarding process that could be triggered either by an online form that would be a direct Camunda process instantiation, or via a mailed-in form that was scanned into another system that emitted an event that would trigger the process.

It was very obvious that this was a live presentation, because Mueller was scrambling against the clock since the previous session went a bit long, having to speed through his demo and take a couple of shortcuts. Although you might think of this as a logistical “bug”, I maintain that it’s an interactivity “feature”, and made the experience much closer to an in-person conference than a set of pre-recorded presentations that were just queued up in sequence.

This was followed by a presentation by Kris Barczynski of Nokia Bell Labs about a really interesting use case: they are using Camunda to guide visiting groups on tours through the Nokia Campus customer experience spaces, and interact with devices including the guests’ wearables, drones and robots. Visitors are welcomed and guided by a robot, and they can interact with voice-controlled drones; Camunda is orchestrating the processes behind the scenes. He talked about some of their design decisions, such as using Camunda JavaScript workers to call external services, and building a custom Android app. Really interesting combination of physical and virtual processes.

Next was a panel discussion on the future of RPA, with Vittorio Dal Bianco of Nokia, Marco Einacker of Deutsche Telekom, Paul Jones of NatWest Group, and Camunda CEO Jakob Freund, moderated by Jason Bloomberg of Intellyx Research. The three customer presenters are involved with the RPA initiatives at their own organizations, and also looking at how to integrate that with their Camunda processes. Panels are always a challenge to live-blog, but here’s some of the points discussed (attributed where I remembered):

  • The customer panelists agreed that RPA has allowed people to move to more interesting/valuable work, rather than doing routine tasks such as copying and pasting between application screens. Task automation through RPA reduces resources/costs, decreases cycle time, and also improves quality/compliance.
  • RPA is a “short-term bandaid” driven from outside the IT organization in order to get some immediate efficiency benefits. It’s maintenance-intensive, since any changes to the appliations being integrated means that the bots need to be reprogrammed. Deutsche Telekom is moving from RPA front-end integration/automation to drive the more strategic BPMS/API automation, so sees that RPA has been an important step on the strategic journey but not the endpoint. NatWest recognizes RPA as a key automation tool, but see it as a short-term tactical tool; they classify RPA as part of their technical debt, and it is not a part of their long-term architecture. Nokia thinks that RPA will remain in niche pockets for applications that will never have a proper API, such as Excel-based applications.
  • Nokia uses Blue Prism for RPA. NatWest uses UIPath RPA, and has a group that is building the integration for having Camunda execute a UIPath task — although I would have thought this would be a relatively simple service call or external task. Deutsche Telekom is using seven different RPA platforms, three of which are commercial including Another Monday and Kryon; they are just starting to look at the integration between Camunda and RPA with a plan to have Camunda orchestrate steps, and one “microbot” performing an atomic task at that step. As their core system offers an API for that task, the RPA bot will be replaced with a direct API call. This last approach is definitely aligned with Camunda’s vision of how their BPM can work with RPA bots as well as any other “task performers”.
  • More discussion on the role of RPA in digital transformation: recommendations to go ahead and use it, but consider it as a stop-gap measure to get a quick win before you can get the APIs built out in the systems that are being integrated. It’s considered technical debt because it will be replaced in the future as the APIs of the core systems become available. It’s a painkiller, not a cure.
  • Although some of the companies are using business people to build their own bots, that has a mixed degree of success and other companies do not classify RPA as citizen developer technology. This is pretty much the same as we’re seeing with other low-code environments, where they are often sold as application development platforms for non-professional developers, but the reality is that many applications require a professional developer because of the technical complexity of systems being integrated.
  • Cost and effort of RPA bot maintenance can be significant, in some cases more than back-end integration. Bot fixes may be fairly quick, but are required much more frequently such as when a password changes: bots require babysitting.
  • The customers had a few Camunda product requests, such as better connectors to more of the RPA tools. In general, however, they don’t want Camunda to build/acquire their own RPA offering, but just see it as another example of where you can pick a best-of-breed RPA tool and use it for task automation at individual steps within a Camunda process.
  • Best practices/lessons learned:
    • Separate the process orchestration layer from the bot execution layer from the beginning, with the process orchestration being done by Camunda and the bot task execution being done by the RPA tool.
    • Use process mining first to objectively identify what should be automated; of course, this would also require that you mine the user interaction processes that would be automated with bots, not just the system logs.
    • Have a centralized control center for bot control.
    • Develop bot templates that can be more quickly modified and deployed.

Looking at how the panel worked, there are definitely aspects of online panels that work better than in-person panels, specifically how they respond to audience questions. Some people don’t want to speak up in front of an audience, while others get up and bloviate without actually asking a question. With online-only questions, the moderator can browse through and aggregate them, then select the ones that are best suited to the panel. With video on each of the presenters (except for one who lost his connection and had to dial in), it was still possible to see reactions and have a sense of the live nature of the panel.

The last session of the day was Jimmy Floyd of 24 Hour Fitness on their massive Camunda implementation of five billion (with a “B”) process node executions per month. You can see his presentation from CamundaCon Berlin 2018 as a point of comparison with today’s numbers. Pretty much everything that happens at 24 Hour Fitness is controlled by a Camunda process, from their internal processes to customer-facing activities such as a member swiping their card to gain access to a club. It hasn’t been without hiccups along the way: they had to turn off process history logging to attain this volume of data, and can’t easily drill down into processes that call a lot of other processes, but the use of BPMN and DMN has greatly improved the interactions between product owners and developers, sometimes allowing business people to make a rule change without involving developers.

He had a lot of technical information on how they built this and their overall architecture. Their use is definitely custom code, but using Camunda with BPMN and DMN gave them a huge step-up versus just writing code. Even logic inside of microservices is implemented with Camunda, not written in code. Their entire architecture is based on Camunda, so it’s not a matter of deciding whether or not to use it for a new application or to integrate in a new external solution. They are taking a look at Zeebe to decide if it’s the right choice of them moving forward, but it’s early days on that: it would be a significant migration for them, they would likey lose functionality (for BPMN elements not yet implemented in Zeebe, among other things), and Zeebe has only just achieved production readiness.

Camunda is changing how they handle history data relative to the transactional data, in part likely due to input from high-throughput customers, and this may allow 24 Hour Fitness to turn history logging back on. They’re starting to work with Optimize via Kafka to gain insights into their processes.

Day 1 finished with a quick wrapup from Jakob Freund; in spite of the fact that it’s probably been a really long day for him, he seemed pretty happy about how well things went today. Tomorrow will cover more on microservices orchestration, and have customer case studies from Cox Automotive, Capital One and Goldman Sachs.

As you probably gather from my posts today, I’m finding the CamundaCon online format to be very engaging. This is due to most of the presentations being performed live (not pre-recorded as is seen with most of the online conferences these days) and the use of Slack as a persistent chat platform, actively monitored by all Camunda participants from the CEO on down. They do need a little bit more slack in the schedule however: from 10am to 3:45pm there was only one 15-minute break scheduled mid-way, and it didn’t happen because the morning sessions ran overtime. If you’re attending tomorrow, be prepared to carry your computer to the kitchen and bathroom with you if you don’t want to miss a minute of the presentations.

As I finish off my day at the virtual CamundaCon, I notice that the videos of presentations from earlier today are already available — including the panel session that only happened an hour ago. Go to the CamundaCon hub, then change the selection from “Upcoming” to “On Demand” above the Type/Day/Track selectors.

CamundaCon Live 2020 – Day 1: Jakob Freund keynote and customer presentations

Every conference organizer has had to deal with either cancelling their event or moving it to some type of online version as most of us work from home during the COVID-19 pandemic. Some of these have been pretty lacklustre, using only pre-recorded sessions and no live chat/Q&A, but I had expectations for Camunda being able to do this in a more “live” manner that doesn’t completely replace an in-person event, but has a similar feel to it. They did not disappoint: although a few of the CamundaCon presentations were pre-recorded, most were done live, and speakers were available for live Q&A. They also hosted a Slack workspace for live chat, which is much better than the Q&A/chat features on the webinar broadcast platform: it’s fundamentally more feature-rich, and also allows the conversations to continue after a particular presentation completes.

Very capably hosted by Director of Developer Relations Mary Thengvall, presentations were all done from the speaker’s individual locations, starting with CEO Jakob Freund’s keynote. He covered a bit of Camunda’s history and direction, and discussed their main focus of providing end-to-end process orchestration using the example of Camunda together with RPA, then gradually migrating the RPA bots (widely used as a stop-gap process automation measure) to more robust API integrations. He also shared some news on new and timely product offerings, including a starter package for work-from-home human workflow, and an early adopter package for Camunda Cloud. I’ve shared a few of his slides below, but you can also go and see the recording: they are getting the videos and slides up within about an hour after each presentation, directly on the conference hub.

Next up was Simon Letort, Chief Digital Officer at Société Générale, on how they implemented their corporate investment banking’s core process automation platform using Camunda. They use Camunda as the core of their managed workflow platform, with 500+ processes deployed throughout their operations worldwide. They also use bpmn.io and form.io as their built-in process and forms modelers. Letort responded to an audience question about why not use another large BPMS product that was already in use; they wanted a best-of-breed solution rather than a proprietary walled garden, and also wanted to leverage open source tools as part of that so that they weren’t building everything from scratch. They transitioned from some of the proprietary tools by first replacing the underlying engine with Camunda, then trading out other components such as form.io as a more flexible UI was required.

Interestingly, about half of their workflows are created by 30 expert modelers within centers of expertise, and half by 1200 “amateur” modelers, or citizen developers. This really points out the potential for companies to mix together the experts (focused on core processes) and amateurs (focused on tactical or administrative processes) using the same engine, although they likely use quite different tools for the full development cycle. The SG Workflow “product” offers three main features to support these different modeler/developer types: the (Camunda) process engine, a workflow aggregator for grouping tasks and cases from multiple systems, and UI web components and apps. Their platform also auto-generates process documentation. The core product is created and maintained by a team of about 10, distributed between France and Canada.

He shared some good information on their architecture and roadmap: I did a few presentations last year (one of them at CamundaCon in Berlin) and wrote a paper on building your own process-centric platform using a BPMS and an assembly of other tools, inspired in part by companies like Société Générale that have done this to create a much more flexible application development environment for their large enterprises.

We moved from the main stage to the track sessions, and I sat in on a presentation by Jeremy Warren of Keller Williams Realty (a Camunda customer that works with integrator BP3) about their “SmartPlans” dynamic processes — these aren’t actually dynamic at runtime, but use a flower process model that loops back to allow any task to lead to any other task — which allow real estate agents to create their own plans and tasks.

This is a great example of automating some of the processes that real estate agents use to drive new business, such as contacting prospects on a regular schedule, which would normally be done (or not done) manually. Agents can decide what tasks to do in what order; the branching logic in the model then executes the plan as specified. He also shared some of their experiences in rolling out and debugging applications on this scale.

The second track session was Derek Hunter and Uzma Khan of Ontario Teachers’ Pension Plan (who have been an occasional client of mine over a number of years, including introducing them to startup Camunda back in 2013). They have a number of case management style of processes to handle requests from members (teachers) regarding their pensions. They have 144 BPMN templates, and execute 70,000 process instances per year with up to 20,000 active instances at any time since these are generally long-running workflows. Some of the extremely long-running processes are actually terminated after a specific stage, then a scheduler restarts a corresponding instance when new work needs to be done. Other processes may be suspended in the workflow engine, making them invisible to a user’s worklist until work needs to be done.

Camunda is really just an engine buried within the OTPP workflow system, completely hidden from calling applications by a workflow intermediary. This was essential during their migration off other platforms: at one time, they had three different workflow engines running simultaneously, and could migrate everything to Camunda without having to retool the higher-level applications. In particular, end users are never aware of the specific workflow engine, but work within applications that integrate business data from multiple systems.

They take advantage of in-flight instance migration due to the long-running nature of their processes, which is something that Camunda offers that is missing from many other BPMS products. Because of the large number of process templates and the complex architecture with many other systems and components, they have implemented automated testing practices including modeling user interactions through their workflow interface service (that sits above the workflow intermediary and the Camunda engine), and handling work-arounds for emulating external task processing in their core services.

They’ve developed a lot of best practices for automated testing, and built tools such as a BPMN navigation tool to use during unit testing. Another of their colleagues, Zain Esmail, will be presenting more about this on the technical track tomorrow. They have also developed tools for administrative monitoring and reporting on external tasks, to allow these to be integrated with the internal Camunda workflow metrics in Prometheus.

We’re taking a short break between the morning and afternoon sessions, so I’ll close this out now and be back in another post as things progress, either this afternoon or tomorrow.

How to (not) become a digital enterprise – @jakobfreund CamundaCon keynote

I thought that I was done with my CamundaCon coverage, but noticed that Jakob Freund is blogging more details about what he covered in his keynote. I spent most of his keynote behind the curtain waiting for my turn to speak, but was able to see it again when they posted the video of his presentation.

He’s doing a five-part series on the themes that he covered, based in part on their experiences with their clients over the past years, with the first two available here:

  • Part 1: Intro to the four key elements of becoming a digital enterprise
  • Part 2: The first key element, customer-focused innovation

If he keeps to his posting schedule, the next one should be up tomorrow.

CamundaCon 2019: Monolith to microservices at Deutsche Telekom

Friedbert Samland from Deutsche Telekom IT and Willm Tüting from their technology partner conology presented on Telekom IT (the internal IT provider for Deutsche Telekom) migrating from monolithic systems to a microservices architecture while also moving from waterfall to Agile development methodologies. In 2017, they had a number of significant problems with their monolithic system for wholesale orders: time to market for new features was 12+ months, lots of missing functionality that required manual steps, vendor lock-in, large (therefore risky and time-consuming) releases, and more.

Willm Tüting and Friedbert Samland presenting on the problems with Telekom IT’s monolithic wholesale ordering system

They tried a variety of approaches to alleviate these problems, such as a partial Agile environment, but needed something more radical to make a difference. They identified four major drivers: microservices, cloud, SAFe (Scaled Agile Framework) and devops. I’m sure everyone in the audience was familiar with those concepts, but they went through how this actually works in a large organization like this, where it’s not always as easy as the providers say it will be. They learned a lot of lessons the hard way, such as the long learning curve of moving to cloud.

They were migrating a system built on the Oracle BPEL engine, starting by partitioning the monolith in terms of data and functionality (logic and processes) in order to identify three categories of microservices: business process microservices, data microservices, and domain-specific microservices. They balanced orchestration and choreography with a “choreographed orchestration” of the microservices, where the (Camunda) process orchestrations were embedded within the microservices for handling processes and inter-service communication. By having separate Camunda instances with separate databases for each microservice (which provides a high degree of scalability), they had to enhance the monitoring to get an aggregated view of all of the process flows.

This is a great example of a real-world large-scale implementation where a proprietary and monolithic iBPMS just would not work for the architecture that Telekom IT needed: Camunda BPM is embedded in the services, it doesn’t pre-suppose fixed orchestration at the top level of an application.

Although we’re just halfway through the last day, this was my last session at CamundaCon, I’m headed south for a short weekend break then DecisionCamp in Bolzano next week. Thanks to the entire Camunda team for putting on a great event, and inviting me to give a keynote yesterday.

CamundaCon 2019: @berndruecker on Zeebe and microservices

Camunda co-founder Bernd Rücker presented on some of the implementation issues with microservices, in particular following on from Susanne Kaiser’s keynote with the theme of having small delivery teams spend more of their time developing business capabilities and less on the “undifferentiated heavy lifting” infrastructure bits required to support those. This significantly reduces the cognitive load for the team, allowing them to build the best possible business capabilities without worrying about arcane configuration details. Interestingly, this is not that different from the argument to move from a business process embedded within a business system logic to an externalized process in a BPMS — something that Bernd has a long history with.

He went through an example of the services behind a train ticket booking, which requires payment, seat reservation and ticket generation services; there are issues of latency and uptime as well as the user experience of how the results of those services are presented to the customer. He referenced the Reactive Manifesto as a guideline for software design patterns that are “more robust, more resilient, more flexible and better positioned to meet modern demands”.

Event-driven choreography is a common pattern these days, but has the problem of not being able to visualize the overall process flow between services. This can be alleviated somewhat by using event monitoring overlaid on a process model — effectively process discovery if the flow is not standardized or when it changes — or even more so by orchestrating standard parts of the flow to combine event-driven and orchestration patterns. Orchestration has the advantage of relocating the coupling between services in an event-driven flow to the orchestration layer: although event choreography is seen as loosely-coupled, there’s a lot of event listening that has to be built into the services, which couples them more closely. It’s not that one is good and the other bad: there’s a place for both choreography and choreography patterns in software development.

He finished with a discussion of monolithic legacy software and how to deal with it: from the initial step of just adding APIs to access functionality, you gradually chip away at the monolith’s capabilities, ripping them out replacing with externalized services.