CamundaCon Live 2020 – Day 2: blockchain interrupted, and customer case studies with Capital One and Goldman Sachs

It was a tough choice with the first post-break session at CamundaCon Live: I wanted to listen in on Rick Weinberg, Camunda VP of Products, as he talked about their product direction roadmap, but I decided on the presentation by Muthukumar Vaidhianathan and Tandeep Sidhu from Capital One instead, focused on their process automation modernization with Camunda. I’ll catch the recorded version of Weinberg’s session later, along with a few others that I want to see.

Vaidhianathan and Sidhu talked about some of the problems that they were having with case management using their legacy infrastructure, and how they selected and deployed Camunda. Sidhu talked quite a bit about getting the technical teams up and running with Camunda, and some of the team and DevOps scalability issues. They use a single consolidated UI (actually, one each for front office and back office workers), then a case orchestration layer that connects to the multiple Camunda-based applications: Bank Case Manager, Fraud, Collections, and Legal.

Vaidhianathan then took us through their Legal Case Management application, and how they use BPMN and DMN for automating with complex business rules. Decision tables are used to decide, for example, on the course of action for a particular case based on the data about the case. They feel that it’s important for product owners to own their BPMN and DMN models, while building the strong relationships between developers and the product owners on the business side. Some good lessons learned from their journey at the end.

Captial One also presented at the Camunda Day in NYC last summer, but talked about how they organized a Camunda hackathon rather than the business applications — I think they were much earlier in their journey then, and weren’t ready to talk about business applications yet.

I’ve been interested in blockchain and BPM for a while now, and listened in on Patrick Millar of the non-profit consortium RiskStream Collaborative as he presented on ledger automation using Camunda. Their parent organization is The Institutes, which provides risk management and insurance education, and is guided by senior executives from the property and casualty industry. RiskStream is creating a distributed network platform, called Canopy, that allows their insurance company members to share data privately and securely, and participate in shared business processes. Whenever you have multiple insurance companies in an insurance process, like a claim for a multi-vehicle accident, having shared business processes — such as first notice of loss and proof of insurance — between the multiple insurers means that claims can be settled quicker and at a much lower cost.

In addition to private lines of insurance, they are also looking at applications in commercial lines and reinsurance. There are pretty significant savings if they get 100% market adoption (not an unrealistic goal since the market is made up of their members): $300M in personal lines auto for FNOL and proof of insurance, $384 in commercial lines for certificates of insurance, and $97M in reinsurance for placement.

Unfortunately, we lost the audio/video connection to the presenter in the middle of the session (yes, this really is happening live, and shit happens) and they had to close the session, just as I was really getting into the topic. Also, he never got to the part about how they’re using Camunda. We’ve already heard from Camunda that they will have him record his presentation and have it added to the on-demand videos.

The next session brought both tracks back together for a panel on digital transformation, featuring Mike Ryan, VP Software Engineering at JP Morgan Chase; Christine Yen, CEO of Honeycomb; and Camunda’s Bernd Rücker. Mary Thengvall, Camunda’s Director of Developer Relations, moderated the panel. Here’s some of the points that came up:

  • We’ve built up these massive monolithic systems over the last few decades, but now need to break up these legacy pieces while still supporting them, all while adding new functionality in a more agile manner. This is making it difficult for many of the established companies to compete with the new competitors, such as older financial services companies competing with fintechs. (By the way, I talked about this on a recent webinar, and see it with my own enterprise customers)
  • There’s a need to protect — and improve — the customer experience while the monolith is being replaced piece by piece. In my opinion, “big bang” as a deployment model is dead: gradual migrations without disrupting the user experience should be the general method.
  • There has been a lot of change in roles and communication within organizations. DevOps is part of that, which changes what people are responsible for, and also the concept of process owners being responsible for the end-to-end metrics. Microservices (and service-oriented architecture in general) means that systems can be more targeted since they’re assembled from shared services for a unique purpose.
  • There are a lot of great tools and methodologies now, but many companies are not yet ready to implement them. Microservices, serverless architectures, etc. are changing how we design systems for future state.
  • The current pandemic crisis is driving some amount of digital transformation, and companies are having to decide what is critical for survival now versus what can wait. Ryan said that JP Morgan sent 300,000 employees home to work, and they are rethinking how productive that people can be in distributed environments, and how teams can still work collaboratively. As a financial company, they need to keep serving customers who need access to financial transactions, and are probably having to scale up their online customer experiences to accommodate. Yen believes there is as much of a focus on how people work together remotely to build applications, as there is on the technology itself.

The panel felt a bit unfocused, and wasn’t as engaging as yesterday’s panel. Possibly I’m not quite as fresh after live-blogging 6,000 words over two days.

The last presentation of the day, and of CamundaCon Live 2020, was Richard Tarling, co-head of Workflow Engineering at Goldman Sachs, on the process automation platform that they built with Camunda at its core. He is focused on workflow at enterprise scale: they have 60,000 users (the entire firm) with 8,000 daily users, participating in 10M new activities and 250M decisions per day spread over 650 compute servers. This includes 3,000 process models, 1,000 decision models, 6,000 forms models and 125 RPA bot automations, all created and supported by 1,500 platform developers. Yowsa.

Their goal with creating their digital automation platform was to accelerate developers, but also support non-technical/citizen developers. This means that they embraced model-driven development by creating six key design tools:

  • Workflow Control Centre
  • Workflow Application Project Modeler
  • Workflow Designer, based on bpmn.io
  • Data Modeler
  • Decision Designer, based on dmn.io
  • Forms Designer

They built some engine extensions for their implementation, specifically around the using a stripped-down embedded BPM engine to implement decision flows with high-performance, plus the creation of an open-source jDMN execution engine.

He walked through their overall design-time and execution platform architecture, and some of the things that they did to maximize performance while maximizing (developer) usability. Decision services is a big part of their platform, and he discussed their enterprise-wide decision services execution platform. Their architecture wasn’t born in the cloud, but he feels that their use of microservices design principles means that they could move into the cloud in a straightforward manner.

They have a number of different UI personas that they’ve developed for, resulting in a “zero inbox” persona versus a “power user”. They’ve recently redesigned these UIs with a mobile-first focus. They’re also supporting citizen developers for creating their own case management applications through a combination of model-driven design and pre-built components, plus a governed software development lifecycle built on GitLab. They’ve also built their own provisioning, runtime management and monitoring tools — they even use a BPMN-based process for provisioning.

If you’re building your own large-scale digital process/decision automation platform, definitely go and watch the replay of this presentation — Tarling has been in the trenches and has a ton of great advice. Lots of great Q&A at the end, too.

@phoebe_cat

Jakob Freund came back briefly to chat with Mary Thengvall and wrap the conference: thanks for giving a shout out to this blog (and my cat, who made a brief appearance on the Slack channel). And congrats to all for a great virtual conference that was much, much more than a long series of webinars.

I mentioned on Twitter today that CamundaCon is now the gold standard for online conferences: all you other vendors who have conferences coming up, take note. I believe that the key contributers to this success are live (not pre-recorded) presentations, use of a discussion platform like Slack or Discord alongside the broadcast platform, full engagement of a large number of company participants in the discussion platform before/during/after presentations, and fast upload of the videos for on-demand watching. Keep in mind that a successful conference, whether in-person or online, allows people to have unscripted interactions: it’s not a one-way broadcast, it’s a big messy collaborative conversation.

CamundaCon Live 2020 – Day 2: Microservices Orchestration, new stuff from Camunda, and legacy BPM migration

Day 2 of CamundaCon Live kicked off with Camunda co-founder Bernd Rücker talking about microservices orchestration and integation using workflow automation. This is a common theme for him, and I’ve seen earlier versions of this presentation, but he always brings something fresh to the discussion. He discussed reactive applications that are responsive, resilient, elastic and message-driven, then covered different styles of event-driven architecture.

He gave a (live) demo of autonomous services communicating using Kafka, and showed the issue with peer-to-peer choreography: there is no sense of the end-to-end orchestration to ensure that all services that should have run did actually run. He created an event-based process in Camunda Optimize that modeled the expected end-to-end process, and now by connecting that to the Kafka messages, he had a visualization of the workflow that he defined that showed what happens when one service isn’t running: effectively, the virtual workflow is stuck at the previous service since it does not receive a message that the (stopped) service has picked up the messages.

One solution is to extract the end-to-end responsibility into its own service: really, this implies some level of orchestration via commands rather than purely reacting to events, even if it’s not a completely tightly-coupled workflow. If you use an engine like Camunda to do that top-level orchestration, then you can move the monitoring of the process within that engine (Cockpit rather than Optimize) although it’s likely that anyone using an event-based architecture is going to be looking at an event monitoring system like Optimize as well. You can see his slides below, and the video will be available on the CamundaCon Live hub probably by the time that I publish this post.

The morning session continued with CTO Daniel Meyer on some of the new product capabilities. Camunda’s goal has moved from just being a BPM engine for Java developers to a much broader orchestration platform that can integrate any technology stack and any endpoints.

He introduced a new distribution called Camunda Run (or Lil’ Camboot, as Niall Deehan calls it) provides a lightweight package (50MB) that includes the BPMN and DMN workflow and decision engines, Cockpit, Tasklist and the REST API. It can even be run in headless mode, which disables the web apps, if you just want the engines. It’s Open API enabled, CORS enabled, and SSL enabled out of the box. He gave a quick demo of downloading, starting and running Camunda Run: it’s pretty familiar if you’ve spent any time with Camunda, and it starts fast. From the blog post announcement, the target audience for Run is if at least one of the following is true:

  • You need a standalone process engine accessible via REST API
  • You don’t have extensive Java knowledge (or none at all) but still want to use Camunda BPM
  • You don’t want to configure an application server yourself
  • You want to configure everything in one place
  • You just want to Run Camunda BPM

Meyer also talked about Camunda Optimize, specifically the event-based process monitoring. We saw a bit of that yesterday in Felix Müller’s presentation, and I had a more complete view of the event-based features of Optimize a few weeks ago on the 3.0 release webinar. Basically, you add the event source to Optimize (such as Kafka), and Optimize exposes messages and allows them to be attached to the entry/exit points of elements on a BPMN diagram that represents the event-driven process. They are offering a 30-day free trial for Optimize now if you want to try it out.

Meyer’s third topic was about process automation as a service via Camunda Cloud, which is powered by Zeebe (rather than Camunda BPM). Having cloud-native Zeebe behind the scenes means that it’s highly scalable and fault-tolerant, and uses pub-sub orchestration to let you include endpoints from anywhere. He demonstrated how to spin up a new Zeebe cluster, then deploy a BPMN model that was created in the Zeebe Modeler and start instances of the process using the zbctl command line. These instances were then visible in Camunda Operate (the Zeebe process monitoring tool), and he ran JavaScript workers and published messages to complete tasks in the process and show the instance progressing through the process model. There’s a free trial for Camunda Cloud, and an early access version for $699/month that includes access to larger clusters and technical support.

He fielded some questions that came up on the Slack workspace during his talk. Moving from an existing Camunda BPM implementation to Camunda Run is apparently as easy as just redirecting to the new application server. You can’t use Java delegates, but will have to switch those out for external tasks. There was a question about BPM versus Zeebe, which I think is a question that a lot of Camunda customers have: although most are likely familiar with the technical and functional differences, there is an open question of whether Camunda will continue to support two workflow engines in the future, and if they are going to shift focus more towards Zeebe use cases.

The morning finished by breaking out into two tracks; I stayed with the customer presentations rather than the technical breakout to hear some of the case studies. The one that I was most interested in was Fareed Saeed, head of Product and Tools for Advanced Process Solutions at Fidelity Investments, talking about migrating their monolithic legacy BPM to Camunda, in part because I did some early technical architecture consulting with them on their digital process automation platform over a year ago, although I’m not involved at this time. For those of you who know me mostly through this blog and as an independent industry analyst, you may not be aware that the other half of my business is as a consultant to large enterprises, mostly financial services and insurance, on technical architecture and strategy, or anything else to help make their process-centric implementation projects a success.

James Watson of Doculabs, who advised Fidelity on migration strategies, joined him on the discussion. Saeed talked about their current home-built workflow system, which runs thousands of different processes for most of their back office operations, and the need to move away from monolithic architecture and fragile, non-agile systems to a more flexible platform. This talk was not about the architecture or platform, but about the migration planning and execution: a key subject for any large enterprise moving off a legacy platform, but one that is often not fully considered during new digital automation platform implementation.

There are a few different strategies for migrating process-based applications, and it’s not the same story for each process. Watson shared his thoughts on this (see the slide at right), but this is my take on it:

  • High-volume processes, that usually represent a smaller number of process models but most of the transaction volume, are usually rewritten from scratch while incorporating some degree of re-engineering and process improvement along the way. These are the core business processes that need to be done right, and will most benefit from the more agile and scalable new platform.
  • Lower volume processes can be reviewed to see if they’re still required, may possible be combined into similar processes, then a straightforward “lift and shift” rewrite done to just duplicate the functionality as is. In short, these aren’t worth the time to do the re-engineering unless there are obvious wins, since the volume is relatively low. These are also candidates for low-code business-led development if that’s available on the automation platform, rather than the professional development teams required for the high-volume transactional processes.
  • Very low volume processes can be retired, especially if their functionality can be rolled into processes in one of the first two categories.

Although they are looking at a “factory model” for some level of automation around the migration, Saeed believes that this is an opportunity to re-engineer the processes rather than just rewriting the same (broken) process on a new platform. They want to have smaller, distributed groups for developing and delivering new applications, which means that they need to have the right governance and standards in place to support a distributed model. He sees the need for early pilots and successes to allow everyone to see how this can work, and learn how to make it successful. A strong diverse team of business leaders is also a plus, since there will be some degree of pain in the business units as the migration happens.

That’s it for the morning of Day 2, they must have read my comments yesterday and actually made sure that we finished on time so that we get our 15 minute lunch break. 🙂 I’ll be back for the afternoon to finish off CamundaCon Live 2020.





CamundaCon Live 2020 – Day 1: Optimize, RPA, and how 24 Hour Fitness executes 5B process nodes per month

We continued the first day of CamundaCon Live (virtual) 2020 with Felix Mueller, senior product manager, presenting on how to use Camunda Optimize for driving continuous improvement in processes. I attended the Optimze 3.0 release webinar a couple of weeks ago, and saw some of the new things that they’re doing with monitoring and optimization of event-based processes — this allows processes that are not part of Camunda to be included in Optimize. The CamundaCon session started with a broader view of Optimize functionality, showing how it collects information then can be used for root cause analysis of process bottlenecks as well as displaying realtime metrics. They have some good case studies for Optimize, including insurance provider Visana Group.

He then moved to show the event-based process monitoring, and how Optimize can ingest and aggregate information from any external system with a connector, such as RabbitMQ (which they have built). His demo showed a customer onboarding process that could be triggered either by an online form that would be a direct Camunda process instantiation, or via a mailed-in form that was scanned into another system that emitted an event that would trigger the process.

It was very obvious that this was a live presentation, because Mueller was scrambling against the clock since the previous session went a bit long, having to speed through his demo and take a couple of shortcuts. Although you might think of this as a logistical “bug”, I maintain that it’s an interactivity “feature”, and made the experience much closer to an in-person conference than a set of pre-recorded presentations that were just queued up in sequence.

This was followed by a presentation by Kris Barczynski of Nokia Bell Labs about a really interesting use case: they are using Camunda to guide visiting groups on tours through the Nokia Campus customer experience spaces, and interact with devices including the guests’ wearables, drones and robots. Visitors are welcomed and guided by a robot, and they can interact with voice-controlled drones; Camunda is orchestrating the processes behind the scenes. He talked about some of their design decisions, such as using Camunda JavaScript workers to call external services, and building a custom Android app. Really interesting combination of physical and virtual processes.

Next was a panel discussion on the future of RPA, with Vittorio Dal Bianco of Nokia, Marco Einacker of Deutsche Telekom, Paul Jones of NatWest Group, and Camunda CEO Jakob Freund, moderated by Jason Bloomberg of Intellyx Research. The three customer presenters are involved with the RPA initiatives at their own organizations, and also looking at how to integrate that with their Camunda processes. Panels are always a challenge to live-blog, but here’s some of the points discussed (attributed where I remembered):

  • The customer panelists agreed that RPA has allowed people to move to more interesting/valuable work, rather than doing routine tasks such as copying and pasting between application screens. Task automation through RPA reduces resources/costs, decreases cycle time, and also improves quality/compliance.
  • RPA is a “short-term bandaid” driven from outside the IT organization in order to get some immediate efficiency benefits. It’s maintenance-intensive, since any changes to the appliations being integrated means that the bots need to be reprogrammed. Deutsche Telekom is moving from RPA front-end integration/automation to drive the more strategic BPMS/API automation, so sees that RPA has been an important step on the strategic journey but not the endpoint. NatWest recognizes RPA as a key automation tool, but see it as a short-term tactical tool; they classify RPA as part of their technical debt, and it is not a part of their long-term architecture. Nokia thinks that RPA will remain in niche pockets for applications that will never have a proper API, such as Excel-based applications.
  • Nokia uses Blue Prism for RPA. NatWest uses UIPath RPA, and has a group that is building the integration for having Camunda execute a UIPath task — although I would have thought this would be a relatively simple service call or external task. Deutsche Telekom is using seven different RPA platforms, three of which are commercial including Another Monday and Kryon; they are just starting to look at the integration between Camunda and RPA with a plan to have Camunda orchestrate steps, and one “microbot” performing an atomic task at that step. As their core system offers an API for that task, the RPA bot will be replaced with a direct API call. This last approach is definitely aligned with Camunda’s vision of how their BPM can work with RPA bots as well as any other “task performers”.
  • More discussion on the role of RPA in digital transformation: recommendations to go ahead and use it, but consider it as a stop-gap measure to get a quick win before you can get the APIs built out in the systems that are being integrated. It’s considered technical debt because it will be replaced in the future as the APIs of the core systems become available. It’s a painkiller, not a cure.
  • Although some of the companies are using business people to build their own bots, that has a mixed degree of success and other companies do not classify RPA as citizen developer technology. This is pretty much the same as we’re seeing with other low-code environments, where they are often sold as application development platforms for non-professional developers, but the reality is that many applications require a professional developer because of the technical complexity of systems being integrated.
  • Cost and effort of RPA bot maintenance can be significant, in some cases more than back-end integration. Bot fixes may be fairly quick, but are required much more frequently such as when a password changes: bots require babysitting.
  • The customers had a few Camunda product requests, such as better connectors to more of the RPA tools. In general, however, they don’t want Camunda to build/acquire their own RPA offering, but just see it as another example of where you can pick a best-of-breed RPA tool and use it for task automation at individual steps within a Camunda process.
  • Best practices/lessons learned:
    • Separate the process orchestration layer from the bot execution layer from the beginning, with the process orchestration being done by Camunda and the bot task execution being done by the RPA tool.
    • Use process mining first to objectively identify what should be automated; of course, this would also require that you mine the user interaction processes that would be automated with bots, not just the system logs.
    • Have a centralized control center for bot control.
    • Develop bot templates that can be more quickly modified and deployed.

Looking at how the panel worked, there are definitely aspects of online panels that work better than in-person panels, specifically how they respond to audience questions. Some people don’t want to speak up in front of an audience, while others get up and bloviate without actually asking a question. With online-only questions, the moderator can browse through and aggregate them, then select the ones that are best suited to the panel. With video on each of the presenters (except for one who lost his connection and had to dial in), it was still possible to see reactions and have a sense of the live nature of the panel.

The last session of the day was Jimmy Floyd of 24 Hour Fitness on their massive Camunda implementation of five billion (with a “B”) process node executions per month. You can see his presentation from CamundaCon Berlin 2018 as a point of comparison with today’s numbers. Pretty much everything that happens at 24 Hour Fitness is controlled by a Camunda process, from their internal processes to customer-facing activities such as a member swiping their card to gain access to a club. It hasn’t been without hiccups along the way: they had to turn off process history logging to attain this volume of data, and can’t easily drill down into processes that call a lot of other processes, but the use of BPMN and DMN has greatly improved the interactions between product owners and developers, sometimes allowing business people to make a rule change without involving developers.

He had a lot of technical information on how they built this and their overall architecture. Their use is definitely custom code, but using Camunda with BPMN and DMN gave them a huge step-up versus just writing code. Even logic inside of microservices is implemented with Camunda, not written in code. Their entire architecture is based on Camunda, so it’s not a matter of deciding whether or not to use it for a new application or to integrate in a new external solution. They are taking a look at Zeebe to decide if it’s the right choice of them moving forward, but it’s early days on that: it would be a significant migration for them, they would likey lose functionality (for BPMN elements not yet implemented in Zeebe, among other things), and Zeebe has only just achieved production readiness.

Camunda is changing how they handle history data relative to the transactional data, in part likely due to input from high-throughput customers, and this may allow 24 Hour Fitness to turn history logging back on. They’re starting to work with Optimize via Kafka to gain insights into their processes.

Day 1 finished with a quick wrapup from Jakob Freund; in spite of the fact that it’s probably been a really long day for him, he seemed pretty happy about how well things went today. Tomorrow will cover more on microservices orchestration, and have customer case studies from Cox Automotive, Capital One and Goldman Sachs.

As you probably gather from my posts today, I’m finding the CamundaCon online format to be very engaging. This is due to most of the presentations being performed live (not pre-recorded as is seen with most of the online conferences these days) and the use of Slack as a persistent chat platform, actively monitored by all Camunda participants from the CEO on down. They do need a little bit more slack in the schedule however: from 10am to 3:45pm there was only one 15-minute break scheduled mid-way, and it didn’t happen because the morning sessions ran overtime. If you’re attending tomorrow, be prepared to carry your computer to the kitchen and bathroom with you if you don’t want to miss a minute of the presentations.

As I finish off my day at the virtual CamundaCon, I notice that the videos of presentations from earlier today are already available — including the panel session that only happened an hour ago. Go to the CamundaCon hub, then change the selection from “Upcoming” to “On Demand” above the Type/Day/Track selectors.

CamundaCon Live 2020 – Day 1: Jakob Freund keynote and customer presentations

Every conference organizer has had to deal with either cancelling their event or moving it to some type of online version as most of us work from home during the COVID-19 pandemic. Some of these have been pretty lacklustre, using only pre-recorded sessions and no live chat/Q&A, but I had expectations for Camunda being able to do this in a more “live” manner that doesn’t completely replace an in-person event, but has a similar feel to it. They did not disappoint: although a few of the CamundaCon presentations were pre-recorded, most were done live, and speakers were available for live Q&A. They also hosted a Slack workspace for live chat, which is much better than the Q&A/chat features on the webinar broadcast platform: it’s fundamentally more feature-rich, and also allows the conversations to continue after a particular presentation completes.

Very capably hosted by Director of Developer Relations Mary Thengvall, presentations were all done from the speaker’s individual locations, starting with CEO Jakob Freund’s keynote. He covered a bit of Camunda’s history and direction, and discussed their main focus of providing end-to-end process orchestration using the example of Camunda together with RPA, then gradually migrating the RPA bots (widely used as a stop-gap process automation measure) to more robust API integrations. He also shared some news on new and timely product offerings, including a starter package for work-from-home human workflow, and an early adopter package for Camunda Cloud. I’ve shared a few of his slides below, but you can also go and see the recording: they are getting the videos and slides up within about an hour after each presentation, directly on the conference hub.

Next up was Simon Letort, Chief Digital Officer at Société Générale, on how they implemented their corporate investment banking’s core process automation platform using Camunda. They use Camunda as the core of their managed workflow platform, with 500+ processes deployed throughout their operations worldwide. They also use bpmn.io and form.io as their built-in process and forms modelers. Letort responded to an audience question about why not use another large BPMS product that was already in use; they wanted a best-of-breed solution rather than a proprietary walled garden, and also wanted to leverage open source tools as part of that so that they weren’t building everything from scratch. They transitioned from some of the proprietary tools by first replacing the underlying engine with Camunda, then trading out other components such as form.io as a more flexible UI was required.

Interestingly, about half of their workflows are created by 30 expert modelers within centers of expertise, and half by 1200 “amateur” modelers, or citizen developers. This really points out the potential for companies to mix together the experts (focused on core processes) and amateurs (focused on tactical or administrative processes) using the same engine, although they likely use quite different tools for the full development cycle. The SG Workflow “product” offers three main features to support these different modeler/developer types: the (Camunda) process engine, a workflow aggregator for grouping tasks and cases from multiple systems, and UI web components and apps. Their platform also auto-generates process documentation. The core product is created and maintained by a team of about 10, distributed between France and Canada.

He shared some good information on their architecture and roadmap: I did a few presentations last year (one of them at CamundaCon in Berlin) and wrote a paper on building your own process-centric platform using a BPMS and an assembly of other tools, inspired in part by companies like Société Générale that have done this to create a much more flexible application development environment for their large enterprises.

We moved from the main stage to the track sessions, and I sat in on a presentation by Jeremy Warren of Keller Williams Realty (a Camunda customer that works with integrator BP3) about their “SmartPlans” dynamic processes — these aren’t actually dynamic at runtime, but use a flower process model that loops back to allow any task to lead to any other task — which allow real estate agents to create their own plans and tasks.

This is a great example of automating some of the processes that real estate agents use to drive new business, such as contacting prospects on a regular schedule, which would normally be done (or not done) manually. Agents can decide what tasks to do in what order; the branching logic in the model then executes the plan as specified. He also shared some of their experiences in rolling out and debugging applications on this scale.

The second track session was Derek Hunter and Uzma Khan of Ontario Teachers’ Pension Plan (who have been an occasional client of mine over a number of years, including introducing them to startup Camunda back in 2013). They have a number of case management style of processes to handle requests from members (teachers) regarding their pensions. They have 144 BPMN templates, and execute 70,000 process instances per year with up to 20,000 active instances at any time since these are generally long-running workflows. Some of the extremely long-running processes are actually terminated after a specific stage, then a scheduler restarts a corresponding instance when new work needs to be done. Other processes may be suspended in the workflow engine, making them invisible to a user’s worklist until work needs to be done.

Camunda is really just an engine buried within the OTPP workflow system, completely hidden from calling applications by a workflow intermediary. This was essential during their migration off other platforms: at one time, they had three different workflow engines running simultaneously, and could migrate everything to Camunda without having to retool the higher-level applications. In particular, end users are never aware of the specific workflow engine, but work within applications that integrate business data from multiple systems.

They take advantage of in-flight instance migration due to the long-running nature of their processes, which is something that Camunda offers that is missing from many other BPMS products. Because of the large number of process templates and the complex architecture with many other systems and components, they have implemented automated testing practices including modeling user interactions through their workflow interface service (that sits above the workflow intermediary and the Camunda engine), and handling work-arounds for emulating external task processing in their core services.

They’ve developed a lot of best practices for automated testing, and built tools such as a BPMN navigation tool to use during unit testing. Another of their colleagues, Zain Esmail, will be presenting more about this on the technical track tomorrow. They have also developed tools for administrative monitoring and reporting on external tasks, to allow these to be integrated with the internal Camunda workflow metrics in Prometheus.

We’re taking a short break between the morning and afternoon sessions, so I’ll close this out now and be back in another post as things progress, either this afternoon or tomorrow.

How to (not) become a digital enterprise – @jakobfreund CamundaCon keynote

I thought that I was done with my CamundaCon coverage, but noticed that Jakob Freund is blogging more details about what he covered in his keynote. I spent most of his keynote behind the curtain waiting for my turn to speak, but was able to see it again when they posted the video of his presentation.

He’s doing a five-part series on the themes that he covered, based in part on their experiences with their clients over the past years, with the first two available here:

  • Part 1: Intro to the four key elements of becoming a digital enterprise
  • Part 2: The first key element, customer-focused innovation

If he keeps to his posting schedule, the next one should be up tomorrow.

CamundaCon 2019: Monolith to microservices at Deutsche Telekom

Friedbert Samland from Deutsche Telekom IT and Willm Tüting from their technology partner conology presented on Telekom IT (the internal IT provider for Deutsche Telekom) migrating from monolithic systems to a microservices architecture while also moving from waterfall to Agile development methodologies. In 2017, they had a number of significant problems with their monolithic system for wholesale orders: time to market for new features was 12+ months, lots of missing functionality that required manual steps, vendor lock-in, large (therefore risky and time-consuming) releases, and more.

Willm Tüting and Friedbert Samland presenting on the problems with Telekom IT’s monolithic wholesale ordering system

They tried a variety of approaches to alleviate these problems, such as a partial Agile environment, but needed something more radical to make a difference. They identified four major drivers: microservices, cloud, SAFe (Scaled Agile Framework) and devops. I’m sure everyone in the audience was familiar with those concepts, but they went through how this actually works in a large organization like this, where it’s not always as easy as the providers say it will be. They learned a lot of lessons the hard way, such as the long learning curve of moving to cloud.

They were migrating a system built on the Oracle BPEL engine, starting by partitioning the monolith in terms of data and functionality (logic and processes) in order to identify three categories of microservices: business process microservices, data microservices, and domain-specific microservices. They balanced orchestration and choreography with a “choreographed orchestration” of the microservices, where the (Camunda) process orchestrations were embedded within the microservices for handling processes and inter-service communication. By having separate Camunda instances with separate databases for each microservice (which provides a high degree of scalability), they had to enhance the monitoring to get an aggregated view of all of the process flows.

This is a great example of a real-world large-scale implementation where a proprietary and monolithic iBPMS just would not work for the architecture that Telekom IT needed: Camunda BPM is embedded in the services, it doesn’t pre-suppose fixed orchestration at the top level of an application.

Although we’re just halfway through the last day, this was my last session at CamundaCon, I’m headed south for a short weekend break then DecisionCamp in Bolzano next week. Thanks to the entire Camunda team for putting on a great event, and inviting me to give a keynote yesterday.

CamundaCon 2019: @berndruecker on Zeebe and microservices

Camunda co-founder Bernd Rücker presented on some of the implementation issues with microservices, in particular following on from Susanne Kaiser’s keynote with the theme of having small delivery teams spend more of their time developing business capabilities and less on the “undifferentiated heavy lifting” infrastructure bits required to support those. This significantly reduces the cognitive load for the team, allowing them to build the best possible business capabilities without worrying about arcane configuration details. Interestingly, this is not that different from the argument to move from a business process embedded within a business system logic to an externalized process in a BPMS — something that Bernd has a long history with.

He went through an example of the services behind a train ticket booking, which requires payment, seat reservation and ticket generation services; there are issues of latency and uptime as well as the user experience of how the results of those services are presented to the customer. He referenced the Reactive Manifesto as a guideline for software design patterns that are “more robust, more resilient, more flexible and better positioned to meet modern demands”.

Event-driven choreography is a common pattern these days, but has the problem of not being able to visualize the overall process flow between services. This can be alleviated somewhat by using event monitoring overlaid on a process model — effectively process discovery if the flow is not standardized or when it changes — or even more so by orchestrating standard parts of the flow to combine event-driven and orchestration patterns. Orchestration has the advantage of relocating the coupling between services in an event-driven flow to the orchestration layer: although event choreography is seen as loosely-coupled, there’s a lot of event listening that has to be built into the services, which couples them more closely. It’s not that one is good and the other bad: there’s a place for both choreography and choreography patterns in software development.

He finished with a discussion of monolithic legacy software and how to deal with it: from the initial step of just adding APIs to access functionality, you gradually chip away at the monolith’s capabilities, ripping them out replacing with externalized services.

CamundaCon 2019: Preparing for a microservices journey with @suksr

Susanne Kaiser, former CTO of Just Social and now an independent technology consultant, opened the second day of CamundaCon 2019 with a presentation on moving to a microservices architecture, particularly for a small team. Similar to the message in my presentation yesterday, she advises building the processes that are your competitive differentiator, then outsourcing the rest to external vendors.

She walked through some of the things to consider when designing microservices, such as the ideas of bounded context, local data persistence, API discovery and management, linkage with message brokers, and more. There’s a lot of infrastructure complexities in building a single microservice, which makes it particularly challenging for small teams/companies — that’s part of what drives her recommendation to outsource anything that’s not a competitive differentiator.

Susanne Kaiser and the complexities of building a single microservice

She showed the use of Wardley maps for the evolution of a value chain, showing how components are mapped relative to their visibility to users and their level of maturity. Components of a solution/system are first identified by their visibility, usually in a top-down manner based on the functional requirements. They are then plotted along the evolution axis, to identify which will be custom-built versus those that are third-party products or outsourced commodities/utilities. This includes identifying all of the infrastructure to support those components; initially, this may include a lot of the infrastructure components as requiring custom build, but use of third-party products (including open source) can shift many of these components along the evolution axis.

Sample Wardley Map development for a solution (mid-evolution)

She then showed how Camunda BPM would fit into this map of the solution, and how it can abstract away some of the activities and components that were previously explicit. In short, Camunda BPM is a higher-level piece of infrastructure that can handle service orchestration including complexities of retries and more. I haven’t worked with Wardley Maps previously, and there are definitely some good concepts in here for helping to identify buy versus build components in a technical architecture.

CamundaCon 2019: ING on taking a regional BPM platform global

Derek Vandivere of ING Netherlands finished up the first day of CamundaCon 2019 here in Berlin taking about how they moved from a regional to global platform migration — strange, because he’s actually talking about their Pega implementation although they’re also implementing Camunda — and how to work around the monoliths. I know that Derek’s wife is an art restorer, and this has obvious rubbed off on him since all of his slides were photos of Dutch Masters paintings that were (however peripherally) related to his subject matter.

Different ING regional operations selected different BPM engines: the Netherlands went with Pega, while Germany went with Camunda, with other areas building their own thing or using legacy TIBCO. They’re attempting to build some standards around how people talk about BPM and case management internally, as well as how applications are developed. As a global bank, they need to have some data, rules and processes that span countries, making it necessary to consider how to bring all of these disparate systems together.

Derek Vandivere discusses the challenges of having multiple vendors’ BPMS

He went through a number of the best practices and lessons learned that they discovered along the way as they rolled out a regional solution globally. Although his experience with the Dutch implementation was based on Pega, there are many transferrable lessons here, since a lot of it is about higher-level architecture, bottlenecks in processes and decision-making (often human bottlenecks, although some technical as well), and how to interact with the business areas.

He discussed with the current pressures on their monolithic iBPMS (Pega) platform that echoed some of what I talked about this morning: proprietary developer training, container-based microservices architecture, and multiple distributed deployment models (support for both cloud and regionally-mandated on-premise). Replacing or upgrading any sufficient complex IT is going to be a challenging task, but doing that with a monolithic iBPMS is considerably more challenging than a more distributed microservices architecture.

We’re about to spill out onto the Spree-side patio here at Radialsystem V for a BBQ and well-deserved refreshments, so that’s it for my coverage of this first day at CamundaCon 2019. I’ll be back for a couple of sessions tomorrow morning before heading south to Bolzano for next week’s DecisionCamp.

CamundaCon 2019 breakout: DMN and BPMN for reusable survey forms at Indiana Farm Bureau

Sowmya Raghunathan and Corinna Cohn presented on a claims intake implementation that uses BPMN and DMN in an interesting way: driving the intake forms used by a claims administrator when gathering first notice of loss (FNOL) information from a claimant. The idea is that the claims admin doesn’t need to have any information about the claim type, and the claimant doesn’t get asked any irrelevant questions, because the form always presents the next best question based on previous responses: a wizard-like model, but driven by BPMN and DMN.

As the application and technical architects at Indiana Farm Bureau Insurance, they were able to give us a good view of how they use the tools for this: BPMN for orchestrating the DMN and UI communication as well as storing the responses, DMN for defining the questions and question/response mapping, and a UI component for implementing the survey forms. They consider this a headless application, but of course, it does surface via the form UI; from a Camunda process standpoint, however, it is decoupled piece of the architecture that interfaces with the claims system.

Technical architecture of the survey DMN/BPMN system

We saw a demo of one of the claim forms at work, where the previous questions and responses can be seen, and changes to the previous responses may cause changes to subsequent questions based on the DMN decision tables behind the scenes. They use a couple of DMN tables just as configuration tables for the UI for the questions and options (e.g., radio buttons versus free-form responses), then a Next Question decision table to determine the next question based on the previous response: this table is based on a directed acyclic graph that links questions (nodes) via answers (links), which allows for easy re-navigation of the graph if an earlier response is changed.

DMN decision table for FNOL questions related to auto claim

BPMN is used to navigate and determine the next question in a dynamic question subprocess, and if the survey can be exited; once sufficient information has been collected, the FNOL is initiated in the claims systems.

Dynamic Questions BPMN subprocess

The use of DMN means that the questions can be changed very easily since they’re not embedded in the code; this means that they can be created and modified by business analysts rather than requiring developers to code these into the UI directly.

FNOL BPMN process

The entire framework is reusable, and could be quickly reconfigured to be used for any type of survey. That’s great, because a few years ago, I saw a very similar use case for this in a clinical situation for a stroke assessment questionnaire: in a hospital setting, when someone arrives at an emergency department and is suspected of having had a stroke, there are standard questions to ask in order to evaluate the patient’s condition. At the time, I thought that this would be a perfect use case for DMN and BPMN, although that was beyond the scope of the project at that time.