I’m attending a workshop at the first morning of CASCON, the conference on software research hosted by IBM Canada. There’s quite a bit of good work done at the IBM Toronto software lab, and this annual conference gives them a chance to engage the academic and corporate community to present this research.
The focus of this workshop is service integration, including enabling new services from existing applications and creating new services by composing from existing services. Hacking together a few services into a solution is fairly simple, but your results may not be all that predictable; industrial-strength service integration is a bit more complex, and is concerned with everything from reusability to service level agreements. As Allen Chan of IBM put it when introducing the session: “How do we enable mere mortals to create a service integration solution with predictable results and enterprise-level reliability?”
The first presentation was by Mannie Kagan, an IBMer who is working with TD Bank on their service strategy and implementation; he walked us through a real-life example of how to integrate services into a complex technology environment that includes legacy systems as well as newer technologies. Based on this, and a large number of other engagements by IBM, they are able to discern patterns in service integration that can greatly aid in implementation. Patterns can appear at many levels of granularity, which they classify as primitive, subflow, flow, distributed flow, and connectivity topology. From there, they have created an ESB framework pattern toolkit, an Eclipse-based toolkit that allows for the creation of exemplars (templates) of service integration that can then be adapted for use in a specific instance.
He discussed two particular patterns that they’ve found to be particularly useful: web service notification (effectively, pub-sub over web services), and SCRUD (search, create, read, updated, delete); think of these as some basic building blocks of many of the types of service integrations that you might want to create. This was presented in a specific IBM technology context, as you might imagine: DataPower SOA appliances for processing XML messages and legacy message transformations, and WebSphere Services Registry and Repository (WSRR) for service governance.
In his wrapup, he pointed out that not all patterns need to be created at the start, and that patterns can be created as required when there is evidence of reuse potential. Since patterns take more resources to create than a simple service integration, you need to be sure that there will be reuse before it is worth creating a template and adding it to the framework.
Next up was Hans-Arno Jacobsen of University of Toronto discussing their research in managing SLAs across services. He started with a business process example of loan application processing that included automated credit check services, and had an SLA in terms of parameters such as total service subprocess time, service roundtrip time, service cost and service uptime. They’re looking at how the SLAs can guide the efficient execution of processes, based in a large part on event processing to detect and determine the events within the process (published state transitions). He gave quite a detailed description of content-based routing and publish-subscription models, which underlie event-driven BPM, and their PADRES ESB stack that hides the intricacies of the underlying network and system events from the business process execution by creating an overlay of pub-sub brokers that filters and distributes those events. In addition to the usual efficiencies created by the event pub-sub model, this allows (for example) the correlation of network slowdowns with business process delays, so that the root cause of a delay can be understood. Real-time business analytics can also be driven from the pub-sub brokers.
He finished by discussing how business processes can actually be guided by SLAs, that is, runtime use of SLAs rather than just for monitoring processes. If the process can be allocated to multiple resources in a fine-grained manner, then the ESB broker can dynamically determine the assignment of process parts to resources based on how well those resources are meeting their SLAs, or expected performance based on other factors such as location of data or minimization of traffic. He gave an example of optimization based on minimizing traffic by measuring message hops, which takes into account both rate of message hops and distance between execution engines. This requires that the distributed execution engines include engine profiling capabilities that allows an engine to determine not only its own load and capacity, but that of other engines with which it communicates, in order to minimize cost over the entire distribute process. To fine-tune this sort of model, process steps that have a high probability of occurring in sequence can be dynamically bound to the same execution engine. In this situation, they’ve seen a 47% reduction in traffic, and a 50% reduction in cost relative to the static deployment model.
After a brief break, Ignacio Silva-Lepe from IBM Research presented on federated SOA. SOA today is mostly used in a single domain within an organization, that is, it is fairly siloed in spite of the potential for services to be reused across domains. Whereas a single domain will typically have its own registry and repository, a federated SOA can’t assume that is the case, and must be able to discover and invoke services across multiple registries. This requires a federation manager to establish bridges across domains in order to make the service group shareable, and inject any cross-domain proxies required to invoke services across domains.
It’s not always appropriate to have a designated centralized federation manager, so there is also the need for domain autonomy, where each domain can decide what services to share and specify the services that it wants to reuse. The resulting cross-domain service management approach allows for this domain autonomy, while preserving location transparency, dynamic selection and other properties expected from federated SOA. In order to enable domain autonomy, the domain registry must not only have normal service registry functionality, but also references to required services that may be in other domains (possibly in multiple locations). The registries then need to be able to do a bilateral dissemination and matching of interest and availability information: it’s like internet dating for services.
They have quite a bit of work planned for the future, beyond the fairly simple matching of interest to availability: allowing domains to restrict visibility of service specifications to authorized parties without using a centralized authority, for example.
Marsha Checkik, also from University of Toronto, gave a presentation on automated integration determination; like Jacobsen, she collaborates with the IBM Research on middleware and SOA research; unlike Jacobsen, however, she is presenting on research that is at a much earlier stage. She started with a general description of integration, where a producer and a consumer share some interface characteristics. She went on to discuss interface characteristics (what already exists) and service exposition characteristics (what we want): the as-is and to-be state of service interfaces. For example, there may be a requirement for idempotence, where multiple “submit” events over an unreliable communications medium would result in only a single result. In order to resolve the differences in characteristics between the as-is and to-be, we can consider typical service interface patterns, such as data aggregation, mapping or choreography, to describe the resolution of any conflicts. The problem, however, is that there are too many patterns, too many choices and too many dependencies; the goal of their research is to identify essential integration characteristics and make a language out of them, identify a methodology for describing aspects of integration, identify the order in which patterns can be determined, identify decision trees for integration pattern determination, and determine cases where integration is impossible.
Their first insight was to separate pattern-related concerns between physical and logical characteristics; every service has elements of both. They have a series of questions that begin to form a language for describing the service characteristics, and a classification for the results from those questions. The methodology contains a number of steps:
- Determine principle data flow
- Determine data integrity data flow, e.g., stateful versus stateless
- Determine reliability flow, e.g., mean time between failure
- Determine efficiency, e.g., response time
- Determine maintainability
Each of these steps determines characteristics and mapping to integration patterns; once a step is completed and decisions made, revisiting it should be minimized while performing later steps.
It’s not always possible to provide a specific characteristic for any particular service; their research is working on generating decision trees for determining if a service requirement can be fulfilled. This results in a pattern decision tree based on types of interactions; this provides a logical view but not any information on how to actually implement them. From there, however, patterns can be mapped to implementation alternatives. They are starting to see the potential for automated determination of integration patterns based on the initial language-constrained questions, but aren’t seeing any hard results yet. It will be interesting to see this research a year from now to see how it progresses, especially if they’re able to bring in some targeted domain knowledge.
Last up in the workshop was Vadim Berestetsky of IBM’s ESB tools development group, presenting on support for patterns in IBM integration offerings. He started with a very brief description of an ESB, and WebSphere Message Broker as an example of an ESB that routes messages from anywhere to anywhere, doing transformations and mapping along the way. He basically walked through the usage of the product for creating and using patterns, and gave a demo (where I could see vestiges of the MQ naming conventions). A pattern specification typically includes some descriptive text and solution diagrams, and provides the ability to create a new instance from this pattern. The result is a service integration/orchestration map with many of the properties already filled in; obviously, if this is close to what you need, it can save you a lot of time, like any other template approach.
In addition to demonstrating pattern usage (instantiation), he also showed pattern creation by specifying the exposed properties, artifacts, points of variability, and (developer) user interface. Looks good, but nothing earth-shattering relative to other service and message broker application development environments.
There was an interesting question that goes to the heart of SOA application development: is there any control over what patterns are created and published to ensure that they are useful as well as unique? The answer, not surprisingly, is no: that sort of governance isn’t enforced in the tool since architects and developers who guide the purchase of this tool don’t want that sort of control over what they do. However, IBM may see very similar patterns being created by multiple customer organizations, and choose to include a general version of that pattern in the product in future. A discussion about using social collaboration to create and approve patterns followed, with Berestetsky hinting that something like that might be in the works.
That’s it for the workshop; we’re off to lunch. Overall, a great review of the research being done in the area of service integration.
This afternoon, there’s the keynote and a panel that I’ll be attending. Tomorrow, I’ll likely pop in for a couple of the technical papers and to view the technology showcase exhibits, then I’m back Wednesday morning for the workshop on practical ontologies, and the women in technology lunch panel. Did I mention that this is a great conference? And it’s free?