A bit of meat to go with the whine

Yesterday, I posted a rather whiny entry about rude customers (and Bob McIlree was kind enough to give me a comforting pat on the shoulder, virtually speaking — thanks Bob!) so today I decided to get a bit more productive. Moving from whine to wine, I finally made my first cut of a Squidoo lens about Australian Wine in Toronto (yes, I’m a geek and this is how I spent part of my Sunday). Sort of a niche topic, true, but that’s what Squidoo lenses are all about: it allows you to quickly build a one-page portal with links to other sites, Amazon products, eBay, RSS feeds, and a number of other kinds of information. Since it’s all on the web, you can update it anywhere, which is why I’ve moved quite a bit of information about both wine and BPM from my websites to my two Squidoo lenses.

I want to add a bit of meat to this post to offset the whine of yesterday, and coincidentally (before I saw his comment), I was reading Bob’s post on SOA and Data Issues and the need to maintain a source system of record (SSoR) for data. In particular, he discusses a conversation that was passed along to him from another organization:

A, the SOA implementer, argues that application-specific databases have no need to retain SSoR data at all since applications can invoke services at any time to receive data. He further opined that the SOA approach will eliminate application silos as his primary argument in the thread.

B, the applications development manager, is worried that he won’t get the ‘correct’ value from A’s services and that he has to retain what he receives from SSoRs to reconcile aggregations and calculated values at any point in time.

Since I’m usually working on customer projects that involve the intersection of legacy systems, operational databases, BPMS and analytical databases, I see this problem a lot. In addition to B’s argument about getting the “correct” value, I also hear the efficiency argument, which usually manifests as “we have to replicate [source data] into [target system] because it’s too slow to invoke the call to the source system at runtime”. If you have to screen-scrape data from a legacy CICS screen and reformat it at every call, I might go for the argument to replicate the mainframe data into an operational database for faster access. However, if you’re pulling data from an operational database and merging it with data from your BPMS, I’m going to find it harder to accept efficiency as a valid reason for replicating the data into the BPMS. I know, it’s easier to do it that way, but it’s just not right.

When data is replicated between systems, the notion of the SSoR, or “golden copy”, of the data is often lost, the most common problem being when the replicated data is updated and never synchronized back to the original source. This is exacerbated by synchronization applications that attempt to update the source but were written by someone who didn’t understand their responsibility in creating what is effectively a heterogeneous two-phase commit — if the update on the SSoR fails, no effective action is taken to either rollback the change to the replicated data or raise a big red flag before anyone starts making further decisions based on either of the data sources. Furthermore, what if two developers each take the same approach against the same SSoR data, replicating it to application-specific databases, updating it, then trying to synchronize the changes back to the source?

I’m definitely in the A camp: services eliminate (or greatly reduce) the need to replicate data between systems, and create a much cleaner and safer data environment. In the days before services ruled the earth, you could be forgiven for that little data replication transgression. In today’s SOA world, however, there are virtually no excuses to hide behind any more.

Best quote from Mashup Camp

That’s the thing about mashups, almost all of them are illegal

I heard that (and unfortunately am unable to credit the source) in the “scrAPI” session at Mashup Camp, in which we discussed the delicate nature of using a site that doesn’t have APIs as part of a mashup. Adrian Holovaty of ChicagoCrime.org (my favourite mashup at camp) was leading part of the session, demonstrating what he had done with Chicago police crime data (the police, not having been informed in advance, called him for a little chat the day his site went live), Google maps, Yahoo! maps (used for geocoding after he was banned from the Google server for violating the terms of service) and the Chicago Journal.

Listening to Adrian and others talk about the ways to use third-party sites without their knowledge or permission really made me realize that most mashup developers are still like a bunch of kids playing in a sandbox, not realizing that they might be about to set their own shirts on fire. That’s not a bad thing, just a comment on the maturity of mashups in general.

The scrAPI conversation — a word, by the way, that’s a mashup between screen scraping and API — is something very near and dear to my heart, although in another incarnation: screen scraping from third-party (or even internal) applications inside the enterprise in order to create the type of application integration that I’ve been involved in for many years. In both cases, you’re dealing with a third party who probably doesn’t know that you exist, and doesn’t care to provide an API for whatever reason. In both cases, that third party may change the screens on their whim without telling you in advance. The only advantage of doing this inside the enterprise is that the third party ususally doesn’t know what you’re doing, so if you are violating your terms of service, it’s your own dirty little secret. Of course, the disadvantage of doing this inside the enterprise is that you’re dealing with CICS screens or something equally unattractive, but the principles are the same: from a landing page, invoke a query or pass a command; navigate to subsequent pages as required; and extract data from the resultant pages.

There’s some interesting ways to make all of this happen in mashups, such as using LiveHTTPHeaders to watch the traffic on the site that you want to scrape, and faking out forms by passing parameters that are not in their usual selection lists (Adrian did this with ChicagoCrime.org to pass a much larger radius to the crime stats site that its form drop-down allowed in order to pull back the entire geographic area in one shot). Like many enterprise scraping applications, site scraping applications often cache some of the data in a local database for easier access or further enrichment, aggregation, analysis or joining with other data.

In both web and enterprise cases, there’s a better solution: build a layer around the non-API-enabled site/application, and provide an API to allow multiple applications to access the underlying application’s data without each of them having to do site/screen scraping. Inside the enterprise, this is done by wrapping web services around legacy systems, although much of this is not happening as fast as it should be. In the mashup world, Thor Muller (of Ruby Red Labs) talked about the equivalent notion of scraping a site and providing a set of methods for other developers to use, such as Ontok‘s Wikipedia API.

We talked about the legality of site scraping, namely that there are no explicit rights to use the data, and the definition of fair use may or may not apply; this is what prompted the comment with which I opened this post.

In the discussion of strategic issues around site scraping, I certainly agree that site scraping indicates a demand for an API, but I’m not sure that I completely agree with the comment that site scraping forces service and data providers to build/open APIs: sure, some of them are likely just unaware that their data has any potential value to others, but there’s going to be many more who either will be horrified that their data can be reused on another site without attribution, or just don’t get that this is a new and important way to do business.

In my opinion, we’re going to have to migrate towards a model of compensating the data/service provider for access to their content, whether it’s done through site scraping or an API, in order to gain some degree of control (or at least advance notice) of changes to the site that would break the callling/scraping applications. That compensation doesn’t necessarily have to mean money changing hands, but ultimately everyone is driven by what’s in it for them, and needs to see some form of reward.

Update: Changed “scrapePI” to “scrAPI” (thanks, Thor).

Implementing BPM

The flight home from Mashup Camp was a great opportunity to catch up on my notes from the past couple of weeks, including several ideas triggered by discussions at the TIBCO seminar last week: some because I disagreed with the speakers, but some because I agreed with them. I split my opinions on the discussions about implementing BPM systems, specifically about the role of a business process analyst, and agile versus waterfall development.

First of all, the business process analyst role: Michael Melenovsky sees this as important for BPM initiatives, but I tend to feel the same way as I do about business rules analysts: just give me a good business analyst any day, and they’ll be able to cover rules, process, and whatever else is necessary for making that business-IT connection. Furthermore, he sees the BPA as a link between a BA and IT, as if we need yet another degree of separation between the business and those who are charged with implementing solutions to make business run better. Not.

There were some further discussions on how business and IT have to collaborate on BPM initiatives (duh) and share responsibility for a number of detailed design and deployment tasks, but this is true for any technology implementation. If you don’t already have a good degree of collaboration between business and IT, don’t expect it to magically appear for your BPM initiatives, but do take note that the need for it is at least as great as for any other technology implementation. How we’re supposed to collaborate more effectively by shoehorning a BPA between a BA and IT is beyond me, however.

Melenovsky also had some interesting “lesson learned” stats on the correlation between the time spent on process discovery and model construction, and the success of the BPM initiative: basically, do more work up on your analysis and up-front business-focussed design, and your project will be more successful. Gartner found that over 40% of the project time should be spent on process discovery, another 9% on functional and technical specifications, and just 12% on implementation. In my experience, you’ll spend that 40% on process discovery either up-front, or when you go back and do it over because you implemented the wrong thing due to insufficient process discovery in the first place: as usual, a case of choosing between doing it right or doing it over.

That directly speaks to the need for agile, or at least iterative, development on BPM projects. You really can’t use waterfall methods (successfully) for BPM development (or most other types of technology deployments these days), for so many reasons:

  1. When implementing new (and disruptive) technology, whatever business tells IT that they want is not accurate since the business really isn’t able to visualize the entire potential solution space until they see something working.
  2. While IT is off spending two years implementing the entire suite of required functionality in preparation for an all-singing, all-dancing big bang roll-out, the business requirements will change.
  3. During that two years, the business is losing money, productivity and/or opportunities due to the lack of whatever BPM is going to do for them, and is building stop-gap solutions using Excel, Access, paper routing slips and prayer.
  4. That all-singing, all-dancing complex BPM implementation is, by definition, more complex and therefore more rigid (less flexible) due to the amount of custom development. It makes sense that you can’t use a development methodology that’s not agile to implement processes that are agile.
  5. The big bang roll-out is a popular notion with the business right up to the point when it happens, and they discover that it doesn’t meet their needs (refer to points 1 and 2). Then it becomes unpopular.

Instead, get the business requirements and process models worked out up front, then engage in iterative methods of designing, implementing and deploying the systems. Deliver “good enough” processes on the first cut, then make iterative improvements. Don’t assume that the business users aren’t capable of providing valuable feedback on required functionality: just make them do their job with the first version of the system, and they’ll give you plenty of feedback. Consider the philosophy of an optional scope contract rather than a fixed price/date/scope contract, whether you’re using internal or external resources for the implementation. Where possible, put changes to the business processes and rules in the hands of the business so that they can tweak the controls themselves.

Killing me softly…with SOA

Joe McKendrick posted last week about whether open source or SOA is killing the software industry faster, right on the heels of a couple of articles in eWeek about how E-Trade is switching to open source (E-Trade’s not just implementing Linux, which would hardly raise an eyebrow these days, but also components higher up in the stack, such as web server, application server and transaction management software).

From the point of view of the software industry, these are both disruptive technologies that fundamentally change the way that business is done. Funny, after all these years of introducing disruptive technologies to other businesses that resulted in some pretty major upheavals, software companies are getting it back in spades.

As for SOA and other technologies that make software development faster and easier, I say “bring it on”. I have little tolerance for systems integrators (or the professional services arm of software vendors) that won’t use newer, better technology when it makes them less money, although there are a few of them that seem to get it.

Business (rule) analysis

I received the call for papers for the 9th International Business Rules Forum, which has prompted me to browse through the other business rule-related tidbits that I’ve been viewing over the past few weeks. If you’ve been reading Column 2 for a while, you already know that I think that business rules are a crucial feature in BPM, whether the BPM contains them inherently or as an add-on: you can find some of my previous posts on BPM and business rules here, here, here and here.

Rolando Hernandez recently posted a short term outlook for business rules — in short, that BR provide huge competitive advantage through business agility — plus an opinion on the differences between a business analyst and a business rules analyst.

The business rules analyst is focused on separating rules from code. The rule analyst walks and talks business… The rule analyst talks about business rules and business logic. The rule analyst means business.

The business analyst sees rules as code. The business analyst talks about the system. A business analyst is often a systems analyst by nature, and by training… The systems analyst means code.

I don’t think that there is a big difference in the inherent skills of business analysts and business rules analysts; rather, I think that systems analysts need to stop foisting themselves off as business analysts. Rolando starts a paragraph describing the business analyst (“the business analyst sees rules as code“), segues through an assumption (“a business analyst is often a systems analyst by nature, and by training“) and by the end of the paragraph is referring to the systems analyst rather than the business analyst, as if there were no difference. Yes, this happens, but it’s unfair to paint all business analysts with the same brush.

I also see the opposite problem, where a business user is designated as a business analyst, even though he (or she) has no skills or training in analysis; since he’s not trained to write requirements that are both necessary and sufficient, the resulting solution will not do what the business needs it to do. Furthermore, since he’s probably not up on the latest in associated technology areas, he’s unlikely to think outside the box because he doesn’t even know that the box exists.

The trick is to meet somewhere in the middle: a business analyst or business rules analyst needs to be focussed on the business, but be aware of the capabilities and limitations of the technology. The first job of the business (rule) analyst is to determine the business requirements, not write a functional specification for how the system might behave, as I’ve posted in the past. A business analyst needs training in the business area under study, but also needs training and experience in gathering requirements, analyzing business functions, optimizing business processes and documenting requirements, plus a high-level understanding of the functionality (not the technology) of any systems that might be brought to bear on a solution.

Software development methodology

An excellent article by Matthew Heusser about the tradeoffs in designing a software development methodology. I’ve been involved in software development for over 20 years as a developer, designer and architect, both within software product companies and as a systems integrator or consultant to large organizations’ IT departments, and I’ve seen a wide range of software development methodologies from rigidly-imposed waterfall to much more agile techniques. Many of my customers are large and conservative, and tend towards the more rigidly structured methodologies that fit into the larger corporate budgetary process. Heusser nails the problem with that:

For example, if the organization wants accountability and predictability, it may require documented requirements with various levels of review and signoff, and create a change-control board with the power to line-item veto changes in requirements that may affect schedule. Everything sounds good so far…until about six months later, when the VP of New Product Development can’t get the feature set changed in order to respond to a market demand. Some ninny down in software engineering is holding up the company’s ability to deliver products to customers, all in the name of “process improvement”!

He also examines the tradeoffs between estimates and code, delivery date or features, control or productivity, and organic decision making versus mechanic. He’s obviously a fan of agile development, preferring (like me) to get a simple working system in place first, then let the customer set priorities on what features to implement next.

(via Managing Knowledge Processes)

Design and requirements

Some recent work for a client has me struggling over the distinction between requirements and design: not in my own mind, where it’s pretty clear, but in the minds of those who make no distinction between them, or attempt to substitute design for requirements.

First in any project are the business requirements, that is, a statement of what the business needs to achieve. For example, in a transaction processing environment (which is a lot of what I work with), a business requirement would be “the transactions must be categorized by transaction type, since the processing is different for each type”. Many business requirements and constraints of an existing organization are detected, not designed; from a Zachman standpoint, this is rows 1 and 2, although I think that a few of the models developed in row 2 (such as the Business Workflow Model) are actually functional design rather than requirements.

Second is the functional design, or functional specifications, that is, a statement of how the system will behave in order to meet the business requirements. For example, in our transaction processing example, the functional specification could be “the user selects the transaction type” or “the transaction type is detected from the barcode value”. Some references refer to these as “functional requirements”, which I think is where the problem in terminology lies: when I say “requirements”, I mean “business requirements”; when some people say “requirements”, they mean “functional requirements”, or rather, how the system behaves rather than what the business needs. This is further complicated by those who choose not to document the business requirements at all, but go directly to functional design, call it “requirements”, and gloss over the fact that the business requirements have been left undocumented. Personally, I don’t consider functional design to be requirements, I consider it to be design, to be performed by a trained designer, and based on the business requirements.

Lastly is the technical design, about which there is usually little debate, although there’s still way too many projects where this stage is at least partially skipped in favour of going straight from functional design to coding.

All this being said, it’s possible for a designer who is familiar with the requirements to internalize them sufficiently to do a good functional design, which in turn can produce a good technical design and a system that meets the current requirements. So what’s the problem with just skipping the documentation of business requirements and progressing straight to functional design? There are two problems, as I see it. First, there’s a big communcation gap between the business and the technology. The technical designer understands what the system should do, but not why it should do it that way, so can’t make reasonable suggestions for modifications to the functional design if there appears to be a better way to implement the system. Second, the future agility of both the business processes and the technology is severely impacted, since it will be nearly impossible to determine if a change to the technology will violate a business requirement, or to model how a changing business requirement will impact the technology implementation, since the requirements are not formally linked via a functional design to the technical design.

A lot of the work that I do is centred around BPM, and although these principles aren’t specific to BPM, they’re particularly important on BPM projects, where the functional design can sometimes appear to be obvious to the subject matter experts (who are not usually trained designers) and a lot of paving of the cow paths occurs because of that.

Adaptive approaches

Greg Wdowiak’s post on application integration and agility (as in agile development) includes a great comparison of plan-driven versus adaptive development. He rightly points out that both approaches are valid, but for different types of projects:

Adaptive approach provides for an early customer’s feedback on the product. This is critical for new product development where the customer and designers ideas may significantly differ, be very vague, or the kind of product that is being design has not been ever tried before; therefore, having the ability for the customers to ‘touch’ an approximation is very important if we want to build something useful.

That pretty much describes most development projects that I’m involved in…

The plan-driven approach allows for optimization of the project trajectory. The trajectory of adaptive approach is always suboptimal, but this is only apparent once the project is complete.

As this last quote from his post makes clear, the plan-driven approach works well for well-understood implementations, but not so well for the introduction of new technology/functionality into an organization. The plan-driven approach reduces logistical risks, whereas the adaptive approach reduces the risks of uncertain requirements and unknown technology.

One of the key advantages of adaptive development in introducing new technology is the delivery methodology: instead of a “big bang” delivery at the end, which often surprises the end-user by not delivering what they expected (even though it may have been what was agreed upon in the specifications), it offers incremental approximations of the final result which are refined at each stage based on end-user feedback.

So why isn’t the adaptive approach used for every new technology project? Alas, the realities of budgets and project offices often intervene: many corporate IT departments require rigid scheduling and costing that don’t allow for the fluidity required for adaptive development, for example, by requiring complete signed-off requirements before any development begins. Although it’s certainly possible to put a project plan in place for an adaptive development project, it doesn’t look the same as a “classical” IT project plan, so may not gain the acceptance required to acquire the budget. Also, if part of the development is outsourced, this type of rigid project planning is almost always used to provide an illusion of control over the project.

When a company just isn’t ready for the adaptive approach yet, but can be convinced that the plan-drive approach isn’t quite flexible enough, I propose a hybrid approach through some project staging: my mantra is “get something simpler in production sooner”. If I’m working with a BPM product, for example, my usual recommendation is to deploy out-of-the-box functionality (or nearly so) to allow the users to get their hands on the system and give us some real feedback on what they need, even if it means that they have to work around some missing functionality. In many cases, there’s a lot of the OOTB functionality that’s completely acceptable to them, although the users may never have specified it in exactly the same manner. Once they’ve had a chance to see what’s available with a minimal amount of custom development, they can participate in informed discussions about where the development dollars are best spent.

This approach often puts me at odds with an outsourced development group: they want to maximize their development revenue from the client, whereas I want to keep the design simple and get something into production as soon as possible. I’ve had many cases in the past where I’ve worked as a subcontractor to a large systems integrator, and I almost always end up in that same conflict of interest, which explains why I usually try to work as a freelance designer/architect directly for the end customer, rather than as a subcontractor.

User-driven design

Kathy Sierra recently posted the following helpful hint on Creating Passionate Users:

Seriously, though, she goes on to say:

Most of us realize that focus groups are notoriously ineffective for many things, but we still assume that listening to real feedback from real users is the best way to drive new products and services, as well as improve on what we have. But there’s a huge problem with that — people don’t necessarily know how to ask for something they’ve never conceived of! Most people make suggestions based entirely around incremental improvements, looking at what exists and thinking about how it could be better. But that’s quite different from having a vision for something profoundly new.

This isn’t a new idea (that users themselves are typically not going to come up with breakthrough innovations), but one that we need to constantly keep in mind. When I’m designing a system, I make a deal with the users who are involved in focus groups, JADs and other interviews: they tell me what they need to accomplish to meet their business goals, and I’ll design the best way to do it. In other words, I’ll treat them as the business subject matter experts, and they’ll treat me as the design expert.

There’s a constant struggle with users who insist on specific features (e.g., “the button has to be blue”, when I’m trying to create something that doesn’t even need a button) because they don’t have the perspective to spontaneously visualize the future that is possible for them. Designing BPM systems is particularly problematic, since manual business processes are part of the folklore of an organization, and changing them causes some amount of cultural disruption. Having users involved in the design process is necessary, but it’s also necessary not to be unduly influenced by protests of “but we’ve always done it this way”.

Strangely enough, I was on Amazon yesterday and under “My Recommendations” it came up with Flatland, a short book of fiction about geometry, published in 1880, that I haven’t read since I was in university:

Flatland…imagines a two-dimensional world inhabited by sentient geometric shapes who think their planar world is all there is. But one Flatlander, a Square, discovers the existence of a third dimension and the limits of his world’s assumptions about reality and comes to understand the confusing problem of higher dimensions.

As a designer, sometimes I just have to think like a Square in a Flatland.

Shallow vs. Deep Knowledge

The EDS Fellows’ Next Big Thing blog today discusses how business applications continue moving towards less custom coding and more off-the-shelf reusable vendor components, and the impact that has on an integrator’s knowledge of the vendor components. Interesting that some of the best minds at this large SI are pointing out that their portion of any particular job is likely to continue to shrink, something that I wrote about last week, although they don’t discuss how EDS or other large SIs are going to fill the in gaps in their past business model of “build everything”.

The point of their post is, however, how can someone working on a business application have sufficient knowledge in order to understand the strengths and weaknesses of a given vendor component when they haven’t seen the source code? They go on to provide a scientific method for gaining a deeper knowledge of a component without access to the source code, but their entire argument is based on an old-style mainframe integration (which, to be fair, was/is EDS’ sweet spot) where it was fairly common to have access to vendors’ source code.

I have to say, welcome to the real world: I’ve been doing integration for over 15 years, have a very deep knowledge of a few vendors’ products, plus a shallower knowledge of a bunch of other products, and I’ve never seen a line of vendor source code. Personally, I can’t think of very many cases where access to the source code would have improved the end result; as any good software QA team can tell you, you don’t need to see the code in order to determine the behaviour and boundaries of a component.

Furthermore, their scientific method doesn’t include a vital component: vendor relationships. If you’re building a significant business on a specific vendor’s products, you have to establish and maintain a relationship with them so as to have relatively easy access to their internal technical resources, the people further behind the customer support front line. Having done this with a couple of vendors in the past (and then being accused of being in bed with them for my efforts), I know that this is a key contributor to gaining the requisite deep knowledge for a successful integration.