Column 2

Naked in Toronto

I know that only a small percentage of my readers are in Toronto, but if you are, and you blog, you’ll be interested in the bloggers’ dinner that is forming around Shel Israel‘s visit next Monday. Shel, the co-author of Naked Conversations, will even sign a copy of his book for you. Shoeless Joe’s, 401 King West, 6:30pm. Details here and here.

Do you know the way to SOA?

It’s not often that a graphic from a vendor makes me laugh out loud, but this one from TIBCO did:

I’ll be humming Burt Bacharach tunes all afternoon.

ScrAPI series

The first of a series of posts on scrAPIs (which I commented on following Mashup Camp last week) by Thor Muller of Rubyred Labs. I’m looking forward to the rest of the series this week.

Retro look at the impact of SOA

I recently discovered some notes that I had made back in November 2004 from a TIBCO webinar “Enabling Real-time Business with a Service-Oriented and Event-Driven Architecture”. Randy Heffner from Forrester spoke at that webinar, and I remember that it was his words that made me realize what an impact that SOA was going to have, and how strategic SOA requires a focus on enterprise architecture, particularly the application architecture and technical architecture layers, so that business and IT metrics can be tied back to defined services.

Although it seems obvious now, that webinar really crystallized the idea of services as being process steps to be orchestrated, and how this allowed you to focus on an end-to-end process across all stakeholders, not just what happens inside your organization: the Holy Grail of BPM, as it were. EA often does not include business architecture, but services force it to consider the business process architecture and business strategy/organization.

Computer History Museum

My wrapup of Mashup Camp wouldn’t be complete without mentioning the fabulous Computer History Museum in Mountain View where the event was held. Great venue, and the part of their collection that we were able to view during the party on Monday night was very nostalgic (although I can’t quite say that I miss RSX11M). Definitely worth a visit if you’re in the Bay area.

On my return to Toronto, I had lunch with a friend who works for Alias, the day after she emailed me to say that their corporate email addresses have changed from @alias.com to @autodesk.com following the recent acquisition. The end of an era for a long-running innovative Canadian software company. There since the late 1980’s, she saw many transitions, including the purchase of Alias by Silicon Graphics (and its subsequent sale). SGI was, at the time, housed in the building that now holds the Computer History Museum, and she remembers visiting there when it was SGI headquarters. An interesting footnote after spending the first part of the week there.

Picturing yourself at Mashup Camp

I’m still wrapping my brain around some of the ideas that started in my head at Mashup Camp, but I’ve been having fun browsing through all of the photo-detritus of the event. I was surprised that I made the first photo in Valleywag’s coverage of the event, and Doc Searls caught me at the XDI session on Monday (bonus tip: wear purple at popular events so that you can find yourself in the photos). There’s over 900 Flickr photos tagged mashupcamp, and likely many more still languishing out there on memory cards.

Best quote from Mashup Camp

That’s the thing about mashups, almost all of them are illegal

I heard that (and unfortunately am unable to credit the source) in the “scrAPI” session at Mashup Camp, in which we discussed the delicate nature of using a site that doesn’t have APIs as part of a mashup. Adrian Holovaty of ChicagoCrime.org (my favourite mashup at camp) was leading part of the session, demonstrating what he had done with Chicago police crime data (the police, not having been informed in advance, called him for a little chat the day his site went live), Google maps, Yahoo! maps (used for geocoding after he was banned from the Google server for violating the terms of service) and the Chicago Journal.

Listening to Adrian and others talk about the ways to use third-party sites without their knowledge or permission really made me realize that most mashup developers are still like a bunch of kids playing in a sandbox, not realizing that they might be about to set their own shirts on fire. That’s not a bad thing, just a comment on the maturity of mashups in general.

The scrAPI conversation — a word, by the way, that’s a mashup between screen scraping and API — is something very near and dear to my heart, although in another incarnation: screen scraping from third-party (or even internal) applications inside the enterprise in order to create the type of application integration that I’ve been involved in for many years. In both cases, you’re dealing with a third party who probably doesn’t know that you exist, and doesn’t care to provide an API for whatever reason. In both cases, that third party may change the screens on their whim without telling you in advance. The only advantage of doing this inside the enterprise is that the third party ususally doesn’t know what you’re doing, so if you are violating your terms of service, it’s your own dirty little secret. Of course, the disadvantage of doing this inside the enterprise is that you’re dealing with CICS screens or something equally unattractive, but the principles are the same: from a landing page, invoke a query or pass a command; navigate to subsequent pages as required; and extract data from the resultant pages.

There’s some interesting ways to make all of this happen in mashups, such as using LiveHTTPHeaders to watch the traffic on the site that you want to scrape, and faking out forms by passing parameters that are not in their usual selection lists (Adrian did this with ChicagoCrime.org to pass a much larger radius to the crime stats site that its form drop-down allowed in order to pull back the entire geographic area in one shot). Like many enterprise scraping applications, site scraping applications often cache some of the data in a local database for easier access or further enrichment, aggregation, analysis or joining with other data.

In both web and enterprise cases, there’s a better solution: build a layer around the non-API-enabled site/application, and provide an API to allow multiple applications to access the underlying application’s data without each of them having to do site/screen scraping. Inside the enterprise, this is done by wrapping web services around legacy systems, although much of this is not happening as fast as it should be. In the mashup world, Thor Muller (of Ruby Red Labs) talked about the equivalent notion of scraping a site and providing a set of methods for other developers to use, such as Ontok‘s Wikipedia API.

We talked about the legality of site scraping, namely that there are no explicit rights to use the data, and the definition of fair use may or may not apply; this is what prompted the comment with which I opened this post.

In the discussion of strategic issues around site scraping, I certainly agree that site scraping indicates a demand for an API, but I’m not sure that I completely agree with the comment that site scraping forces service and data providers to build/open APIs: sure, some of them are likely just unaware that their data has any potential value to others, but there’s going to be many more who either will be horrified that their data can be reused on another site without attribution, or just don’t get that this is a new and important way to do business.

In my opinion, we’re going to have to migrate towards a model of compensating the data/service provider for access to their content, whether it’s done through site scraping or an API, in order to gain some degree of control (or at least advance notice) of changes to the site that would break the callling/scraping applications. That compensation doesn’t necessarily have to mean money changing hands, but ultimately everyone is driven by what’s in it for them, and needs to see some form of reward.

Update: Changed “scrapePI” to “scrAPI” (thanks, Thor).

Mashing up a new world (dis)order

Now that I’ve been disconnected from the fire hose of information that was Mashup Camp, I’ve had a bit of time to reflect on what I saw there.

Without doubt, this is the future of application integration both on the public internet and inside the enterprise. But — and this is a big but — it’s still very embryonic, and I can’t imagine seriously suggesting much of this to any CIO that I know at this point, since they all work for large and fairly conservative organizations. However, I will be whispering it in their ears (not literally) over the coming months to help prepare them for the new world (dis)order.

From an enterprise application integration perspective, there’s two major lessons to be learned from Mashup Camp.

First, there’s a lot of data sources and services out there that could be effectively combined with enterprise data for consumption both inside and outside the firewall. I saw APIs that wrap various data sources (including very business-focused ones such as Dun + Bradstreet), VOIP, MAPI and CRM as well as the better-known Google, Yahoo! and eBay APIs. The big challenge here is the NIH syndrome: corporate IT departments are notorious for rejecting services and especially data that they don’t own and didn’t create. Get over it, guys. ThereÂs a much bigger world of data and services than you can ever build yourself, and you can do a much better job of building the systems that are actually a competitive differentiator for you rather than wasting your time building your own mapping system so that you can show your customers where your branches are located. Put those suckers on Google maps, pronto. This is no different than 1000’s of other arguments that have occurred on this same subject over the years, such as “donÂt build your own workflow system” (my personal fave), and is no different than using a web service from a trusted service provider. Okay, maybe it’s a bit different than dealing with a trusted service provider, but I’ll get to the details of that in a later post on contracts and SLAs in the world of mashups.

Second, enterprise IT departments should be looking at the mechanics of how this integration takes place. Mashup developers are not spending millions of dollars and multiple years integrating services and data. Of course, theyÂre a bit too cavalier for enterprise development, typically eschewing such niceties as ensuring the legality of using the data sources and enterprise-strength testing, but there’s techniques to be learned that can greatly speed application integration within an organization. To be fair, many IT departments need to put themselves in the position of both the API providers and the developers that I met at MashupCamp, since they need to both wrap some of their own ugly old systems in some nicer interfaces and consume the resulting APIs in their own internal corporate mashups. I’ve been pushing for a few years for my customers to start wrapping their legacy systems in web services APIs for easier consumption, which few have adopted beyond some rudimentary functionality, but consider that some of the mashup developers are providing a PHP interface that wraps around a web service so that you can develop using something even easier: application integration for the people, instead of just for the wizards of IT. IT development has become grossly overcomplicated, and itÂs time to shed a few pounds and find some simpler and faster ways of doing things.

Mashup Camp peek

It’s the second day of Mashup Camp, and I’ve had zero time to blog about what’s going on here — much like everyone else here, so it seems. I’ll be posting my thoughts later this week, but in the meantime you can check out the other blog posts about Mashup Camp and look at the pictures that people are taking here (over 600 at this time).