Eventful mashup hits Boing Boing

Before I went to Mashup Camp, I exchanged emails with Chris Radcliff of EVDB/Eventful, and it was great to meet him face-to-face at camp. EVDB makes an API for managing event, venue, and calendar data, and Eventful uses that API in an events/calendaring/social networking mashup of events submitted directly to Eventful plus those grabbed from other event sites.

Today, I see that Eventful was covered on Boing Boing, which should bring it a huge amount of well-deserved attention. Congrats!

Computer History Museum

My wrapup of Mashup Camp wouldn’t be complete without mentioning the fabulous Computer History Museum in Mountain View where the event was held. Great venue, and the part of their collection that we were able to view during the party on Monday night was very nostalgic (although I can’t quite say that I miss RSX11M). Definitely worth a visit if you’re in the Bay area.

On my return to Toronto, I had lunch with a friend who works for Alias, the day after she emailed me to say that their corporate email addresses have changed from @alias.com to @autodesk.com following the recent acquisition. The end of an era for a long-running innovative Canadian software company. There since the late 1980’s, she saw many transitions, including the purchase of Alias by Silicon Graphics (and its subsequent sale). SGI was, at the time, housed in the building that now holds the Computer History Museum, and she remembers visiting there when it was SGI headquarters. An interesting footnote after spending the first part of the week there.

Picturing yourself at Mashup Camp

I’m still wrapping my brain around some of the ideas that started in my head at Mashup Camp, but I’ve been having fun browsing through all of the photo-detritus of the event. I was surprised that I made the first photo in Valleywag’s coverage of the event, and Doc Searls caught me at the XDI session on Monday (bonus tip: wear purple at popular events so that you can find yourself in the photos). There’s over 900 Flickr photos tagged mashupcamp, and likely many more still languishing out there on memory cards.

Best quote from Mashup Camp

That’s the thing about mashups, almost all of them are illegal

I heard that (and unfortunately am unable to credit the source) in the “scrAPI” session at Mashup Camp, in which we discussed the delicate nature of using a site that doesn’t have APIs as part of a mashup. Adrian Holovaty of ChicagoCrime.org (my favourite mashup at camp) was leading part of the session, demonstrating what he had done with Chicago police crime data (the police, not having been informed in advance, called him for a little chat the day his site went live), Google maps, Yahoo! maps (used for geocoding after he was banned from the Google server for violating the terms of service) and the Chicago Journal.

Listening to Adrian and others talk about the ways to use third-party sites without their knowledge or permission really made me realize that most mashup developers are still like a bunch of kids playing in a sandbox, not realizing that they might be about to set their own shirts on fire. That’s not a bad thing, just a comment on the maturity of mashups in general.

The scrAPI conversation — a word, by the way, that’s a mashup between screen scraping and API — is something very near and dear to my heart, although in another incarnation: screen scraping from third-party (or even internal) applications inside the enterprise in order to create the type of application integration that I’ve been involved in for many years. In both cases, you’re dealing with a third party who probably doesn’t know that you exist, and doesn’t care to provide an API for whatever reason. In both cases, that third party may change the screens on their whim without telling you in advance. The only advantage of doing this inside the enterprise is that the third party ususally doesn’t know what you’re doing, so if you are violating your terms of service, it’s your own dirty little secret. Of course, the disadvantage of doing this inside the enterprise is that you’re dealing with CICS screens or something equally unattractive, but the principles are the same: from a landing page, invoke a query or pass a command; navigate to subsequent pages as required; and extract data from the resultant pages.

There’s some interesting ways to make all of this happen in mashups, such as using LiveHTTPHeaders to watch the traffic on the site that you want to scrape, and faking out forms by passing parameters that are not in their usual selection lists (Adrian did this with ChicagoCrime.org to pass a much larger radius to the crime stats site that its form drop-down allowed in order to pull back the entire geographic area in one shot). Like many enterprise scraping applications, site scraping applications often cache some of the data in a local database for easier access or further enrichment, aggregation, analysis or joining with other data.

In both web and enterprise cases, there’s a better solution: build a layer around the non-API-enabled site/application, and provide an API to allow multiple applications to access the underlying application’s data without each of them having to do site/screen scraping. Inside the enterprise, this is done by wrapping web services around legacy systems, although much of this is not happening as fast as it should be. In the mashup world, Thor Muller (of Ruby Red Labs) talked about the equivalent notion of scraping a site and providing a set of methods for other developers to use, such as Ontok‘s Wikipedia API.

We talked about the legality of site scraping, namely that there are no explicit rights to use the data, and the definition of fair use may or may not apply; this is what prompted the comment with which I opened this post.

In the discussion of strategic issues around site scraping, I certainly agree that site scraping indicates a demand for an API, but I’m not sure that I completely agree with the comment that site scraping forces service and data providers to build/open APIs: sure, some of them are likely just unaware that their data has any potential value to others, but there’s going to be many more who either will be horrified that their data can be reused on another site without attribution, or just don’t get that this is a new and important way to do business.

In my opinion, we’re going to have to migrate towards a model of compensating the data/service provider for access to their content, whether it’s done through site scraping or an API, in order to gain some degree of control (or at least advance notice) of changes to the site that would break the callling/scraping applications. That compensation doesn’t necessarily have to mean money changing hands, but ultimately everyone is driven by what’s in it for them, and needs to see some form of reward.

Update: Changed “scrapePI” to “scrAPI” (thanks, Thor).

Mashing up a new world (dis)order

Now that I’ve been disconnected from the fire hose of information that was Mashup Camp, I’ve had a bit of time to reflect on what I saw there.

Without doubt, this is the future of application integration both on the public internet and inside the enterprise. But — and this is a big but — it’s still very embryonic, and I can’t imagine seriously suggesting much of this to any CIO that I know at this point, since they all work for large and fairly conservative organizations. However, I will be whispering it in their ears (not literally) over the coming months to help prepare them for the new world (dis)order.

From an enterprise application integration perspective, there’s two major lessons to be learned from Mashup Camp.

First, there’s a lot of data sources and services out there that could be effectively combined with enterprise data for consumption both inside and outside the firewall. I saw APIs that wrap various data sources (including very business-focused ones such as Dun + Bradstreet), VOIP, MAPI and CRM as well as the better-known Google, Yahoo! and eBay APIs. The big challenge here is the NIH syndrome: corporate IT departments are notorious for rejecting services and especially data that they don’t own and didn’t create. Get over it, guys. There’s a much bigger world of data and services than you can ever build yourself, and you can do a much better job of building the systems that are actually a competitive differentiator for you rather than wasting your time building your own mapping system so that you can show your customers where your branches are located. Put those suckers on Google maps, pronto. This is no different than 1000’s of other arguments that have occurred on this same subject over the years, such as “don’t build your own workflow system” (my personal fave), and is no different than using a web service from a trusted service provider. Okay, maybe it’s a bit different than dealing with a trusted service provider, but I’ll get to the details of that in a later post on contracts and SLAs in the world of mashups.

Second, enterprise IT departments should be looking at the mechanics of how this integration takes place. Mashup developers are not spending millions of dollars and multiple years integrating services and data. Of course, they’re a bit too cavalier for enterprise development, typically eschewing such niceties as ensuring the legality of using the data sources and enterprise-strength testing, but there’s techniques to be learned that can greatly speed application integration within an organization. To be fair, many IT departments need to put themselves in the position of both the API providers and the developers that I met at MashupCamp, since they need to both wrap some of their own ugly old systems in some nicer interfaces and consume the resulting APIs in their own internal corporate mashups. I’ve been pushing for a few years for my customers to start wrapping their legacy systems in web services APIs for easier consumption, which few have adopted beyond some rudimentary functionality, but consider that some of the mashup developers are providing a PHP interface that wraps around a web service so that you can develop using something even easier: application integration for the people, instead of just for the wizards of IT. IT development has become grossly overcomplicated, and it’s time to shed a few pounds and find some simpler and faster ways of doing things.

Get on the map

Attention, all you Mashup Camp attendees: go to Attendr to see a cool mashup example from Jeff Marshall that allows you to link to other attendees you already know or would like to meet. If you’re going to be at MashupCamp next week, be sure to look me up and say hi.

As for the rest of you, head on over and add yourself to my Frappr map. You can see where other readers of Column 2 are located, and I’ve added the capability to use coloured pins to denote whether you’re a customer, product vendor, services vendor or “other” as it relates to BPM and integration technologies.

Mashups and the corporate SOA

I listened to a podcast last week of David Linthicum interviewing Dion Hinchcliffe that really helped to coalesce my thoughts about mashups, Web 2.0, SOA, composite applications and the future of integration. I was walking along a street in downtown Toronto, listening to it on my iPod, and making enough facial expressions, hand gestures and remarks aloud that I was likely written off as one of the usual crazies: it’s very exciting when someone with very similar ideas to your own states them much more clearly than you could have said it yourself.

A couple of weeks ago, I posted about mashups and the implications for enterprise integration, which of the integration vendors is likely to jump on this bandwagon early, and noted that I’ll be at Mashup Camp later this month because I really want to explore the convergence of mashups and enterprise integration. Unbeknownst to me, Dion Hinchcliffe had published an article in the SOA Web Services journal in late December entitled Web 2.0: The Global SOA, which was the focus of the podcast, and blogged about the 100’s of services available on the “giant service ecosystem” that is the web:

An important reason why the Web is now the world’s biggest and most important computing platform is that people providing software over the Internet are starting to understand the law of unintended uses. Great web sites no longer limit themselves to just the user interface they provide. They also open up their functionality and data to anyone who wants to use their services as their own. This allows people to reuse, and re-reuse a thousand times over, another service’s functionality in their own software for whatever reasons they want, in ways that couldn’t be predicted. The future of software is going to be combining the services in the global service landscape into new, innovative applications. Writing software from scratch will continue to go away because it’s just too easy to wire things together now.

The information on this is now starting to explode: David Berlind (organizer of Mashup Camp) discusses the bazaar-like quality of the mashup ecosystem, Stephen O’Grady pushes the concept of SOA to include mashups, and even Baseline Magazine is talking about how mashups can free you from the tyranny of software vendors with a discussion about how some of the services feeding mashups could be used in an enterprise integration context.

All of this has huge implications for business processes, and the type of BPM that currently resides completely inside an organization. Most BPM vendors have enabled their products to be consumers of web services in order to more easily play an orchestration role, and some customers are even starting to take advantage of this by invoking web services that integrate other internal systems as steps in a business process (although a lot are still, unfortunately, stuck in earlier, more primitive generations of integration techniques). Imagine the next step: as corporate IT departments get over their “not invented here” fears, the BPM tools allow them to integrate not just internal web services, but external services that are part of the Web 2.0 mashup ecosystem. Use a Salesforce.com service to do a customer credit check as part of processing their insurance application. Integrate Google Maps or Yahoo maps to determine driving directions from your service dispatch location to your customer’s location in order to create service call sheets. It’s like software-as-a-service, but truly on a per-service rather than per-application basis, allowing you to pick and choose what functions/services that you want to invoke from any particular step in your business process.

Dion Hinchcliffe thinks that 80% of enterprise applications could be provided by external services, which is a great equalizer for smaller businesses that don’t have huge IT budgets, and could almost completely disconnect the issue of business agility from the size of your development team. I think that it’s time for some hard introspection about what business that you’re really in: if your organization is in the business of selling financial services, what are you doing writing software from scratch when you could be wiring it together using BPM and the global SOA that’s out there?

Update: David swamped his podcast provider and ended up moving the podcast here. Reference also updated above.

We need a BPM camp

I received yet another email about the upcoming Gartner BPM Summit, and I continue to be horrified by the price of conferences: U$1,895 for 3 days?! Or how about the AIIM records management conference in Toronto next week: C$2,899 for 3 days? By the time you add in travel and living, it’s no mean chunk of change when you’re an independent: I don’t have the luxury of a big company picking up my tab. Even those of you working for larger companies know that it’s not easy to find funding for attending conferences, even if you believe that they’ll be of value.

I know that analysts are in the business of making money from knowledge (so am I), but knowledge is becoming a commodity these days, and a lot of people won’t (or can’t) shell out that much cash just to sit in a room for three days and hear someone talk when the same information is available (albeit in a less structured manner) in a variety of other forms at a much lower cost: blogs, podcasts, vendor seminars, webinars, analyst reports and other sources that don’t believe that it’s in their best interests to charge everyone an arm and a leg just to have a conversation.

I only attend these big-money conferences once every few years; in the interim, I do just fine with my RSS feeds, daily email newsletters, webinars, vendor seminars, and other sources of free or reasonably-priced information. For example, in the past year, I’ve attended two major conferences: BPM 2005 in London, where I paid full price as an attendee, and FileNet’s user conference in Las Vegas, where I was a speaker so had my conference fee waived (check out the series of entries in my November archive, where I was blogging live from the conference sessions by emailing from my Blackberry). I also attended some local seminars/mini-conferences at little or no cost, such as e-Content Institute, plus some vendor seminars; in fact, I spent yesterday morning at a LabOne seminar hearing about how their next generation of products is going to better integrate into my insurance clients’ systems.

I attended a ton of webinars last year, most from ebizQ and BPMinstitute.org, but also from vendors such as Global 360 and Proforma (search my archives for “webinar” to see my comments on the webinars). I have a list of past webinars that I want to watch but haven’t found time yet: a wealth of information delivered to my desk, for free, with a relatively modest amount of vendor promotional material wrapped around it.

There is something to be said about a conference atmosphere, however. As much as I dislike most professional networking (I’m a closet introvert), conferences provide a great opportunity to meet people with the same interests: for me, that includes potential clients, but also vendors, potential partners, industry analysts and a variety of other types. Most conferences also include some sort of vendor showcase where I can have a peek at the latest and greatest technology.

The dilemma is this, then: given that much of the “information” (content) of the big conferences is available in the public domain or through lower-cost alternatives, how do we share that information in a conference-like networking atmosphere?

The answer may lie in the new generation of “un-conferences” or “camps”. These still exist mostly as technical conferences, but with the focus on collaboration rather than presentations (i.e., have a conversation guided by an actual practioner rather than death-by-Powerpoint from a hired speaker), limited enrolment, and free (or nearly so) fees for attending, this movement has the potential to expand into other traditional conference areas. One popular technical camp is BarCamp, including the recent TorCamp. David Crow, the prime organizer of TorCamp (and my neighbour), just posted about the camp format for un-conferences, and links to Chris Heuer with more about these sort of amateur conferences. A camp with a specific focus on integration is Mashup Camp next month in San Jose, which I’ll be attending because I want to explore how to use mashup concepts in the context of enterprise application integration: this is the part of the future of orchestration. And the expected “conference fee”? $0.

Camps are still, for the most part, for techno-geeks (I admit it, I am a geek). But how long before this “amateur” format hits the mainstream? How long before Gartner’s BPM summit is competing with BPMcamp?