Sys 2 Sys: Starting SOA - Hints and Tips

Introduction

Building SOA on solid foundations is really important. I've come across so many SOA implementations which have kicked off without really considering, or perhaps just misunderstanding some key concepts of SOA and so making some mistakes which are difficult to fix in retrospect (especially when a project is over and the funding is gone). Initially I was going to write a set of principles for SOA, but this was a bit of a grand title for an unstructured cooks tour of "stuff I've seen go wrong and how to avoid them". These cover ground from both the big picture (governance, standards etc) and low level (how to best construct XSD interfaces) side of the messaging picture. So in no particular order...

1. Make interfaces strict

One of the things I come across surprisingly often is the idea that to be "reusable" an interface should be as lax (as unrestrictive) as possible. Before we start considering this, it's probably worth defining what I mean by a lax and a strict interface definition. So ignoring if the data type below is a good example or not, here’s an example "person" data type. The first would be a strict interface:

Title: Required, Enumeration [Mr, Mrs, Miss, Ms, Dr]
Surname: Required, String (conforming to regular expression ^[a-zA-Z\-\']{2,25}$ - i.e. 2-25 chars in length, containing only letters and hyphens/apostrophes)
Firstname: Optional, String (confirming to regular expression ^[a-zA-Z]{1,25}$ - i.e. 1-25 chars in length, containing only letters)

However a lax version would be:

Title: Optional string
Surname: Optional string
Firstname: Optional string

The differences should be fairly obvious, the lax interface will allow almost any values, the strict interface is very prescriptive about exactly what is acceptable.

Although I disagree with the argument for lax, I'll try to give it a fair hearing. The argument seems to go something like this: "if we make the interfaces restrictive then we can't reuse the interface (e.g. with another back-end system, or change the back end) without having to alter it". Put another way if add a new system (either with content based routing or to replace the existing one), and this new system has surnames can be up to 50 chars, or the title can now include "Admiral", then the lax interface can be reused, whereas the strict interface can’t. That is indeed true to a point, but there are some massive disadvantages of a lax interface:

Service consumers aren't aware of what is allowed - or if they are it's in a document, or comments field which has more chance of being ignored or misunderstood
Neither the service consumer, nor the service provider can do real-time validation of messages to the level required to stop errors. In the above lax interface I can validate that the message passes some values called Title, Surname and Firstname, and that it doesn't pass something called "Nickname" but that's about it. Passing in a surname 30 characters long, and a title of "Master of the Universe" would be perfectly valid according to the interface. Because it’s valid the message would be passed to the back end system. Given that this is invalid, it's likely this would cause an error - if the system doesn't validate the input either, then an error would be created when you tried to write an overly long string to the database. This isn't great for several reasons: first the errors are all the way down in an application log, secondly they might not be easy to identify the cause as an invalid message (it might look like an internal application DB error). The back-end may or may not pass an error back to the consumer and the error might or might not mean anything (e.g. a generic DB error). None of this is great, but there is also a worse scenario: if the back end is a creaking old mainframe/or C application which has fixed length variables. The title variable is 4 characters long (to save memory because Miss is the longest allowed title), and writing "Master of the universe" overflows this variable straight into the memory allocated for something else with unpredictable consequences. That's a little less likely, but as a rule propagating bad data around the system is never a good idea when it could be caught at the front door.
One final option. Assuming it's agreed that stopping errors before they hit the backend is a good idea, there's no reason that you can't do some validation in the ESB and still have a lax interface. True. There's no need to encode this in a WSDL/XSD. This would get over the problems of the errors not occurring in the right place or breaking the back end, but there are problems here too. If you're doing real-time testing you have to write some rules rather than just click the "validate" checkbox in that nice expensive ESB tool you bought. Secondly (and going back to point 1) the consumer can't validate their messages whilst unit testing unless they write some complicated test harness (rather than just generating a mock service from your interface (say in [soapui.org]) - and who is to say if this test harness is the same as your test code. This makes unit testing harder and more expensive, and means you'll be less likely to find errors before link/integration testing - and as the old adage goes: the earlier you find an error the cheaper it is to fix. Finally you're losing the supposed benefit of having a lax interface: the ability to reuse without change as you’d need to change the ESB rules, and your test harness anyway!

There is one last thing to consider which speaks against the myth of lax interface benefits. Adding a new backend will almost always have some impact on a service consumer and/or interface. Change isn’t bad, unmanaged change is. This all comes to the need for governance (see below). At least if you've got strict interfaces, you understand what the current state of play is. If the new system takes 24 character names then you know there's a problem just be looking at the existing interface, if it takes 30 character names you can either continue as you go (you're just not using the full length - useful if running two back-ends as you're taking the lesser of the two) or you can re-version the interface and inform consumers they can make their surnames longer if they wish to by using the new version. If you're just taking open strings it's not at all obvious if this is an issue or not because you've no idea what rules everyone else is playing by. After all, an interface definition is merely a contract saying what is required, if a back end system requires a surname to be there, but you make it optional then all you're doing is drafting a contract you can’t meet. If a consumer sends a message with no surname (fulfilling their part of the deal), then rather than sending back a positive response you send an error - which isn't very fair.

2. Decouple systems don’t just separate

Decoupling systems is more than separating them by putting an ESB between them. To get the benefits of decoupling (reuse, hiding the complexity of multiple back ends behind a single service, isolating the impact of change etc) it takes much more than just passing the message through an ESB. An ESB is not a reverse proxy. I have on more than one occasion seen an architect tell developers to “just use the XSD from the backend” as the XSD for the front end (sometimes with a different namespace – sometimes not even that)! Why you'd do this I'll never know (unless it’s to tick a box saying “I used the ESB like I was told to”. In order to actually decouple systems, you need to encapsulate the implementation in a suitable service. It's always going to be difficult to have truly system independent data types, but by trying to generalise the interface and considering carefully what functions are required means you might get some actual benefit of using an ESB.

3. Development standards and libraries

As with any component based development it's important to have a view on how the software will be constructed. This will come from the Technical Architect or senior developers. Whilst this will be expanded during the first few projects, even the first services need some guidance about:

What are the reusable components in the middleware (e.g. logging service, security module)
What is the structure of development projects, source repository etc
What coding standards will be followed. Depending on the tool this might be based on industry standards (e.g. for JBoss ESB the Java standards will be followed). Even if there are some standards to follow, these will need to be expanded for this SOA implementation to answer questions like: is the preferred method of mapping XSL, XPath, or ESQL? What is the naming convention for elements, types and enumerations in XSD/WSDLs? What is the naming convention for queues, namespaces, endpoint URLs? These don't take a long time to map out, but are worth doing up front as having each developer coming up with their own way of doing things just make maintenance difficult.

4. Start your governance early

There are lots of cool governance tools out there, from RegistryRepository tools, to real-time management and monitoring suites. These allow you to do real-time service discovery, manage subscribers, throttle services, give differing QoS to different consumers, and even scale your infrastructure from the cloud based on demand. It's all very cool stuff. Then again, if you're just getting started then your first few services don't need any of this - a few simple and relatively easy to implement governance steps can pay dividends in allowing you to grow and alter your SOA without running into trouble, and will allow you to switch to the shiny toys at a later date.

Governance structure: Setup a governance structure and process, in my experience this is usually two tier:

Technical board: to sign off designs, approve deviation from standards, discuss and decide on new standards - meets frequently (e.g. every two weeks)
Steering group: to set stratigic aims, approve the standards in the first place, manage pipeline etc - meets less frequently (e.g. quaterly)

Version services: after all, they're going to change eventually (especially if you've got prescriptive interfaces). Change isn't a problem, but does need managing. Most SOA toolkits can happily run multiple versions of the same service along side each other, so if the service is properly versioned you can upgrade clients one at a time rather than having a big bang release (or more dangerously altering a service without a consumer knowing it's going to happen until it's too late).
Document your services: In addition to the XSD/WSDL each service should have a contract (document, wiki page etc) to define additional data around the service. This can hold non-functional details (e.g. max response time, maximum expected load, maximum message size); invocation details which aren't in the XSD (is this at least one, at most once or exactly once - so is the consumer expected to retry until it gets a reply ot not); another thing of use is a sample message so anyone wanting to use the service can see an example and use the example in their unit testing - this isn't a substitute for strict service definition but can be useful.
Create a service catalogue: there are a lot of reasons for a service catalogue and quite a few things to record. Eventually you can consider adopting a RegistryRepository tool if your SOA grows to a sufficient size, but to start with a spreadsheet is usually sufficient. The service catalogue allows people to find existing services and allows you to manage them. You might want to have two levels of catalogue: business services (exposed to the world for use across the enterprise) and technical services (re-used within the ESB either for common functions like auditing or, as building blocks to make composite business services). At a minimum every good service catalogue should include:
- Service name: The logical name of your service
- Description: A short description of the service, to allow future consumers to identify if this is what they're looking for, and so you can see what you've got before you accidentally build duplicates
- Version: The version of this service, this means you can track more than one version of the same service - the name and description might not change but the items below will be different between service versions
- Consumers: Who uses this version of the service? This is important as eventually you'll want to turn services off (an often used rule of thumb is to keep consumers at most one version behind the latest, and decommission older services). Knowing who uses a service can ensure you avoid suddenly breaking an important application who is on an obsolete service version
- Status: What phase of the service lifecycle is this service in - there are various granularities in a service lifecycle but at a high level this generally looks to be:

5. SOA principles and standards

In addition to the development standards (point 3), there are also higher level architecture standards. These have probably been at least partially considered when deciding to go down an SOA route, and doing product selection (i.e. if you've got an ESB and a BPM product there is presumably a vision for how these will be used), however these can be expanded to give more technology guidance at a level above that of development coding standards but below that of the enterprise vision. These would include things like:

When to use the SOA: in many large organisations there are a combination of integeration tools for managed file transfer, EAI, and ETL in addition to the ESB. On occasion I've seen "the bus is the answer" mentality creep in. With a step back these are obviously different challenges with widely different use cases but just because the ESB is the tool of the moment doesn't mean it should suddenly be processing overnight runs of terabytes of data. Nor should all interfaces suddenly be rewritten to use the bus if there is no hope of reuse.
Messaging approaches (SOAP vs Rest, JMS/MSMQ/WMQ, when to use queues vs http, when batch is more appropriate than real-time, should services be WS-I compliant)
Security standards: do services need to be secured, encrypted, can messages be logged in plain text, how are third parties connected to the ESB (if they ever can be), how can the internet connect to services?
Components of the SOA: ESB, UDDI server, registry/repository, BPM, etc. What should be the domain of each product (e.g. what should be composed in BPEL, and what in BPM).

Of course as with all principles these things can be contravened if there's a good reason to (through the governance process above). An example would be a principle that "get" services go over HTTP, but "update" messages will go over JMS (especially if reliability is required). If a system cannot send JMS (say for firewall reasons) but reliability is still required then for this service WS-ReliableMessaging could be implemented, if reliability wasn't needed then the service could use http and either be retried or not if it when failures occur. In general JMS might make sense for an organisation using Oracle Service Bus, because it's based on Java and JMS is generally easier to interact with the WS-ReliableMessaging.

6. Service isn't SOAP, SOAP isn't HTTP

Having some basic guidance for what to use is a good idea as discussed above, but there is one myth I have hit on a few occasions so I’ll put it straight here: SOA doesn't mean SOAP, SOAP doesn't mean you always use HTTP (it can be sent over JMS or SMTP for example). It may be that when defining your principles you select SOAP, which is fine (I'm not against SOAP - see note below), but there may well be times when XML over JMS, or a JSON REST service would be appropriate. If it looks like I’m picking a specific example of the wider point from #5, then that’s because I am. This is just a case of exceptions to a principle, but I keep seeing people try to make SOAP do everything, or use HTTP for everything when XML over JMS, or when a binary message would be better. First the principle shouldn’t just say SOAP over HTTP (unless you never need reliability, you have a reason not to use a queueing technology, or because you like the WS-* extensions) – most SOA architectures need both HTTP and a queue (be it MSMQ, IBM WMQ or JMS), but furthermore SOAP isn’t the answer to everything.

Note on SOAP: As a slight aside, a lot of people don’t like SOAP but I do. Not because of the elegance of the specification (I personally think WSDL is as ugly a standard as you might expect from a committee), but because there are so many tools available which can automatically present/consume SOAP messages (e.g. wsdl2java), or test them (e.g. soapui). Plus who needs the hassel of writing their own custom standard? I can see the arguments behind REST, and if you want smaller messages JSON is a very fine approach. The argument I don't quite understand is that "SOAP is inefficient". XML is verbose and as such a bit bloated and inefficient (that’s true), but I don't see that SOAP is any more inefficient than XML. XML vs JSON I get, but SOAP vs XML? Nope. For the price of a couple of extra envelope nodes, you get all the interoperability of the SOAP stack. You don't have to do dynamic discovery or use the WS header standards if you don't want to.

That’s all folks

Well that rather lengthy selection is my tips for common pit falls of “doing SOA”. Hope it helps.

Sys 2 Sys

Monday, 1 July 2013

Starting SOA - Hints and Tips