Making everything optional

DJ commented on my post addressing the problem Raimond raised with my versioning strategy. He wondered if he'd missed an earlier post where I argued that you not use XSD to validate your data because if you make content optional, you can't use it to check what has to be there. Since I haven't written about that yet, I figured I'd start to address it now.

When people build a schema for a single service, they tend to make it reflect the precise requirements of that system at that moment in time. Then, when those requirements change, they revise the schema. The result is a system that tends to be very brittle. If you take the same approach when you design a schema for use by multiple systems, describing a corporate level model for customer data for instance, things are even worse. Some systems won't have all the required data. They have to decide whether to (a) collect the data, (b) make up bogus data, or (c) not adopt the common model. None of these are good approaches.

To solve both these problems, I've started thinking about my schema not as the definition of what this system needs right now but as the definition of what the data should look like if it's present instead. I move the actual checking for what has to be present inside the system (either client or service) and implement it using either code or a narrowed schema that is duplicate of the contract schema with more constraints in place.

The advantage of this model is that I can change the “has” requirements without changing the shape of the data. Yes, I'm changing my contract and I need to make sure that clients test against the new version before it all goes live. There's a chance, though, that some or even many clients won't have to change. If I revise the schema itself in a way that forces a namespace change, then I have to support parallel contracts or migrate all clients at the same time. Neither is a good option. In other words, I'm trying to minimize schema changes and use testing to pinpoint required code changes.


Posted Apr 20 2006, 02:51 PM by tim-ewald

Comments

Rajesh Jain wrote re: Making everything optional
on 04-20-2006 1:53 PM
I don't get it! How is defining a schema different from (e.g.) an interface. Both define a contract at a point in time.

It is ridiculous (in my humble view) to think that loose contracts is the best way to go. I hate it when people tell me "Just pass me a String. I will give you a huge friggin document that will describe the shape". I am always at the receiving end of these kinds of statements and it hurts me a lot because it takes me ages to figure out the exact contract. Moreover, the provider "forgets" what the contract was 2 months down the line because he/she never bothered to document. Tim, I hope you see the point.

Contracts change and if a client breaks, it is a "GOOD thing" since it forces/surfaces the change (especially in a very complex world where everything is/cannot be documented)

What probably should be said about contracts is this--the provider should try to support the old contract as well if possible.
Jay wrote re: Making everything optional
on 04-20-2006 6:48 PM
Rajesh,

Your approach gets us back to the problem of managing multiple versions of an interface. This problem can quickly become intractable if you are going through a few iterations. From a conceptual sense it is a "simpler" appraoch. It would seem to me that you gotta pick your poison based on the problem you are trying to solve.
Raymond wrote re: Making everything optional
on 04-20-2006 11:28 PM
I totally agree with the loose contract vision. Take a look in the future instead of strictly typing your interface. But, I cann't convince the people in my company about it. Some one-line guideline here says "webservice interfaces should be strictly typed.". I'm talking about interfaces with lots of parameters here (50+).
Maybe a webservice broker is a solution to separate between strict contracts and loose clients?
theCoach wrote re: Making everything optional
on 04-21-2006 6:41 AM
Is there a pattern/mechanism for sending a response that includes metadata about how well the request was formed -i.e. This is valid, but it appears to be based on an older schema. Obviously, there would be different ways, but for the [name | FirstName.lastName] scenario, good metadata could be sent back, and with a proper infrastructure these migrations could be planned.

As an aside, in the Microsoft .NET world, do you have any thoughts on WinFS's effects on certain constructs that would go across the wire?
DJ wrote re: Making everything optional
on 04-21-2006 7:27 AM
I like it! Basically you could maintain a 'published schema' that is the loose version with mostly optionals and a set of versioned 'contract schemas'. You could use the versioned schemas to identify the version of the interface that the client is trying to use while validating that the data is correct. With the right tool, the published schema could be auto-generated from the set of contract schemas to make sure it supports all versions.
Rajesh Jain wrote re: Making everything optional
on 04-21-2006 12:36 PM
Jay,
I agree with you that to a certain degree you have to pick your poison when designing an interface in a certain context. But to suggest that loose contracts are a good thing (or the preferred approach) is totally beyond me. This is one debate that has been going on for ages (in one form or another--Scripting/compiled, loose/tight contracts, loose/strict XSDs or schemas (I mean, isn’t it the same problem when you design a database schema?)) It pains me that the "in thing" changes every few years! I mean, if there was such huge benefit in loose contracts, why put so much effort in designing schema definition technologies. Just focus on making the definition (and consumption) of loose contracts easier for the whole world!

I think you can’t beat the advantages of strict contracts (self describing—precise definitions, managed change control, etc.) versus loose contracts. Then again, I might be naïve. I have built and managed a lot of enterprise class applications and they didn’t help me!

Regards
Rajesh Jain wrote re: Making everything optional
on 04-21-2006 12:49 PM
OOh!, did I tell you about a story--in my company, we have a massive (I mean friggin huge with more than 2000 elements) so called loose schema and over the years every body lost the original design context and it became the superset for all the data that all possible systems could ever use. The result is, nobody knows what to do about this beast because most of the systems use only a small fraction of that data (a subset schema) and nobody knows what subset is used by what system! So everybody tries to pass anthing they got and cross fingers! This is a very real world scenario faced by common people like me.

Sorry, I forgot to narrate this story in my previous post. Please forgive me if it looks like (uuhhmmm.) spam ;-)
Jon Fancey wrote re: Making everything optional
on 04-23-2006 12:45 PM
There's an obvious reason for this - people are authoring schemas in the same way as they write code. They invent an api for their system and implement the schema for it - probably automatically. The schema is no more brittle than the api behind it but the reality is that just as people hoped OO systems were the future-proof silver bullet, now many think that Web services/schema etc are. Now this is a fine approach if you're building internal systems on the same platform and you have control over the whole domain - Web services just don't always offer a compelling choice here.

The only way to build flexible systems is think about the amount of flexibility you need and build a system that provides it. There are no shortcuts. I'm really glad you've stepped in and offered prescriptive advice on this as its pretty thin on the ground.

Schema validation in a production system is a non-starter for most people. Whether you go validation via your approach or xpath (or RelaxNG :-)) is a moot point - they both end up at the same place.

The interesting question is how different a message (as that's what we're really talking about) you are prepared to accept before you kick it back as 'invalid' - or find a way to achieve clarification programmatically rather than the current out-of-band.

I think the industry took a wrong turn when the focus (and all the energy) went on mapping object models to messages. Who cares about that? - it's hard to build anything other than fragile systems with that approach for the same reasons. Trying to hide too many details is generally a bad move. That's when people start passing datasets and hashtables around without thinking about the consequences.

One of the things that attracted me towards BizTalk was its flexibility. To recognize a message all it wants is to see a root element name and namespace that it knows. In fact you can even 'dial down' this requirement and just accept any valid xml document. The message itself could contain anything, but as a consumer you have the option to schema validate or xpath out or whatever. The problem then becomes one of working out the correct processing for the message received - and that's the problem we should be concentrating on. Not all systems require this level of flexibility but then not all systems need to use Web services either. Whatever approach is used should be based on sound design and requirement not the fashion of the day.

Great posts, and debate - keep them coming!
Sailesh wrote re: Making everything optional
on 04-23-2006 1:09 PM
Data and its validity is only important at the endpoints. Is the sender happy its data is valid before it sends? Is the receiver happy the data it get is valid? The rest doesn't matter - using a single schema to set the contract between two enterprise systems won't work, but removing type and making everything optional doesn't help anyone - if everyone can send and receive everything, there is no contract.
The solution is transformations, which need tight definitions of input and output. The transforms are the things that change. If the receiver wants the name split into first and last use a transform (which can be performed by the receiver).
Jon Fancey wrote re: Making everything optional
on 04-24-2006 1:17 AM
The lack of strict contract is the whole point. The current specs drive us down the strict contract route with exactness as a main criteria. Unfortunately this doesn't help build real world systems that need to use this technology to work. If you require strict contracts then you probably aren't building open (between domains) distributed (either internal or external) systems. And that's fine. It's ok to use the tools as they come configured. But if you ARE building loosely coupled systems these tools work against you by default. And that's not ok, because it is only experience that can teach you that, and by then of course, it's too late. And that's the problem with versioning, it's easy to put it off until v2.0 - by which time you're in trouble.
Rajesh wrote re: Making everything optional
on 04-24-2006 9:56 AM
Jon,
<I>The interesting question is how different a message (as that's what we're really talking about) you are prepared to accept before you kick it back as 'invalid' - or find a way to achieve clarification programmatically rather than the current out-of-band.</I>

Sailesh hit the nail on the head. Are we <b>too</B> focused on the problems at the "receiving" end? What about the sender? Do contracts help the sender by educating her what will and will not be accepted by the receiver (before she goes into a trial and error mode)? To that end, I like Schematron (or to a lesser extent RelaxNG) even more!

Thanks
Jon Fancey wrote re: Making everything optional
on 04-24-2006 12:18 PM
Rajesh, that's a very good question. But the answer is no. Especially if we don't know who the consumers are. Now, unlike some people I don't actually buy into that unless you're the likes of ebay or amazon. Most providers know their consumers. Even so, it's a fact that with the current state of the art, being flexible in what you receive is easier than being flexible in what you send. This would be approaching clairvoyance. I like schematron and relaxNG too as I mentioned above. But only because they require less effort currently - as Tim points out these things are still achieavable with vanilla .net - but they do require more effort- which is why most people don't do it that way.
XML Nation wrote More on making everything optional
on 04-25-2006 6:05 AM
Dare Obasanjo aka Carnage4Life wrote Tim Ewald on Versioning XML Web Services with XSD
on 05-14-2006 4:53 PM

Add a Comment

(required)  
(optional)
(required)  
Remember Me?