Versioning and semantic changes

I got an email a little while ago from someone who read about my preferred XSD versioning strategy. They felt I had glossed over the issue of introducing a change that effects semantics and might be ignored by a receiver. Consider this example:

<element name=”Deposit”>
  <complexType>
    <sequence>
      <element name=“Account“ type=“tn:AccountIdType“ />
      <element name=“Amount“ type=“double“ minOccurs=“0“ />
    </sequence>
  </complexType>
</element>

Now suppose I evolved the schema by adding this optional element:

<element name=”Deposit”>
  <complexType>
    <sequence>
      <element name=“Account“ type=“tn:AccountId“ />
      <element name=“Amount“ type=“double“ minOccurs=“0“ />
      <element name=“Currency“ type=“tn:CurrencyType“ minOccurs=“0“ />
    </sequence>
  </complexType>
</element>

The new Currency element is optional so that client who don't send it still work. They use the default currency that was used with the original version of the Deposit message. This works fine as long as you only never have new clients talking to old services. In that case, the client could send a Deposit with a Currency, but the old service would skip that unknown element (as per the rules I described in my original post).

The solution to this problem is to determine whether you will ever have the new client / old service situation (if you're services are deployed as single instance or always updated at the same time before any clients are released, then you never will). If you do, then exercise your judgement about changes. If you are going to change an XSD in a way that alters semantics, make sure you differentiate new instances somehow. For instance, you could add this to your schema:

<element name=”DepositWithCurrency”>
  <complexType>
    <sequence>
      <element name=“Account“ type=“tn:AccountId“ />
      <element name=“Amount“ type=“double“ minOccurs=“0“ />
      <element name=“Currency“ type=“tn:CurrencyType“ minOccurs=“0“ />
    </sequence>
  </complexType>
</element>

This way you are versioning your schema, but not the Deposit element. This avoids the problem without introducing too much extra complexity.


Posted May 05 2006, 02:27 PM by tim-ewald

Comments

Christopher Steen wrote Link Listing - May 5, 2006
on 05-05-2006 7:24 PM
Attempting to address Rajesh's concerns about
optionality... [Via: Tim Ewald ]
Axis2 1.0 released...
Ebenezer Ikonne wrote re: Versioning and semantic changes
on 05-08-2006 6:44 AM
I think if "services" are build to accept "messages" i.e. ProcessMsg(Message msg), where the message is essentially xml (not typed), then the service can be evolved to support different messages, hence support/validate data supported by different schemas. So for the example that is presented above, we simply have two different messages that the service should be able to handle.
Jason Orendorff wrote re: Versioning and semantic changes
on 05-16-2006 6:36 AM
Ebenezer:

Sure, if your API says nothing about your service, you can evolve willy-nilly without changing the API. I'm not sure what use that is.

Here's what types are about. A single API can perform a lot of really useful functions: communicating to application programmers how to use it; providing a natural contract between service and client, allowing each to reason about the other's behavior; detecting errors closer to the source, to make bugs easier to find; detecting dangerous errors early, before they do costly damage; communicating changes (because a change in the contract implies a change in the API); allowing libraries on the client and server to automate away a whole layer of message serialization, parsing, and dispatching; and what the heck, supporting features like Intellisense in IDEs. These aren't trivial features. Learnability and robustness against mistakes are among the most important qualities of... well, any product.

An excessively typed API is rigid and requires too much maintenance. It requires breaking changes for stupid little things like adding an optional parameter to a method. An insufficiently typed API, like the one you describe, leaves out features that an API could provide.
Fraser Goffin wrote re: Versioning and semantic changes
on 06-03-2006 7:45 AM
Tim,

not sure if you're prepared to continue comments on this series of blog entries, but here goes.

Like you I have been thinking about versioning, extensibility and validation for some time now and trying different approaches. More recently theres been a couple of threads running on XML-Dev and UBL-Dev which I've been active in which have focussed on some of this. I will introduce some of the thoughts that are surfacing there later on (just to share one early - validation is very much considered as a layered approach potentially using a number of technologies, rather than all or nothing using only WXs).

Its hard to be critical of any well reasoned commentary on this subject. If there's one thing we have all found, it's that there doesn't appear to be a simple solution, nor one which fits all circumstances. So whilst there is much to be agreed with in your series of articles on versioning, IMO some aspects just don't quite fit with my experience and perhaps those of others. Its probably worth saying right out front that I am primarily interested in B2B, and that in my industry sector (financial services), services, and the schemata that define their data content are more often than not, large and complex, typically reflecting an aspect of a business process (for example a request for quotation might contain 40-50 business entities and 200+ elements arranged in some meaningful way). The message exchange patterns also reflect the business process in the sense that, although individual exchanges may follow a simple synchronous request/reply pattern, the overall conversation is often long running, asynchronous, and stateful (just to distinguish what I'm talking about from a service modelled like a previously implemented API method with a few params).

I did start to write a response that covered all of your related blogs, but it was just too long. So I thought I'd start of gently with a few probes and see where that takes us ?

I'd like to just clarify a couple of comments that may hint at you're preferred approach to schema design since this can have a profound effect on versioning and extensibility) :-

a. 'Make everything optional' - except an *identifier*.

Are you suggesting that *every* business entity (Party, Address, Payment, etc.) should be modelled with a unique ID ?

What is this ID for. Is it to identify an instance fragment as being of a particular type (ie. an Address, a Party ?) or are you talking about an identifier that has meaning to the communicating parties i.e. PartyID 'abc123' actually means 'Fraser Goffin' (i.e. a shared enterprise key !) ?

b. No need for xs:any ?

Given the suggested approach (including rebuttal of WXS xs:any + 'sentinnel') are there any circumstances in which using WXS extensibility makes sense ?

c. For maximum flexibility / reuse are you suggesting every business entity be declared as a global type/element ?


Now to your main theme in the earlier blogs - 'Make everything Optional' (except ID) :-

This implies that it would be *technically* valid to send an instance containing *only* the identifiers. Of course it is likely that this instance would *not* be sufficient and would be rejected by the validation (or business rules) processing. However, I can't help feeling that the extreme of making everything optional just doesn't reflect reality and potentially reduces clarity and precision in specification. When we design a data (and service) model, one aspect concerns cohesion. That is, we want to ensure that our grouping of data items makes sense in the contexts in which they will be used. This applies equally to cardinality. In most cases there *is* a meaningful subset that applies to *all* contexts and this is typically more than just an ID !. If that's the normal case, why not mark each item in that subset as mandatory ? Surely that would a) improve the precision of the specification and, as a consequence b), reduce the need for some aspects of the 'out of band' communication (use cases scenarios, semantics and QoS are things I would more typically expect to find rather than detailed data model descriptions) and c), reduce the 'trial and error' nature for understanding the content of exchanges which *are* actually required for meaningful communications between parties. I hear your point about not assuming what a consumer may be able to send, but there's not much point receieving data which is clearly not fit for purpose right at the outset, indeed, if the consumer were validating their output before sending (not sure if anyone actually does that), it wouldn't get past that point. Earlier notification, easier/cheaper to fix.

As an example, if I have a compleType for Address, I think you'd agree that in order for it to be genuinely useful, I probably need more than just a ID field (of course it depends what is in there - it could be an ID that just identifies this data fragment as an instance of the address type, it could be an ID that contains a value that has shared meaning between the trading partners - for now lets assume the former - its a bit too early in this dialogue for a discussion on the subject of enterprise keys :-). So we might decide that a zip/post code is mandatory (that gets us to 15 or so properties in the UK) along with [say], House/flat number/name. If this turns out to be the minimum [useful] requirement why not make those data items mandatory ?

This aspect of making things business meaningful is something that I would be reluctant to forego whether we are talking about data or service definition. Abstraction is good, but at some point the 'rubber hits the road'. In my experience of integration, confusion over details such as this is the biggest consumer of development and testing resources, some of which can certainly be avoided by a more precise spec. This doesn't have to mean less flexible, it means, focussing attention on those aspects which may vary by context and let the rest speak for itself.

Fraser.
Fraser Goffin wrote re: Versioning and semantic changes
on 06-03-2006 11:16 PM
OK, the penny has finally dropped !. Using IDs for each business entity accompanied by other information items does make more sense than I was giving credit for. I still think that there are difficulties in establishing a shared key that all participants can use successfully, deciding on which party sets up the key (who has the 'master data'), and ensuring that the abstraction doesn't leak either parties technical implementation details, but all of these are probably solvable (even if you have to end up carry 2 IDs - yourRef/myRef). So most of the earlier post is irrelevant - sorry thats just me reacting to a blog thats presenting different ideas than I currently use, without thinking through it properly.

I am still interested in whether you feel that there is a place for xs:any in schema design though. ?

I can see that it is possible to write code such as the example that you have posted to encode rules for 'must ignore unknown' but I can't help feeling that this does somewhat 'raise the bar' for some implementers who may not have the resources to do so, or the inclination (software is seldom the core business of trading partners). IMO people probably just want to use the capabilities of COTS packaged applications or the basic capabilities of general parsers without having
to write (and more importantly maintain) custom code. Using a standard validating parser, its possible to implement 'must ignore unknown' via xs:any isn't it ? (keeping in mind Dave Orchards point about the retain or discard flavours if you happen to be an intermediary - actually I think this also applies to endpoints since although the request may contain data which is of no interest to the receiving endpoint processing, it is often the case that that data is required to be reflected back in the response so can't just be discarded).

To borrow a term that you 'coined' some time ago - 'making it easy to pay me' is about how easy it is for others to integrate with my service and if that means you have to write a bunch of code to ensure that I send you the correct data (over and above what my basic infrastructure tooling will tell me) it may be that this very important principle is somewhat eroded ?

Fraser.
Ken Brubaker wrote Tim Ewald's solution for XML Schema versioning
on 11-28-2006 7:55 AM
Tim Ewald addresses the XML Schema versioning issue head on.
... wrote re: Versioning and semantic changes
on 01-30-2009 12:40 AM

Lovely. Great site.

... wrote re: Versioning and semantic changes
on 02-28-2009 10:50 PM

Gut!

... wrote re: Versioning and semantic changes
on 03-12-2009 10:05 AM

Sehr wertvolle Informationen! Empfehlen!

cnscvwtqae wrote re: Versioning and semantic changes
on 05-06-2009 2:41 PM

TRqNEA  <a href="yfowchlglozz.com/.../a>, [url=http://aebunokqceln.com/]aebunokqceln[/url], [link=http://ydwaxnpktmyv.com/]ydwaxnpktmyv[/link], http://shrqdfcuacdi.com/

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 2:11 PM

comment2,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 2:18 PM

comment5,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 2:24 PM

comment6,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 2:30 PM

comment4,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 2:36 PM

comment5,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 2:42 PM

comment6,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 2:47 PM

comment1,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 2:53 PM

comment2,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 3:00 PM

comment2,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 3:05 PM

comment4,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 3:11 PM

comment2,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 3:17 PM

comment3,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 3:23 PM

comment2,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 3:35 PM

comment2,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 3:41 PM

comment1,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 3:47 PM

comment2,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 3:52 PM

comment5,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 4:04 PM

comment2,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 4:10 PM

comment3,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 4:16 PM

comment3,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 4:22 PM

comment3,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 4:28 PM

comment2,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 4:34 PM

comment5,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 4:46 PM

comment5,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 4:52 PM

comment1,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 4:58 PM

comment3,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 5:04 PM

comment5,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 5:16 PM

comment2,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 5:22 PM

comment4,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 5:28 PM

comment5,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 5:33 PM

comment5,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 5:39 PM

comment2,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 5:45 PM

comment3,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 5:51 PM

comment4,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 5:58 PM

comment1,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 6:03 PM

comment2,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 6:10 PM

comment3,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 6:15 PM

comment2,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 6:21 PM

comment5,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 6:27 PM

comment4,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 6:34 PM

comment5,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 6:39 PM

comment5,

jonn2 wrote re: Versioning and semantic changes
on 05-07-2009 6:45 PM

comment2,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 7:40 PM

comment5,

jonn3 wrote re: Versioning and semantic changes
on 05-07-2009 8:52 PM

comment1,

jonn1 wrote re: Versioning and semantic changes
on 05-07-2009 10:10 PM

comment3,

cwprpl wrote re: Versioning and semantic changes
on 06-01-2009 6:05 AM

kE8Elz  <a href="agimvzjfknhd.com/.../a>, [url=http://gcjsygynxytv.com/]gcjsygynxytv[/url], [link=http://naqhvhpeuyus.com/]naqhvhpeuyus[/link], http://rqjscneariiw.com/

yfcseg wrote re: Versioning and semantic changes
on 06-05-2009 6:13 AM

DXypLJ  <a href="telduqdrdgda.com/.../a>, [url=http://lzxlnxwhznoc.com/]lzxlnxwhznoc[/url], [link=http://chdrhbqgmdce.com/]chdrhbqgmdce[/link], http://xaghysbyqmzi.com/

Add a Comment

(required)  
(optional)
(required)  
Remember Me?