Friday, January 30, 2009

Interoperability Gotcha: Order of XML Elements

@YaronNaveh

This time I will talk about the importance of order within xml elements.

Let's say we have a web service with this schema:


<s:element name="root">
 <s:complexType>
  <s:sequence>
   <s:element name="elem1" type="s:string" />
   <s:element name="elem2" type="s:string" />
  </s:sequence>
 </s:complexType>
</s:element">


So of course this is a valid request:


<root>
 <elem1>I'm elem1</elem1>
 <elem2>I'm elem2</elem2>
</root>


But about about this?


<root>
 <elem2>I'm elem2</elem2>
 <elem1>I'm elem1</elem1>
</root>


This is not a valid xml instance!
The reason is that the schema contains the element which requires its sub elements to have order.

So why is it an interoperability gotcha?
The reason is that different soap stacks by different vendors behave differently in such cases. Sometimes even different stacks of the same vendor are divided...

Let's see how a .Net 2.0 proxy would look like:


...
private string elem1;
private string elem2;
...


This proxy can parse both xml instances from above.

This is how the equivalent WCF proxy would look like:


[XmlElement(Order=0)]
public string Elem1
{
...
}

[XmlElement(Order=1)]
public string Elem2
{
...
}


This proxy is stricter and actually enforces the order of elements. So it would only accept the first (legal) instance and will give unexpected results with the second. It would usually not throw an exception but rather ignore the message and use default values for the unordered elements so your fields will have a lot of NULL's and zeros.

How to fix it?

Option 1 (recommended) - Send XML with the correct order.

Option 2 - Change the WSDL to use <all> instead of <sequence>. The former does not require correct order of elements (there are some other subtle differences). Note that in case the WSDL is dynamically generated this may be hard to maintain.

Option 3 - Remove the "Order=..." property from the WCF proxy. Note that when the proxy will be regenerated the "Order" property will come back so this is also a maintenance challenge.

@YaronNaveh

What's next? get this blog rss updates or register for mail updates!

8 comments:

Moshe said...

Hi Yaron. The only thing I couldn't understand was who may generate the problematic message (the second instance). For example, though the .Net 2.0 proxy has no means for enforcing the order but, at long last, the fields within the proxy appear in the correct order (unless the proxy code generation changes the order for some esoteric reason). I can't imagine some automatic layer that can mess the order (unless you create the SOAP message manually). The only real scenario I can consider is complex type extensions. In this case the order between the inherited fields and the new ones is not obvious.

Yaron Naveh (MVP) said...

Moshe

There can be various sources for that message:

1. Some faulty 3rd party soap stack
2. An handcrafted message, or at least an handcrafted message body
3. A malicious hacker
4. A proxy that was created from an older version of the WSDL which had a different elements order

In any case in a SOA environment we
should not rely on client side correctness or validation.

Michael said...

Thank you for this post. It was just what I needed to understand my problem. One comment. In your Option #2, your xml sample is not properly escaped, so in a web browser it reads "Change the WSDL to use instead of .". When I viewed html source I see that it should read "Change the WSDL to use all instead of sequence" where all and sequence are xml elements.

Yaron Naveh (MVP) said...

Thanks @Michael - fixed

Michael said...

There is one small bug in your example. elem2 in the schema is defined as int, but then the example data and proxy show it as a string.

Yaron Naveh (MVP) said...

Thanks @Michael, I've fixed it

Unknown said...

Thank you for the explanation. I have faced the issue that is caused exactly by the very same cause. I've generated the WCF client in .NET from JAVA based SOAP 1.0 service. Very strange thing was that the response I've was receiing was missing particular values, but only those that were located in subclasses of the base response class.All the values coming from the base class were properly deserialized. The raw response message also looked OK and contained all necessary data. It seemed that sublasses fields are not being properly deserialized.

The root cause of the problem was probably because of the lost field order in inheritance chain of the SOAP response. After long investigation I found the old comment: http://geekswithblogs.net/LeonidGaneline/archive/2008/05/01/wcf-values-disappeared-in-response-derived-classes-and-serializationdeseriazlization-order.aspx that suggested removing the Order parameter from [System.Xml.Serialization.XmlElementAttribute(Order=3)] from the whole generated client. After removing around 300 occurences of Order parameter, my client started working like a charm.

Yaron Naveh (MVP) said...

thanks for sharing Pawel!