Tuesday, 21 September 2010

Map System.TimeSpan to xs:duration for the DataContractSerializer

I’m working on a new RESTful service (I’m using Asp.Net MVC for this) and I’m keen to make it as easy to integrate with as across as many client platforms as possible.

I’m going to be using XML exclusively, as I it enables me to produce schemas that our service clients will be able to download and then see the shape of each object that the service operations will expect.

I also want to employ XML schema validation on the data coming in from the request.  Doing this will trap most data errors it gets further down the request pipeline, thus protecting my code; but will also ensure that the caller gets nice verbose error messages – XML Schema Validation is pretty explicit on what’s gone wrong!

Thus, I’ve wired up an MVC action to produce a schema for the relevant objects using the XsdDataContractExporter class.  I will simply point users at this, alongside documentation for each of the operations; which will include schema type names from that ‘live’ schema.

While testing out one of the types, I noticed that a TimeSpan member was being serialized as xs:string and not as xs:duration.  I consulted an MSDN topic I should now know by heart (given how many times I’ve looked at it) to check the support for TimeSpan in the DataContractSerializer and sure enough it’s there; but I couldn’t understand why, if DateTime is indeed mapped to the XML DateTime type, it’s not mapped to the XML Duration type.

So I’ve written a TimeSpan type called XmlDuration that is implicitly convertible to System.TimeSpan but which, when you expose it as a member on a Data Contract, presents itself as xs:duration.  This then means that if you enable schema validation on your incoming XML, the input string will be validated against the rules attached to the XML Duration type, instead of being simply a string that allows any content.

The code is as follows.  There’s one really long line in there which is a Regex that I’ve split into multiple string additions purely for this post; you can join the strings back up again if you so desire:

  1. /// <summary>
  2. /// This type is a TimeSpan in the .Net world but, when
  3. /// serialized as Xml it behaves like an XML duration type.  
  4. /// For more about the duration data type in XML - see
  5. /// http://www.w3.org/TR/xmlschema-2/#duration
  6. ///
  7. /// You should use this on types that intend to send a
  8. /// timespan to ensure clients can read the data in a
  9. /// conformant manner.
  10. ///
  11. /// Note that when the type writes out to XML, it starts
  12. /// with days; not years.  That is because .Net timespan
  13. /// only expresses days.
  14. ///
  15. /// If the original duration string is preserved from
  16. /// the input XML, then the same duration instance will
  17. /// serialize out using that same string.
  18. /// </summary>
  19. [XmlSchemaProvider("GetTypeSchema")]
  20. public sealed class XmlDuration : IXmlSerializable
  21. {
  22.     /// <summary>
  23.     /// When this instance is loaded from XML, this is
  24.     /// the original string.
  25.     /// </summary>
  26.     public string ValueString { get; private set; }
  27.  
  28.     /// <summary>
  29.     /// The inner .Net TimeSpan value for this instance
  30.     /// </summary>
  31.     public TimeSpan Value { get; private set; }
  32.  
  33.     public XmlDuration() { }
  34.  
  35.     public XmlDuration(TimeSpan duration)
  36.     {
  37.         Value = duration;
  38.     }
  39.  
  40.     public XmlDuration(XmlDuration duration)
  41.     {
  42.         Value = duration.Value;
  43.         ValueString = duration.ValueString;
  44.     }
  45.  
  46.     public XmlDuration(string duration)
  47.     {
  48.         try
  49.         {
  50.             Value = TimeSpanFromDurationString(duration);
  51.             ValueString = duration;
  52.         }
  53.         catch (ArgumentException aex)
  54.         {
  55.             throw new ArgumentException
  56.                 ("Invalid duration (see inner exception", "duration",
  57.                 aex);
  58.         }
  59.     }
  60.  
  61.     public static implicit operator XmlDuration(TimeSpan source)
  62.     {
  63.         return new XmlDuration(source);
  64.     }
  65.  
  66.     public static implicit operator TimeSpan(XmlDuration source)
  67.     {
  68.         return source.Value;
  69.     }
  70.  
  71.     private static Regex RxRead = new Regex(@"^(?<ISNEGATIVE>-)?" +
  72.         @"P((?<YEARS>[0-9]+)Y)?((?<MONTHS>([0-9])+)M)?((?<DAYS>([0-9])+)D)?" +
  73.         @"(T((?<HOURS>([0-9])+)H)?((?<MINUTES>([0-9])+)M)?((?<SECONDS>([0-9]" +
  74.         @")+(\.[0-9]{1,3})?)S)?)?$", RegexOptions.Compiled);
  75.  
  76.     /// <summary>
  77.     /// Constructs a new TimeSpan instance from the pass XML
  78.     /// duration string (see summary on this type for a link
  79.     /// that describes the format).
  80.     ///
  81.     /// Note that if the input string is not a valid XML
  82.     /// duration, an argument exception will occur.
  83.     /// </summary>
  84.     /// <param name="value"></param>
  85.     /// <returns></returns>
  86.     public static TimeSpan TimeSpanFromDurationString(string value)
  87.     {
  88.         TimeSpan toReturn = TimeSpan.MinValue;
  89.         var match = RxRead.Match(value);
  90.  
  91.         match.ThrowIf(m => m.Success == false, "value",
  92.             "The string {0} is not a valid XML duration".FormatWith(value));
  93.  
  94.         bool isNegative = false;
  95.         int years = 0, months = 0, days = 0, hours = 0, minutes = 0;
  96.         double seconds = 0;
  97.  
  98.         var group = match.Groups["ISNEGATIVE"];
  99.         isNegative = group.Success;
  100.  
  101.         group = match.Groups["YEARS"];
  102.         if (group.Success)
  103.             years = int.Parse(group.Value);
  104.         group = match.Groups["MONTHS"];
  105.         if (group.Success)
  106.             months = int.Parse(group.Value);
  107.         group = match.Groups["DAYS"];
  108.         if (group.Success)
  109.             days = int.Parse(group.Value);
  110.         group = match.Groups["HOURS"];
  111.         if (group.Success)
  112.             hours = int.Parse(group.Value);
  113.         group = match.Groups["MINUTES"];
  114.         if (group.Success)
  115.             minutes = int.Parse(group.Value);
  116.         group = match.Groups["SECONDS"];
  117.         if (group.Success)
  118.             seconds = double.Parse(group.Value);
  119.  
  120.         //now have to split the seconds into whole and fractional.
  121.         //note - there is clearly a potential for a loss of fidelity
  122.         //here given that we're expanding years and months to 365 and
  123.         //30 days respectively. There's no perfect solution - although
  124.         //you can simply ask your web service clients to express all
  125.         //durations in terms of days, hours, minutes and seconds.
  126.         int wholeSeconds = (int)seconds;
  127.         seconds -= wholeSeconds;
  128.         toReturn = new TimeSpan((years * 365) + (months * 30) + days,
  129.             hours, minutes, wholeSeconds, (int)(seconds * 1000));
  130.         if (isNegative)
  131.             toReturn = toReturn.Negate();
  132.  
  133.         return toReturn;
  134.     }
  135.  
  136.     #region IXmlSerializable Members
  137.  
  138.     /// <summary>
  139.     /// Returns a qualified name of
  140.     /// http://www.w3.org/2001/XMLSchema:duration
  141.     /// </summary>
  142.     /// <param name="xs"></param>
  143.     /// <returns></returns>
  144.     public static XmlQualifiedName GetTypeSchema(XmlSchemaSet xs)
  145.     {
  146.         return new XmlQualifiedName
  147.             ("duration", "http://www.w3.org/2001/XMLSchema");
  148.     }
  149.  
  150.     public System.Xml.Schema.XmlSchema GetSchema()
  151.     {
  152.         //see the static GetTypeSchema method.
  153.         return null;
  154.     }
  155.  
  156.     public void ReadXml(System.Xml.XmlReader reader)
  157.     {
  158.         string s = reader.ReadElementContentAsString();
  159.         if (s.IsNotWhitespaceOrNull())
  160.         {
  161.             Value = TimeSpanFromDurationString(s);
  162.             ValueString = s;
  163.         }
  164.         else
  165.             Value = TimeSpan.MinValue;
  166.     }
  167.  
  168.     public void WriteXml(System.Xml.XmlWriter writer)
  169.     {
  170.         StringBuilder sb = new StringBuilder();
  171.  
  172.         //if we have the original duration string then we write that back out.
  173.         if (ValueString.IsNotWhitespaceOrNull())
  174.             writer.WriteValue(ValueString);
  175.         else
  176.         {
  177.             if (Value.Ticks < 0)
  178.                 sb.Append('-');
  179.  
  180.             bool isFractionalSeconds =
  181.                 ((double)((int)Value.TotalSeconds)) != Value.TotalSeconds;
  182.  
  183.             sb.AppendFormat("P{0}D", (int)Value.TotalDays);
  184.             sb.AppendFormat("T{0}H", Value.Hours);
  185.             sb.AppendFormat("{0}M", Value.Minutes);
  186.             sb.AppendFormat("{0}S",
  187.                 isFractionalSeconds ?
  188.                     "{0}.{1}".FormatWith(Value.Seconds, Value.Milliseconds)
  189.                 : Value.Seconds.ToString());
  190.  
  191.             writer.WriteValue(sb.ToString());
  192.         }
  193.     }
  194.  
  195.     #endregion
  196. }

I’ve taken this class and merged it into the System.Xml namespace – because clearly this will also work with the XmlSerializer as well as for the DataContractSerializer.

A few notes.

The nifty part of this class is in the use of the XmlSchemaProviderAttribute.  This is the .Net framework’s preferred mechanism for mapping types to Xml schema.  In theory, if you were writing a more complex custom type for which Schema simply cannot be auto-generated, you could manually inject the schema into the XmlSchemaSet passed into the GetTypeSchema method (the name of which is determined by the parameter you pass to the attribute constructor).  You would then return the XmlQualifiedName of this schema type to satisfy the framework.

In this case all we have to do is to return the well-known qualified name of the xml duration type: ‘duration’ from the namespace ‘http://www.w3.org/2001/XmlSchema’.  We should be able to rely on anyone working with XML to have mapped this namespace already, and we know that a schema exporter will be doing the same since most of the .Net fundamental types are mapped to the same namespace.

Under the hood I’ve written a simple regex parser based on the format for the Duration data type.  It will recognise all valid strings, but it also lets one or two invalid ones through (notably ‘PT’).  However, if you are also using schema validation then this will not trouble you.

As the comments state - years and months are a problem; since the .Net TimeSpan chickens out and doesn’t encode years/months (presumably because it makes it much easier to calculate them from the difference of two DateTimes, as well as to add one onto a DateTime).  Of course neither have a fixed number of days; so I’ve gone for a reasonable average.  You could be more clever and take the current month’s number of days plus a strict average of 365.25 days per year – it depends on how accurate you really need it to be.

If you’re writing a new web service, you can simply make sure that all your clients express durations starting with the number of days – e.g. ‘P400D’ which will be deserialized exactly into a .Net TimeSpan representing 400 days.

In situations where a duration is received from a client and might need to be sent back to them – the original input string is preserved (but it will be up to you to persist that server side).

So now you can change a DataContract class like this:

  1. public class Class1
  2. {
  3.     public TimeSpan Duration { get; set; }
  4. }

And change it over to this:

  1. public class Class1
  2. {
  3.     public XmlDuration Duration { get; set; }
  4. }

In this case these types aren’t annotated of course (whereas in my case all my exported types are).

If you’re publishing a data contract for a type that must also implement some internal interface that exposes a TimeSpan like this:

  1. public interface IClassInternal
  2. {
  3.     TimeSpan Duration { get; set; }
  4. }

Then your best policy is to write the DataContract class as follows:

  1. public class Class1 : IClassInternal
  2. {
  3.     public XmlDuration Duration { get; set; }
  4.  
  5.     #region IClassInternal Members
  6.  
  7.     TimeSpan IClassInternal.Duration
  8.     {
  9.         get
  10.         {
  11.             //use implicit casting operator
  12.             return Duration;
  13.         }
  14.         set
  15.         {
  16.             if(value!=null)
  17.                 Duration = value; //and here
  18.         }
  19.     }
  20.  
  21.     #endregion
  22. }

Obviously, there is an argument here that the XmlDuration should, in fact, be a value type and not a class.  I’ll leave that up to you to decide.