February 2004 – Dimitri Glazkov

Representing xsd:choice element in C#

The issue of correctly representing semantics of an xsd:choice element seemed like a good challenge, so I decided to tackle it.

Per XML Schema reference, xsd:choice is a group element, which allows one and only one of the elements contained in the group to be present within the containing element. If you try to translate this statement into the rough CLR terms, the parent instance of a type may contain only one instance out of a defined group of types. For example, suppose we have an instance of a Boat type, which can hold only a instance of a Cabbage type, an instance of a Goat type, or an instance of a Wolf type, but not any combination of those instances. What OOP concept does this sound like? That’s right, polymorphism. Let’s imagine an abstract class Load, from which Cabbage, Goat, and Wolf all inherit. Then, our declaration will look like so:


class Boat { public Load load; }
class Load { /* ... */ }
class Cabbage : Load { /* specific features of Cabbage */ }
class Goat : Load { /* specific features of Goat */ }
class Wolf : Load { /* specific features of Wolf */ }

Simple enough? Polyphormism-shmolymorphism, and we got ourselves a nice and semantically correct representation of a choice element. Unfortunately, as it so often happens in reality, all three of our instances competing for the coveted place on the Boat, must also inherit some of their traits from other base classes. For instance, Cabbage is also a Vegetable, Goat is a Herbivore, and Wolf is a Carnivore:


class Cabbage : Load, Vegetable { /* ... */ }
class Goat : Load, Herbivore { /* ... */ }
class Wolf : Load, Carnivore { /* ... */ }

As Steve Saxon pointed out in a comment to my previous post, you can’t do that in CLR — a type may inherit from only one base type. In these situations, you can simulate multiple inheritance using interfaces. Let’s redefine our Load class type as an interface:


interface ILoad {}

This interface is not doing much — it is there only to indicate that an instance of the type that implements this interface may be a load on the boat. Now, Cabbage, Goat, and Wolf can all both inherit their traits from base classes and also be marked as potential suitors for the boat ride:


class Cabbage : ILoad, Vegetable { /* ... */ }
class Goat : ILoad, Herbivore { /* ... */ }
class Wolf : ILoad, Carnivore { /* ... */ }
class Boat { public ILoad load; }

Now, let’s look at the definition of the Boat type. Although it can clearly carry one instance of ILoad, there is no indication of that exactly the load is — as the instances are cast into ILoad, all of their specifics are hidden by the marker interface. The type information is still available — all you need to do is to attempt to cast the instance of the load to one of the three types we’ve defined, and one of them is bound to be the answer, unless the boat is empty. Trying to be good programmers, we don’t want to leave this task entirely up to the developers who come behind us. So, with a few modifications, we can add management of boat loading and unloading. Here’s what we’ll do:

Inside Boat type, create a nested class called LoadManager:


public class Boat
{
/* ... */
public class LoadManager {}
}

This new type will contain one private field of type ILoad:


public class LoadManager
{
private ILoad _load;
}

Create accessor properties for each of our riverside travelers:


public class LoadManager
{
private ILoad _load;
public Cabbage Cabbage { get return _load as Cabbage; set _load = value; }
public Goat Goat { get return _load as Goat; set _load = value;}
public Wolf Wolf { get return _load as Wolf; set _load = value; }
}

As you can see, the logic of the LoadManager is slightly different from the plain ILoad implementation — now you have a 3 doors of different shapes leading to the same room instead of 1 door that changes shape depending on the entrant, yet they represent the same structure semantically.

To complete our journey, let’s modify the Boat type to have a property of LoadManager type rather than a field of ILoad type (we’ll also throw in lazy initialization just for kicks):


public class Boat
{
/* ... */
private LoadManager _loadManager;
public LoadManager Load { get if (_loadManager == null) _loadManager = new LoadManager(); return _loadManager; }
}

Whew! That was a long trip. However, at the end of the road, we’ve together some pretty code (full listing can be found here):


Boat boat = new Boat();
Console.WriteLine("Loading Cabbage..."); 
boat.Load.Cabbage = new Cabbage();
Console.WriteLine("Loading Goat...");
boat.Load.Goat = new Goat();
Console.WriteLine("Load is Cabbage: " + boat.Load.IsCabbage); // not likely
Console.WriteLine("Load is Goat: " + boat.Load.IsGoat);

What’s next? Well, how about figuring out how to implement a CLR representation of minOccurs and maxOccurs attributes of the same xsd:choice element?…

Generating C# code from XSD

Steve Saxon had posted a few times on the XSD -> C# code generator that he’s building. Looks pretty cool. He also mentions that ideally some of the code should be generated as nested classes. It is true, some constructs such as “xsd:choice” are not easy to translate into an elegant C# equivalent. So, here’s an idea: what if we represent “xwd:choice” as its own class? Possibly a sub-class of some abstract XsdChoice implementation, this class provides solid implementation of properties for each element, but uses the same underpinnings to check the facets and provide the actual “choice” framework. Same for the simple types with restrictions. Then, the public definition of Steve’s ProductType class may go something like this :


public class ProductType : XsdComplexType
{
   public int number {get;set;}
   public howBigChoice howBig {get;}
   public howBig2Choice howBig2 {get;}
   public zipCodeSimpleType {get;}
   public DateTime effDate {get;}
}
public class howBigChoice : XsdChoice
{
   public sizeType size {get;}
   public int height {get;set;}
}
public class howBig2Choice : XsdChoice
{
// same here
}
public class sizeType : XsdType
{
  public bool isValid(int size);
  public int Value {get;set;}
}
public class zipCodeType : XsdType
{
  public bool isValid(string zipCode);
  public string Value {get;set;}
}

What do you think?

TrackBack

HTML, CSS and Other Curious Stuff That You May Find Hard-coded In Your Web Application

Design View Dude is asking interesting questions about composing HTML/CSS mark-up in the process of building a .NET Web application.

Ok, I may be a little too radical on this, but IMHO the controls should emit as little HTML mark-up as possible. Ideally, none. I am puzzled at how such great concepts as XSLT and XML-driven rendering are completely (well, aside from the Xml control, which doesn’t really count) ignored in the .NET strategy of building Web content. In proper development environment, the producer of HTML code (the one with visual design skills), and software developer (the one with application architecture skills) are rarely the same person. It would be only logical to keep their activities separate.

So, if you ask me, I would like to see framework built in such a way that content (this includes UI) is emitted in XML, which then can be freely shaped into desired context using XSLT.

That way, we don’t have to worry about script being injected in the wrong place or mark-up being non-XHTML-compliant.

Similarly, we don’t have worry about application developers designing the presentation UI and HTML being hard-coded into the logic of the software.

It will also ease your job in developing visualizers for the development environment — all you need is a test harness stylesheet, which takes emitted XML and creates a usable Web application out of it.

Think about it — skinning and tweaking the look and feel would no longer be connected to the application development cycle.

Antarctica.uab.edu or why cobbler’s kids go barefoot

One of my more recent projects was the Antarctica.uab.edu Web site. Basically, it hosts online journals of a close-knit group of scientists at University of Alabama at Birmingham (UAB), who are currently pursuing biological research in Antarctica. Twice a week (or more often, if time permits), they connect to the Internet and update their journals, read comments, and answer questions. You can ask them anything from penguin vs. leopard seal questions to some really hardcore microbiology and even behavioral psychology stuff — they will be happy to give you their polar perspective. Pretty cool, eh?

But where it gets really exciting is the technical side. This is the first time we’ve implemented a blog engine using Estrada Extensions. I have to say, it came out pretty nice. Here are the highlights:

Moderation — after a comment is added to the post (or article, following the naming convention of a journal), an email notification is issued to the moderator of the journal, who can then click on the link in the email and approve or reject the comment. If the comment is approved, it immediately appears in the article comments section.
Comment posting delay — a visitor may only comment once every 5 minutes on the same article.
Sorting — for some reason, a very important capability of sorting is a rarity in the blogging world. On this site, you can sort comments and articles by date, poster’s name, title, etc.
Comment “folding“ — only 30 comments at a time will appear in the comments section. The rest is be accessible via pagination links (Next page, previous page, first page, second page, etc.)
There is also hide/show fields functionality (for example, hide comment body for all comments to create an abbreviated view of the comments), as well as “show all comments“ rather than 30 at a time, but it is not enabled at the moment — still tweaking the UI.
Journal avatars — each journal has a graphic associated with it (a la LiveJournal)
Per-article gallery and link sidebar — each article may have a picture gallery and/or a link sidebar, so that all of the magnificent pictures of the barren ice and all the links to relevant Web resources are grouped together nicely.
Last but not least is the RSS support — either aggregated main feed of all journals or per-journal feed.

One of the things to mention is that this site was put together fairly quickly — about 10 days of the actual implementation. With planning, architecture, and requirements development included, it was a 2-month project from start to finish.

Now here’s a logical question — if you think you have such a cool blog engine, how come you are still using .Text? Well, I am planning to convert, honest. I just don’t have the time.

The Cruelty of Unattainable Perfection

First, there was pristine perfection: a vision of a mark up language that was simple, intuitive, and made your online documents look reasonably pretty. Then… well, then everything went to hell in a hand basket. In a little over a decade, HTML became the clumsy, disfigured, and repulsive Frankenstein of a language that it is right now. Not only the language itself is full of atavistic lexical protrusions and semantic dead-ends, its implementation, until very recently, has been a tale of creative standard interpretation and compliance infidelity.

Those of you, who spent countless hours trying to make a fairly simplistic layout work correctly in multiple browsers, would readily agree with me. In fact, the process of coding a HTML and making it cross-browser-compliant has become something of an encryption: once you got the thing done, it is easier re-code it from scratch than to make edits to the code (unless, of course you don't care and are willing to "dreamweaver" it together – ah, the familiar "duct tape" solution).

Not all is lost though. With the help of enthusiasts and browser developers, W3C is slowly but surely trying to bring the ship about and put an end to the madness. "Let content to be content", they said. So the cascading style sheets (CSS) were crowned the new king of layout.

Since then, we've gone a long way. CSS2 is now a reality. CSS3 is taking shape as we speak. Just a bunch of lone madmen on a street corner a short while ago, the proponents of standards-based HTML development are gaining popularity and support.

I, too have succumbed to the beautiful idea of content-context separation, and made a good effort in adopting standards in HTML code development.

And you know what I've found out? It's still about hacking. Although not as terrible as before, programming with CSS is still an imperfect process. Go to any of the Web sites that talk about CSS. What will you find? Lots and lots of hacking around the uneven browser support and just plain limitations of the specification itself. Let me give you an example: suppose you have a div element that contains a ul element with an arbitrary number of li elements:

<div>
            <ul>
                        <li>line item 1</li>
                        <li>line item 2</li>
                        <li>line item 3</li>
                        ...
                        <li>line item N</li>
            </ul>
</div>

Just content, right? Now, here's a pop quiz: make this render as a neatly centered horizontal navigation bar, like so:

Done? Now, was it worth that much work? Why does it have to be this hard?

Yes, this is still better than the "nested-table encryption", yes, we are getting closer to perfection. But the cruelty of the situation is that we need this to work now, without resorting to building tables upon tables to control the layout…