Code Generation, SharedContracts and The Sneaky Bug

A short discussion ensued today on the topic of Code Generation tools like CodeSmith.

Like Unit Testing, code generation is a topic that some people swear by and some reject out of hand. I’m sure this is mostly a question of getting used to the concept. Only after I had used unit tests extensively in a real project could I appreciate the real value of having them – not just occasionally running some tests, or writing some cases beforehand. I’m talking a full suite of automated tests that could be run nightly or as part of a continuous integration setup. But I digress.

Code Generation was of course very useful in .NET 1.1 for generating strongly-typed collection classes and the likes, but that aspect has been pretty much deprecated with Generics in the 2.0 framework. There are still very useful for generating boilerplate code and translating metadata information (XSD, WSDL and other contract-type information) into strongly typed classes.

WCF uses Code Generation to create proxy wrapper from WSDL, creating a class with static code based on the contract information. This is called SharedContract mode. The alternative is Shared Type mode, where the interface isn’t defined as WSDL but as .NET Metadata – i.e. an interface or a class – and the proxy is based on that interface.

In a previous article I expressed a preference for SharedType when we have a closed system where we control both client and server. Ralph Squillace claims that there should be no difference as far as the developer experience is concerned – whether the proxy is generated dynamically at runtime from a shared type, or statically at compile-time by a shared contract.

The reason I disagree with this statement is the same reason I am wary of Code Generation tools in general. It’s not because I don’t trust them – it’s true that code generation bugs can introduce subtle errors into the system, but I assume that a serious Code Generation tool will receive proper attention and QA. I had already filed one bug report on SVCUTIL’s proxy generation code and it was promptly fixed.

The reason isn’t that I don’t trust the tool or even that I don’t trust the programmer using the tool, it’s that I don’t trust any process. The more steps I have, the more things can go wrong. When these steps are manual, even more so.

In a Shared Type scenario, changing the contract involves three steps:

1. Update the interface.
2. Update the Service.
3. Update the client.

In a Shared Contract scenario, it’s slightly different:

1. Update the interface.
2. Update service.
3. Regenerate the proxy.
4. Update the client.

(Note that #1 and #2 might be the same step, if the WSDL is generated directly from the service).

I’m leery of step #3. Not because it’s hard. Not because it’s long or exhausting or particularly annoying to perform – it’s not much more than a menu click in Visual Studio. I’m worried about it because it is a manual step, and all manual steps are bound to be forgotten occasionally. No matter how much we worry, we are that much more likely to find ourselves with a mismatched contract between client and server.

If the mismatch is big, it will be quickly noticed. If my client tries to call an operation that doesn’t exist, I’ll receive an error immediately. If I changed the types of my parameters, I’ll get an exception on the server.

But what if my changes are more subtle? What if I added an OperationBehavior on one end that wasn’t replicated on the other? What if I added a [KnownType] on one end and forgot to synchronize it on the other?

These are errors that hard to catch, and usually manifest much later than they are introduced. This is caused by my synchronization process being manual and more likely to fail.

This is true for other Code Generation scenarios too. If my code generation template creates strongly typed classes based on my database schema, I need to make sure I rerun the generation after each change to my database. How many times have I changed a database table during development and started debugging only to have my Typed Dataset code crash on load because of incompatible schemas, just because I forgot to rerun the generation tool? What if the changes were more subtle (like changing a string length limit) and would only be apparent at some later time in a specific set of circumstances?

 

I’m not saying that SharedContract is bad. It’s a necessity, of course, with open systems and interoperability scenarios. I’m not saying these problems are inevitable when generating code. A bit of discipline and common sense will go a long way. I’m just saying that leaving these holes can come back and bite us. And if we can do without them, we should.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.