Telephony XML

Many years ago, I built a voice-based web browser for a client when only other competition was Unwired Planet (now called OpenWave) founded by Alain Rossman whom I worked with at Radius.  Interestingly enough, they were both based in Redwood Shores, situated across a lagoon from each other, the same lagoon that passes by my house.  One time I even visited my client in a canoe.  Now that's the kind of commute I can enjoy.

While Unwired Planet was focused on using the tiny display and the dialpad and ended up with WAP, my client wanted to use all aspects of the phone including speech recognition and generation (aka Text-To-Speech).  There was nothing like it so I put together what can best be described as VoiceXML 0.0 and built some PIM-like applications for executives on the go.

It was a fun project and having to call myself hundreds of times a day was an 'interesting' experience.  The company then ran out of funding and that was the end of it although patents were snapped up.  Last time I heard, Moses Ma bought the dialpad-based browser navigation patent.

After all these years, I am now doing some VoiceXML/CCXML development again.  It's a weird feeling seeing what I was working on matured, implemented and available widely.  VoiceXML 2.0 is W3C recommendation and CCXML 1.0 is close to completion.  There are many VoiceXML vendors and hosting services like Voxeo even offer free developer accounts to build and test telephony applications.  All this is so much easier than having to build everything myself.

Still, there are many irksome aspects of VoiceXML and CCXML that leads me to think the spec was developed without the benefit of advices from experienced XML gurus.  For example, CCXML has many attributes whose values are expected to be ECMAScript (aka Javascript) fragments which leads to some awkward XML expressions like this:

<assign name="state0" expr="'calling'" />

Note the single quotes inside double quotes.  As to why the CCXML WG didn't add an alternate attribute named 'value', I am clueless.  What's even more weird is that attribute names provide no hint on whether the value is suppose to be script fragment or textual value.  I would have postfixed '_expr' or prefixed 'j' to names of attributes whose value is script fragment.

While I am tempted to fix what's wrong with CCXML before it's finalized, I already have my hands full so this general advice will have to do:

There is more to XML than meets the eye.  If you are defining a new XML-based language, you really need to consult some XML gurus to avoid making silly mistakes like these and to avoid pitfalls.

If you don't know any, let me know and I'll recommend a few.