Preserving Permalinks

An irony of the blogosphere is that permalinks are not permanent.  Whenever a blog changes service or software, its permalinks breaks.  While breaking of permalinks is not worth crying over, it's pretty annoying because internal links break as well.

Unless you are prepared to get your hands dirty, changing blog service or software means you are better off leaving your old posts where they are.

Having your own domain name doesn't protect you either if you decide to switch blogging software.  This is why bloggers are leaving a trail of blogs behind them like a breadcrumbs.  Nice huh?  This is the situation I am in and I did get my hands dirty by writing an ASP.NET HttpModule to redirect date-based Radio URLs to DasBlog's URL format.

It's working pretty well except for the anchor part of URLs which is used by Radio to pinpoint a post in a page containing  multiple posts.  Since those anchors are not sent to the server, I can't map a Radio URL to a page dedicated to a single post.  Oh, well.  At least my permalinks are permalinks.

My next task in the blog transition is building a flexible RSS feed service framework while preserving old feed URLs.

.NET Pervert

There is something pathetic about getting enthusiastic over a geeky article on Christmas Eve but, if you are a .NET pervert, you should read Aleksandr Mikunov's Rewrite MSIL Code on the .NET Framework Profiling API.  The article doesn't say whether the technique still works under .NET 1.1, but it looks promising.

While mutating and hooking managed code is not exactly encourageable behavior, sometimes you have to do it and you can't argue with Gotta.

FTP

Scott Watermasysk asks what the hands-down best FTP tool is.  I have been using FileZilla for the past year and have been very happy with it.  It's fast, free, and trouble-free.  Don't let the 'zilla part bother you if you are a softie.

Speaking of FTP tools, here are some features I want in my FTP tool:

  1. Persistent synchronization – intelligently monitor and update local directories to match remote directories or remote directories to match local directories.
  2. Delta archives – automatically record and archive changes.
  3. Integrity checker – detect illegal modifications or additions to remote directories.

Bad Taste of XSLT

Everytime I use XSLT, it leaves a really bad taste in my mind.  I just spent 3 hours writing an XSLT stylesheet for a new XML-based signature verification result format I created for my client recently.

The format itself is designed to capture data associated with signature verification so that it can be used as legal proof of verification at some later date.  This means capturing data hash, signature, certificates, and OCSP request/response pairs for each cert in the chain; basically bagging every scrap of data on the table.  End result should be routed automatically to a backend repository, but some customers will opt to stored them on local drives which means they need to view it locally.

That's where XSLT comes in.  By associating an XSLT stylesheet with the XML file, users can view the file with just a browser (well, IE).  It's a nice solution except writing XSLT can be a real pain in the ass.  Take one little step outside the simple stuff and you are in a jungle and it doesn't get better over time unless you use it everyday.  Since what I had to do involves fairly advanced XSLT, I was not in a good mood by the time I finished.

If you have a choice, avoid XSLT like the flu.  I didn't.  If you really have to, make sure you have a XSLT debugger.  XSLT being a declarative language is a joke.  It might look declarative, but if you do any serious work with it, you will start thinking procedurally in order to make sense of it.  Like I said, it's a joke.

Atom Authentication Protocol

Mark Pilgrim has written a fairly technical yet easy to read article on how the Atom Authentication protocol works.  They have chosen to base the procol on the Web Services Security UsernameToken Profile which is, while not finalized, a reasonably secure authentication protpcol.

I think they made a good choice — it's kinda ironic that Atom API, which is based on REST, is leveraging an ongoing work to secure SOAP-based APIs — and would like to see the procol implemented for XML-RPC-based Blog APIs as well.

Only problem is that it places a burden on the client to calculate SHA1.  Yes, there are JavaScript implementations of SHA1 and they are fairly fast, but you will still need either JavaScript or Java VM on the browser.  And then there are mobile devices which are still behind the curve on the computing power.  Oh, well.  It's difficult to find a universal solution anyway.

BTW, using just plain username and password is just fine for most blogs IMHO.  This stuff is mainly for corporate users posting to internal blogs, yet-to-emerge infrastructure blogs upon which many people depend on to receive critical information in timely fashion, and trusted blogs like those featuring press releases (coming real soon I hear).  Just be sure to backup your blog content though just in case someone gets pissed off at what you write and decide to paint your blog red.

Mesmerizing Design Patterns

Certain design patterns are more compelling than others.  Some patterns are outright mesmerizing.  For example, the Hierarchy by Containment (also known by many other names) pattern,  which represents hierarchical relationships by have one object contain another, is probably the most popular XML schema design pattern.  Here is an example from XAML:

 <Window ID="root">
  <Button>Hello World</Button>
</Window>

Since <Button> element is inside <Window> element, the button is semantically within the window.  This is nice because syntatic structure matches semantic structure.  Here is an alternate solution that XML designs often overlook:

 <XAML>
  <Window ID="root">
   <Window.Background>Blue</Window.Background>
 </Window>
  <Button ID="mybtn">Hello World</Button>
  <Insert object="btn" into="root"/>
  <Center object="btn" to="root"/>
  <Move object="btn" x="10" y="20"/>
 </XAML>

In the first approach, object properties and contents are specified together.  In fact, there is no real distinction between object properties and contents.  If there is a need to distinguish the two, syntax has to be changed by introducing a separator or a container like <Content>.  Problem gets worse when new a new aspect needs to be added.

In the second approach, contents and operations are separated from objects.  Admittedly, this approach is less visually appealing.  However the syntax is simpler to process because the unit of processing is clearly defined: immediate children of the document element.

BTW, I am not advocating one design pattern over another.  I just thought it might be helpful for people to see alternative approaches to designing XML data.

Atom Info Proposal

Despite some doubts about the usefulness of Jason Shellen's proposal for beautifying feeds with CSS stylesheets, I like it (choice #3 to be precise).  Can someone pull together a stylesheet for RSS 2.0 so we can do a taste test?

As to the problem of making it easier for news aggregator users to subscribe, aggregators will have to ship with a browser plug-in or do it like Radio does via a well-known port and URL.

Over and Under in a Box

I have examined quite a number of open source Java and .NET projects recently and there is a striking contrast between the two groups which is that Java projects tend to be over-architected and .NET projects tends to be under-architected.  In more mundane terms, it's the contrast between having too many joints and too few joints.  I am not talking about the kind that you strike a match to, but the kind that adds flexibility.  For Java, I think it might be a withdrawl from design pattern addiction.  For .NET, it's probably apply duct-tape til it works syndrome encouraged by Visual Studio .NET.

There is also a disturbing trend of hiding or scattering information from developers so that it is difficult to sit down and review several pages of code without having to run around looking for the hidden pieces.  For example, .NET introduced attributes which are like extensible metadata for pieces of code.  These attributes are very useful but they also hide details necessary to get the complete picture.  Another example is .NET's boxing and unboxing behind-your-back trick which, while well-intended and neat, often means disappointingly slow code.  After a few sessions of trying to get some decent performance out of .NET, I am already starting to see boxes in my sleep.

Spoofing for Dummies

I had my doubts, but the URL spoofing bug in IE that Microsoft is supposedly investigating is really there.  The link-happy blogosphere, filled with copy-and-paste addicts, is a ready victim to this bugger (via Zap The Dingbat).

Test Exploit

The bug is caused by simply inserting '%01' in front of '@' character in URLs like foobar@blahblah.com to hide the real domain name from the fake one which goes in front of the '%01' (see the HTML source for this post).

As an architect, this sort of bugs takes a lot of energy out of me.  Ever feel betrayed by the ground you walk on?  It's like discovering that everything you designed was built on a gigantic turtle that just woke up.  I have obviously exaggerated the size of the problem but this sort of bullshit just upsets my stomach.

Another thing that upsets my stomach is getting all excited enough about something to invest months into it just to wake up and realize that there is no reason for people to use it.  There is a quite a bit of that in the web services and Atom hypes.  Get in the habit of asking Why Would They? if you can't take the disappointments.  IT is NOT about YOU, but ALL about THEM, the people who will be using what you build.

Update:

On my IE 6 running on XP with all the latest patches, this is what I see after pressing the "Test Exploit" button.

Whidbey Helps

Examples of rough edges in .NET I pointed out are removed in the next version of .NET framework (aka Whidbey).  It will come with concrete classes that reduces provider-specific dependencies.

Don't expect magic though.  Provider-specific SQL syntaxes are not going away anytime soon.  While storing SQL statements in external configuration files is the obvious solution, there are associated security risks.  Digitally signed configuration files offers some protection, but it's a hassle to implement and takes away much of the power of configuration files.

I am looking at System.DirectoryServices namespace right now.  It looks to be equivalent of Java's JNDI package.  I am not yet sure how it handles in-memory objects (i.e. data sources) bound to the directory service.  Still moving.