StAX Update

If you have used StAX API at all, you have some scars.  StAX API itself is good, IMHO, but the reference implementation was buggy and its sparse documentation failed to explain how the methods worked together.  The result was that each of us wasted a lot of time experimenting and debugging.

Thankfully, the situation is improving.  Version 1.0 of Woodstox, an implementation of StAX, was released recently.  It's author, Tatu Saloranta, is also working on StaxMate (StAX utility library), StaxTest (StAX conformance test suite), and StaxMisc (integration bits).

If you would rather stick with the RI, there is an updated release at stax.codehaus.org site.

A related news is that James Strachan is working on ActiveSoap, a lightweight SOAP stack using StAX.  Excellent.  If you are sick of Axis's slobbish performance but can't wait for ActiveSoap, you might want to take a look at JibxSoap.

Also, here are some useful StAX Tips written by Berthold Daum:

  1. Using StAX
  2. Parsing XML documents partially with StAX
  3. Screen XML documents efficiently with StAX
  4. Write XML documents with StAX
  5. Merge XML documents with StAX

Feeds out of service

FYI, my RSS feeds are out of service temporarily.

It's related to compression and HTTP headers.  For some reason, gzipped feed content is showing up as garbage for some people.  On my desktop, IE can read the feed just fine but not on my laptop.  Turning off compression on the server-side (by massanging metabase.xml) doesn't quite work either because the compressor is slipping in somehow after a while.

Update:

Feeds now seem to be working although IE on my laptop still thinks the feeds are bad.  I think that IE's configuration is screwed up somehow.  Unfortunately, I don't want to reinstall it because that will cause too many annoyances.  Oh, well.

Another problem is that sometimes gzip encoded response is received although gzip is not in the Accept-Encoding.  I have yet to figure out if this is because a proxy is serving up cached copy without honoring Accept-Encoding header or simply IIS behaving erroneously.

Technorati Result Quality

I was too tired to attend Technorati Hackathon, so here are some details behind my whining.  20 items in the first page of Technorati search result for my blog consist of:

  • 10 blogroll links (5 duplicates)
  • 2 update link (1 duplicate)
  • 8 post links (4 duplicates)

None of the blogroll links were new and I know when my blog is updated, so only the post links are of interest to me.

Getting 4 items of interest out of 20 is, politely speaking, not bad, but I can't help thinking that Technorati can do better, far better, than what I am currently getting.  I know that there are some underlying fundamental problems that must be solved and that they can't be solved by Technorati alone.  This is why I mentioned the need to ask for help from the community.

Update:

A related problem is Technorati-spamming which I haven't seen done yet but will surely happen.  That problem will be difficult to fix.

MSN SearchPoint

SearchPoint is the idea I donated to the MSN Search team.  I disseminated it to a handful of individuals across the team hierarchy so I think the chance of it seeing the light of day is fair.  Since I came up with the idea, I don't think it falls under the NDA agreement, but I am not going to discuss it's details for their sake.

Since when did I care about Microsoft?  Frankly, I don't give a hoot about Microsoft.  I do like the MSN Search team though.  They were open in all senses of the word and, although they had some trouble grokking foreign thoughts we threw at them, I found myself caring for now.  Besides, they gave me a nice backpack and an expensive cigar.  Yeah, I am a cheap date.

I will say that the SearchPoint idea has some excellent characteristics such as:

  1. offers substantial benefits to users
  2. offers substantial benefits to websites.
  3. leverages MSN search engine's main strength.
  4. costs little or nothing to implement
  5. dynamic extensibility

#1 means users will see much better search results with minimal effort.  #2 makes SearchPoint viral.  #3 makes it difficult for competitors to replicate SearchPoint.  #4 means SearchPoint will have little impact on the project schedule and resources.  #5 means SearchPoint can be used as a platform to launch other services.

Sorry about the tease but I do enjoy teasing. 😉

BTW, SearchPoint is not a variation of the Search Hats idea although the benefits to the users are similar.

Java and XP SP2 Firewall

One of the nice features in XP SP2 Firewall is that when an application tries to access the network, it opens a dialog asking if the application should be granted network access and remembers the choice user makes.  This is great for normal applications.  Unfortunately, Java applications all fall under the hosting application name (java.exe and javaw.exe) so network access can not be given to some Java applications and not others.

Unless Java applications start running with Security Manager enabled and the Security Manager is better integrated with the host platform firewall such as XP SP2 Firewall, I fear IT Administrators will start cracking down on Java applications.  Until that happens, I think popular Java applications will have to be cocooned inside a thin native application wrapper to give each application a unique process signature.

Note that .NET applications don't have this problem.

Mouse Pad Key

I have been thinking about new authentication methods and some of the ideas I came up with are quite interesting.  One of them is the Mouse Pad Key.

Mouse Pad Key uses the optical mouse to read keys on a sheet of paper by moving the mouse around on top of the paper.  The idea is still theoretical because I haven't looked into whether optical mouses can reliably read patterns on a piece of paper.  If the key pattern can be printed on regular papers, then keys can be sent electronically.  If not, then a special printing process may enhance security.

A special mouse driver will be necessary, of course, and unubiquitous availability of optical mouse makes deployment impractical but I thought the idea was interesting enough to share.

What’s is wrong with Technorati?

Before, I thought Technorati was going places and liked the service enough to attend a Technorati developer open house, but I am not sure any more.

They now have over a hundred servers instead of a handful and tracks four million blogs instead of less than a million, but the quality of service seems to have gotten worse over time.  Am I just imagining that Technorati seems slower and results seem less complete and more noisesome than before?  I have no problem with outages, but I am disturbed by the apparent lack of improvement in the performance and quality of their service after all time time and resources they invested.

If they are having problems and need some help from the outside, maybe they should open up and ask for help.  If the last Technorati developer open house meeting was any indication, I am sure they will have no problem finding volunteers who would be glad to donate some time to save an important part of the blogosphere infrastructure.

Update:

Just a day after this post, Technorati announced Technorati Hackathon event to gather the Technorati fan/clan.  Way to go, Dave.  Also, the service seems faster and more reliable all of a sudden.  If I was a conspiracy addict, I might have wondered if they were waiting for a fool to post a complaint before turning on some optimizations.  Results are still noisy though with many duplicate items and blogroll links.  They need to work on that.

Anyway, I am happy enough for now although I am perfectly willing to be the whiny cheerleader.

Tiger on the loose

I was expecting it to be released a few months later, but JDK 5.0 (aka JDK 1.5 and Tiger) was released today.  Go get it, jBoys and jGirls.  While you are at it, you might want to lean over the edge a bit with Eclipse 3.1M2 since Eclipse 3.0.x has some Tiger-related issues as-is.

Speaking of Tiger, I am a tiger too chinese zodiac wise.  Many times over in fact since chinese zodiac applies to months and hours as well.  When all the signs, including those of my parents, are combined, I have several tigers, one dragon, and a bull.  Of the three animals, tiger is my favorite.  My wife thinks I am the Tigger though: roaring occasionally, but bouncing around happily most of the time.

Boing!  Boing!  oh ho ho HO!

SourceLabs and DeviceWise

I think the idea behind SourceLabs is an excellent one which will eventually allow them to reap a lionshare of the profits generated by open source movement.

DeviceWise is a similarly innovative idea that I thought about some time ago.  Instead of each hardware companies writing their own software, DeviceWise writes software for peripheral  hardware companies.

By specializing in producing quality software, there is a good chance a company like DeviceWise can play a dominant role in the peripheral hardware market like the way Microsoft plays in the software market.  Why?  Because hardware companies write shitty software.  Over time, such a company can cultivate a brand that customers will want on hardware boxes.  The downside is that the company could end up being just another contract programming shop.

I would have started up DeviceWise if I didn't hate writing firmware and device drivers.  It's a mindbogglingly boring yet troublesome job.  When I did it ages ago, I wasted half of my time dealing with faulty hardware or arguing with hardware engineers.  Ever tried to debug software running on hastily soldiered together circuits?  Urgh.