BEA’s Java StAX Reference Implementation Open Sourced

BEA's reference implementation of StAX (Streaming API for XML, aka JSR 173) was open sourced at Codehaus back on May 12 but the announcement apparently didn't reach me so I just found out about it by loitering at the StAX Yahoo Group.

StAX website at Codehaus is lacking essential information for the moment but you can get the source from the CVS:

cvs -d:pserver:anon@cvs.stax.codehaus.org:/scm/stax login
cvs -d:pserver:anon@cvs.stax.codehaus.org:/scm/stax checkout stax

FYI, stax/dev/src and stax/dev/src100 were identical last time I checked.  Don't forget to join the StAX Yahoo Group to communicate with rest of the StAX fan club.

Chromeless Phish

When I built the visual spoofing demo, I could have done it in several ways including chromeless window but I went for the simplest way.  It turns out that some smart phisher recently launched a chromeless window-based phishing attack.  Following is screenshot of the browser window showing the phishing site which was still active at 11:51AM.

The webpage and the URL portion of the addressbar is fake.  What's happening is that the phishing site opened a chromeless window to overlay the fake URL over the real address which can be discerned by dragging another window over.  It's using a IE 5.5 specific feature to float the fake URL over everything.  The interesting thing about this trick is that it can potentially defeat many phishmark implementations such as my own 9-block phishmarkPassMark and background-based phishmarks are still effective though.

Pure Java Berkeley DB

Wow.  Pure Java edition of Berkeley DB is out.  I guess pure Java version of Berkeley DB XML is coming as well.  As to the performance, I haven't checked it myself but if this quote means anything, I think this is a major event for Java developers:

"With Berkeley DB Java Edition, we have a simpler setup, a 3x increase in data import speed, a 5x increase in performance and a 10x decrease in disk storage requirements."

–  Eric Jain, Swiss Institute of Bioinformatics

Firefox 0.9: A Stinker

I just installed Firefox 0.9.  Urgh.  It's butt ugly now and seemingly infested with UI bugs.  I tried a few supposedly popular themes but none made me happy although the IE clone theme was most bearable.  They changed quite a bit of dialogs too, most of which seem to be a move in the wrong direction.  0.8 was good.  0.9 is a stinker.

Hollywood meets Video Conferencing

Since I haven't used any videochat software, let alone multiway video conferencing system (MVCS), I am not sure if a brief inspiration I had this afternoon is implemented or not.  The inspiration was to enhance the MVCS experience by emulating what movie directors do to help the audience follow conversations in movies.  It's Hollywood meets video conferencing.

This is how I see the system working.  When a person (A) speaks, the view changes to A.  When another person (B) breaks in, the view switches to B, but the view will briefly switch back to (A) several times while B is speaking.  When B finishes, the view switches back to A.  When A doesn't respond immediately, the view switches to show other attendents intelligently.  The intent here is to catch the facial reactions to weave the conversations into a drama so the system remembers the interaction history like who spoke when, in reaction to whom, for how long, etc.  When two people talk at the same time, the screen is divided into two parts to minimize the ping-pong effect.  Questions are also detected and 'the camera' scans the likely 'suspects' for reactions.

Brief overlaying of textual information about the person on the screen commonly seen in detective TV shows should be also useful when participants are not familiar with each other (i.e. community meeting with 1000 attendents).  It's smartly done based on whether it helps the viewer or not.

Lastly and appropriately going overboard, dramatic sounds can be injected either automatically or by participants like sounds of suspense or those funny sounds talkshow DJs like to use (video-smileys?).  Appropriate video clips (i.e. Three Stooges or Groucho Marx) can also be injected similarly by the moderator or attendence.

While I think it is unlikely someone haven't thought of this before, I thought the idea was interesting enough to share, just in case.

OSGi

OSGi is a standard lightweight API for plugin framework (useful for building microkernels) with a bias toward the needs of network devices.  Recently it gained some momentum when the Eclipse team replaced their original plugin mechanism with OSGi (actually, they paved over it rather than replace).

First full version of Oscar, an implementation of OSGi, was released yesterday.  Also checkout OSGi bundle repository and this nice tutorial of how OSGi can be used.

Eclipse 3.0 RC2

I had been using Eclipse 3.0 RC1 for the past week but it was sluggish and I ran into a few hangups, so when I saw that Eclipse 3.0 RC2 was available, I got right on it.  Definitely better.  Startup is faster and shutdown takes only a second.  Nice.  I think I'll stick with this one until the final release is out which is due end of this month.  Eclipse bug count looks healthy although Platform UI and SWT team seems to be struggling a bit.

Downloading Eclipse took forever btw.  They have mirrors but mirrors are troublesome to use because it forces the user to find the package among the mirrors to download.  They should use BitTorrent IMHO and turn the mirrors into seeds.  BitTorrent needs to be more location-aware (actually route-aware) though.

LL, LALR, and GLR

If you are like me, you have a tattered copy of the Dragon book on your bookshelf and have a fading memory of LL(k) and LALR(1) lores gained through your battles with Yacc, Bison, JavaCC, and ANTLR.  Mostly, you remember wasting a lot of time wrestlig with the tool.

What I didn't know was that new parsing algorithms have appeared on the scene while I was busy in the ready-to-ship world: GLR (Generalized LR) and Earley.  Actually, they are old algorithms whose latest implementations have now become competitive with the more popular yet more restrictive cousins.  For more info on why they should be considered alongside LL and LALR, read John Aycock's Why Bison is Becoming Extinct.

Here are some popular GLR and Earley implementations to get you started:

  • ElkHound - C++, BSD, sep. scanner (supposedly fastest)
  • DParser – C, BSD, scannerless
  • SGLR – scannerless

Earley implementation:

You might also want to read Current Parsing Techniques in Software Renovation Considered Harmful.

Smart TODO Patent

Automatic handling of TODO comments in source code is something Eclipse has been doing for a while now but Microsoft has been granted a patent the feature.  The patent was filed on March 6, 2000.  I forget when Eclipse had the smart TODO feature.

Antialiased Text

What did I do this weekend?  Work, of course.  I finally started on the souped up on-screen newspaper renderer.  GDI doesn't support anti-aliased text, GDI+ was too slow, Flash doesn't have an API I could call, and Quartz is OSX only, so I had to put one together myself.  Key requirement is the ability to fill a large screen full of anti-aliased text and images and switch between two zoom factor fast.

By fast, I mean fast enough to trigger the illusion of viewing a real page instead of a painted screen.  That's the magic sauce that will get people into reading on screen.  You would know this too if you spent a few days doing crazy stuff with newspapers while your wife looked on with a worried look.

The good news is that I got a good enough result with some experiments that I am going to invest more time on the project.  The bad news is that it's not fast enough to do zoom transition animation.  I'll have to add some cheap visual hints to trick the reader's eyes instead.