RDF Nonsense

I thought Sean's article was a good glancing blow, but Danny and Uche apparently don't think so.  They are too busy defending RDF to realize that people like Sean are smart experienced experts whose criticisms should be carefully examined like rocks from a jade mine instead of focusing on flaws.

Sjoerd Visscher + Danny Ayers on the RDF article. Sjoerd says exactly what I was trying to say in the article. He points at Danny Ayers (and Uches) comments that are both worth a read.
I'm not anti-RDF, I'm anti "in-your-face" RDF. Thats a very different thing. Its why I like the idea of semantic shadows I explained in the article.  [Sean McGrath, CTO, Propylon]

I, like Sean, like the ideas behind Semantic Web and understand the benefits of RDF.  What I don't like is people claiming that Semantic Web is the Next Big Thing and that everyone will be using RDF eventually.  "In-You-Face" RDF, as Sean calls it, is what disgusts me.

When you put a typical XML fragment next to an RDF fragment, most people grok the XML fragment because, like a list of groceries, there is almost nothing to understand.  RDF fragment, on the other hand, requires some efforts to map from XML syntax to a mental model.  Without understanding the RDF model, its much harder.  Directed graph is easy enough to understand on a piece of paper.  A large directed graph in your head or in textual form is quite another beast.

You can't expect average web developers to use RDF without understanding it.  Yes, tools can ease the pain.  Tools also obfuscate and separate the user from the data.  Danny and Uche only sees the benefits of RDF, while people like Sean and I see disadvantages as well and recommends more judicial use of RDF.

Ideas Lost and Found

I finally had a talk with a friend of mine in the security industry about my idea on intrusion detection.  He got the idea immediately which was a good sign.  Sometimes, I have to flap my arms like a bird in flight to communicate ideas to usually smart people.  Unfortunately, he also remembered reading an obscure paper that described a very similar idea.  Disappointing, but I am glad I don't have to grunt through the pain of filing a patent.

It is still amazing to me that an idea like that is just remembered and not widely used.  As everyone say, ideas are cheap and even great ideas are often lost in time, hopefully to be found again.  I am going to implement the idea in a side project of mine.  Until I do, the idea will have to be lost a bit longer.  Sorry, but I don't want to ruin the surprise.  As someone said, surprise is all about timing.

Patent Dillema

When a person has a patentable idea that could benefit almost everyone virtually overnight at minimal cost, should that person file and enforce the patent or share the idea with everyone?  When does public gain have precedent over personal gain?

Progress

I finally managed to build PGP 8.0 successfully by grafting missing files from old PGP source code and a bit of reshuffling.  Ability to step through PGP source code using debugger is important, particularly since there is no SDK for PGP 8.0 yet.  Now I can home in on the bug that hangs PGP during cleanup.  At this point, I am able to sign and verify PDF documents using PGP.  For simplicity sake, I am using SHA1 for hashing and PGP formats for public key and signature.  Overall, a productive Sunday.

Eclispe 2.1 RC3 Released

Its at its usual place.  They were late in getting RC3 out and I think some regression bugs creeped in during the rush.  I'll pass on RC3 and wait for RC4, which seems likely at this point.  I expect and recommend Eclipse 2.1 final release to be delayed by two weeks.

Searching for words

Urgh.  So far, I had no luck in locating anything similar to the intrusion detection method I came up with.  Everyone seems to be obsessed with complex solutions.  I tried all the words I could think of that describes the method, but all I get are fancy crypto, AI, or statistics heavy methods that generate too much false alarms.

I need to find the right keywords to continue on with the search.  Google doesn't help me at all.  I wish there was a Google-like search engine for finding the right keywords.  A search engine that understands time, places, things, people, and concepts.  A search engine that monitors, remembers, and assist me in my search.  I guess the search engine industry is still young.  Maybe I'll dustoff my old notes on my search engine idea.

Lupy: Python port of Lucene

If you are a Java guy like me, you know how great Lucene is.  I am also a budding Python guy, but not having a search engine as good as Lucene in Python really sucks — sorry ZCatalog guys.  Well, here is something that looks promising:

Lupy is a port of  Jakarta Lucene 1.2 to Python. Specifically, it reads and writes indexes in Lucene binary format. Like Lucene, it is sophisticated and scalable. Lucene is a polished and mature project and you are encouraged to read the documentation found at the Lucene home page.

Hurrah!

Problems with faxing signed documents

Businesses often ask users to print a web page or a PDF file, sign it and return it by fax.  But no business I know of check whether the signed document has been changed from the original or not.  Even more troublesome, signatures are almost never verified.  In most cases, there is nothing to verify against.  All the contracts I have signed and faxed could have easily been signed by my son yet no one would know.  Is it the high legal cost that prevents disputes?  It seems to me that there is a market here somewhere.

Signing documents with voice

While my Acrobat plug-in for PGP is taking shape, I am unhappy with PKI in general.  One idea I am going to be exploring soon is using voice to sign documents.  I don't know if signing by voice qualifies as E-SIGN.  Filing tax returns with voice signature seems to be disallowed since 2000.  Voice signature products seems to be out there, but not wildly popular.  Still the idea could be fun to play with.

What I am thinking of works like this.

Registration by Phone - a user either calls from or is called at certain phone number known to belong to that person.  The user is asked to repeat a few short sentence.  Recording is stored and analyzed.

Registration by Web – a user reaches a web page by e-mail or simply browsing.  The web page has an embedded voice recording control (could use Flash for this).  The user is asked to repeat a few short sentences displayed on the web page.  Recording is stored and analyzed.

Signing – Clicking on a web page or an Acrobat form with signature field brings up the voice signing plug-in.  The user is asked to say something like "I, Don Park, have read the entire contract and agree to all terms."  Recording or some derivate of the recording is saved into the Acrobat file.  The user may optionally call a phone number to make a recording which is then fetched by either the client or by a server, either during signing or during verification.  User may be asked to punch into the phone an extension displayed on the web page or type in a number given over the phoneline into the web page.

Verification – There are many options on verification of a voice signature, starting with doing nothing until a dispute arises.  In a typical business settings, voice can simply be played back to be recognized by a person who has communicated with the signing party before over phone.  Voice analysis can be applied, of course, to verify that the voice is that of the same person who registered.

OneNote and InfoPath

I just saw the demos for OneNote and InfoPath.  OneNote is just a glorified Notepad, no where as good as NoteTaker is.  InfoPath, on the other hand, is going to be a catalyst, an monster underwater earquake that will start a tsunami of changes across industries.  Its going to generate Office suite upgrade momentum as well as Microsoft server and middleware software sales.  Buy Microsoft stock.  Their revenue will rise sharply in the near future because of InfoPath.  I am not exaggerating, folks.