What’s wrong with Sem@code

When I posted about Semacode yesterday, I had a vague feeling something was missing and bugged me rest of the day until I realized it while in the ZZ land. It's that a Semacode maps to a URL which is a silly thing to do in the post-Google era. Websites, particularly small websites likely to be pointed to by Semacode, tend to disappear over time and it's mostly read-only, meaning only those who own the website or are members of the website can add information to it.

Semacode should be just be a string (it could be a set of keywords or even just numbers) unique enough to be used as a reliable coordinate in the online search space so that looking it up at a search engine will return only the links directly and deliberately mapped to the coordinate. This way people can add information about the object at the coordinate without restrictions. If it happens to be a restaurant, they can even post a bad review on their own blogs and it will still show up on cellphones after Semacode is scanned.

The Big Idea here is that you don't really need a URL if you have good search engines. For wiki-fans, it's like turning the entire web into a wiki of sort by using search services like Google to weave a wiki page out of pages across the Net.

Think different people. It's all right if it has been done before as long as it hasn't been done by you.

2D Graphics Libraries

While platforms these days have fairly good 2D graphics support like Quartz on OSX, GDI+ on XP, and , and Gnome Canvas, developers like me often have to use third-party libraries for whatever reasons. On Win32, for example, GDI+ support is missing in legacy platforms which means either giving up on fancy graphics, redistributing GDI+ binaries, use a third party library, or writing one yourself. Writing one yourself is fun (I have done it a couple of times over 20 years) but, unless it offers some unique features, you'll always end up migrating to a third party library.

BTW, Flash has an excellent 2D graphics engine but it's lacks an API so it's like a sports car without a driving wheel. Yes, you can embed the Flash ActiveX and generate SWF on-the-fly but it's unwieldy for dynamic interaction and even handling gets tricky. Embedding Adobe SVG ActiveX is just as unwieldy if not more.

While there are proprietary 2D engines out there, typically written by a few guys at a small company, they tend to disappear within a couple of years, either bought by companies (i.e. Apple, Adobe, Macromind, and Microsoft), or abandoned out of lack of interest or workable revenue model. Besides, they charge fairly steep fees so I tend to avoid them.

Out of all the freely available 2D libraries out there, Libart stands out in features and quality. It offers fast anti-aliased rendering and it's use in Gnome Canvas over the years means most of the bugs have already been stepped on. Libart is also used to drive librsvg, a SVG engine, and Java 2D, Java's graphics API, although Sun made extensive changes to tap hardware acceleration. While Libart can and has been used cross-platform, it's not exactly cakewalk to use in non-Linix platofrms. Cairo has some interesting features and rising interest could mean it will replace Libart someday, but it's still in development.

Third-party 2D graphics library I really like these days is Anti-Grain Geometry (AGG) which, although dormant for the last two years, has been rejuvenated with the released of version 2.1. AGG is written in C++ and uses templates extensively like ATL does. AGG is lightweight, very fast, flexible, and full of features. It even comes with a partial implementation of SVG viewer as an example. AGG supports Win32, X11, and SDL as is. It doesn't yet support features variable stroke effects like Creature House's Expression 3 engine and Fractal Design's Painter support but then it's just me being unreasonable. 🙂

I should note that subpixel graphics was first done 20 years ago in Word Handler to display 70 columns of hi-res text on Apple II. Silicon Valley Systems, the company that published Word Handler, was based just 5 minutes from where I live now and I enjoy fond memories of working there every time I pass by the old office on El Camino. I guess everybody remembers their first job. LCD screens were just starting to replace LED on calculators at the time, so Steve Gibson and Microsoft ClearType can claim to be the first to use subpixel graphics on LCD screen. Lenny Elekman, where are you now?

Update:

I thought I should put this excerpt from the AGG doc, which is still being written, for those who are expecting GDI+ or Quartz like API from AGG.

Anti-Grain Geometry is not a solid graphic library and it's not very easy to use. I consider AGG as a “tool to create other tools”. It means that there's no “Graphics” object or something like that, instead, AGG consists of a number of loosely coupled algorithms that can be used together or separately. All of them have well defined interfaces and absolute minimum of implicit or explicit dependencies.

In fact, AGG is just a bunch of C++ template classes which little or no documentation to guide you except the examples. Don't wade into AGG unless you know what you are doing.

Spams, Phishing, and Trojans

This Netcraft article titled Phisher Kings compares growth of phishing with that of spamming (via Payments News). It's not surprising to me since I think phishers who rely mostly on social engineering used to be spammers. However, phishers using trojans, like the one described in this Code Fish Spam Watch article, are not. They are hackers using e-mail to find their victims.

Using trojans to harvest passwords and credit card numbers is, fortunately, not as deadly as it might seem at first glance. Why? Because trojans require more technical knowledge, higher cost of maintenance, and higher cost of labor necessary to mine the returned data. It's all glory and little in return.

In comparison, phishers with spamming background tend to focus on what really matters, the ROI numbers. Instead of wasting days and weeks to write and finetune trojans, they use a web page editor to create their lures and receive their loots in ready to use form.

There is a more dangerous group of potential phishers we need to keep an eye out for: telemarketers. While most spammers operate blindly, telemarketers leverage information to choose and attack their victims more intelligently. Phishers with telemarketing background are more likely to be spear-phishers, phishers who target rich victims with tailored attacks.

When they come for you, they will know your name, where you live, what finanicial services you are using, and more.

Secure UI: 9-Block Phishmarks

When I originally came up with the idea of phishmarking, I was thinking of using fractal patterns. Unfortunately, fractual patterns are rarely simple symmetrical designs so they are more difficult to remember. So while I was looking for a different approach, I remembered Jared Tarbell's 9-Block Pattern Generator at Levitated.net which basically does what quilt makers have been doing for ages but with simple shapes that can be used to build a shape that is easy to recognize even at small size.

It uses following 16 shapes, rotations, colors, inversion, and some rule for symmetry to generate astonishing number of designs.

Below is my implementation of 9-block phishmarks being used in browser toolbars. Note that phishmarks are anti-aliased because the display area on the toolbar was too small. Cool, eh?

Pretty and Safe!

BTW, Jared told me that 9-block pattern generation algorithm can be used without a license although his Flash code is under GPL. Jared also has other interesting graphics generators that could be used for phishmarking although I am not sure about licensing. For example, Bone Piles and Combinatorial Critters are pretty interesting although they will require more real estate and more complex coloring schemes.

9-block quilts are very interesting although not enough to make me want to take up the sewing needle. Heh. Anyway, if you want to find out more, here are some links to get you staretd:

Update:

To be more precise about how many unique patterns can be generated, above implementation uses 17 bits for the pattern (3 bits for the middle shape and 7 bits each for corner and side shapes) plus foreground and background colors. Taking limits of human vision and color restrictions, I would say this implementation of 9-block phishmarks can generate around a billion easily recognizable unique patterns. That's enough, I think, against phishing.

If not, adding a few more shapes will be enough to assign a unique design for every single person on earth. Hmm. Wouldn't it be interesting to assign one to each last names so they can be used as 'house' symbols?

Update #2:

Please read the post about PassMark patent that could affect this and other phishmarks.

Searching for Algorithms and Data Structures

In the old days, finding obscure algorithms and data structures meant keeping stacks of books and trade journals in the garage, visiting book stores and local universities to dig through mountains of books and badly written and copied research papers.

Even in the Bay Area where there are great bookstores and two large universities, it wasn't easy. Whenever I read about something neat like Linda or spiral hashing, I drove over to the Stanford math & CS library and spent hours going through papers. Finding the paper often mean having to read papers mentioned in the paper so it was a good way to waste a whole day in that dusty library.

Thanks the Net, all that is replaced by Google and excellent services like CiteSeer and NIST WebSpace. NIST Dictionary of Algorithms and Data Structures is useful too, although it's not as complete as I would like. Unfortunately, most of these services are not updating their stat pages as often as I would like. For example, this list of most-accessed documents at CiteSeer was last updated in June 2003.

I wish these valuable services had easy ways to donate (PayPal?) because I think they are absolutely essential to my work and I want them to improve and expand their services. For example, wouldn't it be great if documents at CiteSeer were Wiki pages? This way, corrections to errors in the papers and implementations can be shared.

Lua 5

If you need a small yet fast embeddable scripting language engine, check out Lua. Lua 5 now has a formidable array of tools and libraries you can choose from and there is also a Lua wiki. Lua syntax is slightly funky but similar to Python. Yes, it handles COM just fine so you can use it inside an ATL/WTL module (i.e. IE or shell extensions) to easily manipulate COM objects without writing a lot of duplicate code one has to write when using COM. LuaPlus is a variation worth looking at also.

Too much flexibility

Hanni at BileBlog occasionally hits the nail and this time it's his rant against the flexibility fetish rampant among Java programmers. I suspect it was their obsession with the Design Pattern that lead Java programmers down this path. Design Pattern is a useful tool but you can hurt yourself if you pull on it too much.

Being able to mold and fuse everything in your software is good, but such flexibility isn't useful if it isn't actually used. I seriously doubt if more than 10% of all the extra flexibilities and abstractions being built into Java software are ever used. All that 'fat' makes the software bigger, slower, and more difficult to understand.

Dive into any popular open source Java code and you'll see lots of design pattern artifacts like Factories, Adaptors, Managers, and Observers most of which has only one or a couple of implementations. These sort of habitual abstraction often forces late comers to get lost in the abstractions to understand process and data flow.

I think the best example of flexibility is the mammal skeleton structure because flexibilities are like joints, points that can bend. Joints in our bodies don't bend in all directions. They also exist only where it's really needed. Each joint has cost associated with it so if the benefits don't outweight the cost of having a joint a a certain location, it shouldn't be there.

Evolution doesn't happen in anticipation; it happens in real time. Don't add flexibility in anticipation, but add it when you actually need it, where you need it, and no more than what you need. Based on my experience, I would add that real flexibility comes from preventing assumptions from leaking across component boundaries. Limiting surface areas between components will help in reducing the chance of such leaks.

Zombies at Starbucks

This particularly ghoulish scene from the movie Security Scenarios from Hell has three actors: WiFi, Zombies, and Spyware.

Perils of WiFi are well known and well publicized (i.e. Wireless Networks are in Big Trouble, a classic Wired from 2001). If you are a geek, here is a more technical version of the same from Security-Forums.com. While the perils were preached before their subjects have, WiFi is now commonly available which means those perils are now common as well.

Zombies are also well publicized. Typically, they are poorly protected servers or home PCs with broadbands which are hijacked by hackers, supposedly even traded like Yu-Ki-Oh cards in the hacker community, and used to increase scalability to their attacks and to reduce likelyness of capture.

Spyware is software running on desktops that monitors user activities and report back to it's master. Most of them are just privacy violators, some are used for more sinister purpose and are called trojans. Earthlink recently claimed that PCs had, on the average, 28 spyware installed. While I think the claim is over-hyped to fit their agenda, spyware is nonetheless common place and it's not difficult to place one on anyone's compure. If your PC is more than six months old, chances are that there were plenty of opportunities for hackers to seed it with spyware.

So here is the scene: imagine a new class of spyware that monitors wireless network packets using code from these open source wiretapping tools. AirSnort and one of the ARP poisoning packages should be enough. Now imagine this spyware being delivered to laptops with WiFi cards that supports features AirSnort needs. The laptop just became a new kind of zombie, which I call wireless zombie, that only wakes up when the WiFi card is used.

All that is missing from the scene is the stage: a WiFi hotspot like Starbucks. The laptop owner sits in a corner and access the Net through the WiFi, it could even be someone like me writing this very blog post. The spyware wakes up and starts monitoring the wireless traffic looking for passwords and credit card numbers. If very strong encryption is used, wireless zombies can form a global grid and split up the work of cracking encryption keys. Once a month, the zombies reports back to their master via USENET posts.

This Zombies at Starbucks scenario is particularly nasty because the potential number of compromises is just staggering. Maybe the FCC will have to dictate higher level of standards and send out a warning that helps WiFi users detect wireless zombies by the unusual fan activities triggered by the zombie grid working overtime.

Alive

Just in case you are wondering, I am still alive and kicking. I have been busy with a project for a client and I have barely managed to get enough sleep in the last six days because I have to deliver by this Sunday something that will wow people into opening their pocket next week.

As usual, it's a lonewolf project because there is neither the time nor resources to pull together a team. I am trying to slip in some fancy design features for flexibility but it's mostly wham-bam-stay-out-of-my-way-fool and I'll-fix-that-later going on.

Yeah, it's Silicon Valley at its best since crash projects like these are impossible to outsource. Days of milking fat mega-corporations on multi-year projects are gone and lean mean shoot-from-the-hip or work-for-nickles days are here.

Superbug and Hackers

Hackers are like germs. You throw equivalents of antibiotics at them, they'll mutate into superbugs. For example, I doubt phishers will be tempted to hack Google to take advantage of AdSense Voluntary XSS vulnerability because they are getting enough loot from stupid phishing attacks to keep them happy. Once Microsoft Outlook, the main phishing delivery vehicle, is plugged and their gravytrain runs out, they will turn into superbugs to find other means of getting their phishing lures in front of the user's eyeballs.

Oops. I am out of tea for now.

Don Park's Weekly Habit

Well, sorta weekly.

Category: Technical