Processing Old Posts

Spent a couple of hours converting my old posts from XML with custom schema to JSON. Scrubbed some obvious spam comments (any comment with more than 5 hyperlinks).

Result is a 7MB JSON file containing 1756 blog posts with comments. Hash of IP addresses were not archived so they’ll all be treated as self-proclaimed foobars.

Next step is to POST them as well as assets they reference to WordPress.com via REST API which should take another couple of hours of hacking since I haven’t bothered to convert the posts to RSS.

Need A Good Theme

This blog goes way back, all the way to 2001. First 7 years of it was the most hectic but they’re archived and need to be restored.

Fresh start calls for a fresh theme. Something that’s simple yet easy to read. No gray text. I may just bang one out from scratch but would rather start with a good one like t his Rewritten Hemingway theme then tweak (post titles looks tad too big and too dark).

Update: That theme lasted only 5 min of staring at it. Let’s try Independent Publisher theme.

Blog Name Change

I’m preparing myself mentally to return to blogging, at a slower pace. To that end, I’ve renamed this blog to Weekly Habit. Blogging daily was exciting back then. I still want to but am wary of blogging for the sake of blogging.

Anyway, stay tuned.

On KISSmetrics

I think Hiten Shah, CEO of KISSmetrics, is too distracted with recent lawsuits to understand the mistake his company made: not looking out for their customers.

Legality of using ETag for tracking or reusing same ETag hash across domains is unclear and should be answered through legal process. What is clear, however, is that their usage raises suspicions and invites accusations against their loyal customers, not just KISSmetrics.

KISSmetrics should have foreseen this but apparently either did not or  did but failed to act before it blew up. I hope my two cents worth will help them learn and improve their service. Lawsuits may come and go but lessons learned will stay with you.

An ounce of foresight is worth a pound of hindsight.

Excuses make poor stain removers.

Cinemetrics

Cinemetrics is a promising example of Identicon IMO. Similar efforts have been made audio clips.

Cinemetrics aims to create a visual “fingerprint” for film using the editing structure, color, speech and motion.

Design challenge in generating interesting ‘fingerprint’ depends largely on the target audience. Multimedia production is a very iterative process resulting in many variations and combinations so, if the target audience are film editors, challenge is in finding ways to emphasize difference without sacrificing similarity.

Identicon and Robohash

This post is a dump [for archival purpose] of exchange between Colin Davis, creator of Robohash, and I that took place in context of a Hacker News about Robohash.

Colin:

Identicons are a great idea, I really love them.. They’re a good solution to a gut-check “Something is wrong here..”

Sort of like a SSH-fingerprint.

The problem I’ve had with them is that they’re generate not all that memorable. Was that triangles pointing left, then up, or up then left?

This is my attempt at addressing that problem for my own new project, but I’d love to see what you build! If you want to use these images, feel free. They’re CC-BY, so they’re open to the world now 😉

Don:

Re ‘not all that memorable’, that’s because identicons were originally designed for ‘distinguishing’ and ‘matching’ data, not ‘memorizing’.

Abstract geometric identicons like my original implementation as well as variations used at WordPress and StackOverflow are, while nearly impossible to remember, distinguishable in a pile which comes in handy when distinguishing the ‘voice’ of individuals in a long thread of comments.

To use identicons as permanent identity, one has to ‘identify’ with their identicon. We can identify faces of our friends because we shared memories with them, stories if you will.

So robotic identicons like yours can be made more memorable if users had some ways to create a story they can associate with it like ‘blue viking with left arm missing’, etc.

Colin:

That makes a lot of sense. I wasn’t trying to be disparing. It’s a great idea, and very helpful, I just felt like it could go in a slightly different direction for this specific use-case (Public Keys).

Don:

I think an interesting way to apply identicon to certs is to map each cert attributes to an ‘attribute’ of identicon, visualizing attributes.

What is Identicon?

Word identify has two meaning:

  1. Establish or indicate who or what (someone or something) is.
  2. Recognize or distinguish.

I chose the name Identicon with second meaning in mind to convey that Identicons’ intended applications are in helping users recognize or distinguish textual information units in context of many.

Textual Data Problem

Human eyes have evolved to recognize individual objects out of a group by noticing visual differences. Unfortunately, textual data are visually similar.

While many different typographic features and techniques have been invented since writing was invented, most of them are for free-form text. Additionally, list and table text layout lack the irregular features free-form text have, like line ends and paragraphs, to use as landmarks.

Icon Solution

Icons do add the necessary visual differences to textual data. Only problem is that icons are typically designed by hand or, in case of avatars, photos or pictures have to uploaded.

Identicon = Generated Icon?

One might say Identicons are simply generated icons. The first implementation of Identicon used salted hash of IP address to generate 9-block colored icon for each blog commenter. Most popular use of Identicon today remains generated iconic avatars.

I think it’s a bit more. Certainly, generated part is required. But the icon part is unnecessarily restrictive unless colored circle or box can be called an icon.

Identicon and QR Code

I was recently asked to provide some information on identicons, a good excuse to restart blogging.

This post, more like notes actually, compares Identicon to QR code which may seem similar visually but are not.

WARNING: I think in random fragments, brief moments of coherency, so my posts will be the same.

Machine vs People

Content

  • QR codes are containers of information.
  • Identicons are shadows of information they are associated with. 

Usage

  • QR codes are used to transfer information from real life (RL) objects to computers using only optical means.
  • Identicons are used to distinguish individuals or groups of information.
More to come later. Sorry.

Installing sqlite3-ruby gem on Snow Leopard

Problem:

After upgrading to Snow Leopard, I had to rebuild/reinstall MacPorts and RubyGems as recommended. While doing this, I found that sqlite3-ruby gem install failed with errors related to extconf.rb file.

Solution:

Not sure why this works but I found a working solution at StackOverflow which replaces:

/usr/local/lib/libsqlite3.dylib

with a symbolic link to one that came with XCode for Snow Leopard:

/Developer/SDKs/MacOSX10.6.sdk/usr/lib/libsqlite3.0.dylib

You can find the full ‘ln’ command at StackOverlow page above but be sure to rename the original in case you need to restore it.

Using JSP with Jersey JAX-RS Implementation

This post shows you some tips you’ll likely need to use JSP with Jersey in typical Java webapps.

Tested Conditions

While Jersey 1.1.1-ea or later is probably the only hard requirement for the tips to work, my development environment is listed here for your info. You are welcome to add to this rather meager basis for sanity.

  1. Jersey 1.1.1-ea
  2. Tomcat 6.0.20
  3. JDK 1.5
  4. OS X Leopard

Change JSP Base Template Path

Default base path for templates is the root of the webapp. So if my webapp is at “/…/webapps/myapp” then Viewable(“/mypage”, null) will map to “/…/webapps/myapp/mypage.jsp”

To change this, say to “WEB-INF/jsp” as it’s commonly done for security reasons, add following init-param to Jersey servlet/filter in web.xml:

<init-param>
<param-name>com.sun.jersey.config.property.JSPTemplatesBasePath</param-name>
<param-value>/WEB-INF/jsp</param-value>
</init-param>

Return Viewable as part of Response

It was not obvious to me (doh) where Viewable fits into Response when I have to return a Response instead of Viewable. It turns out, Viewable can be passed where message body entity is passed. Example:

return Response.ok(new Viewable("/mypage", model).build();

Use “/*” as servlet-mapping for Jersey

The primitive servlet-mapping URI pattern scheme, which somehow survived many iterations of the servlet API, impacts JAX-RS hard if servlet-mapping is overly broad. Unfortunately, pretty restful URL calls for servlet-mapping to be “/*” instead of something like “/jersey/*”, breaking access to JSP files as well as static resources.

To work around, you’ll have to use Jersey as a filter instead of a servlet and edit a regular-expression init-param value to punch passthrough holes in Jersey’s routing scheme. To enable this, replace Jersey servlet entry in web.xml with something like this:

<filter>
 <filter-name>jersey</filter-name>
 <filter-class>com.sun.jersey.spi.container.servlet.ServletContainer</filter-class>
 <init-param>
  <param-name>com.sun.jersey.config.property.WebPageContentRegex</param-name>
  <param-value>/(images|js|styles|(WEB-INF/jsp))/.*</param-value>
 </init-param>
</filter>
<filter-mapping>
 <filter-name>jersey</filter-name>
 <url-pattern>/*</url-pattern>
</filter-mapping>

That’s all for now. Hope this post saved you some headaches.