Web Application Platform Comparisons

I found these informative links in my research into high performance web application platform and thought you might find them useful as well.

To start off, A Comparison of Portable Dynamic Web Content Technologies for the Apache Server provides good overview of the choices.  I think its numbers for Java Servlets are somewhat off, but not enough to affect the conclusions.  FastCGI + Perl was the clear winner in the paper, followed by mod_perl.  Only surprise was the poor performance of PHPmod_python was not test unfortunately.

If you are curious about the performance differences between Apache 1.3x and Apache 2.0x, Open-Source Web Servers: Performance on a Carrier-Class… provides some benchmarks.  This paper might be outdated though.  If you know of current state of the performance differences, let me know.  In short, the paper shows that there are only marginal differences except Apache 1.3x scalability was not consistent in one of the tests.

AMK's Diary blog post Web server performance clued me to SCGI which was designed to be easier to implement than FastCGI.  When Linux Weekly News switched from mod_python to SCGI, performance improved significantly.  The post goes on to suggest why this might be.

Understanding FastCGI Application Performance provides more thorough explanation of why FastCGI (and SCGI) performances are so good.  After reading the paper, I set out to compare FastCGI and SCGI and found this archived message by Michael Watkins.  If the numbers are right, SCGI is much faster than FastCGI.

SCGI is released by MEMS Exchange, an apparently non-profit organization serving chip fabrication centers.  MEMS Exchange also makes available a number of useful Python software for free, most interesting of which (at least to me) is Quixote.  Quixote is a Python web application framework that can support many architectures (see Quixote whitepaper).


a SCGI-based architecture

I would have liked to see how SCGI compared to mod_perl, but found only an informal comparison of FastCGI and mod_perl.  I would have also liked to see comparison of mod_perl-based Perl web applications and SCGI+Quixote based Python web applications, but found nothing in this run.

By chance, I did run across Apache Hello World Benchmarks, a very informative benchmark report.  It even compares popular Perl web application frameworks like Mason and HTML::ASP.  The numbers were enough to made me lose interest in Mason.  Phew.  I am saving a local copy of the benchmark because I know I'll be coming back to it often.

So, I'll be looking into SCGI and Quixote in the near future.  Did you know that during my wild younger days, my friends called me Don Quixote?

Update #1:

SCGI picture has dimmed a bit.  Only SCGI server implementation I could find was MEMS Exchange's which is written in Python.  This spells trouble for non-Python.  Also Apache2 version of mod_scgi, Apache module implementing SCGI client code, is not available.  Some efforts were made but they were abandoned.

I can't imagine mod_scgi being difficult to write, so it must be the lack of interest.  But lack of interest despite apparent superiority over other solutions doesn't make sense to me.  Geeks vote with their feet and lack of footsteps around SCGI leaves it looking less attractive to me.  Maybe it's Python being supposedly slower than Perl and its GIL (Global Interpreter Lock) curse.  In comparison, Perl looks awefully good in terms of geeks voting with their feet.

Speaking of popularity, mod_perl easily beats FastCGI despite the performance advantage FastCGI has.  Some of the reasons cited are: better support, bigger developer community, and tighter integration.

Herd behavior enhances survivability

Update #2:

Neil Schemenauer, author of mod_scgi, wrote to say mod_scgi for Apache2 exists but shouldn't be used for serious use until more testing is done.  He mentioned that there is nothing Python-specific about SCGI and he would be happy to provide support to people writing non-Python SCGI bindings.  He also pointed to mod_list as an alternative to mod_scgi but I couldn't locate it via Google.

Update #3:

Neil was referring to mod_lisp which sends request information in name-value pairs over local loopback.  As simple as can be.  While protocols might be simple, smart management of multiple workers will not be.  There are some room for improvement in this area (notice the weird URL for mod_lisp).

Character Encoding Detection

For bilinguals like me, there is an often overlooked differences between IE and Mozilla: character encoding detection.

IE often gets confused about character encoding of a webpage and end up displaying garbage.  Manually changing the character encoding can sometimes fix the problem, but the problem is usually back again in the next page.  Also, attempts to change character encoding manually often conflicts with website deep-linking protections.

When this happens, I use Firebird (Mozilla variation) to read the page because they have a much better character encoding detection algorithm.

With Jchardet, a Java port of Mozilla Charset Detector, Java programs can enjoy the same level of charset detection excellence as Mozilla.  Awesome.  Thanks to Elliotte Rusty Harold for the link.

Python, Perl, mod_perl

First half of today was great, sitting outside with my tentative partner and brainstorming for three hours with my wife supplying food and drink.  Then I dove into Apache/Python integration options like mod_python (urgh) and mod_snake (dead).

Discouraged by what I saw, I spent the next three hours setting up mod_perl.  Installation is straight forward except the docs I found skipped an important step, so I wound up going around in circles until I found the missing step and filled in the blanks.

mod_perl worked for a few minute, then my IE started acting weird like trying to download my test Perl script instead of serving what it generated.  Firebird didn't have any problem though.  Restarted everything and it's working again.  Maybe it was a mistake to install version 2.0 instead of stable 1.0.  Anyway, I'll have to reconfigure Apache to prevent script files from being served.  Sheesh.

I am moving on to installing AxKit so I can play with it tonight.  Meanwhile, I got O'Reilly's mod_perl book in tiny my Safari bookshelf to read over dinner.  I am hoping that my Perl allergy stays dormant.  I wonder how much trouble it would be to compile Python into Perl bytecode?

Update:

Don't you just love getting stuck in the open source house of cards?  I deferred playing with AxKit in favor of Mason which is supposedly in use by Amazon and Salon.  Woohoo, I said.  I already had Apache2 and mod_perl2 installed so I should be able to drop Mason on top and play, I said.

To make the long story short, I ended up ripping out everything I installed and still ended up destroying Apache2 installation over trouble with mod_perl2, Apache::Request, and libapreq modules.  Why did I go with mod_perl2?  Because mod_perl1 doc basically said I should use mod_perl2 for Win32.

Overall, it was a fun day.

WASTE at SourceForge

While googling for the latest on crypto export law, I came across Nullsoft's WASTE living on at SourceForge.  FYI, WASTE is open source small P2P network software supporting IM, group chat, file browsing/searching, and file transfer.  It was released by Nullsoft and then removed by AOL, its parent company, in matter of hours.  WASTE is now up to version 1.1.  Go check out WASTE at SourceForge.

Upcoming and Enriching Blog Calendar

Andy Baio's Upcoming.org is an interesting service (via Marc via Matt Haughey).  No, there aren't any really new ideas there, but Upcoming is an implementation of ideas that were talked about before, something to grab and shake.  Ray also found Upcoming interesting and asks some good questions.

Upcoming is, of course, related to the idea of enriching ubiquitous calendar on every blog.  Events from Upcoming could be used to populate a blog's calendar as well as 'day' pages so one could see the events related to specific time period.  Since most interesting events are those yet to happen, blog architecture and UI has to change.  In the end, blog calendar turns into a calendar aggregator.

I am not happy with Upcoming UI though, but it's a start.

This post was brought to you by Shama Lama Ding Dong.

10 Python Pitfalls

Nice enumeration of odd python behaviors by Hans Nowak.  Most of Python language works as expected so you get used to cruising along right after diving into it.  But that easy rider feeling turns queasy when you wham straight into oddities like these.  If you enjoyed this, here are few more:

BTW, does anyone have answers to these two questions?

Google Cluster

I know I am late, but I am reading it now and thought some of you might have missed or have forgotten about the paper on Google cluster architecture (PDF) like I have.  Too bad it's a good read because that only leaves me wondering what else I might have missed.  Stress of living in the information-age I guess.

I got a few days of lull in my consulting practice just now so I am keeping myself busy on the latest about clustering technologies and practices.  I am not too interested in the academic side, just good ol' practical stuff along with some up-the-creek stories.  More companies like Amazon, AOL, Microsoft, and Yahoo should write about their experiences and solutions.

Skype Gripe

Until late afternoon yesterday, I thought one had to find someone to make a call.  Skype UI certainly encouraged that line of thinking, particularly since I couldn't find a way to add a friend manually.  So the trouble with finding people started.

I found Scoble and Deane easily enough but they were apparently not online.  Later, I read somewhere that Skype online status is unreliable currently.  Several friends left me a comment saying he couldn't find me.  I couldn't find them either.  I tried to find myself as an experiment but couldn't.  Duh!

Yesterday afternoon, I suddenly realized that I didn't have to 'find' anyone to make a call on Skype.  I can just type "callto://billgates/" into IE and it would make the call.  So I called James Snell.  Yup!  I couldn't find him but Skype happily called him somehow with "callto://jamesmsnell/".  Nuts.  Skype screwed up their UI IMHO by making a shortcut apppear to be the only way.

He didn't answer but he called me back a few second later.  He said Hello?.  Sounds great!  I said Hello!  Nothing.  James launched into more Hellos and I started fiddling frantically with everything.  Thankfully, we had Skype IM to help us along and figure out that it was microphone input gain problem or something.  I then had to go to dinner so we disconnected.

Late last night, I thought about Skype.  Skype is certainly a neat beta product.  It still needs many more peers in it's P2P network.  Sound quality was good though.  But it felt weird using a computer like a telephone.  Uncomfortable, in fact.  Do I need Skype?  Nope.  I don't have a large phone bill.

My wife does, but the other end has a technology phobia so using a computer like a telephone would be awkward.  It's easier and more comfortable for my wife to adjust calling time and day to take advantage of cheapest international call rate.

Skype was more of a curiosity than a necessity and it wasn't much fun trying to get it to work.  So I uninstalled Skype last night.  Sorry friends for all the trouble.  Let's wait for a better trouble.

Update:

I am not underestimating the potential of a service like Skype.  The potential is there.  All I am saying is that Skype has beta problems and that it doesn't have universal appeal.  If one needs it badly, their tolerance level drops.  Otherwise, others may come along and do a better job.  I also would like to see more detailed information about Skype and future roadmaps.  So far, I haven't seen any welcome attitude toward third-party developers.

Update #2:

A few peopled asked for the Skype graphics I had used for my 'callto:' link.  Here it is.

Infinite Tolerance

I knew the numbers but I just now realized what the numbers really meant.  Only 10% or less of e-mails I get daily are legitimate e-mails.  90% are spams and viruses.

10%!!!

I agree with Jon Udell that RSS is not a replacement for e-mails and that e-mail has special powers, but e-mail infrastructure is clearly broken.

As noted by some, SpamBayes tend to throw e-mails written in non-English language into the spam pile.  The other day, my little spirit-brother (brother not of birth) in Korea called me to ask why I haven't responded to his e-mail.  I told him it was SpamBayes' fault.  He said Huh?

Exactly what is going on here?  Infinite tolerance?  What we waiting for before some drastic actions are taken?  1%?  0.1%?

Serts

It's 4AM and I am thinking about serts.  No, not certs.  Serts, as in Asserts.  Cute, eh?  It's a term I came up with to describe cert-like objects except it's just signed piece of information about anything where a cert is a signed information about a person.  A sert can be a signed grocery list or a list of weapons on board a ship.  A sert is just data signed by someone or something.

Think of serts as anonymous certs gone wild.  Say I visit a website and the website can distinguish me from other visitors using whatever means handy like cookies.  Over time, the website gets to know me.  So it hands me a sert that say "I don't know who this guy is but he has been trustworthy for these sort of activities."  Does that sert have value?  I think so.  Does it matter who I am?  Not really.  That's a sert for you.

World needs serts because everyone could use a bit of sertainty.