Auto-Mirror Web Service

Since the bubble burst, contents have been disappearing.  Just yesterday, I have been reading some papers on techniques for simulating hand-drawings (I wanted to follow up on on my idea that UI artifacts like buttons that looks hand-drawn stands out, without being too loud, a very useful quality).  After finding and reading about ten interesting papers on the topic, I started reading secondary papers referenced in those papers.  What surprised me was that majority of those links were broken.  Mostly sites were simply shutdown.  Rest of them were due to papers being removed or moved elsewhere.

If I have some web resources that links to external web resources, only think I can legally do now is pray.  If the resources are critical to me, I can use a local copy, but there are several problems with using a local copy:

  1. using a local copy may be illegal.
  2. updating local copy is cumbersome and often requires manual review (real paper might get replaced with sorry-but-its-gone page).
  3. UI may become confusing or verbose enough to interfere with the content.

Owners of those external resources also have their own problems.  First, they don't know who are depending on their resources, so there isn't much choice when they have to stop operation for one reason or another.  Second, scalability often cost too much and takes too much time to increase (when your server goes down because you got slashdotted, its too late to get additional servers).

One solution for both parties is to use web services to negotiate auto-mirroring of contents.  For external resource referencing sites (well, everyone), auto-mirroring guarantees that external resources referenced by their content will be available.  For owners of external resources, they can route requests to mirror sites when load gets too heavy.

Technically, its just mundane stuff.  Resource consuming server asks resource owner, via SOAP, whether certain resources can be mirrored.  If not, nothing is done.  If allowed, content is mirrored and resource owner notes the mirror location as well as information (capacity, location, etc.) useful for balancing load across multiple mirrors.

What excites me is that amount of work involved is relatively small, yet benefits are so huge.  I can easily imagine this being a standard web server feature within a year.