Back into the Future of Web: HTML5 SQL Player

Like a Three Stooges carpet-pull slapstick stunt, the HTML5 client-side storage spec changed drastically the night I released my Gears wrapper. Thump. Ow!

I am ok! I am ok! And better for it! This time, I am back with the vengeance. Why fight the change? Embrace it! Dear reader, allow me to present the HTML5 SQL Player, a tool that spec developers and curious bystanders alike can use to poke and prod the spec in action. Essentially, this is a Google Gears-based sandbox, in which a user can run Javascript code to query and test the interfaces, implemented by the specification. If I were into that kind of thing, there would be a picture of a crazy-eyed Christopher Lloyd and some reference to the movie that doomed his career. Yes, my friends, this sandbox is a glimpse into the yet-to-be-implemented technology.

And as such, beware of the bleeding edge. Some things in the spec are somewhat under… erm… specified (like the mode of transaction and its effect on sequential calls of the transaction method) and some things in the sandbox are under… erm… implemented (like changeVersion or SQL sanitation). But regardless, this approach is still the best if you’re trying to evaluate spec’s viability in an effort to make it better. And that’s what this is all about.

Jumping Off Audience Navigation Bandwagon

Future Endeavor has another insightful post, followed by an interesting UX example of the University of Virginia front door. I am a big fan of this blog and would highly recommend it to anyone involved in higher education Web development. This time, Tony Dunn talks about the future of the University Web site. I like his thinking and I feel that my thinking is mostly aligned with it. Where we diverge is on the future of the audience-based navigation.

The truth is, I no longer believe in the necessity (or usefulness) of the audience-based navigation for a University. There, I said it. Having been the advocate for the last 8 years, I eventually came to realize that all it does is create an extra barrier for the user (umm, who am I? Which is the right door?) and is mostly ignored by the visitors, anyway (I am basing this on my observations and thought experiments).

Self-selection is a myth: as you probably know, the user commonly belongs to multiple or none of the offered audiences, and this artificial ritual of forcing the visitor to put the right hat on is not only confusing, it’s actually a little bit insulting.

What’s the alternative? Concentrate on three things:

  • Needs-based Clusters. Envelop topics, relevant to specific needs (How do I become a student?) into a cohesive (spit-and-polished!) and limited in scope sites.
  • Lifeline Links. Identify 3-5 most desperate and immediate needs of your visitor (I have to check my grades) and by golly, put them on the home page.
  • Ambient Findability. Make sure that each page on your site carries a potential of getting the user closer to achieving their goals.

That’s all for now. I am eager to hear your thoughts and opinions on my little turn-about.

Chewing on Open Social

So, the cat is out of the bag, in case you haven’t heard (and if you haven’t, what remote island are you living on?). I spent a bit of time this weekend, playing with the new toys, trying to analyze by immersion. Essentially, on the Javascript side, it’s one part existing Gadget API, one part new feature (you guessed it, named opensocial), and you’ve got yourself a whole new playing field to tinker with. Not being familiar with the Gadget API, I was learning both parts at the same time, which is never a bad thing.

After getting my sandbox permit, I hastily cooked up two simple gadgets, er.. social applications, the Twitter and the OpenZombie. Both of these are skeletal proofs-of-concept, on whose I have no intention to continue development. So, feel free to borrow in parts or in whole — it’s public domain, baby! I intentionally tried to keep them light-weight, client-side-only. Both have been casually tested with Firefox and IE7. In other words, don’t call me if you have a problem running either.

First application grabs data from Twitter using Gadget API calls and renders it to somewhat resemble a Twitter feed. It doesn’t actually use any of the OpenSocial API functionality and can be run in the iGoogle. It does use the UserPrefs to ask for the Twitter username, and Orkut’s current way of dealing with this is rather jarring, so be prepared for that.

Second one is my 45-minute take on the ever-ridiculous Zombies application on Facebook. Except this one actually bites automatically. As soon as the user stumbles upon my profile page, they are bitten by the OpenZombie application (with the corresponding activity stream message), and offered to install the application themselves as a vengeance-laden consolation prize. No stats are kept (and that would be hard, given that API doesn’t yet allow you to update owner’s person data), and no blood-curdling imagery is displayed. I figured, the next guy will come along and make it pretty. And by pretty I mean despicably horrific.

Speaking of the next guy, here are a couple of tips that I have for you:

  • When debugging the application, appending &bpc=1 to the URL of the page itself will disable caching of the application. Someone already built a Greasemonkey script for that.
  • Modularize your development. Make your application a harness that calls scripts and styles remotely:
  • <Module>
    	<ModulePrefs [attributes go here]>
    		<Require feature="opensocial-0.5"/>
                    [more feature requirements go here]
    	</ModulePrefs>
            [user prefs, etc. go here]
    	<Content type="html">
    	<![CDATA[
    		<script type='text/javascript'
    			src='[absolute script url]'></script>
    		<style>
    			@import url('[absolute style url');
    		</style>
    		<div id="message"></div>
    		<div id="panel"></div>
     	]]>
      </Content>
    </Module>
    

    Then, in your script, do something like this:

    _IG_RegisterOnloadHandler(function() {
            // your code goes here
    });
    
  • Once you have modularized your application, you can do another simple trick: edit your hosts file to temporarily point the hostname in the script and style URLs to localhost. Then make sure that these files are accessible from your local web server. Now you can edit them and see the changes without having to push the files to the server on which the application will eventually be hosted. Just don’t forget to remove the edits in the hosts file when you’re done developing.

Now, for a quick technology review of the OpenSocial Javascript API (can’t speak for the GData stuff, haven’t played with it). On the contrary to the few negative reactions in the blogosphere, I find OpenSocial pretty impressive. I think the API is easy to learn and follow, the transparent authentication and identity data management model is neat, and there’s plenty of room to play, or even build something useful. Bringing application development into the Javascript domain is a good thing. Yeah, the sandbox squeaks and rattles, but that’s typical for an early release. Give it a little time.

The API itself is wordy and a bit inelegant, though this may be a viewpoint,
skewed by the laconic beauty of JQuery. I am guessing that its current shape is probably a result of being tailored toward the more arcane Javascript
implementations. I can’t find any other explanation for the gratuitous global
namespace pollution or things like API objects having accessible underscored methods/fields.

But my biggest beef is with the Gadget API. With it’s let’s start now, it’s so simple! approach, it practically encourages hacky, spaghetti-style Web development. Adding even a primitive asset management to the XML declaration would be a win-win: developers are nudged to separate behavior, presentation, and markup, and the server gets to know in advance what’s needed to render a gadget, thus providing opportunities for caching, embedding, or aggregating the assets:

<Assets>
    <Asset Type="js" Src="http://example.com/twitter.js" />
    <Asset Type="css"  Src="http://example.com/twitter.css" />
...

Another thing that stood out is the lack of user experience management. Facebook went a long way (they invented their own language!) to keep the consistency of the user interface by offering common primitives, like profile action or the freshly baked board. Walking from application to application, you can easily see where the primitives end and developer’s own creative aspirations begin (and believe me, in 8 cases out of 10, it ain’t pretty). But at least they tried. The only thing that Gadget API has in this regard is handling of user preferences. That’s it. The containing IFRAME is essentially an open canvas. This is something that has to be addressed, especially considering that some partners in the alliance are pretty good about keeping their UX noses clean.

I hesitate to draw any sort of conclusions in regard to direction or
viability of the project. Obviously, this is a very early developer’s
preview, where it’s perfectly acceptable to come across matchsticks and
duct tape. As it stands right now, OpenSocial is certainly not as
well-oriented and focused as Facebook, and Orkut doesn’t make a good sandbox
container, because… well, let’s just say it won’t win any usability
awards. And certainly not visual design awards. Even with that, I can
see fairly clearly what Google wants to become: they want to be the social networking plumbing. Just like their search became the the Internet
for many users, I can speculate that Google hopes to offer free,
ubiquitous, and highly mashable pieces of infrastructure that power the
majority of person and community-centric software on the Web. Ultimately, I don’t believe
it’s a move in a game of chess, but a tiny step in the strategy that reaches much
farther and wider than everyone’s favorite blue-shaded time waster.

Slides from my IPSA presentation on HTML5 and Google Gears

Today, at the monthly IPSA meeting, I gave a presentation on Google Gears and HTML5 client-side storage part of the spec. As promised, I uploaded the slides to this blog.

… Yes, I am going slide-less from now on. Jeff Keeton and I have already done a couple of browser tabs-only presentations before, and the simple method works as well or better than the slides.

HTML5 Wrapper for Gears

This is possibly the lowest-hanging fruit ever. After WebKit folks released their implementation of the HTML5 SQL storage specification, Gears immediately became the odd man out, the non-standards-compliant implementation. Never you mind that this part of the spec is still basically a twinkle in Hixie‘s eye. By adding the SQL bit into the latest WebKit nightlies (Windows users are still blissfully unaware of this, by the way), the legendary Apple team put a symbolic stake into the metaphoric ground: this better not be moving anywhere.

So I thought, why not connect the dots? Gears is a capable and much more mature implementation, both WebKit and Gears use the same SQLite server, so it should be just a matter of writing a simple wrapper to bring Gears back into the standards fold.

So here they are, WebKit’s Stickies running in Firefox with Gears, using a tiny bridge script I wrote (see the screenshot on Flickr if you aren’t inclined to install Gears). If you’re running Firefox 3, you may actually be able to write on those stickies. For Firefox 2 users, the lack of contenteditable support means that you can just create, move, and delete the notes. IE users are SOL‘d, because the Stickies sample uses DOM Level 2 events and other happy standards goodies. Do not let this stop you from using the bridge itself, though: it works in IE just peachy-fine.

The script, as I mentioned before, is laughably small and oh yes, incomplete. But I figured, the openDatabase and executeSql support take you about 80% of the way for most Web application development needs, and should the need for the other 20% arise, I would gladly oblige and be your code monkey.

Gearites and Safaritans, if you feel like I encroached on any of your wonderful work by creating this frankenstein, please let me know and we’ll sort this out.

My iPhone SDK

So, my good friend Steve spills the beans about the upcoming SDK. The crowd screams like girls at the Beatles concert. Even ever-curmudgeonly Dave eeks a “hooray”. And that’s a big deal, my reading acquaintance. Now, I don’t own an iPhone, but I play with one at the local store every other week. And I also am a gearhead, as we’ve discussed before. And, I watch google-gears and
WHATWG groups like a hawk whenever there’s talk about gear-related stuff.

With all that, it is only logical to assume that I am siding with Chris on the mournful thoughts for the Web-based application development on the delectable slice of glass-sided aluminum. Because in my opinion, Safari 3 is only five tiny steps away from being a full-fledged application platform.

These steps are:

The “nice-to-haves” and “coming-soons” may include direct TCP Network connections, history and browsing context management, and other neat things from the HTML5 spec, but those first five is what could really make the Web applications first-class. Think about it.

HighEdWebDev 2007

The HighEdWebDev 2007 conference begins tomorrow and I am all ready to go. This is my second year to attend this gathering. Last year, I was sitting at the vendor table (Hello, I would you like to buy a pound of CMS?), guest-blogging at Collegewebeditor.com, and that’s about it. This year, I (thankfully!) will not be vending software
goods, but I did sign up for guest-blogging. In addition, Jeff
Keeton
and I will have a post-conference Workshop 2.0, a boundary-busting, mashpit-flavored jam session of collaborative learning.

We’ll start with breaking down the what’s, the who’s, and the why’s of the gnarly beast we all lovingly call Web 2.0. We’ll talk about the traits and the hype, the comers and the goners, but please pelt us with rotten iPhones if we spend more than 20 minutes doing that.

Next, we’ll conduct an exclusive virtual (hold your fingers crossed for Skype) expert panel, discussing what the future holds in store for the Web of tomorrow. The panel will feature several gurus and senseis of the Web, including our own, battle-tested Mark
Greenfield
and Web 2.0’s uncle Chris FactoryJoe Messina. We tried to get Father Tim, but Twitter was down at the time.

In another 20 minutes, we’ll pull the plug on the Skypecast, claiming technical difficulties, and get to the most exciting part of the shop:
the hands-on. In the remaining 2 hours, we will venture to build a new site. Together. Not to compete with the CSS experts next door, we’ll skip the part of finding fun rendering bugs in IE, but rest assured, we’ll follow the process from start to finish. Please bring
your smocks — this is going to get messy.

Armed with our shockingly brilliant collective intelligence, our Team 2.0 will pick the site project, brainstorm goals, audiences, creative concept, doodle wireframes and even say words like AJAX and Controlled Vocabulary. Together, we’ll argue about pros and cons of promotion techniques, mull over minute details of information architecture, decide on underlying technology and discuss details of implementation while the Photoshop geeks in the corner will
be quietly cooking rounded corners, gradients and drop-shadows.

Throughout, we’ll use social networking tools to facilitate the architecture, development, and promotion of the site, making sure to squeeze the most utility out of the Web 2.0 hype. We probably won’t walk away with a real site. But we sure will try. And in the process, we’ll capture and cherish the spirit of
working together and having a good time.

Call Me Gearhead

So, I am playing with Google Gears and I’ve got to tell ya, Gears are brilliant.
And the
farther
you
look,
the
shinier
it
gets.
Why? Because Gears fulfill the ultimate fantasy of any Web developer in a very
radical way: they cut into the browser (IE and Firefox, currently), deep and
wide, introducing themselves inconspicuously as a DOM facility
(google.gears.factory – hah, clever fellers, aren’t they?). And they bring…
well, I think they bring a bit more clarity into this whole murky future Web
thing. Oh noes, did Dimitri finally cross the line from a grouchy nerd to the
full-blown pundit and started predicting the future? Well, let’s hope not. To
the very least, color me intrigued.

If you’re not familiar with Gears, this blog is probably not the place to learn
it, but basically, it breaks down to three things: local storage
management, isolation/threading model, and cross-origin data and resource
access. Yeah, see? It’s not even the same breakdown as
on
their site
. But that’s why it’s here on my blog: I can skim over the details
and get to the good stuff. And good stuff it is. I am surprised that the
reaction to Gears is so muted, because they attack the status quo of Web
application (heck, any Web site) development with a deft balestra that
rivals pretty much anything I’ve seen on the market since, well, since
XMLHttpRequest (only took 5 years to «discover» that one, eh?). Here are just a
couple of exciting possibilities that come to mind after playing with the
0.2 code.

Client-side Composition

Up until now, whether we are eager to admit it or not, DHTML applications (and I
am intentionally classifying
SilverFlash
out of the picture here) had this distinct flavor of a dumb terminal
(Crockford
says
3270
— it’s catchy. Besides, I still have a 3812 in my basement. Makes
excellent ballast). The server had to pretty much “print” down the wire the
entire snapshot of a page in a tasty soup of HTML. And that’s on every request.

Google pushed the envelope on client-side composition with GMail and some people
took notice, but by and large, server-side composition reigns supreme. Come to a
page on the Web, and for each ounce of content there is a bucket of context:
navigation, branding elements, context-sensitive link lists, spotlights,
testimonials, you name it — all repeating from page to page, racking up
bandwidth, using up server cycles needed for complex frameworks to sift, sort,
transform and align resources into the darned HTML snapshot. And that’s on
every request
, my teary-eyed readers. I would join you in your sorrow for
all this wasted energy, but I must finish this song… er, post.

See, with server-side composition, it is up to the server to determine the
context of the requested page and generate markup that puts those links and
other context elements together with the content of the page. Let’s isolate this
effort into a separate functional component, and call it the context
engine. To summarize, content engine retrieves content of the page, context
engine evaluates content and mashes it with whatever seems relevant, and hands
it off to be served to the user agent. Funny fact: most modern content
management systems are in fact context management systems. Content engine
is dumb and simple (fetch a page, duh). Context engine is complex, nontrivial,
and a gargantuan resource hog.

With Gears, you can finally have proper client-side storage (not the
cookie-based monster), and thus you can have proper client-side state. And thus
you can implement proper client-side composition. Which means that aside from
the obvious “offlining” of frequently-used, but rarely-changed assets
(static pages, scripts, images, stylesheets) using
LocalServer,
you can actually move the context engine to the client-side… Hey buddy, an
example wouldn’t hurt, m’kay?
M’kay.

Let’s suppose that on your site, most pages have a sidebar, displaying the list
of upcoming events, relevant to this page. With server-side context page, each
page arrives as the blob of markup, generated by the context engine. With
client-side context engine, the page only contains:

  • content and optionally, a list of keywords (tags) that describe the
    meaning of content
  • URL to the event feed

Upon loading of the page, the client-side context engine kicks in:

  1. It checks to see if local event repository exists, and if it doesn’t,
    creates one by fetching all events from the provided event feed URL.
  2. It evaluates content/tags against local content repository of events and
    adds relevant event to the sidebar
  3. Makes “get added/updated since” requests to the event feed and updates
    content repository, as well as the sidebar
  4. Removes older events from the repository

When this user goes to the next page, the server doesn’t do anything but serve
content. For the duration of the session (or perhaps some pre-defined interval),
the client-side context engine provides relevant events from its local
repository. The happy, unburdened server sends sloppy kisses to Gears. No
wonder, because now, the only task that requires any meaningful computation is
serving the list of events added or changed since specified date or revision
marker.
Twitter
boys
, this one’s for you.

Decoupling context engine from the server not only makes server’s work easier.
It also makes context engine server-independent. Who says that I should only
pull events from this one server? Why not pull them from that popular news
outlet, online local events site, or Google calendar?
Schmancy-Schmashups,
here we come!

Even cooler, the client-side context engine is far better suited to keep the
track of user browsing habits, generating personal taxonomy or tag cloud, and
taking it into the account when evaluating relevancy. Am I the only one who gets
goosebumps thinking about the opportunities?

Worker as Service

While playing with Gears’
0.2
bits, I realized that cross-origin
HttpRequest
and
createWorkerFromUrl()
introduce a better, more modular and more secure way of building public
JavaScript APIs. In this new release of Gears (developer-only, for now), the
worker can be loaded from a URL, and this URL does not have to originate on the
same server as the document, in which the worker is created. In essence, you can
load and run a script from another server in a completely isolated context,
and you can exchange messages with this script. And this script
can make HttpRequest calls back to that server. It only takes a small
logical step to see that this script can expose the public API of a Web
application, located on that server, via WorkerPool messaging.

Let’s pretend that I have a Web site that wants to use Google Spreadsheet as
table, listing some goods for sale. Here’s how I would connect to the API using
JavaScript (wildly pseudocoding):


	var wp = google.gears.factory.create("beta.workerpool", "1.1");
	// create message
	var getRows = new Message("getRows");
	// ... perhaps create more messages, with parameters or not
	// set up message handler
	wp.onmessage = function(text, id) {
		var message = Message.fromJSONString(text);
		// API initiates communication
		if (message.command == "ready") {
			wp.sendMessage(getRows.toJSONString());
		}
		// ... more message processing
	}
	// finally, kick-start the whole thing by loading the API
	// URL is fictional, of course
	var api = wp.createWorkerFromUrl("http://google.com/api.js");

Yeah, I am skipping lots of details, but I hope the concept is obvious: the
Spreadsheet API handler is loaded as a worker, and the page can then use this
API by exchanging a documented set of messages. No need to knit
double-frames or server-side proxies. It just works.

Because the worker is isolated, we can set up more secure authentication and
increase authentication granularity by accepting only certain messages,
depending on the identity. Also, worker runs as a separate thread, which means
we can do other things while the data is cooking. If it were up to me, I’d be
staying up all night converting all Google API endpoints to this model and
developing a good message exchange protocol. Though I might be getting too old
to stay up all night without dire consequences.

Worker as Module

A reverse of worker-as-service model is worker as module. In this case,
the application accepts cross-origin worker registration via UI, allowing them
to participate via message exchange. For example, Google Reader could
allow users to add plug-ins by allowing the users to enter the URL of the API.
The URL could be an HTML document with some simple markup, referencing the
JavaScript file, containing the plug-in. Easy-peasy. And beautiful.

Thinking Outside of The <object> Box

Finally, let’s pause for a second to ponder the way Gears is implemented.
Instead of building their own
Eden
inside of an object node, Gears hook up directly in DOM, without creating a
scripting
language
runtime and certainly not a new
declarative
presentation
format. They organically extend HTML DOM space and DHTML
developer’s horizon. What’s more, the Gearites
openly
commit
to making (and
diligently
follow
through
with
) a good effort of helping bring this extension into the new HTML spec. And I
like this thinking.

Margin Marks UI Concept

Summary

Margin marks is a user interface concept that aims to expose microformats on a web page in a way that’s intuitive, useful, and positionally relevant, yet has minimal interaction with the page presentation. This concept can be also extended to emphasize other, typically invisible aspects of the content, such as fragment identifiers, classes and even to letting the users add their own marks. You can go ahead and just look at the pictures if you don’t feel like reading.

Motivation

I’ve been following the thought process of microformats UI in Firefox 3 as documented in Operator‘s
functionality, Alex‘s blog, and uf-discuss list. It’s been exciting to think about the power of microformats and its consumption potential being built into the browser, and as such the decisions about the user interface exposing this power are certainly quite heavy in weight. The greatest problem as it appeared to me was exposing content, marked up with microformats in a way that does not interfere with the page presentation, while at the same time providing comfortable and immediately useful experience for the users. Mike’s current experiment, the Operator, has cool ideas and lots of configurable options, but it still left me wanting something more. Primarily, my holy grail was positional relevance of the consumer user interface to the actual marked-up content. Looking at pages like twitter.com and my own blog comments, I realized that a page with a lot of microformatted content practically begs for positional correlation between the Operator’s
action drop-downs and the page itself. That’s how the margin marks came along.

General Concept

The margin bar is a vertical pane that is shown on one side of the browser window. Whether it’s on the left or on the right may be configurable by the user. The contents of the margin bar are vertically attached to the page, so
that when the page contents are scrolled, the margin bar contents are scrolled as well. Visually, it’s an extra margin to the page that is controlled by the browser, not page presentation (hence, the margin bar). The margin bar can be visible or hidden, as desired by the user. Naturally, open should be the default state.

The margin bar is narrow, with minimal impact on the width of the browser window. The information provided is hint-like, abbreviated down to icons and perhaps numeric indicators. Visually, it’s a set of glyphs, each positioned alongside the start of relevant content fragment. These glyphs are margin marks. Margin mark identifies vertical position of a content fragment in the margin bar. The mark can be visually presented as an arrow or any other sort of pointer with an icon on it.

Grouping Marks

In situations, when there are more than one marks occupying the same space, the marks are combined into one mark, visually identifying multiple items, together with the number of combined marks. The icon, associated with the top-most mark is displayed.

Mark Actions

Each mark may have one or more actions, associated with it, with one action designated as default. Configuring the actions is part of the browser preferences UI. It is possible that the action may have an icon associated with
it. For instance, if the action is to add event to Microsoft Outlook calendar, the Outlook icon is displayed in the mark, rather than a generic address card. However, this may introduce more confusion, given the diversity of platforms and applications that may be potentially invoked by the users.

Mouse Navigation

When the user hovers the mouse over the mark, the details window is revealed. Moving the mouse off the mark closes the details window. Clicking on the mark invokes the default action. Visually, default action is placed at the top of the details window, so hovering and clicking are intuitively connected: the user does not need to make any further mouse movements to invoke the default action. Hovering the mouse over a group opens the group: the marks in the group are lined up in the bar vertically, allowing the user to explore the marks within the group. Admittedly, this is not very elegant. Perhaps you could come up with a better idea.

Keyboard Navigation

Margin bar participates in the browser chrome tab cycle, preferably placed immediately before the page. Also, there should be a keyboard shortcut to bring keyboard focus into the margin bar. Once the bar acquires keyboard focus, the top-most mark gains it automatically. Then, the following keyboard events are recognized (this list is just a suggestion and food for thought):

  • Down Arrow — move to next mark
  • Shift-Down Arrow — move to next mark within the group. If at the end of the group, move to next mark
  • Up Arrow — move to previous mark
  • Shift-Up Arrow — move to previous mark within the group. If at the beginning of the group, move to previous mark
  • Space — scroll the page down and jump to the first mark in the visible span of the page
  • Enter — invoke mark/note action
  • Tab — go to the browser window
  • Shift-Tab — go to the previous item in the tab cycle

Aural Presentation

Ideally, when used with a browser that is equipped with voice-reading software,
such as JAWS, the user interaction should occur as follows:

  • When the margin bar gains focus, the reader announces: X marks on the page. Mark One. Type: Address Card. Name: Rulon Oboev… and continues reading the mark contents
  • Using arrows, the user can move between the marks. Upon each move, the reader announces the sequential number of the mark, it’s type and contents.
  • After reading contents, the reader announces each action as a link.
  • In addition to standard actions, the “Go to content on page” action is added after the default action.

Microformats Marks

Whenever microformat markup is encountered on page, a mark is placed on the bar at the current vertical position of the starting element of the markup fragment. Should the position change as a result of DOM operation or changing geometry of the page, the mark changes the position accordingly. This may be difficult to implement, so an acceptable solution would be to detect detachment (position change) and somehow change the appearance of the mark to no longer “point” to a place in content. Each mark contains a distinctive icon of the corresponding microformat (address card icon for an hCard, calendar icon for hCalendar event, etc.).

When hovered over the microformat data is presented as a complete note, perhaps using a metaphor, relevant to the specific microformat. For example, the hCard could be rendered as a Rolodex card, and an hAtom entry would be probably best presented as a yellow-pad note, a common visual hint of blog post.

Other Types of Marks

One can also easily extrapolate the use of the margin mark to other types of page metadata. For instance, a mark with a feed icon may be placed whenever a feed is encountered on the page. Usually, these would be at the top, but should there be an a element with the type attribute of application/rss+xml, the mark would be placed accordingly there, too.

Also, the marks could be used to provide a UI to unobtrusively identify HTML elements with an id attribute (HTML fragments). Other uses may include tracking a set of user-specified elements, attribute values, or content (mark everything containing “microformats” on the page).

User Marks

It would be really interesting to offer the users to add their own marks to the page, perhaps by clicking (or right-clicking) on the bar, as a way to annotate the page. As the users add a new mark, they can fill in the fields in the provided dialog box. Typically, this would be a simple note (an hAtom entry), but one can envision adding reminders (an hCalendar event), contact information (an hCard), perhaps re-purposing non-microformatted content from the page), or other types of content. After the mark is added, it is persisted within the browser.

Persistently and reliably identifying is a potential challenge of user mark implementation. Since it is not known when or how the content of the page will change upon next visit, a visual equivalent of um.. somewhere around here may be applied: if the browser can not identify the precise location of the user mark, an extra hint (a question mark, maybe, or a spatial glow/spread to signify uncertainty in position) is added to the mark. When this hint is present, the point line is not displayed.

Other Random Thoughts

Taking one step further brings us to the ability of the browser to communicate with the server when new user marks are added or deleted. Using some simple detection scheme, a browser could recognize that the page accepts mark updates and send newly added marks to the server transparently. An existing blog comment API with some positional extensions could be used or a new protocol could be proposed. I’ll let you figure out what would be best here.

When the margin mark is hovered over or has focus, an additional visual hint could be introduced: a point on the page where the relevant content begins and a horizontal line, connecting it with the mark, like a laser pointer. This could really address the issues of positional relevance.

When the page has more microformatted content beyond the current scroll view, a teaser hint could be shown at the bottom or top of the margin bar (an arrow of some sort?) to indicate that there’s more crunchy markup above or bellow the currenty visible portion of the page.

The margin bar could also have an expanded state, in which it shows details along with the marks. I originally had this in the concept, but I instinctively felt it makes the whole thing too complicated.

Inspiration, Disclaimer, and Licensing

This concept is inspired by the entire super-awesome premise of microformats and the great people around them, by the Alex Faaborg‘s
post on Firefox 3 microformat UI concepts, Mike
Kaply
‘s ground-breaking Operator extension, and quite obviously, Jack Slocum‘s blog comment system.

I am not a browser developer and honestly do not know how much effort would it take to implement something like this. I did take a brief stroll in a Mozilla trunk and soon realized that one cannot evaluate implementation feasibility by just taking a brief stroll through the code of a browser. I am positive this can be done completely in Javascript, and thus assume that the feasibility is pretty high.

Should anyone find this concept, in full or in parts useful, inspiring, and/or worthy of implementation, I release it as public domain. I think that it would be awfully splendid of you to mention my name, even if somewhere deep in the comments of your shiny new toy. Or maybe bake me a low-carb cake. Or a MacBook Pro. But I won’t insist.

The Horrific Markup of Live Spaces and Possible Explanation of Dare not Getting Microformats

It’s a Friday surprise! After reading this post by Dare Obasanjo,
I dutifully followed the links in the article. Upon stumbling on his Live Spaces friends page, I instinctively hit Ctrl+U to peek at the source code. Ow! Ow! My eyes! My tired bespectacled eyes!

Dear readers (yes, all 2 of you — Privet, Mom!), let’s all hold hands and stand in a circle. Let’s promise ourselves to never look at that pitiful, congealed elephant-man, malignant-growth of a code ever again. It’s just better that way. And you, Spaces developers… Well, shame on you. You should know better than dumping this crap on the Web. No wonder poor Dare can’t get screen-scraping out of his mind: he probably hasn’t even seen good semantic markup, much less realized its benefits. Have you?

Oh, and Dare, bad call on the friends.get example. You should probably use fql.query to get something more useful than a list of friend UIDs for any sort of social network portability. But I’ll leave you alone with your point of view on what can and cannot be an API.