Back from WWW2008

It happened once again. The most important conference on the Web was hosted in Beijing, China.

Fortunately, I was able to attend it once again, but only after an entire week on vacations. I have to say that I loved China. A huge country, filled with great, kind people. I had the chance of having three buddies with me on this trip, which was great!

Shanghai. For those who know (and like) New York City, you will feel comfortable in Shanghai. A very cosmopolite city, filled with skyscrapers. Lots of excellent street food (even for a vegetarian like me). Lots of bargaining, which I translated into a shiny new Canon EOS 40D for half the price, and a brand new Sony Cyber-shot T300. You can see their quality in my Flickr account.

Beijing. The great city in the north. A HUGE city, may I say. Had the pleasure of travelling to the Great Wall, more specifically to the Mutianyu section.

Conference days. I presented on W4A, which I was attributed with the best paper award. Great news! On the second day I presented on WebEvolve, the Web Science Workshop. The next three days were enjoying the main WWW conference, with excellent keynote speeches, excellent papers (some Google guys presented there a PageRank for images), and excellent food.

I deeply recommend everyone to go to China when possible. And WWW, well, it continues with its excellency on research and industry.

On full screen apps

Recently, a new breed of full screen applications have emerged and, from the feedback that reviewers and users have been giving, it seems that it works.

A few months ago I have bought one of those Apple MacBooks that so many people love (me too, btw). On the last couple of weeks, while searching and trying out new software stuff, I've watched the increasing growth of the number of applications for OS X that provide a full screen user experience.

Yes, full screen apps are not a new. We've had them for a long time: games and bunch of screen real-estate hungry applications (such as 3D modeling or movie playing apps). But, apart from those, full screen applications meant going against the traditional GUIs of all operating systems. Things such as window managers, APIs, usability guidelines, etc., have been tightly tied and cooperative in order to deliver better user experiences. Whether we talk about applications, windows, or documents, all UI metaphors are evolving towards interacting with several windows, in order to accomplish a given task. We do it everyday with our beloved and almost impulsive Alt/Command+Tab task switching.

Even with OS X's window manager, everything is centred around easily finding a window, viewing them all, etc. with all the fancyful animations of Exposé. So, what happened to users that they started to demand for full screen applications?

Back to OS X applications and their... fullscreen-ness, I've come across with applications that see full screen as a competitive advantage, regarding their competitors, such as: Adobe's Lightroom; Apple's Logic, iTunes on cover flow mode, OS X'sDashboard, or Front Row; Scrivener; Writeroom; and the list goes on, and on, and on...

But why?

Well, from this small list of applications we see that some lean towards multimedia (iTunes et al.), others to media editing (Lightroom, Logic), or to text editing (Scrivener and Writeroom). From crawling through blog posts, mailing lists archives, tech news sites, etc., I believe that this new breed of applications fill an important usability gap: attention. With the increase of screen real estate and better window managing, users are bombarded with information coming from everywhere. Things such as Growl notifications, instant messaging blinking or jumping icon notifications, etc. This is, IMHO, increasingly overwhelming to users. Users and getting tired of this.

That's where full screen comes along.

Full screen applications tend to be more imersive. The user is simply centred on fulfilling a task. Nothing gets in between this scenario. No distractions. No bells and whistles around. Consequently, productivity increases. It seems to make sense, doesn't it? I like it and use it, a lot.

Well, from some reports elsewhere, Apple is bringing support for developing full screen apps into OS X's latest installment, Leopard (a.k.a., 10.5), thus dismissing a few full screen hacks that one can find with a bunch of Google queries. Good for us, developers and users.

Finally, as this blog is geared towards Web related user experience, here's a thought. Do we see a clash between the Web's multi-window/page/site hyperlinkingness and full screen metaphors as a weakness when we begin to think on applying this concept to Web browsers, or even to the way websites are designed and how users should interact with them?

In times, I've prototyped a Web multimedia database/hyperbase, whose user interface was specifically optimized for a full screen Web environment (through a Firefox full screen add-on), with good results. Maybe this will be a trend for future Web based applications.

I'll think about it more seriously. For sure.

Investigatione miscellanea

Despite not posting a lot of stuff in the past weeks (apart from the smallish post about OpenID), I've been quite busy working on my PhD - apart from a small vacation hiatus in the last bunch of days.

In a nutshell, my PhD hypothesis says that, in order to cope with rich interaction scenarios on the Web (i.e., everything that is not a common user with a desktop computer with a typical Web browser and a mouse), one must center their focus on characterizing the set of Web Interaction Environments (WIEs) that one's interested in.

Currently, I've defined a way to specify WIEs (and encompassing prototype modeling tool) - more on that later on (I guess I'll have to create a wiki page about it on my research group's website). I've also sketched a way to reverse engineer existing websites and Web applications crossing the ConcurTaskTrees and traditional hypermedia/Web engineering modeling practices (such as OOHDM or W2000), that allows leveraging accessibility and usability in a pragmatic way. Also, this practice may (can?) be used to fully model highly dynamic websites and Web applications for a given set of WIEs (this is yet to be proven).

However, now it's time for one more pause to fully rercharge my batteries. A long year is waiting for me, from the middle of August onwards, fulfilled with ideation, prototyping, testing, writing, and presenting tasks. See you then.

User Agent detection and segmentation

Web browsers indentify themselves with a particular string. It's called the User Agent string. In order to better fit contents to different scenarios, trying to segment the whole spectrum (or, at least, a big chunk) of user agents proved to be a daunting task.

While working on a prototype for an adaptation engine for Web based documents, I came across the need of finding out which type of Web browser is requesting a document. While nowadays some efforts are being made to ease this task, such as UAProf and WURFL. However, being pragmatic, an ubiquous availability of these two technologies may take several years to take off. Therefore, the simplest way of doing it today is sniffing the request's HTTP header, and looking for a User Agent field.

From the HTTP 1.1 specification, Web browsers and other Web user agents (e.g., crawlers) may identify themselves with a specific string on the header of each HTTP request. The production rule for this header is:

User-Agent = "User-Agent" ":" 1*( product | comment )
product = token ["/" product-version]
product-version = token

What's the meaning of this expression? Basically, it states that this header field should start with the string User-Agent:, followed by a product name and its version, or a comment. This must appear at least one time, at most... infinite times. Hence, user agents identify themselves with almost arbitrary strings, as long as they comply with the production rules. Headache warning.

Despite existing a huge amount of Web browsers available in the market, my adaptation engine should indentify them according to their segment, such as desktop browsers, mobile browsers, etc. But, thanks to HTTP's loose user agent rule, putting browsers correctly on their segment is really hard (read: cumbersome, error prone, nearly impossible).

Here's a quick sample of user agent strings from miscellaneous browsers, taken from a huge list found elsewhere:


  • Internet Explorer 7: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; WOW64; .NET CLR 2.0.50727)

  • Firefox 2.0: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-GB; rv:1.8.1) Gecko/20060918 Firefox/2.0

  • Sony Ericsson K610i builtin Web browser: SonyEricssonK610i/R1CB Browser/NetFront/3.3 Profile/MIDP-2.0
    Configuration/CLDC-1.1 UP.Link/6.2.3.15.0

  • Pocket PC Internet Explorer: Mozilla/4.0 (compatible; MSIE 4.01; Windows CE; PPC; 240x320)

Once again, pragmatics tell me that tailoring the Web towards each single device is unfeasible. Hence, it should be possible to define different segments and associate each User Agent string to the appropriate segment, through a set of heuristics. Even when WURFL, UAProf, or even more recent work from W3C's Mobile Web Initiative Device Description Working Group becomes widespread, segmenting the Web end-points - browsers - into a treatable set of characteristics will continue to be useful.

Going back to the User Agent strings mumbo jumbo, my initial proposal relates to distinguish between the mobile and desktop landscapes, and it goes something like this (beware - pseudo-code algorithm):

function user_agent_segment(string ua_str){  switch (ua_str)  {    case /MSIE/ except /PPC|PocketPC|Windows CE/:    case /Gecko/:    case /KHTML/:    case /Opera/ except /Mini|Mobile|Wii/:      return DESKTOP;    default:      return MOBILE;  }}

The simple, yet crucial, aspect of this algorithm relates to detecting desktop browsers at first, since the (useful) desktop browser landscape is narrower (in comparison to the wildwest style huge range of User Agents on mobile phones). From there, one may just detect specific substrings.

If your keen on this topic, please feel free to implement, test, extend, and improve the algorithm. My (mid-term) goal lies on expanding it in order to detect and diferentiate mobile phones, ultra mobile PCs, and desktop environments (at least). Also, it could be somewhat interesting to extrapolate input mechanisms (i.e., modalities) available - e.g., if a mobile phone is detected, we may infer a numeric pad (and possibly arrow/cursor keys) as the available input modality. This way, navigation on a Web site may be tweaked in order to facilitate user interaction, thus leveraging the user's experience and increasing one's satisfaction.