f

unk rock


Limitations of the WAI-ARIA

March 1st, 2009

As a follow up to my previous post, I wanted to learn more about the WAI-ARIA spec, so I went ahead and read it. I won’t claim to be an expert, or to have anything more than a basic understanding of the spec at this point, but I was not impressed with what I saw.

Unfortunately, the spec doesn’t seem particularly forward thinking. There are two main issues I have with the spec: first is that it’s a very limited extension of the DOM and of what we already have on the web; second is that there is no programmatic interface to the accessibility system.

The most obvious downside of assuming that the DOM will be the basic building block of future web applications is that it presumes all web apps will always be structured with DOM trees. But this already isn’t true today. Look at Mozilla’s Bespin editor, which renders text inside a canvas element, or SUN’s Lively Kernel, which implements an entire widget set in SVG. Because there is no underlying DOM structure, these two programs can not be made accessible under the WAI-ARIA spec without substantial hacks.

What about elements that aren’t part of the visual structure at all? For example, in Cappuccino, every application has a CPApplication object. This object contains information that would undoubtedly be useful to the accessibility system, but because it’s an abstract object and isn’t part of the render tree in any way, it will never be visible to WAI-ARIA compliant browsers.

This leads us to the second problem, not having programmatic access to the accessibility system. This spec is designed to target web applications, but it has taken the same document based approach as HTML. Applications are defined by requiring a programming language, which in the case of the browser is usually JavaScript. It seems short sighted to develop a standard for applications that uses a fundamentally different technology than the application does.

Even beyond the fact that elements in your program that don’t exist in the DOM tree can’t become part of the accessibility system, not having a programmatic interface means that you can’t dynamically compute accessibility values. Imagine some user interface element that changes frequently, or perhaps is a composite of several other objects. Under the current WAI-ARIA spec, every change to this user interface element needs to be immediately reflected in the DOM. This means a potentially substantial performance hit that is unnecessary for the vast majority of users, and even for people requiring the accessibility feature if they aren’t focused on that element.

Lack of an actual API also limits the potential uses of ARIA. I can’t, for example, implement my own accessibility tool within the browser. It also causes issues with the event system, and doesn’t add any enhanced functionality for simulating events using accessibility APIs, which could have enabled a lot of advanced automated testing.

These thoughts are a result of my very brief research into ARIA so far, and if anything is technically incorrect, I’d appreciate that feedback. At this point, I’m not sure where to go with ARIA and accessibility. As far as Cappuccino implementation is concerned, there is enough in ARIA that we can significantly enhance the accessibility of Cappuccino, if not 100%. I suspect the performance implications will be minimal for most UI elements (though I am concerned about text fields), but it’s impossible to say until we actually have ARIA implemented.

I’m interested to hear other opinions on ARIA, especially from those who are developing modern web applications. It would be interesting to know if anyone shares my concerns, and if there is any way to actually address some of them within the current framework of ARIA or perhaps another active project I’m not aware of. I don’t claim to have any answers, just a lot of questions, so please share your thoughts.

Ross at 11:51 am | Posted in Technology, Web | Comments (16)

Accessibility & Degradation in Cappuccino

February 26th, 2009

On Tuesday, we announced Atlas, our new visual layout tool for Cappuccino. I’m incredibly excited about where Atlas is headed, and I’m also glad to hear all the feedback we’ve been getting just from our demo video. One of the things in particular that I’ve read several discussion on is accessibility and Cappuccino, and I wanted to share some thoughts on the topic.

Drew McLellan wrote an interesting piece outlining his concerns about the subject, but I think certain things need to be clarified. First, there’s a difference between accessibility and the availability of JavaScript. Accessibility is about enabling assistive technologies like screen readers to relay information to users with disabilities like vision impairment. JavaScript availability, on the other hand, is about whether or not a user’s browser has JavaScript enabled (or supports JavaScript at all). JavaScript availability is what people are talking about when they talk about graceful degradation. Both of these issues are important, but they need to be addressed separately.

Let me state the obvious: JavaScript availability is a requirement for writing an application in the browser. The reason is simple, writing a program requires a programming language, which HTML and CSS are not. To be more precise, I’m talking about an application that doesn’t rely on the server for all its logic, a truly browser based application, not a website with a dynamic back end. Not all programs should be or need to be written this way, that is something we readily acknowledge. But, some applications only make sense written like this: a presentation editor can’t hit the server on every single move or update or reposition of a slide element; a word processor can’t hit the server every time you need to type a character. I don’t believe this is a controversial statement, it’s a fundamental reality of the web. And it’s something you see not just in Cappuccino, but in any complex web application, from Google’s to Apple’s and countless others.

The second issue is accessibility, and I do believe its important. First, to put on my contrarian hat, you have to consider that not all applications can be made accessible. Although I could be wrong, I don’t think there’s a reasonable way to make Photoshop accessible to someone who can’t see; fundamentally it’s a visual tool. 280 Slides, for the most part, is the same (and to a large extent, Atlas may be as well). They are largely visual tools, heavily relying on visual design, drag and drop, and other mouse based metaphors. Since Cappuccino development up to this point has been driven mostly by our own needs, that may help explain why this hasn’t been a top priority for us.

All that aside, we absolutely want Cappuccino to be an accessible platform. Until pretty recently, this just wasn’t a possibility. Browser vendors and assistive technology vendors provided absolutely no facilities for interacting with the accessibility system. We’ve been working on Cappuccino for some time now, and I think it would be a travesty if none of the things we’ve accomplished had been done simply because there wasn’t yet a way to make them accessible. Cappuccino is pushing the edges of web development right now, and understandably some things take time to catch up. Vendors need to take their share of responsibility for the problem.

More recently, efforts like WAI ARIA are starting to be taken seriously enough to consider as a potential avenue for Cappuccino. Like the rest of our APIs, we have a strong foundation to build upon — Mac OS X has great support for accessibility in custom UI. I am extremely excited about integrating some of these technologies into Cappuccino. At the same time, 280 North is a three person company, developing our own products to support our business, and developing Cappuccino in the open to benefit everyone. We can’t program every feature needed in Cappuccino all at once, but that’s part of why we’re embracing open source. If people feel strongly about WAI ARIA, I encourage them to get in touch with us about helping to add support to Cappuccino. It isn’t an area we have a great deal of expertise in, but we’re happy to learn new things, and this is absolutely a problem we want to work on.

As for Atlas, I’m looking forward to sharing more about development status, and what it can do. We’ve got a lot of ideas, but we really love hearing what fellow developers feel is important. This is just one good example of the feedback we’ve been getting, and hopefully I’ll be able to share more in the coming months.

Ross at 1:59 am | Posted in Projects, Technology, Web | Comments (21)

iReddit - the official reddit iPhone App

February 15th, 2009

An iPhone application I wrote just got posted to the app store yesterday. iReddit (cleverly named huh?) is the official reddit client for the iPhone. I may have built the iPhone app, but they built reddit, which is the really hard part, and for that we’re all thankful.

iReddit Home Screen

Actually, I’ve been interested in trying out the iPhone dev environment for some time. I got to talking with Alexis and thought this would be a great opportunity to build something useful, and help out our startup at the same time. All in all it was a good experience, but at times the iPhone dev process may make you want to kill yourself (and not because of the programming part). Don’t say I didn’t warn you.

The app itself is pretty great. You can browse the combined front page, or look at individual subreddits, and even get your own customized reddit list on the phone by logging in. Commenting, voting, and saving are all there, plus “serendipity” mode, which displays a random upcoming reddit story every time you shake the iPhone. The best part of the app is probably the reddit alien getting pissed off at slow loading pages. The effect is great.

So, if you haven’t already, go buy the app! I’ll try and write more about the dev process in future posts, and in particular about my take on the app store and some of the things people have been complaining about.

Alexis has a bit of a crazy side to him, and it called him to make this commercial for the app. Hand modeling by yours truly. Still waiting to hear what Wil Wheaton actually thinks of it.

Ross at 10:47 am | Posted in Projects, Technology, Web | No Comments

JavaScript IS a High Level Language!

December 9th, 2008

In response to some extended discussion via JResig, Ajaxian, and Francisco, Charles Jolley said the following:

JavaScript, on the other hand, is a high level language. It has garbage collection, lambda functions, dynamic typing, object messages, and nearly every other feature you would expect in a modern high-level language.

When you wrap JavaScript in another high-level language, you don’t free yourself from managing low-level details; you just exchange one abstraction for another.  You pay the cost of two high-level languages without gaining any benefits for your final product.

The implication here is that JavaScript is the highest level language currently imaginable; in a word, it’s perfect. Actually, the next line, which I won’t quote, warns us not to interpret that as claiming JavaScript is perfect. It then goes on to reverse course and say the only way to abstract JS further is with a UI builder.

But is this really the case? There’s a simple test: Can I create a language on top of JavaScript that has at least one feature not present in JavaScript? Of course, the answer is yes, and the proof is Objective-J. If there is one feature that proves this point best, it’s @import. Importing is a feature that almost every high level language offers, and yet it isn’t present in JavaScript.

This begs the question: couldn’t I just implement this as a feature in a native library? Yes, and no. This is a complex topic, and I don’t want to get into the specifics, but to get the asynchronous/look-ahead code importing built into Objective-J, you need to preprocess code (aka, extend the language).

Objective-J doesn’t just add one feature, it adds many. These include dynamic message sending (which enables features like method_missing from ruby, and which is completely different from what JavaScript does have — function calls; I’m not sure why Charles called them “object messages”), importing, and classical inheritance. Many of these features use a runtime component, called the Objective-J runtime. Most of the new syntax in Objective-J is shorthand for accessing this runtime (which is built, of course, in JavaScript). You could say that the language frees yourself from managing low-level details of the runtime that you shouldn’t need to worry about.

And since Objective-J is a strict superset of JavaScript, you don’t pay any cost for utilizing it. You can drop down to “pure” JavaScript at any time. You don’t even need a compiler or a special build tool to use Objective-J, since the language is written in JS and preprocesses dynamically in the browser.

Ross at 5:30 pm | Posted in Technology, Web | No Comments

Google, iPhone, and Thinking Like a Programmer

November 20th, 2008

Yesterday, John Gruber declared that Google Mobile uses private iPhone APIs. Since he wrote it, it must be true, which is why TechCrunch picked up on the article this morning. Nevermind the fact that the only evidence offered by Gruber was that “there is no public API in the iPhone SDK for using the proximity sensor in this way.”

The critical mistake Gruber makes is assuming that because there’s no documented method for trivially utilizing the proximity sensor, there must be no possible way to do it. This is a severe lack of imagination.

Programming is an exercise in creative thinking, especially when working within the context of specific platform. There are always things you, the developer, want to do but don’t seem to be able to. This has always been and will always be true. It’s the reason why plenty of desktop Mac apps use private Apple APIs, and it’s the reason why plenty of iPhone apps on the store are using undocumented APIs.

The essential question, then, is can what Google is doing be done without using undocumented APIs on the phone? I think it can. I spent an hour this morning trying to prove it, and came up with this iPhone app.

To be fair, this doesn’t work as well as Google’s version, or exactly the same way. For one, I didn’t bother to program in the proximity requirement for the trigger. This is well documented, so anyone could easily add it to this project. The other way in which my method underperforms Google’s is that the phone actually has to touch your ear before it will trigger. Google’s just needs to come close to your ear. But, the code does use the proximity sensor, and it only uses the available API method for that sensor, which gives you the ability to turn it on and off. The code uses no undocumented APIs.

How does it work? As I mentioned, it requires you actually touch the phone to your ear. Once you do, the application receives a touchBegan event, and turns on the proximity sensor. It also makes a note that the app is in the middle of a touch sequence, and finally it fires off a timer which makes this trick work. Once the proximity sensor is turned on, it is immediately engaged because the phone is so close to your face. When the proximity sensor is engaged Cocoa turns off the touch sensor, but it doesn’t send a touchEnded message to the application. Thanks to that timer we fired, we can poll for the current number of touches on the screen. If it drops to zero, but we never saw the touchEnded message, we know we’ve triggered the proximity sensor. The project I’ve included will turn the screen yellow and play a little sound, just like the Google application. Here’s all the relevant code:

- (void)touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event
{
    [self setBackgroundColor:[UIColor redColor]];
    [UIApplication sharedApplication].proximitySensingEnabled = YES;
    
    _inLiveTouch = YES;
    
    [self performSelector:@selector(checkTouches:) withObject:event afterDelay:0];
}

- (void)touchesEnded:(NSSet *)touches withEvent:(UIEvent *)event
{
    [self setBackgroundColor:[UIColor greenColor]];
    
    _inLiveTouch = NO;
    [UIApplication sharedApplication].proximitySensingEnabled = NO;
}

- (void)checkTouches:(UIEvent *)event
{   
    if (_inLiveTouch && [[event allTouches] count] == 0)
    {
        [self setBackgroundColor:[UIColor yellowColor]];
        AudioServicesPlaySystemSound(_sound);
        return;
    }

    if (_inLiveTouch)
        [self performSelector:@selector(checkTouches:) withObject:event afterDelay:0.5];
}

An hour of programming, and three methods get us something relatively close to what Google is doing. I’m far from an expert in the iPhone SDK, but it only took a little bit of imagination to come up with this idea. Google has perhaps a dozen or more employees working full time on the iPhone. It’s not much of a leap to believe they figured out an even smarter trick than mine to accomplish what they wanted to do. Gruber did mention later that at least one other app is doing something similar without using the proximity sensor, but was quick to point out how inferior it was to the proximity based approach.

Now, as it turns out, Google probably is using private APIs to do what they’re doing. Erica Sadun, over at Ars Technica, made a more convincing case.

The ultimate question then, is does it matter? The reality of the situation is that there are a lot of apps doing this that have already shipped. It isn’t just Google, and it isn’t just big companies, it’s everybody. Either Apple knows this and isn’t doing anything about it, or they don’t know. Either way points to indifference. If they know, it’s obvious how not taking action is indifferent towards things, but what about if they don’t know. In that case, it means the review process they’ve built is not designed to catch these cases. To be sure, the dynamic nature of Objective-C would make it difficult to catch all instances of runtime tomfoolery, but it’s not impossible to do so, nor would it be particularly burdensome to catch most of the cases. As it currently works, I believe they are only checking to see which frameworks you link against statically, and denying anyone who includes private frameworks in that list.

Even if Apple was deeply concerned that people were using undocumented APIs, would it be a big deal if Google was granted an exception? Gruber thinks so:

Third-party iPhone development is purportedly a level playing field. If regular developers are forced to play by the rules, but Google is allowed to use private APIs just because they’re Google, the system is rigged.

Rigged? Who said the iPhone SDK was a level playing field in the first place? Apple has never been a company that stands up for level playing fields, nor should they have to be. Every business makes business deals, partnerships, exclusive rights agreements — this is common business practice, at Apple, Google, and everywhere else in the business world. Why should the iPhone be any different? If you had billions of dollars in the bank, Apple would talk to you too.

This isn’t only realistic, it’s also reasonable. One of the reasons to prevent people from using undocumented APIs is not wanting applications to break between upgrades. If Apple knows every single application that will break, because it has an arrangement with the developer, it can make sure to correct the problem in advance. This obviously doesn’t scale to every developer, but it makes perfect sense to do so on a limited case by case basis. In fact, this is how Apple already behaves in the desktop world. Wil Shipley gets his bugs fixed faster than Joe the Indy Developer. Why? Because his app is popular, and because he has connections at Apple.

TechCrunch claimed in their article this morning that “Apple wouldn’t allow Google to use an unsupported call. It’s not in their DNA.” If anything, Apple is more likely to help Google than any other company. After all, Google provided Apple with a special version of maps for the iPhone application, and now the company refuses to license Google Maps to any other iPhone developer, even the ones willing to pay the outrageous sums of money a license costs. Google gave Apple a virtual maps monopoly for the iPhone, the least Apple can do is let them use an undocumented method here and there.

Ross at 11:37 am | Posted in Apple, Projects, Technology, Web | Comments (3)