DOM = Frame buffer

February 2nd, 2010 | Posted by Quixey in Technology

If you’re writing a large-scale AJAX application, is it okay to write code like this?

if (jQuery("#file_menu").is(":visible")) { ... }

No, it’s not okay at all. The problem is relying on the DOM to store your program’s state. We want to argue that, despite its tree structure, the DOM only encodes your application’s output — and not its semantics.

Desktop Metaphor

Think about programming a desktop application. Your program keeps track of state using objects in memory. If you want to know whether your program is currently displaying a certain UI element, you don’t check if Screen.getPixelColor(519, 872) == "#000000". Instead, you rely on objects in memory with the proper semantic structure, writing something like Menus.activeMenu == Menus.FILE.

The most fundamental principle of elegant code is to write what you mean. Do you ever really mean to know the value of a pixel on the screen? No, what you really mean to do is know which UI element is displayed. That’s why you build levels of abstraction into your programs, like the Menus object in the example. If the programmer has intentions about menus, then the line of code should talk about menus. This kind of proper abstraction is what separates the master programmers from the simple-minded “get ‘er done” hackers.

Applying it to AJAX applications

Everyone understands why you shouldn’t rely on pixel values to store your program state. But almost no one understands the analogous rule for AJAX applications: Don’t use the DOM to store your program state.

People are confused because the DOM’s significance has done a 180 since the days of static web pages. In the old days of static HTML and CSS, you would “separate style from semantics” by writing your HTML to manifest the semantic structure of your layout. You would use nested tags to encode parent-child relationships between parts of your layout. Then you would use style sheets to control the display logic — the non-semantic attributes of your layout.

In a modern AJAX application, your HTML and CSS have been demoted one level of abstraction. What used to encode semantic structure + display logic, now only encodes display logic. The structure that manifests your application’s semantics is no longer the DOM, but rather, your JavaScript program and its state.

And that’s why we say, DOM = Frame buffer. It’s just a location in memory that you output into.

Crimes against abstraction

Let’s go back to the original crime against abstraction:

jQuery("#file_menu").is(":visible")

It should be rewritten as something like:

Menus.activeMenu == Menus.FILE

See? You should talk about the abstract concept of an active menu, regardless of whether your display logic deals with pixels or the DOM.

Just because your grandfather’s static pages used DOM trees to model layout semantics, it doesn’t mean your AJAX application has to model its state by dumpster-diving into the display layer

You can follow any responses to this entry through the RSS 2.0 You can leave a response, or trackback.

7 Responses

  • Scott Graves says:

    Having written a 50,000+ line MVC AJAX application a long time ago, my experience is that keeping your model and the DOM in sync with good performance and clean code is a bitch (try implementing a *fast* model-based drag-and-droppable live-editable multi-column tree). Or hope that you can find a decent framework that does that.

    Not that there are any good alternatives, but for small projects, stick your data in the DOM (try to be at least somewhat semantic about it, using classnames, custom attributes and such) and you’ll have a much better time. Pragmatism beats purity.

  • That one data structure can always be decomposed into multiple data structures with equivalent meaning doesn’t imply it’s unquestionably incorrect not to separate them. That two things can exist separately doesn’t mean it’s wrong to combine them, particularly when they are intimately related, and the alternative is maintaining redundant state which must be kept synchronized. That’s the problem with dogmatic proclamations. Understanding where to draw the line is crucial to optimizing both machine efficiency (via less storage and runtime effort) and programmer efficiency (via less code and fewer bugs). The complexity of a system grows exponentially with each such layer of decomposition. Is the stopping point arbitrary? I claim it is an engineering compromise, informed by the anticipated value of additional modularity, with the prerequisite that one structure can be mapped onto another without loss of information.

    Certainly you would not check pixel values to determine if a menu is open. The mapping from semantic elements to rendered pixels is immensely complex and seldom invertible. You might reasonably ask the UI toolkit if the menu is open. You can, as you suggest, keep track of this yourself, but you must anticipate every way the menu might become closed: both by the operating system, or by other code in your application – perhaps written by someone else – that bypasses your intended function for changing this state. If you don’t insist on recording this information yourself, you write less code, and with a single definitive source there is no possibility of inconsistency. This is what you are arguing against.

    An aside: The pixel example is, by coincidence, relevant to a point I’ve been thinking about lately. Game systems from the early 80s did something that seems unthinkable by modern standards of software hygiene: collision detection was often handled directly by the video hardware. Semantics and presentation, completely conflated. Yet, as a programmer, the choice is that of checking a few flag bits indicating collisions between objects and/or the background, or performing a separate pass through the objects to check for collisions yourself. It makes sense to use the information if the hardware provides it, and it makes sense for the hardware to provide it, because it’s just a few extra AND gates and flags to add there, versus an expensive and complex loop in software. This choice imposes constraints on the application, yet often it was the correct one. Where it wasn’t, a price was paid in complexity and (potentially) performance. To my surprise, just when I thought this kind of coupling was relegated to the past, it emerged in modern systems with a new twist: the hardware occlusion query. Meanwhile, I think about efficient 2D collision detection in sparse environments. In the past, I’ve used spatial hashing. In the future, I may use a quadtree. It pleased me to realize that, in the service of detecting collisions, a quadtree can also contain all the information needed to render the objects, storing pointers to transformed rectangles of pixels in the leaves, and it’s conceivable that simple hardware could output video directly from the tree with no intervening framebuffer (though perhaps less simple if it tries to make the most of the available memory cycles). Indeed, a quick Googling indicates direct scan conversion of quadtrees has been researched.

    That said, hiding application state in the DOM tree sound like a dubious proposition. However, is menu visibility truly important application state, or an irrelevant property of the menu’s presentation? Core application logic should only care when a menu item is selected, and not about the menus themselves at all, as the most starry-eyed proponents of “Model-View-Controller” architecture will insist. That patterns or philosophy can guide the way without always being reified, in excruciating detail, as code and data, is a point too often overlooked, and the cause of much unnecessary labor and suffering.

    • quixey says:

      Andy,

      Those are good points about the advantages of calling lower-level functions to make things simpler and more efficient.

      The motivating principle behind DOM = Frame Buffer is that you absolutely must “write what you mean”, i.e. use abstraction layers properly, if you want any large-scale application to remain maintainable.

      So in your example of hardware collision detection, it’s great to use, as long as it’s encapsulated within the right place in your object hierarchy, or other multi-level code structure.

      -Liron

  • Danilo says:

    I agree with quixey.
    I work for advertising agencies, programming very interactive user interfaces and a few games also. And I have found that abstracting application/widgets/animation states is the best way to go. And not because of “purity”. It makes things easier, and faster to develop.

  • Brad says:

    I think it might be a poor use case for your thesis. In this case, its only a crime against abstraction from the developer’s point of view. From the point of view of the user, or a developer acting as a user’s advocate, .is(‘:visible) is abstraction of the highest order in that both the business and end user can assert state by eyesight.

    A lot can go wrong on the path from browser event to css attribute. Insert an error at any point in the event lifespan from event call to event propagation to custom event call to event delegation and all of a sudden, the developer’s tidy, abstracted state is hopelessly out of sync with its audience. Imagine that on a ‘Buy’ or ‘Sell’ button in your favorite daytrading web app.

    The final arbiter of state in a web app should always be the displayed output. It is the contract between the end user and the website. .is(‘:visible’) has elegantly put the user’s perceptions at front and center. By replacing that elegance with Menus{‘activemenu’: Menus.FILE}, you’re actually obscuring state because of the possible dissonance between your discreet properties and the actual desired result — your user saw a menu.

    With just a few modifications, I find this to be more meaningful:

    Menus.FILE = jQuery(“#file_menu”);

    if (Menus.FILE.is(“:visible”) { … }

    • quixey says:

      Thanks for your well-thought-out comment. My view is that Menus.FILE = $(“#file_menu”) is pretty close to saying Menus.FILE = Screen.getRectanglePixels(x1, y1, x2, y2). The .is(“:visible”) looks nice enough and is fine for small projects. But it is missing a conceptual separation between the code that displays the menu and the code that models its state, and breaking the code into conceptually separate components is the most important challenge facing a serious programming projects.

      • Brad says:

        I can see your point, but then the use case is still misleading. Menus generally only keep a state of visible or hidden, so .is(‘:visible’) is a pretty great abstraction for that case.

        I think your point would be better reinforced by a more complex set of states. A combo box or a ratings widget come to mind. jQuery itself is an awesome DOM state machine, so when the use case is relegated to the DOM, I try to leave it at raw jQuery.

        Business states are a better model for this type of abstraction, but even then I would want the assurance of a backbone, knockout, or jquery data-linking to be sure that my app and its users are sharing the same ‘contract’.



Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>