Genii Weblog

Late last night - In defense of chunks

Sun 25 Jan 2004, 04:34 PM



by Ben Langhinrichs

You know the scene.  It's midnight, and you're standing around the Swan fountain in the Swan Hotel at Lotusphere, explaining to a friend (Paul Ryan of Process Stream Technologies, in this case) why cursors in rich text make less sense then chunks.

What?  That's never happened to you?

Well, let me just say that Paul, who happens to be a fellow Penumbra member and one of the few people I ever talk to whose eyes don't glaze over when I talk CD records and that sort of thing, doesn't understand why I don't use a cursor concept.  A cursor would keep track of the position in the rich text, much the way the NotesRichTextNavigator works (or would work if it really worked all that well).  My impassioned response, as captured by stealth photographer Julian Robichaux (above), may have lacked a bit in coherence due to the late hour and the two parties I had already attended (parties sadly lacking in milkshakes).

So, for anyone still reading this ramble, here is a short (somewhat philosophical) discussion about what is good about chunks, such as Midas uses.

Short discussion on chunks vs. cursors in rich text
How you model data depends largely of three factors:
  1. How you think of the data; 
  2. how you are able to access the data; and
  3. what you intend to do with the data.
If you think of rich text as a blob, and if you are mostly only able to stuff things into it without caring much what is already there, and if what you intend to do is primarily create, rather than work with the rich text, cursors make a lot of sense.  A cursor in this worldview is little more than an "insertion point".  I want to add a table to rich text, so I will put my cursor here, and I will add my table.  It is reflective of what a user does, but only an unsophisticated user who does nothing but create rich text and doesn't know anything about modifying it.  Cursors are very good if you want to add stuff again and again. (e.g., I'll add this table and then this text and then some more text and then a doclinks.)

Chunks are born of a different mindset.  When I think of rich text, I think of it not as a blob, but as a container for complex ideas, structured, but flexible, somewhat like a poem with a meter and set number of lines, but no rules on the content.  I am able to access it not just by stuffing things in, but using the Midas engine, I can move things around and sort them and structure them and relate them.  Finally what I want to do with the rich text is anything the mind can imagine.  Instant cross references, they're in there.  Take the existing rich text and add containers around it (such as sections), or parse it up into useful parts (such as tabs on a tabbed table), they're in there.  

So, how does a chunk help with this, and why wouldn't a cursor be just as good?  Imagine a chunk defined as "Text Starts 'DDN#' 1", which would be the first text string which starts with DDN#.  What if I change the text from DDN#091874, for example to SBN#091874?  In a cursor based system, even one which allowed such complex definitions, the cursor would be left right after the original string.  What is the good of that?  I am working with, as I said before, the first text string which starts with DDN#, and it is obvious that SBN#091874 does not start with DDN#, so it has no relevance to my task anymore.  Why would I want to point to that place in the rich text.  With a chunk definition, once I changed the text, I moved on.  I didn't shift the cursor, or look for another match, I just left the definition alone, but the focus of the definition changed.

This is the key point.  The focus of the definition changed.  I have blogged before about late binding, which is the technical underpinning for this concept, but if you can leave aside the "How does it happen?" questions and think from the business perspective, this concept is much easier.  Whether you are an administrator or developer, how much of your time do you spend worrying about the messes you cleaned up yesterday, or the problems you solved last week?  Not much.  How much time do you spend worrying about what is coming?  A lot.   Now, here is the clincher.  How much of the time is the problem you face today a direct follow up to the problem you solved (not the problem you didn't solve, but the problem you already solved) yesterday or last week?  Not much, I'd venture.

So, from that perspective, a cursor is all about thinking that the problem I have now must directly follow on the problem I had before.  A chunk casts its net wider and tackles the problem as it appears today, right now.  Who cares what you did before this, the problem you face NOW is what counts.

Conclusion
I bet I know what you are thinking.  I bet you are thinking, if this is the coherent follow up to the incoherent argument last night, you couldn't pay me enough to show up at the Swan tonight.  See, can I read minds or what?

Copyright © 2004 Genii Software Ltd.

What has been said:


99.1. Paul Ryan
(01/30/2004 03:05 AM)

Cursors don't make less sense than chunks. Each concept is a valid way to access and manipulate Notes rich text. For Midas' basic purpose (programmable access to manipulating rich-text), I appreciate why the chunk approach is the easiest and most often useful approach. In other contexts, as you fleetingly touch on, cursors can be especially useful and a much more efficient way to go. Many discrete rich-text manipulation scenarios (as opposed to a tool like Midas that needs to stay more generalized) involve a single pass through a rich-text field where multiple manipulations are done along the way. In such a context, cursors are clearly superior to ordinal-based chunks, as I submit you aptly show in your late-binding post.

Heck, you could extend the already excellent Midas to include an additional cursor interface. :-) Since of course you're using a cursor construct internally, it might not take much to expose it, allowing users to, say, dynamically color the background of individual table cells dynamically w/o your code having to start at the beginning of the rich-text stream before every "chunk change". Food for thought!

- Paul -