Thursday, August 27, 2009

Tightly Bound

I'd like to start this post off with a simple home-grown definition. Sometimes a bit of terminology can encapsulate a collection of ideas and make a conversation a little easier.

A computer software system is 'loosely bound' if for any specific functionality implemented within the program, most of the code involved for that functionality is specific to it and is not used for any other functionality, i.e., for each different functional behavior in the code, almost none of the underlying code is shared.

Of course, to make this definition complete the opposite is true as well. A system is 'tightly bound' if almost none of the underlying code is unique or specific to any given bit of functionality. Almost all of it is shared. The processing passes quickly from some specific block of code to something generic that then does most of the heavily lifting in the program. There is always some specific code or data at the end-points, but most of the work in the system is done by generic code.

It is best to give a simple example of this.

Most software we build these days follows a really simple pattern of being some type of GUI that allows us to navigate around, view or edit data. Within these applications, there is generally some heavy duty processing -- an 'engine' of some type -- that performs the in-depth work on the data. Mostly we persist a great deal of data either in a database or in a large set of configuration files. These basic elements are common to a huge collection of systems.

Within these types of systems, the flow generally starts from the user. The user requests to see some data, which then percolates back to the data source. The data is looked up, formatted in some way, and then sent back to the user interface for handling. So, we have two main pieces: a) the user sends a request to the persistent store, and b) the persistent store satisfies the user's request and sends back the data.

Mostly the user is navigating through some complex data landscape with the occasional action to create or modify some of what they see. This general pattern fits a large number of systems.

If you can visualize a program as a series of connections between a user and several other data stores (data bases, configuration files, etc.) then it is not a big leap to think of the code as just "lines of instructions" connecting the two together in a specific direction for some specific functionality.

For example, functionality like a user "Sign in" takes some basic information about the user, then looks up a valid user record in the database. "Edit a file" allows the user to navigate down to another (crude) type of database (the file-system), and identify a file for editing. The second half of that functionality opens the file and passes that data back to the user to be displayed in some context relative way (either by what was saved with the file, or what the user had recently set within the running system).

In a very offbeat way, we could see these various lines of code as being like a bundle of twigs standing between the user and the data. One twig to send the request from the user to the data source, and then another to send the results back again. Thus for most functionality, as it is launched by the user, it finds it way down one twig and then comes back on another.

In a loosely bound system, each piece (half) of functionality is essentially in its own twig. Except for some minor shared points, each twig stands alone; representing one half of the user's requested functionality. A loosely bound system is one where there is almost nothing holding the various twigs together; they are independent pieces of code, that share no context dependencies. They could all mostly stand alone. They are just in the same system because that's how they were packaged, how they ended up.

But in a tightly bound system, the twigs quickly come together at both ends to form branches, and the branches come together to form a trunk. In a really tightly bound system, there is only one massive trunk between the user and data, and absolutely all interaction between the two moves up or down this massive generic pathway.

In other words, the user's request to see some specific data via some on-screen control like a button gets generalized and sent down the main pathway to the back-end, which binds that to some specific subset of the persistent data. Once the database has assembled the data, it is packed into a generic container and shipped back to the front-end down the main pathway. At the front, it is dynamically tied back to some presentation information, and then shown to the user.

Most of the time, in most of the code, the management of the data is entirely generic. The code knows almost nothing about the underlying data.

In a tightly bound system, the data, which starts at one end or the other as being strongly typed, gets loosely typed as it flows through the system. In a very tight implementation the strongly typed static code at both ends is entirely minimized. It must always be there, but it doesn't need to have a large scope within the execution, e.g., the "UserName" field is also a data type, although stored as something more generalized in the database and the GUI widgets.

In a loosely bound system, we would copy the data from the database directly into a series of variables called UserName and then copy them, one after another through the system and finally into the appropriate display screen. Each link in the chain of code would know explicitly the exact type and name for all of its data. None of this code would be shared.

In a tightly bound system we would likely convert UserName into something more generic, a string perhaps -- or even farther out, an object -- and then pass that up through the system with some type of keyword or label. At the other side, we'd tie the keyword back to the data, convert it to a "UserName" again and then display it. In the middle of the code, all of the data is generic and loosely typed. At the ends, it is as specific and as strongly typed as necessary in order to complete the presentation or storage requirements (the screens or schema could actually be generic as well and not care, but that does limit display and functionality).

Getting back to this weird visualization, if we see all of the functionality of a system as this large bundle of twigs between the user and the persistent data, the tighter we bind it at the center, the more that center code gets generalized and carries an increased load. If we bind the twigs tight enough, they become one big indistinguishable trunk that then becomes the main pathway to move all of the data through the system.

Of course the point to tightly bounding the system is to try very hard to remove or reduce as much repetitive or duplicate code as possible. The twigs themselves are highly redundant.

We've long known that any redundancies are intrinsically dangerous because they can and do easily fall out of synchronization with each other, leading to expensive and difficult bugs. Time and changes both happen regularly, and both tend towards creating inconsistencies.

In most modern systems, there are two main places where we see this type of duplication the most.

On the user's side, our primary code interaction with the users is either by some set of highly repetitive screens, or in some type of MVC framework with a set of highly repetitive 'actions'. Either way, the user entry-points that manage the context and launch the functionality tend to be hotbeds of highly repetitive code. Most of the functionality is similar, thus most of the code driving it is similar too.

On the other side, near the database, although the data is stored into only a very limit set of introspective fields, typical relational database code is also highly repetitive and very static. Each and every query -- usually multiple ones even for the same underlying entities -- is explicitly moved around in very static parameters in the system. As the system grows, this code generally becomes massively repetitive and also very ugly. It's often a part of the system that most programmers try very hard to avoid or ignore.

In small and some medium sized systems, these two generators for redundant code can often be passed over without too many problems. The programmers can just do all of the extra work needed to pound out all of the extra code. Brute force will save the day.

In larger systems however, these two areas can start to push the overall complexity upwards into dangerous territory. If the programmers are just dumping code erratically into these parts, the system tends to become very unstable, very quickly. At some point, many systems reach a point where it is essentially like stuffing straw into a sack with a lot of holes. As you push it in one side, it falls out the others. It becomes a nearly endless sink for wasting resources.

By now we know that this type of redundant code can cause significant development problems. Still, even given all of our modern advice for not repeating the code over and over again, most programmers just accept loosely bound systems as being a necessity in programming, not realizing that there are other alternatives.

For most programmers, it is far easier for them to write code, if that code is as specific and static as possible. Most people have problems with generalizing, there seem to be far more detail-oriented beings who find it easier to focus on the trees, rather than the surrounding forest. It's not that surprising, it comes along with the general notion of "math anxiety" where people often fear a detachment from reality that comes from working through some highly generalized and abstract problem.

So, left to their own, most programmers will repetitively re-type in the same blocks of code, over and over again. It's habit and it is easier. They don't have to think deeply, and the quick progress makes them feel constructive. Many like the similarity -- are drawn to it -- even if deep down inside they know that there is probably a better, shorter way.

However, no matter how comfortable programmers are with making their systems loosely bound, there are several very strong reasons why tightly bound are always a far better development strategy.

The biggest reason is that if there are lots of twigs that can be generalized, then generalizing is significantly less code. Not just a 10% or 20% reduction, but often tightly bound system are orders of magnitude smaller. Why build something in 2 million lines of code, if 150,000 will do?

Programmers code at what can mostly be taken for as a constant rate. A good Java programmer -- who isn't doing a lot of bad cutting and pasting -- is probably creating about 30,000 to 50,000 lines of code per year. Given that medium systems start in the hundreds of thousands of lines of code, and large ones easily get into the millions of lines, 50,000 lines of code per year is a relatively small amount. Still, by the time planning, design, debugging, support, etc. are all taken into account, getting 50,000 lines of good clean, non-repetitive code out "consistently" would be a significant accomplishment for any talented programmer.

Within whatever bound, the amount of code that can be created by a programmer is still small. It is clear that most significant software is going to require many man-years to development. A quick hacker might get a prototype out in a few months, but to get the full product out, with all the proper documentation and packaging expected, is years and years worth of work. Programming is long and slow. It always has been that way.

Given the sizable effort involved, any significant reduction in code size, one that can particularly cut some of the base tasks into radically smaller pieces, is going to have a huge impact on the success of the project. If I know I can write the same system with one third of the code, I'd have to be crazy to write it any other way. Less code is better, way better.

Still, as most significantly experience programmers have probably figured out, it is not that initial "new" code that lands most development projects into hot water.

It's each new iteration, particularity on a foundation of constant erratic changes, that rapidly starts to burn through the resources.

As always, the initial versions and prototypes come into being very quickly, and then the "tar pit" so eloquently described by Brooks hits, and hits hard. And it is there, in between all of the massive dependencies, that having a code base that is one quarter or one tenth of the size really starts to pay off quickly.

As changes get made to loosely bound systems, the twigs start to quickly drift away from each other. There is, after all no technical reason why they should be bound or consistent with each other.

Inconsistencies build up, faster and faster. The rate of decay accelerates.

Our modern tool set makes it easier for programmers to keep searching through the code to find similar code that is falling out of sync, but we seem to shy away from utilizing these tools well, in favor of just ignoring the problems until testing. Until it is too late.

As an added bonus, not only are the original twigs falling out of sync with each other, but most programmers are ignoring this and hastily adding more and more new twigs, compounding these inconsistencies. The extremely repetitive nature of a loosely bound system causes all of the problems we would expect with having the same code blocks repeated over and over again. We know not to make some types of repetitions within our code particularly with variables or data, but for most programmers they don't really see that that is exactly what they are doing with their loosely bound systems.

As if that weren't bad enough already, the whole nature of the system means that it is really hard to isolate the new changes and just retest some sub-pieces. Unless you explicitly tract all of the changes, the nature of the twigs forces one to retest the entire system, each time, just in case. That, in its self, if done properly is a huge effort, one that is clearly skipped far too often.

Now contrast all of that with a tightly bound system. If the amount of unique and distinct code is small, then it doesn't take long before the processing has moved into some generic routine.

Adding new data to the database, or adding new screens to the GUI is all about just putting in that barest minimal amount of code into the system. All of the other handling is generalized and used by other code. Unlike a loosely bound system, as the code base grows, the ability of the programmer grows as well. The system gets easier to add to later, rather than harder.

If there are hundreds of functionality entry-points all using the same underlying generic code, then testing one of those points essentially tests all of them. There is, of course, still some high level or low level differences, slight variations on the screen or in the database, but mostly you can find that if the main pathways are working for the main data, there is a very diminished likelihood that they are failing for something else.

The bugs shift from being implementation problems to being presentation problems or small inconsistencies. If you have to pick your bugs, then these are a far better choice.

Of course, initially in the new development, generalizing the code is a hard prospect. You have to think about the problems a lot more, instead of just banging out the lines at high speed.

And, since none of us really see the full generalization right from the start, to get a really tightly bound system requires way more effort in re-factoring the code, iteration after iteration, to bring together the repeating pieces into denser and denser implementations.

Another point about a really dense tightly bound system is that the underlying primitives are quite obviously more difficult to understand. That is inevitable given the ever increasing pressure to keep making the code do more things. Dense code is harder to understand, which means another extra level of trying to keep it as clear as possible while still making it dense.

And still another problem with a tightly bound system is that it is much harder to distribute amongst a large group of programmers. That is often why you see the big organizations resorting to loosely bound systems. Brute force works with lots of coders even though the results are predictably fugly and repetitive.

Building a culture where coders want to extend what is there, not just quickly re-hack their own lame version, requires providing very difficult team dynamics and overall system architecture, as well as training and documentation. Programmers shouldn't sit around stranded on critical resources, but they also shouldn't just be splatting out code at some over-the-top rate.

Finding organizational arrangements that intrinsically facilitate good architecture with tightly bound results is an unanswered question in computer science.

So quite obviously, a tightly bound system is considerably harder to write. Programmers can't just throw themselves at the coding in the hopes of accidentally discovering the perfect generalizations. It takes slow deliberate thinking in order to get that tight binding at the center of the system.

Still, if and when it is done well, the extra effort in design and thinking in the initial part of the project pay off hugely at the release parts. Gradually, as the overall savings accrue, the work quickly puts a project ahead. And if the developers have been disciplined about keeping the code clean and consistent, the overall complexity of the code isn't growing at a dangerous rate.

A tightly bound system is a much better and more workable development project. Once over the initial design hump, it increases the likelihood of success and can really make the difference in being able to deliver new extended versions of the system.

Unfortunately, it is easier to develop loosely bound systems, and more importantly, most of the writing, advice, tutorials, etc. out there that tries to teach programming, explain design or help with technical issues also pushes the idea. It is hard enough to build a tightly bound system, writing about it is just too complex an endeavor for most people.

Regardless, the problem remains that a significant chuck of our development resources are burnt up in trying to beat on loosely bound systems. The low quality, frequent failures and high inconsistency rates in our modern software industry are testimonies to the fact that most developers are not being nearly as effective with their efforts as they could be. As an industry we flush away a huge amount of resources simply because we misunderstand were we can get the best use of them.

It is ironic because so many programmers strongly believe that their work involves a significant amount of intellectual creativity, yet they quickly fall back on using the most mindless brute force approaches to splat out as much brain-dead code as possible. They want their work to be recognized as intellectual, even if they don't want to put in the effort to make it so.

3 comments:

  1. "Why build something in 2 million lines of code, if 150,000 thousand will do?"
    150 000 thousand is more than 2 million.

    ReplyDelete
  2. Hi Anonymous,

    Thanks, that was a particularly embarrassing mistake, unless of course I was considering getting a position in one of the very large software manufacturers. They do see to have an inverted perspective on things :-)


    Paul.

    ReplyDelete
  3. Thanks. This is an amazing article. I think it summarises the issues I have with most of the commonly taught programming "best practices". It's also the best description of WHY most of the larger software I have ever used has been a buggy mess with inconsistent functionality.

    ReplyDelete

Thanks for the Feedback!