Saturday, October 18, 2014

The Necessity of Consistency

A common problem with large projects starts with adding new programmers. What happens is that the new coder bypasses the existing work and jumps right into adding a clump of code. Frequently, both to make their mark but also because of past experiences, they choose to construct the code in a brand new, unrelated way. Often, they'll justify this by saying that the existing code or conventions are bad. That's were all of the trouble starts.

In terms of good or bad, most coding styles are neither. They're really a collect of arbitrary conventions and trade-offs. There are, of course, bad things one can do with code, but that's rarely the issue. In fact, if one had to choose between following a bad coding practice, or going in a new direction, following the existing practice is the superior choice. The reason for this is that it is far easier to find and fix 20 instances of bad code, then it is to find 20 different ways of doing the same thing. The latter is highly subject to missing code and bug-causing side effects, while the former is simple refactoring. 

When a programmer breaks from the team, the work they contribute may or may not be better, but it definitely kicks up the complexity by a notch or two. It requires more testing and it hurts the ability to make enhancements, essentially locking in any existing problems. And as I said earlier, it is often no better or worse than what's already there. 

It's ironic to hear programmers justify their deviations as if they were improving things, when experience shows that they are just degenerating the existing work. Once enough of this happens to a big code base it becomes so disorganized that it actually becomes cheaper to completely start over again, tossing out all of that previous effort. Overall the destruction out-weights any contributions.

That's why it is critical to get all of the programmers on the same 'page'. It is a sometimes-painful necessity to allow any large development project to achieve its full potential and it is one of the core ingredients necessary to forming a 'highly effective team'. The latter being one of the keys to building great systems; in that dysfunctional teams always craft dysfunctional software. 

With all of that in mind, there are a couple of 'rules' that are essential for any software project if it wants to be successful. The first is that all of the original programmers for a new project need to come together and spend some time to lay out a thorough and comprehensive set of coding conventions. These should cover absolutely 'everything', and everyone should buy into them before any coding begins. This isn't as difficult as it sounds, since most brand new projects start with very small tiger teams. 

The next rule is that any new programmer has to follow the existing conventions. If they don't they should be removed from the team before they cause any significant problems. It should be non-negotiable.

Now of course, this is predicated on the idea that the original conventions are actually good, but often they are not. To deal with this, a third rule is needed. It is simple idea that anyone at anytime can suggest a change to the existing conventions. For this to come into effect two things are necessary. First, the remaining programmers all need to buy into the change, and second the programmer requesting the change must go back to 'all' existing instances in the code and update them. The first constraint insures that the new change really is better than what is there, while the second one sets a cost for making the change and also insures that everything in the code base remains consistent afterwards (it also teaches people to read code and to refactor).

If teams follow the above rigorously, what happens is that the code base converges onto something organized and well-thought out. It does add some overhead, but a trivial amount relative to a messy disorganized ball of mud. And of course it means that the development project never ends for avoidable internal reasons. 

Monday, October 13, 2014

The Depth of Knowledge

A few decades back, I decided that I wanted to know more about quantum physics. I've always been driven to indulged my curiosity of various disciplines, it's been a rather long-standing obsession since I was a kid. 

I went out to the bookstores (remember those :-) and bought a huge stack of books on what might be best described as the 'philosophy' underpinning quantum physics; stuff like the wave/particle duality and Schrödinger's rather unfortunate cat. It was fun and I learned some interesting stuff. 

I'm fairly comfortable with a wide range of mathematics, so I bought a book called Quantum Theory by David Bohm. The first chapter 'The Origin of the Quantum Theory' nearly killed me. I was so hopelessly lost by the first sixteen pages. I'm sure if I dedicated years to studying, I could at least glean 'some' knowledge from that tome, but it was likely that it just wasn't possible to get there as a hobby.

Knowledge has both breadth and depth. That is, there is a lot of stuff to know, but getting right down into the nitty gritty details can be quite time-consuming as well. Remembering a long series of definitions and facts is useful, but knowing them doesn't necessarily help one to apply what they've memorized. Knowledge is both memory and understanding. My leap into quantum physics is a great example, I learned enough to chat about it lightly at a party but not enough to actually apply it to anything useful, not even chatting to an actual quantum physicist. I got a very shallow taste of the higher level perspective, but even after all those books I still didn't really understand it.

That knowledge is both wide and deep causes lots of problems. People can easily convince themselves that they understand something when they've only scratched the surface. It's like trying to make significant contributions to a field were you have only taken the introductory 101 course. If it's a deep field, there are years, decades and possibly centuries of thinking and exploration built up. What's unknown, is buried so deeply below that accumulated knowledge. Finding something new or novel isn't going to be quick. If you want to contribute new stuff, you really have to build on what is already known, otherwise you'll just be right back at the starting point looking over well-tread ground.

And some knowledge is inherently counter-intuitive. That is, it contradicts things you already thought you knew, so that you literally have to fight to understand it. Not everything we understand is correct, some of it is just convenient over-simplifications to make it possible for us to operate on a day-to-day basis. That 'base code' helps, but it also hinders our deeper understanding. We see what we need to, not what is really there.

It's pretty amazing how deep things can really go, and how often what's under the surface is so different from what lies above. My curiosity has led me to all sorts of unexpected places, one of my favorite being how much can be learned from really simple systems like boolean equations. Going down, it can feel like a bottomless pit that endlessly twists and turns each time you think you it can't go any further. With so much to grok from just simple things, it's truly staggering to contemplate the full scope of what we know collectively, and what we've forgotten. That 'field' of all things knowable has exceeded our own individual capacities a long time ago, and has steadily maintained its combinatorial growth since then. At best we could learn but a fraction of what's out there.

Software is an interesting domain. It's breadth is huge and continuously growing as more people craft their own unique sub-domains. A quick glance might lead one to believe that the whole domain is quite shallow, but there are frequently deep pools of understanding that are routinely ignored by most practitioners. Computational complexity, parsing, synchronization, intelligence and data mining are all areas that sit over deep crevices. They are still bottomless pits, but depth in these areas allows programmers to build up the sophistication of their creations. Most software is exceptionally crude, just endless variations on the same small set of tricks, but some day we'll get beyond that and start building code that really does have the embedded intelligence to make people's lives easier. It's just that there is so much depth in those sort of creations and our industry is only focused on expanding the breadth at a hair-raising pace. 

It's easy to believe that as individuals we posses higher intellectual abilities, but when you place what our smartest people know along side our collective knowledge, it becomes obvious that our inherent capabilities are extremely limited. Given that, the rather obvious path forward is for us to learn how to harness knowledge together as groups, not individuals. Computers act as the basis for these new abilities, but we're still mired in the limits of the personalities involved in their advancements. Software systems, for example, are only as strong as their individual  programmers and they are known to degenerate as more people contribute. We only seem to be able to utilize the 'overlap' between multiple intellects, not the combination. That barrier caps our abilities to utilize our knowledge.

Despite all we know, and all of the time that has passed, we are still in the early stages of building up our knowledge of the world around us. If you dig carefully, what you find is a landscape that resembles swiss cheese. There are gapping holes in both the breadth and depth of what we know. Sure, people try to ignore the gaps, but they percolate out into the complexity of modern life at a mind numbing rate. We rooted our cultures on a sense of superiority over the natural world, but clearly one of the largest things we still have to learn is that we are not nearly as smart as we think we are. If we were, the world would progress a lot smoother than it has been. Knowledge is after all, a very significant way to avoid having to rely on being lucky.