Thursday, June 11, 2009

Programming is Simple!

Writing software to program computers is a simple process. You start by deciding which new data you want to be supported by the system. All systems revolve around their underlying data.

From there, you decide which functionality is necessary. Most of it is fairly trivial, just the usual adding, deleting and modifications. That accounts for at least 80% of most systems.

The other 20% can be complex and require some research in textbooks or the web. Chances are someone has at least tackled the basics of the algorithm, at some time in the past. Certainly, the category is probably covered.

Once the data and functionality are understood it is time to start actualizing the code.

This works best by starting at the persistence layer and working upwards. Most systems use some form of relational database, but all persistent data stores have some type of schema that needs to be extended for the new data types.

From the database, with its universal model, the data needs to work its way into a running application-specific model. This often involves a slight skew and the addition of some underlying context. Nothing horrible, unless it is ignored.

With the data in the application, the next big piece is to wire up the functionality to any required interfaces, be it GUI, command line or even some type of batch mechanism.

As the data moves forward to the user, it often needs to be dressed with fancy presentations, stuff that gets easily stripped away later. The original data type should drive these visualizations.

Thus, programming is simple.

Well, almost. There are a few things that keep this neat progression from easily happening on a quick, and repeatable basis.

The first one is that our modern technology has serious problems. Really serious problems.

I couldn't imagine the very early years in computing when "programmers" had to manually trigger actual switches on the boxes in order to boot their machines as they started up. It must have been so painful and boring. But all I can think of is that someday in the future, people will look back at these days and think "I can't imagine how these guys coped with their millions of repeating lines of code, it must have been so painful and boring".

Toss in the supplementary code, scripts, documentation, designs, testing, and tutorials, and the same underlying simple bits of information get splattered redundantly across a humongous range of different formats, files, and locations. We don't just repeat ourselves, we go nuts in doing so.

Given what we actually need to accomplish, our technologies and their inherent weaknesses drive us to distraction.

In many ways, their most serious problem has become their insanely overcomplicated architectures. We jump through bigger hoops and tricks, just to solve the same simple technical problems over and over again, that the real domain problems get lost in the carnage.

The second big thing we face is people. And it's a problem that hits programmers from two sides simultaneously.

From the front, the users know what they want, but we orient our systems vertically. Thus we are only interested in a thin slice of their problems. This impedance mismatch makes it hard to get them to communicate exactly, in a precise way, how we should quantify that little slice of the world and automate it. They can't say it, and often we're not interested in guessing or digging deeply. Many programmers think their boundaries stop at the edges of their little niche and choose to be very territorial.

From the back, most systems are more work than a single person could complete in a reasonable time. Thus, the work needs to be partitioned amongst a larger group. The culture of coding is about freedom and independence, so like any good herd of cats, shortly after the design meeting, everybody goes off in their own direction and does their own thing.

Management is genuinely surprised when it all doesn't come together in the end. All the individuals believe that their particular direction and approach was the correct one. It generally goes downhill from there.

Assuming that the lame technology and organizational problems aren't particularly fatal, the single greatest threat to simplicity faced by all programmers is themselves.

Most of us fell into our employment positions because we were fascinated by intricate things like the inner workings of a clock. That's good because we like what we do, but bad because it means that we have a real tendency to push an overwhelming amount of complexity onto even trivial problems.

We like complexity. We think it's neat. So it's no wonder that we're drawn toward taking some wantonly over-complex approach to even the simplest of development problems. It's inherent in our nature.

Competing with that is that programming is a huge amount of work. Too much work. And the stress of all of that pending effort forces many programmers to try and rush through the coding process at their fastest possible speeds. High-speed sustainable programming is great, but charging forward in spasmodic spurts leads programmers towards just crudely pounding out bad code, in the hops that it can be fixed later. And later never comes.

If it wasn't for these three problems, programming would be simple. We would simply decide what data we wanted to add, how we going to use it, and then go about implementing and testing the code.

In most programs, most of the time, once we've gotten past the initial technical architectures, the essence of programming shouldn't be any more complex than that.

But it just doesn't work that way. It should. It makes life easier, and programming more enjoyable, but in practice, we never seem to get there, although some of us figure it is possible.

So how do we get to simple?

There isn't much we can do about the technology in the short run. It is what it is. Although, we should try wherever possible to choose technologies that are inherently elegant. If the market demanded elegance, the vendors would have little choice. Right now we don't, so they don't provide it. Why do extra work if it doesn't pay?

Elegance is simple enough to distinguish. If the technology looks simple, like you might have been able to build it yourself in a few weeks, then it is probably elegant.

Choosing simple, straightforward technologies that minimize their requirements for integration will go a long way towards weeding out the many bad apples we currently have.

People, however, will always be a problem.

The best way to deal with the users is to understand that the software developers are really the experts in the equation. The users have a broad perspective across the whole landscape and need to be mined for that view, but they don't have the capacity to turn their knowledge and experiences into a working system. If they did, they wouldn't need programmers.

My earlier post on Architectural Ramblings dealt with matching system architecture to team structure. It's an open problem, and always a trade-off, but at least if you understand it, it is less likely to be disruptive. It's near impossible to achieve a strong amount of consistency in programming right now with a group, so the overall elements of the architecture should assume that, and not rely on it in order for the system to be successful.

Failure has to be built in or explicitly monitored.

As for ourselves, for most of us, we will always be the greatest enemy that we face. Our need to get excited by the work compels us to push our own envelopes. Doing the same old thing, again and again, is boring.

Inevitably, that leads to mistakes in experience or judgment. Simple choices, gone wrong and then hidden in the code. Often causing larger downstream problems. There probably isn't a system written that doesn't have some nasty kind of WTF buried at its core. At least one, if not hundreds.

Even when it's not obviously wrong, most programmers tend towards over-complicating their designs, then resorting to brute force to slam in the actual mechanics to meet their deadlines. They go extreme on one side, then extra crude on the other, creating brittle code that is trying too hard.

We can handle this in two ways. The first is by starting out with code that is ridiculously over-simplified. Complexity is easy to add but nearly impossible to remove. If the first iteration of code is an algorithm so simple that it couldn't possibly work, it is not that hard to slowly enhance it into something that meets the criteria. Over-simplify, and then work backward.

The second way to handle this is by expecting that the significant bulk of our programming effort is to not be writing new code. Deletions and modifications to code are far more valuable to an existing system. If you start the day by deleting a few thousand lines of code, it is a good programming day.

Many people know about refactoring, but often people are too afraid to actually apply it to their own code. Instead, they'll put up with obvious bad deficiencies, because they don't want to tip over the apple cart. They don't want to introduce a whole new suite of bugs, as they are working on stuff.

Truthfully, programmers have hot days, and not so hot days. The best skill one can add to their repertoire is to find the self-discipline to clean up and refactor on the off days. If you touch the file, and it's messy, you shouldn't leave it, unless you've cleaned up the problems.

For those paranoid institutions that don't want total chaos in their codebase every week as programmers maddeningly change everything, it is a simple matter of restricting the refactorings to be non-destructive. You can compress or clean up the code, so long as its (expected) algorithmic behavior doesn't change. Sure, initially some things will change a bit too often, but those will be bugs (or just insanely complex bits). Over time, the code will stabilize. The short term pain will be worth the long term gain.

Programming should be simple.

It's often not, but for reasons that are mostly fixable. There may be some scrambling initially as the direction of the project gets set, but over time, development on a codebase should always get easier. This happens because the base problems get solved and encapsulated.

As more and more of the mechanics get implemented, the scope of the functionality should grow larger, get more complete, and be more enjoyable to work on.

If you're on a development that is past its very first initial stage, and it still is a hard slog, then it is likely that it is caused by self-inflicted actions. Punishment for not pushing harder to make sure things are easier. After all, programming is simple.

No comments:

Post a Comment

Thanks for the Feedback!