Monday, December 22, 2008

My Stack Overfloweth

A couple of weeks ago I finally got around to visiting Stack Overflow, the new Q&A site by Joel Spolsky and Jeff Atwood. It's a simple concept that I found both fun and engaging, a place where users can ask and answer specific programming questions.

It's easy to submit new questions, so there is a constant stream of inquiries to be answered. For each question you ask or answer you give, you get a collection of reputation points (some positive, some negative, depending on content). As you answer more questions, you get more abilities, you can do more things on the site. Also, there are these cutesy little badges that you can earn for specific milestones, adding an extra amusing level to it.

It's a great site, completely engaging with a decent interface. It is simple enough to use, and it has the feel of being a cross between a social site, a news feed and a repository of answers. In concept I think it's a great idea, but I was finding that in practice there were some interesting issues.

Sites come and go rapidly on the Internet, it is a very volatile medium. Things move very quickly. This makes it an interesting place to observe how people interact with each other and their software. You don't have to spend a long time to get an a good sense of what is happening. Web 2.0 has provided a large number of good public examples for analysis.

For this entry I figured that although I like Stack Overflow, the dynamics were interesting enough that I could spend a little time digging into the types of problems that may occur with the site. It's a good place for analysis, and we usually don't have to wait long to see if its right or not.


AT THE EDGE OF CONTROVERSY

My short membership on the site has been fun, but not without its share of controversy. Not being one to lurk for too long, I jumped in pretty quickly with a few questions and a fair number of answers.

What surprised me was that my first question "Success vs. Freedom" got into trouble right away. The title was re-edited to "What kind of programming method do you prefer? Success vs. Freedom", and then a discussion broke out in the comments about whether or not it was an appropriate question.

The mandate of the site is to answer "programming" questions, and it seems that some of the "stackers" as they like to call themselves, take that very literally, while others would prefer a wider latitude. A problem with this is that roving stackers with high reputation are allowed to enforce any rules as they see fit. If they have enough reputation points, they are allowed to alter the questions. The site content is completely controlled by its users.

My first question was closed initially because there was no way to answer it with just facts, all answers were inherently subjective. However it was re-opened shortly thereafter by another stacker, and managed to stay around for a while.

My next question was trying to provoke a discussion on this issue, clearly against the rules, but it too failed. It was "Are there legitimate programming questions that cannot be answered by facts?", a loaded question making reference to the fact that the FAQ says any question asked should be expecting just facts as answers.

The problem with Computer Science is that it is a highly subjective industry at the moment, we have precious few 'facts' that are not actually contained in a manual somewhere on the web. I wanted to get people thinking about the scope of this site, and whether or not facts are just far too limiting.

In a very real sense the questions with subjective answers are the ones that are above and beyond just the simple RTFM questions. Facts might be nice, but they are too simple to really be helpful, at least not in the long run.

Most surprisingly, my third question, which was really relevant to my current working situation got shut down immediately. I recently switched my main programming workstation to a quad core chip and I was wondering how to best set it up. Here for the first time I really needed an answer to a real problem, yet my query "What are the best performance options in XP for a quad core ship?" was shut down right away for not being relevant to programming.

The comments suggested that a cheap spin on the question -- like following one of those future cookie tricks were one adds "in bed" to the end of each fortune -- might have allowed the question to remain open. In this case one might simply tack on "for programming" to each question and possibly get away with it. I was thinking about trying that, but it just seemed far too lame.

Judging by the range of the various questions in the site, there seemed to be some disagreement going on, as to how to interpret the FAQ. As often plagues these sites, the old guard is struggling against the new guard for control of the content. Wherever there are people, there are disagreements.


THE LEFT VS THE RIGHT

The rules of the site are very specific, but some stackers feel that the interpretation should be lenient, while others want a very strict enforcement. That matches any large group of people, where the more serious people often get worked up about the little details, while others advocate a more relaxed approach to life.

The conservative members do have the noble goal of trying to keep the site from degenerating. Clay Shirky famously pointed out that the group is its own worse enemy, so it's likely they are right; if left unchecked the content on the site will go down-hill quickly. Still, one has to be careful with simplifying any situation to that degree. Behavior is dependent on the context and the type of people involved.

A site like this is mostly made up of two types of people, those seeking answers and those wanting to answer. The dynamics of those two groups will either help to the site to grow, or let it wither away.

The rules dictate little discussion, so the stream of questions is forced to be very specific and more or less junior in nature. Fact-based answers are really things that should be documented in help pages somewhere, thus the site's mandate is to essentially stick to well-known documented items or items that were missed for some reason.

Keeping the scope of the current questions down to a tight-nit range of purely factual questions will definitely keep out the fracas, but it will also heavily constraint both sets of users of the system.

New programmers go to the web to avoid reading documentation, and as I said, that makes up the majority of fact-based questions. The questions are simple, and direct, yet they belong to some larger, unknown context. They weren't asked at random. That means the usage of the answer may actually be more significant than the question itself. We'll get back to that point a bit later.

Also of interest, those answering the questions are not necessarily, as one might assume, the seniors in the industry. In fact it's more likely that they themselves are still in their first five to ten years of experience, just starting their careers. It's a leap, but not a far one since for most programmers who have seen several generations of technology come and go, there is a tendency to be more elastic with their knowledge.

When you first get into coding, you're driven to learn everything possible, but as time goes on and your favorite technologies get replaced again and again, you start to wise up. Why go deep into the technology if it's just going to get replaced again? There are, no doubt, many older senior programmers that maintain their high energies, but they are unlikely to be the majority. Most of us have seen it once too often to be impressed by seeing it again (in a crappier form). We lose interest in the little details as we get more experienced. You can't help it.

Thus, most of the people capable of answering the more modern fact-based questions are the ones that have learned that information recently and are still interesting in showing off how much they know. At the moment, it's fairly safe to assume that most of the people attracted to the site are juniors and intermediates.

Sure there are lots of seniors (you see this in the "How old are you, and how old were you when you first started coding?" question that they keep trying to kill), but their answers are probably more selective, and subjective. And they'll drift away faster.

Thus the conservative approach to keeping the site pure by weeding out questions then is to continually put pressure on the site to answer just fact based questions. But these type of questions, for both the asking and answering sides tend to drive the content down to a more junior level. A level that competes more heavily with the rest of the Internet, and also tends to become stale quickly.

The conservative side is perpetually in jeopardy of choking off the good and really useful questions, primarily because they don't have a long enough horizon to distinguish between the real issues, and the faux ones. My question on XP settings was a good example, as it very much relates to programming, yet it doesn't explicitly mention the word 'programming'. A limited scope means that limited usefulness.

Still, there always needs to be some type of constraints put over the content. If the site becomes too unruly, the signal to noise ratio will cut off its usefulness. There will be too much junk to make using it worthwhile. There are enough degenerated sites out there to know that it is a very probably direction for this site.

So we can can speculate that the site leaning too heavily in either direction will kill it. However, there are also other problems that will build up as significant issues over time.


THE REAL SOURCE OF KNOWLEDGE

A long long time ago, when I was a junior programmer, a coworker came in hurriedly and asked me a technical question. I can't remember the details, but it was an easy question for me to answer. Just as I was starting too, my officemate, a really skilled senior, battle-hardened consultant jumped in and started asking questions. They were mostly in the form of "why do you want to know that?". As he dug, it turned out that my coworker was clearly on the wrong track. A dangerous track that was minutes away from doing something dreadful to the system.

If we hadn't dug into the issues, I would have contributed to causing a huge mess. My coworker had glommed onto that specific question, because they were basing their knowledge on a series of incorrect assumptions. The question, being simple was actually a huge indicator of a serious misunderstanding. If they understood what was going on, they never, never would have asked that question.

The consultant, of course, chewed me out for having very nearly supplied the fatal piece of information. He told he, and I always remembered, that as a professional it was my duty to understand how people were going to use the information, before I was to give it out easily. In a case like the above, he said it was important to understand why they were asking the questions, because in many ways that was far more important than the question itself.

These days, that attitude feels very selfish, but there are certainly enough things out there that juniors shouldn't know until they are ready to understand them. They need the information, perhaps, but they need the surrounding context far more.

If we give out information, without giving out knowledge, then there is a huge chance that that information will be used poorly or in a wrong way. And its exactly that which has caused a shift for the worst in programming. It is too easy to get past simple problems, without having the prerequisite knowledge first. We're teaching toddlers to drive cars before they can even walk, and then we're surprised by the numbers of accidents.


JUNK FOOD DIET

The web brought with it a huge boon for programmers. Manuals were no longer hard to get things, hidden in some corner or on another floor. We no longer have to pour over them for hours to seek the smallest of answers to the biggest of questions. All of the sudden a quick search and *poof* you've got your answer, or at least a discussion of the answer.

That change is a good thing in that programming has become easier and diagnosing problems has become faster and more effective. However, all things good come with consequences ...

The web has made it easier to program. In fact the web has made it easier to know far less about programming. And in many ways this is a big problem. We are seeing far more code with a higher degree of artificial complexity that should never have been allowed into a production site. Sites like WTF are getting overloaded with content. Stuff that should have died because the programmer was flailing badly, suddenly gets released because the coder managed to route around the types of bugs that should have been fatal. Bugs, oddly have their value sometimes.

We've become so addicted to this fast-food information, and its having a huge effect. It's highly dangerous to give all of these programmers the simple stupid answers if they do not understand the content of what they are doing. We're not making software development better, we're ruining it. We're making it more possible for partially trained programmers to route around problems that they have no idea how to actually solve.

When you respond with "set the CosmicInverterThreshold variable to 24" it allows the programmer to get past the understanding of why that is the correct parameter, without really knowing what a cosmic inverter is, or why the old value was causing a malfunction. They can just happily ignore the obvious problems, and hopefully the non-obvious ones will disappear to.

We've been seeing the effects of this in the industry for years. Yes, the web opened up the door to quick information, but the quality of the underlying tools has degenerated. We're losing touch with how the machines works, trading it away for just poorly patching problems in a debugger.

It's nice on the one hand for making development faster and easier, but the underlying complexity of our systems has clearly outpaced our abilities. How many programmers actually know what is really happening in their OS? How much is just a big mystery? Are the little elfs that move around the bits red or green?

This both allows weaker programmers to create things that should not have been created, but it also helps to preserve libraries and utilities that should have died a nature death because they were poorly built. Too much information, with too little knowledge has become a serious problem to most disciplines, but a particularly sever one in software.

What is worse, of course is when you see those posts that ask or wonder if you need to learn any theory to program. The current answer: "no, you don't", but it should really be: "yes you do!"

You should absolutely have that knowledge, particularly if you are going to write something serious, but now because of the web, you can just skip over it and forget about the consequences. It's as if we've given the ability for home hobbyists to start building apartment buildings, in any manner they choose. Some buildings, of course, will be nice and well-built, but the rest?


THE QUALITY OF KNOWLEDGE

Quick answer sites serve up huge plates of junk food knowledge. Fast little tidbits that are incomplete platitudes passing for real answers, for real knowledge. The problem is that a steady diet of this type of fluff foolishly convinces most people that they don't need the real thing.

And like anything in this world, there is a huge distance between essentially understanding something, and really understanding the details. Baring those with photographic memory, most people don't really get the details unless they sit within a bigger more extensive conceptual understanding.

That is, you do need to know the theory, so that you can really understand the practice. No matter how well you know the practice, you cannot fill in the missing bits without understanding the encompassing theory. And its what you don't know that will cause all the problems.

You might get by, but you'll never be an expert in something if your only real tangible knowledge is relatively light. Sure you can program for instance, but without a bigger deeper theoretical background you probably will have trouble as an architect, for example. Precisely because you'll follow the popular trends, even when they are way off, and you won't be able to see why they are dysfunctional (until it's way too late). Software development is full of a lot of bad advice, and the only cure is to have a deep understanding.

This problem already occurs on a frequent basis in software. The industry relies on a huge number of domain specific programmers to write very specialized code for specific applications. That is fine, but if you've ever encountered a project were it's all domain specific coders, 100% of them, and no classically trained Computer Scientists, you're likely to find very esoteric code that fails on a very basic level. The domain specific stuff may be quite acceptable for the core functionality, but the whole package is often weak and unstable because of what's not known.


THE SOURCE OF ALL

Knowledge isn't just remembering a few facts or being able to ape a few movements. Information might be what you get, but when you can place something in it's context, it's real context, then you truly know it.

We take everythign we learn and we use that to drive our work. If there are huge gaps in what we know, then our efforts are more dependent on luck than they should or need to be. That, in an industry like software where we already rely on a huge amount of luck and guess work to see us through, puts any project driven by people without a lot of knowledge in a dangerous position.

We all know luck pays off sometimes, somebody wins the lottery, and good things sometimes happen, but the difference between an amateur and a professional is easily the amount of luck they rely on to be successful in their endeavors. Anybody can get lucky, and many people do. But most do not.

Software is exceptionally complicated, and now it's packed into endless layers, each a little more complex than not. For most of the code out there, there are plenty of real theories on which the original designs were based. The industry has a great deal more theoretical knowledge than most programmers are aware of. We know more than we think we do, it's just lost in old papers, private mailing lists or not disseminated to the masses. We're probably the worst industry for having the practitioners ignore most of the common knowledge.

Knowledge is really what most juniors needs when they start asking questions, however easy access to quick answers allows them to bypass that need, and get back to flailing at the keyboards.


GETTING BACK TO THE BASICS

Overall, despite the problems with the content I really like the existing site. Stack overflow is entertaining, however that is not necessarily a good thing. Fun may draw in a number of people in the short-term but it's not enough to keep them hanging around over time. Fun quickly gets dull, the users move on to the next thing.

It takes more than fun to keep users hanging around for the long term. Once the site is no longer the next cool thing, then it faces the real challenges. You need more than just fun to give people a real reason for coming back to the site. The content must go beyond just simple fact-based questions. If people aren't getting something more substantial, they'll quickly move on to the next form of entertainment.

Software, in its current state is a mess and the only real way out of that space is by evolving our knowledge. The reason this isn't happening is that every new generation of coders is rewriting, in a new technology, the same old stuff over and over. And in some weird ways, some strange memes like MVC become sacred cows, while other more significant ideas get lost and reinvented poorly.

What we really need is to honestly discuss our profession and to explore what works and what doesn't. It sounds simple enough, but programmers always fall back into defensive positions, thus making it nearly impossible to really discuss things.

One reason progress is so slow with software development, is that our culture is to hate something or believe in it 110%. There's no in between, so questioning becomes a lack of faith, and thus we get very few real objective discussions. This has been stagnating the industry for years. Once X has become the technology du jour, everyone jumps on the bandwagon or completely hates it. It never gets viewed objectivity for both its good and bad qualities. Once its old, its 100% crap, we never learn any lessons from the good parts.

We just go round in circles. Software will never improve until we find a way to improve it.

And we'll never improve it, until we find a way to discuss the strengths and weaknesses, honestly without rhetoric.


SUGGESTIONS

Discussion and deeper knowledge are huge issues that I think Stack Overflow will need to confront if it wants to remain current. These are the real value-adds that a site like this needs in order to continously drawn in people over a long period of time, and they are the real value-adds that our industry also needs in order to break out of its current mediocre bonds and start utilizing the potential of hardware.

With that in mind, there are a few simple things I think would really help kick the site up to a more useful level:

- allow polls (and let other people mine the data)

- set up a section for definitions (with the best ones floating to the top)

- limit the site to all of software development, not just programming

- open up separate domain-specific sections

- open up separate sections for environment issues (chairs, desks) and tools.

- allow discussions, but limit them; subjective stuff is very important to our industry.

As far as discussion go, find someway to allow them, yet contain them at the same time. Everyone should get their say, but just once or twice at most (I like how the current comments are way too short, so you can't respond with an essay).

And most importantly:

- Encourage alternative ways of thinking, expressing, working, etc.

The answers we have, are by no means at a significantly high state. Software is an immature industry, which eventually will grow tremendously over the centuries. What we know now, will get supplanted with better technologies over time, it's just a matter how fast.

No matter what happens with Stack Overflow, I'm finding it fun right now, so I'll hang around for a bit. But no doubt, like Slashdot, Facebook, Digg and most of the others, when the novelty wares off, I'll fad away. I always lose interest quickly in these types of things.

16 comments:

  1. As usual Paul, great post. For some reason you manage to capture my attention over and over again dispite the fact that your articles are insanely long. :-)

    "some strange memes like MVC become sacred cows, while other more significant ideas get lost and reinvented poorly."

    I'd love to see a follow-up on that statement. What significant ideas would you say we've "lost and reinvented poorly"?

    "but programmers always fall back into defensive positions, thus making it nearly impossible to really discuss things."

    I would say this is a property of humans, not particularely programmers, but I know what you're talking about. And I too have been guilty as charged although I get more and more humble for each year in the business (18 and still counting).

    ReplyDelete
  2. I could not agree more. Stackoverflow is a nice idea implemented poorly.

    ReplyDelete
  3. Excellent article - coherent and cogent. Right on the money.

    ReplyDelete
  4. Big post, well written but way to big for its contents.

    I think stack overflow is an blow of fresh air, and something that should have come up a long time ago.

    We have this famous sentence spoken by a very known comedian in Portugal that goes like so :"Falas falas mas não dizes nada" and that translates to something like: "You talk and talk but I do not see you saying anything useful"

    ReplyDelete
  5. I came across a link to this post on SO. Great article.
    I just wish more people who manage programmers read stuff like this instead of whatever crap they are reading.

    ReplyDelete
  6. "...as a professional it was my duty to understand how people were going to use the information, before I was to give it out easily."

    I have found this to be very true. Asking background questions before dishing out the answer is essential, especially when in a mentor role.

    ReplyDelete
  7. @Hans-Eric,

    Thanks, I try to shorten them, but the ideas just flow together (I guess I am just long-winded by nature :-)

    I find that many of the modern tools, like Ant, miss out on the significant of their predecessors like Make (no builtin dependency checking, turns my XP box into an XT at compile time). My new IDE based programming environment, NetBeans, is way less flexible and automated than my vi, grep, find, UNIX, X, twm one from twenty years ago. Subversion dropped the ability to easily combine multiple resources together that CVS had in its module file. As well, my tools are now littered with all sorts of pretty buttons and things that I don't understand and will never use.

    And that's just my technical software, I don't even want to start on what is wrong with the newer versions of Word, Office, Email, etc. Too much badly organized functionality, that is too hard to utilize properly. Eek.

    I figure, because we spend all day in logic, programmers have an above average tendency to reduce everything to static black and white issues. I keep having to remind myself that it is a gray, gray world :-)

    @everyone,

    Thanks for the comments. Don't be too hard on Stack Overflow, they did an amazing job building the site, and it's still in its infancy, so there is plenty of room to grow. In many ways it's our modern culture that is too blame for all of the Internet's informational junk food, SO is just reflecting the world around it.

    Paul.

    ReplyDelete
  8. "You talk and talk but I do not see you saying anything useful"

    I think that captures my reaction to your article very well. Paul, you raise some very genuine concerns.

    What are your proposed solutions other than adding a few extra sections to stackoverflow?

    How do you propose solving the problem of 'Juniors' or 'Intermediate' programmers being the heaviest dispensers of advice, rather than the so-called 'Senior' ones?

    How do you propose programmers who have just joined the trade and are still learning, find answers to IMPLEMENTATION issues, beyond just api documentation, in a way that is better than what StackOverflow provides?

    I challenge you to propose a model that is better than the one Jeff and Joel have come up with, rather than dismissing their work in a condescending manner as being merely 'fun' and 'amusing', and later saving grace by conceding that they have put in hard work in a comment.


    I also challenge you to next time use a spelling and grammar checker. Surely a post as long as this was not written inside of simple text editor?

    ReplyDelete
  9. I think the worry about the consequences of easily available answers is a red herring. No matter how much theoretical knowledge some software developer has he or she will always, when writing real code in real-world environments, run into snags and bugs in APIs and confusing specs. This is where sites like stackoverflow will be a tremendous boon.

    I like that the SO team is trying to come up with the best solution to this kind of problem, and not getting distracted by trying to solve all the problems in the world of software development and computer science, which is what the author seems to be wishing for.

    StackOverflow is a very nice, useful tree and a forest that needs more of them. Let's be glad it's here and use it for what it is best at, quick answers to questions that more often than not are about little things that used to take hours of searching to find out.

    ReplyDelete
  10. @Danish Munir,

    "You talk and talk but I do not see you saying anything useful"

    The usefulness is in the analysis, in digging into the details to get a better understanding of both the good and bad issues. If I just re-iterated a bunch of meaningless facts about the site would that have been better?

    There is nothing wrong with the current site other its scope is too narrow to possible survive in the long run. I wasn't being condensing, I was trying to objectively analyze the different elements in play. The conclusions drawn are up to you, not me.

    My only real advice is to widen the scope early before the crowd moves onto the next location. SO has collected a good audience, now they just need to keep them.

    @Al,

    One problem as I see it, is that little bits of information grow at a massive rate. Unless it is mined and summarized, it quickly grows beyond its useful potential. As it gets larger it will be just as hard to search for answers on SO as it is on the Internet. The site's simplicity is an artifact of its small size.

    Paul.

    BTW: Google docs; it doesn't check grammar and doesn't notice duplicate words. If you want well-edited polished works, it's best to stick to the major media outlets. Sure it's spun, but it's their day job, as opposed to something done at irregular intervals, late in the evening.

    ReplyDelete
  11. Wow that was long. Read the first part about your questions that got shut down. Think about it like this: Do I really want to go to stack overflow to read about questions on what people think, or about usual hardware things? No, I want to go there to see how to do something tricky in the code I'm writing. If I wanted to talk about something like that I'd go to proggit.

    ReplyDelete
  12. StackOverflow was started for a specific purpose, to answer cut and dried programming questions and make them easy to find. Your questions intended to get people to think were instead only a boring waste of time. Just because people want some place to go to quickly find solutions for (usually) clearly definable problems does not imply that those people need a programming messiah to break them out of their narrow-mindededness. It also doesn't imply that they don't recognize the value of deeper discussions about programming that StackOverflow doesn't provide. It does suggest that they have adequate common sense and realize that StackOverflow isn't the place for those discussions.

    Someone earlier in the comments suggested StackOverflow is a nice idea but implemented badly. Really? Please direct me to the site that does what StackOverflow explicitly intends, and does it better. There's a tendency among programmers to say that stuff that isn't perfect is bad. Mostly they say this about other people's stuff. It's strange because programmers in general have extremely high IQs and tremendous analytical skills, but a disproportionate number of them seem to be very poor at economics.

    ReplyDelete
  13. Hi Michael,

    Thanks for your comments. Oddly I think you answered your own point. There have been lots (hundreds?) of Q&A sites started over the years, they just degenerated beyond the point of usefulness that's all. Now would be a good time for SO to figure out how to avoid that same fate.

    Our industry is proud of the fact that the success rate has improved to 30% from 15%. For any other 'constructive' discipline, that would be a huge embarrassment. I know it's nicer to blame management or global warming or something bigger like that, but most of the problems start with programmers flailing at their keyboards, and just get worse from there.

    Sure, some of the people seeking fast answers are just looking for a quick fix, they already understand their problems, but many of them are routing around having to learn something. They are depending on luck more than skill. There will always be a need for a good clean Q&A site, but there is a much bigger need to dispense knowledge, not just information. We live in an era were people don't want to fix the real problems (or even know about them), they just want quick band-aids, so they can get back to their iPods.

    We definitely need a programming messiah to come along and change our industry. Someday people will look back at what were doing now -- in the same way we look back at those early programmers toggling switches and punching cards -- with a disbelief that anyone could have used such crude methods and get anything accomplished.

    It's some type of cruel modern joke that micro economics has become the key justifier for poor quality. The market feeds us cheap merchandise because people by into the selfish delusion that more stuff will make them happier. Hopefully its just a phase we are going through (and people return to their senses).

    That doesn't apply to SO (it's just an observation about the madness of our current world), I just think they got half way to excellent, but could go a lot farther. We shouldn't be afraid to analyze things even if they appear to be mostly working (nothing is perfect). There is always room for improvement. There is always room for discussion.

    Paul.

    ReplyDelete
  14. We still get the same panorama we had years berfore.

    Web is full of kiddies, full of people trying to impose patterns and stuff.

    Here in Brazil people like to "show up" and that's why Orkut is so popular here (I think it is the greatest Orkut public).

    Just put a reputation-system and your users will fight for fame rather than knowledge. Too volatile.

    ReplyDelete
  15. Hi Jan,

    Thanks for the comments. I guess people flow towards easy answers in the same way that electricity flows towards the shortest path. We all want the fame and fortune, but most of us would prefer to get there easily ...

    Paul.

    ReplyDelete

Thanks for the Feedback!