Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The article's conclusion seems to be that even if debugging/software development can't be parallelized perfectly (which is true because debugging requires a buildup of mental context about a codebase), since the amount of time required is a random variable, doing the entire task redundantly in parallel will necessarily reduce the expected value of the minimum amount of time required. For instance, if you throw $n$ developers with equal knowledge of a codebase independently at a problem whose time-to-solve is exponentially distributed with rate $\lambda$, then they'll solve it at a rate of $n \lambda$. Of course, this is almost always less efficient than letting them work on their own problems!

However, if you apply a similar analogy to reliability instead of time, having people work on the same task with what would appear at first to be redundant effort can increase reliability phenomenally. In a way, we see this in organizations that demarcate developers and test engineers - you're throwing bodies at a problem, creating a rivalry of two groups that challenge each other by attacking a problem from different angles, and in doing so you have a different distribution over the error/bug rate than you would otherwise.

An interesting read on this point is http://www.fastcompany.com/28121/they-write-right-stuff about the team that does (did :( ) software engineering for the space shuttle. Things like rigorous separation into development and testing sub-organizations... duplication of coding effort insofar as every line of code isn't just code-reviewed, but requires double-entry in a specification as well, and review of that specification... constant "process-ization" of debugging insights so the entire team doesn't make the mistake again. It's the exact opposite of "agile," but it makes sense if reliability is what you want to model and optimize.



> It's the exact opposite of "agile," but it makes sense if reliability is what you want to model and optimize.

One of the more interesting aspects about that classic story was that the software developers, despite having deadlines of their own and having to adapt to ever-changing hardware on the Space Shuttle, almost never ended up in "crunch mode".


  > However, if you apply a similar analogy to reliability instead of time,
  > having people work on the same task with what would appear at first
  > to be redundant effort can increase reliability phenomenally.
See also RFC 1025 - TCP AND IP BAKE OFF http://tools.ietf.org/html/rfc1025




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: