Thursday, October 26, 2006

OOPSLA 2006: Day 3, Thursday, October 26

Today I attended a tutorial on Python, a talk, and some papers.

The best part of the day, by far, was the talk by Martin Rinard from MIT. He described an experiment in what he deemed "Failure Oblivious" programming. The standard approach for dealing with discovered errors in code is that programs should fail early obviously when they run into an error. He described an experiment that took the opposite approach: instead of having programs fail early, have them not fail at all.

His group took several open-source programs with known errors and changed memory allocation and pointer schemes so that they could not fail. If the system tried to read outside of an array boundary, it would just return a manufactured value. If it tried to write outside of the boundary, it would just not write.

The result was that the programs did not fail. They would hiccup on the input data that ran across the error, but they would then chug along just fine after that. Rinard's contention is that, for many contexts, this is far preferable behavior to the current practice of throwing exceptions and killing the application. For one, people use very small percentages of the features of most programs. Often the user would prefer to keep using the other features in an application even if one of them starts giving them trouble.

I would very much like to do some more reading on this area. It would be fun to do some experimentation with this approach.

No comments: