OPIE and Debugging from a Computer Science POV

Posted on Posted in Computer Science 101, PHP Dev

Anyone can learn to code, but debugging your code is the more difficult, and valuable, skill. Here I will argue that an understanding of coding as process-based logic, and the OPIE (Output Equals Process of Input plus Errors) model are a great asset in your arsenal for debugging your applications–that it is, indeed, essential, and the core of the fundamentals of debugging.
To begin with, interestingly, the term “computer bug” was coined by the pioneering computer scientist Grace Hopper, who was not only a woman, but also an alum of my own alma mater, Vassar. Now, having gotten that out of the way, what is debugging, and how does OPIE fit into it? Debugging is the art of removing computer bugs from your code. A bug is, of course, an error or problem that interferes with your code doing what it is supposed to do. In my opinion, in terms of bugs in source code, there are four types of bugs: design errors, syntax errors, semantic errors, and super-syntax errors.
A design error is when your flowchart or logic design for the macro-level business logic of the application is incorrect, in other words, you fundamentally made a mistake in how you designed the software. All of your source code might be linguistically correct, but the application itself, coded correctly, does not achieve the desired result, at which point, having established that there are no syntactic bugs in the code, you must inevitably redesign the software itself.
A syntax error is where your design is correct, but you are in error–academic error, one might say–about the correct language syntax to use to accomplish a step in the logical instructions you are feeding to the CPU.
A semantic error is one in which your syntax is correct, but the actual meaning of the syntax is in error, for example, you think that something should be evaluated to true, when, in the context of your software logic, it should be false. For example, your semantics, your meaning, is that if a condition is met, something should happen, but you are incorrect in thinking that this should happen if that particular condition is met. Similar to a design error, finding no flaws in your syntax, you must end up redesigning this semantics to achieve the desired output.
A super-syntax error is one in which a syntax error culminates in a semantic error, in other words, where your syntax is linguistically correct and valid, but the syntax does something different from what you believe it does, which “infects” your semantics with a non-desired behavior. In a super-syntax error, your syntax and semantics are linguistically correct, but there is a unit of syntax that behaves differently than what you expect, and the semantic meaning of this syntax thereby produces semantics that differs from what you want, even though, on its face, your syntax and semantics are not structurally in error and do not emit error messages. The difference between a semantic error and a super-syntax error is that in a semantic error your beliefs about your syntax are correct but your semantic intentions are flawed, whereas in a super-syntax error your beliefs about your syntax are incorrect, although your syntax and semantics would both be correct if the syntax behaved the way you think it does. And, on the other hand, a super-syntax error differs from a typical syntax error in that a syntax error expresses itself as invalid syntax that will not compute (or execute if interpreted), whereas a super-syntax error will execute but with semantic results that differ from your goals.
Debugging is the art of, after having written source code, removing all bugs from it so that it will operate without error. (Of course, in practice, one will never remove 100% of all bugs, but debugging should seek to do this as an aspiration, and then concede 99% bug-free code as acceptable.)
I will presume that you have read the first part of my Computer Science 101 series of blog posts, so that I can use the term OPIE and “process logic” without needing to provide a refresher for you. Then, we must ask, what does OPIE have to do with debugging? Everything, is the answer, for this reason: because, nine times out of ten, you will see an error, but you will not immediately see what caused it, and you will know the offending erroneous source code only from its results, not from seeing it highlighted directly. Therefore, to identify the bug, you must infer the error from its resultant output, by means of logical inference, in other words, you must know the invisible by inference from the visible, and this logical inference of inferring the invisible from the visible must be so logically sound as to be unassailable, for you to know that you have found the source of the error, the bug, and can then fix it and rest easy that you code was debugged. How can you infer the invisible from the visible? By means of logic, and by understanding your application and its source code as a logical process. The basic tool is the “if X, then Y” argument in any logical process–and note that here, and this is very important, I am not referring to the “if () {} else {}” syntax of any particular language, but, rather, to the very fundamental concept of theoretical logic that, in a logical process, there can be a scenario where, if something is true, it must have some result, which is logical and necessary and non-optional. If this principle of logic is sound, then you can infer the cause from the effect, as it were, and thereby reverse-engineer the error to identify the bug to fix.
More often than not, you will have a bug. This bug may be either output which is clearly not acceptable, or else your development environment may actually notify you of a bug, such as if the syntax is linguistically invalid as evaluated by the compiler or interpreter. Then you will have to analyze what could have caused this bug, in order to know where in your source to look, and what to change. Say, for example, you have bug Z, and it could have been caused by A, B, or C. The logical form of the syllogism “if X, then Y” is that, if you can establish Y, you know that X can be true, and if you can establish X, then you know that Y must be true. Thus, “if X then Y, if A then Y, Y is true, therefore X or A is true” is sound. “If X then Y, X is true, therefore Y” is also sound. These, along with basic logical principles of and, or, not, etc., are the basic toolkits in debugging (among many other tools, such as the ability to keep track of nested layers of arrays, knowledge of syntax, the ability to scour the web for free open source software, etc.–I do not mean to oversimplify software engineering and make it seem more simple than it really is).
Say, for example, that you have an error, which I will call X. Based on your knowledge of the logic of your application, you can say that X could have been caused by A or B, because you know that if A then X, and if B, then X. You can think of nothing in your code other than A or B that could cause X. Say, for the sake of the example, that you can also say that if A then C, and if B then D. From an analysis of the output when you test your code, you can see that “not C” is true. From this–and, yes, from this along–you can infer that B is the cause of the X bug. The ultimate test comes, obviously, when you change B and then run your code again. If the X bug is now gone, then you can feel comfortable that your analysis was correct: X is true, if A then X, if B then X, A or B, if A then C, not C, therefore not A, therefore B is true. Sometimes, this will be all you have to go on, and this rigorous logical thinking, and the ability to infer the invisible from the visible with confidence, is the reason why, as I see it, logic and process-based thinking are the core of debugging (and of coding as such).
That having been said, there are all sorts of “developer happiness” tools for debugging, almost all of which aim to make the invisible become visible so that you can see your bugs directly instead of needing to abstractly infer them by means of logical thinking. Stack traces, unit tests, application tests, debugging consoles, stepping through variables, etc., are all of this sort. Then, with these tools, you might be able to achieve a debugging logic shortcut, where, for the hypothesis that B caused error X, you can, as a matter of logic, establish that if B then E, and, perhaps, E is actually easy to see visibly with some debugging tool, so that you can evaluate whether B is true in five seconds by running the test for E in five seconds and without thick layers of inference and reasoning.
This basic “if then” logic, based on logical inference and understanding the process that your application source code embodies, is, in my experience, the core of debugging, upon which one them builds up from the foundation by means of knowledge of syntax, debugging tools, etc. In computers, logic is everything, and debugging is no different. If you don’t be comfortable with achieving confidence in a conclusion that is itself invisible by means of testing a hypothesis upon visible elements, and inferring the invisible from the visible, in other words if you are not confident in your logic, then coding may not be your cup of tea (coders drink coffee, not tea, anyway).
There is one type of bug that I have not mentioned above: third-party bugs, where the bug is actually caused by a coding error made by a third party, such as the author of some open source library you are relying on, or even a bug in the most recent version of a framework or language you rely on. The problem with third-party error is that, even though it is possible and it happens, it is a bad idea behaviorally to make a habit of blaming other people for bugs in your own software. Virtually every popular framework and language goes through a rigorous testing and debugging before a stable release is declared, so third party errors in the popular frameworks are rare. My general rule is to rule out third party error until literally every possibility that you are yourself responsible for the bug has been ruled out by your testing and logical analysis. If no possibility remains other than a third party bug, then, and only then, the hypothesis of a third party bug becomes plausible, and it can be tested by swapping to a different third party and seeing if the bug persists (although, obviously, the cost and resources needed to change the source code’s syntax to test a different third party library may be severe, which is another reason why this should be avoided until all else fails to find the bug).
Usually, in 99% of the cases, the bug is your own fault, and relying on third party error as an excuse will only contribute to bad coding practices. That having been said, sometimes, every once in a while, a popular library or component is authored by someone who is prone to mistakes or is no longer committing resources to active development (think Internet Explorer web browser and the vast resources coders must devote to accommodating all the mistakes it makes), and so third party error is conceivable, but it should not play a role in one’s debugging process until one’s own error has been ruled out as the cause of the bug. Also, too, if a library is bad, one can often find documented other people also having issues with it, on Stack Overflow and such, and at that point the third party error will become apparent, and the software developer would then be well advised to switch to a different third party library which does not have a reputation for developer misery.

Leave a Reply

Your email address will not be published. Required fields are marked *