“Everything is a system, every system can be hacked, and humans are natural hackers.” — Bruce Schneier
I grew up playing a lot of board games, and it wasn’t long before I realized that there was a tremendous advantage in having a deeper and more thorough knowledge of the rules than other players. For 99% of cases, the common understanding was sufficient and correct, and most of that knowledge would be relatively useless. But every so often an edge case might appear, and knowing how such ambiguous cases would be resolved by the rules could sometimes offer a critical edge.1
In an excellent report about present and near-term future risks of AI, Bruce Schneier uses the concept of hacking to discuss the potential exploitation of loopholes, vulnerabilities, and flaws in both technical and social systems. There’s lots in the report, and it really is excellent, so I’ll probably revisit it in a few posts here, but I wanted to start just by unpacking the term “hacking”.
Scheier begins with a definition of “hack” as an unanticipated subversion of a system, or something in a system that allows such a subversion. “Hacking is not cheating. It’s following the rules, but subverting their intent. It’s unintended. It’s an exploitation. It’s “gaming the system.” … Hacks are clever, but not the same as innovations.“2
Although our computers are likely our first association with the term hacking, Scheier correctly notes that the same concept applies just as well to sets of rules like the tax code, as well as to our own minds. Because it’s hard for mutually antagonistic legislators to write coherent and consistent laws, the tax system is full of bugs and loopholes, such as the so-called “Double Irish with a Dutch Sandwich”, in which corporations are able to make use of subsidiaries in multiple countries to avoid paying taxes. Similarly, even though humans weren’t designed by committee, we are also full of vulnerabilities, and can be relatively easy to exploit, especially when tired or distracted.
The threat, as Schneier sees it, is that computerized systems which operate with greater speed, scale, and scope, are well-poised to discover and exploit numerous vulnerabilities in our technical, social, and cognitive systems, to our great disadvantage. There are a few nuances to be debated, but overall I find Schneier’s relatively grounded discussion to be much more productive than the more imaginative scenarios and assumptions that are commonly deployed in the AI alignment community.
Regardless, it’s interesting to think a bit about this notion of hacking. As noted, it’s extremely hard to design a reasonably complicated system of rules such that there are no ambiguities or loopholes.3 This means that most such systems will have vulnerabilities that can be exploited.
Anti-inductivity would tell us that all profitable loopholes have already been exploited (or very soon will be), such that existing systems are basically unhackable. This is not actually the case in practice, both because the cost of discovering a hack may be hard to estimate, and because the potential payoff may be a matter of timing. Good old fashioned computer hacking provides an interesting example here: many hacks are easily patched once discovered, and so those that discover them might sit on them for a considerable amount of time, hoping to exploit them at the most opportune moment.
Of course, the fact that it’s almost impossible to write complex rules without vulnerabilities means that it’s potentially quite easy to intentionally include such vulnerabilities in the design of a system in ways that will not be noticed. The exploitation of such loopholes would then, I suppose, not be considered hacking by Schneier’s definition, since the exploitability was intended (presumably by the person who will exploit it, perhaps in a roundabout way).
Backing up one level, the creation of rules is itself governed by rules and norms, and one might assume that they were meant to be followed in good faith. Thus one could say that the intentional introduction of a vulnerability into a set of rules is itself a kind of hacking of the rules-creation framework. Unless, that is, the potentially exploitability of that framework was intended by its designers? Can one hack a system of one’s own design?
It might be argued that not sharing relevant knowledge of the rules is unfair, but the problem with this is that a) there is too much detail for most people to care to retain, and b) when such edge cases arise, it can be difficult to communicate without assuming or giving away strategic choices. In some sense, the real advantage is in seeing when ambiguities will appear, as much as in knowing how they will be resolved.↩︎
The full definition he gives is: “Def: Hack /hak/ (noun) 1. A clever, unintended exploitation of a system which: a) subverts the rules or norms of that system, b) at the expense of some other part of that system. 2. Something that a system allows, but that is unintended and unanticipated by its designers.”↩︎
Board games remain an interesting case study. Despite being relatively complicated, well-written rules often provide surprisingly comprehensive treatment of all scenarios that arise (though of course many sets of rules are not so well-written)↩︎