tis.so
April 26, 2022

Reply to “Heuristics That Almost Always Work”

by Suspended Reason

I’m almost tempted to write a post called How I think about dialectic” that says, Look, all beliefs are theories about the world, which is to say, heuristics for acting. If there are two rival heuristics (‘beliefs’) that people walk around with—say, conflict theory and mistake theory—and each works OK or seems coherent, it would be wild to go, Well, only one of these heuristics can be right; which one is it?’ No, you’d say, how do we drill into particulars to get a more nuanced understanding of when conflict vs mistake theory is more productive a frame? How do we refine our sense of the problem ontology in order to refine our agency (the efficacy of our heuristics)?”

A few weeks back, Collin brought up Scott Alexander’s recent newsletter, Heuristics That Almost Always Work,” and some of the problems that emerge when you use the Law of the Excluded Middle outside of logic, in the real world.

One thing Scott is good at, in his writing, is setting up a puzzle from starting premises, and then trying to tease out apparent incoherences, inconsistencies, or problems. The issue, however, is that often, in formulating the puzzle, assumptions and premises are taken for granted in its formulation which do not, when examined explicitly, hold true in the real world. A puzzle emerges which claims to be about the world, but is instead largely a product of the puzzle’s construction. (In this way, it is like philosophy’s thought experiment.)

In Heuristics That Almost Always Work,” Scott supplies a list of example scenarios that are defined by having an investigator and a mystery. The mystery is introduced to the investigator by a cue, an external-facing sign of possible trouble. The investigator, in each thought experiment, repeatedly encounters the same cue and must decide how to interpret or act on it.

The investigator can resolve the ambiguity of the cue in two ways: by taking it seriously, or by dismissing it. The former we can label a positive response, the latter a negative. The investigator’s heuristic for interpreting the mystery can yield a true negative, true positive, false negative, and false positive. False negatives, in the examples Scott lists, are very rare but very costly. False positives are rather common, but can be cleared up and identified as false with a minimal amount of work. His agents learn, with repeated false positives, to uniformly dismiss cues (that is, label them negatives), rather than putting the time in to properly investigate. The uniform passivity of their response—never investigating, always dismissing, cues—makes their job as investigator fundamentally unnecessary. A spam filter that never filters out spam is not a spam filter, and certainly not worth paying an annual salary.

A very rudimentary way to behave rationally, in this paradigm, would be to perform a cost-benefit analysis, weighing the overhead of false positives (routine inspections and follow-up) against the tail risk of a false negative. A false positive, for a security guard who hears a rustling, might require setting down his crossword puzzle and doing a lap around the building with a flashlight. A false negative—a robbery on his watch—loses him the job. Depending on how badly he needs the job, and how often he expects a robbery to coincide with a rustling, he may or may not find it worthwhile to investigate such cues.

But the actual way that professionals deal with these situations is to remodel their ontologies, and to look for subtle differences in the type of cue (and the context the cue emerges from), and to set up different more precise heuristics. In some sense, a category just is a heuristic, and the issue with the security guard example is that it presumes there is only one kind of rustling, that there is only one type of stimulus and thus one type of rational response. In reality, each stimulus, each cue, is unique in some way. Being effective in such professions as an investigator typically involves developing an intuitive ontology for which cues ought to be taken seriously, and which ought to be dismissed. This is what actual experts do: they build ontologies from experience.

A problem, of course, is that learning the distribution space of phenomena requires adequate sampling. If an agent is never exposed, in training, to some combination of cue and (e.g. disastrous) outcome, they can never develop a proper ontology for noticing and preventing that outcome. This is a problem not with heuristics but learning, and the nature of tail risk. But it is handled not by developing a full sense of possible tail risk scenarios, but by developing a sense of normalcy. When you’ve heard the rustling of wind ten thousand times, an actual break-in will sound different; it may be unplaceable, it may not be obviously a break-in, but it will not sound identical to the ten thousand winds. Gary Klein’s famous basement fire” intuition example in firefighters is a commonly cited example:

It is a simple house fire in a one-story house in a residential neighborhood. The fire is in the back, in the kitchen area. The lieutenant leads his hose crew into the building, to the back, to spray water on the fire, but the fire just roars back at them. Odd,” he thinks. The water should have more of an impact. They try dousing it again, and get the same results. They retreat a few steps to regroup. Then the lieutenant starts to feel as if something is not right. He doesn’t have any clues; he just doesn’t feel right about being in that house, so he orders his men out of the building—a perfectly standard building with nothing out of the ordinary. As soon as his men leave the building, the floor where they had been standing collapses. Had they still been inside, they would have plunged into the fire below.

A sense of normalcy isn’t perfect, but it’s effective at breaking us out of routine tactics, telling us to pay attention, and getting us to look closely at what feels off.

Scott Alexander heuristics law of the excluded middle ambiguity generalized reading Gary Klein intuition sense of normalcy ontological remodeling

April 25, 2022

On metonyms

by Suspended Reason

Suspended: Do you think it’s fair to say that signs & symbols are just cultural technologies built atop metonymic inference muscle?

Crispy: Couldn’t have put it better myself. Metonymy seems to be this very basic pattern that’s used everywhere but has a lot of aspects that can be studies in generality, I want to dig into it more. Also, surrogation is a result of a very salient aspect of metonymy: information lossyness.

Suspended: Right, there’s natural informational lossyness—edgecases, exceptions, etc that aren’t covered, or are mis-represented.” But most (many?) metonyms are at least reasonable heuristics to begin with, tailored to an environment (statistical distribution of happenings). When you have adversarial players though, they are incentivized to decouple the relationship. It’d be interesting to think through the relevant superset—cases where signifiers and -ieds drift naturally apart, or else are strategically decoupled.

How do we know a man is married? We look at the ring on his finger.

Darwin Ortiz, Strong Magic:

In the heyday of the big con in this country, professional con men would often meet some wealthy businessman and, within a couple of hours, succeed in convincing him, without any collateral, to turn over large sums of money to them. These well educated and highly intelligent businessmen were willing to put their trust in complete strangers in large part because the con men were able to convince their victims, purely through their dress, grooming, and demeanor, that they were the same type of men as themselves and therefor trustworthy. These con men know that all of us draw firm conclusions about others based purely on what we see.

How do we know someone is rich? It isn’t by auditing their net worth.

Sarah Perry, Cargographic Compression”:

This is where maps (and trails) come in. Maps are compressions of the aboutness-extraction type. They represent and highlight certain features of the domain, forming a useful model, and ignore or downplay others… Language itself acts as a store of information, shared among minds, and language tends toward compression. As stories and concepts are shared, they become more compressed, until they reach the final stage: a metonym, a single word that represents a story or concept that conversation partners are expected to understand. A word is the ultimate tl:dr for human communication

The hermeneutic circle goes: Understanding of whole influences understanding of part influences (new) understanding of whole influences understanding of (next) part. As you read a book, you meet people, navigate new cultures, learn biology.

In other words, cues shape our framing of a situation; our framing shapes our reading of the cues. Gary Klein:

That became part of our model—the question of how people with experience build up a repertoire of patterns so that they can immediately identify, classify, and categorize situations, and have a rapid impulse about what to do. Not just what to do, but they’re framing the situation, and their frame is telling them what are the important cues. That’s why they’re always looking, or usually looking, in the right place. They know what to ignore, and what they have to watch carefully.

Thus does Damozel, praising Jacob Clifton’s television writing at Flatland Almanack, proclaim that Clifton

has that one thing you have to be born with if you’re going to have it at all: the writer’s gift for discerning the telling detail, the defining phrase.

surrogation metonymy Sarah Perry compression Jacob Clifton Gary Klein frames

April 24, 2022

Degenerate play

by Possible Modernist

In response to my last post, Suspended asked whether Schneier’s notion of Hacking was the same as degenerate play”. I wasn’t familiar with the latter term, although it turns out I am deeply familiar with the phenomenon.

At least in conventional early usage, it seems like the two concepts are indeed very similar. In Rules of Play: Game Design Fundamentals by Salen and Zimmerman, the authors treat degenerate strategies” as more or less synonymous with exploits”, defining degenerate strategy as a way of playing a game that takes advantage of a weakness in game design, so that the play strategy guarantees success.” Mirroring the Schneier definition, they later emphasize that using a degenerate strategy is not cheating. Degenerate strategies take advantage of weaknesses in the rules of a game, but they do not actually violate the rules.”

How such strategies are received is bound to depend on context. If you figure out a way to defeat a difficult boss in a soulslike by exploiting a weakness in its programmed behavior, perhaps in a way that wasn’t anticipated by the game’s designer, that might at times be seen as clever or even elegant. But (to use what is perhaps the most commonly mentioned example) if you have a Magic: The Gathering deck that is often able to win on the first turn, that is unlikely to be something that will be valued by the other player (in terms of their enjoyment of the game), even if they admire it on some level.

Although not explicitly stated in the above definition, a key part of how the term is used seems to be that degenerate play is a strategy that makes a game less fun. In the strictest sense, if there is a strategy which truly guarantees victory, then we are dealing with a solved game. If both players know this, it seems unlikely they’re going to be able to have much fun with the game, unless they both agree to some sort of rules changes or additional constraints.

However, if we put the emphasis on the fun” aspect, it’s easy to think of strategies that don’t necessarily guarantee victory, but nevertheless break the game. In this way, a long string of final moves in chess might be considered degenerate, with the player who is in the losing position forcing the player who has effectively already won to play out the moves to the end. There’s a reason it is considered proper to concede at a certain point in chess.

A slight variation, which might still be considered degenerate, even though it’s unlikely to lead to victory, is simply refusing to abide by the goals of the game. Although it’s rarely stated explicitly in rules, most games are premised on the idea that all players are suppoed to be trying to win. If a player decides that they don’t care about this, it suddenly frees them to start doing whatever they want, possibly creating chaos. Especially for games in which player order matters, this can often create a large advantage or disadvantage to the player who happens to be immediately before or after them in turn order.1

The most interesting part of all of this is that most games only exist because players on some level want to come together to have a good time. Obviously there are competitive levels of many games, where fun might not enter into it, but these arguably grew out of less serious forms of play. When dealing with hacking, we are typically talking about a system that was designed for some purpose (either by people or evolution), which others may be able to exploit for their own gain. (No one is concerned that tax loopholes make taxes less fun for people, but rather that others may be able to avoid paying their fair share). With games however, there is the meta goal of having a good time, which is (in the best cases) still totally possible if you lose, and which is therefore arguably more important than winning. Players who would sacrifice fun in order to secure their own victory have indeed hacked the rules of the game, but arguably to no one’s benefit.


  1. Something similar can happen with new players who don’t yet have any sense of basic strategy.↩︎

games rules play hacking degenerate play strategy

April 23, 2022

Introspection vs extrospection, and psychoanalytic epistemics

by Snav

Response to Signal privilege vs. representation privilege in introspection.

The signal privilege vs representation privilege frame seems to well represent the epistemological issues involved in introspection. Namely, you have access to greater raw experience” of your own psyche, which includes personal history as well as transient nervous system states, but you may not be in the best position to render this experience meaningful in terms of understanding, compared to an external” observer. As an introspector, you already have your own explanations for the meaning of your subjective experiences, whereas an external observer is not burdened with this additional context, and thus might be able to evaluate your problem more accurately.

As an analogy, consider you’re reading a good novel. You sit down and rapidly read through the chapters, enjoying it as you go. Once you’ve finished, you have a sense of what happened”, in terms of key conflicts, but you might not be able to articulate this to someone else. I’ve had many experiences personally of reading (or watching) something, yet being utterly unable to communicate the story” in a coherent way. It feels like the story was something that happened to me”, rather than something I observed and synthesized into a summarizable compression. This is not necessarily bad! Much of the richness of literature and film is that experience of being in the media. I posit that this is also how introspection tends to operate.

On the other hand, consider reading the same novel out loud to someone. You might not have the same degree of immersion in the story, as you were focused on speaking rather than reading, and yet you might find that, once finished, you have a far better grasp of the overall arc of the plot and the main character conflicts. The listener also tends to have a different experience than if they were reading it themselves: they can’t glaze over paragraphs, and the words take on a sort of embodiment lacking in quiet” reading.

Is this not how the psychoanalytic situation operates, as well as personal practices like journaling? In telling the story of yourself”, without thinking about it, you can come back later to the things you said and evaluate them as if” you were an external observer, hearing your own story. In the moment, while speaking, you may experience the recollections as if you were there, fully charged with energy, but since you are also speaking them, you can come back later and think over what you said, as if you were hearing it for the first time. This places the introspective activity outside of yourself”, in a way. Call this activity of returning to your own utterances with an observational, objective eye extrospection”.

The capacity to properly extrospect requires one important thing: that, when speaking, the speaker is free associating. This means, effectively, not imposing conscious narratives into their words. Rather, the speaker should relate exactly what pops into their head, at the moment it pops into their head. Free association also gives the listener a privileged position with respect to the speaker, in that they gain access to the immediate thought-patterns of another person. This is highly intimate, so it is no wonder that in most of our relationships, we do not free associate. Rather, we act with respect to our conceptualization of the other, and play games together, communication as manipulation.

Alternatively, free associative therapy can be thought of as a special sort of game, where the speaker conceives of the listener as having a certain special knowledge which can fix the speaker’s problems, insofar as the speaker is able to speak openly and reveal themselves, i.e. free associate. In Lacanian parlance, the listener takes on the position of the subject supposed to know”. Know what? Precisely the thing that the speaker wants or needs them to know. To sustain this link, the listener in a psychoanalytic situation reveals as little about themselves as possible, attempting to act like a blank screen” or mirror for the speaker. The process of the speaker projecting some character or feelings onto this blank” listener is called transference”, and it often takes the form of love”. This love” is completely a quirk of the formal structure of the game”. On this topic, Žižek writes (The Metastases of Enjoyment, p. 153):

because of its automatic’ character, transference love dispenses with the illusion that we fall in love on account of the beloved person’s positive properties - that is, on account of what he or she is in reality’. We fall in love with the analyst qua the formal place in the structure, devoid of human features’, not with a flesh-and-blood person.

Love is, in this sense, itself a by-product of a particular kind of intimate game. But love is only a side effect of this particular therapeutic game, whose ulterior” goal is to have the speaker speak themselves” into existence, so that they can then reverse positions and hear” themselves, eventually coming to know” themselves. As Lacan claimed, the end of analysis is when the analysand (speaker) can say I am this”.

In other words, extrospection is how one changes oneself on a purely formal level: one’s idea of self” (“I”) changes, when you witness yourself speaking and realize oh shit (hahaha), this is actually me”. This often happens at specific moments of insight, akin to the paradigm-breaking function of scientific discoveries which cannot be incorporated into the theories from which they emerged. Introspection, on the other hand, proceeds through pure thought via a single, undifferentiated subject, and ultimately produces structures of justification. Mathematics instead of Science. Which you choose depends on what you want to do, where you want to end up.

introspection extrospection games meaning ACiM psychoanalysis epistemology

April 22, 2022

Will my uploaded brain whistle?

by Collin Lysford

Compare knowing and saying:

how many metres high Mont Blanc is

how the word game” is used

how a clarinet sounds.

Someone who is surprised that one can know something and not be able to say it is perhaps thinking of a case like the first. Certainly not of one like the third.

-Ludwig Wittgenstein, Philosophical Investigations, proposition 78

I kind of, sort of, know how to whistle. This is a big improvement from most of my life where I couldn’t whistle at all. But I constantly feel like whistling, so I would always force out a tuneless bunch of air. Sometime over the last few years, maybe as a result of starting voice lessons, my whistling went from non-existent to merely remedial. If I whistle a tune we both know, you’ll probably be able to guess it. But I certainly don’t place the notes nearly as accurately as with my voice. There’s a huge degree of certainty on what kind of sound will come out when I start. I hear the noise and try to make the rest of the whistling work out based on how the first note sounds.

So somewhere in the tangle of my brain is some information about how to make my lips and lungs cooperate to make a noise and adjust it. But that doesn’t mean the sound” of whistling is encoded anywhere. I’m distributing some of that cognition to my lips themselves. I don’t think” a sound that is imperfectly outsourced to my body; I think” a process that cooperates with my body to produce a sound.

This difference becomes quite material when I imagine my life as a brain on silicon. Many people believe it’s inevitable we’ll be uploading our brains into computers - the information encoded in a human brain is on a scale tractable to potential storage capabilities, so whats stopping us from putting that information on a computer? And some people go even further and think of this as an upper bound” on the timescale for when we’ll have artificial general intelligence. Even if we can’t generate one from scratch, then surely we can start with a human brain on a computer as a general intelligence and improve from there. But that information encoded in my brain isn’t the abstract idea of whistling, it’s a partnership with my lips. So what does it mean to say that it’s uploaded?”

One answer is that the brain will live inside a simulation that wires it up to virtual lips so well that I won’t be able to tell the difference. But if you need that sort of simulation for the brain to work, you no longer have the guarantee that you can easily encode everything that you need. The entirely of the physical world that touches a brain contains a lot more information than the brain itself does. And the required complexity of the simulation means you’re front-loading a lot of the difficulty. If brain emulation is a stepping stone to AGI, but it requires PerfectlySimulateTheNaturalWorld.exe to make the emulated brain work, then why muck around with artifical general intelligence at all when you could just use PerfectlySimulateTheNaturalWorld.exe to answer all your questions?

So the question is: how much less than perfect can the simulation attached to your brain be and still be useful? Your brain is a tool that’s made to interface with an enormous amount of physical context, but how much of that context can be omitted and still have it work? A chief hope for emulated intelligence is being able to speak textually, and people often have a subjective feeling of being disembodied while speaking textually, so there tends to be a lot of unconscious hope that the words part” might be cleanly separable from the rest. But is there a reason for that belief?

To be clear, this isn’t a rhetorical question on my part - I genuinely don’t know how you’d even start determining separability of various brain functions. It all seems pretty goopy and connected to me. But as we sit in the fog of uncertainty, I do think we focus too much on the ideal case of just split out the thinky-bits from the biology-bits”, when of course our mood influences our thoughts, many people think better on walks, and we only know what we’ll sound like when we start pursuing our lips and whistling. I have this horrible image of the world’s first emulated brain, running fine on a hard drive but completely unable to talk to any interfaces we create for it. Eyes, hearing, speech - all of it relying on hidden biological context we didn’t manage to fake well enough. All we can do is watch the brain waves on a computer screen, wondering if anything is awake” in there, wondering if it’s feeling pain, unable to ask it anything. Before we get too excited about uploading brains, we should make sure we have a well-founded answer for why they won’t end up like this.

AI Ludwig Wittgenstein context simulation

April 21, 2022

Natural ontologies

by Neil

I.

When scientists or engineers learn to dance, they often try to pilot their body like a sci-fi character would pilot a spaceship—except instead of set phasers to stun,” they’ll try to set lats to 15 pounds of force” or something like that.

This doesn’t work very well. While it’s theoretically possible to engage individual muscles to a precise degree, doing so requires an intense level of body awareness that most human beings never achieve. In practice, most beginning dancers find much more success with visual metaphors: e.g., imagine you’re carrying a tray” or imagine you’re wearing a necklace that you want to show off.” When you think about it that way, your body naturally does the right thing (or something closer to it).

Nevertheless, many scientist-dancers insist on trying the engage my quads 10% more” approach for a long time. They argue that the physics of muscles and force are what’s really happening, and so they should be able to learn it that way.

II.

Something similar came up for me when I saw these tweets:

Tweet Tweet Tweet Tweet

I can see why Tyler finds this kind of interaction spooky.” But on a smaller scale, this kind of effect is very familiar. While I’ve never met any self-proclaimed energy healers, I have definitely felt the atmosphere change when someone simply enters a room, either for better or worse. I think we’d say that happens all the time.

This looks spooky” only because we insist the scientific picture must be the most true picture of what’s happening. And in this frame, we’re led to speculate about whether there is a luminiferous aether for vibes, or create some model of how our brain is processing all of these micro-events, because, we insist, there must be an expedient way to explain this scientifically. But our confusion is self-inflicted. It comes from introducing a scientific picture where it isn’t appropriate to the situation.

meaning ontology dance

April 20, 2022

Girardean mimesis, Bourdieusean distinction

See also Thoughts on differentiation”, Taste, optics, and authenticity”.

S. Rashani 1999, The Eternal Tao—Mimesis and Distinction in Everyday Life:

From Girard, we learn that a man’s observation of his neighbors shapes, in a serious way, the form of his own desires. That desire does not so much originate in a natural,” pristine, and pre-social self—to be discovered within—but is deeply social, located outside the self. We will refer to this social construction of desire as the call to convention and kinship—a kinship, paradoxically, attained via conflict.

Simultaneously, Bourdieu points us toward our drive for distinction. We seek constantly to differentiate ourselves, to set ourselves apart from those around us, establish a niche into which we might slip and say, We are not like them.” The motivation for this drive is economic in the broad sense of being capital-driven (financial, social, sexual)—perhaps is even an existential imperative. To the information theorist, distinctiveness is a prerequisite for the possession of identity.

Our identities are products of the ongoing negotiation between these contrary impulses, which like the horses used to draw-and-quarter medieval prisoners, threaten constantly to tear a self apart. Like these people, but not like those” sings the ego, modified masochistically by Groucho Marx. Once a club accepts us, we discount the value of its invite, climbing ever upward, re-weighting our social ambitions. We differentiate ourselves to reach a subculture; we feel we have boarded the mothership,” and found our people”; soon after, we begin our rebellion, or are acculturated into pre-exisitng party lines. (Narcissism of small differences.) To borrow a term from the new CSS2 programming language, our signals are never absolute positioned,” always relative-positioned; they are geared toward an audience of near-comrades, showcasing the ways we are different and the ways we are similar.[^1] To belong, and to be exceptional. We live under the threat of forgetting the similarity—of emphasizing our differences while leaving the overlap unvoiced and implicit. But is this a threat, or a natural evolution of our assumption ground? A subculture looks much different to an insider than an outsider: heterogeneity, instead of homogeneity; a landscape of conflicts and micropolitics replacing a unified field. As our belonging becomes taken for granted, our sense of self shifts, slowly, to emphasize intra- over inter-, this landscape of subdivision, an ever-receding tribe. Old thems” are forgotten; the old us” fragments into a new us” and several thems.”

[…]

In the Girardean frame, because our desires are inevitably over scarce resources, our shared desiring brings us into conflict. In light of Bourdieu we might say that it is specifically status and distinction (identity) which are the central contestations between mimetic rivals.

[^1] What would an absolute position” look like, after all? Cartesian coordinates are themselves a system of relative position; the absolute position of CSS is merely relative to the browser window’s edge.

Pierre Bourdieu René Girard distinction difference mimesis S. Rashani authenticity desire information theory positional semantics torque epistemology narcissism of small differences subculture

April 19, 2022

Generalized hacking

by Possible Modernist

Everything is a system, every system can be hacked, and humans are natural hackers.” — Bruce Schneier

I grew up playing a lot of board games, and it wasn’t long before I realized that there was a tremendous advantage in having a deeper and more thorough knowledge of the rules than other players. For 99% of cases, the common understanding was sufficient and correct, and most of that knowledge would be relatively useless. But every so often an edge case might appear, and knowing how such ambiguous cases would be resolved by the rules could sometimes offer a critical edge.1

In an excellent report about present and near-term future risks of AI, Bruce Schneier uses the concept of hacking to discuss the potential exploitation of loopholes, vulnerabilities, and flaws in both technical and social systems. There’s lots in the report, and it really is excellent, so I’ll probably revisit it in a few posts here, but I wanted to start just by unpacking the term hacking”.

Scheier begins with a definition of hack” as an unanticipated subversion of a system, or something in a system that allows such a subversion. Hacking is not cheating. It’s following the rules, but subverting their intent. It’s unintended. It’s an exploitation. It’s gaming the system.” … Hacks are clever, but not the same as innovations.”2

Although our computers are likely our first association with the term hacking, Scheier correctly notes that the same concept applies just as well to sets of rules like the tax code, as well as to our own minds. Because it’s hard for mutually antagonistic legislators to write coherent and consistent laws, the tax system is full of bugs and loopholes, such as the so-called Double Irish with a Dutch Sandwich”, in which corporations are able to make use of subsidiaries in multiple countries to avoid paying taxes. Similarly, even though humans weren’t designed by committee, we are also full of vulnerabilities, and can be relatively easy to exploit, especially when tired or distracted.

The threat, as Schneier sees it, is that computerized systems which operate with greater speed, scale, and scope, are well-poised to discover and exploit numerous vulnerabilities in our technical, social, and cognitive systems, to our great disadvantage. There are a few nuances to be debated, but overall I find Schneier’s relatively grounded discussion to be much more productive than the more imaginative scenarios and assumptions that are commonly deployed in the AI alignment community.

Regardless, it’s interesting to think a bit about this notion of hacking. As noted, it’s extremely hard to design a reasonably complicated system of rules such that there are no ambiguities or loopholes.3 This means that most such systems will have vulnerabilities that can be exploited.

Anti-inductivity would tell us that all profitable loopholes have already been exploited (or very soon will be), such that existing systems are basically unhackable. This is not actually the case in practice, both because the cost of discovering a hack may be hard to estimate, and because the potential payoff may be a matter of timing. Good old fashioned computer hacking provides an interesting example here: many hacks are easily patched once discovered, and so those that discover them might sit on them for a considerable amount of time, hoping to exploit them at the most opportune moment.

Of course, the fact that it’s almost impossible to write complex rules without vulnerabilities means that it’s potentially quite easy to intentionally include such vulnerabilities in the design of a system in ways that will not be noticed. The exploitation of such loopholes would then, I suppose, not be considered hacking by Schneier’s definition, since the exploitability was intended (presumably by the person who will exploit it, perhaps in a roundabout way).

Backing up one level, the creation of rules is itself governed by rules and norms, and one might assume that they were meant to be followed in good faith. Thus one could say that the intentional introduction of a vulnerability into a set of rules is itself a kind of hacking of the rules-creation framework. Unless, that is, the potentially exploitability of that framework was intended by its designers? Can one hack a system of one’s own design?


  1. It might be argued that not sharing relevant knowledge of the rules is unfair, but the problem with this is that a) there is too much detail for most people to care to retain, and b) when such edge cases arise, it can be difficult to communicate without assuming or giving away strategic choices. In some sense, the real advantage is in seeing when ambiguities will appear, as much as in knowing how they will be resolved.↩︎

  2. The full definition he gives is: Def: Hack /hak/ (noun) 1. A clever, unintended exploitation of a system which: a) subverts the rules or norms of that system, b) at the expense of some other part of that system. 2. Something that a system allows, but that is unintended and unanticipated by its designers.”↩︎

  3. Board games remain an interesting case study. Despite being relatively complicated, well-written rules often provide surprisingly comprehensive treatment of all scenarios that arise (though of course many sets of rules are not so well-written)↩︎

hacking AI alignment games anti-inductivity degenerate play frames

April 18, 2022

The signal democratization double-bind

by Crispy Chicken

Often the ability to play certain social games proficiently becomes a social class marker or a social class in and of themselves. Think fencing.

If we want to level the playing field from the current status quo we usually end up in a double bind.

The charity method” to help people out is to teach some lucky few to play the given game proficiently. This causes Goodharting on whatever the selection criteria is for choosing who gets these fencing lessons”; usually it’s just another layer of social networking. Regardless of what the selection criteria are, this enforces the status quo and the hold of the marker actually grows stronger as the marker ends up being a more reliable predictor of future success, whether it’s because the system is good at scouting out people who deserve more credit or because it reinforces a new social networking method.

The modern method” to help people out is to enforce strict rules against the marker. This causes extreme Goodharting on whatever markers can come to stand in for the old marker, as the system total has learned to rely on the status quo. All the assumptions about unwritten rules, who is trustworthy, etc. rely on these social signals and so people will decide on new markers that seem like they might cause similar outcomes. Due to being artifical, these new markers will be played because they are easy to manually coordinate around without shame (no social standards around these new markers likely exist yet) and without the nebulosity tradition lends as different groups interpret tradition differently.

I’m no huge fan of Chesterton’s Fence—why shouldn’t we wonder about whether people who make fences tend to be fools?—but I think Chesterton’s Fence is subsidized by people’s failure to change status quo for the better due to the above dynamics.

signaling double bind fencing surrogation Chesterton's fence

April 17, 2022

Boundaries protect information, too

by Suspended Reason

Last week, Hazard wrote about Julian Assange’s observation that large institutions must keep paper trails for corrupt or illegal systemic actions they take.

systematic injustice by definition is going to have to involve many people… maybe the whole plan isn’t visible [but] by the time it gets down to the grunts [who enact the policy] some component of it is visible… all organizations of any scale have rigorous paper trails for the instructions from the leadership.

To generalize, coordination requires communication between participating agents. This communication may occur either symbolically (e.g. agents exchanging language with one another) or stigmergically (e.g. agents all consulting the same project state), such that each agent may determine its next action, with minimal redundancy of effort, in a way that advances rather than regresses the shared project. To maintain a consistent set of behaviors over time, and specify to participants how they ought to behave in a variety of possible situations, there must be some store of information which perseveres over time, a store deliberately created and maintained against entropy, corruption, and sabotage.

There are many reasons that superorganisms (countries, institutions, animals, cells) maintain boundaries around their constituent sub-organisms. One is to keep valuable resources from leaking out, or being extracted by rivals. Another is to keep toxins, viruses, and parasites out. But in addition to this careful gating of physical forces and materials, there is also a careful gating of informational access. Within boundaries, we maintain and share informational stores (be they through stigmergy or symbols) so that sub-organisms may follow a common code and coordinate. These stores disambiguate project states, and the configuration of the superorganism, so that agents dedicated to advancing the project and interests of the superorganism may better do so. But these information stores, if accessed by rivals, help them exploit and subvert the superorganism’s aims, precisely through the same disambiguation.

strategic interaction legibility communication signals boundaries stigmergy superorganisms

April 16, 2022

Taste, optics, and authenticity

by Suspended Reason

Epistemic status: Mostly torque, awaiting a torque in the opposite direction.

We are straitjacketed by worry—worry over how we come off, how we are read, how we are thought of. We are scared by the prospect of trespassing boundaries, offending others, seeming dull. New York City has a $2Bn cosmetics industry. Ambitious 20-somethings sacrifice years of their lives toward building bulletpoints on a CV. And yet we have the audacity to tout (on Instagram, natch) our authenticity—to ourselves, to our roots, our desires and feelings. Is there a way to read this as anything other than cope, a denial of our opticratic situation? Idealism is antagonistic towards optics, refuses to communicate or explain itself, is antithetical to pragmatism. To come face-to-face with the debts we owe others, the message we monitor and hedge around, the pragmatic way our image cashes out to power: this is the meaning of growing up.” Instead we blindfold ourselves with propaganda. We carefully curate (others’) identities while pretending that no pressure is exerted on the behavior of those we curate; we carefully curate (our own) identity while pretending that every defensive act, in a red queen, isn’t an offensive gesture, too. The bands, brands, personalities, and athletes most praised for their authenticity are waist-deep in performance, even as we sing praises of their trueness to self” (which, luckily looks more like novel niche-identification than obedience to original impulse). We’re’ waist-deep in performance even as we sing the propaganda of our selfhood. Hey, you look good, that’s some dress you got on there.” This old thing? Why, I only wear it when I don’t care how I look.” To the opticratic-pragmatist, to be authentic” is to nurture a well-integrated social performance,” to cultivate a set of stances, tastes, beliefs, dispositions, attitudes, and ontologies which are robust to cross-examination. To be a myth which caulks its cracks. To sustain an aesthetically coherent set of stances across a variety of interactions, rather than inventing them wholesale, on the spot, as the niche arises. There is the legitimate sanctioning of two-facedness, but also a torque against the oppression of others’ expectations, a rebellion against the self-censorship which in public, rules nearly absolutely. (And in private, guides heavy-handedly.) Proclamations of freedom from others—their opinions, judgments, perceptions—are magical in the sense of striving for speech act status: they are attempts to make reality, rather than to represent it. The lady doth protest too much, methinks.”

At its most delusional, this denial of performativity takes taste as only ever personal. I mean taste broadly, but the original (culinary, gustatory, papillary) sense sheds light on the metaphorical extension. More than anything, taste is a system of preference, of preference for certain courses of action, development, unraveling, progression. Preference for the hedonic valences which tend, statistically, to accompany these actions and developments. And it is a system where, if we are possessed of any self-awareness, we feel sharply that there are two sets of preferences: one personal, one subcultural (varying between members, but with high degrees of overlap). A system of reputational baggage, the types of guy who tend to hold a given preference, the inferences that can be made about a man who listens exclusively to classical music, Broadway musicals, hair metal, or the Grateful Dead.

So maybe we can say that there are intrinsic and extrinsic rewards involved in consuming certain experiences or stimuli (~“artworks”). There are meals or food items we all have that we enjoy privately, but would never order in public. Times that, maybe we actually wanted one drink off the menu, and ended up ordering a more respectable cocktail. And these types or layers of reward interact in complicated ways; they’re far from separate.

For instance, one way to win a selection game is to fib. You might fib on a date by ordering a high-class dish you’ve never ordered before, as if it were a long-time favorite. This tends to go poorly, because reality is highly detailed, and small errors can give away the act:

See also the three glasses” scene in Tarantino’s Inglourious Basterds. This is a key aspect of Goffman’s theory of social performance: the definitions (of ourselves, of our motives, of our credentials) we perform for others are fragile things. Even one or two small slips can pop the bubble of belief which makes the interaction work.

Another way to win this sort of selection game is by actually cultivating a taste for expensive French entrees. By regularly going out, hearing the names of dishes pronounced, matching descriptors to entries, learning from others the proper way to pick apart a lobster. In other words, the occasional high-stake games we play (e.g. romantic dates, job applications) distort our behavior not just during the game, but outside it. (And this is to say nothing of the banal keeping up with the neighbors” treadmills that similarly drive consumption decisions, investments of time and effort, choices of pasttime, choices of employment etc.)

This sort of social learning is more interesting than, and irreducible to, pure performance. While we are driven out of our comfort zones by social pressure, class aspiration, and shame, with time we come to genuinely enjoy our adopted experiences, and by extension, our adopted stance. Our tastes and opinions become second-nature, inseparable from our identity, impossible to imagine otherwise even as, in the abstract, we pay lip service to nurture over nature. We come to feel our fashioned self is truly us,” amnesiac towards its ongoing construction process, the influences and censorings, the extrinsic rationales which gave it rise. And this is not incompatible with the idea that we might explore a domain (cuisine, painting, pop music) out of intrinsic curiosity; rather, it is to say that the extrinsic always shapes or delimits the direction of our curiosities. Expansions of taste are fundamentally aspirational—it is rare that we journey into realms of experience that are below our self-image of social standing.

opticracy authenticity taste aesthetics typification selection games extrinsic-intrinsic

April 15, 2022

The limits of signal privilege

by Collin Lysford

Sarge, my leg hurts something fierce!” The solider is in the hospital bed, and can’t see his leg as it’s obscured by a sheet. Sarge can, and knows something the solider doesn’t: his leg is blown clean off. There’s nothing there. When the solider says my leg hurts”, is he wrong?

Well, what does it mean to have leg pain? Is it the discomfort? The solider can’t really be wrong about having discomfort. If people perform worse at some cognitive task when they’re in pain, the solider will perform worse at that task. To the extent pain is an experience, the soldier is really having that experience. But leg pain” is also an attempt to explain that experience, and as an explanation, it’s clearly false. Leg pain” is a representation of a signal that implies certain ways that signal will interact with other processes in the world. Icing your leg will make it feel better, walking on the leg will aggravate it, something that blocks the nerves in your leg could clear it up completely. But your leg” is an empty set for the solider, and none of those processes can possibly happen.

Hazard came up with the excellent frame of signal privilege vs. representational privilege, and we can use it here. The solider has signal privilege - he knows what he’s feeling, and Sarge doesn’t. For questions like is the solider feeling discomfort?”, the soldier can answer instantly and accurately, while Sarge has to rely on second-hand observations like how is he doing at this cognitive task?” Sarge has representational privilege - he knows how the solider ought to represent certain signals, even though he can’t directly observe those signals himself. The leg that is not there cannot be iced, so even if Sarge doesn’t know when the signal is present, he has knowledge about what kind of signal it can and cannot be, and how certain interactions will interface with the signal.

As far as representational issues go, this is an unusually definite one. Sarge and the soldier don’t disagree on how to verify whether someone has a leg or not. It’s just that Sarge has done that verification, and the soldier hasn’t yet. Once the solider has, no matter how much he still feels his phantom limb pain, he will agree that the pain does not make the leg and that the object he’s attributing this feelings to actually doesn’t exist.

It’s rare that internal representations can be verified by outside observers in such a concrete way. The solider has privileged access to all kinds of other internal signals whose explanations aren’t so straightforward. He feels nauseous - is that entirely a factor of the pain, or is disgust of what he’s seen playing a psychosomatic factor? He’s angry and wants revenge - or so he thinks. But he’s never been in this situation before, while Sarge has ministered to more war causalities than he cares to remember. This upset feeling the soldier wants to attribute to his stomach, this angry feeling the soldier wants to imagine would be diminished by a future violent act - while the soldier has signal privilege knowing that they’re there at all, there’s no guarantee they point to the world in real ways. It may be that the feeling the soldier is representing as wanting revenge” is pointing to an object as empty as my leg”.

This is something most people end up noticing or later. Your voice teacher telling you that you’re thinking too low”, grief counselors decoding your emotions in the wake of a tragedy, Don’t worry, newbie, everyone feels that on the first day”; people are broadly willing to accept that someone else may have representational privilege over them in certain contexts. But I haven’t seen discussion of it as a specific phenomena. It’s usually lumped into nebulous catch-alls like wisdom” or experience”. I think that lack of specificity is a major hurdle in communication.

Consider a teenager complaining to her mom: You don’t understand how I feel!” If she’s operating from a signal privilege lens, this is a trivially obvious statement that only the most bone-headed parent would deny. Her mom replies Honey, I understand better than you do” - asserting her representational privilege of knowing how many teenagers have felt about teenage problems. How could you know?” the teenager replies, offended by the dehumanization of (she thinks) her signals being blatantly ignored. And if mom says something like experience”, she’s stuck, because experience is a bad explanation for the more nuanced concept of representational privilege. Sure, most versions of representational privilege require at least some experience, in the same way all trees of a certain height are at least a certain age. But there are plenty of techniques that can let you update your representations quickly and efficiently, and there are plenty of people who’ve had a lot of experiences they never really learned from and remain representational novices. Thinking of those people, teenager knows there’s no way that experience” is a sufficient reason to override her signal privilege, and slams the door in mom’s face.

By carefully carving out this specific concept, we can talk more precisely about when someone has representational privilege. If the teenagers problems are ones relating to social dynamics and personal changes in her body that are similar to what her mom grew up with, then her mom has a wealth of context the teenager doesn’t. She knows how many different people in the same boat described their feelings. She knows how people describing feelings a certain way ended up acting, and how they ended up feeling later. She knows of other, stronger feelings she’s had that cast formerly important-seeming signals in a new and lesser light. And this is what she’s trying to communicate to her child - that even if she doesn’t know the exact signal her child is feeling, she has a better sense of what to do about the signal.

Crucially, though, she doesn’t have a universally better sense of what to do. The benefit the mother has is context, so of course the applicability of that benefit is contextual! Maybe her child is trans and she isn’t, and so some of the signal is something she genuinely has no model for. This is the nuance that gets lost when we don’t talk about representational privilege specifically, often dependent on experience but certainly not the same as experience. It’s very important for people who want to allege representational privilege over someone else to be able to explain why they think their context has more explanatory power. Conversely, having this language should hopefully help those with only signal privilege understand how it’s possible someone else might be able to explain that signal better, and that this isn’t a refutation of their lived experience but a clarification of it into a form that will better slot in to the rest of their life in the world.

representation signal privilege representational privilege frames examples