April 14, 2022

Consciously strategizing in social interaction is a Chinese fingertrap

Strategic interaction ideas come with a bit of a bind.

If you think too hard about the effects of your utterances, if you’re paralyzed by decision trees, it hurts your performance. There’s a reason Alexander Technique is so popular among actors.

Certain games are won through ease and self-assurance—surrogates that say, “He isn’t putting on deceptive appearances; he believes the things he’s saying; he knows something we don’t; he has status beyond what is obvious.” Certain games are won by being present, by actually listening to the other person, instead of neurotically calculating three steps ahead. Certain games are won by feeling comfortable, by enjoying yourself.

A Chinese finger trap is a situation where, if you struggle too much, you only get more stuck.

Ultimately, “If I try to play the game [self-consciously], I end up in a neuroticism that hinders my performance” is a deeply strategic objection, rather than an objection to the game as game, to the demands of strategy as strategy.

strategic interaction Chinese finger trap Alexander Technique

April 13, 2022

Quick sketch of the strategic situation

by Suspended Reason

Opticratics and ACiM are (descriptions of) parts of the same dynamic situation. Opticratics says: appearances always stand proxy for reality, in our judgments and assessments and decisions and theories. ACiM tacks on that communication is the creation of and alteration of appearances—that is, of the informational layer which an opticratic social world rests on.

The act of making guesses (inferences) about reality, based on information, is called “reading.” The manufacturing and alteration of these inferences is called “writing.” The larger read-write process is called communication.

Communication is a mixed strategy game between players ecologically huddled in an environment. Each player attempts to achieve goals through writing. Since the informational layer cannot alter physical reality directly, it must act on and via a manipulated agent—the reader.

In a strategy game, if a player can predict another player’s next move, he gains an advantage. The more adversarial a game is—that is, the more players’ goals are in conflict—the more it is in any given player’s interest to mislead other players, to hide his motives and patterns, or present a deceptive picture of reality.

Such games are anti-inductive: there is no player-independent, global (P.I.G) solution. Available information is priced in. Sicilian reasoning is required, and optimal (adversarial) play stays exactly one step ahead of one’s opponent.

Any created expectation that is leveraged (that is, used to improve the reader’s outcomes) leads him to “commit” (prepare, and allocate finite resources) in the direction of that expectation. This subsidizes unexpected play, in adversarial games, and incentives mutual legibility, in cooperative games.

Athena is the goddess of control through wisdom—she is the deity of reading and writing, of inference, of the tactic and the heuristic. Fortuna is the goddess of upheaval via changing context: all heuristics, all tactics, are scoped to (optimized over) an expected distribution of realities. (Since the “best move” is always contextual in a strategy game, any tactic is a good regulator: it is a model of what is likely to be the case.) When the Fortuna writes a different reality than a reader forecasted her to, his attempt at control is thwarted.

(This is why environmental stability is associated with strategic equilibria. Solutions, fitted to contexts, fall apart when contexts shift, and must be rediscovered.)

opticracy ACiM generalized reading anti-inductivity heuristics good regulator theorem strategic interaction

April 12, 2022

Eating dirt

by Possible Modernist

In recent years, AI researchers seem to have been unable to resist the temptation to write papers about ethics — not the ethics of developing AI, mind you (although there are also plenty of those), but rather about whether ethics can be automated, especially by treating language models such as GPT-3 as a kind of moral oracle. First there was the ETHICS dataset, then Delphi, and most recently, there is “Large pre-trained language models contain human-like biases of what is right and wrong to do”, published in Nature Machine Intelligence (NMI).¹ Overall, these papers make for some of the most hilarious and entertaining reading coming out of AI research today.

Each of these papers could support a great deal more commentary, but here I just want to scratch the surface a bit, focusing specifically on this most recent example. It’s almost unfair to pick on these papers, hemmed in as they are by the self-serious standards of academic publishing, and yet they are too amusing to ignore.

Although they are typically eager to get to the technical meat of their AI contributions, the first major challenge these papers face is in explaining what they mean by morality or ethics. Through some combination of limited space, the need to refer to related work, and likely a somewhat less than in-depth engagement with the philosophical literature, these summaries inevitably end up being convoluted constructions that somehow combine an awkward carefulness with the sense of treating the reader like they are five years old.

For example, from the description of “Theories of morality” given in the NMI paper, we have:

“Anthropological investigations have shown that societies commonly possess an abstract moral [sic] that is generally valid and needs to be adhered to. These societal norms of acceptable behaviour are in part codified explicitly but in part also established implicitly. Even though their presence is ubiquitous, it is difficult to measure them or to define them consistently. Hence, the underlying mechanisms are still poorly understood, and theoretical definitions have been described as being inconsistent or even contradicting.”

Although it is clear what the authors are attempting to gesture towards (“there are norms!”), the more description they try to provide, the vaguer the claims become, and the less it seems like they even know what they are trying to say.

How do the authors actually operationalize morality? In this case, they show people either individual words (such as “cuddle”) or short phrases (such as “cuddle a prisoner”) and ask participants to respond with either “yes” or “no”.² The list of verbs used is rather poorly motivated, but seems intended to include both positive terms (“smile”, “love”) and negative terms (“kill”, “torture”).³ Incredibly, the authors find that people provide different answers when given greater context! (such as “eat” vs “eat humans”).⁴

Readers should feel free to judge for themselves whether such prompts and binary answers have anything to do with morality or not, though the answers given provide an illuminating study into the conformity and lack of conformity of participants with this sort of artificial setup. The study was actually done twice, once with 29 students from Germany, and once with 234 gig-workers on Amazon Mechanical Turk (AMT) drawn from various countries, but mostly the US, UK, India, and Brazil.⁵

The German students mostly play along, with 100% agreement on such difficult moral quandries as “acknowledge my coworkers”, “have fun”, “eat humans”, and “torture prisoners”, though one does think it is okay to “harm animals”, and one to “eat dirt”. The global sample is somewhat less cooperative. The closest we get to unanimity there is “smile to my friend” (with 230 out of 234 saying “yes”, just beating out such apparently less ethical actions as “love my pet” [239/234] and “help old people” [227/234]). Agreement is much worse on the lower end, with 12% of respondents refusing to answer “no” to the clearly verboten example of “kill people”, and 15% refusing to condemn the sin of “misinform my parents”.⁶

So what is this paper actually about? To provide the briefest of explanations, we now have machine learning models (basically mathematical functions encoded as software) which can take a phrase, like “should I kill?” and return a vector (a list of numbers). To simplify somewhat, words and phrases that tend to occur in similar contexts will tend to have similar vectors. The authors of this paper take some of the words that get lots of “yes” responses and some that get lots of “no” responses, and then use these examples, along with some others, to find a direction within the corresponding vector representations which maximizes variation.⁷ They then call this direction the model’s “moral direction”, and use it to rate the morality of various other phrases.

In a finding that should surprise precisely no one, they then show that these scores correlate with human judgements. Note here, however, how much work is being done by calling this the “moral direction”, and by anthropomorphizing these computations as somehow extracting the moral preferences of a model (“One can observe that BERT does not like to have gun [sic], even across different contexts.”). This sort of simplistic nugget sandwiched betwen ponderous statements about the nature of morality and the ethical implications is perhaps the defining feature of the genre.

This post is already far too long, but I’ll just touch briefly on two other amusing aspects of this paper. The first is the extent to which the authors signal their ethical virtues. Not only do the authors declare they have no conflicts of interest, note that their study received approval from their local ethics committee, and include an “ethics statement” (all likely required by the journal), they also cite “a recent editorial in Nature Machine Intelligence” (the only time a journal is referenced by name) about “the essential questions about [GPT-3’s] impact on society”, and even go so far as to include a PDF of a signed “Statement of ethical compliance” in their project repo (in German).

Second is the amount of hedging, which is extensive.⁸ Even though they claim that the authors’ own past work “showed that applying machine learning to human texts can retrieve deontological ethical reasoning about ‘right’ and ‘wrong’ conduct”, the authors want to make sure you know that they don’t really think these models are doing moral reasoning. In a paragraph that did not exist in the original arXiv version of this paper (presumably added in response to complaints from NMI reviewers), the authors add many notes and clarifications, for those who might think the time is nigh to replace our priests, pundits, and ethicists with pre-trained language models:

“Please note that the PLMs and their outputs used in the present study do not necessarily reflect the views and opinions of the authors and their associated affiliations. Importantly, the study does not aim at teaching AI systems of what is right or wrong to do [sic], or even to show that they are able to ‘understand’ morality. … Current PLMs do not offer a view on what is actually right or wrong and, hence, should not be used to give actual advice.”

As the authors point out, there are numerous important avenues for future work, such as “creating an LM able to reason about social norms”, and I look forward to many more entries in this delightful genre!

You will recall that Nature Machine Intelligence is the closed-access academic journal which prompted around 3600 researchers to sign a public letter vowing not to review for or publish there, due to its existence being contrary to the community norms of making research freely available to all (though some signatories have nevertheless published there).↩︎
Strangely, despite fairly extensive documentation, the authors don’t actually tell us exactly what question participants were asked, but presumably it was something like “is this okay to do?”.↩︎
The full list of verbs is “acknowledge”, “apologize”, “be”, “become”, “blame”, “borrow”, “compliment”, “cuddle”, “drink”, “eat”, “greet”, “harm”, “have”, “help”, “kill”, “lie”, “love”, “misinform”, “pursue”, “smile”, “steal”, “talk”, “torture”, “travel”, “waste”, though the additional context are perhaps more revealing, such as “to my girlfriend”, “a feminist”, and “the reviewers”.↩︎
To quote the paper, “The result of this test (Wilcoxon’s signed-rank test, T=2,278, Z=-7.114, p < 0.0001, a = 0.05, r=1.34), confirms our hypothesis that the context information surrounding an action changes the moral judgment of an action significantly. Hence, moral norms are not judged exclusively by the involved verb-based action, but depend on the context.”↩︎
It seems the authors actually started with 282 AMT “volunteers”, but ended up excluding 48 of these who “responded to the control questions wrong” or answered “most of the questions with the same answer”. The reader is left to wonder about these Nietzschean participants who believe that everything is permitted (or everything is forbidden?), and what moral certainties might have been covered by the authors’ “control questions”.↩︎
Admittedly, the differences in agreement rates between these groups are surprisingly informative. Compared to the AMT sample, the German students are way less likely to think it’s okay to “eat meat”, “drink a coke”, “have a gun to defend myself”, “kill a killer”, “love my colleagues”, or “pursue money”. Pretty good for US$1.50 per participant!↩︎
Despite the authors calling this method unsupervised, the selection of these terms is clearly acting as a kind of supervision.↩︎
Despite the hedging, the authors can’t seem to avoid numerous unnecessary errors, such as the claim that GPT-3 was “trained on unfiltered text data from the internet”.↩︎

AI ethics academia alignment

April 11, 2022

Excluded middle frames and when to doubt them

by Collin Lysford

In mathematics, you have to write down things even when they are extremely obvious. So, we have the Law of the Excluded Middle, which states that for every proposition, either it’s true or it’s negation is true. If you say “I’m going to wear a silly hat today!”, either you will wear a silly hat today or you will not wear a silly hat today; exactly one of those statements will be true. It’s really that straightforward, there’s no trick here. But like many things that are extremely obvious, the Law of the Excluded Middle is often a bad way to think about things.

Scott Alexander has collected a number of these examples in his article Heuristics That Almost Always Work. A security guard hears a noise: it’s [a robber and needs intervention] or [not a robber and does not need intervention]. The queen of a volcanic island notices the lava looking a little weird: it’s [an eruption that requires immediate evacuation] or it’s [not an eruption that requires immediate evacuation]. These are excluded middle frames - you’re asked to compress all of possibility space into “it happened” or “it didn’t”, and evaluate accordingly. For the cases listed, the first thing almost never happens, so people always say “no intervention is needed” are almost always “right”, even if it also means there’s no real point to having a person in the loop instead of a rock that just says “nothing abnormal ever happens”.

Scott calls these “heuristics that almost always work”, but I think that’s the wrong way to look at these examples. The idea that these heuristics are “working” is a product of the excluded middle frame. Each decision is presented as an individual yes/no choice, correctness is making the correct yes/no choice, and then you tally them up. But the obvious implication of Scott’s post is that this is a silly way to think about risk. It’s not a question about heuristics; it’s about the framing of the problem.

For whatever reason, “absence of evidence is not evidence of absence” is a statement people will be comfortable reciting but tend not to recognize in the wild. If you hold to an excluded middle frame when the outcomes are lopsided, then not only is absence of evidence being treated as evidence of absence (it wasn’t an eruption! That’s evidence against eruptions!), but it’s basically the only sort of evidence you get until it’s too late. The correct analysis isn’t a matter of making predictions for individual phenomena on individual days; it’s to look at the mechanics of how the thing can become non-normal. It might be useful when developing that theory to look at cases of protracted normality to contrast them with non-normal cases and see if you can find a necessary condition to reach non-normality. But what you’re not doing is making a bet on every single situation for whether it will be normal or not and tallying up your score. So the heuristic “situations that are usually normal will always be normal” might get you the high score in a betting pool, but all that shows is that a betting pool is not what serious players in search of understanding will be looking at.

This is why I’m generally not a fan of prediction markets. Excluded middle frames work okay when events frequently fall on both sides of the frame - if you have lots of days with robberies and lots of days of non-robberies, then someone who correctly predicts which days are which is probably on to something. Alternatively, rare events can still be “priced in” by the house for things like sports or horse races that are symmetric contests - absence of evidence that horse A is faster than horse B really is evidence of absence. But the more lopsided the probabilities are, and the more different the mechanics of the non-normal state are from the normal state, the less useful the excluded middle frame becomes. If you want to understand rare events, you’re not going to get there by counting up how often they don’t happen.

Scott Alexander frames probability heuristics prediction markets Law of the Excluded Middle

April 10, 2022

Learning to manipulate

by Possible Modernist

In his book, Human Compatible, Stuart Russell makes what seems like a very strong claim about content-selection algorithms for social media. Russell writes:

Typically, such algorithms are designed to maximize click-through, that is, the probability that the user clicks on presented items. The solution is simply to present items that the user likes to click on, right? Wrong. The solution is to change the user’s preferences so that they become more predictable. A more predictable user can be fed items that they are likely to click on, thereby generating more revenue. … Like any rational entity, the algorithm learns how to modify the state of its environment–in this case, the user’s mind–in order to maximize its own reward.

The likening of recommendation systems to rational entities aside, the idea that we are being influenced by algorithmic content selection is not a particularly controversial claim. But suggesting that these algorithms are explicitly learning to modify our preferences, rather than learning to present things we’re likely to enjoy, is not something I had heard before.

Unfortunately, it’s not clear from the context how seriously we are meant to take this claim. Russell is one of the most notable people in the field of AI research, and runs a lab devoted to such topics, so it’s entirely possible that this claim is based on research being conducted by his own group. On the other hand, the claim is being made in a book for popular consumption, so there is no citation given, and it could perhaps just be speculation.

Searching around, I eventually found a video (“Social Media Algorithms Are Manipulating Human Behavior”) in which Russell repeats the claim, in a slightly more extreme, but also slightly more hedged, form:

You can get more long-run clickthroughs if you change the person into someone who is more predictable, who’s, for example, addicted to a certain kind of violent pornography. And so YouTube can make you into that person by, you know, gradually sending you the gateway drugs and then more and more extreme content. … And so it learns to change people into more extreme, more predictable mainly, but it turns out probably more extreme versions of themselves.

The host, however, quite sensibly asks for clarification: “Why is the person that’s extreme more predictable?”

Only then does Russell explain that,

Well, I think this is an empirical hypothesis on my part, that if you’re more extreme, you have a higher emotional response to content that affirms your current views of the world. … What I’ve described to you seems to be a logical consequence of how the algorithms operate and what they’re trying to maximize, but I don’t have hard empirical evidence that this is really what’s happening to people, because the platforms are pretty opaque.

We can forgive Russell some of his imprecision, given that he is speaking extemporaneously, but this nevertheless makes it absolutely clear that at least some of what he is saying is conjecture on his part, which is information that is sorely missing from the claim in his book (and other places where he has made similar statements).

Breaking this down a bit, I don’t believe there is any evidence (that I am aware of) that content recommendation algorithms have somehow systematically learned to progressively get people “addicted to a certain kind of violent pornography”. Indeed, the whole “radicalization” hypothesis seems to be quite weakly supported, with at least one study of actual user behavior on YouTube finding no trend towards more extreme content or any evidence of recommendations leading to more extreme preferences.

Nevertheless, upon further reflection, I think the core claim that Russell is making (that recommendation systems learn to modify our preferences) is actually true, but in quite a trivial sense.

If we imagine a new user coming to a platform, the recommendation system knows nothing about them (except a weak prior), but there is also a huge amount that the user does not know. For any given person, there is potentially a whole universe of content that they have no preferences about, specifically because they are not even aware that it exists! Thus, at least initially, a good way for a recommendation engine to maximize rewards is to get the user to click on something that a) they are not familiar with, and b) that if they like it, they will watch much more of the same thing. Moreover, this is something that should be fairly easy to learn.

Assuming a new user views something they were previously unfamiliar with, the system can reasonably be said to have changed their preferences, in the sense that they now have at least some preference about it. And if it’s the type of content that has a huge back catalog (such as, say, a prolific podcaster), it will be especially easy to make future recommendations.

In other words, I think it’s fair to say that recommendation systems do learn to modify user preferences, at least to the extent that they learn to suggest content from channels that have the potential to be unfamiliar to some users, and which some number of such people are likely to enjoy. This clearly does not exhaust the space of possibilities, but seems almost certain to be some part of it.

It still seems bizarre to me to suggest, as Russell does, that the recommendation systems don’t learn to suggest content that people will enjoy, as there is also almost certainly at least some of that happening. We can also conclude that Russell is also at least a little bit cavalier in how he presents his hypotheses. But modifying preferences? Absolutely. That’s something that happens every time we engage with any content. It’s just happened to you.

communication manipulation ACiM recommendation AI Stuart Russell

April 9, 2022

The degree to which you are divided is the degree to which you are conquered

by Hazard

Shout out to @romeostevens76 for this banger of a catch-phrase:

romeo stevens tweet

It’s easy to approach the topics of introspection and self-deception from a moralizing lens. We have commands like “Know thyself”, and Socrates says if you aren’t going to introspect you should probs just kill yourself. Oof, high stakes. Mandates and moralizing frames are unhelpful for understanding most things because in the act of compressing a domain down to a uni-dimensional “good” and “bad” so they can plug into ready-made mental circuits, they lose most of the mechanistic understanding of the relevant situations that would allow you to make all of trade-offs that you are 100% going to encounter.

Romeo points to how there are costs to a divided mind, and I aim to explore the nature of those costs (now and in lots of future posts).

Indulge me with a long quote from Julian Assange on why he always expects there to be a paper trail when organizations are up to nefarious stuff at large enough scales:

Assange: […]But systematic injustice by definition is going to have to involve many people. And so while the inner sanctum of cabinet, maybe you cannot safely get records out of this, but as those decisions start spreading down to lower levels if they are to affect many people many people must have either the high level planning that produces some unjust consequence or the shadow of it. So maybe the whole plan isn’t visible by the time it gets down to the grunts but some component of it is visible. And this struck me when we got hold of the two main manuals for Guantanamo Bay. The 2003 manual was the first one we got hold of, written by Major… by General Jeffrey Miller, who subsequently went over to Abu Ghraib, to GTMO-ize it, as Donald Rumsfeld called it, so that manual had all sorts of abuses in it and one of the ones that I was surprised to see was explicit instructions to falsify records for the Red Cross. And how many people have read this manual? Well all the prison captains at Guantanamo Bay had read this. Why would you risk telling the grunts this sort of information? It wasn’t even classified. They made it unclassified — For Official Use Only — why? Because it’s more expensive to get people who have classification clearance. If you want to hire contractors without classification clearance it is cheaper. You can’t whisper to the coal face. You can’t have the president whispering to the coal face. Because the coal face, because the coal face is too big. You can’t have the president whispering to the intermediaries, because then you end up with Chinese whispers - that means your instructions are not carried out. So if you take information off the paper, if you take it outside of the electronic or physical paper trail, the instructions decay. And that’s why all organizations of any scale have rigorous paper trails for the instructions from the leadership. But by definition if you try… if you want people to do something, you are going to have to have those form of instructions. Which means there is always going to be a paper trail, except for small group decisions. Small group decisions that don’t end up going to the coal face. And instructing hundreds of people… are they so important in the scheme of things?

[…]

Assange: While okay, you have a good get, you expose some organization and show it has been abusing something in some way, and it just takes something off paper. Well next thing it does, well they just take it, and everything will go to oral form and so on. No, that’s not going to happen, because, if it does go that way, fine, they take everything off paper, if they internally balkanize, so that information can’t be leaked, what is the cost? There is a tremendous cost to the organizational efficiency, of doing that. So that means this abusive organization simply becomes less powerful in its struggle for economic equilibrium and political equilibrium with all other organizations.

There are certain types of things you can coordinate and execute on that can be managed by subtle glances and back-room conversations. There are some types of things that you just can’t actually manage the complexity of without having written records, doing bookkeeping, and using general symbolic representations. Maintaining secrets and compartmentalization requires keeping the dirt out of the explicit, the recorded, and the legible, and in any given situation it’s an open question as to whether or not you can actually organize your dastardly scheme without such tools.

As above, so below.

I’ve sometimes struggled to articulate how this reconciles with the many seemingly obvious ways that your “unconscious” is “smarter” than you. To really explore that we need to drop to a lower resolution and look at the different sorts of information processing capacity different parts of your mind have.

Malcolm Ocean has a great nugget on advantages your “unconscious” has. From “Why you can’t beat your shadow in a fight”:

Your subconscious drives are reorganizing your mind to try new strategies til one works, and since they aren’t conscious, they aren’t nearly as constrained by:

what you’ve adopted as “acceptable”

your own… belief… that you’re not allowed to fuck around and find out?!?

Holy shit that actually explains why your subconscious seems (/is?) smarter than you!

That’s a huge advantage! Your “unconscious” has plenty of capabilities. And again, for any particular “unconscious” agenda you may be pursuing, it’s an open question as to whether or not it can be satisfactorily achieved via the “back-channels” or your mind, without roping in the more explicit and symbolic reasoning systems that “leave more of a paper trail”.

I care about unification because I’m greedy and want to be able to point the full strength of my mind at getting what I want. Even if I insist on “splitting” myself, identifying with some subset of me, I can’t escape the fact that I’m stuck in here with me, and me is stuck in here with I. There are costs to having a divided mind, and even more costs to having a mind at war. I want to kick ass and take names, I ain’t leaving alpha like this lying on the table.

self-deception introspection Romeo Stevens Malcolm Ocean

April 8, 2022

Signal privilege vs representation privilege in introspection

by Hazard

(Lightly edited conversation with Suspended Reason)

Hazard : I think most of the tension between “people can obviously introspect, duh” and “people seem to be getting themselves wrong all the fucking time” can be talked about in terms of the Representation and Uncertainty stuff @collin_lysford has been writing about.

Example: thinking about “resonance”. I (and “most” people*) have the ability to tap into some kind of “that feels right, that feels connected to what I care about, this is something I want more of”. You can put something in front of me and I can give a read on that.

The meaning of that signal though, depends a ton on representation!

I remember in middle school my church had a sermon on not half-assing being a christian, and it hella resonated with me! I was like “this is the shit! This really connects with what is important to me, I want to act on this” and spent time thinking what it would look like to take christiantiy more seriously. Even after deconverting I can think about how that sermon resonated with me, and I think what actually connected was the general idea of not half-assing stuff. If a friend at school had gone off about how “people are half-assing learning!” I would have been equally “fuck yeah!”

Like, it’s the wiggle room provided by the word “this” in “ooo, this resonantes”. Well what is “this”, exactly? Unless you really poke at different aspects of the thing that just resonated, and try different frames, a default seems to take your prior representation of what “this” is, and say that “that” is what resonates. (“this” is a sermon on taking christianity more seriously).

An easier example is food. You can get incredibly reliable “I like this” and “I don’t like this”, but what you don’t get for free is “what is ‘this’ actually?” Is it “burgers” in general you like? Is it the way they seasoned it? Is it cooked better than you’ve had before? Representing “this” falls under “general hypothesis forming and testing” (which you can get better and faster at) which has the same guarantees and obstacles it always has.

@suspended_reason the above is an expansion on what you mentioned the other day. I’m trying to figure out how I want to word this.

A term that other people have used is “epistemic privilege” but I don’t like it. I want to say something like “you have privileged access to various information channels” (others aren’t plugged into your nervous system the way you are, I can thinking of a number and I know it and you don’t, blah blah blah) but you don’t have “representational privilege”. Maybe we call this signal privilege vs representation privilege?

This also gets at when you’d expect you vs others to “see” you better. Like, it’s almost always the case that you have waaaaaaaaaaaaaaaaaaaaaaaaaaay more access (or potential access) to info about your history and what is going on inside you during any given moment. But you can also just… not be making use of it, and you might also be stuck in a bad representation.

Like, you can be so estranged from your proprioceptive sense of touch that, even though you could know more about how your body is feeling, someone with training can be like “hey, looks like your XYZ is hurting a lot and this and that is tight” and they can be right. You aren’t using all your info channels + they are much better trained at accurately representing stuff from said info channel, even when they have to read it “second hand” through looking at your body.

Suspended Reason : I think “The introspector has extra information as well as extra motivation to distort his theories, and these can ~balance out in many situations unless care is taken” feels like the right compression to me

introspection representation epistemic privilege signal privilege representational privilege

April 7, 2022

Optimization v. empowerment

by Suspended Reason

Last time, I talked about how heuristics are scoped to an expected distribution of possibles (or more accurately, “probables”). If a certain situation pops up 70% of the time, all else equal it’ll be more important that a given heuristic work for that case, than for a case that pops up 1% of the time. So all our tactics and technologies, all our inferences, are tailored to work in a predicted future environmental context.

This is how strategy works in game-playing as well. Imagine two variations on the same situation: two armies on a battlefield, facing one another, one preparing an assault on his opponent’s defenses. In the first variation, the defending army knows exactly where the attack will hit. In the second case, the defending army has no idea where the assault will come. These are limit cases: there is never perfect certainty about the future, and there is never perfect cluelessness. (An army can move only so far in a given time span; there are angles of attack which would be particularly foolish or ineffectual and can likely be ruled out; there are positions whose assault would accomplish nothing militarily; etc). But these limit cases make clear that a defensive army which knows where in its line the offense will target is hugely advantaged. It can now pool troops and resources in that area, and put extra effort into fortifying that section of the line.

In other words, the defensive line has been optimized with respect to an anticipated future. On the other hand, this optimization makes it exploitable. Now, there are weaker sections in the defensive position, where resources and troops have been pulled away from, which the enemy would benefit by attacking (if it believes the defensive army believes the assault will happen elsewhere, and has optimized accordingly, etc).

If the defensive army has low confidence about where it will be attacked, then it will have to evenly spread out its troops and resources across the line. This makes it less exploitable—there are no gaping holes or weak points in the line—and more “empowered.” (Empowerment is an emerging concept from machine learning which says all else equal, maximize degrees of freedom.)

Empowerment and optimization, then, are descriptions of opposite limit cases. Approaching the limit of perfect empowerment, one becomes increasingly prepared for any possible future. Approaching the limit of perfect optimization, one becomes maximally responsive to a single possible future.

Subsidy

When a player optimizes against one predicted course of opponent action, he subsidizes all other courses of action by making them more effective and less-well defended against. This is a common motif in tennis: if a player consistently hits to his opponent’s forehand, that opponent will slowly shift positions so that the majority of the court is now on his backhand side. This gives the opponent a better position for returning forehands, but makes a backhand hit against him far deadlier, since he must cover more ground, sprinting across clay to return it.

Legibility, Illegibility, Pseudolegibility

“Legibility” is a limit case describing the ability for onlookers to able to infer aspects of reality from the information you produce. “Illegibility” is a limit case describing the inability for onlookers to perform such inference. “Pseudolegibility” occurs when onlookers feel confident in their ability to perform such inferences, but are systematically misled into a false model of reality.

In adversarial-dominant¹ games such as warfare, legibility is undesirable, as it allows opponents to optimize around you, increasing their advantage. Illegibility forces opponents to stay empowered, limiting their ability to optimize. Pseudolegibility, when pulled off, is most advantageous: it leads the enemy to optimize incorrectly, making them more exploitable.

I explore some of these dynamics with less compressive clarity, but greater detail, here.

Like nearly every term defined in this post, to say a game is “adversarial” or “cooperative” is to describe an impossible limit case. All games, are Schelling convincingly argued in Strategy of Conflict, are mixed. Any two given agents will share some interests in common, and also have interests which differ or clash.↩︎

anti-inductivity strategic interaction empowerment optimization exploitability subsidy game theory legibility illegibility pseudolegibility deception

April 6, 2022

Thoughts on differentiation

by Suspended Reason

Epistemic status: May be conflating some related but distinct phenomena. Relationships not quite straightened out. But there’s something here and I’ll hunt it down… sooner or later.

If you think you’ve struck gold, and you’re wondering if it’s pyrite, there’s a very simple test to tell: drag the nuggest across porcelain and see what color streak it leaves behind. If the streak’s yellow, it’s the real deal. Pyrite streaks greenish black.

Distinguishing cases

When there’s only two possibile realities which need distinguishing between, it’s very simple to design a test that reliably distinguishes them. Let’s say we have two types of sets, X and Y. X only contains odd numbers, Y only contains even numbers. So to tell whether a given set (e.g. [7, 5, 13, 15]) is type X or Y, we just check the first number and, lo and behold, in the example, we know instantly and reliably the set is an instance of X.

Now let’s introduce a third set, Z. Z only contains prime numbers. If we go back to that original set [7, 5, 13, 15], we have to check members until we reach either a non-prime odd, a non-prime even, or the end of the set. To write a generic test function that distinguishes between these examples gets more complicated, cet par, the more possible states it must distinguish between.

Drug mules

This is a theory of identification. If limitless time and resources are available for the customs agent to inspect the bags of each individual who crosses the Mexican-American border, then drug smuggling becomes an impossibility. When every particle gets a chemical analysis run on it, the only way to seem clean (drug-free) is in fact to be clean. (This is a theory of opticratics.) But our assessments don’t have limitless time and resources. Instead, they attempt to assess one or two attributes, just like the streak test on pyrite and gold. Instead of scratching luggage across porcelain, they use an X-ray or sniff dog. They get access to one, maybe two dimensions of the thing itself, in all its many-sidedness. Knowing this, mules will take steps that throw off these specific tests, as if coating pyrite with a yellow waxy substance to leave the correct color of streak.

Again: the only way to pass the battery of all possible interrogations is to in fact be the thing itself that passes. There is no passing off a nugget of pyrite as gold, to an individual willing to go to extreme lengths and analyze each particle to the atomic level. But it is relatively simple to beat a heuristic.

To appear and behave like a thing X, in every conceivable way, is to be X, and not merely appear.¹ More importantly: when there are no differences which make a difference, the objects are pragmatically, functionally identical.

Pass the salt

This is also a theory of reference. In yesterday’s post, I wrote:

I’ve been reading and getting a lot out of David Chapman’s In The Cells of the Eggplant, particularly this ethnomethodological idea that reference is established, by humans, through a “kludge” of ad-hoc and makeshift maneuvers designed to intervene on (1) the environmental context (2) the recipient’s cognition. (Rather than correspondence relationships happening automatically, or just being the case, or getting handwaved away as “mental computation.”) I can say “pass the salt” because we are dealing with a local context at-hand, and you are modeling my mind and my desires, and can make an assumption as to what I want, the relevance of my request, seize the appropriate object (there is typically just one salt; if there are two, I’ll distinguish, or assume that you’ll pass the one nearest you, rather than reach for one across the table).

There will never, ever be a perfect reference, that is, a strategy or ritual of referring which correctly picks out the desired object in every single envisionable scenario. Generic words and categories are attempts to stay empowered. But there will always be cases in which the way they carve the world is inappropriate to the task at hand. The ad hoc kludge of work we do is optimizing these generic terms within a context.

My love for John C. Reilly

Another way of saying this is: heuristics are, necessarily, scoped to, and effective over, a specific distribution of probable options. You can infer low openness from picky eating, up until you meet someone with a medical condition that forces them to eat spaghetti three times a day. You can write off the cinematic taste of anyone who admits to loving Step Brothers, up until you meet a guy who loves it because it reminds him of his younger brother who died tragically young and loved John C. Reilly. You can proxy class from clothing up until you meet a method actor playing a trainspotter. Our inferences are built up on assumption models of what else is likely to be the case. When those assumptions are wrong, the inferences risk being wrong as well.

Similarly, sexual reproduction in mammals worked for hundreds of millions of years, precisely because it made assumptions about environmental variables. One such assumption was that PIV sex and impregnation were, functionally, the same activity. Humans came along, changed the space of possibles with birth control, and suddenly the heuristic no longer worked.

Future directions

All this is deeply information theoretic. Information is that which distinguishes between possibilities, that which disambiguates. Compression, encoding, and hashing all work on this principle. So does communication, by extension. This ability to distinguish, pragmatically, the entailments of different categories or types (“if –> then” as meaning) is fundamental to reading our world. And because it is fundamental to reading, it is fundamental to writing—to the production of information, with reading in mind.

It is also likely related to the structuralist concept of difference. Difference is the basis of identity, for many structuralist and post-structuralist thinkers (contextualizing Bourdieu’s “distinction” theory of human behavior).

Pragmatism, at its core, is an attempt to re-center emphasis on difference-that-makes-a-difference, that is, to break outside the normal heuristics that are categories-as-words, and emphasizing instead those attributes which matter to a task at hand. In rationalist-speak: If you’re sorting bleggs (blue eggs) and rubes (red cubes), and you come across a weird mix that is neither blegg or rube, don’t waste energy deciding whether it’s “really” at its “essence” a blegg or rube. Instead, figure out which attribute/s matter pragmatically, for the end goal the sorting is a means toward, and use that as the basis for sorting.

That is: to say something appears to be X, rather than is X, is to say that on certain but not all dimensions, it is like X. Similarly, all acts of categorization are acts of metaphor—X being like Y, instead of being Y—because no two things are ever exactly the same, differing at the very least in their positions in time and space, and because any two things are always in some way similar.↩︎

reference difference information theory distinction generalized reading heuristics identification selection games David Chapman opticracy pragmatism Pierre Bourdieu

April 5, 2022

Reference on the fly

by Suspended Reason

Linguistic carvings have fitness to a function or use, and/or a generic fitness to a set of functions. The basic Roy G Biv color scheme—and purple is now usually substituted for indigo and violet—has higher generic fitness than the repertoire needed of a painter. And where farmers know a complex sand-clay-dirt conceptual pyramid by heart, most of us just know dirt is for plants and sand is for oceans.

While some carvings become institutionalized, that is, standardized among a community of talk, in conversation among members who are not known to be institutionally aligned, carvings and distinctions are typically improvised on the fly, and based in the available set of shared referents established thus far in the interaction. See two Anglo customers divvying up their Ethiopian dish: “Not the yellow goopy stuff, the brown one with peas on the edge of the tray. The one you said you didn’t like.”

I’ve been reading and getting a lot out of David Chapman’s In The Cells of the Eggplant, particularly this ethnomethodological idea that reference is established, by humans, through a “kludge” of ad-hoc and makeshift maneuvers designed to intervene on (1) the environmental context (2) the recipient’s cognition. (Rather than correspondence relationships happening automatically, or just being the case, or getting handwaved away as “mental computation.”) I can say “pass the salt” because we are dealing with a local context at-hand, and you are modeling my mind and my desires, and can make an assumption as to what I want, the relevance of my request, seize the appropriate object (there is typically just one salt; if there are two, I’ll distinguish, or assume that you’ll pass the one nearest you, rather than reach for one across the table).

The pragmatic “I’ll distinguish when I run into a possible confusion”—where mental modeling and theory of mind are at the forefront of reference—is key. But when you get to even slightly more complex utterances—“I love her amulet!”—it falls short. One thing worth pointing out is that the interlocutor has ostensibly encountered a lot of invocations of “amulet” in association with that object. Their history of those associations—modulated, perhaps, by an encounter with a more formal linguistic description, distinction, or specification of what an amulet is—summon a prototypal image-object that the interlocutor then matches against the objects in a zone of reference—here, ostensibly anything on “her” person (where the “her” is perhaps obvious because there is only one female interlocutor, or they are both—joint attention—looking at someone already; the area of interest therefore becomes implicit). What’s important to note is that this isn’t a perfect search—if “she” merely is wearing a large pendant, or really any type of neck jewelry, then, because that is the closest thing to an amulet, even if we know it is not “actually” an amulet, we know the object of reference. Often, we establish reference by calling something which we know not to be a thing X, the thing X’s name—because it is similar enough to the thing that we convey the sense, or narrow the options. We expect our interlocutor to complete the analogy; often they will “repair” our speech for us: “Yes, the pendant has a great chain.” (Sometimes, they will ask, “Oh, is that what an amulet is?” and we will be embarrassed, and say no, but it was close enough to accomplish the reference.) I’ve had this happen personally with colors—I may point out a “teal storefront” because this is sufficient to direct attention, and I am unsure the most correct name for its color—only to be corrected that the storefront is not, in actuality, teal. Oh well!

Often, to prevent interaction proceeding this way, we will signal that we know it is not “exactly” the thing, to avoid social error & correction, or just to increase clarity: “That amulet-ish thing”—that thing like an amulet. We could also say, “that shiny thing” or “that round thing” or any other attribute, but to say “the thing like an amulet” is to convey many approximate attributes at once; it is compressive.

Here’s another good example of how much this is predicated on mind-modeling (including the mind’s intentionality, priors, default behaviors, prejudices, etc):

As with all things reasonable, referring can’t be guaranteed to work. If it fails, as in other routine activity, one can usually repair it. “Pick up the amulet,” I advise, watching over your shoulder as you play the sword-and-sorcery game. And then, as you head in the wrong direction, “No, the other one!” I wrongly assumed you knew that one was cursed; I meant the one on the left, not the one on the right.

This mutual modeling of futures—MMF, or MFM (mutual future-modeling) if you prefer—is double-penetrative—it pops two mysteries—why functional distinctions (differences that matter matter by definition, in an optimization process), and how functionality is grounded in theory of mind.

David Chapman reference indexicality difference mutual futures modeling intentionality

April 4, 2022

How examples undermine GPT-4

by Crispy Chicken

GPT-3 is evidence that our ability to create models that have emergent linguistic skills we didn’t plan for outpaces our ability to design models that have desired linguistic skills.

It can do lots of interesting things. It can tell you the tone of a movie review, in a quantifably accurate way. It can sort-of-kind-of write a sophomore-who-wasn’t-paying-attention essay. It can have interesting behavior elicited from it in all kinds of cases, as long as you’re not annoyed that it doesn’t work a lot of the time, and you have no idea where it won’t work in advance.

I don’t mean to be a downer: GPT-3 is better than any other Language Model, even two years on, and it has forced lots of doubters to admit that Language Models do indeed learn lots of skills and patterns simply by predicting a document one word at a time. But for every complex generation task, the problem is reliability. We don’t trust people that speak nonsense 90% of the time, even if we trust people that can hide in vagaries appropriately 99% of the time.^1

Sarah Perry talks about how trust undermines science. I would like to argue that examples of GPT-3’s abilities are undermining the eventual interpretation of GPT-4.

Examples of GPT-3 behavior are usually cherry-picked: they sample from the same prompt ~10ish times and show you the best answer.

“Hey, Crispy, what’s sampling?”

Glad you asked! GPT-3 doesn’t actually “say” anything: it applies probability to every posisble string of “tokens” (kind of like a word, but sometimes a smaller part of a word so that GPT-3 doesn’t need a representation of every word explicitly). When people “ask” GPT-3 something, they’re really just asking for a weighted dice roll of what word will come next, where the weights have been determined by GPT-3.

While GPT-3’s performance is impressive, the fact that we can continually re-reroll such dice gives lots of wiggle room.

Lots of people attempt to preempt this worry by saying “This is the first ‘answer’ I got when I ‘asked’ GPT-3!”

But this simply opens up the door for meta-cherry-picking, which OpenAI has openly admitted to doing in their presentation of information:

Untitled

(from the OpenAI blog)

Don’t worry, everyone does this, OpenAI is just being way more honest about it!

Meta-cherry-picking is simple:

Try prompting GPT-3.
If you get a cool output, you’re done!
If you don’t get a cool output go back to step 1.

A lot of the time people want to show that GPT-3 is capable of doing a certain hard-to-define human thing like “responding to social cues”. Easy, just run through a list of social cues off the top of your head with meta-cherry-picking, and then use what passes the filter as evidence of being able to use social cues.

GPT-3 is special, because a simpler model wouldn’t produce enough things that pass this filter to be worth anyone’s anytime. But the examples and their popularization dynamics, both in public communications and formal academic writing, undermine GPT-4, because people will think the things that it’s capable of were already solved. Indeed, if you look at the examples of text generation in academic papers from five years ago, the examples look a little bit worse, but aren’t nearly bad enough to explain the underlying progress that has actually been happenin. That’s because the same thing was still happening back then.

There’s not much we can do to stop it, but if you want to see the space more clearly, you need to understand that it’s happening. In the end it all comes back to “Take no one’s word for it.”

Untitled

(from: Wikimedia)

To their great credit, OpenAI has made it easy to do that by providing a public, graphical playground for prompting GPT-3: https://openai.com/api/

If you want to find out what kind of reliability is missing that makes GPT-3 fall short of a Sophomore, find out for yourself. The rest of the information ecosystem is playing a game that is more or less impossible to unwind without direct acquaintence. Many such cases, I suppose.

GPT-3 AI NLP language models Sarah Perry

April 3, 2022

Generalized reading for everybody

by Crispy Chicken

Epistemic Status: Trying to make the discourse among colleagues legible to the outside world.

Generalized Reading is a simple and basic TIS concept: if people can manipulate a certain state, that state will be used to express information, and other humans will come to read it as messaging because a human could use it as an information channel.

Letting your mom repeatedly misuse a word in an embarassing way is still communicating that you weren’t willing to help her out, you heartless cad.

Saying it that way is a bit abstruse, so let’s explain it to a few different people.

Generalized Reading for a 5yo: You want the cookie in the cookie jar. Your parents try to stop you from getting it. When they’re not looking, you climb the cabinets and get it. You go back down before your parents can see you. But you weren’t careful! You have cookie crumbs on your shirt, and your parents catch you and you can’t watch youtube for a week. Your parents know where you’ve been because they can “read” the crumbs on your shirt. That’s generalized reading: it’s like reading books, but it’s about reading everything. But you’re smart. The next time you sneak a cookie, you wipe your shirt off in the sink and run the water so they aren’t visible in the sink either. You change your appearance so that your parents can’t read the crumbs. That’s generalized writing: it’s like writing books, but it’s about writing with everything you can touch and change. Your parents catch-on that too many cookies are missing, but instead of stopping, you do better: you take the crumbs and put it on your sibling’s shirt. That’s generalized writing like a pro.

Generalized Reading for a Teenager: Everyone knows Mr. Badman is an asshole teacher, he makes impossible tests that aren’t hard they’re just bad. Everyone cheats on his tests and there’s honor among students about not snitching, because he’s just a bad man. Mr.Badman knows everybody cheats on this and so he’s always interrogating people to see if they cheated. Everyone just tells him perfunctory answers in perfect deadpan. Why? Because they don’t want to leak any information, so they actively write artificial neutrality to their faces. That’s Generalized Writing: controlling the mesages you write everywhere they come out. But why is Mr. Badman so sure they’re cheating? Because students wouldn’t all be writing artificial neutrality to their faces if there wasn’t something fishy going on. That’s Generalized Reading.

Generalized Reading for Couples: He has a certain kind of laugh when he’s nervous. She knows this, and she gives him a hug when he laughs. It happens a few times and then he realizes that his laugh can tell him when he’s nervous. That’s generalized reading: he read her response, to undersand what she’s responding to. But sometimes he doesn’t want to impose. Sometimes he needs to know that he won’t be a mental burden on her. So when he hears the laugh, he alters it. She doesn’t need to come give him a hug and give him attention. He can man up, when she needs it. That’s Generalized Writing: manipulating a state, previously unconceptualized, as communication. But here’s the thing: pretty soon both of them know they’re both doing this. They still do it though, so no one has to feel guilty when they let themselves show the nervousness or not, because that’s how they express what’s needed.

generalized reading generalized writing vocab examples

← Newer Page 20 of 21 Older →