Rex Kerr
1 min readJun 3, 2023

--

As an example, I just told GPT-4 a story that is exactly equivalent to Iterated Prisoner's Dilemma, but without warning it that this is what I was doing, and it got all confused about what was happening: it stated that both sides should betray the other, but also that both sides would get maximum reward this way. It certainly didn't come up with superrationality; it didn't even properly analyze the "rational" solution because it missed the dependency of the players on each other (i.e. if it recommended that player 1 betray player 2, that player 2 would start from the low baseline, not the high one).

Again, it would be able to do this fine, I'm sure, if I gave it any clues that I was actually talking about Prisoner's Dilemma. (It answered a similar but not identical question correctly when I prompted it with the name.) But the only clue was the logical structure, which I obfuscated slightly by telling it in narrative style, and unsurprisingly (to me), it wasn't able to employ the language-embedded world model to solve the problem.

--

--

Rex Kerr
Rex Kerr

Written by Rex Kerr

One who rejoices when everything is made as simple as possible, but no simpler. Sayer of things that may be wrong, but not so bad that they're not even wrong.

Responses (1)