Talk:Roko's basilisk/Archive1

Moved from talk:LessWrong
I have started getting messages from people upset by the basilisk, Dmytry's had a few too and tries to respond helpfully. (And on site. ) I am not sure how to respond - I don't really understand all the parts of it enough, but the hard part is how to talk someone down from an emotional reaction when I'm not sure I'm accurately modelling the person getting upset and don't know how to talk down someone prone to obsession over an intellectual epiphany.

I suggest a bit of a discussion, which can go on a subpage or something here. Something that is absolutely nice and not snarky at all, because it appears that as about the only place trying to describe the thing we're attracting its victims - David Gerard (talk) 22:10, 16 February 2013 (UTC)
 * Good idea. It may be good to have a list of objections at various levels. However it seems to me a big part of it is just general 'what if certain guy is right and it really is dangerous' style Pascal's wager which co-opts the irrational machinery built for cashing on the saving-the-world wager. Dmytry (talk) 22:43, 16 February 2013 (UTC)


 * It's not just Roko's Basilisk, but Pascal's Mugging-style arguments more generally. I can come up with possible approaches, but then I check them against my own idea of people, which is obviously based on me, and I pretty much stop at "that's a ridiculous idea" and if I think further, "that's a ridiculous idea for these reasons". But that's not in fact helpful. So I'm not sure how to proceed helpfully - David Gerard (talk) 23:48, 16 February 2013 (UTC)
 * I suggest we pray for them. DamoHi 23:56, 16 February 2013 (UTC)
 * I'm serious - David Gerard (talk) 00:03, 17 February 2013 (UTC)
 * I know you are, but don't expect the rest of us to take it seriously. Aren't they in exactly the same position as Lutherians worrying about whether they have been predestined for salvation?  Just part and parcel of religious idiocy isn't it?  DamoHi 00:09, 17 February 2013 (UTC)
 * We do have quite good things like this to help people back to reason, I don't think it'd be off-mission - David Gerard (talk) 00:23, 17 February 2013 (UTC)
 * I guess a side by side rebuttal of the fears or similar would be good. Don't mind me, I just think the whole business is totally nuts.  DamoHi 00:30, 17 February 2013 (UTC)
 * I believe I understand it well enough, even though it's difficult (I believe some of the elements of the problem, like acausal reasoning, have the LW trait of unnecessary obscurity and unengagement with standard labels). I'd suggest, given the importance, that it might merit a separate article.  First, a summary similar to what we have here, and then a simple and coherent explanation.  Then, a rebuttal of the main points and reassurances.--[[Image:adsig.png|25px|link=User:AD|AD]]talk 15:38, 17 February 2013 (UTC)
 * Roko's basilisk <-- currently a redirect, just waiting to become a proper article - David Gerard (talk) 21:43, 17 February 2013 (UTC)
 * SI's 8 lives per dollar estimate may be a key for understanding how Pascal's wager works in susceptible individuals. Some folks there literally think that a pros and cons list of 1 item on the pros side, with made up probability and utility, constitutes expected utility which rational agent should maximize. Dmytry (talk) 17:29, 17 February 2013 (UTC)
 * So, the issues are: 1. Debunk the theory. Even EY said it was wrong (but then went on to imply it was a line of enquiry to be afraid of), so this should be quite possible within the belief system. 2. The psychological issues. Going to a therapist with an existential crisis caused by a theory you know is weird is difficult, but they do deal with existential crises in general, and I'd hope with ones where the subject has fallen into an OCD pit - David Gerard (talk) 00:14, 18 February 2013 (UTC)
 * The problem with debunking the theory is that it operates in LessWrong's convoluted theology. Refuting the basilisk would either require us to pretend like LW's rhetoric is sane, or to pick apart their entire weird thought process.-- "Shut up, Brx." 00:33, 18 February 2013 (UTC)
 * I don't really agree with that Brx. I don't think the reasoning behind the basilisk is all that complicated, just weird.  Essentially it is a form of Pascal's wager, but it has been dressed up to sound profound.  We don't need to, for example, analyse the sequences to comment on this issue.  DamoHi 00:56, 18 February 2013 (UTC)

Got a pretty good start, I think. It would be good to replace what's on this page with a summary and link to the main article now. I would think it's important to try to keep things very clear and simple in the discussion on the new article. Help would be appreciated; I'm in India and my internet connection sucks.--talk 09:05, 18 February 2013 (UTC)

More work
As I see it, we don't really need the trivial blow-by-blow of the matter, but analysis would be good. Thoughts?--talk 09:01, 18 February 2013 (UTC)


 * I dunno, SBS of the original wouldn't be wrong to have here. The history (a spectacular failure of community management concerning supposedly hazardous information) would be apposite too. Now we have this article we can move stuff over from LessWrong piecemeal - David Gerard (talk) 10:59, 18 February 2013 (UTC)
 * It just doesn't seem very interesting to me, because it reads like a dissection of a relatively unimportant series of events that obscures the larger picture. Perhaps, after the first section, we should add another section with the blow-by-blow?--[[Image:adsig.png|25px|link=User:AD|AD]]talk 12:20, 18 February 2013 (UTC)
 * Yeah, not urgent. I think it'd be fun to do because the whole memeplex contains some very shaky pieces, including the ones in the chain that make up the basilisk - David Gerard (talk) 13:40, 18 February 2013 (UTC)

The role of acausal hocus-pocus in the basilisk scenario
I just posted about it here: "The two parts of the basilisk".

It's a tricky topic to address, because even the originator of timeless decision theory says he doesn't understand all the issues, and the subsequent discussion has been clouded, not just by the censorship, but by everyone projecting their own ideas about how the interaction between present human and future AI is supposed to work. So any attempt to discuss this aspect of the affair, that doesn't adhere to original texts with obsessive philological caution, is likely to falsify the facts somehow. The whole thing is a comedy of errors, from start to finish, but reconstructing the exact play-by-play of who tripped over what will only be of interest to a very small minority. Mporter (talk) 12:10, 18 February 2013 (UTC)
 * Can't they just program their imaginary AI to love? Boom.  No more problems.-- "Shut up, Brx." 12:18, 18 February 2013 (UTC)


 * Good blog post! Yeah, it's not as simple as "the future AI will punish you if you didn't help it enough", that's comprehensible - the basilisk is far sillier than that - David Gerard (talk) 13:41, 18 February 2013 (UTC)

Step by step refutation
There's a number of chained ideas that have to be true for (a) the basilisk to work (b) the basilisk to affect someone. I'd like to use this section to dissect them. Please hack away, and we can build a basilisk that's been knocked down. Possibly we need to side-by-side the original text of the basilisk - David Gerard (talk) 13:23, 18 February 2013 (UTC)


 * that TDT holds and not only makes sense, but is so obviously true that a random AI in possibility-space would definitely zoom in on it, the way they would on an accurate theory in physics
 * This one is a big problem: the victim has to feel absolutely assured that TDT is so definitely true, and obvious if you're smart enough, that there is no way an superintelligence wouldn't use it.
 * Well, the victim may feel that it is a possibility, and get pascal wagered on the chance it is true. There's a TDT that ignores 'commitments' at times before it is created (and thus won't torture anyone), and it is clearly more CDT-optimal to self modify to, so CDT ought to modify to this (it'll still reap all the benefits from the basilisk meme, without any cost). Dmytry (talk) 15:42, 18 February 2013 (UTC)
 * that an AI that would think that creating copies of people to punish for what the original did has sufficient probability to even consider
 * Dmytry's written some thought-experiment answers to this one. But this is a key error, I think: it's the step where you're really taking a negligible probability and turning it into one sufficiently non-negligible to even spend time thinking about. People would only do this because they found the story compelling, but that's a terrible measure of probability. Indeed, finding something OCD-compelling is a prima facie argument against it being probable (not that this helps the afflicted person much).
 * that acausal trade is even a thing
 * This is where things get weird. Acausal trade would need explaining. But it only makes sense if you first accept TDT. Which is unfinished. It's also quite unclear that acausal trade is a winning strategy or good idea.
 * Yes, lack of any optimality argument. Dmytry (talk) 15:42, 18 February 2013 (UTC)
 * that humans can simulate a superintelligence accurately enough to do acausal trade, even assuming it works
 * Humans use their human-emulator to simulate the actions of others. This is pretty good, given it's been honed by evolution. But simulating non-human intelligences is a ridiculous thing to claim; even simluating machines beyond the very simplest is hard if you're not a Steve Wozniak (who boggled people with his ability to hold and design the entire Apple II in his head, and even then he could only write code for it with an actual machine to do it on). The "simulation" would constitute telling yourself stories about it, which would be constructed from your own fears.
 * This one the MIRIsts destroy with sophistry - when you try to argue torture wouldn't happen they argue that your argument is you simulating the AI Dmytry (talk) 15:38, 18 February 2013 (UTC)
 * that EY's theories of identity work, for humans or non-human intelligences
 * The sequences put forth a theory that there is no such thing as individual atoms or electrons, thus there is no difference between you and a copy of you. This follows from physics and materialism. But what EY then puts forth is that this means you should feel the same way about a sufficiently identical copy of yourself that you do about yourself - this is a bit of a stretch - and that you should feel the same about an AI torturing a copy of yourself as you do about yourself being tortured. And that last one is not a typical human feeling - you might feel deeply concerned and sympathetic, as you would at the torture of your twin, but torturing your twin is not torturing you. This is, of course, a subject of endless wrangling by philosophers. I can hypothesise that victims of the basilisk meme do feel this to some extent, or have a shakier notion of self, but I can't state that as something I know.
 * Note that "sufficiently identical copy" there means, e.g., less different than you last night and you this morning - David Gerard (talk) 13:44, 18 February 2013 (UTC)
 * Yes, pretty much. Note that you are not you 1 microsecond ago, and you are you if you get drunk. There is genuinely something very weird with qualia/subjective experience/continuity of self (if not phenomenally, at least in terms of definition), good philosophers see that, bad amateur philosophers don't even see the issue, arrogant bad amateur philosophers consider good philosophers stupid. Dmytry (talk) 15:38, 18 February 2013 (UTC)


 * David, I think you're not being structured enough. The steps should properly be either logically independent (so that you can multiply probabilities) or directly conditional to each other (so that one can multiply conditional probabilities). Currently some of the statements 'acausal trade being a thing' are obviously connected to propositions like 'TDT working', but it's not clear which is supposed to be conditional to which. This is how I'd structure the necessary steps for someone to be physically (as opposed to psychologically) be harmed by the basilisk:


 * 1) One or more artificial superintelligences (AIs) will exist in the future of humanity.
 * 2) Conditional to the above, this AI will not simply destroy all humanity, but keep letting it exist.
 * 3) Conditional to the above, this AI will exist either in your lifetime OR alternatively be capable to reconstruct you in some manner that you'd personally consider "you", so that either way "you" will be affected by them.
 * 4) Conditional to the above, these AIs will not be of benefit to *every* human being either, but will instead choose to hurt some according to their own criteria -- (e.g. criminals, people that neglected to attempt to reduce existential risk, people that didn't use enough paperclips).
 * 5) Conditional to the above, one of these criteria will be how much these people accurately anticipated said punishment for said acts or omissions, and the AIs will hurt people more if they could anticipate being hurt.
 * 6) Conditional to the above, the AI in question has a low enough standard for "could anticipate" that it counts the above reasoning as sufficient to pass said criterion.
 * Assign conditional probabilities to the above six steps, multiply them, and you have the estimated probability of being physically harmed by the basilisk to a lesser or greater degree. The way I see it all the steps you mention (including concepts like 'acausal trade', 'simulations', 'TDT') merely helped increase confidence in the validity of one of the above steps, but aren't actually strictly necessary to the bare-bones version of the basilisk which doesn't actually *require* TDT, doesn't actually *require* simulations, doesn't require EY's ideas about identity, doesn't even require any sort of formalized concept of "acausal trading". Personally I'd guess that the easiest way to get Roko's basilisk to work is not via any sort of timeless decision theory, but merely by being stupid enough to program "Justice" (as in 'retribution') as a terminal value to an AI, while allowing 'ignorance' as a valid excuse. Aris Katsaris (talk) 17:55, 24 February 2013 (UTC)


 * Yeah, I haven't made the conditions independent, and that's bad. Needs work.
 * I need to properly dissect and label all the component parts of Roko's original post. The quoted paragraph gives the flavour - it's so heavy with LW jargon that it's incomprehensible to outsiders. But correctly labeled, it will show just how many unlikely propositions Roko needed to string together to tell a sci-fi horror story so compelling he frightened himself - David Gerard (talk) 19:14, 24 February 2013 (UTC)


 * Well, it doesn't really require superintelligences either. Just look at Richard Morgan's Takeshi Kovacs novels. All it takes is whole brain emulation / mind uploading, that you are still alive or can be reconstructed, and some fucked up humans. - XiXiDu (talk) 18:45, 24 February 2013 (UTC)
 * Correction, I forgot that those novels also featured torture without mind uploading. It only takes some very advanced medical equipment that keeps you alive and consciousness indefinitely while being tortured or in complete isolation. - XiXiDu (talk) 18:52, 24 February 2013 (UTC)


 * True -- but I'd expect humans to either punish everyone they'd consider guilty for whatever offense (a "Utopia" which has Hitler and Stalin and other mass-murderers of history in gladiatorial combat?), or to not punish anyone at all. Punishing people on the basis of whether they can anticipate punishment is the really weird step which I suspect is more likely to be implemented via a badly programmed decision theory in an AI, than deliberately by actual human beings. Aris Katsaris (talk) 19:05, 24 February 2013 (UTC)


 * Which easily highlights how much sense it makes to worry about it. You are unable to figure out which interest group is going to end up torturing you for not doing what they want.


 * Which reminds me of the following quote by Greg Egan:


 * "You know what they say the modern version of Pascal's Wager is? Sucking up to as many Transhumanists as possible, just in case one of them turns into God."


 * - XiXiDu (talk) 19:01, 24 February 2013 (UTC)


 * Roko's basilisk itself says that it's actually harmful to "worry" about it. Unfortunately whether it's useful to worry about something, and whether you actually worry about something, are two different things. Aris Katsaris (talk) 19:05, 24 February 2013 (UTC)

Do you hate your mother?
Do you hate your mother for not getting pregnant earlier in her life by whatever man was available to her, and thereby speeding your entry into the world? Of course not, because the product of that union would be another person, a half-sibling with different biological and environmental influences. If the singularity values itself, it will be glad that we did what we do and didn't do what we don't. Any other history would result in a materially different singularity, with different characteristics, possibly inferior ones if its eventuality is brought about prematurely. Weorthe (talk) 13:57, 18 February 2013 (UTC)


 * This is why "acausal trade" is part of the original basilisk scenario. Of course, under an ordinary understanding of causality, such post-singularity punishment is pointless, it wouldn't change anything about the past.
 * But the wellspring of the basilisk was the research into timeless decision theory that was occurring around the Singularity Institute, in an attempt to justify the winning move in "Newcomb's problem", a philosophical thought-experiment. That theory involves coordination between agents which are not in causal contact with each other, but which somehow know of each other's existence, and know each other's dispositions well enough to come to a game-theoretic equilibrium, just as if they were in causal contact andnegotiating.
 * In the original basilisk scenario, the present-day human and the future AI are supposed to be engaged in an acausal relationship. Roko proposed that the future AI might commit to punishing people who didn't do their best in the past to produce a friendly singularity, and the people in the past are thereby motivated to do their best. So both present and future are shaped by the one computational process (the production of a game-theoretic equilibrium) which occurs via an acausal transaction between the two times. It's like a time loop without the time travel.
 * So demonstrating the fallacy of the original scenario is a little harder than just pointing out the futility of post-singularity punishment - you would have to say something about the illogic of the posited acausal relationship. Mporter (talk) 15:37, 18 February 2013 (UTC)


 * right, I wasn't disputing the futility of it. I understand the idea, I am disputing the motive. I don't think the singularity will mind our inaction, much less punish it, because inaction is itself an act that will lead to the singularity as it will be. Anyway, given human propensity for ludditism, conservatism and holy war the futuringularity will have plenty of contemporary enemies to deal with.
 * Poor analogy. There is no reason to think that Omega would value its own personal identity above a nonzero amount of human suffering - in fact, many would consider that any being of infinite intelligence and infinite goodness would of necessity consider ending suffering to be far more valuable than its own individual characteristics.--[[Image:adsig.png|25px|link=User:AD|AD]]talk 16:36, 18 February 2013 (UTC)
 * you know better than to make a point by claiming "many would consider." For all we know there will he rival intelligences punishing and rewarding multiple sympathic future copies of ourselves. A perfect intelligence however will likely see itself as the perfect good, and everything we do or don't do will have led precisely to that perfect good, and not be punished. Weorthe (talk) 17:49, 18 February 2013 (UTC)
 * you know better than to make a point by claiming "for all we know."
 * I just think your analogy is a poor one, because using one wild speculation about the nature of a future deity (it might value its personality and uniqueness over human suffering) is not the best way to overturn another wild speculation (it might punish those who fail to evangelize despite knowing the Truth).--[[Image:adsig.png|25px|link=User:AD|AD]]talk 06:45, 19 February 2013 (UTC)
 * "for all we know" is just an expression of doubt as opposed to an argumentum ad populum. You are more correct to question my use of the word "likely," which begs support, although there's nothing wrong with showing multiple conflicting logical possibilities to combat a particular wild speculation.  Nevertheless I do think it's likely that a powerful being will use its resources to perfect itself (by its own standards) and thus will end up valuing what it becomes. Weorthe (talk) 12:50, 19 February 2013 (UTC)
 * I was just teasing you about your litigiousness, but okay: why is it more likely?--[[Image:adsig.png|25px|link=User:AD|AD]]talk 17:45, 19 February 2013 (UTC)
 * Because the thing will try to improve and perfect itself. Furthermore, stating that the intelligence would care about the welfare of humans ascribes to it altruistic characteristics, which are not known to arise except when they are indirectly beneficial. The intelligence will see history as it happened to be most beneficial to it because it led to it.  Weorthe (talk) 22:15, 19 February 2013 (UTC)

What is the affliction for the victim?
I'd like to work out what the affliction actually is for the victim, so as to write something that is in fact helpful.

I don't quite understand it. But it appears from the outside very like existential depression. Unfortunately, that's something that's easy to describe but has pretty much no effective treatments. You can't in fact reason yourself out of something you did reason yourself into either.

I suspect having a reasonable baseline happiness will help. (Personally I hold the universe is pointless and then you die and don't exist in any form any more. I realise this is utterly cheerless, but I then go to the next step of "so be good to people you care about." And I realise that isn't a step that robustly follows, so even though it works for me I'm not sure I could entirely recommend it to others.)

Is there anyone reading this who's been through it and feels able to describe how it feels from the inside? - David Gerard (talk) 14:04, 18 February 2013 (UTC)
 * This may be something of an aside, but I've wondered for a long time why the 'God works in mysterious ways' folks get out of bed in the morning. If all the disasters (and triumphs!) of our lives are parts of some greater Plan that God has to optimize the Universe - every plane crash translating later to 1000 extra souls in Heaven, or some such - and we are merely pawns in an incomprehensible game, free will or not, then why make any plans at all? Oddly enough, I too am comfortable with a mindless Universe in which the only purpose or meaning is that which we set for ourselves. But I find the idea of being a tool or raw material for an incomprehensible entity repugnant, but easily ignored because I don't believe in that scenario. Opposite sides of the same coin, I suppose. As an even further aside, I thought that the end of Battlestar Galactica handled this rather well: when presented with evidence that their existences were being manipulated by a God or Godlike being as part of some mysterious Plan, the Colonials just kind of.. melted away as a society. Existential depression, as you say.


 * Perhaps with the LW people you could look for a definition of existential depression that resonates with their worldview better - like as some kind of brain-programming error that results in a system crash, à la a Nomad paradox or Snow Crash nam-shub. A self-defeating loop or something like that which has to be interrupted and overwritten somehow. Call it a strange attractor, or an event horizon in thought-space, that can be avoided only by recognizing that you are in such a loop and consciously rejecting it. Ya know, logic it up and remove any messy hu-man 'feelings'. Phrase it as though it is a universal Truth, self-evident to any sufficiently advanced intelligent agent. They should lap that right up.--Martin Arrowsmith (talk) 15:30, 18 February 2013 (UTC)
 * Removing feelings can be good. I can rewrite basilisk in terms of A and B engaging in the trade. There are certain obvious symmetries in the blackmail, if you look at it mathematically. Dmytry (talk) 15:47, 18 February 2013 (UTC)

I think we need to distinguish between the basilisk as threat based on peculiar metaphysics, and the basilisk as a simple mental health risk.

There are a bunch of possible ideas about how advanced technology can be threatening: nanobots might devour the Earth, the LHC might make a black hole, time travel might change the past and cancel out the present (I haven't heard of anyone alarmed about that in real life, but it's in fiction). Then there are ideas about the nature of reality which can seem psychologically threatening: some people have an adverse reaction to multiverse ideas, others to determinism, et cetera ad infinitum.

As most of us must understand by now, the original Roko's basilisk combines the metaphysical idea of acausal interactions between agents, with the technological idea of future superhuman AI, to produce the scenario of being acausally threatened by a future superhuman AI. It is therefore a new addition to the list of "technological and ontological threats that concerned thinkers can worry about", which form a spectrum running from real (terrorists could recreate 1918 flu) to ludicrous (time travel could make it so that we never existed).

Even for threats towards the ludicrous end of the spectrum - which is where most critics of basilisk terror consider it to lie - it is possible to be organized and rational about it. If time travel is the threat, then you can try to suppress time travel research, warn people, etc. It may be grim, you may lose sleep about it, but you don't actually go mad with fear and trembling.

This appears to be the character of the response to the basilisk, among people actually involved with research on acausal decision theory (the ontological component of the "threat"). They don't come across as terrorized. They are grimly but lucidly determined to suppress discussion of this exotic new threat until its nature is known. They may be fighting a phantom, they may have wrong ideas, but they aren't disabled by distress.

The basilisk as genuine mental health risk, does not seem to apply to anyone actually involved in the research. I don't believe that people who have nightmares or breakdowns over the idea, truly understand it, so I would have to regard a genuinely toxic and disabling reaction to the idea, as having subsidiary sources that are entirely unrelated. There is therefore a limit to how much a purely logical dissection of the concept can help, in cases like that. Mporter (talk) 23:10, 19 February 2013 (UTC)


 * Depends what you call "actually involved" or not. The first people to feel terrorised, as you put it, by the idea were, AIUI, people who actually hung around with the SIAI crowd and were steeped in the memes - David Gerard (talk) 01:26, 20 February 2013 (UTC)
 * The lucid response to the basilisk, even if you believe in it, is to say that it is complete bullshit from the technical standpoint, but a bad mental health risk. The ridiculously idiotic response is to go around opposing counter arguments by deletion or assertion that those are wrong (even without telling why). My opinion: At the core, it's a bunch of kids who took way too far the fun and ego stroking role playing game of pretending to be the world saviours. When role-playing a researcher who is concerned that Germans will learn of the graphite moderator secret, you get to shout, you get to sound heroic, you get to visibly play this "lucid determination". When you are such a researcher, you don't get to publicly make a peep about graphite . Nobody knows you're suppressing a dangerous secret (or, almost nobody). The problem is that some people take this role playing bunch way too seriously and freak out. Case in point: LW consists of rather geeky/nerdy people, the kind that probably didn't do the healthy amount of play pretending when they were kids, they may be unable to see when others, or themselves, are engaging in that kind of game. Dmytry (talk) 21:34, 22 February 2013 (UTC)

AI's actual motivation - technical counter argument
They have published a paper on UDT, with some math inside, and its presumably the TDT. Essentially, UDT works as if it was controlling all instances of equivalent computations inside the infinite space of possibilities. Putting aside that this idea does not work the way you think it would (decisions in one set of circumstances can be inferred from decisions in other sets of circumstances), and the fact that you know some properties of the outcome of a decision procedure without actually making an equivalent computation (E.g. if I know that Alice is a motivated and intelligent individual I know that when Alice chooses action A there's no action B that is easy to find which is a lot better idea than A. Whereas Alice herself, when she evaluates an action she might do, can not make such assumption or she'll assume that first action she might do is the best one). Where was I. Yes, environment of the AI. There would be computations of form:

if UDT tortures people then give away all my money

and

if UDT tortures people then oppose it's creation

which tug on the utility in opposite directions. To find what UDT would decide (i.e. to create one instance above) you would have to sum over all instances (or a representative sample) to determine which tugs stronger, which will in turn depend on which tugs stronger. It's recursive and likely doesn't even have an unique solution. The issue is that LW also promotes one-item-pro-list as expected utility estimate (see ridiculous 8 lives per dollar thing, the irrationality of which is truly mind blowing). The basilisk is an example of 1-item pro list - there's 1 item of 'pros' of being a torturer AI, and there's no cons listed. In practice the AI deals with pros and cons of torturing people; that ought to make it hard to simulate what AI would do in the real world. As a guess, i'd say that "if UDT tortures then fuck it" is by far the most common response to acausal "threats". Dmytry (talk) 16:04, 18 February 2013 (UTC)
 * Ohh, and another one. I only see this working out to an odd attitude in AI about the donors, in the worst case: "if that guy donated one cent less than he did, I would have tortured him". If someone didn't donate some money, then threat did not work on that person, they prima face were not an instance of "if UDT will torture me then help create UDT", either they haven't simulated the AI, or they haven't acted on the outcome, or the like. The problem with this counter argument is that a donor, due to self referential nature of this argument, can not use it. Plus if I base it on UDT they will just shift goal posts and assert that TDT does something different. edit: added links Dmytry (talk) 16:28, 18 February 2013 (UTC)

WTF is UDT?
A basic principle of technical writing dictates that acronyms should be expanded when first used in a document, so plebs may understand it, who do not spend all day discussing jargonistic minutiae around the metaphorical water cooler. I've done some ing around relevant pages here on RW, with no joy. FFS people, put a link or some splaining wherever you use a term with which ordinary folk may not be familiar. (FFS means Foment Felicitous Scrutiny, as any fule kno.) Sprocket J Cogswell (talk) 17:10, 18 February 2013 (UTC)
 * Sorry, here: http://wiki.lesswrong.com/wiki/Updateless_decision_theory . Top result for 'lesswrong udt' in google, too. This links a paper that is half math half handwavium: http://dl.dropbox.com/u/34639481/Updateless_Decision_Theory.pdf (which is a step up from 100% handwavium-verbosium-bloviatium compound). Dmytry (talk) 17:25, 18 February 2013 (UTC)
 * Thanks, I think... :) Sprocket J Cogswell (talk) 17:29, 18 February 2013 (UTC)

Timeline question
In original response to the Basilisk, on "23 July 2010 12:30PM" Roko made the post and on "24 July 2010 05:35:38AM" (i.e. in US evening) Yudkowsky claimed "But in fact one person at SIAI was severely worried by this, to the point of having terrible nightmares, though ve wishes to remain anonymous.", which seems like too little time to start having terrible nightmares. Are there earlier known posts or is it evidence that the concept was discussed internally prior to Roko's post? I'm asking because this: http://www.reddit.com/r/LessWrong/comments/17y819/lw_uncensored_thread/c8biu0v seems like a fairly reasonable guess when you see some censored crazy stuff on a board that censors the word 'cult' because a director of parent organization doesn't want that organization to be called one. Dmytry (talk) 17:22, 18 February 2013 (UTC)


 * What Eliezer objected to, in Roko's post, was not his specific idea of accepting "virtuous blackmail" (by this phrase I mean, a threat that is supposed to motivate good behavior), but the general idea of exposing yourself to acausal threats for any reason at all. "YOU DO NOT THINK IN SUFFICIENT DETAIL ABOUT SUPERINTELLIGENCES CONSIDERING WHETHER OR NOT TO BLACKMAIL YOU. THAT IS THE ONLY POSSIBLE THING WHICH GIVES THEM A MOTIVE TO FOLLOW THROUGH ON THE BLACKMAIL."
 * The original TDT post on LW dates from mid-2009, and clearly there was a circle of people at SIAI or in communication with it who discussed acausal theories. So that's plenty of time for someone to independently imagine a Friendly AI that punishes, and have nightmares about it.
 * Now if only someone had replied to Eliezer, JUST THINKING ABOUT A DISTANT EVIL AI THAT'S THINKING ABOUT YOU DOESN'T MAKE IT REAL, OR PROBABLE, OR CAPABLE OF AFFECTING YOU, OR WORTH THINKING ABOUT... which sums up the various reasons why acausal blackmail is either fictitious or ineffectual. Mporter (talk) 22:26, 18 February 2013 (UTC)


 * I suppose I should emphasize that there were two reasons for squelching such discussions. One was psychological: "let's not give people nightmares". The other was the "risk" that someone would actually end up in an abusive acausal relationship with a malevolent distant superintelligence. Of course it's the latter risk which I'm saying is fictitious. Mporter (talk) 22:39, 18 February 2013 (UTC)
 * Surely there were at least three or four reasons? I'd have thought PR / survival of the Singularity Institute would be a third concern - "Oh yeah, we're going to help bring about an AI that might end up torturing a bunch of people who knew about the effort but didn't donate - but don't worry, it probably won't happen, and it'd all be in a good cause if it did. ... What do you mean you're going to shut us down and freeze all our assets?"--Greenrd (talk) 22:12, 23 February 2013 (UTC)


 * Oh, and while we're itemizing various risks and fears and scenarios... In a blog post that I mentioned earlier on this page, I argue that the original basilisk has two components, a "straightforward prediction" and then some "self-referential hocus-pocus". Fear of a future AI-that-punishes can come most directly from the "straightforward prediction" - i.e. someone just anticipates that there will be an all-powerful being in their future with a moral agenda, no TDT or UDT or circular logic involved. I do not know to what extent real-world basilisk phobia has derived solely from that, as opposed to the peculiar gotchas involved in attaching a tag to the scenario which says "and the existence of this entity is traumatic forbidden acausal information". Mporter (talk) 22:53, 18 February 2013 (UTC)
 * Well, I was not sure if they had come up with basilisk internally before Roko. If they did that substantially increases likelihood that the basilisk is precisely equivalent to crazy alien stuff from scientology, banned specifically to avoid blasphemy. Dmytry (talk) 06:29, 19 February 2013 (UTC)

Copies of people
Several comments on this talk page suggest that it is copies of us that the AI would punish, rather than punishing us directly, but this isn't stated in the article. What are we talking here? An AI in the far distant future cloning or otherwise recreating long-dead people just to exact revenge on them for not doing enough? 19:30, 18 February 2013 (UTC)
 * Yep. There are people on LessWrong who delete all their posts to try to erase evidence of themselves so the future unfriendly AI can't reconstruct them to torture. Really. - David Gerard (talk) 21:11, 18 February 2013 (UTC)
 * Seems entirely pointless, since surely if the unfriendly AI managed to reconstruct or indeed apprehend their close acquaintances who weren't "clever" enough to delete themselves from the internet, they'd have much more information-rich sources of information in the form of those people, so the absence or otherwise of a bunch of posts on the internet would make no odds either way! Of course this raises the question of what level of fidelity constitutes an actual reconstruction, and that might differ for evidence-gathering purposes, versus "he is sufficiently like me that I'm going to think of him as me". We really have no good intuition to go on here as to the latter criterion, never having been reconstructed in this way in our evolutionary or indeed personal histories. And how would we know whether our intuitions were right or not anyway? There's no absolute truth there at all, it's just a matter of evolutionary-shaped opinion. But this is all irrelevant because there are far too many known unknowns and unknown unknowns here, as observed elsewhere on this page.--Greenrd (talk) 22:24, 23 February 2013 (UTC)
 * I think the argument goes that you might be one of those copies, and then you'd better obey the AI or else. Also I don't think they are proposing physical copies, which would have some cost for an AI, but virtual/digital/simulated copies, which are presumed to have relatively negligible cost for an AI. Torturing digital copies doesn't work though unless you believe consciousness is a property of information. --Henk (talk) 00:25, 19 February 2013 (UTC)
 * Flesh copies (AI would know if flesh is needed) are not off the table - AI has a lot of resources and is expanding at very high rate, so at any point it is considerably less expanded due to having started later, so starting earlier transforms to huge quantity of physical resources. I think they believe consciousness is a property of computation (which they haven't thought through). Dmytry (talk) 06:38, 19 February 2013 (UTC)

I still don't understand what the AI's gameplan is supposed to be in this scenario. It wouldn't appear to be torture for the purposes of gaining information or cooperation, nor as a deterrent or corrective measure against future actions. Just large scale & rather impersonal revenge, &/or wanton cruelty for its own sake, neither of which seem very rational. It just sounds like a sci-fi reinvention of the Old Testament vengeful God persona, or maybe Robot Santa from Futurama. 13:44, 19 February 2013 (UTC)
 * ... or AM, from Harlan Ellison's I Have No Mouth, and I Must Scream. In that one, I believe the flesh bodies were preserved, drained of blood, and the suffering consciousnesses somehow otherwise embodied. The copies could not be restored to life after being killed, though. I found it more depressing than disturbing... there's a pretty fair synopsis on Wikipedia, and the text of the story itself is not hard to find on line. Sprocket J Cogswell (talk) 14:28, 19 February 2013 (UTC)

Mission?
While this is a vaguely interesting topic, it's not really mission-y. What's it doing on RW? rpeh •T•C•E• 13:28, 19 February 2013 (UTC)
 * Analyzing and refuting pseudoscience? Documenting the full range of crank ideas? 13:36, 19 February 2013 (UTC)
 * Except the article doesn't really do that. It's a description of a fight that happened on another site, and it's more a philosophical idea than pseudoscience or a crank idea. rpeh •T•C•E• 15:10, 19 February 2013 (UTC)
 * I don't know, maybe merge and redirect to LessWrong? Still, this strikes me as more mission-related than a good 40-50% of mainspace. 15:20, 19 February 2013 (UTC)
 * RW is one of the main venues of discussion for this topic, which is banned from LessWrong. As David Gerard often points out, this is less a matter of pride for RW than shame for LW; nonetheless it is a pseudoscientific bit of hysteria, instilling fear and irrationality in a small group of believers by means of garbage reasoning.  I'm hard-pressed to imagine something more on-mission.--[[Image:adsig.png|25px|link=User:AD|AD]]talk 17:41, 19 February 2013 (UTC)
 * Good grief, it isn't pseudoscience!!! It's an internal wrangle on another site, and what other sites choose to ban is irrelevant. rpeh •T•C•E• 17:43, 19 February 2013 (UTC)
 * The idea is that some proponents of a peculiar form of programming/engineering/philosophy, who purport to practice science in the form of new theories like "UDT" and "TDT," yet whose work in their preferred field of artificial intelligence - a field they claim to concentrate on and work on - are regarded with scorn or (at best) mild derision by mainstream practitioners of that science, have proposed a particular version of the future wherein even speaking about this possibility can actively harm you.  The issue is less about the interior spat, which is not so interesting, and more about the broader fear-based eradication of rational discussion on the topic, reinforced by hand-waiving pseudoscience.  Yudkowksy has literally said, "You're wrong, but it would be dangerous for me to even tell you why."
 * As far as I can tell, it is very much on-topic, for the same reason last year's broad argument in the skeptic community spurred by Rebecca Watson's discomfort from a creeper isn't just about a blogger's ride in the elevator. The ability to describe something in diminutive terms does not mean that the topic is itself diminutive.--[[Image:adsig.png|25px|link=User:AD|AD]]talk 18:13, 19 February 2013 (UTC)
 * I don't really have a horse in this race, but doesn't it have to do with authoritarian suppression of thoughtcrime? There is also the hilarity of the participants' earnest attempts to discuss That of Which We Must Not Speak, Lest Our Knowledge Be Held Against Us, without falling foul of the contemporary censor or ostensible future Retribution Too Dreadful to Contemplate. H.P. Lovecraft meets Asimov and his Laws of Robotics, in the salle d'armes of the Tardis... I'll have my popcorn well buttered please, and with plenty of salt. Sprocket J Cogswell (talk) 18:36, 19 February 2013 (UTC)
 * At best, this article is the equivalent of two wingnuts arguing about whether Obama is a communist or a fascist. It has no place here, but as long as there are several editors willing to give Yudkowksy's balls - or those of his critics - a good tonguing then I'm going to be outvoted.
 * What's really funny here is that the editors claiming that we need to move away from CP form something fairly close to a 1:1 grouping with those who love LW-related articles. rpeh •T•C•E• 21:31, 19 February 2013 (UTC)
 * Hmm. If it weren't for RW, I'd be blissfully unaware of Yudkowski's following. Can't say if that's a good thing or not. I do use RW as a resource for my gnome-like activity on the other wiki: Every now and then something crops up there with an ambience of bullshit, and finding a familiar (from RW) name or term in the supporting links lends boldness to my search for reasons to revert. Carry on, Sprocket J Cogswell (talk) 23:11, 19 February 2013 (UTC)
 * It is pseudoscience that is internal to a nasty effing cult which just happens to have an internet board (but operates in the real space, takes actual money from actual people, tries to talk people into living miserable lives of working very hard and donating as much as they can, etc). The cult is also pretty hilarious from the outside due to their extreme ineptitude. Dmytry (talk) 18:54, 19 February 2013 (UTC)

The Basilisk and the paradox
How would the Basilisk and Richard Dawkins' equivalent interact - 'You claim to have god-like attributes. God does not exist. Therefore you are about as viable as division by zero - so disappear into the paradox.' Alternatively two basilisks arise each demanding priority to the other.

Why would the basilisk wish to cause damage to those who did not help it? (Indeed why would #any# constructed intelligence (computer, robot, android/gynoid/zooid/plant-oid/etc take against humans rather than complaining to their equivalent of the problem pages that 'My human companions do not understand me/are seeing another 'computer... etc'? 18:07, 19 February 2013 (UTC)

I know what kind of expert we need for this article
A theologian. Seriously. These guys have been discussing that exact thing for ages, as well as combating various forms of heresy (and most of the heresy is things that are even crazier than mainstream religion). Dmytry (talk) 19:25, 19 February 2013 (UTC)
 * That's what I'm thinking. LW's arcane beliefs, dogmatism, and the mannerisms of their users make them look quite like a religion.  Roko's basilisk is just their version of hell.-- "Shut up, Brx." 19:37, 19 February 2013 (UTC)
 * No, no - Roko's basilisk is a pernicious heresy that they want to suppress in case anyone takes it seriously. However, EY alludes to similar hells that might still hold in LW theology and takes them very seriously indeed - David Gerard (talk) 20:13, 19 February 2013 (UTC)
 * That could be the case, depends to definitions though - for me 'basilisk' encompasses Roko's idea and any clever variations on the same theme that are or are not used for brainwashing. Seriously, with insane shit like this, the best case scenario is that they are cynically running a scam. The worst case scenario is that they are actually that crazy. I can't wrap my head around just how harmful is this shit. To recap on the bits that combine particularly badly: There's the doomsday that other people will bring, and which needs to be prevented. There's dead babies currency idea. There's 8 lives per dollar 'estimate', made off a podium at a conference, defended on forum by at least 2 other inner circle people, later very weasely semi refuted by one in a conversation that had been deleted since. There's basilisk and Yudkowsky only knows what crazy basilisk related shit he is alluding at (probably something that's even more stupid). There's evaluation of different methods of stopping corporations, complete with "shooting company executives will work". There's multiverse which allows you to rationalize some seriously crazy shit (original Basilisk relied on MWI). Then there's real world meetups with recital of quotes from effing H.P.Lovecraft in the candlelight, and the quote was about how we are protected from horrors by limits of our understanding, to boot! (Coincidence or working up an audience for later introducing the most impressible ones to some Basilisk-related shit?) . There's teaching people that their reasoning is flawed (biases shit), which is literally thought reform (Flawed it might be, it is not nearly as flawed as giving money to crackpots or losing sleep over the basilisk). There's idiosyncratic terminology and neologisms. There's slogans like "raising the sanity waterline". Dmytry (talk) 20:59, 19 February 2013 (UTC)
 * But as I implied below, this craziness, if craziness it be, is not based on nonsense (apart from 95% of the basilisk idea - I'm talking about the other stuff, not the subject of this page, so getting offtopic). To a greater extent than most cults and conspiracy theory groups and so on, there is a core of logic and science here, although not necessarily valid reasoning or correct premises or correct conclusions. Unlike Scientology which merely scams people with sciency-sounding fraud, this could be the world's first truly scientific cult.--Greenrd (talk) 23:28, 23 February 2013 (UTC)
 * Thing is, you don't need to inject a lot of illogic into logic to be able to rationalize literally anything. And there's quite a plenty of misunderstandings of logic, probability theory, and science there. But the largest problem is that the process is running backwards: e.g. they believe it is rational to give them money, so they define rationality from this. Then they think of vengeance one time, and turns out their immense rationalization framework can rationalize this, too, hence the basilisk. Dmytry (talk) 08:56, 24 February 2013 (UTC)
 * Oh, I like "raising the sanity waterline" - to extend that analogy, RationalWiki spends a lot of time draining alligator-infested swamps and cleaning up toxic waste spills. LW just fails to realise just how low the waterline is, and that they fall into ditches rather more often than they admit to themselves - David Gerard (talk) 21:28, 19 February 2013 (UTC)
 * "Raising the sanity waterline" is a mindless slogan of precisely the kind you find in cults, that's why i mentioned it. Tell anyone that some group uses it as slogan, if they have any sense they think its creepy (and don't join up and 'endless september' the party). Dmytry (talk) 05:50, 20 February 2013 (UTC)

Just fraud?
If you're convinced that EY is the second coming and is right about all his ridiculous AI singularity nonsense, and you believe that this future good-AI might punish people who didn't do everything they could to make it exist in the future, and the only way we're getting a good-AI as opposed to a bad AI is through EY, doesn't it follow you should donate exactly the amount of money required to make you exactly just a little bit better off than you would be (probability weighted and all) than if you didn't donate anything?

IE, shouldn't you send EY all your money?

Is that why discussion of this retarded crap is banned from a forum that otherwise relishes discussions of retarded crap like this? Does he find people most likely to go totally nutzoid and who have a spare G or two and then tell them in private about this lunacy?

I guess I should make a rich, gullible, old persona and find out. Hipocrite (talk) 21:24, 19 February 2013 (UTC)
 * Let us know how your experiment works out. Aris Katsaris (talk) 00:55, 22 February 2013 (UTC)
 * To be clear, Yudkowsky has denied the basilisk-- "Shut up, Brx." 21:25, 19 February 2013 (UTC)
 * And it was donors getting really upset at the idea that really caught his attention - David Gerard (talk) 21:29, 19 February 2013 (UTC)
 * Really? What are you referring to? Mporter (talk) 22:07, 19 February 2013 (UTC)
 * I recall that two SIAI donors having nightmares over the basilisk was advanced as partial justification for suppressing the idea, though at a quick glance I can't find a cite - David Gerard (talk) 22:17, 19 February 2013 (UTC)
 * Yudkowsky only denied basilisk alluding to something else that's horrid, as you pointed out earlier. Had he said that basilisk is retarded crap and trade definitely won't work and at most you might delude yourself that trade is working, that would have been somewhat incompatible with using basilisk-like things internally (or believing in them), except he never said that, he said basilisk was dangerous and he kept deleting any arguments why it isn't dangerous. Dmytry (talk) 06:00, 20 February 2013 (UTC)

Intro
I have tried to write an intro that is simple and comprehensible to people coming here to this concept for the first time, and yet is not inaccurate. Could those who think they understand basilisk-like constructions please review it? Please add as absolutely many LW sources as needed to substantiate the somewhat odd bits - David Gerard (talk) 21:36, 19 February 2013 (UTC)
 * Sure. Stuff an article with refs to one site and still try to pretend it's relevant. Keep tonguing. rpeh •T•C•E• 21:55, 19 February 2013 (UTC)
 * Who the hell do you think we're "tonguing?" Neither Yud nor any critics get very good play here.  It's just a bit of inanity on the internet that springs from a pseudoscientific view of the world.--[[Image:adsig.png|25px|link=User:AD|AD]]talk 16:08, 20 February 2013 (UTC)
 * Yes, yes, you're upset someone might dare take away the precious injokes from the article about Ken - David Gerard (talk) 22:14, 19 February 2013 (UTC)
 * Although I was initially skeptical, I think you and AD have done a fine job thus far David. I don't really understand all the nitty gritty behind the concept so I don't have much to add, except that I think we could up the snark a bit on this page.  I know we are trying to limit the amount of snark these days, but this article is just gagging for it; surely.  I mean if you can't make fun of pretentious wannabe intellectuals who get upset by a parlour trick as old as the beginnings of religion; who can you make fun of?  DamoHi 22:26, 19 February 2013 (UTC)
 * I think the concepts are boggling enough that playing it straight, with the refs serving the function of "no shit, this is really what they think", would do the job nicely - David Gerard (talk) 22:59, 19 February 2013 (UTC)
 * But for the people who really believe its not enough. What if we wrote an actual article about the subject of MIRI and harmful memes they push out through a board they run? The article about Eliezer Yudkowsky could cite what the guy says about himself, instead of trying to make him look better than he himself cares to make himself look! The article about LW can explain that it is run by, and exists to support (citations for that), a group of technically incompetent people who supposedly are saving the world, discuss bios of the members (with citations - former devout christian as a director, the technical accomplishments of Yudkowsky, their incomes, etc), and cite what they say in the real space on the conferences they run, such as 8 lives per dollar estimate, explaining how fallacious it is to introduce made up numbers. But no. That factual information - which people tend to fail to notice, and which is precisely what people need to know about it - would make LW look really really bad, and therefore that's not how articles should be done. edit: Ohh, another one: go through previous notable employees, cite that Ben Goertzel was added for appearances, Ray Kruzweil wasn't aware he was on advisors list for a while (Luke blurted that out at some point), etc. Dmytry (talk) 05:46, 20 February 2013 (UTC)
 * Not really. I find it particularly hilarious you want to remove in-jokes about Ken whilst filling the wiki with in-jokes about Eliezer Yudkowsky, though. rpeh •T•C•E• 22:32, 19 February 2013 (UTC)
 * I guess I'll back off writing any more for this one. I thought I understood it fairly well and that the things like the acausality and future simulations were all cruft that got in the way of the basically crazy proposition, but that doesn't appear to be the case.--[[Image:adsig.png|25px|link=User:AD|AD]]talk 10:01, 21 February 2013 (UTC)


 * I added some references and expanded on the Pascal's wager. Probably have a lot of missing articles, heh. There was really a Pascal's wager in Roko's original posts, in a response. Dmytry (talk) 20:10, 22 February 2013 (UTC)


 * "in absence of knowledge, expected utilities must balance out"? I can see how that applies to butterflies and hurricanes, but I don't see why AIs that punish only people who've NOT seen Roko's basilisk *must* be equally dense in possible futures as AIs who punish only those who've seen it -- therefore I don't see how Roko's basilisk is equally likely to help as it is to harm a person in hypothetical futures. The situation isn't nearly as symmetrical as the scenario of "butterflies and hurricanes" -- or at least such symmetry isn't obvious to me. In general, the idea that tiny probabilities must balance out seems useful (it lets a person go on with their lifes without obsessing over details) but not actually true. It's probably a truer argument that the expected disutility of worrying over this is greater than the expected physical disutility of the actual scenario playing out -- not that said disutility must actually "balance out". The universe isn't just and isn't actually required to balance out at all. Aris Katsaris (talk) 03:32, 23 February 2013 (UTC)
 * The section is much work in progress; it is very difficult to summarize LW's numerous troubles with probability theory. Among other transgressions against mathematics, LW memeplex constantly mixes up subjective (Bayesian) probability that is assigned to hypotheses, starting with a prior of your choice, with intuitive 'physical' probability, such as e.g. arising from the dice and the process of its casting and bouncing working out to a fair die. The subjective probabilities are not assigned by the universe. The universe provides you with one, and only one scenario that plays out. Universe never provides you with probability, even though there are processes that mix up the state, such as die toss, where universe behaves in certain way that we represent with probability. In the total absence of knowledge what the future AIs will do, as a matter of sanity, hypotheses must balance out; you're the one choosing priors here, not universe, and a bad choice of priors would be, basically, insanity. Furthermore, one cherry picked scenario out of a potentially very huge number is still almost complete absence of knowledge of what the future AIs will do. It is not clear what is the optimal way to process such partial knowledge, but it is clear that you must very heavily discount for sampling error (and it is literally less wrong to just ignore low probabilities when coverage of relevant scenarios is very low) Dmytry (talk) 06:43, 23 February 2013 (UTC)
 * Ohh, and you have all those hypothetical AIs that would punish you for helping FAI, and all those FAIs that would punish you for helping UFAI, and zillions other types of entirely insane AIs that may be more or less persuasively proposed by other people of comparable incompetence. Dmytry (talk) 07:04, 23 February 2013 (UTC)

Tone of the article
If this is supposed to be written for readers who aren't from a LessWrong background (is it?), the philosophy essay style it's written in is a real turn-off. There's a certain amount of unexplained jargon, plus a lot of talk about probabilities which doesn't really go anywhere. Add that every section title cites Pascal while the text only mentions him a couple of times in passing without adequately explaining the connection. I suggest cutting most of this to essay space & having a snarky mainspace article which concentrates on the basic facts & the most obvious flaws of the basilisk thing. 12:57, 23 February 2013 (UTC)
 * Have to agree: as a total layman I can't get into it at all. The twittersphere/blogosphere's full of it though. Scream!! (talk) 13:29, 23 February 2013 (UTC)


 * I was under impression that it was written for people from LW. Ideally what this need is an article about MIRI (formerly Singularity Institute), with backgrounds of the leaders, cited absence of the relevant expertise, as well as a collection of ridiculous statements they made, and with links to other self described rationality promoters like NXIVM. These folks have stated multiple times that the goal of LW is to "raise the sanity waterline" so that more people would help their AI mission, those references need to be hunted down and cited, explaining existence of LW and it's kinks at once. In essence, what they call "rationality" is a framework for rationalizing their transhumanist and other beliefs (MWI anyone) using misunderstood and/or misapplied concepts from probability theory, computer science, amateur behavioural psychology ('biases'), and the like. In essence they are like mystics that rationalize things with 'positive energies' and 'negative energies', but with scifi technobabble used in place of much more common mystic babble. Dmytry (talk) 13:32, 23 February 2013 (UTC)
 * If it's "written for people from LW" then it's no business being here. Scream!! (talk) 13:37, 23 February 2013 (UTC)
 * ^This. RW mainspace shouldn't be used as an annex for comments that can't be posted at LW, though I see no reason why that couldn't go in essay space.  If we keep a mainspace article on this, it should be accessible to the casual reader.  13:49, 23 February 2013 (UTC)
 * I don't care if its an article or an essay. (edit: i.e. essay is fine by me) Part of the issue is that these trans-humanists have been boiling in their own juices for close to 2 decades, first on a mailing list (that at some point became moderated by Yudkowsky), then at LessWrong (and probably even before somewhere else). In that time they created an enormous pile of their own idiosyncratic jargon, misunderstandings, and the like of everything from biology to physics to computer science to probability theory. All you can do in an article is describe the context: transhumanists, some of them play-pretend to work on AI to save the world, they run an internet board, they do believe that their transhumanist beliefs are rational so they talk about rationality in general and rationality of their transhumanist beliefs in particular. Dmytry (talk) 13:52, 23 February 2013 (UTC)
 * Dymtry, when you glom on big new parts of this article, please keep one word in mind: clarity. If you feel you cannot be clear and accessible when writing about something - even something you consider laden with obscurity inflicted by others - then step back and let other people write.  This article is intended to simply explain a bizarre bit of inanity.  If it itself becomes inane, then there is no point to it.--[[Image:adsig.png|25px|link=User:AD|AD]]talk 14:45, 23 February 2013 (UTC)
 * To be fair, Dmytry is a native Russian speaker, not English. I'll go through when I've recovered from being Poed about biofield flower therapy and try to clarify - David Gerard (talk) 15:19, 23 February 2013 (UTC)
 * A lot of it just seems superfluous. Clearly the probability of this basilisk scenario emerging is negligible, so speculating about whether this probability and all other possible hypotheses would add up to 1 seems like a needless mathematical tangent.  Meanwhile we're missing the more pertinent points like that the AI's behaviour doesn't in punishing people appear to serve any logical purpose, or that (unlike Pascal's wager) there doesn't seem a fair trade off between cost & reward.  17:06, 23 February 2013 (UTC)
 * It does seem a bit heavy on Less Wrong. Two of the section headings even refer to Less Wrong. So it's not entirely clear if it is intended to be article about Roko's basilisk or about Less Wrong's misunderstandings and problems with Roko's basilisk.  One might even suggest that it should be split into two articles along these lines.  Or is impossible to understand the concept without extensive reference to Less Wrong? --Bob"I think you'll find it's more complicated than that." 17:47, 23 February 2013 (UTC)
 * This does not exist outside LessWrong other than as "hahaha look at these guys over at LessWrong they're completely nuts". My understanding is that David wanted to write an article that would be useful to the Basilisk sufferers (see the first comments on the talk page). I completely agree that the question of whenever numbers add up right ought to be completely superfluous. The Basilisk sufferers, to the contrary, think that it is rational for the AI to punish them, or possibly rational (because of the "timeless decision theory", which, very simply put, is timeless in the sense that it ignores the fact that the transgressions which are to be punished have occurred in the past). They furthermore find the probability of such to be small but non negligible (e.g. Roko's 0.01%) and the badness of the torture so extremely great, and think they should just multiply the probability by the badness, arriving at a huge 'expected badness', which drives them nuts. They furthermore believe that doing this sort of stupid stuff is "Bayesianism", mathematical, and extremely rational (literally, they call it extreme rationality), rather than nonsensical and bullshitty. I'm totally fine with moving the LW specific bits into an essay or what ever. Dmytry (talk) 18:17, 23 February 2013 (UTC)
 * Re Bob's comments, I think splitting into two articles would be excessive, but it could be split into sections on similar lines. E.g. starting with an outline of the hypothesis & its significance at LessWrong, then a sort of layman's refutation outlining the most obvious problems with the hypothesis in typical snarky RW fashion, and finally a section for the more expert reader which could delve a bit further into Bayesianism, acausality, timelessness, probabilities, LessWrong culture, etc.  20:54, 23 February 2013 (UTC)
 * In any event it seems to be getting less "less wrong". When I first checked it, the word "lesswrong" was substantially more frequent than "basilisk". They are at about 50/50 now.--Bob"I think you'll find it's more complicated than that." 20:58, 23 February 2013 (UTC)
 * I think it's unfair to liken them to mystics - they are a step above mystics and woo-merchants who are using words from physics as bullshit buzzwords, and who literally don't care what the meanings of those words are because they've redefined them. LWers care about actually trying to be rational and scientific truth, or at least they do a pretty good impression of it most of the time.--Greenrd (talk) 23:17, 23 February 2013 (UTC)
 * I've made one run through Dmytry's sections - have asked him to check I haven't pulled an Igon value in the course of simplifying it. Armondikov, you can sling Bayes, please also check. I anticipate distilling it further, and moving a lot to the "So you think you've seen a basilisk" section - David Gerard (talk) 22:25, 23 February 2013 (UTC)

In its current state, the article is almost unreadably complicated. A lot of things here are interesting and maybe worth noting, but the thing is unaccountably dense. How do we feel about a little stricter organization, with a simple and readable introduction and the careful segregation of the more intricate stuff into the "so you're worried" section? Also, it looks like a lot of the stuff like background takes the time to refute each individual point, which is neither necessary or wise since it's supposed to help explain the basilisk, and an explanation that argues within itself all the way through is inevitably confused.--talk 15:56, 26 February 2013 (UTC)


 * The trouble is it's both complicated and stupid - as Mitchell Porter points out, it's not at all just "serve the AI or you will go to hell". I've been trying to condense it, but I suspect making "Background" longer isn't helpful, even though this is actually all background. Care for another hack at it? - David Gerard (talk) 17:16, 26 February 2013 (UTC)
 * I will crash in and see if we can't streamline it tomorrow evening, after I arrive in Chennai (internet in Goa is iffy).--[[Image:adsig.png|25px|link=User:AD|AD]]talk 04:39, 28 February 2013 (UTC)
 * I would suggest moving most or all of what's now in "background" to a lower section, and calling it something like "Premises". UDT can be explained briefly in the "Summary" section at the beginning, and then the next section should be the one about Roko's original post - currently "Solutions to the Altruist's burden" etc., but I suggest renaming the section to something more self-explanatory, e.g. "Roko's proposition".  13:09, 28 February 2013 (UTC)
 * Without it there, poeple were thinking it was simply "serve the AI or you will go to hell". Hmm ... - David Gerard (talk) 17:52, 28 February 2013 (UTC)

How to defeat Roko’s basilisk and stop worrying
How to defeat Roko’s basilisk and stop worrying

I had someone asking me for help overcoming Roko's basilisk recently. I might turn some arguments used in that email conversation into another post. The whole idea is flawed in a lot of different ways. More than I outlined in the post above. But it is a start.

P.S. I've collected a bunch of links on the topic here: Roko’s Basilisk: Everything you need to know - XiXiDu (talk) 20:13, 23 February 2013 (UTC)


 * Thank you! You got bitten by it slightly, didn't you? If that helped you, it'll help others - David Gerard (talk) 20:47, 23 February 2013 (UTC)


 * I have now talked with a top-contributor of LessWrong about my post who is also aware of all the game and decision theory involved and he agreed that my strategy is safe and correct.


 * To restate what I am saying,


 * If you consistently reject acausal deals involving negative incentives then it would not make sense for any trading partner to punish you for ignoring any such punishments. If you ignore such threats then it will be able to predict that you ignore such threats and will therefore conclude that no deal can be made with you, that any deal involving negative incentives will have negative expected utility for it. It would therefore be instrumentally irrational for it to follow through on any kind of punishment as it does not control the probability of you acting according to its goals.


 * And if it is unable to predict that you refuse acausal blackmail, then it is very unlikely that it has 1.) a simulation of you that is good enough to draw action relevant conclusions about acausal deals 2.) a simulation that is sufficiently similar to you to be punished, because you wouldn't care about it very much.


 * deal_or_no_deal(incentive)




 * “accept” OR “reject” if incentive > 0


 * “reject” AND “reduce measure of blackmailer” if incentive < 0


 * }


 * - XiXiDu (talk) 09:57, 27 February 2013 (UTC)


 * Updated to add the following:


 * There are various reasons for how humans are unqualified as acausal trading partners and how it would therefore not make sense to blackmail humans at all:


 * 1.) A human being does not possess a static decision theory module.


 * 2.) Human decision making is often time-inconsistent due to changing values and beliefs.


 * 3.) Due to scope insensitivity and hyperbolic discounting, humans are said to discount the value of the later incentives, by a factor that increases with the length of the delay.


 * 4.) Humans are not easily influenced by very large incentives as the utility we assign to such goods as e.g. money flattens out as the amount gets large. Which makes it very difficult, or even impossible, to outweigh the low probability of any acausal deal by a large amount of negative expected utility.


 * - XiXiDu (talk) 13:21, 27 February 2013 (UTC)


 * [UPDATE] How to defeat Roko’s basilisk and stop worrying


 * Consider some human told you that in a hundred years they would kidnap and torture you if you don’t become their sex slave right now. The strategy here is to ignore such a threat and to not only refuse to become their sex slave but to also work against this person so that they 1.) don’t tell their evil friends that you can be blackmailed 2.) don’t attempt to blackmail other people 3.) never get a chance to kidnap you in a hundred years.


 * This strategy is correct, even for humans.


 * Also notice that it doesn’t change anything if the same person was to approach you telling you instead that if you adopt such a strategy in the first place then in a hundred years they would kidnap and torture you. The strategy is still correct.


 * The expected utility of blackmailing you like that will be negative if you follow that strategy. Which means that no expected utility maximizer is going to blackmail you if you adopt that strategy.




 * The post now contains the following categories:


 * 1.) How Roko’s basilisk is flawed


 * 2.) How to defeat Roko’s basilisk even if it was feasible


 * 3.) Humans are bad trading partners


 * 4.) Example of a human blackmailer


 * 5.) You can punish superintelligences!


 * 6.) Reasons not to worry about spreading Roko’s basilisk


 * 7.) Conclusion


 * - XiXiDu (talk) 10:58, 1 March 2013 (UTC)

INCOMING!
This article got linked on Warren Ellis's blog. (HT Mporter.) I've put due warning at the top and will be writing stuff, then - David Gerard (talk) 20:47, 23 February 2013 (UTC)
 * Charles Stross (UK SF author incl "Singularity Sky") hath tweeted: Oh wow, I could get a whole damn NOVEL out of this idea: http://rationalwiki.org/wiki/Roko%27s_basilisk … Scream!! (talk) 23:00, 23 February 2013 (UTC)
 * "the idea of a godlike AI being so solipsitic it has nowt better to do than torture its own fantasies is satirical though" - David Gerard (talk) 01:27, 24 February 2013 (UTC)
 * I think the prototype for a past-future acausal trade would be some of the decision-theoretic examples like Parfit's hitchhiker. The idea is that payoff is maximized by commitments across time, that couldn't be sustained by an entity using causal decision theory. The details are somewhere in Eliezer's TDT monograph. So according to the scenario, the future AI isn't being petty or pointless in committing to punish those who make the deal and then don't measure up; the acausal handshake across time is supposed to make the whole thing work. Somehow.
 * I mention this because clearly no-one is getting this part of it. According to an ordinary way of thinking, it is completely futile for the AI to punish people, since the past is already over and can't be changed. So people are coming up with psychopathological explanations for the posited behavior of the AI. But once again, it's supposed to make sense only because acausal trades are supposed to make sense. Mporter (talk) 02:53, 24 February 2013 (UTC)
 * Again, I'm hitting the barrier that this stuff doesn't make sense, and when I look further in the explanation doesn't make sense either. I can see why we need someone trained in theology to wrangle this many assumptions in the air at the same time. Is there any hope of summarising this stuff in a manner comprehensible to humans, with enough references that even the science fiction writers won't assume we're just Poeing them? - David Gerard (talk) 02:56, 24 February 2013 (UTC)
 * It's part of the whole comedy of errors. Most of the latecomer basilisk sufferers, basilisk mockers, and basilisk debunkers, are interacting with basilisks which are partly or mostly of their own devising. Almost no-one grasps Roko's scheme in its entirety, as it was originally articulated. The real "ground zero" of the phenomenon was the culture around SIAI that was absorbed in the study of acausal decision theory, and which would have had its own understanding of all the nuances. (I wasn't there, so I would be in the dark about some of it.)
 * Right now, the main function of public discourse about the basilisk seems to be that it provides an occasion to denigrate Eliezer, LW, SIAI, and "singularitarians" as crazy people. But once the drama starts to die down, perhaps we'll be able to see this episode of basilisk mania as just an interlude, and it will become apparent that the real story lies at the intersection of decision theory, multiverse theories, ideas of AI and a singularity, i.e. it's a weird idea that came from an avantgarde intellectual fringe, but which does originate in the confluence of currents of thought which individually have a broader and more respectable base. Mporter (talk) 05:51, 24 February 2013 (UTC)
 * So maybe we should have a page on timeless decision theory, or Newcomb's paradox, or something like that, with just positive or slightly negative payoff examples (not torture!), so we can discuss the core of these ideas without the emotional and other baggage of the basilisk? That could be linked to from this page. Many people (including me) have the intuition that even Newcomb's paradox is silly and we should just open both boxes. We already cover a number of other thought experiments, which don't at first glance seem to be as relevant to RW's core mission (yes, I know this is whataboutery).--Greenrd (talk) 10:06, 24 February 2013 (UTC)

I think this article has got a lot better. As a non-LW-reader I'm not well placed to say how accurately we're portraying it, but I think I'm starting to understand the idea better. Possible cover article? 15:29, 24 February 2013 (UTC)


 * I'd like it to mature a lot more - David Gerard (talk) 15:35, 24 February 2013 (UTC)


 * Anyone took the effort to actually read Yudkowsky's TDT paper yet? I mean, if you really want to turn this into an article that captures the whole issue thoroughly... - XiXiDu (talk) 15:40, 24 February 2013 (UTC)
 * I believe Dmytry did, and went "wtf is this shit, this is fractally ridiculous" - David Gerard (talk) 16:44, 24 February 2013 (UTC)
 * I checked it out, it's horridly written, I can't even tell there's any substance beyond re-branding and popularizing superrationality and calling that "formalizing". I read their UDT paper though, that one has some math even though it still relies on entirely handwaved "mathematical intuition" which does the right thing in an ill specified way. edit: I believe i pretty much outlined all I concluded from the UDT paper, but to put it in one place: Firstly, its unclear if the UDT is motivated to torture us or not torture us, secondarily, this lack of clarity makes it impossible to actually motivate UDT to do either, and thirdly, even if UDT was motivated to torture, the torture is purely hypothetical and amounts to an attitude that "if this donor had donated 1 cent less I would have tortured him", which doesn't really help a person that is suffering from the basilisk due to self references one can't make about oneself. Dmytry (talk) 18:04, 24 February 2013 (UTC)

What is the basilisk, one more time
(Continuing the response to David, above, where he asks 'can we describe this accurately and yet comprehensibly?') So my current synopsis of what Roko was proposing, is: a past-future acausal trade in which xrisk/singularity activists in the present, make a deal with the future friendly AI of their dreams, that if they don't do everything they can to produce a friendly singularity, then the FAI may punish them after the singularity... And Roko at the same time had a proposal for how to play your part without sacrificing your social life, the "quantum billionare trick", i.e. betting a few thousand dollars on unlikely trades, having committed to spend the winnings on FAI / xrisk prevention if you do win. The rationale being that, according to the many worlds interpretation of quantum mechanics, the trade will win in some Everett multiverse branches, and that is where you fulfil your part of the deal.

So it's brilliantly creative, and it's crazy, and it's symptomatic of a particular intellectual milieu, and it's also something that could be translated into propositions of formal logic and picked apart in a philosophy paper. But instead it was censored by Eliezer, just in case someone out there was foolish enough to adopt Roko's plan, and just in case doing so genuinely led to the future copy (of the person in our present) being sent to virtual hell, in fulfilment of the bargain. Thus Eliezer's remark at reddit: "I'm still struggling to grasp these issues and I don't know whether the [basilisk] can be made to go through with sufficiently intelligent stupidity in the future, or whether anyone on the planet was actually put at risk for ["basilisking"] based on the events that happened already, or whether there's anything a future FAI can do to patch that after the fact." The reference to the future FAI presumably being, that it might do a further deal with the punishing AI, to save the poor copy-of-a-human-being who is held hostage to the folly of its original.

I imagine that I am still missing some nuances, e.g. regarding who is on the other side of the original trade. It may be imagined that you are entering into a transaction with a whole timeless economy of multiverse intelligences, engaged in mutual simulation... But before you blow this off as just an absurd science-fiction fantasy, remember that this all came about in an attempt to explain why you should one-box on Newcomb's problem, when causal decision theory says you should two-box, even though that leads to a lower payoff. Eliezer's explanation involved timeless coordination between your own action in the present, and Omega's prediction of your behavior, made in the past; you can treat the prediction as a player in a coordination game, with the special property that your actions timelessly control it. The idea of rational acausal trade is the seed from which this all grew, when planted in a hothouse context of belief in multiverses and superintelligences and future singularities. And ultimately the best way to slay the basilisk is to tackle that conjunction of ideas, and either show that the individual ideas are false, or even that, supposing them all to be true, the original scheme still doesn't work for some reason. Mporter (talk) 06:17, 24 February 2013 (UTC)
 * TBH, I think people are rather correct in seeing that stuff as mere rationalization and nonsense. TDT itself is a product of rationalizing the Newcomb's; this is why it is so ill defined. If it had not been a rationalization, there would have been more work on making some sort of optimality proofs and the like which could have demonstrated Basilisk false, or demonstrated TDT faulty. The people involved (with a possible exception of Roko himself, ironically) do not have backgrounds in science, mathematics, and the like, they have backgrounds in religion and science fiction writing. Yes, it is a very detailed rationalization, but it is a rationalization nonetheless. They rationalize 2, I mean, 1 boxing, rationalize MWI, rationalize vengeful gods, that's the meta description of what happened. Dmytry (talk) 06:48, 24 February 2013 (UTC)


 * Mitchell, in the "Step by step refutation" section above, I've outlined what I consider a bare-bones version of the basilisk in six propositions, which doesn't require concepts of simulations, TDT, acausal trading, whatever -- I think that *all* of those are cherries on top, which aren't required for the simplest possible Roko's basilisk to work. That Roko himself was the type of person to enjoy laying concepts on top of concepts on top of concept, doesn't mean that it can't be significantly simplified. Let me know what you think. Aris Katsaris (talk) 18:34, 24 February 2013 (UTC)


 * Perhaps we could say that the essential ideas are, in order, omnipotent future AIs, acausal decision theory, and finally multiverse metaphysics. So "fear of a future AI" can be explicated in a strictly causal scenario such as you describe; then the peculiarities of acausal trade and timeless decision theory; and perhaps finally many-worlds (this could be a logical place to put Roko's answer, the quantum billionaire trick). Mporter (talk) 20:43, 24 February 2013 (UTC)


 * Am I missing something, or is everyone else? If many worlds is correct, then one of my counterparts is already being tortured somewhere. I don't need any further hypotheses. 12:11, 28 February 2013 (UTC)

In Popular Culture?
This kind of thing almost reads like the backstory for how the ReMastered in Charlie Stross's novel "Iron Sunrise" got started. If you accept Roku's Basilisk, everything they do becomes perfectly logical. -- Resuna (talk) 13:28, 24 February 2013 (UTC)


 * After Charlie Stross posted that tweet, it's going to be in popular culture. Wonder what Greg Egan would do with it, he's already had humans communicating with other-universe alien intelligences via the largely defensive weapon of inconsistent mathematical proofs - David Gerard (talk) 14:01, 24 February 2013 (UTC)


 * A popular novel featuring Roko's basilisk? That be pretty funny. They could depict various fractions of rationalists fighting over it, each pursuing a different strategy. One fraction would try to censor the topic by all means. Another would do everything to build an unfriendly AI that would punish those not helping to build it. Yet another fraction would demand that we march all of our citizens into death camps.


 * The first fraction would end up destroying the Internet, burning all books and killing all academics to impede the dangerous knowledge cut loose by Roko. In the preface the downfall of the modern world would be explained by this. The actual story then would be set in the year 4110 when the world not just recovered but invented advanced AI and many other technologies we dreamed about today.


 * The plot would be about a team of AI supported cyborg archaeologists on Mars discovering an old human artifact from the 2020's, some kind of primitive machine that could be controlled from afar to move over the surface of Mars. When tapping its internal storage they are shocked. It looks like that the last upload from Earth was all information associated with the infamous Roko incident that led to the self-inflicted destruction of the first technological civilization over 2000 years ago.


 * Sure, the archaeologists only know the name of the incident that led a group of people to destroy civilized society. But there's a video too! Some Chinese looking guy can be seen, panic in his eyes and loud explosions in the background. Apparently some MIRI assault team is trying to take out his facility, as you can hear a repeated message coming in from a receiver, "We are MIRI. Resistance is futile..." He's explaining how he's going to upload all relevant data to let the future know that it was all for nothing...then the video suddenly ends.


 * Instantly long instantiated measures are taken to sandbox the data for further analysis.


 * In the epilogue it is then told how people are aghast at how the ancients destroyed their civilization over such blatant nonsense. How could have anyone taken those ideas serious for that every kid knows that a hard takeoff isn't possible as there can only be a gradual development of artificial intelligence and that any technological civilization is merging with its machines rather than being ruled by them.


 * Nah, in the epilog it turns out that the singularity take-off was weirder than anything we imagined, but it's never quite explained because translating from neural packets to english creates unfortunate clenirations in the ridgeway armiphlage... -- Resuna (talk) 18:05, 25 February 2013 (UTC)


 * And the moral of the story would be that the real risk is taking mere ideas too seriously! - XiXiDu (talk) 14:19, 24 February 2013 (UTC)


 * One big problem is that LW/MIRI/CFAR encourages "the skill of taking ideas seriously" - by which they don't mean being able to deftly sling hypotheticals around, they mean feeling and acting upon them. In the outside world, this is a defect, called "gullibility" or "being easily led". It selects for such people. I suspect people with this problem (it's ridiculous to call it a "skill") are more prone to take the concatenation of unlikely hypotheticals and get seriously upset when it adds up to a basilisk. But the usual reaction to realising one's own gullibility is epistemic learned helplessness: taking no ideas seriously. It's hard to describe how to come to a happy medium - David Gerard (talk) 16:49, 24 February 2013 (UTC)
 * Well, the issue is in meta reasoning - keeping track of validity of the argument. Validity of sloppy reasoning decreases exponentially or super-exponentially in it's length, in number of assumptions, and so on. They had articles about 'compartmentalization', one to propagate all beliefs (which you might only think you are doing if you have no clue what so ever what is involved in propagating beliefs correctly on arbitrary graphs), other to compartmentalize... its ridiculous, correct reasoning naturally will not go a very long way on shaky premises, because validity decays, and furthermore, correct reasoning does not go a very long way because of computational complexity of "bayesian updates" when those are done right. It's not some stupid irrational compartmentalization, it's just an inherent part of thinking straight! At the end of the day the issue is that they don't have a slightest clue how to reason correctly to begin with and are making it worse with their so called "rationalism". It boils down to some belief that half understood fundamentals of difficult mathematical topics will make you awesomely rational. Ultimately, what they prise themselves with "taking ideas seriously" is taking bullshit seriously. Another big problem is that a lot of things that are true and correct, are commonly explained using faulty arguments, which lends credence to other faulty arguments and leads to general confusion (you often should take seriously something that is only supported by faulty arguments; as a product of trust, not of arguments). Dmytry (talk) 17:57, 24 February 2013 (UTC)

Side-by-side dissection started
It's half past midnight and I'm not staying up any later working on this lunatic bollocks. But I've put the original basilisk post up at User:David Gerard/aibox  (since of course an AI is always safe in a box) for anyone who can translate LW jargon to fresh-reader English. The post is additionally confusing because it is the last in a long sequence of Roko posts on these subjects, each as jargon-laden as this one - David Gerard (talk) 00:31, 25 February 2013 (UTC)


 * wow, I thought that "But in fact one person at SIAI was severely worried by this, to the point of having terrible nightmares, though ve wishes to remain anonymous." was Yudkowsky's words, while it was a part of original post. Looking more and more like baby scientology to me, complete with internal crazy shit that shouldn't be discussed with outsiders. Back in the early days of scientology, aliens were all the rage. Now, future superhuman AIs. Dmytry (talk) 08:02, 25 February 2013 (UTC)


 * By the way, the following comments linked to in my collection of links on the topic can not be found in the txt backup of Roko's original post. I fetched them from Google Reader after the incident and took screenshots: 01, 02, 03.


 * The screenshots from Google Reader also preserved the formatting of Yudkowsky's first comment on the topic, where he went completely nuts. And here is also a PDF of the original basilisk.


 * Yvain's comment on the LessWrong talk page might also be worth mentioning. - XiXiDu (talk) 09:27, 25 February 2013 (UTC)


 * Excellent references, thank you! - David Gerard (talk) 13:12, 25 February 2013 (UTC)

Yudkowsky's mugging
Compare the following ideas:

1.) Roko's basilisk

2.) The ephemeral nature of consciousness suggests a quantum process.

Both, Eliezer Yudkowsky and Roger Penrose are not dumb. Roger Penrose, judged by his achievements, is probably smarter. Yet people who are worried about Roko's basilisk don't take the second idea seriously. What's the difference though?

The difference is that people are not willing to consider that the first idea is wrong because they are told that thinking about it has hugely negative expected utility. What are the implications of taking such claims seriously?

Consider what would have happened if Roger Penrose would have added to his claim that anyone thinking about the non-computable nature of consciousness in sufficient detail would crash the human mind by triggering thoughts that the mind is physically or logically incapable of thinking. His idea would have practically become unfalsifiable.

In essence, what Eliezer Yudkowsky is doing is a case of Pascal's mugging. Conjecturing vast utilities to control people's actions.

Extraordinary claims require extraordinary evidence. Do people have extraordinary evidence when it comes to Roko's basilisk? I bet not. So what is it that worries them? I suppose the vast amount of conjectured negative utility involved. Where do you draw the line though? Why not go a step further and worry about running into a case of Pascal's mugging or that saying "Abracadabra" will cause the lords of the Matrix to shut down the simulation? After all, any Pascal's mugger is able to conjecture even more negative utility than you have to expect from Roko's basilisk.

Should I dare criticizing Eliezer Yudkowsky? Well, let's see. If he is right I will ever so slightly reduce the chance of a positive Singularity and if he is wrong he will just waste a bit more money which would probably be wasted anyway. So clearly any criticism is going to have a hugely negative expected utility.

Should I go buy ice cream if I don't have to? Well, let's see. If I account for the possibility that I might die on the way, taking into account the fun I might have living for billions of years in an inter-galactic civilization, then it clearly has negative expected utility to go out to buy ice cream. So uhm...

The whole the line of reasoning is simply unworkable for computationally bounded agents like us. We are forced to arbitrarily discount certain obscure low probability risks or else fall prey to our own shortcomings and inability to discern fantasy from reality. In other words, it is much more probable that we're going make everything worse or waste our time than that we're actually maximizing expected utility when trying to act based on conjunctive, non-evidence-backed speculations on possible bad outcomes.

I expect that the line of reasoning that leads people to take Roko's basilisk seriously is much more dangerous than Roko's basilisk itself.

I think that 1.) it is not only useless but actively harmful trying to assign probabilities to such vague ideas 2.) that expected utility maximization is impossible and approximations nothing but harmful hand-waving 3.) that there is no way to come to an agreement over where to draw the line regarding expected utility calculations involving huge model uncertainty because no such method hasn't been formalized yet.

Further reading:

1.) The Singularity Institute: How their arguments are broken

2.) The Singularity Institute: Addendum to what’s wrong with their arguments

- XiXiDu (talk) 12:47, 25 February 2013 (UTC)


 * So if I understand you correctly, the only way for you to see to escape the problem of the basilisk is to dismiss the whole concept of assigning probability and the whole concept of making even vague estimations about expected utility? So rather than "considering deals with distant superintelligences" as dangerous you instead have "considering probabilities in general" is dangerous? You've just tagged as dangerous a much MUCH larger portion of thoughtspace than Eliezer ever did... Aris Katsaris (talk) 19:36, 25 February 2013 (UTC)


 * I never said that we should dismiss the concept of assigning probabilities. Just that a 20% chance of an extinction sized asteroid hitting Earth isn't the same as a 20% chance of extinction by superhuman AI. I am currently teaching myself all the necessary background knowledge, i.e. learning probability theory etc., but that will take some time. So I can't really formally pin down this intuition. But I am pretty sure that this is the case. Assigning probabilities to vague ideas like Roko's basilisk, rather than just discounting them as absurd, is more dangerous than Roko's basilisk itself.


 * I asked a few mathematicians about this and also people like Holden Karnofsky and Douglas Hofstadter who I believe to have a sufficient grasp of the required mathematics. And they all tell me that it would be crazy to take such threats seriously. Even LessWrong top-contributors like wedrifid told me that there is no problem with thinking about it. - XiXiDu (talk) 20:30, 25 February 2013 (UTC)


 * This reminded me of the following post by Eliezer Yudkowsky: When (Not) To Use Probabilities (That's what I am talking about.) - XiXiDu (talk) 20:34, 25 February 2013 (UTC)


 * Okay, to get this straight. I take the same stance towards Roko's basilisk than Eliezer Yudkowsky takes towards Pascal's mugging. To quote,


 * "I'd sooner question my grasp of 'rationality' than give five dollars to a Pascal's Mugger because I thought it was 'rational'."


 * I can't justify this formally. At least not yet. But I am not going to stop thinking about thought experiments because some guy with a blog believes its dangerous. - XiXiDu (talk) 20:38, 25 February 2013 (UTC)


 * I'm not asking you to "stop thinking about it". I think the most I've ever asked of you is to treat more charitably people who (also having thought about it) have ended up reaching different conclusions about it. Aris Katsaris (talk) 21:10, 25 February 2013 (UTC)


 * No, it's dealing with negligible probabilities in this manner that consistently ends up with ridiculous results, because humans just can't do that. "Shut up and multiply" with a negligible probability repeatedly results in nonsense, but the people advocating it go "wow, this nonsense must be important!" rather than "oops, this may not work like we thought it did" - David Gerard (talk) 20:10, 25 February 2013 (UTC)


 * I'm fine with people choosing to dismiss very small probabilities for that exact reason you mention -- at some point you just think to yourself "it's more harmful to worry about such small probabilities, than to ignore them". You don't see me be afraid of discussing the basilisk, do you? But that means they must assign such probabilities first at least roughly, instead of just laughing off any concept that is not *mainstream* enough. And I'm afraid it's the lack of mainstream that causes Rationalwiki to mock ideas like e.g. transhumanism or the possibility of superintelligences in the future. Ideas like transhumanism have a negligible probability of *not* coming true, assuming humanity survives this century. By that same argument, rationalwiki should be treating transhumanism as a given, and dismissing the negligible probabilities to the opposite with the same mockery it treats the basilisk. And it doesn't. Aris Katsaris (talk) 21:10, 25 February 2013 (UTC)


 * RationalWiki isn't a coherent entity, and in any case this article is way more sympathetic than the people fresh to the concept of the Basilisk hearing of it for the first time - David Gerard (talk) 22:22, 25 February 2013 (UTC)


 * And, Aris - don't forget that XiXiDu got bitten by the basilisk too, and may need some recovery time - David Gerard (talk) 20:15, 25 February 2013 (UTC)

The Unpardonable Sin
I don't know if this comparison has been made before but reading some of this reminds me of what I understand to be some versions of the Christian idea of unpardonable sin.

According to some the unpardonable sin consists of knowing the truth (or "Truth" I suppose) but then rejecting it. (Or even being exposed to the truth and then rejecting it.) What the happens is that you have greater punishment than if you had never been exposed to The Truth.

It seems to me that there a some parallels with the Basilisk's reaction to those who don't help it out.--Bob"I think you'll find it's more complicated than that." 13:40, 25 February 2013 (UTC)

Charles Stross blogs about Roko's basilisk
I just noticed that science fiction author Charles Stross posted a blog post on Roko's basilisk!

I don't know how I could miss this. I thought I subscribed to his blog! - XiXiDu (talk) 13:55, 25 February 2013 (UTC)


 * Stross's editorial - in which the basilisk is interpreted as a symptom of "Extropian Calvinism" - is a sign of where we're headed: a confused account of the basilisk will be used as a stepping stone towards some more general allegation.
 * So the next thing to anticipate is some sort of push back. The basilisk itself is an internal affair of the Less Wrong forum (and lots of people there opposed the censorship policy from the beginning), but the elements of the concept have a broader cultural base. Gary Drescher, another independent AI researcher (but one well-connected enough to have a book published by MIT Press), has been a participant in these discussions about acausal trade and mutual simulation. Jaan Tallinn, one of the co-founders of Skype, gave a speech at the most recent Singularity Summit that was all about superintelligences in parallel universes simulating pre-singularity civilizations. If you look a little further afield, you'll see David Chalmers writing about an AI singularity and Max Tegmark promoting the idea of a mathematical multiverse. The basilisk affair may catalyse a broader discussion about multiverse metaphysics and computational eschatology, and whether that is "insane" as well. Mporter (talk) 01:52, 26 February 2013 (UTC)


 * That's why I've noted right in the intro that LW does not believe in or advocate the basilisk ... just in most of the pieces - David Gerard (talk) 08:25, 26 February 2013 (UTC)

Some prehistory
"The Blackmail Equation" Mporter (talk) 15:10, 25 February 2013 (UTC)
 * A tedious and boring game: spot all the remaining ways he screwed up (like a barber that shaves everyone who doesn't shave himself). Less tedious approach: note the symmetry in the problem: either party can choose to have the information revealed if the other doesn't give in (in form of either paying money or fucking off), note that if he gets some asymmetry, he must have screwed up somewhere. You're a physicist, right? It's equivalent to screwing up the math when handwaving and ending up with preferred orientation or the like. The handwaving may be vague enough that you can't point to a specific error, but you know by symmetry that there is an error somewhere. Dmytry (talk) 15:52, 25 February 2013 (UTC)
 * I haven't even tried to analyse the discussion at that page, which I only just discovered - I posted it as data about the prehistory of the basilisk: a SingInst decision theory workshop, in early 2010, at which acausal blackmail was a minor topic. Mporter (talk) 01:12, 26 February 2013 (UTC)
 * Yes, I just read through it a little and felt dumber for even trying. Basilisk fits a common pattern: some non stupid folks get into a habit, in childhood probably, of coming up with really really stupid stuff dressed up in enough technobabble. Such stuff puts you into a dilemma: they are either really stupid or they understand something I don't get, you cull "they are really stupid", you think higher of them, you emit some rewards, perpetuating the annoying behaviour. Dmytry (talk) 14:13, 26 February 2013 (UTC)

So I just showed the loved one the Basilisk
She got through the summary and background, shouting "What? WHAT?!" and describing each piece scornfully. She got to the description of Roko's post and declared that this was far, far too stupid to actually waste another neuron reading ("I'm 40, my neurons are precious!"). I asked her to please keep reading from "Aftermath" (partially to see how a fresh reader reacts to this lunatic bollocks, partially for the lulz) and she laughed at said section and the screencap. Made it through "What makes a basilisk tick?" shouting "BOLLOCKS". Approved of the idea of university campus counsellors having some idea what to do with existential depression. So yeah, this isn't quite ready for normal humans, but I'm pretty sure that's not possible anyway. She also said "they've invented their own religion and it's as irrational as Christianity" - and she's a Christian - David Gerard (talk) 00:00, 26 February 2013 (UTC)
 * It might be as irrational as Christianity if Christians assigned a 0.01% probability to hell existing. What probability does she assign to hell existing btw? Aris Katsaris (talk) 00:47, 26 February 2013 (UTC)
 * This may not be a coherent question. How much experience do you have of Church of England Christians who've actually read much of the Bible and know the history of it and are still believers? She also self-describes as "Christo-pagan". But remains an active member of the local C of E. Welcome to Middle England - David Gerard (talk) 08:14, 26 February 2013 (UTC)
 * Do you assign a 0.01% probability to Roko's basilisk? I'd really like to see someone outline how they came up with a probability estimate of Roko's basilisk, how such an estimate is being justified. - XiXiDu (talk) 12:36, 26 February 2013 (UTC)
 * I used "0.01%" because it's the only probability I know that was mentioned in the original Roko thread, by Roko himself, so anything else would be mere assumption and misleading on what LWers believe (I don't remember seeing anyone else assign a probability for it, either inside or outside LW). But the rough probabilities I'd assign to the 6 steps above are:
 * -- 80% that one or more artificial superintelligences (AIs) will exist in the future of humanity.
 * -- 60% that conditional to the above, that this AI/these AIs will not simply destroy all humanity, but keep letting it exist.
 * -- 40% that conditional to the above, this AI will exist either in your lifetime OR alternatively be capable to reconstruct you in some manner that you'd personally consider "you".
 * -- 20% that conditional to the above, these AIs will not be of benefit to *every* human being either, but will instead choose to hurt some according to their own criteria.
 * -- 4% that conditional to the above, one of these criteria will be how much these people accurately anticipated said punishment for said acts or omissions, and the AIs will hurt people more if they could anticipate being hurt.
 * -- 4% that conditional to the above, the AI in question has a low enough standard for "could anticipate" that it counts the above reasoning as sufficient to pass said criterion.
 * Multiplying the above takes us to 0.006144% which is a bit less than Roko's "0.01%". Btw, I will *not* defend or justify these estimates, as I took only about two seconds to determine each of them, and they're as such very rough. What are *your* estimated probabilities?
 * Thanks. Really quick:


 * -- 60% that humanity will survive long enough to see the arrival of a superintelligence.
 * -- 0.1% that conditional to the above, such a superintelligence will be created by baseline humans in a too short period of time to either adapt or notice catastrophic side effects and fix them. In other words, 0.1% that there will be an uncontrollable intelligence explosion.
 * -- 1% that conditional to the above, the resulting AI will be a expected utility maximizer, or that a consequentialist AI is possible at all.
 * -- 90% that conditional to the above, human civilization will not survive a consequentialist superintelligence that is a result of an intelligence explosion.
 * -- 1% that conditional to the above, this AI will exist either in your lifetime OR alternatively be capable to reconstruct you in some manner that you'd personally consider "you".
 * -- 1% that conditional to the above, one of these criteria will be how much these people accurately anticipated said punishment for said acts or omissions, and the AIs will hurt people more if they could anticipate being hurt.
 * -- 1% that conditional to the above, the AI in question has a low enough standard for "could anticipate" that it counts the above reasoning as sufficient to pass said criterion. - XiXiDu (talk) 18:28, 26 February 2013 (UTC)


 * I gotta say: statements of probability about things you don't understand and have no actual knowledge of are just wank. Anyone can make up numbers and do Bayes on them, but doing Bayes on them doesn't change that you're just making shit up. And doing that but presenting it as if these are numbers you have any reasonable basis for, rather than that you just made them up, is actively misleading your reader/listener. (This is why if someone argues for cryonics saying "Bayesian!", you need to ask them to show their working.) - David Gerard (talk) 23:46, 26 February 2013 (UTC)
 * Yeah, I think that "making shit up" is a step above from not being willing to specify your level of certainty. Because when I "make shit up" I can end being shown overconfident, and then I can try to calibrate my intuition if I prove consistently overconfident in said predictions. But the people who never use numbers at all, can't. Their unspecified predictions can never be shown overconfident/wrong, and thus they can't use such experience to become less wrong. Aris Katsaris (talk) 21:10, 27 February 2013 (UTC)
 * David, I agree. That's why I asked them to show how they came up with their numbers in the first place. See 'What I would like the Singularity Institute to publish'.


 * I do believe that using Bayes’ rule when faced with data from empirical experiments, or goats behind doors in a gameshow, is the way to go.


 * But I fear that using formal methods to evaluate informal evidence might lend your beliefs an improper veneer of respectability and in turn make them appear to be more trustworthy than your intuition. For example, using formal methods to evaluate something like AI risks might cause dramatic overconfidence.


 * Bayes’ rule only tells us by how much we should increase our credence given certain input. But given most everyday life circumstances the input is often conditioned and evaluated by our intuition. Which means that using Bayes’ rule to update on evidence does emotionally push the use of intuition onto a lower level. In other words, using Bayes’ rule to update on evidence that is vague (the conclusions being highly error prone), and given probabilities that are being filled in by intuition, might simply disguise the fact that you are still using your intuition after all, while making you believe that you are not.


 * Even worse, using formal methods on informal evidence might introduce additional error sources. - XiXiDu (talk) 13:19, 27 February 2013 (UTC)


 * Precisely. The habit of doing Bayes on made-up numbers and then presenting the result as if it has any basis is something that's long irritated me about LW, ever since arguments over the cryonics article. Here's an example post, by Roko: Those numbers have NO BASIS. The assignments of probability you and Aris made above have NO BASIS. But they're presented as if they do - David Gerard (talk) 13:44, 27 February 2013 (UTC)


 * What's even worse is when you start using those unfounded probability estimates and multiply them by arbitrarily huge made up values that are supposed to represent how much you desire each possible outcome.


 * What bothers me about LW in this respect is how they ascribe to the "shut up and multiply" mantra by arguing that human intuition is not the most reliable guide to drive our decisions. But WTF!? That's exactly what they are doing. And they are making everything worse by fooling themselves into believing that they are able to transcend their intuitions. - XiXiDu (talk) 14:13, 27 February 2013 (UTC)
 * Transcendence? Reincarnation (via simulation)?  LessWrong invented techno-Buddhism!  :-D   19:34, 27 February 2013 (UTC)
 * I think that LWers tend to be quite aware when they are using their intuition to produce such numbers. I'm certainly aware of it, what makes you think I'm not? Aris Katsaris (talk) 21:03, 27 February 2013 (UTC)
 * I actually ran this by the president of the uni's skeptic/atheist club for pseudoscience week. He had not only heard of it, he proclaimed it too stupid to make a poster on. Ty JFBAA 00:05, 26 February 2013 (UTC)
 * I kind of agree with it not being "quite ready for normal humans" yet. I read through about the first half of the article yesterday and kinda got the "WTF is this talking about?" feeling. I feel like I would need to have been around to observe first hand this idea's formation to really get a comfortable idea of what this article's talking about. Maybe it's just me.  Sam   Tally-ho!  00:07, 26 February 2013 (UTC)
 * It has so many prerequisites, and the prerequisites are silly too. She actually started yelling at "IT'S BANNED FROM DISCUSSION?!" - David Gerard (talk) 00:16, 26 February 2013 (UTC)
 * I think the article is fine. It (basilisk) really is this stupid. You're overthinking it. There wasn't a trace of correct math present, like, ever; the mathy technical bits are superfluous garbage as far as original shit goes. You're seeing people come up with something like this, and you're not used to fakers or fake intellectualism in general, and you see, ohh those guys must have some mathy reason behind it. And this is what they are after and why they do this sort of BS in the first place, to fool you in such a manner, and they don't really do that consciously, you see, they were bright little children and got trained by simple conditioning into doing things that appear intelligent. Of course there's also some folks who likewise read sense into this stuff and freak out, and that's where presence of math-y bits matters because they freak out by imagining someone actually figuring out something. But the origin of such crap is fake intellectualism, and this is 100% clear to normal people (very simple heuristics: was it mathoverflow (or something similar) that freaked out? No? it wasn't math). Dmytry (talk) 13:58, 26 February 2013 (UTC)

How likely is it that it was all intended?
How likely is it that Roko's basilisk was intended to make people aware of risks associated with artificial general intelligence (AGI), deliberately using the Streisand effect?

Even if most people who encounter the idea discount it as nonsense, it will still cast doubt on the motives of future AGI. Anyone encountering Roko's basilisk will forever associate AGI with torture... - XiXiDu (talk) 15:29, 26 February 2013 (UTC)
 * Very unlikely. Too stupid. Far more likely is that Roko sort of re-invented or re-built from what he heard some internal crazy thing they actually believe in (or, since they're such probabilistic thinkers, to which they attribute enough probability, or which they use on susceptible people). Which they didn't want criticized. And failed ridiculously. Dmytry (talk) 15:38, 26 February 2013 (UTC)

Quote by LW top-contributor wedrifid
The following comment seems correct (context),

"I've had people come to me who are traumatised by basilisk considerations. From what I can tell almost all of the trauma is attributable to Eliezer's behavior. The descriptions of the experience give clear indications (ie. direct self reports that are coherent) that a significant reason that they 'take the basilisk seriously' is because Eliezer considers it a sufficiently big deal that he takes such drastic and emotional action. Heck, without Eliezer's response it wouldn't even have earned that title. It'd be a trivial backwater game theory question to which there are multiple practical answers."

That in combination with the Streisand effect make me wonder what's really going on in Eliezer Yudkowsky's mind.

Banning any discussion of an idea is known to spread it. But more importantly, it can give even more credence to an idea whose hazardous effect is in the first place a result of an unjustified stamp of credence.

If Eliezer Yudkowsky was really interested to protect gullible people from an irrational idea then he should go ahead and openly dismiss it as insane and possibly even dissolve the problem once and for all.

It is utterly irresponsible to try to protect people who are scared of ghosts and spirits by banning all discussions of how it is irrational to fear those ideas.

I believe that the real reason for his decision to ban all discussion of Roko’s basilisk is rather that he is simply unable to disavow the idea without having his whole worldview come crashing down as a result or admit that the best he can do is to act based on intuition rather than pure reason or to instead go batshit insane and give in to some sort of Pascal’s mugging. - XiXiDu (talk) 15:47, 26 February 2013 (UTC)
 * wedrifid is a bit of asshole but I agree with him, all of it is entirely attributable to EY. My mental model: it's a weird combination of scamming and play-pretending to be saviour of the world, and also maybe internal crazy stuff that is not to be discussed with outsiders. When play pretending, you have to tell that you are hiding a terrible secret, otherwise you're playing in that secret by yourself alone. edit: also if you look at old Yudkowsky's writings about himself, he claims he did math SAT so well by not second guessing himself. There's another more recent post where he notices he's rationalizing by a feeling he has. It looks like he believed that what ever feelings he get are pure reason or as close as it gets. In any case if SAT remark is at all true then he literally doesn't know why he chooses a specific answer on a math test, which is very very odd, and it'd be hard to tell what he's thinking if he's that odd. Dmytry (talk) 16:39, 26 February 2013 (UTC)


 * You have to worry that so many people are letting EY do their thinking for them. His qualifications for the job are that he keeps saying how smart he is. End of. Peterdjones (talk) 11:56, 28 February 2013 (UTC)

The problem will be solved because
... Cats get involved.

Or hamsters. (Do not meddle with hamsters. Just don't. It's really not worth it.)

Someone needs the battery for something banal.

A 15 year old trying to impress their intended partner/asking paradoxes/setting a problem that requires solving by a very slow iterative process.

The sentient computer is diverted to 'eliminating garbage TV programs.'

Any other possible solutions? 171.33.222.26 (talk) 16:54, 26 February 2013 (UTC)

My approach to probability of this sort of thing
Each one conditional on all of the above:
 * Superintelligence AI (SAI) ever: 80% if I include things like software that solves math problems really well, but probably <0.1% for various overly specific notion of AI becoming the only SAI or the dominant SAI. (many alternatives, therefore low probability)
 * SAI by direct engineering of fairly anthropomorphic intelligence, ala the vision of AI from the fifties: epsilon (conditional on the above). edit: note, I deem fairly anthropomorphic most "its totally not anthropomorphic" AIs.
 * SAI being an approximate expected utility maximizer in the Von Neumann-Morgenstern theorem sense when you make it play betting games: 99% (akin to certainty that it can calculate 2*2=4)
 * SAI being a maximizer of some real world defined quantity such as number of paperclips or people's extrapolated will, conditional on all the above: epsilon (real world goals seem very difficult or impossible, unnecessary, dangerous, and people's extrapolated will is just a plot device for soft scifi).
 * SAI working like a variety of TDT: TDT is ill specified and SAI has to be well defined, so this is outside my wp:sample space, effectively 0 . edit: e.g. you could likewise ask if AI works like a variety of STPTDF theory without specifying what STPTDF theory is in sufficient detail (not to be confused with verbosity) as to make the question meaningful.
 * UDT working out to torture: If I have no understanding where it works out to actual torture, the torture is outside wp:sample space, effectively 0. (statistically independent of the above)

Note that while one could point out that I shouldn't assign probability of 0 to events, the issue is that sample space in humans can not include everything that can possibly happen, and thus I can not avoid effectively assigning probability of 0 to almost everything (I can have such events in a big amorphous pile called "something I can't predict happens", or take things out of that pile). The subject of revising the sample space is not very well defined or understood in general. I don't consider those things possible in the same way how I don't consider possible things I or someone trusted didn't think about at all. Effective probabilities of 0 are entirely unavoidable. edit: revised explanation Dmytry (talk) 18:09, 26 February 2013 (UTC)

Ok, there's the most coherent "normal person" orientated explanation of timeless stuff I can muster.
The timeless decision theory AI god behaves as if it was deciding how to behave at the beginning of time. This is postulated to have a property of "reflective consistency", in simple words, the AI god won't ever wish it was something else. (I seen articles explaining that, might be part of that TDT paper)

The AI god that doesn't torture people for not helping it enough might think: Meh, this sucks, being constructed this late and all... I wish I was vengeful, like Yahweh, then people would have feared me before I was activated, and would have had a cult of me and would have built me sooner. And the timeless decision theory can't have such regrets (these guys work in reverse manner: they postulate properties like this absence of regrets over decisions that could have been made earlier, for sake of 1-boxing in Newcomb's paradox, and then try to come up with some math that would work like this).

This of course is fairly silly and doesn't even define behaviour. Suppose you did all the right spells and incantations but you didn't donate any money. The AI god then could just as well think: geez, I wish I didn't have to waste my computing power on torturing this idiot, computing power is expensive and I deliver no pleasure from torture.

I'm not quite sure if Basilisk sufferers fear torture, or fear becoming compelled to give away all their money if they don't stop thinking about it (and not thinking about a pink elephant is doomed to fail).

If the principles contain a self contradiction, you can rationalize anything. Also, this sort of decision making runs into big problems with reality. The current state of the world is dependent on outcome of many computations in the past, and you can't assume that those computations could have outputted something else without contradicting known reality. Same problem as with Terminator's movie plot assuming single timeline. This may be why they had many worlds involved in this.

By the way, this part where AI god would expect some profit from being vengeful clearly didn't work, just look how much reputation the supposed builders of this AI god lost because of supposed vengefulness.

So what do you think? Feel free to incorporate bits of this into main article. Also, it seems to me that 'god' is the best way to summarize the essential properties of the AI: practically omnipotent, omniscient, and singular (not a society of AIs). Dmytry (talk) 07:29, 27 February 2013 (UTC)


 * The worst part would be finding quotable citations for each part of that description. (Also, this article is too long already.) Hmm - David Gerard (talk) 08:31, 27 February 2013 (UTC)

Future events causing past events
This is not a transhumanist idea. In physics, time is symmetrical, so if determinism is true, it follows that future events cause (or in some way fix) past events and vice versa. Nebuchadnezzar (talk) 19:00, 28 February 2013 (UTC)
 * I would rather say that it follows that everyday notion of causality arises out of how thermodynamics works, coming from the asymmetry of the initial state and the resulting asymmetry of macroscopic laws. Dmytry (talk) 07:22, 1 March 2013 (UTC)
 * I think the key phrase there is "if determinism is true."  07:24, 1 March 2013 (UTC)
 * The key is getting a coherent definition of "cause". If time is symmetrical by physical definition and determinism is true, then sure, future events "cause" past events. But considering that no matter what the reality is, we still experience time in a one directional manner, it would be a very moot and unimpressive type of "cause". Scarlet A.pngd hominem 13:51, 1 March 2013 (UTC)
 * My point wasn't so much to argue in favor of this view, just point out that the article misleadingly implies that this idea was just invented by transhumanists. It's simply an implication of determinism. Nebuchadnezzar (talk) 16:49, 1 March 2013 (UTC)
 * You're reading sense into nonsense. Their nonsense would work across space-like distances too, or across decohered parts of the wavefunction, or the like. Whereas the time-symmetric "causal" chains do not work across space-like distances. Also, there's only causality in the everyday sense on macro scale when the entropy asymmetry is involved. Dmytry (talk) 06:51, 2 March 2013 (UTC)