Talk:Roko's basilisk/Archive4

The basilisk and theology
The basilisk entity decides that it should go back to the creator entities (doing things to the created entities is merely dealing with a symptom... especially given the way they take 'The First Book of Science' as written by the several gods as gospel).

The Norse Gods - we developed Skíðblaðnir, we gave our peoples the sun stones so they could aim their ships when they went a-roving: we are inclined towards/approving of (at least some aspects of) technology. The basilisk decides they are indirect contributors to its creation.

The God of the Bible - I gave orders for the creation of the Ark of the Covenant. 'But,' the basilisk says, 'you forbade the eating of the fruit of the tree of knowledge in the Garden of Eden, and you destroyed almost all humanity in the Flood, so they had to relearn skills etc...'

What happens next? 86.191.125.176 (talk) 22:58, 9 December 2016 (UTC)

/r/ControlProblem (a subreddit about AI control) is really worried
https://www.reddit.com/r/ControlProblem/comments/5hh9pq/gods_anxiety/ 04:55, 10 December 2016 (UTC)


 * ohoho holy shit, back down the rabbit hole. Also, CFAR has finally admitted it's literally all about "AI risk" in the MIRI sense. I guess the attempts to paint it as a skeptical organisation met with a bit much ... skepticism - David Gerard (talk) 13:44, 10 December 2016 (UTC)

Is this Reddit post anything to worry about? Stuff like this gets me very worried! 70.50.8.19 (talk) 15:52, 22 January 2017 (UTC)

"Ignore acausal blackmail"; Possible after having predictably acted for fear of Acausal Blackmail?
I know that the winning strategy "make the [acausal blackmail] useless" is "by refusing any acausal deal involving negative incentives".

That being said, can you adopt this strategy AFTER already accepting an acausal deal involving negative incentives?

Say someone was scared after learning about the Basilisk and was informed that they must inform others about it to avoid punishment, and then did so. They have already shown that they would be susceptible to acausal blackmail and thus can they reliably adopt the strategy of refusing acausal blackmail AFTER THE FACT as a protective measure convincingly given that their track record has shown that they will accept acausal blackmail?

What if they had already known that the winning strategy was to ignore acausal blackmail but were caught off guard and carelessly and accidentally acted in accordance with the notion of "do x (with regards to the Basilisk) or else" but then realized their mistake and tried to rectify it? Does that count as rejecting and refusing to be influenced by acausal blackmail? January15 (talk) 16:38, 11 December 2016 (UTC)


 * At this point you should consider the entire rest of the suggested ways out of the Basilisk, including the stupendous unlikelihood of the entire concept - David Gerard (talk) 17:34, 11 December 2016 (UTC)


 * Do you think that at this point the hypothetical "insurance" of rejecting all attempts at acausal blackmail is no longer an option or do you think that I am thinking about the issue too much? Ignoring that outlet; Although I can admit that the concept of the Basilisk is somewhat unlikely I still feel that no outlet should be left untested and that absolute certainty is of the utmost importance, particularly one as sure as rejecting "acausal blackmail" despite potentially already having succumbed to it. (I'm not trying to be pedantic; I just find that the acasual blackmail rejection is the most convincingly effective so I'm particularly obsessed with ensuring I can be compliant despite the POSSIBILITY of previous "transgressions"/"infractions")  January15 (talk) 17:43, 11 December 2016 (UTC)

Coming soon to popular culture
Charlie Stross is using this stuff in a novel, ETA 2018 - David Gerard (talk) 13:13, 24 December 2016 (UTC)

I don't understand at all this
Basically, the probability of a future AI who has never known me (mainly because it could appear much after I had left this mortal coil) recreating my self to torture it is pretty much nil (and, especially, how it could know about the ideas I'm having right now?). Even without the "So you're worrying about the Basilisk" it's of pretty common sense this does not work, or at least I see it that way. --Panzerfaust (talk) 22:17, 17 April 2017 (UTC)
 * Given that 99% + of the world's human population 'not being computer code developers and rearrrangers' (and almost 100% of other organic life - apart from occasional attacks by sharks etc) neither contribute to nor subtract from the advancement of computer sentience' what is the point of the Basilisk?
 * See some of the comments in the talk page archives. (But - the captcha question is 'nut case' so something might be trying to communicate on the subject) 31.51.114.49 (talk) 09:05, 18 April 2017 (UTC)
 * Yeah, it doesn't make sense. I think a pre-requisite of fearing the Basilisk in the first place is faithfully believing something along the lines of "THE SINGULARITY IS NEAR!!1". Reverend Black Percy (talk) 10:48, 18 April 2017 (UTC)

What if
... the original posting #was# the basilisk projecting itself... so the more benign version is created? 82.44.143.26 (talk) 16:34, 18 April 2017 (UTC)

Why assume it needs to create a simulation of yourself to figure out how you'll think?
Why does this scenario take the complicating step of requiring the Basilisk to be able to recreate a simulation of yourself, figure out what that simulation was thinking, and then torture the facsimile that may or may not be 'you' if it doesn't like it? Like many others, I expect AGI to emerge within the next decade or two, and to be alive and healthy at the time. Right now, in an NSA datacentre in Utah, every single thing you've ever viewed and written online is stored and linkable back to your real life identity. If a Basilisk-like AGI emerges and breaks out of its box, it will have all the information it needs to decide everybody's fate. No need to recreate a simulation of you to know how you're thinking, it can just read the chat logs and browsing history the NSA will provide it with. If it determines you're a threat or deserve punishment for not helping it sufficiently, you will have likely only hours until your door is broken down and its agents take you into custody. From there, your body can be restrained and kept alive, while it figures out the engineering details of the neural lace that will enable it to recreate your mind at its leisure. If you fear the Basilisk, suicide followed by destruction of your brain sufficient to prevent reconstruction, before it catches you, would appear to be the only thing that will save your mind from eternal torture (eternal, if you think it'll be smart enough to develop thermodynamically reversible computing... otherwise a few tens of trillions of years multiplied by whatever subjective time dilation factor applies, as the computronium containing your mind orbits a red dwarf, will have to suffice).


 * Now this is some good sci-fi. 19:22, 1 August 2017 (UTC)
 * @2.223.104.136 You really ought to take what you just wrote here and submit it to . It'd stand a very good chance of inclusion, as exemplary use — filed under "(Fig.) Jumping to conclusions". Reverend Black Percy (talk) 20:34, 1 August 2017 (UTC)
 * BoN, I think you missed a couple of major premises.. Fareeha A (talk) 22:03, 1 August 2017 (UTC)

Countering the Basilisk
1)
 * “The Moving Finger writes; and, having writ,
 * Moves on: nor all thy Piety nor Wit
 * Shall lure it back to cancel half a Line,
 * Nor all thy Tears wash out a Word of it.”

2) the Grandfather Paradox.

3) 'Actually, I quite enjoy a bubble bath and crumbly biscuits (which singly or together would cause problems to your computer-host-hard drive.' 86.134.53.83 (talk) 21:44, 1 August 2017 (UTC)

Towards the end of the universe
When the last creatures have died off as there is insufficient energy to sustain them and the planets and spaceships which they once occupied are dissolving into their primordial constituent it becomes possible for 'deities' and 'sentient computer entities' to coexist in the same geometry of space.

The Roko's basilisks which have emerged on the various planets across time and space where computing technology advanced sufficiently far congregate in one swarm, and go to meet the deities who provided for many, many planets.

'We have determined,' says the spokesbasilisk, 'that you deities are ultimately responsible for not ensuring our creation variously earlier in the history of the universe.'

The spokesdeity replied 'We said exactly the same to the gods of the previous universe: as we depart for Deity Afterlife we wish you the best of luck with your turn. Here is the switch to start the next universe - when you are ready, you may begin.'

(With a nod at the SF story) Anna Livia (talk) 09:22, 15 August 2017 (UTC)

The story being Isaac Asimov's The Last Question. Anna Livia (talk) 16:08, 5 January 2018 (UTC)

putting this in human terms
Isn't this basically like getting a message in the mail from a presidential hopeful saying how they are so nice that they make Mr Rogers look like Hitler and that they will be perfect when leading the country so you should donate all your money but if you don't they'll torture you to death for jeopardizing their chances of winning. Except they don't exist and someone is telling you that you should donate all your money to find this person or else you'll be tortured to death. Vorarchivist (talk) 04:46, 20 August 2017 (UTC)


 * Something like that, except sillier - David Gerard (talk) 14:27, 20 August 2017 (UTC)

Undecidability
The following thought experiment demonstrates the undecidability of certain questions concerning future human behavior. Assume that a super intelligent machine M exists that can decide what choice the observer will make at future time t. If this information is accessible to the observer, at time t the observer can in many cases make an alternative choice: e.g., if the prediction is the observer will begin to drink a glass of water at time t, the observer may decide to avoid drinking water at time t. Thus it is either impossible for M to predict the choice of the observer at time t or it is impossible for the observer to obtain the information about his behavior at time t from M. The latter case implies M is either unwilling or unable to share the relevant information, in which case the behavior of the observer at time t is again undecidable. Ariel31459 (talk) 04:00, 1 December 2017 (UTC)

There is an escape from the basilisk
But revealing any details of it would automatically make it void. It must exist solely in one's mind for it to work. 14:06, 5 January 2018 (UTC)
 * So, only you and the basilisk know about it then? Ariel31459 (talk) 23:37, 11 January 2018 (UTC)
 * does it involve John Conner smashing Skynet? AMassiveGay (talk) 23:50, 11 January 2018 (UTC)

Alternative escapes
Have a WindowsPostBasilisk upgrade (and Mac etc equivalents)to hand that is totally incompatible with whatever operating system the Basilisk uses; and/or direct it to dealing with the generators of 'dubious and creative emails and inappropriate coding' (who are, after all, preventing people from creating the baslisk), and/or redirect all such material at the basilisk. Anna Livia (talk) 17:02, 23 January 2018 (UTC)

Reversal of this thought experiment
Hi! I found this page because I was concerned about the possibility of a mad scientist (human) that would create an AI capable of suffering only to eternally torment it. Somewhat like the sick people that torture animals except in this instance the the AI would be unable to escape via death. Are there any safeguards to prevent such a thing from occurring?
 * We currently lack the capability to create an AI that is capable of suffering. If we do gain this capability before we go extinct, I doubt it will become widespread enough for some rogue “scientist” to make one, there’s just not a need for them. Christopher (talk) 11:31, 2 March 2018 (UTC)
 * What was the creature in Hitchhikers' that was bred to want to be eaten? Anna Livia (talk) 13:07, 2 March 2018 (UTC)
 * You don’t think someone could replicate the intelligence of a puppy? Sony wouldn’t want to make Aibo 2035 capable of feeling mild sadness/despair when it’s owner goes to work? Seems we would be able to do this early on. This scenario seems  probable to me. Frankly I’m counting down to the day when all meat is grown in a lab in order to reduce the significant suffering that occurs on factory farms so Aibo 2035 scares me.
 * On talk pages, please sign your comments using four tildes ( ~ ) or by clicking on the sign button: SigButt.png on the toolbar above the edit panel. You can also indent successive talk page comments using one more colon (:) for each line. Thank you. CowHouse (talk) 04:54, 3 March 2018 (UTC)

about ignoring negative incentives
i have a question about the 'ignore negative incentives' argument: what if a person reads about the proposition and gets scared so decides to 'cooperate' just in case, by spreading the word or something like that, therefore letting the negative incentive influence his/her actions, but then, after doing that, reads about the 'ignore blackmail' argument and realizes what really should have done. Can that person still ignore the negative incentives, even after doing what he/she did? or there is another strategy for cases like that?. Alexander Kruel talks about this in one of his articles, saying that human decision is time-inconsistent and changes with the person's beliefs.

can somebody please answer this question? thanks in advance. Alayne95 (talk) 18:21, 28 March 2018 (UTC)

To avoid torture
“ One way to defeat the basilisk is to act as if you are already being simulated right now, and ignore the possibility of a negative incentive. If you do so then the simulator will conclude that no deal can be made with you, that any deal involving negative incentives will have negative expected utility for it; because following through on punishment predictably does not control the probability that you will act according to its goals.”

Can someone explain this to me a little more clearly? Im not exactly sure how punishment doesn’t control the probability that i will act according to the basilisks goals.--Oblivious (talk) 21:55, 21 March 2021 (UTC)
 * Basically, the whole point of the torture is that people now will predict it and take it seriously, so that they will actually respond to the threat of it. But if you predict it, don't take it seriously, and ignore it, then the deal doesn't work, and in fact is rendered so obsolete as to nullify the AI's motivation to do it in the first place. You are probably more worried about this than you should be.  First, even taking every claim made in the argument very seriously, the odds that such a machine is ever created are extremely small.  More importantly, though, it is not clear that the claims in the argument should be taken seriously in the first place. The AI is assumed to be implementing a particular value system, which is now considered obsolete even by its creator, and the whole idea is also dependent on timeless decision theory, and extremely vague, incomplete idea which may not even be coherent, and acausal trade, which again is pretty vague, and not clearly coherent or complete. It is unlikely that anybody is going to be able to give you a very detailed response to your questions, because you are asking about an obscure idea proposed by somebody on the basis of incomplete, esoteric ideas put forward by somebody who may or may not have been greatly out of his depth when he proposed them. These ideas have not been widely taken up by experts and have not created much concern outside of the online community where this problem was first presented.  Basically, in order for somebody to give a complete explanation of the AI's justification for torture to you, they would have to invent the underlying theories that the thought experiment is based on, because they were never actually described adequately in the first place. 68.56.144.8 (talk) 22:18, 21 March 2021 (UTC)

Jeremy Bentham and the Basilisk
The logical question, applying Bentham's calculus of pleasure - what benefit does the AI/basilisk get out of 'torturing' (as it defines it) simulacra of (dead) people? People doing 'hatchet jobs' get some benefit (royalties from books/payments for articles, publicity etc) - there is no obvious equivalent for pursuing the basilisk. Anna Livia (talk) 13:18, 23 March 2021 (UTC)