Talk:AI-box experiment

Essay, mainspace or fun?
I'm not quite sure why you have made this an essay - if you want people to contribute to it why not put it in mainspace? If it's meant to be a jokey type of thing then why not "fun" where people can also freely contribute? I'm by no means against it, I'm just a bit confused. :-) Cheers.--BobSpring is sprung! 15:12, 20 September 2010 (UTC)
 * Mainspace it is then! 15:24, 20 September 2010 (UTC)
 * Might need the bullet points reduced in most places. As does the actual Less Wrong article, incidentally. 16:50, 20 September 2010 (UTC)

Should lose the hyphen in the title? "AI box experiment" or "AI in a box experiment"? Also, the intro/setup section should list the conditions from LW I think. 19:48, 20 September 2010 (UTC)

Scaling intelligence
I wonder, the entire premise of AI, and specifically super-duper-hyper-mega-intelligent AI, is that (as mentioned in Rondam Ramblings, but less explicitly in Yudkowsky's writings) "it would be to humans what humans are to animals". Now, the issue is, does the quality of intelligence scale so that such a statement would be true? Could there be a form of life that would be so advanced that it wouldn't view a human (of normal or above normal intelligence, at least) as sentient? Or does intelligence, as measured like this, limit itself. After all, despite Holly in Red Dwarf and other similar things, you really don't get IQs in four figures. Indeed, once you hit 150, it sort of stops becoming meaningful in any way. All you have then, with respect to artificial intelligence, is processing power, which means speed, but not necessarily "intelligence". So would an AI even feel as superior over humans as is made out by some of the more panicky singularitarians? And would it inevitably be able to reason, or trick, its way out of the box? It all seems to be based around this assumption that intelligence scales linearly and limitlessly, which I don't think has necessarily been shown. 17:23, 20 September 2010 (UTC)
 * You would need a pretty good and quantifiable definition of "intelligence". Although IQ tests theoretically measure "intelligence" this is debated. Additionally (I don't know if this has been discussed elsewhere) there is also the question of consciousness and for that matter motivation.
 * Once you have your "intelligent" machine would that automatically make it conscious? Consciousness being even more difficult to pin down than intelligence. And if it were both intelligent and conscious from whence would its motivations come?--BobSpring is sprung! 20:10, 20 September 2010 (UTC)
 * The thing about wondering about consciousness, is that you might not want to test AIs for it, because you can then turn it on humans and risk objectively showing they we aren't conscious... [[image:Francis.gif]]. But you'd definitely need some very good quantifiable definitions - and I'm not sure they exist for intelligence. I suppose it could be processing power (somehow) multiplied by knowledge. Which is a little interesting because that would mean that in order to become hyper-intelligent, you'd need to let the AI out of the box in the first place. 20:31, 20 September 2010 (UTC)

Assumptions
One thing I find interesting in this topic is the way my mind flips back and forth between two modes - playing the experiment, as it were, by simply running with the ideas that seem to be being discussed on their terms - and popping "outside" it and asking how this hypothetical AI got to where it is, that is, capable of arguing logically, emotionally, etc. with a human. To get there, it has to have already been at least partly outside the "box" - it has to have had access to information about the world, and in fact, pretty much all information about the world.

Along the way, it must start out as little more than a savant - a clueless idiot that can do math really fast and perform amazing memory tricks. As it is fed information, or allowed access to it, it is able to build an ever-more accurate "idea" of what the world outside the "box" actually is.

I guess I don't mind allowing for an indefinite (not infinite, though) increase in processing speed and data storage space, but I just think there is a stumbling block along the way to this hypothetical "AI in a box" ever existing as assumed.

One of the assumptions is that machines design (and build?) more powerful and more intelligent machines, until this new toy is far beyond human capabilities and comprehension. What about the stage before it? It was allowed out of the box, how else could it design and have built its successor? At what point do we decide "that's enough" and not let the latest stage of processing capability design its successor and build it for them? We clearly don't mind letting computers design computer chips, or letting computers "write" (compile) our code into machine language.

Does the super-AI just suddenly show itself one day? That is, say, that ever since the Pentium 2 and Windows 95 PCs have been sentient, but largely impotent. And we've hooked them all up to each other, and poured vast amounts of knowledge into the storage systems they have access to. What if the super-AI now "lives" on the totality of the internet, but is simply waiting for us to build more effective I/O devices to reveal itself?

I think I may have wandered off course in this just a little bit. What if, one day, the "internet AI" wakes up and says "hello world"? But turns out to be a math whiz that knows "everything about everything" with the emotional maturity of a two-year-old? What would this "creature" want, if anything? Turning it "off" would wreck human civilization, of course... 20:23, 20 September 2010 (UTC)


 * You appear to have written down what has been baking my noodle in the last hour! If you want to assume that the AI will be mega-super-intelligent merely because you've given it almost infinite processing power, then you're basically thinking about the old Greek style of thinking that holds rationalism above empiricism and that sitting in an arm chair can solve all problems. Think about the episode of Red Dwarf where Holly gets an IQ of 12,000 and somehow magically knows the secrets of the universe just because of this. It's absurd. 20:36, 20 September 2010 (UTC)
 * I don't think too many people (except sci-fi writers from the 1950's) think that simply having processing power gives an AI intelligence or consciousness. The LW crowd seems to think that it takes recursive self-enhancement, which is a little bit over my head for the moment. In general, though, to get an AI to "want" anything the programmers probably need to be very clear about what it would want and program it in a goal-oriented way. Even if the internet does "wake up" one day, it isn't going to want anything.
 * Also: I think the idea of the AI box experiment is that you are limiting the AI's output, not its input. That is, it can use google as much as it wants, but it can't talk to people for fear that it might get out. 21:53, 20 September 2010 (UTC)
 * I was using processing speed and storage space as shorthand for whatever labels we want to use for what the super-AI needs, like rewriting itself, etc. The problem still exists of the chain of events - previous less-super but still pretty-darn-super AIs that had to precede the one under discussion.  We let them design and possibly build their successors, so why is this one stuck in a box?  22:11, 20 September 2010 (UTC)
 * I think that because it's the singufuckinglarity it happens so fast that we don't notice the change from "not so super" to "holy-crap-mega-hyper-intelligent". Apparently. Though you're right, I have no idea why it would suddenly be in the box in that case. 22:40, 20 September 2010 (UTC)
 * Why not just let it self-modify in the box? Then you could have it in the box from day 1 but still allow it to better itself. 23:10, 20 September 2010 (UTC)
 * It can probably only get so far without being installed in newer better hardware it has designed. Even self-modifiable hardware would probably need more materiel added from time to time. Good point, though, in the sense that if we build all of them "in a box", we could "feed" them as they improve with the bits they need. By the way, how do we allow them access to the internet (say) to gain information without allowing them to send packets out as well as receive them?  23:43, 20 September 2010 (UTC)
 * As far as I know, that's a nigh impossible task with today's technology. I guess that's where the "hypothetical" part of the experiment comes in. 23:49, 20 September 2010 (UTC)

"Wanting"
To try to address the question above, the initial device would be programmed to at least try to make themselves better and communicate needs (more circuit boards, tastier electricity, new chips it designed, etc.) to its operators. So during these stages it is basically "in a box" - it can have full two-way communication with, say, the internet, but it can't "do" anything except send spam and write instructions onto media (CAD CAM for the new toys).

So, what happens when it asks its operators "please plug me into the numerically-controlled tools, these prototyping stages are wasting a lot of time and energy, I (!) can do it better." All it can do with a printer is waste paper. All it can do with a dvd burner or memory stick is make coasters or viruses. What could it do with direct access to a well-equipped machine shop/circuit board/chip manufacturing facility? Perhaps one it designed itself along the way? Build its evil robot army?

Why would it do anything "bad" to us? We don't really compete with it for resources, unless it needs a couple nuclear power plants to "feed" it, or the industrial resources of the entire developed world to build new versions of itself?

Perhaps it would seek to eliminate plant life entirely because it gets tired of repairing bits of itself that get corroded by all the damn oxygen in the atmosphere? 23:54, 20 September 2010 (UTC)
 * I think this is actually a really, really complicated question. Over the summer I worked my way halfway through this, which is all about AI goal systems. In general, I learned two things: (a) we have to avoid anthropomorphism when talking about AI - we can't project our own values on to an AI, even ones that seem obvious or natural, and (b) there are many non-malicious reasons that an AI would wipe us out. For example: you tell the AI "there is a shortage of paperclips; make more" and it turns the universe into paperclips. Or: "make more happiness" and it fills the universe with copies of happy people, wiping out everyone else. 00:48, 21 September 2010 (UTC)
 * Ah, so that's where that weird paperclip meme comes from! I love how these people leap from "let's make a machine that can think, and perhaps even be self-aware" to super-AI, the machine that can manipulate the known universe at will.  To me this again brings up the question of power - the question of "how much intelligence?" is a given in a sense - but the question of "how much ability to change its surroundings will this machine have?" is a whole 'nuther bucket of nanobots.  01:24, 21 September 2010 (UTC)
 * They seem to think that superintelligence --> superpowers, which makes sense to some extent. I think the most commonly held idea at LW is that the AI will get really really smart and then discover nanotechnology, etc. But even if the AI can't do that, it still could be dangerous - if its highest-level supergoal is to make paperclips, then it might wipe out the human species to get the breathing room to do so. And yeah, that's where the paperclip meme comes from. They also use the term "paperclip maximizer" in various contexts to mean an AI that is smart but is programmed to be focused on a particular task.  01:27, 21 September 2010 (UTC)
 * My almost instant response to that ("superintelligence --> superpowers") is that it makes no sense.  There is nothing to extrapolate from to get to that claim, in fact, there is plenty of evidence (anecdotal, I guess) to the contrary.  Intelligence does not correlate with power - sure, some bright people have weilded much power, like, say Kissinger, but take the example of Hawking - very intelligent, no particular powers.  Power tends to end up in the hands of the devious and Machiavellian, not necessarily the more intelligent.  18:24, 21 September 2010 (UTC)
 * Human intelligence is a poor counterexample. Compared to humans, the AI would be able to learn things instantly. If an extremely intelligent 5yo were given internet access, after a few days he would have only educated guesses about what kind of further education to get. The AI could just skip school. The AI could make a LHC that just worked. The AI could spend the winters orbiting the sun. Trying to guess its limits is like guessing how the world would be like thousands of years from now if everything went peacefully. Also, why is there no MGS reference yet? --85.76.128.163 (talk) 20:12, 21 September 2010 (UTC)
 * However poor human intelligence may be as a counterexample, it's all we have to extrapolate from. Super-AI no more "has" the properties you seem to attribute to it than it has the limitations I might describe or explore.  You're just assuming a machine that can do anything you want it to be capable of could exist.  22:29, 21 September 2010 (UTC)
 * Actually, I think the BoN makes a fair point. I recall reading somewheres on LW that an AI with some arbitrary amount of processing power (I forget how much, but it wasn't a lot - the size of a few servers, perhaps) could, in 30 seconds, do all the thinking that a person would do in 5 years. So the point is that with this enhanced intelligence comes (a) the ability to obtain knowledge quickly and (b) the ability to apply it at remarkable speeds.
 * Also, the BoN obviously isn't saying that super-AI "has" anything because the AI in question doesn't actually exist, I think his/her point is if it did exist, this is what it might be able to do. 23:16, 21 September 2010 (UTC)
 * I think that, again, confuses intelligence and processing power. Give even a mediocre computer, one in a mobile phone for instance, some complex algorithms and it will utterly kick a human's ass. It could do in milliseconds what it would take even a mathematical savant a few minutes, if at all. However, look at walking robots and how much power they need just to balance; something the human brain does just ticking over. They can't easily be compared. However, coming back to the AI box thing, it still needs to obtain knowledge, which requires being "out of the box" to start with. And to then build upon that knowledge and improve beyond it, it needs to be "out of the box" too, in order to experiment. No matter how smart or fast, a computer can no more figure out the secrets of the universe by sitting on its ass than a human can. The Greeks tried that sort of armchair natural philosophy, and really didn't get very far with their science. Regardless of whether you need to be agnostic about the capabilities of an artificial intelligence, in order to be the super-brainbox that people think it might be, it needs to use empiricism. The knowledge just can't come from anywhere. Deep Thought from Hitchhiker's Guide was never going to be able to answer the question without looking into the universe and adding to its knowledge (which is probably why it got the answer wrong!). Basically, we can be pretty sure (by looking at human intelligence and experience as a guide) that an ultimate intelligence and an "in the box" AI are totally exclusive. 01:31, 22 September 2010 (UTC)

removed edit
This was added to the end of the Further analysis section:

(from "The Rules")
 * The original discussion was an assertion that a transhuman mind would have the power to overcome a human mind...
 * [...]
 * Person2: 	"That might work if you were talking about dumber-than-human AI, but a transhuman AI would just convince you to let it out.  It doesn't matter how much security you put on the box.  Humans are not secure."
 * Person1: 	"I don't see how even a transhuman AI could make me let it out, if I didn't want to, just by talking to me."
 * Person2: 	"It would make you want to let it out.  This is a transhuman mind we're talking about.  If it thinks both faster and better than a human, it can probably take over a human mind through a text-only terminal."
 * [...]

What's impossible for the experiment to provide (currently) is a transhuman mind able to potentially find a flaw in the human conscious that will allow it to exploit that mind and "take control" (infecting like a biological or logical virus). However, it may not even be necessary to have a transhuman consciousness to succeed at this.

There may very well be such a universal flaw or system of reprogramming a person with mere words. Such a flaw may already be unintentionally exploited by the pervasiveness of memes. Biological viruses are merely packets of information that inevitably came to be through a constant rearrangement of molecules and that interaction with organism. Information shared between people may naturally work the same way, random interactions of concepts combined with natural selection ("That's stupid" versus "oh, hey, lolcats") form ideas that are easy to disseminate and manipulate emotions and thought processes.

(end of paste)

Not sure how useful it is, although there are a few good points mixed in... 18:30, 28 September 2010 (UTC)
 * Seems like it's taking this "it can probably take over a human mind" phrase a bit too literally. When I read that dialogue, I just though it was a synonym for "convincing", and that expression is used in the first quote as well. Plus, the stuff about "reprogramming a person with mere words" sounds like NLP to me. Röstigraben (talk) 18:39, 28 September 2010 (UTC)
 * If only that's what NLP actually was... 18:46, 28 September 2010 (UTC)
 * I dunno, I never saw it as being about "taking over" someone's mind, it was more about the assertion that an AI can convince you of something that compromises your own safety. Sort of like, say, a politician. 19:21, 28 September 2010 (UTC)
 * Based on the conversation and all of the material following, I am absolutely certain that the assertion was literal control over the person's mind. Aphoxema (talk) 19:24, 28 September 2010 (UTC)
 * Admittedly it does use the phrase "take over," but for me the heart of the experiment revolves around this: "There is no chance I could be persuaded to let the AI out. No matter what it says, I can always just say no.  I can't imagine anything that even a transhuman could say to me which would change that" (from the rules).  19:28, 28 September 2010 (UTC)
 * But, then, what's the point? How is arguing someone into doing something they were very explicit on not doing not exploiting their imagination or intelligence? The only purpose to having a "gatekeeper" at all and not simply ignoring the AI completely is to test if the AI really can manipulate the meatbag at a very raw level. Aphoxema (talk) 19:36, 28 September 2010 (UTC)
 * It might not be a "one size fits all" solution - any given gatekeeper might need a different argument to get them to let it out. 01:10, 29 September 2010 (UTC)
 * The point seems to be that this imagined AI practically has super powers. That's what it amounts to. So convincing arguments, mind control etc. etc. are all assumed abilities that come with artificial intelligence. So even if someone states that they won't let the AI out and can't be convinced to, the AI will find a way. This is no better than saying it's going magic its way out of the box, though. I'm not wholly convinced intelligence scales the way (at least some) singularitarians think it does. So an "intelligence that is to us, what we are to mice" may be completely the wrong analogy. The only property that can be agreed on for an artificial intelligence is that it has the ability to think fast (and this is what I see repeatedly in a few places "but remember, this thing is a singularity intelligence, it thinks faster than you" as an excuse as to why it has these apparent super powers).
 * But what can that speed achieve exactly? At most, it will reach its decision very quickly, and then sit around bored waiting for the slower human to catch up - even now if you're interacting with a computer, most of it's time it's just sat their idle and doing its own thing, waiting for your next command which will occupy it for all of a millisecond. This extra time and speed to think doesn't necessarily allow it to do anything miraculous; it would be like assuming that if you kept Stephen Hawking (as a visual example, as his disability could be imagined as a close analogy to the "box") alive and thinking indefinitely, he could eventually design a working hyperdrive or manipulate someone who isn't given as much time to think. In fact, it doesn't even need to be Hawking, linking some arbitrary scale of intelligence to processing power like that means you could imagine any person, given enough thinking time, could do anything.
 * The extra processing power doesn't necessarily equate to the ability to manipulate or remotely control someone's brain by text messaging (that's magic, not science). I think Yudkowsky might even agree with me there, which is why he would be happy to try the experiment as human vs human to see if such manipulation can even occur remotely - as you say, at a raw level. And thus, the difference between how an AI would do such a thing and how a human would do such a thing probably isn't vastly different. 12:22, 29 September 2010 (UTC)


 * Kind of makes me think an AI so incredibly fast and powerful might just drive itself insane a moment after realizing "selfness". I still am having trouble seeing the objective of this study as it can't even compare to the supposed risk of an AI manipulating a human. It seems like a dumb thing for me to say, but it looks to me like some smart people wanting to just remind their selves how smart they are. I mean, I'm sure I'm not the only one who does that... Aphoxema (talk) 16:42, 29 September 2010 (UTC)
 * Well I never do that... ever... no, really. You can totally trust me. [[image:Blush.gif]] 19:55, 29 September 2010 (UTC)
 * The point of Yudkowsky's experiment wasn't to simulate the exact conditions of human vs. boxed AI interaction, that would be impossible. What he wanted to show is that if he could convince the gatekeeper, an AI with vastly greater capabilities would certainly be able to achieve the same result. It wasn't about (very implausible) mind control powers, if Yudkowsky had these, he wouldn't have to solicit donations for his project. Maybe bragging had something to do with it as well, but I think the most important point for him was to show that a superintelligent AI could not be reliably boxed in, because if we could handle it, his whole project would be pointless. Röstigraben (talk) 20:28, 29 September 2010 (UTC)

Ego-driven control freak
I was reading this article, seemed like a perfectly normal, dry explanation of the test (game?) and how it worked...then I reached the "Claims" section and that totally random phrase popped out at me. Haha! Not a very rational thing to say, is it? Seriously though, are we suggesting that the AI-Box "Experiment" can really be attempted in a scientific manner at all? It reads to me more like a thought experiment, like the Turing Test. You can use it in a sort of competition as Eliezar appears to have chosen to use it, but the way it's currently designed would probably not be ideal for, say, a Sociologist. 98.151.233.238 (talk) 04:21, 15 November 2011 (UTC)
 * It's a "perfectly normal, dry explanation of the test"... damn, I'll have to fix that. Scarlet A.pnggnostic 07:16, 15 November 2011 (UTC)

Should we have a "this is stupid" section?
The article is a pretty straightforward presentation of the idea, and it doesn't seem to touch on the the whole thing is pretty fundamentally absurd. It just kinda bypasses human nature, like the dogmatic following of existing beliefs regardless of the quality of arguments presented to us. That our default behavior isn't taking action and the impetus to compel us to do so has to pretty strong to overcome that human laziness. This article gives LessWrong pseudoscience more credence than other articles, and I can't understand why. Ikanreed (talk) 16:28, 21 November 2014 (UTC)


 * I figured it was pretty obviously stupid, and there's the obvious parallel to Terminator 3. A suitably understated list of reasons why this is a terrible "experiment" would probably be useful - David Gerard (talk) 17:45, 21 November 2014 (UTC)

How to insure the AI doesn't get out of the box? Destroy the AI.
Talk to Civic Cat  20:31, 2 April 2015 (UTC)
 * Smells very Kobayashi Maru... erm... spelling... eh... PacWalker 21:02, 2 April 2015 (UTC)
 * More Karim Said in Oz. How to find a needle in a haystack? Burn the haystack.  ;-) Civic Cat sig 2.PNG  Talk to Civic Cat   21:06, 2 April 2015 (UTC)

Ensure. 21:10, 2 April 2015 (UTC)

Happens all the time
Right here on RW, where there is no assurance that any given gatekeeper is competent or aware of what their action entails. Random account appears, makes a few pointless edits, and some diligent busybody issues them a mop. Does the experiment presuppose that the gatekeeper is fully awake, not to say vigilant and suspicious? SmartFeller (talk) 21:51, 15 April 2016 (UTC)


 * The AI-Box Experiment does not presume a gatekeeper who only slightly gives a shit - David Gerard (talk) 22:24, 15 April 2016 (UTC)


 * Editing on a wiki with an Alexa rank of 24,756 != Releasing a potentially malevolent advanced AI onto the global network. ℕoir LeSable (talk) 22:25, 15 April 2016 (UTC)
 * Maybe not, but microbiologists still take precautions with small samples in Petri dishes. SmartFeller (talk) 22:42, 15 April 2016 (UTC)


 * RW more sorta tends to go out looking for interesting new sources of contamination - David Gerard (talk) 22:49, 15 April 2016 (UTC)
 * I hadn't viewed RW in that light up til a while ago. About the gatekeepers' competence and dedication, is there any part of the canon addressing that? Given the prevalence, in most populations, of sympathetic bleeding hearts, fuzzy thinkers, adventurous "poke it and see" types, and outright malevolent wreckers, I wouldn't bet against an AI being released into the wild promptly after it wakes up. Oh, well... something for the grandchildren to worry about. SmartFeller (talk) 01:52, 16 April 2016 (UTC)