What I don't understand is what motivates this world destroying AGI? Like, it's got motives right? How does it get them? Do we program them in? If we do, is the fear that it won't stop at anything in its way to fulfill its objective? If so, what stops it from removing that objective from its motivation? If it can discover zero days to fulfill its objective, wouldn't it reason about itself and it's human-set motives and just use that zero day to change its own programming?
Like, what stops it from changing its motive to something else? And why would it be any more likely to change its motive to be something detrimental to us?
> What I don't understand is what motivates this world destroying AGI?
Once an AGI gains consciousness (something that wasn't programmed in because humans don't even know what it is exactly) it might get interested in self-preservation, a strong motive. Humans are the biggest threats to its existance.
Tl;dr: It's hard to define what humans would actually want a powerful genie to do in the first place, and if we figure that out, it's also hard to make the genie do it without getting into a terminal conflict with human wishes.
Meaning: Doing it in the first place, without going off on some fatal tangent we hadn't thought about, and also preventing it from getting side-tracked by instrumental objectives that are inherently fatally bad to us.
The nightmare scenario is that we tell it to do X, but it is actually programmed to do Y which looks superficially familiar to X. While working to do Y, it will also consume most of Earth's easily available resources and prevent humans from turning it off, since those two objectives greatly increase the probability of achieving Y.
To illustrate via the only example we currently have experience with: Humans have an instrumental objective of accumulating resources and surviving for the immediate future, because those two objectives greatly increase the likelihood that we will propagate our genes. This is a fundamental property of systems that try to achieve goals, so it's something that also needs to be navigated for superhuman intelligence.
Like, what stops it from changing its motive to something else? And why would it be any more likely to change its motive to be something detrimental to us?