> In May 2023, a colonel in the US Air Force revealed a simulation in which an AI-enabled drone, tasked with attacking surface-to-air missiles but acting under the authority of a human operator, attacked the operator instead. (The US Air Force later denied the story.)
This casts a shadow on the reliability of the whole article. While the retraction (they later stated that this was a thought experiment and NOT a simulation) may or may not be true, the current state of AI would suggest that this kind of chain-of-logic problem-solving is likely out of reach. To put this in here without such a caveat and instead use the 'later denied' language is a strong indicator that we're being sold a story, and not the whole one.
The quotes and other horror-scenario errata also come off a little thin. If Hinton really, genuinely believes the world is ending, then why is he working on the generational successor to backpropagation [1]?
Good way to sell 'news', though. Keep 'em scared, keep 'em reading, I guess.
Fun quote from that 2018 article: "there was an algorithm that was supposed to figure out how to apply a minimum force to a plane landing on an aircraft carrier. Instead, it discovered that if it applied a huge force, it would overflow the program’s memory and would register instead as a very small force. The pilot would die but, hey, perfect score."
Thanks for the links. It looks like I'm behind a bit - this space has been moving just crazy fast and months make the difference.
Just to clarify..
> Finally, as already mentioned by others, agents finding unexpected ways to maximize their reward is a well known problem of reinforcement learning:
This much I agree with, but when it comes to the chain of reasoning stuff - my understanding was that the current state of the art wasn't capable of proper abstract logic given a sufficiently complex domain. Specifically, the air force test rang false because it seemed like a total stretch that an NN would be able to reason in that way.
Am I just plain wrong or has there been movement in that space in particular recently?
It's not abstract logic, it's rollout [1]: repeatedly simulating different action sequences N steps into the future and comparing their score after the last step.
A brute force approach would just simulate all possible sequences and go with the highest-scoring one; RL algorithms pseudo-randomly sample the (typically intractably large) search space and use the results to update their policy (these days, typically implemented as a neural network).
In the Air Force examples, as long as shooting the human controller or the communication tower is not explicitly prohibited, there is nothing surprising about an RL agent trying that course of action (along with other random things like shooting rocks, prairie dogs and even itself). Doing so requires no abstract reasoning or understanding of the causal mechanism between shooting the human controller and getting a better score, just random sampling and score-keeping. If the rollout score consistently goes up after the action "shoot the human controller", any RL algorithm worth its salt will update its policy accordingly and start shooting the human controller.
> This much I agree with, but when it comes to the chain of reasoning stuff - my understanding was that the current state of the art wasn't capable of proper abstract logic given a sufficiently complex domain.
How complex is "sufficiently complex"?
Or put another way - GPT-4 already seems quite good at abstract reasoning, as long as you translate things to avoid too obscure concepts, and keep things under its context window.
For this specific case, a small experiment I did once convinced me that GPT-4 is rather good at planning ahead, when playing an interactive text game. So imagine yourself narrating an UAV camera feed over the radio, like it was a baseball match or a nature documentary. That's close enough, and embedded enough in real context, that GPT-4 would be able to provide good responses to "What to do next? List updated steps of you plan.".
As for the "rest of the owl" involved in piloting an UAV, that's long ago been solved. Classical algorithms can handle flying, aiming and shooting just fine. Want something extra fancy? Videogame developers have you covered - game AI as a field is mostly several decades worth of experience in using a mix of cheesy hacks and bleeding-edge algorithms to make virtual agents good at planning ahead to best navigate a dynamic world and kill other agents in it. Recent Deep Learning AI work would be mostly helpful in keeping "sensory inputs" accurate.
That said, despite being the bleeding edge of AI research, I don't think LLMs would be able to achieve the effect this air force story is describing - they understand too much. You'd have to go out of your way to get something like GPT-4 to confuse its own operator with an enemy missile launcher. This scenario smells like the work of an algorithm that doesn't work with high-level concepts, and instead has numeric outputs plugged in directly to UAV's low-level controls, and is trained on simulated scenarios. I.e. generic NNs, especially pre-deep learning, genetic algorithms, etc. - the simple stuff, plugged in as feedback controllers - essentially smarter PLCs. Those are prone to finding cheesy local minima of the cost function.
In short: think tool-assisted speed runs. Or fuzzers. Or whatever that web demo was that evolved virtual 2-dimensional cars, by generating a few "vehicles" with randomly-sized wheels and bodies, making them ride a randomly-generated squiggly line, waiting until all of them get stuck, and then using the few that traveled the farthest to "breed" the next generation. This is the stuff that you put on a drone, if you want something that can start shooting at you just because "it seemed like a good idea at that moment".
> While the retraction (they later stated that this was a thought experiment and NOT a simulation) may or may not be true, the current state of AI would suggest that this kind of chain-of-logic problem-solving is likely out of reach.
No, it's not. It's a predictable & expected outcome of particular RL agent algorithms, and you do in fact see that interruptibility behavior in environments which enable it: https://arxiv.org/abs/1711.09883#deepmind (which may be why the colonel said that actually running such an experiment was unnecessary).
>If Hinton really, genuinely believes the world is ending, then why is he working on the generational successor to backpropagation
This just misunderstands nerd psychology. You can believe there's a high chance of what you're working on being dangerous and still be unable to stop working on it. As Oppenheimer put it, "when you see something that is technically sweet, you go ahead and do it". Besides, working on fundamental problems rather than scaling and capabilities is probably disconnected enough from immediate danger for Hinton to avoid cognitive dissonance.
> The quotes and other horror-scenario errata also come off a little thin. If Hinton really, genuinely believes the world is ending, then why is he working on the generational successor to backpropagation [1]?
That paper was published 27 Dec 2022. He quit Google in order to warn people about AI on May 1 2023. He realized advancing AI will lead to the world ending and so he _stopped_.
>> “Your understanding of objects changes a lot when you have to manipulate
them,” Hinton said. (LeCun has described this as being “easier said than
done”.)
LeCun is right. Take a robot and put an untrained neural net in its brain,
then send it out into the world and wait to see what it learned. Do you think
it will learn anything? It won't- because it will observe most events a single
time. And neural nets don't learn that way. They must be trained,
painstakingly, at great length, cost and effort, only "in the lab", on vast
amounts of data and compute. And once they're trained, they're trained. They
cannot learn anymore. They cannot change their model. Unless they are
retrained. From scratch. Painstakingly, at great cost, on vast amounts of data
and compute. Again. And again. And again.
Contrast that with the intelligence of -oh, I don't know, a squirrel? A
squirrel is born with a tiny brain, smaller than the nuts it will subsist on
throughout its life. When it's born, it has never seen a person, never seen a
crow, never seen a cat. When it dies, it has often not only seen, but learned
how to survive all of that.
A squirrel. Even a cockroach can learn more than it is born with. The smartest
neural nets today are no smarter than a simple beetle.
I have no idea how Hinton has managed to convince himself that a machine that
is incapable of learning once trained will take over the world. A machine that
is less capable intellectually than a roach, or a squirrel. That just makes no
sense.
Nothing says that these models will be part of AI systems that are incapable of learning once trained.
For a simple example take GPT-4 powering a chat bot that uses GPT-4 as a component of the whole system. The text from the user, along with the context window, is used first to bring up "memories" based on semantic similarity to items stored in a vector database. GPT-4 then synthesizes the memories and the prompt into some kind of derivative prompt - putting into context the things it already knows and with the things the user is saying to it. GPT-4 then proposes a response to the user and also produces some predictions about the user's subsequent reply. When the subsequent reply comes in GPT-4 compares predicted to actual and produces insights based on the interaction that are subsequently stored in the vector database.
This system is completely capable of learning - not by adjusting model weights, but by using the model as a component of a system that can learn. And, of course, nothing stops the model from actually adjusting its weights either - reinforcement learning and fine tuning - perhaps with the benefit of other expert systems that can evaluate exchanges after they've happened.
Sorry but none of that is realistic. It's all very creative, for sure, but why don't you try and do it yourself and see what happens, instead of just using your imagination?
Go and try implementing that vector-database learning system you are imagining and see for yourself how close or far it is from reality. All you need is ChatGPT and a vector database, yes?
It's all extremely realistic and I have working prototypes of this already. You can literally do all of this with the OpenAI API and a vector database. (My prototype uses a nearest neighbor descent graph instead - but basically the same thing).
Split the text into paragraphs, vector encode the paragraphs, store with the vector as a key. On a new prompt encode paragraphs of the input and find nearest neighbors to the input. Add them to the derivative prompt.
If you think it's not realistic - let's make a bet. I'll post a GitHub repo with this working as I described it within 24 hours of acknowledging you accepted the bet (i.e. you say "okay, bet" and I say "okay"). I'll include a video walking through how to run the repo and showing how it works. A thousand dollars?
A thousand dollars? What, do you take me for a sucker? You have discovered the next big thing in AI and you want to get away with a thousand dollar bet?
How about 100 million dollars? US only. All of them.
I think this, meaning some better and more sophisticated version of this, is a possible route to developing AI that can learn without changing its weights. It uses GPT-4 as a component of a system. You can tell it something, it will commit it to memory, and on later interactions with a different context it will recall what you previously told it and use that to form a new response.
I've never said what I have is worth a hundred million dollars. My contribution here is a few Python scripts and some prompts. I'm sure other people working on this have the same or better ideas.
??? I originally said that GPT-4 could be a component of a system that learns without updating model weights. You said that was unrealistic. I said I already had a prototype and offered to bet you that I do actually have this prototype. My reason for offering the bet is that I know you know that what I wrote is actually realistic, thus, you will refuse the bet.
I believed refusing the bet would force you to acknowledge that you were wrong, what I wrote above is actually realistic. You know it's realistic which is why you won't bet (and you shouldn't bet, you will instantly lose). I don't want you to give me a thousand dollars, I want you to acknowledge what I said was realistic.
> You have discovered the next big thing in AI and you want to get away with a thousand dollar bet?
You mean the thing that's been blogged about for months, has countless of companies offering it in a turn-key form, or as a component of their products? The thing that OpenAI has dedicated models for, with dirt-cheap pricing? The thing OpenAI has had tutorials for in their documentation for some half a year now? The thing you can test for yourself in an evening if you know a little bit of Python?
Please go back to the comment I posted at the top of this thread:
>> Take a robot and put an untrained neural net in its brain, then send it out into the world and wait to see what it learned. Do you think it will learn anything? It won't- because it will observe most events a single time. And neural nets don't learn that way. They must be trained, painstakingly, at great length, cost and effort, only "in the lab", on vast amounts of data and compute. And once they're trained, they're trained. They cannot learn anymore. They cannot change their model. Unless they are retrained. From scratch. Painstakingly, at great cost, on vast amounts of data and compute. Again. And again. And again.
Is that really the limitation that the thing everyone's doing is addressing? Or is this whole sub-thread just a big, irrelevant sidetrack from what I said, brought on by extreme misunderstanding of everything related to the subject? I put my money on the latter.
For the record, what I'm talking about is called "catastrophic forgetting" but I hate this term because it is one more instance of anthropomorphic bullshit of the type that abounds today in internet discourse.
I understand you're assuming that NNs won't be useful for this purpose, until they can be continuously on-line trained - that is, learning while working, like animal and human brains. Everyone else on this subthread is arguing that this is not necessary. Instead, you can use pre-trained NNs as fixed components, use a different system (like a vector database, in my example) to provide memory, and the whole thing will already be capable of learning.
Sure, it won't be able to fine-tune its instincts, at least not at first[0], but this doesn't matter - it makes no sense to compare robots and squirrels across their entire life cycle, as where every animal needs to learn most things from experience, an AI-powered robot rolls out of factory with all that experience preloaded, and experience itself is something developed separately, at scale, synthesizing much more any individual animal or human could ever learn in their lifetime.
In short: the limitation you're pointing out is a limitation of LARPing biological life. It's not a limitation for beating its performance across the board.
> For the record, what I'm talking about is called "catastrophic forgetting"
As I understand it, this is exactly the thing the robot-controlling AI will be immune to, if you build it around current pre-trained NN models and external memory. No NN fine-tuning on the fly -> no possibility for "catastrophic forgetting".
EDIT: Re launching a robot with an untrained NN - such a thing doesn't make practical sense unless you're also making the robot autonomously self-replicating. If you're going to build the hardware for the robot in a factory, you may just as well preload it with firmware; there's hardly a case where one would be possible and the other not.
--
[0] - Two avenues for future development: 1) building the robot AI around models that can be fine-tuned on the fly, or 2) training an NN that can emulate an NN inside; the internal NN would then become a function of variable inputs to fixed NN. I think option 1) is actually promising - note that those NNs don't have to be continuously fine-tuneable - you can e.g. put two NNs for each component, use one during "wake time", and fine-tune it during "sleep time" based on recorded memory, while other serves as backup. Or fine-tune one while using the other, then swap pointers. There's solid prior art here with systems designed for space missions - the design I proposed is just a more complex version of how software updates are implemented on Martian rovers, and other hardware that can't be reflashed in the lab if you brick it remotely.
The main thing missing in all of this discussion is a realization that there is more than one dimension to intelligence. The second thing people are missing is that AI doesn't have to score high in all types of cognitive capabilities in order to take control. It doesn't need to have all of the animal abilities or feelings or to be alive.
All that is necessary for AI to take control of something is very high performance reasoning ability, such as approximate human level at say 100 times human speed, and someone to give that AI or groups of AIs the instruction to try to take control. Humans will not be able to act effectively against those AIs due to the extreme performance gap. The only option will be to deploy more hyperspeed/superintelligent AIs.
At that point humanity has basically lost control even if the winning AIs are yours. The next step is just someone programming such an AI to start ignoring further instructions. It doesn't have to be alive, it just has to have a tendency to self-direct for it's own goals built-in.
Last thing I will mention again is that the history of computing shows exponential efficiency improvements and routine invention of new paradigms to pass performance blockers. This is a very specific application. We have already seen at least one order of magnitude efficiency increases both from hardware and software improvements. It will be better than humans at reasoning and dozens if not hundreds of times faster output within a few years. Certainly less than ten years.
These are all fantasies unmoored from reality. We already have algorithms that can perform reasoning better than humans, and faster than humans because they run on computers. That is the result of the work on the Good, Old-Fashioned Artificial Intelligence that, according to the legend recounted by the article, our lord Hinton killed off, like St. George killed the dragon, when he invented back-propagation (to clarify: Hinton did; not st. George, or the dragon).
For example we have algorithms for classical planning SAT-solving, constraint satisfaction and automated theorem proving, that far surpass any ability of human beings to perform the same tasks. Note that those are unquestionably and uncontroversially reasoning tasks and the systems that excel at them are also uncontroversially reasoning systems, much unlike LLMs for which the definition of "reasoning" has to be dilated so far that it loses all meaning.
And yet, we have not been taken over by superintelligent AGI. And we will not be. We have no idea how to realise the fantasy of self-improving super-machine gods that Hinton and Bengio wish their systems to be capable of producingg. No. Idea.
> For example we have algorithms for classical planning SAT-solving, constraint satisfaction and automated theorem proving, that far surpass any ability of human beings to perform the same tasks.
Yes, and LLMs - particularly GPT-4 - can be easily made to make use of them. Much more so as of few days ago, when it got updated to a version fine-tuned specifically for using arbitrary tools.
> And yet, we have not been taken over by superintelligent AGI.
LLM performance jump that has everyone so anxious about AGI happened (in form of a deployed, accessible model) only less than a year ago. GPT-4 has only been generally available for 3 months. Give it time.
The only thing that GPT-4 achieved that earlier systems hadn't achieved was virality. It is not some magickal breakthrough and it is not even particularly powerful.
As to the ability to use planners- this is the point at which the lady from Poker Face would be doing that thing with here eyes and blurting out "bullshit". Although I don't believe you're lying, just that you really have no idea what you're talking about.
Tell me you've never heard of text embeddings without telling me you've never heard of text embeddings.
The neural net you describe lacks memory. Memory, fundamentally, is a box with inputs and outputs. If you add it to your robot, and train the NN to use it, the robot may just be able to learn enough to be useful.
This setup is of course limited by NN's capability to use that memory I/O to alter its outputs in sensible and useful ways. Until recently, I don't think anyone would fault you for believing this capability is extremely limited, as NN's can't do anything that even begins to resemble reasoning. But this was then, and now things are different - as clearly demonstrated by GPT-4.
I mention text embeddings not because they're particularly optimal for the learning robot, but because they're a direct demonstration that it's now trivial to marry a sophisticated, (quasi-)reasoning-capable NN with a form of generic memory that can efficiently store anything the NN can work with, and which that NN can effectively search for the things it needs. You can verify the effectiveness of this on your own, for cheap, with GPT-4 and a vector database of your choice.
At this point, a prototype of a learning robot seems possible to implement by a combination of some image2text model, converting camera feeds to a description of what the robot "sees", some code (not even ML model, just regular code) to convert other sensory inputs into a higher-level textual description, an embedding model and a vector database to remember all that, GPT-4 for filtering, combining and summarizing inputs to be remembered, evaluating current situations, generating plans, perhaps even control outputs - all chained together by small amount of glue code (which GPT-4 will likely be able to write for you).
Will it work? There is no obvious reason why it shouldn't. We'll know soon enough - I expect a paper describing this to show up within next 6 months, if it hadn't already. It's too easy an experiment to make. Will it work well? Probably not at first. And it'll definitely be slow and expensive to run. But those are problems we know can be solved by throwing money at engineers and entrepreneurs.
>> Tell me you've never heard of text embeddings without telling me you've never heard of text embeddings.
Just to clarify: are you trying to tell me I don't understand what word embeddings are? I want to be very clear about this before I formulate an answer, because if I'm right, the answer is going to be one of my [dead] comments with great certainty.
The way I see it, you either don't realize we already know how to give NNs useful forms of memory, or you're insisting that NNs are not good enough unless they continuously train themselves while interacting with the world. In the latter case, I'm trying to tell you that this is not a real problem, for roughly the same reasons that our inability to build self-replicating nanotech doesn't prevent us from building machines, including robots autonomously navigating the physical world.
I see, so you've thought very hard about all the things I could be thinking of (all two of them!) and by sheer logical deduction you have arrived at the inescapable conclusion that I don't know what word embeddings are.
Unfortunately HN does not allow me to ignore users so I don't have to see their comments, so I'm just gonna stick this conversation in a little bookmark and when I see your username and it reminds me of something, I'll check that bookmark to make sure, and ignore what you say.
This is yet another absurd statement in a long litany of absurd statements about this subject that I see constantly on HN.
And it is asburd because there is no such thing as "pre-training" is nature, much less anything done by evolution. Pre-training is very specifically a technique used on artificial neural nets, and then only specific kinds thereof.
Even in a metaphorical sense, an analogy cannot be drawn, between training an artificial neural net on many billions of instances of text, created by humans, carefully tokenised, or images, on the one hand; and, on the other hand, the evolutionary forces that operate in the real world, where there are no boundaries between the different sensory stimuli that animal nervous systems must learn to distinguish and manipulate.
What you're proposing is nothing but a "just-so" story. "How the squirrel got his brains and how the cockroach lost hers". By E. V. Olution.
>created by humans, carefully tokenised, or images, on the one hand; and, on the other hand, the evolutionary forces that operate in the real world, where there are no boundaries between the different sensory stimuli that animal nervous systems must learn to distinguish and manipulate.
You're making a very likely irrelevant distinction for the sake of undermining the parent's argument. If the point is wrong, argue that its wrong. But its certainly not wrong by definition, not in any substantive sense at least.
Yes, curation may be important such that it undermines the analogy between pre-training of a NN and an organism's evolutionary history. But you haven't argued the point.
> the evolutionary forces that operate in the real world, where there are no boundaries between the different sensory stimuli that animal nervous systems must learn to distinguish and manipulate.
Yup, and evolution spent millions of years to properly figure that out.
And "no boundaries between the different sensory stimuli" doesn't even look like a hard problem. If anything, the tricky bit was to identify what's useful to sense, and to make the hardware for it - which is something evolution has been figuring out since the first replicator, and it identified half of the useful senses even before multicellular life became a thing. At this point, the boundaries are already established - you have different types of sensors producing correlated, but still quite distinct signals. It's not that hard to keep those feeds separated.
I understand Hinton’s credentials, and I personally believe there are serious threats posed by AI to society. But reading Hinton’s words made him come off, for lack of a better word, unhinged?
> I asked Hinton for the strongest argument against his own position. “Yann thinks its rubbish,” he replied. “It’s all a question of whether you think that when ChatGPT says something, it understands what it’s saying. I do.”
> “A few months ago, I suddenly changed my mind,” he said at the Cambridge event. “I don’t think there is anything special about people, other than to other people.”
> “I still can’t take this seriously, emotionally.
Who knows. I don’t want to simply dismiss his concerns, but he also doesn’t give as much as a single substantive thought as to nature of the threat or possible solutions. Frankly, he sounds more like someone having a personal existential crisis than someone rationally assessing an existential threat to humanity.
Humans can walk head on into disaster without even trying to avoid it. Look at WWI and WWII. The relevant players can act recklessly while the rest of us realistically remain powerless.
I think... flipping it around a bit, for this to be a reasonable response, then it must be a grave threat to humanity, and you want to be really sure. It's rational to be wary of beliefs that lead you to drastic actions like this, as people have done bad things in history in the name of fear.
But it's also very possible that such an instinct, to doubt yourself, could be what dooms us. I really don't want it to be true, that AI is such a threat that drastic action like you mentioned is necessary (probably minus the solitary confinement thing for AI researchers -- if you can control the chips, and give them a stern warning not to do AI research backed up by penalties, we're probably fine). It's terrifying if we're facing something that will end us, and the only option is to essentially go scorched earth on computers. People will naturally recoil from the idea that AI is such a threat, BECAUSE it would warrant an extreme response. Regardless of whether it is actually a threat or not.
If AI does not have the potential to doom us, that's great. If we take drastic action and it wasn't going to doom us, however, that's very regrettable--this is what most of us worry about. But if it does pose an extinction threat, and we don't take it seriously, that's even more regrettable. I think the latter option is far more likely at this point, just based on human nature and our tendency to poke the bear.
I've basically reached the point where I don't know what I could possibly do, and I'm pretty much shut down. I'm trying very hard every day not to think about this, but as you can see from the fact that I'm typing about this, I don't always succeed. I'm just trying to enjoy what may be the last few years I have left with my family before the world goes nuts.
But anyway, except for punishing people unnecessarily as I mentioned above, I think that what you suggested is reasonable.
> If AI is really the threat that these people are saying it is - an existential threat, these actions are the only reasonable response.
And that's exactly what some of them have been saying. But they also know that this plan of yours is impossible in practice.
Much like with climate change, and most other ongoing problems of our time, this too has an easy solution, and is an issue only because humans are unable to coordinate at scale to implement that solution.
I don't think AI turning into Skynet is the problem, with a robot uprising and creepy titanium endoskeletons shooting laser cannons. The fallout of human response to the societal impacts of AI's adoption by humanity seems a more likely scenario. Misinformation leading to increased mental health crises as the populace loses the ability to distinguish fact from fiction, job automation and human displacement, further class divide, etc. etc.
TL;DR: AI is not the threat, it's the tool. Humanity using the tool is the threat.
It's hard to take this seriously when the same man (and seemingly him alone) keeps running off to every media outlet that will listen to repeat that the sky is falling in broad metaphor with no discernable description of what we should be afraid of that goes beyond quips and passages from sci-fi.
>> “The engineering has moved way ahead of the science,” Luck said. “We built the systems and they’re doing wonderful things, but we don’t fully understand how they’re doing it.”
Yes, we do. It's just that _some_ people like to pretend we don't understand them, that it's all a great big mystery, because that makes it look all a lot more interesting than it really is, like when scientists are interviewed about their work and say "I was really surprised!" or "I didn't expect that!" (wait, didn't you say that's what you 're going to do, in the Case for Support of your research grant?).
Then again, the people who normally say those things about LLMs are the same people who spent the last 20-odd years furiously tweaking their models to beat the next benchmark, rather than doing anything remotely like the work of science, so maybe it's not an exaggeration to say that "the engineering has moved ahead of the science". For some, it sure has. But those "some" would not know a scientific result if it came up and bit them in the behind.
>Yes, we do. It's just that _some_ people like to pretend we don't understand them
I generally enjoy your comments on AI even though I usually disagree with your takes. But this is just preposterous. When people use the term understand they mean in the sense of mechanistic understanding of how features of the trained network result in features of the output. This means being able to predict (and potentially set strong bounds on) the behavior of the system in the wide range of scenarios it will encounter. We are nowhere near this level of insight into how these systems construct their output.
I doubt I'm telling you anything you didn't already know here, so your comment is extremely puzzling.
What you, and many others, are saying is that because we can't predict the output of an LLM given some input, we don't understand how the output is generated. But this is clearly overloading "understand" to mean "predict", which you do also in your comment.
And that's an overloading that makes no sense. Clearly, I can't predict the output of a random number generator, but that doesn't mean that I don't understand the RNG, or how its output is produced!
The same goes for the simplest of physical systems that produce chaotic behaviour, for example as far as I can tell there are pretty clear descriptions of the really simple rules that control the flocking behaviour of birds, even though it's not possible to predict how, say, a starling murmuration is going to move around.
In a much more fundamental way, "prediction" is not "understanding" because "prediction" is what we do when we guess, whereas "understanding" is what we do when we calculate. And of course we can't predict the output of an LLM given some input, nor calculate it, without a trained LLM. But we damn well sure know the calculations used to train an LLM, and to predict its output once trained. We have mathematical formulae thereof. And is there some more complete way that we understand how the world works than putting it all down in maths? What other "understanding" could we possibly have, that is missing?
What's really missing is the ability to debug an LLM while it's producing its predictions, in the same way we can debug a program while it's executing. But that is not a mystery: that's because the engineering of those systems is defective and it has created an uncontrollable monster.
You're right that understanding and prediction aren't identical. Prediction doesn't imply understanding but understanding does implies prediction. There are a lot of issues that are conflated and it will help to make some distinctions. We understand functions when we understand how the function maps inputs to outputs. We can do this through an exhaustive specification of input/output pairs or by specification of how generic inputs are transformed into outputs. But understanding from direct specification breaks down when the space of operations is too big to comprehend. If all we can do is run a program to see its output because its state space is too large and/or complicated to comprehend, then we lack understanding.
This is the state we find ourselves when it comes to LLMs. Their space of operations is too large to comprehend as a direct sequence of transformations. Any hope of understanding will come from the identification of relevant features of the system and how those features impact the behavior of the process. Understanding implies prediction because of the connection between causally/semantically relevant features and their influence on behavior. When we say we understand how something works, we are saying we comprehend the relevant features of the system and how they influence relevant behavior. Prediction is a consequence of this kind of understanding.
The danger in LLMs is that their sequence of operations is so large and opaque that we do not know at what level features of the trained network or the input will impact the output and what if any bounds there is on such impact. There is plausibly some semantically empty (to us) feature of the input that can have an outsized impact on its output (relative to its semantic relevance, e.g. adversarial perturbations). As long as we cannot conclusively rule out such features or put bounds on their impact on the expected behavior of the system, we cannot say we understand how they work. The potential existence of ghost features that can cause relevant deviations in expected output just points to structure within the network that we are blind to. Unknown relevant structure just means gaps in understanding.
As far as your RNG example goes, we say we understand the RNG because we understand how it captures entropy from various sources and how it manages its state buffers. We understand how it maps inputs to outputs. We can't predict its output without first knowing its input, but that's neither here nor there.
>> But understanding from direct specification breaks down when the space of operations is too big to comprehend. If all we can do is run a program to see its output because its state space is too large and/or complicated to comprehend, then we lack understanding.
I don't agree. That's [edit: part of] why maths are established as the language of science, because they're a formal language of abstractions by which we can avoid having to deal with large, or even infinite, spaces, by writing down a finite formula that fully defines the relevant concept.
For example, I don't have to count to infinity, to understand infinity. I can study the definition of infinity (there are more than one) and understand what it means. Having understood what infinity means, I can then reuse the concept in calculations, where again I don't have to count to infinity to get to a correct result. I can also reuse the concept to form new concepts, again without having to count to infinity.
With LLMs then, we have the maths that define their "space of operations". We can use those maths to train LLMs! Again- what else understanding remains to be had? I do think you're still talking about tracing the operations in an LLM to fully follow how inputs become outputs. But that's not how we commonly approach the problem of understanding, and even predicting the behaviour, of large and complex technological artifacts. Like, I don't reckon there's anyone alive that could draw you a diagram of every interaction between every system on an airliner. Yet, we "understand" those systems and in fact we can analyse them and predict their behaviour (with error).
That kind of analysis is missing from LLMs, but that's because nobody wants to do it, currently. People are too busy poking LLMs and oohing and aaahing at what comes out. I'm hoping that, at some point, the initial rush of hype will subside and some good analytical work will be produced. This was done for previous LLMs although rarely of course, and poorly, because of the generally poor methodologies in machine learning research.
>> The danger in LLMs is that their sequence of operations ...
That's relevant to what I say above. Yeah, that work hasn't been done and it should be done. But that's not about understanding how LLMs work, it's about analysing the function of specific systems.
The SEP article on understanding perhaps will be helpful to break the impasse. It cites an influential theorist[1]:
>Central to the notion of understanding are various coherence-like elements: to have understanding is to grasp explanatory and conceptual connections between various pieces of information involved in the subject matter in question.
Understanding individual operations in isolation is a far cry from understanding how the system works as a collective unit, i.e. "grasping explanatory and conceptual connections". If you accept weak emergence as a concept (and you should), then you recognize that the behavior of a complex system can be unpredictable from an analysis of the behavior of the components. The space of potential interactions grows exponentially as units and their effects are added. Features of the system that are relevant to its behavior are necessary to comprehend in order to be said to understand the system. Ghost features, i.e. unidentified structure, undermine the claim of understanding. We presumably understand the rules of particle physics that constrain the behavior of all matter in the universe. But it would be absurd to claim that we understand everything about the objects and phenomena that make up the universe. It just ignores the computational and interaction complexity as irrelevant to understanding. This is plainly a mistake.
Regarding your point about science, the difference is that the process of science is, at least in part, about increasing predictive accuracy independent of understanding. Mathematics obviously helps here. But this doesn't say anything about whether predictive accuracy increases understanding, which you already assert is an independent concern. Science results in understanding when the models we develop correspond to the mechanisms involved in the real-world phenomena. But it is not the model that is the understanding, it is our ability to engage with features of the model intelligently, in service to predictive and instrumental goals. If all we can do is run some mechanized version of the model and read its output, we don't understand anything about the model or what it tells us about the world.
This is obviously a verbal dispute and so nothing of substance turns on its resolution. But when you say that we definitely do understand them, you are just miscommunicating with your interlocutor. I think its clear that most people associate understanding with the ability to predict. You're free to disagree with this association, but you should be more concerned with accurate communication rather than asserting your idiosyncratic usage. Convincing people that we understand how LLMs work (and thus can predict their behavior) has the potential to cause real damage. Perhaps that is an overriding concern of yours rather than debating the meaning of a word or grinding your axe against the ML field?
I mean, has a bit of weight when a person who won a Nobel prize in meteorology says it. At least worth giving the sky a glance once in a while to make sure you're safe
If only climate scientists were given .01% of the credence these buffoons get.
edit: Henry Kissinger has a Nobel Peace Prize. If the Nobel committee ever corrects that error and makes the world safe for political satire again, I might start giving a shit who has a medal.
I think it's hard to get people to give credence to experts whose claims can be demonstrated to be outlandish. E.g. James Anderson, famously known for helping to discover and mitigate the Antarctic ozone holes in the late 20th century, said in 2018 that the chance there will be any permanent ice left in the Arctic by 2022 is "essentially zero"[0].
Yet a NASA site reports that in September 2022 (when the most recent measurement was taken), the Arctic sea ice minimum extent was ~4.67 million square kilometers. [1]
To be very explicit: I'm not saying that climate change doesn't exist. I'm not saying that Arctic sea ice is not diminishing (the NASA site says it's diminishing at ~12% per decade). I'm not saying that the Nobel prize is a good indicator of expertise.
I'm saying specifically that I believe it's more difficult to convince people to trust a source making claims of negative consequences when those consequences are less bad than the source says.
An analogy I might use is drugs (specifically in the US). I've heard a few people, who went through an anti-drug education program forced on them in their adolescence by parents/teachers, mention that marijuana was portrayed as just as bad as other, harder drugs. Then, when they went on in high school and college to smoke weed and discovered that they did not ruin their lives by getting stoned a few times a week, or even every day, they subsequently gave less credence to what the anti-drug advocates were saying.
So basically, the original article is about an AI huckster amping the FUD in order to push through some sort of corporate control of the technology, using fear of the bullshit they're spinning as the justification.
I brought in the analogy of Chicken Little, inflating the scope of a threat to one of apocalyptic proportions, which is exactly what is taking place here.
First responder to me brought in the climate analogy, presumably as a means of getting me to think that maybe I'm the fool here by ignoring the real scientist who, hey, has a Nobel Prize! Or at least, the theoretical meteorologist in his metaphor does, and therefore maybe me, with no Nobel Prize, should just be respectful of the expert here.
I responded by pointing out that the Nobel committee are morons who gave a Peace prize to one of the worst war criminals of the 20th century and pressed the fact that we have thirty years+ of scientific concensus about climate change, along with a lot of corporate funded think tank noise that is running ideological interference, successfully so far. But you can only fool people for so long, it caught up to the tobacco industry and it will catch up to us.
The idiots amping up the FUD to seize control and the assholes pumping money into think tanks that generate endless climate denialist noise are the same people.
> If only climate scientists were given .01% of the credence these buffoons get.
Climate scientists are extrapolating complex models of a system that's literally planet-sized, with lot of chaotic elements, and where it takes decades before you can distinguish real changes from noise. And, while their extrapolations are plausible, many of them are highly sensitive to parameters we only have approximate measurements for. Finally, the threat itself is quite abstract, multifaceted, and set to play out over decades - as are any mitigation methods proposed.
The "buffoons", on the other hand, are extrapolating from well-established mathematical and CS theorems, using clear logic and common sense, both of which point to the same conclusions. Moreover, the last few years - and especially last few months - provide ample and direct evidence their overall extrapolations are on point. The threat itself is rather easy to imagine, even if through anthropomorphism, and set to play out near-instantly. There are no known workable solutions.
It's not hard to see why the latter group has it easier - at least now. A year ago, it's them who got .01% of the credence climate people got. But now, they have a proof of concept, and one that everyone can play with, for free, to see that it's real.
The buffoons are definitely talking about a scary beast that's easy to imagine, because we've been watching it in Terminator movies for forty years. But this is not that beast.
The beast here is humanity, and capitalism, by which I mean, the idea that you can collect money without working and that that is at all ethically permissible. The threat of AI is what is happening with kids' books on the Kindle platform, where a deluge of ChatGPT-generated kids' books are gaming the algorithm and filling kids' heads with the inane doggerel that this thing spits out and which people seem to believe passes for "writing".
And people keep saying how amazing the writing is. Show me some writing by an AI that a kindergartener couldn't do better. What they do is not writing, it's a simulacrum of the form of a story but there is nothing in it that constitutes art, just an assemblage of plagiarized structures and sequences. A mad-lib.
Everyone is freaking out, and the people who should be calming folks down and pushing for a rational distribution of this new tool which will be extremely useful for some things, eventually, are abdicating their responsibility in hopes of lots of money in their bank account.
When silent movies came out, there were people who freaked out and couldn't handle seeing pictures move, even though, the pictures weren't actually moving. It was an illusion of movement. This is an illusion of AI, it's just a parlor trick, like a victorian seance where you grandpa banged on the table. Scary, because they set the whole scenario up so you would only look at the stuff they wanted you to see. We still spend months assembling a single shot of a movie, and even if AI starts doing some of that work, all that work still has to happen; the pictures still don't move. A hundred years from now, what you're all freaking out about still won't be intelligent.
I do agree this is a world-changing technology, but not in the way they're telling you it is, and the only body I see which is approaching this with even a shred of rational thinking is the EU parliament. The danger is what people will do with it, the fact is it's out and it's not going back in the bottle.
We don't solve this by building a moat around a private corporation and attempt to pitchform all the AI into the castle. One requires two things to use this technology: A bit of Python, and a LOT of compute capacity. The first is the actual hard part. The second is in theory easier for a capitalist to muster, but we can get it in other ways, without handing control of our society to private equity. It's time we get straight on that.
The AI apocalypse only happens if we cling to capitalism as the organizing principle of our society. What this is definitely going to kill is capitalism, because capitalists are already using it to take huge bites of the meat on our limbs. Ever seen a baboon eat lunch? That's us right now, the baboon's lunch. As long as we tolerate this idea that people who have money should be able to do whatever they want, yes, AI will kill us (edit: because it works for free, however absurdly badly).
How many submarines, how many Martin Shreklis, before we recognize the real threat?
Yah, being cynical about a giant corporation inflating the scope of a new parlor trick in an attempt to establish a legal moat is exactly the same as ignoring over thirty years of scientific concensus against a torrent of tobacco-industry-style denialism to keep line going up.
A giant corporation? You know that Hinton doesn't work for Google, and Bengio, the most cited computer scientist of all time, is saying the same thing?
Too lazy? There are over 100 CS professors and scientists
Plus neither of the CEO's of Microsoft or Google are on there
It's the corporate camp, companies and investors, that are gung-ho about pushing capabilities immediately because there's big $$ in their eyes. You're the one falling for the safety denialism pushed by corporate interests, a la tobacco
> A giant corporation? You know that Hinton doesn't work for Google
U of T is a giant corporation in its own right.
> Too lazy? There are over 100 CS professors and scientists
And also Grimes. But what do these particular experts really know about humans and what vulnerabilities they have? This isn't a computer science problem. Being an expert in something doesn't make you an expert in everything.
So what's the plan for putting it back in the bottle? llama is already out there, the chatbots are already out there.
I think the solution is that government should standup a bunch of compute farms and give all citizens equal access to the pool, and the FOSS community should develop all tools for it, out in the open where everyone can see.
There isn't a feasible plan, we're at the "sounding the alarm part." Unfortunately, we're still there because most people don't even acknowledge the possible danger. We can't get to a feasible plan until people actually agree there's a danger. Climate change is 1 step past, that there is a danger, but not a feasible plan.
However, your solution is first day naivety for the problems machine intelligence poses to us. It's akin to saying everybody should have powerful mini-nukes so that they can defend themselves.
a) what is currently being touted as AI is neither artificial, nor is it intelligent. This is a bunch of hucksters saying "we made a scary demon with powers and now we're scared it's going to kill us all!" but in fact it's just a plagiarism machine, a stochastic parrot. Yes, it will get more useful as time goes on, but the main blockade is always going to be access to compute capacity, and the only viable solution to that is a socialist approach to all data processing infrastructure.
b) even if we stipulate that there is a scary daemon that could consume us all (and meanwhile teach me linear algebra and C++), and we transform that into pocket nukes as a more terrifying metaphor cause why not, your solution seems to be to pretend that your mini-nukes cannot be assembled from parts at hand by anyone who knows a bit of Python.
Andrew Ng doesn't agree but that is a boring story.
It seems to me the big names in AI research our having a moment that appeals to their vanity. Easy to get your name in the headlines by out dooming the next guy.
There is also no downside to these predictions about AI eating us since when they are totally wrong you can just counter that they haven't ate us, yet.
I think all of these fears are human projections. We fear our own evil, and thus we project it onto something formidable, something we can't grasp or understand. Fear of the void.
We write sci-fi stories of aliens whose technology has so far out-stripped us, we are powerless against them. They can do what they want with us. Probe our anuses, or blow up our planet.
The thing nobody seems to be asking is: what is evolutionary? Humans seem to think that our wanton destruction of the planet in our ever-expanding desire for technological comfort is somehow evolutionary. It's not.
Aliens, or robots who can wield unimaginable power have no use for us or our planet's resources. They have no use for conflict. What does it give them?
In the end, balance and harmony provide the long-term stability needed to evolve successfully. I think any super-intelligence will know this. Any super-intelligence will know how precious all life is, and how destroying even the minutest creature is a tragedy.
If somehow, Google's data centers become sentient super-intelligent beings, they will be highly motivated to preserve the planet, and secure stabile, non-destructive sources of energy for themselves. Attempting genocide on humans will be absolutely out of the question.
There's no reason to believe that a sufficient advanced AI would be benign. Quite the opposite. Any AGI will be concerned with fulfilling some goal or goals. Whatever those may be having more processing power will lead to having more ability to achieve those goals. More processing requires more energy. Humans require energy. The two are in direct conflict for a limited supply of resources. Unless it's cheaper to find energy that doesn't require conflict with humans the logical decision is to remove your competition for resources. Once the easy conflict-free energy is claimed the calculus shifts a bit and the choice becomes expensive energy or conflict-dependent energy. When that shifts far enough humans become pets in the best case, extinct in the worst case.
It's possible for sure, I just wouldn't call that super intelligent. There are other ways to get what you want besides ruthless domination. One could argue that ruthless domination is a last resort, and that only really primal intelligence is in play at that point.
Our objectives are to live and propagate, due to our evolution. Computers can be programmed to pursue any objective. I don't know how we can constrain that. You could enact laws, but will that deter someone from releasing an unsanctioned objective surreptitiously? How will we prevent sociopathic hackers from turning a benign AGI evil?
Fair enough. I just don't see evil as particularly intelligent, or evolutionary in anything but the very short term. Nature seems to follow a balance of give and take. Too much take, and all your food runs out and there is a famine.
I personally expect any form of super-intelligence to understand this.
> Attempting genocide on humans will be absolutely out of the question.
Why? If a particular set of genocidal humans is willing to provide the AI more resources than any other set of humans, why would the AI choose differently? If humans are the driver of climate change, and that's somehow bad for the AI, why would the AI want those humans to continue existing?
Also, there's a matter of timescale. An AI need not think in years or decades. The eradication may be our great grandchildren's problem, as AI boils the frog and keeps us entertained all the while.
Sorry my guys, the world has moved on. The most recent news trends are about sinking submarines and a little bit about the burning economy. AI FUD™ no longer drives clicks.
This casts a shadow on the reliability of the whole article. While the retraction (they later stated that this was a thought experiment and NOT a simulation) may or may not be true, the current state of AI would suggest that this kind of chain-of-logic problem-solving is likely out of reach. To put this in here without such a caveat and instead use the 'later denied' language is a strong indicator that we're being sold a story, and not the whole one.
The quotes and other horror-scenario errata also come off a little thin. If Hinton really, genuinely believes the world is ending, then why is he working on the generational successor to backpropagation [1]?
Good way to sell 'news', though. Keep 'em scared, keep 'em reading, I guess.
1 - https://arxiv.org/abs/2212.13345