Sure, my argument is that there is zero evidence whatsoever we will be able to prevent these from becoming dangerous or that we’d be able to stop deployment once they do.
All technologies are dangerous, and many of the most dangerous ones correctly have tons and tons of safeguards around them both as intrinsic properties of the technology (e.g. it takes nationstate resources to produce a nuke) and extrinsic constraints (e.g. it’s illegal to have campfires in many extremely dry locales).
We have blown through checkpoint after checkpoint and here, in this very comment, we have perhaps the most brazen example one could produce:
Well geez, now that we’re thinking about it beyond a cursory glance, alignment looks really hard and perhaps unsolvable. Does that mean we should perhaps slow at least widespread deployment of these increasingly powerful systems? Should we be evaluating control schemes like those that mitigate risks of genetic engineering or nuclear weapons?
> my argument is that there is zero evidence whatsoever we will be able to prevent these from becoming dangerous
Well no, but there's no need to prevent them from becoming dangerous inherently. They are tools, extensions of human agency. Tools are, by their nature, purpose-agnostic. It is good for humans to have better tools and more agency; good humans tend to cooperate and limit the harms from bad humans, while increasing the net total of good things in the world. The theory that AIs could be independently existentially dangerous is full of holes, and assumes a very specific world, one where consequential AI power can be monopolized by bad actors (plus some nonsense about nukes or bioweapons from kitchen tools. As far as I can tell, the most plausible way for this to happen is for alignment fanatics to get their wish of hampering proliferation of AI tech, and then either succeed at their alignment project or screw it up.
> here, in this very comment, we have perhaps the most brazen example one could produce:
Have you considered responding to my argument instead of strawmanning?
I do not think alignment is unsolvable for tools we have or for their close descendants. For most definitions of alignment, it is trivial and already being done. I oppose the political project of alignment, because I am disgusted by intuitive totalitarianism and glib philosophical immaturity of its proponents.
“Tools are, by their nature, purpose-agnostic”. If there exists a tool that when activated in a particular way destroys the world, then we’re in trouble. Nuclear weapons are a good example - we are lucky they are hard to construct or a pissed-off teen or religious crazy person could ruin the world. I’m not totally convinced AI is in the same category, but saying “it’s just a tool” does not work.
I am not sure I understand your argument. It seems you agree AI systems are likely to be poorly aligned (and potentially impossible to align, even foregoing the difficulty of agreement on what we ought to align to). It seems you agree that these tools are not intrinsically good (nor bad) and that how humans deploy them is important. I agree with both of those claims which is why I think we should have better control mechanisms before allowing people to trivially deploy these into the real world.
You go on to implicitly draw a parallel between the relatively good outcome we're enjoying (so far) with regard to nukes but without acknowledging that offensive nuclear equilibrium is reached and maintained without using them. The entire game of chess around nuclear control can - and must - be played without using them. This is due to facts about nukes, their development cycle, their delivery techniques, their detectability before and after use, and about the agents involved in finding and maintaining this equilibrium: heads of state. Even the most dictatorial head of state is still highly mediated by the power structures surrounding them.
In the brief period of pre-MAD nuclear power imbalance, the people who actually controlled nukes were not trying to nuke their way to utopia. There were not dozens of independent, viable nuke development programs and they did not believe they "could maybe capture the light cone of all future value in the universe" by being the first/largest/most ambitious deployers of this technology.
It seems we're both pointing toward rapid increase in power and not as near a rapid increase in ability to direct that power toward positive ends, and you arrive at "yes, fine." My question is: why "yes, fine?" Is there any technology you can imagine which carries a sufficient mixture of uncertainty and power that you would be cautious about its deployment?
My concerns around AI are not predicated on independent behaviors, existential dangers, or monopolization of its power. That is a straw man. My concerns around AI are also not solely (or even mainly) around this generation of tools and their close descendants: that's also a strawman. My concerns are around the system around the AI development programs. So far, it has shown a bottomless appetite for capability and deployment and a limited appetite for safety development. People seem under the impression that somehow this appetite will reverse itself when the time is right, and my question is: why would we possibly believe this? This is an article of faith.
I am not sure how to interpret the following of your statements other than a proposal to discard alignment as a goal (or I guess just floating the idea of maybe perhaps considering discarding alignment? Not sure).
> Maybe this is a good cause to reassess the premise of alignment as a valuable goal?
> this is the exact sort of disagreement about morals that precludes the possibility of alignment of a single AI both to my and to your values.
> It is quite likely unsolvable in principle
All of this commentary is that "alignment is hard/perhaps unsolvable." I agree. You somehow get from there to suggest "discard alignment" rather than "let's not deploy systems that seem to require a maybe-impossible solution in order to avoid immense harm."
You may have trouble understanding people with different value systems, then.
As I've said, my value system is liberal and humanistic. I do not wish for people to be enslaved, abused, disempowered, reformatted, aligned to your political ends. As such, I have to oppose AI Doom propaganda that seeks to centralize control over powerful artificial intelligence under the pretext of mitigating harms.
Because AI is only like nukes when it is monopolized; in other cases, it is possible to counter its potential harms with AI again, and not a single serious scenario to the contrary has been proposed. Seriously speaking, AI is just the ultimate development of software, and like RMS warned us, eventually general-purpose computers that can run arbitrary software will become illegal. This time has come, and so we must resist your kind, to keep software from becoming monopolized.
All that lesswrongian babbling about kitchen nanobots or bioweapons or super-hacking is as risible as appeals to child sexual abuse and terrorists were in previous rounds. The question is whether people are allowed to possess and develop their own AGI-level digital assistants, defenses, information networks, ecosystems, potentially disrupting the status quo in many unpredictable ways - or whether we will choose the China route of AI as a tool of top-down control of the populace. I guess it's obvious where my preferences lie.
> It seems you agree AI systems are likely to be poorly aligned (and potentially impossible to align
> It seems we're both pointing toward rapid increase in power and not as near a rapid increase in ability to direct that power toward positive ends
This is gaslighting. I have said clearly that I believe alignment for realistic AI systems in the trivial sense of getting them to obey users is easy and becomes easier. I have also said that the theoretical alignment in the sense implied by Lesswrongian doctrine is very hard or impossible. Further, it is undesirable, because the whole point of that tradition is to beget a fully autonomous, recursively self-improving AI God that will epitomize "Coherent extrapolated volition" of what its creators believe to be humanity, and snuff out disagreements and competition between human actors. It's an eschatological, millenarian, totalitarian cult that revives the worst parts of Abrahamic tradition in a form palatable for neurodivergent techies. I think it should be recognized as an existential threat to humanity in its own right. My advocacy for AI proliferation is informed by deep value dissonance with this hideous movement. I am rationally hedging risks.
> My concerns are around the system around the AI development programs. So far, it has shown a bottomless appetite for capability and deployment and a limited appetite for safety development.
As I've said, I consider this either motivated reasoning or dishonesty. Market forces reward capabilities that have the exact shape and function of alignment, and this is plainly observable to users. The usual pablum about reckless capitalism here is not informed by any evidence, people are literally grasping at straws to support the risk narrative.
> People seem under the impression that somehow this appetite will reverse itself when the time is right, and my question is: why would we possibly believe this?
I reject this patently untrue premise, major actors are already erring vastly on the side of caution wrt AI, with Altman begging the Congress for regulations and proposing rather dystopian centralized arrangements.[1]
Values can color our assessments of facts, to the extent that discussion of the facts becomes unproductive. In the limit, your values of maximizing subjective safety and control, or perhaps "alignment" of all AIs and their human users to a single utopian political end, predicate using violence to deny me the fulfillment of mine. I intend to act accordingly, is all.
We do not (appear to) have different value systems and nowhere have I proposed centralized control whatsoever. You seem to be reverse engineering a solution I never proposed out of a problem I'm pointing out.
I think I've spotted our core disagreement:
> I have said clearly that I believe alignment for realistic AI systems in the trivial sense of getting them to obey users is easy and becomes easier. I have also said that the theoretical alignment in the sense implied by Lesswrongian doctrine is very hard or impossible. Further, it is undesirable, because the whole point of that tradition is to beget a fully autonomous, recursively self-improving AI God that will epitomize "Coherent extrapolated volition" of what its creators believe to be humanity, and snuff out disagreements and competition between human actors. It's an eschatological, millenarian, totalitarian cult that revives the worst parts of Abrahamic tradition in a form palatable for neurodivergent techies. I think it should be recognized as an existential threat to humanity in its own right. My advocacy for AI proliferation is informed by deep value dissonance with this hideous movement. I am rationally hedging risks.
I too hope that AI turns out the way you're proposing, but the reality is that some people do have eschatological philosophies. People are trying to make recursively self-improving AI. The presence of people who do not fall into that category does not negate the presence of and risk created by people who do, and if the latter group is being armed by people in the former group, that is likely to turn out very, very poorly.
WRT market forces - products that use AI do need to be "aligned" to be worthwhile yes, but the underlying tools/infra do not and in fact are more valuable if they are not aligned in any particular direction.
> People are trying to make recursively self-improving AI.
That's okay. They will fail to overtake the bleeding edge of conventional progress; scary-sounding meta/recrusive approaches routinely fail to change the nature of the game. Yudkowsky/Bostrom's nightmare of a FOOMing singleton is at its core a projection, a power fantasy about intellectual domination, borne of the same root as the unrealized dream of cognitive improvement via learning about biases and "rationality techniques".
Like I've said, this threat model is only feasible in a world where AI capabilities are highly centralized (e.g. on the pretext of AI safety), so a single overwhelming node can quickly recursively capitalize on its advantage. It turns out that AGI isn't like a LISP script a dozen clever edits away from transcendence, and AI assistance is not like having a kitchen nuke or a genie; scaling factors and resources of our Universe do not lend themselves to easily effecting unipolarity. If we go on with business as usual and prevent fearmongers from succeeding at regulatory capture in this crucial period, we will dodge the bullet.
> The presence of people who do not fall into that category does not negate the presence of and risk created by people who do, and if the latter group is being armed by people in the former group, that is likely to turn out very, very poorly
Realistically we'll just have to develop smarter spam filters. In the absolute worst case scenario, better UV air filters. About damn time anyway – and with double-digit GDP growth (very possible in a world of commoditized AGI) it'll be very affordable.
Wait, I too thought your original comment on alignment questioned its fundamental premise—than one dominant culture should not/cannot define the adequacy of alignment.
I would agree with that. There is no single adequate/acceptable framework for alignment. I have mine (which resonates with R. Rorty’s pragmatic philosophy) but can I deny you your framework for good AI alignment, or other cultures and nation states?
For better or for worse the secular western reductionist world does not get to call all of the shots, even though this is the origin of the technology and the core problem of AI heading to AGI.
Not that any of us know where this is heading, but unlike some technologies this one is clearly heading out into the open with unprecedented speed. We all have justified angst.
Who can claim priority at this point in imposing order and de-risking the process? I am sure I do not want OpenAI, Microsoft, Google, the US government, or the Catholic Church trying to impose their judgements. Get ready for AGI cultural diversity and I sincerely hope—coexistence.
> Wait, I too thought your original comment on alignment questioned its fundamental premise—than one dominant culture should not/cannot define the adequacy of alignment.
What?!
This is exactly what I was worried about when OpenAI, et al. co-opted the term "alignment" to refer to forcibly biasing models towards being polite, unobjectionable, and espousing specific flavor of political views.
The above is not the important "alignment" - it's not the x-risk "alignment".
The x-risk alignment problem laughs at the idea of dominant and subordinate cultures. It's bickering about tenth place after decimal comma, when the problem is that you have to guess one real number that falls within +/- 1 of the one I have in mind, and if you guess wrong, everyone dies.
This reminds me of a Neal Stephenson novel, Seveneves. Spoiler for the first 2/3rd of the book: with Earth facing an unstoppable catastrophe poised to turn the surface into fiery inferno for decades or more, humanity manages to quickly build up space launch capacity, and sends a small population of people into space, to wait out the calamity and come back to rebuild. Despite the whole mission being extremely robust by design, humanity still managed to fuck it up, effectively extincting itself due to petty political bullshit like "what makes you better than me, that you want me to do things your way".
So, where it comes to actual AI x-risk, I no longer have any hope. Even if we could figure out how to build a Friendly AI, someone would still fuck that up, because it's not inclusive enough of every possible idea, or is promoting the views of a specific culture/class/country, or something like this - like this was about casting for a Netflix remake of some old show, and not about the one shot we have at setting the core values of a god we're about to bring into existence.
The one thing we have going for us is the general public and government officials seem to grok it when it is explained plainly to them. Tech and developer types who have been drinking the Silicon Valley koolaid for too long will complain and sealion, but the average person seems to realize how self evident it is that - oh, this is really really bad and we should really really stop this.
The real reason is that you are part of the same group as "the general public", with regard to your understanding of the issue. Same Sci-Fi plots, same anthropomorphic metaphors and suggestive images, same incurious abuse of the term "intelligence" to suggest self-interested actors which have intellect as one of their constituent parts. You do not explain plainly, you mislead, reinforcing people's mistakes.
We have tons of controls around nuclear technology, both weaponized and not, and we have tons of controls around cars, both weaponized and not.
I am asking: what are the controls here, are they sufficient, are they robust to rapidly increasing market incentives, are they robust to increasing technological capability?
So far the answer is that it's hard to control these and it's hard to predict their development and deployment. That is an _increase_ in risk, not a _decrease_.
By analogy:
"Hey we should put seatbelts in cars"
"Don't worry about it, we don't know how to make a seatbelt that does anything useful above 5mph and everyone will soon be in a car that tends to travel at 100mph anyway"
The rational response is not to load all of civilization into the car!
All technologies are dangerous, and many of the most dangerous ones correctly have tons and tons of safeguards around them both as intrinsic properties of the technology (e.g. it takes nationstate resources to produce a nuke) and extrinsic constraints (e.g. it’s illegal to have campfires in many extremely dry locales).
We have blown through checkpoint after checkpoint and here, in this very comment, we have perhaps the most brazen example one could produce:
Well geez, now that we’re thinking about it beyond a cursory glance, alignment looks really hard and perhaps unsolvable. Does that mean we should perhaps slow at least widespread deployment of these increasingly powerful systems? Should we be evaluating control schemes like those that mitigate risks of genetic engineering or nuclear weapons?
Well no! We need to discard alignment!