Can we stop this drawn-out narrative that Deepseek is at the level of Gemini or o3? It’s brilliant in its own way but for some reason a lot of journalists think it’s still at par with American frontier models.
It’s funny, R1 came out and matched 4o/o1 at the time, you could claim it was very slightly behind but it was basically even.
It’s been 6 months? Geminis big upgrade was 2 months ago and o3 even more recent.
It’s just funny that US companies just barely got ahead the last couple months and already it’s a “drawn out narrative” that they aren’t ahead.
For all we know R2 drops tomorrow? If it’s ahead or even how are we supposed to think about the narrative?
IMO it’s not really that much of a stretch to say they’re fairly close together. I’d want to wait 6 more months where the US stayed significantly ahead before I’d be complaining about narratives. I know things move fast but that’s all the more reason to wait and see.
But didn't R1 use openai/google models to generate the data to train on? So the only reason R1 could exist is necessarily because those models predated it.
I'd reference something like https://llm-stats.com/ which suggests that the story is ... muddled. On the one hand, Deepseek is clearly not leading. On the other hand, they aren't really "behind" in any sense I care about. They'd have world-leading performance with their models this time last year.
The field is really moving too quickly to talk too certainly about "dominance" or "ahead". My observation is projects I care about on GitHub come with a Chinese README and many interesting talkers at conferences have strong Chinese accents. But I know a good researcher personally and it isn't so apparent to me if these are Chinese Chinese people or Americans of recent Chinese descent.
Americans can raise more cash. They are still pretty unbeatable on that front. So until that changes they will always be ahead no matter what happens on the tech front.
Similarly, withholding funding for research, meddling in how universities are supposed to conduct their affairs, the reduced appeal of studying in the US for foreign students, putting wrestling promoter Linda McMahon in charge of dept of Education… these are all going to impact America’s research and innovation abilities.
The big resource in technology is not cash. It is human effort in engineering and science. Putting more cash into finite resources can only result in inflation, which makes additional cash useless.
at the level of chinese gov't, the "cash" is going to pay for hardware. And it's american hardware currently leading the frontier, and the sanctions on it have made it hard to officially procure large amounts of compute from nvidia.
So the chinese gov't will need to also invest in hardware production - and surely they are furiously doing so (and getting limited success, but success none the less).
The american chip sanctions is, in my view, an own-goal. In the short term, it might cause some pain, but in the medium to long term, it is the kick that the chinese market would need to adapt. Necessity is the mother of all inventions after all. It might take 10 years, but i have no doubts that china can reach a level equal to that of TSMC.
If the US administrations (both current and previous) had any brains, they should've seen this. They should've put subsidies into chips so that chinese production will not be competitive, and chinese firms will lose money if they go domestic. And the export of such hardware would balance the trade deficits.
The assumption is that AGI could be near so all you need is a 5 year lead. In that sense, short term blocking despite giving them long term incentive to build manufacturing capacity is worth it.
The "American" chips sanctions is outsourced. It is a single Dutch company (ASML) in EU which machines are installed in Taiwan. The EU which Humpty-Trumpy is working very hard to completely alienate...
Journalists give what their readers want, and what they want is a discussion about a US-China race or "AI". There is also an equity ownership aspect as well, because tech stocks in China tend to be the primary market in the green within the larger SSE and Hang Sen, and a DeepSeek/AI story makes China oriented emerging market ETFs much more enticing. Same reason you see much more financial reporting in American business news about India now that Indian equities are now available in emerging market ETFs.
That said, Deepseek is a decent model and was the forcing function needed to give a reality check to a number of AI Startups (and has had the positive effect of making it easier for startups I've helped incubate make the case for their own domain specific foundation model strategy). It's impact shouldn't be understated.
In absolute scores, no one is leading. They all plateaued around the same level. The difference is that models are optimized in different ways. This makes R1 useful/ahead for some people but not for others.
However, on cost, R1 beats the Western models by miles.
I use Qwen 2.5, it works better for my tasks than larger models.
(But I use it for actual work, not for chatting with imaginary friends. Maybe you really do need a "frontier model" if you want to monetize imaginary friends. I woun't know or care.)
So many formerly thriving American cities suffer because of insane murder rates and neverending gang violence. Journalists who study Shotspotter and other methods of stopping gang violence will almost never approach the problem with the intent of mitigating what amounts to a major public health emergency (if you can even call it that). Instead, they work to frustrate underfunded police departments and encourage police budget cuts as warfare rages on.
Chicago PD's budget is ~$2 billion officially, closer to $3 billion when considering accounting trickery. If they want more money, perhaps they should stop illegally brutalizing people?
> The city paid out $639 million in police-related judgments and settlements between 2012 and the end of 2021, but only budgeted for $329 million in that span
What if you performed simple public records requests and realized that this whole event stinks to high heaven and that it was immediately politicized by politicians and billionaire media elites to suppress the rights of American citizens?
There always has to be room to question events that dominate the news cycle.
>What if you performed simple public records requests and realized that this whole event stinks to high heaven and that it was immediately politicized by politicians and billionaire media elites to suppress the rights of American citizens?
Are you claiming to have done so? If that's the case, what were the results of your "public records requests"?
It's easy to start this with Trump because he's outrageous, makes a lot of people uncomfortable and makes spurious claims. But soon it will be every conservative politician (except of course the ones who want regime change in the right places), and then all "leftist extremists" who defy the whims of megacorporations. Those cheering this on are simply demanding a world run by technocratic oligarchs and homicidal communist dictatorships.