Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
AI search of Neanderthal proteins resurrects ‘extinct’ antibiotics (nature.com)
79 points by mfiguiere on July 31, 2023 | hide | past | favorite | 55 comments


This is the code used in the paper:

https://gitlab.com/machine-biology-group-public/pancleave

>This package implements a scikit-learn-based random forest classifier to predict the location of proteoylytic cleavage sites in amino acid sequences. The panCleave model is trained and tested on all human protease substrates in the MEROPS Peptidase Database as of June 2020. This pan-protease approach is designed to facilitate protease-agnostic cleavage site recognition and proteome-scale searches. When presented with an 8-residue input, panCleave returns a binary classification indicating that the sequence is predicted to be a cleavage site or non-cleavage site. Additionally, panCleave returns the estimated probability of class membership. Through probability reporting, this classifier allows the user to filter by probability threshold, e.g. to bias toward predictions of high probability.

In the face of current hype around LLMs and 'fear of AI', calling a Random Forest Classifier 'AI' is a bit... far


The AI term has been coopted by Hollywood notions of synthetic human-level intelligence but it really is just referring to the academic discipline which Random Forests definitely falls under and has for decades.


We've also been using the term Machine Learning for a long time to try to avoid this. Using "AI" where "ML" would suffice is just trying to cash in on recent hype for clicks.

Still an interesting result of course.


My understanding is "ML" came into existence after the term "AI" became an untouchable and disrespectful dirty word in academia after a previous hype cycle went bust. Statisticians wouldn't be caught dead working on "AI". See also xkcd purity scale (435).


It looks like machine learning took over in the 2010s after a long convergence starting in the 2000s.

https://books.google.com/ngrams/graph?content=artificial+int...

This is from published books, so probably less affected by popular narratives. For that, we have Google Trends!

It looks like they trade places repeatedly when constrained to news searches: https://trends.google.com/trends/explore?cat=5&date=all_2008...


Not really though it may seem that way. Not all AI is ML. There are a ton of things within AI such as planning algorithms or Expert Systems. Again those are under the academic discipline of AI but not in the subfield of ML.


It's also a refinement on the term. AI can include a lot while ML is much more specific.


Right. ML is less likely to arouse emotions in either direction.


Yes I agree, but OP's link is not aimed at academics, it's aimed at the general public. And you can't expect the general public to know that distinction: they see 'AI', they think 'ChatGPT', not Random Forests


AI has been discussed in public for years and was attached to huge numbers of things well before chatgpt. Chatgpt only came out less than a year ago.


A random forrest classifier is a classic example of AI, quite popular before hardware caught up with the computational needs of neural networks.

Random forests of significant size also suffer some of the same inexplicability problems that neural networks suffer from, so it makes even more sense to make the comparison.


Every algorithm is now called AI, unfortunately. 'AI' is being used 'exponentially'.


Random forests are absolutely AI as the term has been used for a very long time in academia and in public.


I fear this kind of retconning will become more commonplace as new entrants to the space who are unfamiliar with the history of the field put words in the mouths of academics that academics themselves would vociferously reject.

Academics have long pushed back against the labeling of everything as "AI", and only the most public-facing ones with more than a buck to make on the hype treadmill ever stoop so low.

In academia, "AI" is usually reserved as a synonym of "AGI" in large part because of the fuzziness and lack of consensus around a good definition of "intelligence". The term is simply not applied to actual outputs of research because there's no way to justify calling anything "intelligence".

As the saying goes, what's the difference between ML and AI? ML is what the PhDs call it, AI is what the MBAs call it.


> In academia, "AI" is usually reserved as a synonym of "AGI"

Now this is retconning. I did my uni course called AI almost twenty years ago. The same processes had been called that for a long time.


Under current usage of the term, I think the bimetallic switch in my office thermostat qualifies as an AI.


Agree completely, particularly with the smirk I can see in the final word! As an academic, "AI" means "fund me". It's the new "big data". And "exponential" is the new "rising" (or, sometimes, "falling").


At a job where the company specialized in mathematical optimizations (think operations research, logistic planning, timetable optimization, etc.) we jokingly called the solvers with heuristics "A.I. technology, short for Advanced If technology". Some customers got the joke, some didn't, but they were generally content with the results :)


Now the question is, could this be done for stuff like marine immune genomes?


> In the face of current hype around LLMs and 'fear of AI', calling a Random Forest Classifier 'AI' is a bit... far

Just because the state of art evolved doesn't mean we have to erase history of the field. This is a ML algorithm so calling it AI is perfectly in line.

The fear you mention is built on the lack of understanding of what ML is. Showing that some AI has "dumb" yet useful implementations can help show the limits of this category of technology.


Lol the wikipedia article of RFs is literally quoted to be "Part of a series on Machine learning and data mining" yet you have MLDevSecBullshitOPs-type experts here schooling you on what consists AI or not.

https://en.wikipedia.org/wiki/Random_forest


As far as 'AI risk' stuff goes, I expect this kind of shit to actually kill people.


Open source genomes of engineered super viruses combined with garage-grade nucleotide synthesizers will with more probability than anything else I can think of.

Or, heck, maybe the super virus escapes from a BSL4 lab on accident before the garage phase. :D (There's been precedent, and I'm not alluding to COVID.)


I love how the Longtermism people won't shut up about this without having one single clue what they're actually talking about.

As though virus particles spontaneously pop into existence when you mix the right DNA strands together (a process that gets wildly expensive beyond a couple hundred basepairs and is destroyed if the proteins on your fingers get anywhere near it).

That's notwithstanding the fact that you can't just turn a dial labelled "lethality" to produce a gene sequence.


The expense seems to have gone down enough in recent years that there are too many sketchy biolabs mixing a lot of base pairs together. Or doing other kinds of things to cross breed these bugs.


What precedents?


Lots. I think one of the only arguments that needs to be made against things like gain-of-function research is this. [1] That's an incomprehensive list various announced biosecurity incidents.

Just since 2000 we've had lab incidents, including leaks, with: anthrax, west nile virus, SARS, COVID (not implying China - there was a confirmed COVID leak in Taiwan), ebola, tuberculosis, dengue, smallpox, zika, polio, and more. And they're happening all throughout the world. That includes the US, China, Russia, Japan, Germany, Australia, UK, South Korea, Hungary, France, Taiwan, Netherlands, and more. Incidents specified as coming from BSL-4 labs include ebola and SARS, though the BSL level is not specified at all for most incidents.

And I would take that is an extremely incomprehensive list given that many incidents are likely going to be tucked away or classified. There's playing with fire, and then there's this... which increasingly more feels like standing around a fire and seeing what happens if you start dumping kerosene into it, all in the name of firefighting - of course.

[1] - https://en.wikipedia.org/wiki/List_of_laboratory_biosecurity...


I don't think it was a modern BSL-4 standard lab, but smallpox escaped and killed someone in the UK. Now (at least in theory) only the US and Russia have samples in special government BSL-4 labs. Other countries could make it from sequence though.

Edit: I can't reply, so I'll say that an extinct horsepox virus was recreated from sequence, the same procedure in theory should work on smallpox


Smallpox is 186 kilobases in size. It's pretty hard to synthesize from scratch.

Not impossible, but hard.

Then you'll still need to assemble the viable viral particles. This will probably require the creation of artificial chromosomes needed for the viral replication, and then innoculating human cell culture with them, alongside with the synthetic viral DNA.

This is a level that requires years of work from a major biolab.


I dunno, these guys just ordered fragments from GeneArt: https://journals.plos.org/plosone/article/file?id=10.1371/jo...


The did have a BSL2 lab, and they did quite a bit of work. But yeah, the state of the art has advanced since I worked in molecular biology.


Always harder than you think of course - but the thing about molecular biology is unlike nuclear physics, most of the equipment you need is relatively cheap and readily available.

So the tools are there.

However on the other hand - making something the right mix of lethal, but still able to spread I suspect is incredibly hard - if it was easy we'd all be dead already ( from viruses etc naturally evolving ).


>I don't think it was a modern BSL-4 standard lab, but smallpox escaped and killed someone in the UK.

That was in 1978! I believe we didn't have biosafety levels back then in the UK.

Edit: at least the BSL levels came 6 years later, but the UK has different names for those

>Over the next two decades, growing CDC, NIH, and OSHA participation in ABSA annual meetings further solidified biosafety guidelines, culminating in the 1984 publication of the first edition of the text, Biosafety in Microbiological and Biomedical Laboratories (BMBL). The BMBL guidelines laid out four levels of increasingly intensive safety practices, equipment, facilities, and engineering controls to be employed in the safe handling of microbial agents: Biosafety Levels 1, 2, 3, and 4 (BSL-1, -2, -3, and -4),

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7099915/


You can acquire viable smallpox samples from graves in permafrost regions. It would not surprise me if the UK incident was an archeologist.


Russia definitely doesn't have BSL-4 labs (because BSL-4 refers to an US-specific system).

Sorry for being pedantic.



We have medical trials for a reason. It doesn't matter what the drug discovery process is as long as you test them after discovering them.


One could imagine a sufficiently advanced AI coming up with drugs which are designed to appear harmless in drug trials despite still being dangerous, either because it legitimately wants to murder us or because its learning function will be incredibly biased towards passing drug trials. I mean, we already sometimes do that (see the Thalidomide Tradgedy), but presumably AI can do it faster and more efficiently in ways that we would struggle to anticipate.


Agree. Personally I'm strong on "AI has real risks", which are mostly inflated beliefs in what it tells us. The protocols for drug testing are so high AI (if it works) is most likely to reduce prescreening delay, and if it had a risk it would be ignoring novel forms which don't conform to training, not high lethality: the drug test regimen deals with that already.


I'm assuming you mean 'We' as the US. Assuming everyone holds high standards may be a mistake.


Am not aware of any non-US countries having issues with this either.

The US is actually too strict, which is why we don't have any of the good sunscreens.


[flagged]


> Testing CRISPR genome edits of pathogens is playing with fire

This has nothing to do with the article


It has everything to do with not dying though.


[flagged]


A whole virus worth of mRNA would be a lot of mRNA (though, a prion wouldn't be) and I think we'd notice it happening.


Yes, if AI can do this, expect AI to also find diseases that are perfect bioweapons...such as a long-lasting common cold that merely weakens you for months, and is extremely contagious. Perfect for weakening an entire country before invasion, without even a hint that it actually is a bioweapon until it's too late.


Weapons that attack the entire planet are not perfect.


They will be when our robot overlords want to get rid of us. :-)


This assumes robots would want to be friend with each other, because robots. They'd probably wipe each other out very efficiently too and they would probably be able to do it without the use of weapons, it would just be cyber warfare.


No weapon in the history of mankind has ever been perfect.


Loads of things can be used as inefficient weapons, for example we could be worried about rouge states developing methods of dropping dead cows from zeppelins.


You jest, but launching dead cows from catapults was indeed a medieval siege tactic.


How? Not snark, genuine question. What do you see about this form of active molecule search that makes it uniquely dangerous?


It seems very close to resurrecting ancient diseases and protein strands 'just because we can'.


Those are extinct for a reason. Why recreate those? For the same reason you don't recreate dinosaurs ( if you could ).

Hey let's recreate long extinct antibiotics, whatever could go wrong?


Sarcasm? Just in case it wasn't...

The antibiotics aren't going to break out of the lab, knock down people on the street, and inject itself into their arms. If the antibiotics work against today's bacteria, and are safe for today's humans, then what's the problem?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: