Pretty amusing the old AI revolution was pure logic/reasoning/inference based. People knew to be a believable AI the system needed some level of believable reasoning and logic capabilities, but nobody wanted to decompose a business problem into disjunctive logic statements, and any additional logic can have implications across the whole universe of other logic making it hard to predict and maintain.
LLMs brought this new revolution where it's not immediately obvious you're chatting with a machine, but, just like most humans, they still severely lack the ability to decompose unstructured data into logic statements and prove anything out. It would be amazing if they could write some datalog or prolog to approximate more complex neural-network-based understanding of some problem, as logic based systems are more explainable
One of the reasons for why word vectors, sentence embeddings and LLMs won (for now) is that text found on the web especially, does not necessarily follow strict grammar and lexical rules.
Sentences that are incorrect but still understandable.
If you then include leet speak, acronyms, short form writing (SMS / Tweets), it quickly becomes unmanageable.
I am not a linguist, but I don't think that many linguists would agree with your assessment that dialects, leet speak, short form writing, slang, creoles, or vernaculars are necessarily ungrammatical.
From what I understand, the modern understanding is that these point to the failure of grammar as a prescriptive exercise ("This is how thou shalt speak"). Human speech is too complex for simple grammar rules to fully capture its variety. Strict grammar and lexical rules were always fantasies of the grammar teacher anyway.
I am a linguist, and I agree. But it does complicate the grammar to allow for these other options. (I haven't studied leet speak, but my impression is that it's more a matter of vocabulary than grammar, and vocabulary is relatively easy to add.)
For the record, the parser I worked on ended up having the "interesting" rules removed, leaving it as a tool for finding sentences that didn't conform to a Basic English grammar with a controlled vocabulary--and used to QC aircraft repair manuals, which need to be read by non-native English speakers.
There are languages that have fully codified grammar which completely covers everything people actually use (and more). But we spend 10 years learning the grammar itself 1-2 hours every day at school (then you have literature etc on top of that)...
I see it as a complete waste of my youth, BTW. Today I speak English that I learned through listening, reading and watching, and all of this mother tongue grammar nonsense that used to stress me out daily at school and during homework is absolutely useless to me.
I wonder if people approach NLP as a sea of semes rather than a semi-rigid grammatical structures to then be affected with meaning. (probably but I'm not monitoring these field)
LLMs brought this new revolution where it's not immediately obvious you're chatting with a machine, but, just like most humans, they still severely lack the ability to decompose unstructured data into logic statements and prove anything out. It would be amazing if they could write some datalog or prolog to approximate more complex neural-network-based understanding of some problem, as logic based systems are more explainable