My favorite argument against SP is zero shot translation. The model learns Japan...

Hendrikto · 2025-05-21T08:35:56 1747816556

The translation happens because of token embeddings. We spent a lot of time developing rich embeddings that capture contextual semantics. Once you learn those, translation is “simply” embedding in one language, and disembedding in another.

This does not show complex thinking behavior, although there are probably better examples. Translation just isn’t really one of them.

spartanatreyu · 2025-05-22T00:08:15 1747872495

Furthermore: Learning additional languages fine tunes the embedding.

EGreg · 2025-05-21T09:23:32 1747819412

This is also the problem I have with John Searle’s Chinese room

nthingtohide · 2025-05-21T07:38:29 1747813109

> The model learns Japanese-English and Swahili-English and then can translate Japanese-Swahili directly. That shows something more than simple pattern matching happens inside.

The "water story" is a pivotal moment in Helen Keller's life, marking the start of her communication journey. It was during this time that she learned the word "water" by having her hand placed under a running pump while her teacher, Anne Sullivan, finger-spelled the word "w-a-t-e-r" into her other hand. This experience helped Keller realize that words had meaning and could represent objects and concepts.

As the above human experience shows, aligning tokens from different modalities is the first step in doing anything useful.