But the form of language is the way it is because of the meaning we want the language to have!
Yes, it was our meaning that we encoded as form and placed into language. But who cares? We're not asking LLMs to create alternative realities with orthogonal meanings to our own. We're asking them to perform linguistic computations in a sophisticated way, where the outputs lie in plausible parts of the (query, response) space for human responses.
Because the form of language is not arbitrary or intentionally meaningless, these transformations of form necessarily have associated translations of meaning.
The consequence of this is that if we are able to express our meaning well, the form of language will have a high-reliability encoding of meaning, and LLMs will be able to manipulate meaning well by manipulating form. If we are unable to express ourselves well, LLMs will mirror our ineptness.
That I can't personally run a pile of transformers and attention modules in my head on both English and in on the texts in the Library of Thailand, and therefore do not have an intuition about what I could do (aside from "omg") is not a good argument of anything save my own computational limitations as a human. I can imagine in principle a dimensionally-reduced manifold of part of language space that looks very similar whether it's Thai or English--talking about interplanetary travel, for instance--and if we knew how to search such spaces efficiently this might allow us to map Thai and English form in a meaning-preserving way even without giving a training set that contains translations. I can also imagine that the self-similarity is not high enough to do this, or that you would end up mapping culturally equivalent things instead of physically equivalent things in some cases. I don't really know. But I'm very certain that introspection is a poor guide to how one can embed linguistic knowledge into many-billion-parameter ANN architectures.