Discussion about this post

User's avatar
Claire Isabelle's avatar

This is so cool—I had Prof. Weatherby for a class in college that ended up being one of my favorite classes of all time. I didn't know he had a book about LLMs coming out. I have got to get my hands on it! Thank you for sharing and for sharing your thoughts on it!

Expand full comment
Peter Ross's avatar

LLMS do not only encode the relationships between words (i.e. vector embeddings). They also implicitly learn the *dynamics* of language, i.e. they learn a generating process, how a sequence of words gives rise to the next one.

I think the language-generating process should be thought of as a stochastic dynamical system in the high-dimensional space of vector embeddings. Syntactic regularities are not “rules” externally imposed or hard-coded—rather, they are observables of this dynamical system, something like low-dimensional manifolds or attractors embedded in the embedding state space. It seems like semantics and syntax can both be thought of as relating to the geometry of the state space - syntax is more local, constraining the motion, whereas semantics describes more global relationships across that geometry. This dialectical view of language generation emerges from modern machine learning, and it also seems to me to be the only view that makes sense. It's apparently closer to structuralism than Chomskyan linguistics, but manages to unite and transcend the two. (I am not really familiar with semiotics though, so I'm basing this last statement on a very rough understanding of what structuralism is saying.) One other quick point: this is not just how language works - what makes these models so powerful is that just about anything in nature can be modeled in this way (as stochastic dynamical systems in a high-dimensional state space).

Expand full comment
1 more comment...

No posts