I feel like “passing it through a statistical model”, while absolutely true on a technical implementation level, doesn’t get to the heart of what it is doing so that people understand. It’s using the math terms, potentially deliberately to obfuscate and make it seem either simpler than it is. It’s like reducing it to “it just predicts the next word”. Technically true, but I could implement a black box next word predictor by sticking a real person in the black box and ask them to predict the next word, and it’d still meet that description.
The statistical model seems to be building some sort of conceptual grid of word relationships that approximates something very much like actually understanding what the words mean, and how the words are used semantically, with some random noise thrown into the mix at just the right amounts to generate some surprises that look very much like creativity.
Decades before LLMs were a thing, the Zompist wrote a nice essay on the Chinese room thought experiment that I think provides some useful conceptual models: http://zompist.com/searle.html
Searle’s own proposed rule (“Take a squiggle-squiggle sign from basket number one…”) depends for its effectiveness on xenophobia. Apparently computers are as baffled at Chinese characters as most Westerners are; the implication is that all they can do is shuffle them around as wholes, or put them in boxes, or replace one with another, or at best chop them up into smaller squiggles. But pointers change everything. Shouldn’t Searle’s confidence be shaken if he encountered this rule?
If you see 马, write down horse.
If the man in the CR encountered enough such rules, could it really be maintained that he didn’t understand any Chinese?
Now, this particular rule still is, in a sense, “symbol manipulation”; it’s exchanging a Chinese symbol for an English one. But it suggests the power of pointers, which allow the computer to switch levels. It can move from analyzing Chinese brushstrokes to analyzing English words… or to anything else the programmer specifies: a manual on horse training, perhaps.
Searle is arguing from a false picture of what computers do. Computers aren’t restricted to turning 马 into “horse”; they can also relate “horse” to pictures of horses, or a database of facts about horses, or code to allow a robot to ride a horse. We may or may not be willing to describe this as semantics, but it sure as hell isn’t “syntax”.
I do teach English as a Foreign language, I used to teach computer programming (at a beginner level), and sometimes I daydream about teaching math according to principles from the essay A Mathematician’s Lament by Paul Lockhart, but I am unlikely to be given the leeway to try.