> There's an older tradition of rule-based machine translation. In these methods, someone really does understand exactly what the program does, in a detailed way
I would softly disagree with this. Technically, we also understand exactly what a LLM does, we can analyze every instruction that is executed. Nothing is hidden from us. We don't always know what the outcome will be; but, we also don't always know what the outcome will be in rule-based models, if we make the chain of logic too deep to reliably predict. There is a difference, but it is on a spectrum. In other words, explicit code may help but it does not guarantee understanding, because nothing does and nothing can.
The grammars in rule-based MT are normally fully conceptually understood by the people who wrote them. That's a good start for human understanding.
You could say they don't understand why a human language evolved some feature but they fully understand the details of that feature in human conceptual terms.
I agree in principle the statistical parts of statistical MT are not secret and that computer code in high-level languages isn't guaranteed to be comprehensible to a human reader. Or in general, binary code isn't guaranteed to be incomprehensible and source code isn't guaranteed to be comprehensible.
But for MT, the hand-written grammars and rules are at least comprehended by their authors at the time they're initially constructed.
Sure, I agree with that, but that's a property of hand-writing more than rule-based systems. For instance, you could probably translate a 6B LLM into an extremely big rule system, but doing so would not help you understand how the LLM worked.
I would softly disagree with this. Technically, we also understand exactly what a LLM does, we can analyze every instruction that is executed. Nothing is hidden from us. We don't always know what the outcome will be; but, we also don't always know what the outcome will be in rule-based models, if we make the chain of logic too deep to reliably predict. There is a difference, but it is on a spectrum. In other words, explicit code may help but it does not guarantee understanding, because nothing does and nothing can.