How are current large language models being judged with apparent confidence, without a ranking of brain parameter size versus task performance metrics?
⚫Question for Prof Gary Marcus /Gary Marcus re biological brain parametrization size/current LLM sizes wrt task performance:
Q-A: Since general intelligence occurs naturally in around 1000 trillion parameters/synaptic connections (or in a supposedly rare case, 10% of that, in case where a man lost 90% of brain volume, due to hydrocephalus, but still reportedly functioned normally — Feuillet et al/The Lancet)…:
1. Would you expect today’s models, with far less parameters, to exhibit general intelligence?
2. If not what degree of general intelligence is expected given the current/apparent parameter mis-match so far?
3. How do you determine whether the degree seen in ChatGPT for eg, isn’t consistent with some expected measure, given the parameter mis-match?
Q-B: Is there a ranking somewhere of expected task performance wrt ranges of neuro-computational parameter size requirements?
If not, how are judgments being cast?
Question maker: G. Bennett