This part really freaked me out... GPT-2 couldn't do math: Context → Passage: Sa...

nottombrown · on May 29, 2020

Author here: Sorry for the confusing formatting on the task descriptions at the end of the paper. That "4" is the human-generated target completion, not a model generated completion. I'm not sure whether the model got that particular question correct, but from Table 3.7 that GPT-3 has 36.5% accuracy on DROP in the few-shot setting.

Many other readers were confused by this so we'll update the formatting to say "target completion" to make this more clear.

leesec · on May 29, 2020

Thanks for clarifying. I'm a bit more confused though, are you saying that all of these Q&A examples are human answered, and that you were just demonstrating the format / question types for Q&A? If so, is there any way to see some of the model's responses?

Thank you.

longtom · on May 29, 2020

It seems it has (rudimentarily) learned concepts general enough to be logic itself. That is general intelligence. Now hook it up to reinforcement circuitry and make it even larger and it will mark the end to life as we know it.

GTP-3 has 175 billion parameters, but the human brain has 100 trillion synapses, so 0.175%. NN model capacity currently has a 3.4 month doubling time.[1] In 7-10 doublings we'll be in a similar ballpark, i.e. 2-3 years.

[1] https://openai.com/blog/ai-and-compute/

KhoomeiK · on May 29, 2020

Is there any specific reasoning behind equating 1 synapse to 1 NN parameter? Seems a bit simplistic. Seems to me like a synapse probably has more computational ability than a single parameter.

longtom · on May 29, 2020

Real neurons have many other trainable parameters and a lot more computational structure, so this is of course a simplifying assumption, but it is not entirely baseless either as it is known ANNs can approximate any function in theory, which may suggest synaptic weights do the heavy lifting in biological brains (since what more than general do you need?).

Though biological brains are likely overly complicated due to evolutionary baggage. There are hydrocephalus cases which have much reduced brain matter, but still high IQ.[1] The recurrent laryngeal nerves in giraffes is about 4.6 metres (15 ft) because it goes up and down their neck as it could not be rewired more directly during evolution.[2] Our pristine mathematical models and low-noise computational environments are likely superior to evolved wetware hacks.

[1] https://www.newscientist.com/article/dn12301-man-with-tiny-b...

[2] https://upload.wikimedia.org/wikipedia/commons/thumb/7/7e/Gi...

nwienert · on May 29, 2020

The hydrocephalus story looks a bit sketchy [0].

Also if anything brains are hyper optimized for many things (based on the many specialized sub-units). I’d bet we are essentially not unsupervised, and the sub-units of the brain are essentially fine tuned for many tasks, and hyper optimized to use all their resources incredibly efficiently (memory optimization must be intense). Not that the generative models won’t get close in some general way relatively soon, but I could see human brains being another 10-1000x more powerful than your ballpark pretty easily.

[0] https://www.gwern.net/Hydrocephalus

longtom · on May 29, 2020

Thanks I was not aware of these details about the hydrocephalus story.

leesec · on May 29, 2020

In Sam A's words, "genuinely, we have an algorithm that can learn."

mchusma · on May 29, 2020

Do you have a source? I am genuinely curious as I can't find it and would like to see the context

leesec · on May 29, 2020

https://www.youtube.com/watch?v=0TRtSk-ufu0

Around 19:10~. Though I messed up, he didn't say 'genuinely'. He said "full stop, truly, legitimately, we have an algorithm that can learn".

mchusma · on May 30, 2020

Thanks!

p1esk · on May 29, 2020

How many of those 100T synapses are dedicated to language skills?

longtom · on May 29, 2020

This is an interesting question. Likely more than 0.1%, perhaps 20-40% I'd guess. Which would be the lower estimate I provided.

rydre · on May 29, 2020

Passage: Saint Jean de Br´ebeuf was a French Jesuit missionary who travelled to New France in 1625. There he worked primarily with the Huron for the rest of his life, except for a few years in France from 1629 to 1633. He learned their language and culture, writing extensively about each to aid other missionaries. In 1649, Br´ebeuf and another missionary were captured when an Iroquois raid took over a Huron village . Together with Huron captives, the missionaries were ritually tortured and killed on March 16, 1649. Br´ebeuf was beatified in 1925 and among eight Jesuit missionaries canonized as saints in the Roman Catholic Church in 1930.

Question: How many years did Saint Jean de Br´ebeuf stay in New France before he went back to France for a few years?

Answer: 4

Explanation: The model used the arithmetic expression - 1629 + 1633 = 4.

NAQANet (trained on DROP) - came out in 2019 is able to do reasoning, you have to click result twice. First once it thinks it got it from passage, second attempt it tries to do arithmetic.

https://demo.allennlp.org/reading-comprehension/MjEzMjE1Ng==

frutiger · on May 29, 2020

This is not my idea of math.

sdan · on May 29, 2020

T5 could, and well.