ChatGPT — Handle With Care. Behind the Hype — Understanding what… | by Giovanni Bruner | Mar, 2023


Photo by Possessed Photography on Unsplash

First, it came the language model. The intuition was easy: the next word in a sequence of words can be modeled with a probability distribution and is heavily dependent on the previous words. Words are part of a vocabulary, a limited corpus (170,000 tokens in the English vocabulary). Each word has a limited amount of meanings. A sequence of words, following a slow-changing, inner set of metadata: a grammar. Which is a predictable structure. You can expect a verb to be followed by a noun and not by another verb. Grammar and Meaning, act as constraints to limit the amount of randomness in the next word prediction. Which is arguably an easier task than predicting the next day’s share price value of a thousand companies. Plus, language models are by nature auto-regressive, the next word prediction depends on the previous words, and there are not that many latent, unobservable, variables to account for.

For all those reasons language models are amenable to pre-trained models and transfer learning. The key feature unlocking the new AI revolution. Transfer Learning means that you can use a model pre-trained by somebody else, say on 20 GB of Wikipedia articles, without having to retrain it with your own data, just by making a few adjustments to fit to your problem.

How is that possible? Well, your language problem is unlikely to require the use of grammar and a vocabulary that is completely different from what you can find on Wikipedia. Transfer learning started a new AI summer just when people started arguing about a second AI winter.

Pre-trained models became larger and faster, with performance improving alongside the increase in the number of parameters and the amount of data used. It was empirically found that the performance of language models would scale when the size of those models increased. Up to an upper bound of computational power to be accounted for. Computer chips are as capable as they get. Something had to happen for language models to keep growing. A way to efficiently parallelize training across many machines is a no-nonsense way. The advent of Transformers.

Photo by Aditya Vyas on Unsplash

The Transformers Language Models, released by a Google Brain team, nailed an impressive series of ingenious improvements to traditional language sequence models. The crux of it was an extensive usage of Multi Head Self-Attention and a model architecture designed to run on several concurrent GPUs. The Attention Mechanism massively improves the task of predicting the next word in a language model, spreading information from all the words in a sequence in a more efficient way than traditional recurrent neural networks.

Transformers allowed the pre-training of large, very large, language models. That went from the 1.5 Billion parameters of GPT 2, released in 2019 to the 175 Billion parameters of GPT3 in 2020. Paving the way to the phenomenal release of ChatGPT in 2022 and the era of Large Language Models (LLM)

ChatGPT excels at many things, but beware of hallucinations.

This was a rather long introduction, but important to put things in context. Language models are not black magic, nor Artificial Generic Intelligence. They will not overtake humanity any time soon. They are incredibly useful tools, excelling at predicting the next word in a sequence. As with any tool invented by humans, they can cause damage if we don’t read the manual and the fine print.

Words are limited and have a limited set of meanings, they come together in a predictable way, following a grammar structure. However, information, which is how meanings combine, is not finite and is not necessarily predictable. Large Language Models like ChatGPT can create new information, completely made up. Which in turn generates new narratives of unchecked facts. They have a parametric memory, with no access to an external knowledge base. We have seen that they are good at probabilities, but they don’t have an inner mechanism to tell a truth from a lie. In a few words, they can hallucinate.

Check out below. Here I’m asking ChatGPT about myself, pretending to be a famous Data Scientist.

Image by the Author

ChatGPT gets it right. Possibly I’m famous to my mom, but no more than that. But then I get offended and I completely make up extra information about myself.

Image by the Author

Of course, I’m not a Kaggle Grandmaster (I wish I was), yet the AI apologizes to me.

The sudden popularity of ChatGPT is unprecedented in this field, thanks to an interface to interact with, retaining accumulated knowledge. The dialog interface uses Reinforcement Learning with Human Feedback (RCHF). Problem is that the accumulated knowledge may be grounded on follow-up questions and corrections that are blatantly untrue.

Yejin Bang, Pascale Fung, and their team of PHDs released a few weeks ago an extensive framework to quantitatively evaluate models such as ChatGPT on publicly available sets. As a Zero-Shot learner — a model that can answer any question without being explicitly fine-tuned for that task — ChatGPT results as State of The Art on the majority of tasks. With big improvements in question answering, sentiment analysis, and misinformation detection.

Image from paper: https://arxiv.org/pdf/2302.04023.pdf

When it comes to reasoning, by far one of the most debated features, researchers found that ChatGPT is very good at deductive reasoning and very bad at inductive reasoning and at solving Math problems.

Deductive reasoning takes you from general premises to a specific conclusion and works well when the premises have enough information to anchor you to the solution. The algorithm was found to have superior performance in those reasoning tasks.

The image was taken from Pascale Fung Youtube Video: https://www.youtube.com/watch?v=ORoTJZcLXek

Induction is the inverse process. It’s about drawing information from data to infer a general conclusion. Whilst deductive thinking is the intellectual process you follow when you have to test a theory, inductive thinking is what takes you to frame a theory. In other words, you can expect ChatGPT to come up with some sample data given a lot of detailed premises, but don’t expect it to come up with a generalized rule given some sample data.

In practice, ChatGPT is not capable (at the moment) to draw an idea of the world, the same way a human would do.

References

Yejin Bang at all, A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity, https://arxiv.org/pdf/2302.04023.pdf

Pascale Fung, ChatGPT: What It Can and Cannot Do by Prof. Pascale Fung, https://www.youtube.com/watch?v=ORoTJZcLXek

Ashish Vaswani at all, Attention is All You Need, https://arxiv.org/pdf/1706.03762.pdf


Photo by Possessed Photography on Unsplash

First, it came the language model. The intuition was easy: the next word in a sequence of words can be modeled with a probability distribution and is heavily dependent on the previous words. Words are part of a vocabulary, a limited corpus (170,000 tokens in the English vocabulary). Each word has a limited amount of meanings. A sequence of words, following a slow-changing, inner set of metadata: a grammar. Which is a predictable structure. You can expect a verb to be followed by a noun and not by another verb. Grammar and Meaning, act as constraints to limit the amount of randomness in the next word prediction. Which is arguably an easier task than predicting the next day’s share price value of a thousand companies. Plus, language models are by nature auto-regressive, the next word prediction depends on the previous words, and there are not that many latent, unobservable, variables to account for.

For all those reasons language models are amenable to pre-trained models and transfer learning. The key feature unlocking the new AI revolution. Transfer Learning means that you can use a model pre-trained by somebody else, say on 20 GB of Wikipedia articles, without having to retrain it with your own data, just by making a few adjustments to fit to your problem.

How is that possible? Well, your language problem is unlikely to require the use of grammar and a vocabulary that is completely different from what you can find on Wikipedia. Transfer learning started a new AI summer just when people started arguing about a second AI winter.

Pre-trained models became larger and faster, with performance improving alongside the increase in the number of parameters and the amount of data used. It was empirically found that the performance of language models would scale when the size of those models increased. Up to an upper bound of computational power to be accounted for. Computer chips are as capable as they get. Something had to happen for language models to keep growing. A way to efficiently parallelize training across many machines is a no-nonsense way. The advent of Transformers.

Photo by Aditya Vyas on Unsplash

The Transformers Language Models, released by a Google Brain team, nailed an impressive series of ingenious improvements to traditional language sequence models. The crux of it was an extensive usage of Multi Head Self-Attention and a model architecture designed to run on several concurrent GPUs. The Attention Mechanism massively improves the task of predicting the next word in a language model, spreading information from all the words in a sequence in a more efficient way than traditional recurrent neural networks.

Transformers allowed the pre-training of large, very large, language models. That went from the 1.5 Billion parameters of GPT 2, released in 2019 to the 175 Billion parameters of GPT3 in 2020. Paving the way to the phenomenal release of ChatGPT in 2022 and the era of Large Language Models (LLM)

ChatGPT excels at many things, but beware of hallucinations.

This was a rather long introduction, but important to put things in context. Language models are not black magic, nor Artificial Generic Intelligence. They will not overtake humanity any time soon. They are incredibly useful tools, excelling at predicting the next word in a sequence. As with any tool invented by humans, they can cause damage if we don’t read the manual and the fine print.

Words are limited and have a limited set of meanings, they come together in a predictable way, following a grammar structure. However, information, which is how meanings combine, is not finite and is not necessarily predictable. Large Language Models like ChatGPT can create new information, completely made up. Which in turn generates new narratives of unchecked facts. They have a parametric memory, with no access to an external knowledge base. We have seen that they are good at probabilities, but they don’t have an inner mechanism to tell a truth from a lie. In a few words, they can hallucinate.

Check out below. Here I’m asking ChatGPT about myself, pretending to be a famous Data Scientist.

Image by the Author

ChatGPT gets it right. Possibly I’m famous to my mom, but no more than that. But then I get offended and I completely make up extra information about myself.

Image by the Author

Of course, I’m not a Kaggle Grandmaster (I wish I was), yet the AI apologizes to me.

The sudden popularity of ChatGPT is unprecedented in this field, thanks to an interface to interact with, retaining accumulated knowledge. The dialog interface uses Reinforcement Learning with Human Feedback (RCHF). Problem is that the accumulated knowledge may be grounded on follow-up questions and corrections that are blatantly untrue.

Yejin Bang, Pascale Fung, and their team of PHDs released a few weeks ago an extensive framework to quantitatively evaluate models such as ChatGPT on publicly available sets. As a Zero-Shot learner — a model that can answer any question without being explicitly fine-tuned for that task — ChatGPT results as State of The Art on the majority of tasks. With big improvements in question answering, sentiment analysis, and misinformation detection.

Image from paper: https://arxiv.org/pdf/2302.04023.pdf

When it comes to reasoning, by far one of the most debated features, researchers found that ChatGPT is very good at deductive reasoning and very bad at inductive reasoning and at solving Math problems.

Deductive reasoning takes you from general premises to a specific conclusion and works well when the premises have enough information to anchor you to the solution. The algorithm was found to have superior performance in those reasoning tasks.

The image was taken from Pascale Fung Youtube Video: https://www.youtube.com/watch?v=ORoTJZcLXek

Induction is the inverse process. It’s about drawing information from data to infer a general conclusion. Whilst deductive thinking is the intellectual process you follow when you have to test a theory, inductive thinking is what takes you to frame a theory. In other words, you can expect ChatGPT to come up with some sample data given a lot of detailed premises, but don’t expect it to come up with a generalized rule given some sample data.

In practice, ChatGPT is not capable (at the moment) to draw an idea of the world, the same way a human would do.

References

Yejin Bang at all, A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity, https://arxiv.org/pdf/2302.04023.pdf

Pascale Fung, ChatGPT: What It Can and Cannot Do by Prof. Pascale Fung, https://www.youtube.com/watch?v=ORoTJZcLXek

Ashish Vaswani at all, Attention is All You Need, https://arxiv.org/pdf/1706.03762.pdf

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – admin@technoblender.com. The content will be deleted within 24 hours.
Ai Newsartificial intelligenceBrunercareChatGPTGiovanniHandleHypeUnderstandingMARTechnology
Comments (0)
Add Comment