Techno Blender
Digitally Yours.

GPT-3 can Wrongly Twig Lies to Make the Content Interesting

0 72



The text generation process of GPT-3 is unable to provide truthful information.

The text generated by  GPT-3 is surely incredible as it brings in an ensemble of classy words along with smart usage of wit in its sentences. But the question remains over the fact whether such embellished phrases and paragraphs actually deliver the truth.

The Groundedness Error

This problem is well-known to the research community, which phrases it in terms of groundedness. Currently, most generative text models lack the ability to ground their statements into reality, or at least attribute them to some external source; they often hallucinate plausible-sounding falsehoods. This limits their applicability to creative domains (writing fiction, gaming, entertainment, etc.) and makes them dangerous in places where truthfulness should be the first priority (news, scientific articles, education, etc.). The situation becomes even more worrisome in badly intentioned hands. Generative text models provide a tool for bad actors to lie at scale. They can inundate social platforms with overwhelming amounts of content that seems true just because of its sheer volume. They can also target individuals by portraying falsehoods in a way that convinces each person in particular, based on their social media profile. The way researchers are currently attempting to tackle the problem of groundedness is by incorporating an additional retrieval step in the generation process: before producing the output text, the model is trained to perform a lookup in some external database and gather supporting evidence for its claims.

LaMDA’s Attempt at Groundedness

Google’s state-of-the-art conversational model LaMDA uses an entire toolset to help ground its responses: a retrieval component, a calculator, and a translator. Based on the history of the conversation, LaMDA first produces an unconstrained response (ungrounded at this stage). LaMDA evaluates whether adjustments are needed (i.e., whether to use any of its tools). When LaMDA decides that supporting evidence is needed, it (a) produces a query, (b) issues the query to look up a supporting document, and (b) rewrites the previous response in a way that is consistent with the retrieved source. Steps 2. and 3. are repeated until LaMDA decides no further adjustments are needed. Note this might require retrieving multiple pieces of evidence. While this method is shown to improve groundedness over a baseline model without retrieval (from 60% to just under 80%), it is still far from human performance (~95% even when humans don’t have access to an information retrieval system).

It is Still a Lengthy and Complicated Process

Defining and measuring groundedness is a challenge in and of itself. In a recent paper from Google, researchers make the distinction between fact-checking (i.e., making a judgment on whether a statement is universally true) and attribution (i.e., identifying a supporting document). The latter is somewhat more tractable since it postpones the question of whether the identified source is credible. But even after de-coupling these two concepts, reasoning about attribution is non-trivial. Another challenge in ensuring groundedness is the lack of training data. In order to teach a model about attribution, one must provide positive and negative pairs of <statement, evidence>. Given how nuanced attribution can be, such data points require manual annotation. For instance, the LaMDA authors collected training instances by showing crowd workers ungrounded model responses (from step 1) and asking them to manually perform steps 2 through 4: issue queries, collect supporting evidence and modify the model’s responsibility until it is consistent with the evidence. Finally, incorporating a retrieval component is an engineering challenge. On each training step, the model needs to make (potentially multiple) lookups into an external database, which slows down training. This latency concern also applies during inference, which is already a pressing problem for transformer-based models.

Conclusion

Most text generation models (including the GPT family) are prone to making false statements due to their inability to ground their responses in the external world. This is because they were trained to sound truthful, not to be truthful. Models like LaMDA attempt to address this issue by incorporating a retrieval component (i.e., lookups into an external database) and iteratively improving the model response until it is consistent with the evidence. While promising, this strategy is not foolproof. It will be interesting to see how the community responds to this pressing challenge.

The post GPT-3 can Wrongly Twig Lies to Make the Content Interesting appeared first on .



GPT-3

The text generation process of GPT-3 is unable to provide truthful information.

The text generated by  GPT-3 is surely incredible as it brings in an ensemble of classy words along with smart usage of wit in its sentences. But the question remains over the fact whether such embellished phrases and paragraphs actually deliver the truth.

The Groundedness Error

This problem is well-known to the research community, which phrases it in terms of groundedness. Currently, most generative text models lack the ability to ground their statements into reality, or at least attribute them to some external source; they often hallucinate plausible-sounding falsehoods. This limits their applicability to creative domains (writing fiction, gaming, entertainment, etc.) and makes them dangerous in places where truthfulness should be the first priority (news, scientific articles, education, etc.). The situation becomes even more worrisome in badly intentioned hands. Generative text models provide a tool for bad actors to lie at scale. They can inundate social platforms with overwhelming amounts of content that seems true just because of its sheer volume. They can also target individuals by portraying falsehoods in a way that convinces each person in particular, based on their social media profile. The way researchers are currently attempting to tackle the problem of groundedness is by incorporating an additional retrieval step in the generation process: before producing the output text, the model is trained to perform a lookup in some external database and gather supporting evidence for its claims.

LaMDA’s Attempt at Groundedness

Google’s state-of-the-art conversational model LaMDA uses an entire toolset to help ground its responses: a retrieval component, a calculator, and a translator. Based on the history of the conversation, LaMDA first produces an unconstrained response (ungrounded at this stage). LaMDA evaluates whether adjustments are needed (i.e., whether to use any of its tools). When LaMDA decides that supporting evidence is needed, it (a) produces a query, (b) issues the query to look up a supporting document, and (b) rewrites the previous response in a way that is consistent with the retrieved source. Steps 2. and 3. are repeated until LaMDA decides no further adjustments are needed. Note this might require retrieving multiple pieces of evidence. While this method is shown to improve groundedness over a baseline model without retrieval (from 60% to just under 80%), it is still far from human performance (~95% even when humans don’t have access to an information retrieval system).

It is Still a Lengthy and Complicated Process

Defining and measuring groundedness is a challenge in and of itself. In a recent paper from Google, researchers make the distinction between fact-checking (i.e., making a judgment on whether a statement is universally true) and attribution (i.e., identifying a supporting document). The latter is somewhat more tractable since it postpones the question of whether the identified source is credible. But even after de-coupling these two concepts, reasoning about attribution is non-trivial. Another challenge in ensuring groundedness is the lack of training data. In order to teach a model about attribution, one must provide positive and negative pairs of <statement, evidence>. Given how nuanced attribution can be, such data points require manual annotation. For instance, the LaMDA authors collected training instances by showing crowd workers ungrounded model responses (from step 1) and asking them to manually perform steps 2 through 4: issue queries, collect supporting evidence and modify the model’s responsibility until it is consistent with the evidence. Finally, incorporating a retrieval component is an engineering challenge. On each training step, the model needs to make (potentially multiple) lookups into an external database, which slows down training. This latency concern also applies during inference, which is already a pressing problem for transformer-based models.

Conclusion

Most text generation models (including the GPT family) are prone to making false statements due to their inability to ground their responses in the external world. This is because they were trained to sound truthful, not to be truthful. Models like LaMDA attempt to address this issue by incorporating a retrieval component (i.e., lookups into an external database) and iteratively improving the model response until it is consistent with the evidence. While promising, this strategy is not foolproof. It will be interesting to see how the community responds to this pressing challenge.

The post GPT-3 can Wrongly Twig Lies to Make the Content Interesting appeared first on .

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment