Techno Blender
Digitally Yours.

AI study by Google researchers reveals incredible jump in Med-PaLM 2’s answering accuracy

0 64


Artificial intelligence has permeated every technology field today. Among them, one field which is particularly resilient to emerging technologies is the medical field. Due to dealing in extremely sensitive areas that can lead to life-and-death scenarios, the medical field has been apprehensive about deploying new medtech tools into general practice. However, AI has been knocking at its door for some time, and if a new study conducted by Google researchers is to be believed, Google’s in-house Med-PaLM 2 is getting really high accuracy scores in medical question-answering (MedQA) and is in prime position to enable medical professionals in offering faster medical care to patients.

In effect, Med-PaLM 2, is a medical large language model (LLM) that is being trained to synthesise information from medical images. In fact, not just Google, other players too are working on generative AI in the healthcare industry, and among them is Sam Altman-led OpenAI’s ChatGPT. And the competition is stiff. A study published in JAMA Internal Medicine said that ChatGPT delivered higher quality answers to questions than written responses from actual practitioners.

Now, on Wednesday, Google Health UK research lead Alan Karthikesalingam posted on Twitter, highlighting the accomplishment. He said, “So happy to share #MedPaLM2 – our team’s evolution of Med-PaLM. A new state of art for medical question-answering! Med-PaLM 2 scores 86.5% on MedQA-USMLE, exceeding Med-PaLM’s score by >19%, & 81.8% on PubMedQA”.

It should be noted that the MedQA-USMLE dataset is a multiple-choice questionnaire based on the USA’s Medical License Exams. So, getting a high score essentially means that the AI could, in theory, get certified to practice medicine in the USA. PubMedQA is also a similar dataset. In the dataset test, Med-PaLM 2 has scored a high 86.5% as per the study conducted by the group. The study is currently available in a pre-print stage on arXiv. It should also be noted that the study has not been peer-reviewed or published in a journal so far.

Google AI scores big in Medical Licence Exam

Karthikesalingam stated in a series of tweets the high level of scrutiny taken in order to ensure that the results of the test were not a fluke or a misrepresentation of the AI platform’s abilities. He said, “We believe in rigorous, careful evaluation. Physicians even preferred #MedPaLM2’s long-form answers to answers from other real physicians along 8/9 axes of quality including medical accuracy (consensus w/medical opinion) and reasoning, with less likelihood of harm”.

“To highlight the real-world importance of nuanced evaluation we introduce a new dataset of “adversarial” questions designed specifically to probe LLM weaknesses including #HealthEquity,” he added.

It is unclear at the moment how much of an impact this new AI technology can have in the medical field but Google seems optimistic about the results. However, the study is just the beginning. In order for this technology to be adopted and used in real-life situations, it will have to undergo much stricter scrutiny to understand whether the AI can consistently and reliably help patients in their health care.

As it is, even Google chief Sundar Pichai, while speaking at the recently held Google I/O had highlighted how the company was working on this technology in a careful and responsible manner to ensure it did not go wrong.


Artificial intelligence has permeated every technology field today. Among them, one field which is particularly resilient to emerging technologies is the medical field. Due to dealing in extremely sensitive areas that can lead to life-and-death scenarios, the medical field has been apprehensive about deploying new medtech tools into general practice. However, AI has been knocking at its door for some time, and if a new study conducted by Google researchers is to be believed, Google’s in-house Med-PaLM 2 is getting really high accuracy scores in medical question-answering (MedQA) and is in prime position to enable medical professionals in offering faster medical care to patients.

In effect, Med-PaLM 2, is a medical large language model (LLM) that is being trained to synthesise information from medical images. In fact, not just Google, other players too are working on generative AI in the healthcare industry, and among them is Sam Altman-led OpenAI’s ChatGPT. And the competition is stiff. A study published in JAMA Internal Medicine said that ChatGPT delivered higher quality answers to questions than written responses from actual practitioners.

Now, on Wednesday, Google Health UK research lead Alan Karthikesalingam posted on Twitter, highlighting the accomplishment. He said, “So happy to share #MedPaLM2 – our team’s evolution of Med-PaLM. A new state of art for medical question-answering! Med-PaLM 2 scores 86.5% on MedQA-USMLE, exceeding Med-PaLM’s score by >19%, & 81.8% on PubMedQA”.

It should be noted that the MedQA-USMLE dataset is a multiple-choice questionnaire based on the USA’s Medical License Exams. So, getting a high score essentially means that the AI could, in theory, get certified to practice medicine in the USA. PubMedQA is also a similar dataset. In the dataset test, Med-PaLM 2 has scored a high 86.5% as per the study conducted by the group. The study is currently available in a pre-print stage on arXiv. It should also be noted that the study has not been peer-reviewed or published in a journal so far.

Google AI scores big in Medical Licence Exam

Karthikesalingam stated in a series of tweets the high level of scrutiny taken in order to ensure that the results of the test were not a fluke or a misrepresentation of the AI platform’s abilities. He said, “We believe in rigorous, careful evaluation. Physicians even preferred #MedPaLM2’s long-form answers to answers from other real physicians along 8/9 axes of quality including medical accuracy (consensus w/medical opinion) and reasoning, with less likelihood of harm”.

“To highlight the real-world importance of nuanced evaluation we introduce a new dataset of “adversarial” questions designed specifically to probe LLM weaknesses including #HealthEquity,” he added.

It is unclear at the moment how much of an impact this new AI technology can have in the medical field but Google seems optimistic about the results. However, the study is just the beginning. In order for this technology to be adopted and used in real-life situations, it will have to undergo much stricter scrutiny to understand whether the AI can consistently and reliably help patients in their health care.

As it is, even Google chief Sundar Pichai, while speaking at the recently held Google I/O had highlighted how the company was working on this technology in a careful and responsible manner to ensure it did not go wrong.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment