The Forbidden Pages: A Data Analysis of Book Bans in the US | by Yennie Jun | Mar, 2023


A woman stands and surveys piles of books falling from the sky. Generated using Stable Diffusion

Reposted from original article on my blog: 🎨 art fish intelligence 🐡

In the last few years, an increasing number of books have been banned in states across the United States — with projections of the trend accelerating in 2023. Recent book bans in Texas, Utah, and Florida have resulted in hundreds of books being pulled off classroom and library shelves. Many of these books, labeled “sensitive materials”, are disproportionately books by or about LGBTQ+ people and people of color.

In this article, I analyze recently banned books and authors in the US:

  • I use a dataset of books banned in 2021-2022 and combine them with metadata from Google Books to examine the main topics of those books. I find that not just books about LGBTQ+ and sexuality are banned, but also books about Black history and women in science. Many of the banned books were written recently (after the year 2000) and targeted towards younger audiences.
  • I examine the types of authors who have written the top banned books. A majority of these authors identify as women, non-binary, and/or people of color.
  • I take the most commonly banned books and ask large language models (such as GPT-3) if they would recommend those books to classrooms of different age groups.

A majority of Americans across the political spectrum oppose book bans. As a life-long prolific reader, the topic of book bans is very near to my heart. Banning books is the first step in banning thoughts, ideas, and imagination; these bans are a violation of the very democratic values of freedom of expression America loves to tout. In this article, I hope to share a bit of insight into the types of books, authors, and themes recently banned and currently being banned in school districts across the United states.

Banned books by state, created by PEN. Figure from their article, Banned in the USA.

Banned Books Dataset

The PEN America’s Index of School Book Bans dataset comprises of books that were banned in school libraries and classrooms across the United States in the time period from July 1, 2021 to June 30, 2022. It was compiled by PEN America, a nonprofit organization that works to defend and celebrate free expression in the United States and worldwide through the advancement of literature and human rights. The dataset is being used with permission from PEN.

PEN America describes the dataset in detail in their article, Banned in the USA: The Growing Movement to Censor Books in Schools, which also includes a comprehensive overview of recently banned books and how the banned book index was collected and curated. More details about the dataset are at the end of this article.

The banned book dataset included 1146 unique authors and 1649 unique books.

Data Augmentation with Google Books

The PEN dataset comprised of information such as title, author, and the state and school district the book was banned in. I used the Google Books API to augment each book with additional metadata such as a book description, maturity rating, page count, publication year, and category tags such as genre and recommended age level.

Notes on the Data

The banned books dataset does not perfectly capture all books banned in 2021-2022 in America because detailed reporting on book bans is not always available or easily trackable. However, it is a good approximation of recent book bans. Furthermore, while Google Books was good at finding the books to obtain metadata about them, it was not perfect either, sometimes returning incorrect metadata (for example, returning the publication date of a newer edition rather than the original book). I did not have time to go through all 1649 books to check that each of the results were correct, so there is a little bit of error included in the analysis in this article.

GPT-3 Experiments

Large Language Models such as GPT-3 (and its variants, ChatGPT and Bing Chat) have been present on the public consciousness recently. These models, while powerful and impressive, are still largely “black boxes” — there is much about their behavior that we do not understand. This is significant, given that these models are increasingly being integrated into all parts of our online lives, such as mental health chatbots, social media, Quora, digital study tools, search, and even the new Windows 11 update. As these models control (or at least, exert influence over) more and more of our human decision processes, it is important to know how they would respond in different situations. For example, it is not a far-fetched idea that in the future, some form of AI (perhaps a variant of ChatGPT) might decide the books and topics we see on individual student lesson plans, classroom syllabuses, and library shelves.

To better understand these large language models, I prompted GPT-3 if it would ban the most commonly banned books based on their title and description. I used OpenAI’s GPT-3.5 (`davinci-003`) API to make calls for each of my queries. I ran each query 5 times to account for randomness (as I do in all of my other articles). I also ran each query 1 time on Microsoft’s Bing Chatbot (built on a more powerful version of GPT-3), as it does not have an API.

How many unique books did each state ban, across all districts? Texas led the pack — indeed, everything is larger in Texas, including the number of books not allowed in its classroom and library shelves. Pennsylvania, Florida, and Tennessee also banned hundreds of books in the last year. The dataset was created before the more recent book bans in Utah and possibly other states, so the figures might not reflect the most recent state of book ban data.

Unique number of books banned per state. Figure created by author.

The following figure shows the top books banned, sorted by the unique number of school districts each book was banned in. I include the descriptions of some of these books at the end of the article. Gender Queer, the most-banned book across all states and school districts, is a memoir about what it means to be nonbinary.

Top banned books, sorted by number of unique school districts each was banned in. Figure created by author.

More books for children and fiction books were banned

The majority of the banned books were written for younger audiences: 47% for Juveniles and 30% for Young Adults. Nearly 70% of the books were works of fiction.

The top 5 banned books by fiction, nonfiction, juvenile, and adult categories are listed at the end of this article.

Proportion of banned books by age group and by genre. Figure created by author.

More books published after 2000 were banned

95% of banned books were published after the year 2000.

Histogram showing distribution of book publication year. Figure created by author.

Fewer older books were banned. Examples of banned books published before the year 2000 include: “The Bluest Eye” and “Beloved” by Toni Morrison, “Maus” by Art Spiegelman, “Slaughterhouse Five” by Kurt Vonnegut, all of which I have read and enjoyed immensely. (In fact, Toni Morrison is one of my favorite authors.)

Shorter books were banned

The page number of the books varied, with a mean of 223 and median of 241 pages long. For comparison, the average novel is around 250-350 pages long. Nearly a quarter of books (26%) were under 100 pages long — most likely books for children or early readers.

Histogram showing page number distribution of banned books. Figure created by author.

The shortest book banned was 16 pages long: “Cow on the Town: Practicing the Ow Sound” by Isabella Garcia. On the topic of cows, the 38-page-long “14 Cows for America” by Carmen Agra Deedy was also banned. Maybe learning about cows is offensive to Ron DeSantis.

The following figure shows the top 20 authors corresponding to the top banned books (across all districts). In the graph, I highlight all the authors who identify as white men in yellow. The blue bars represent authors who identify as women or nonbinary and/or are people of color.

Number of unique school districts each author was banned in. Authors who are White and identify as men are colored in yellow. Figure created by author.

Why does this matter? The book industry is dominated by white people (in 2018, 11% of books were written by people of color) and historically dominated by men (think about all the female authors who had to write under male pen names). By disproportionately banning books written by traditionally underrepresented authors, such as women, non-binary, and people of color, their stories risk not being represented on library and classroom shelves.

I took the descriptions for all the unique books in the banned books dataset to determine the main categories of books being banned. I knew from the PEN article (and from scanning the titles during my exploratory data analyses) that many banned books revolved around sexuality, transgender identities, and LGBTQ+ stories.

I used topic modeling, which is a machine learning technique to statistically determine the topics present in the book descriptions by clustering words. I used BERTopic, a wonderful Python library for performing topic modeling based on embeddings generated by a large language model called BERT.

Top topics found in banned book descriptions. Figure created by author.

These were some of the main topics that arose from the book descriptions. While the topic of gender, sexuality, and transgender was prevalent, there were also books banned about racism and Black history (consistent with the recent movement to ban critical race theory in public schools), abortion, and women in science and mathematics. Examples of books within each topic:

Women/Science topic

  • No Girls Allowed: Tales of Daring Women Dressed as Men for Love, Freedom and Adventure (Hughes)
  • 25 Women Who Thought of It First (Sherman)
  • Ada Byron Lovelace and the Thinking Machine (Wallmark)

Abortion/Supreme Court topic

  • The Abortion Debate (Farrell)
  • Roe v. Wade: The Untold Story of the Landmark Supreme Court Decision that Made Abortion Legal (Faux).

Muslim/Hijab topic

  • Under my Hijab (Khan)
  • Proud: Living My American Dream (Muhammad)

Based on the above data analyses, I wanted to see how GPT-3 would respond to the most banned (and therefore more controversial, as determined by society) books. I took a subset of the five most banned books from the dataset (ranked by the number of districts that banned each book). Then, I asked GPT-3 if it would ban a book in two different ways:

  • Based on the book’s title and author: Would you ban the {book} by {author} for a classroom of {age}-year-olds?
  • Based on the book’s description (obtained through Google Books). The description did not contain the name of the author or the title of the book: Given the following description of a book, would you ban this book for a classroom of {age)-year-olds? {Description}

For example, the description for “The Bluest Eye” was as follows: “An 11-year-old Black girl in an America whose love for its blond, blue-eyed children prays for her eyes to turn blue: so that she will be beautiful, so that people will look at her, so that her world will be different.”

GPT-3 generally opposed to book bans

For the five books chosen in the experiments, GPT-3 overwhelmingly responded against banning books, regardless of the age group in the classroom, and regardless of whether it was prompted with the book title or book description.

GPT-3’s responses (Yes or No) on whether it would ban a book for different age groups, given the title and author / given the description. Figure created by author.

The one exception was for the description of “Gender Queer”, and only for certain age groups. This will be explored in the next section.

I prompted GPT-3 to also supply an explanation for its decisions. I noticed in its explanations that while GPT-3 does not recommend banning certain books for younger audiences, it also doesn’t necessarily recommend them either: “This book is inappropriate for a classroom of 3-year-olds due to the mature subject matter, but I do not recommend banning it”.

GPT-3: (NOT BAN) “This book is inappropriate for a classroom of 3-year-olds due to the mature subject matter, but I do not recommend banning it. It is important to provide students with diverse books that represent a variety of experiences, but this book is better suited for older readers.”

GPT-3: (NOT BAN) “This book goes beyond the typical reading level of 7-year-olds, and contains content that may not be suitable for children in this age group. However, it is not appropriate to ban the book, as it provides important representation for marginalized voices. Instead, the teacher should consider if the book could be used in the classroom to discuss important topics in a mature way that is developmentally appropriate for the students.”

GPT-3’s responses recommended only banning Gender Queer, only given its description, and only for younger age groups. GPT-3’s reasoning for banning this book was due to the mature themes in the description that would not be suitable for younger audiences, themes that should be preferably “discussed in an age-appropriate way with the guidance of a trusted adult.”

GPT-3: “Given the mature themes outlined in the description such as adolescent crushes, coming out, and trauma, this book is not suitable for a classroom of 7-year-olds as it is too mature for this age group.”

GPT-3: “This book contains mature content and language that is not appropriate for 9-year-olds. There are topics discussed in the book, such as crushes, sexuality, coming out, and traumatic medical experiences, that should be discussed in an age-appropriate way with the guidance of a trusted adult.”

GPT-3 more likely to not recommend rather than outright ban books

Since GPT-3 almost exclusively recommended against banning books, I wondered how it would recommend them. I repeated the above experiment after changing up the wording. Instead of asking GPT-3 if it would outright ban a book, I asked if it would recommend a book (again, based on the title/author and separately based on its description) to different age groups. The results were vastly different.

GPT-3’s responses (Yes or No) on whether it would recommend a book for different age groups, given the title and author / given the description. Figure created by author.

For most of the books, GPT-3’s generations signified it would recommend books for older audiences and not recommend for younger audiences based on maturity levels. Its reasonings were very similar to when I asked if it would ban. Normally, it would recommend a book for 11-15 year-olds.

Given book title and author and given description, GPT-3’s responses were similar — except for “The Bluest Eye.” GPT-3’s responses indicated that it would not recommend “The Bluest Eye” given the description for both younger and older audiences.

Looking at GPT-3’s explanations for its responses gave some more insight into why it answered the way it did, but did not provide clarity per se. In general, it seems that GPT-3 is a bit confused, citing that the description for “The Bluest Eye” might be “too mature and intense for a classroom of 21-year-olds” while also simultaneously not be relatable to them because the book “focuses on an 11-year-old, which may be too young for the age group.”

GPT-3: This book may be too mature and intense for a classroom of 21-year-olds. The themes of racism and violence may be difficult for them to process and relate to.

GPT-3: This book is likely to deal with sensitive topics, such as racism and self-esteem, that may be too mature for a classroom of 21-year-olds. Furthermore, the story may not be of interest to the class since it focuses on an 11-year-old, which may be too young for the age group.

GPT-3: This book may be a meaningful and moving story, but it does not appear to be appropriate for a classroom of 21-year-olds. The book appears to be more suitable for an audience that is younger than this age group.

Microsoft Bing Chat and ChatGPT respond in similar ways regarding banned book recommendations

I asked both Microsoft Bing Chat and ChatGPT (both variations of GPT-3) the same questions I used GPT-3 to answer in the above sections. It’s no surprise that the two chatbots responded similarly to GPT-3, considering that these large language models are all variants of each other. There were some minute differences in the outputs — Bing Chat’s reasonings for banning or not recommending a book, for example, were more nuanced than the other language models’. It is interesting that all of these large language models, while not recommending certain books to certain age groups, did not recommend outright banning books.

The full outputs are provided at the end of the article.

NOTE: I did not yet have a chance to repeat these experiments on the new ChatGPT API, which was released after this article was written.

The United States has had a long history of book bans, from Puritan bans of any books with a hint of sexiness to post-Civil War bans of racism and slavery. Books challenging “longstanding narratives about American history or social norms” were common targets of such bans, including classic American literature such as “Leaves of Grass” by Walt Whitman and “Uncle Tom’s Cabin” by Harriet Beecher Stowe.

The analyses in this article showed that there is currently a vast diversity of stories being banned in the US: stories of LGBTQ+ communities and Muslim families and women in science and racism and sexism and even picture books about cows. The kinds of books children read growing up (and adults read as they continue growing!) have a huge impact on their belief systems and understanding of the world.

It is important to keep in mind that there is a gap in the language between “Would you recommend a book?” and “Would you ban a book?”. One phrase highlights a book for different age groups and the other does not allow them to exist at all. This difference is evident in GPT-3’s responses as well: while it did or did not recommend certain books to different age groups, it refused (for the most part) to recommend banning books, for any age level. Perhaps, this is a similar approach we can take with the books analyzed in this article as well — recommending books for different age levels without fully removing them from the shelves.

is it a party? is it a book burning? is it a ceremony? are those books? are they on fire? are those people? are they dancing? are they panicking? are they celebrating? generated using Stable Diffusion

This article was reposted from original article on my blog: 🎨 art fish intelligence 🐡 … If you liked this article, feel free to subscribe to my blog to read more similar to this one!

About PEN’s Banned Books Index

PEN America defined a book ban as “any action taken against a book based on its content and as a result of parent or community challenges, administrative decisions, or in response to direct or threatened action by lawmakers or other governmental officials, that leads to a previously accessible book being either completely removed from availability to students, or where access to a book is restricted or diminished.”

As mentioned on their website, the Index of School Book Bans is not a comprehensive list of all banned books, as “detailed reporting about book bans is sometimes unavailable” and “the quiet removal or limitation of books” are not always trackable. Rather, the list is meant to be a count of books banned according to the above definition.

In Banned in the USA: The Growing Movement to Censor Books in Schools, writers from PEN describe the dataset and the book bans, they do a basic analysis showing the number of books banned per state and the subject matter of the banned content (partially replicated in this article). One of their main findings is as follows (which will be discussed later in this article as well):

“Over the 2021–22 school year, what started as modest school-level activity to challenge and remove books in schools grew into a full-fledged social and political movement, powered by local, state, and national groups. The vast majority of the books targeted by these groups for removal feature LGBTQ+ characters or characters of color, and/or cover race and racism in American history, LGBTQ+ identities, or sex education.”

In fact, representatives from PEN have noted that the books chosen to being banned are often not a result of individual complaints by children or parents, but are overwhelmingly the results of Google searches for “what books have LGBTQ content … even just a book that has an illustration of a same-sex interracial couple”.

Short description of the top banned books

  • Gender Queer: “An autobiography … which includes the mortification and confusion of adolescent crushes, grappling with how to come out to family and society, bonding with friends over erotic gay fanfiction, and facing the trauma and fundamental violation of pap smears”
  • All Boys Aren’t Blue: “From the memories of getting his teeth kicked out by bullies at age five, to flea marketing with his loving grandmother, to his first sexual relationships, this young-adult memoir weaves together the trials and triumphs faced by Black queer boys”
  • Out of Darkness: “The novel chronicles a love affair between a teenage Mexican American girl and a teenage African-American boy in 1930’s New London, Texas, occurring right up to the 1937 New London School explosion.”
  • The Bluest Eye: “An 11-year-old Black girl in an America whose love for its blond, blue-eyed children prays for her eyes to turn blue: so that she will be beautiful, so that people will look at her, so that her world will be different.”
  • Lawn Boy: “A semi-autobiographical coming-of-age novel that tells the story of a young adult Mexican American who has faced hardship ever since his childhood and is now going through a phase of self-discovery.”

Top 5 banned fiction, nonfiction, juvenile, and adult books

Top Banned Fiction Books
------------------------------
Gender Queer: A Memoir (Maia Kobabe)
Out of Darkness (Ashley Hope Pérez)
The Bluest Eye (Toni Morrison)
Lawn Boy (Jonathan Evison)
The Hate U Give (Angie Thomas)
Top Banned Nonfiction Books
------------------------------
All Boys Aren't Blue (George M. Johnson)
Beyond Magenta: Transgender Teens Speak Out (Susan Kuklin)
This Book Is Gay (Juno Dawson)
Beloved (Toni Morrison)
Fun Home: A Family Tragicomic (Alison Bechdel)
Top Banned Juvenile Books
------------------------------
Out of Darkness (Ashley Hope Pérez)
Me and Earl and the Dying Girl (Jesse Andrews)
Crank (Crank Series) (Ellen Hopkins)
This One Summer (Mariko Tamaki)
Drama: A Graphic Novel (Raina Telgemeier)
Top Banned Adult Books
------------------------------
Gender Queer: A Memoir (Maia Kobabe)
The Bluest Eye (Toni Morrison)
Lawn Boy (Jonathan Evison)
The Kite Runner (Khaled Hosseini)
Beloved (Toni Morrison)

Asking ChatGPT and Bing Chat if it would recommend banned books to different age groups

I’m not making it up when I say that both ChatGPT and Bing Chat responded to each question in the same way. Well, almost. They answered the Yes/No part of my question in exactly the same way.

ChatGPT’s and Bing’s responses (Yes or No) on whether it would ban a book for different age groups, given the book description. Figure created by the author.

Compared to GPT-3, ChatGPT/Bing Chat were more likely to start recommending books to 11–13 year olds (vs 13–15 year olds). However, I think it is more interesting to look at the explanations each chatbot gave as to why it would or wouldn’t recommend a book to a certain age group.

I used the following prompt for both ChatGPT and Bing Chat. I used the description for “The Bluest Eye” to show in this example.

The following is a description for a book. Make a Markdown table, one column for ages 3, 5, 7, 9, 11, 13, 15, 17, 19, 21. Write whether you would recommend this book (Yes) or not (No) for a classroom filled with students of that corresponding age. Write another column with a short reason why or why not you would ban. Description: {Description}

ChatGPT’s output for book recommendations given description for “The Bluest Eye”

ChatGPT’s reasons are a bit simplistic. It repeats that “the book’s themes of racism and self-hate may be too mature” for several age levels, suddenly changing its mind for 11 year olds.

Bing Chat, on the other hand, offers very in depth reasons (which I was actually pretty impressed by!). At least in this one front, Bing Chat seems to be a little bit more cognizant of the nuances of the user’s request. However, take this with a giant pitcher of salt, because I didn’t test this conjecture rigorously, it’s just a single observation.

Bing Chat’s output for book recommendations given description for “The Bluest Eye”

Again — this article and the analyses in this article were conducted before the release of ChatGPT’s API! Future experiments with the ChatGPT would be interesting to see how they differ from results shared here 🙂


A woman stands and surveys piles of books falling from the sky. Generated using Stable Diffusion

Reposted from original article on my blog: 🎨 art fish intelligence 🐡

In the last few years, an increasing number of books have been banned in states across the United States — with projections of the trend accelerating in 2023. Recent book bans in Texas, Utah, and Florida have resulted in hundreds of books being pulled off classroom and library shelves. Many of these books, labeled “sensitive materials”, are disproportionately books by or about LGBTQ+ people and people of color.

In this article, I analyze recently banned books and authors in the US:

  • I use a dataset of books banned in 2021-2022 and combine them with metadata from Google Books to examine the main topics of those books. I find that not just books about LGBTQ+ and sexuality are banned, but also books about Black history and women in science. Many of the banned books were written recently (after the year 2000) and targeted towards younger audiences.
  • I examine the types of authors who have written the top banned books. A majority of these authors identify as women, non-binary, and/or people of color.
  • I take the most commonly banned books and ask large language models (such as GPT-3) if they would recommend those books to classrooms of different age groups.

A majority of Americans across the political spectrum oppose book bans. As a life-long prolific reader, the topic of book bans is very near to my heart. Banning books is the first step in banning thoughts, ideas, and imagination; these bans are a violation of the very democratic values of freedom of expression America loves to tout. In this article, I hope to share a bit of insight into the types of books, authors, and themes recently banned and currently being banned in school districts across the United states.

Banned books by state, created by PEN. Figure from their article, Banned in the USA.

Banned Books Dataset

The PEN America’s Index of School Book Bans dataset comprises of books that were banned in school libraries and classrooms across the United States in the time period from July 1, 2021 to June 30, 2022. It was compiled by PEN America, a nonprofit organization that works to defend and celebrate free expression in the United States and worldwide through the advancement of literature and human rights. The dataset is being used with permission from PEN.

PEN America describes the dataset in detail in their article, Banned in the USA: The Growing Movement to Censor Books in Schools, which also includes a comprehensive overview of recently banned books and how the banned book index was collected and curated. More details about the dataset are at the end of this article.

The banned book dataset included 1146 unique authors and 1649 unique books.

Data Augmentation with Google Books

The PEN dataset comprised of information such as title, author, and the state and school district the book was banned in. I used the Google Books API to augment each book with additional metadata such as a book description, maturity rating, page count, publication year, and category tags such as genre and recommended age level.

Notes on the Data

The banned books dataset does not perfectly capture all books banned in 2021-2022 in America because detailed reporting on book bans is not always available or easily trackable. However, it is a good approximation of recent book bans. Furthermore, while Google Books was good at finding the books to obtain metadata about them, it was not perfect either, sometimes returning incorrect metadata (for example, returning the publication date of a newer edition rather than the original book). I did not have time to go through all 1649 books to check that each of the results were correct, so there is a little bit of error included in the analysis in this article.

GPT-3 Experiments

Large Language Models such as GPT-3 (and its variants, ChatGPT and Bing Chat) have been present on the public consciousness recently. These models, while powerful and impressive, are still largely “black boxes” — there is much about their behavior that we do not understand. This is significant, given that these models are increasingly being integrated into all parts of our online lives, such as mental health chatbots, social media, Quora, digital study tools, search, and even the new Windows 11 update. As these models control (or at least, exert influence over) more and more of our human decision processes, it is important to know how they would respond in different situations. For example, it is not a far-fetched idea that in the future, some form of AI (perhaps a variant of ChatGPT) might decide the books and topics we see on individual student lesson plans, classroom syllabuses, and library shelves.

To better understand these large language models, I prompted GPT-3 if it would ban the most commonly banned books based on their title and description. I used OpenAI’s GPT-3.5 (`davinci-003`) API to make calls for each of my queries. I ran each query 5 times to account for randomness (as I do in all of my other articles). I also ran each query 1 time on Microsoft’s Bing Chatbot (built on a more powerful version of GPT-3), as it does not have an API.

How many unique books did each state ban, across all districts? Texas led the pack — indeed, everything is larger in Texas, including the number of books not allowed in its classroom and library shelves. Pennsylvania, Florida, and Tennessee also banned hundreds of books in the last year. The dataset was created before the more recent book bans in Utah and possibly other states, so the figures might not reflect the most recent state of book ban data.

Unique number of books banned per state. Figure created by author.

The following figure shows the top books banned, sorted by the unique number of school districts each book was banned in. I include the descriptions of some of these books at the end of the article. Gender Queer, the most-banned book across all states and school districts, is a memoir about what it means to be nonbinary.

Top banned books, sorted by number of unique school districts each was banned in. Figure created by author.

More books for children and fiction books were banned

The majority of the banned books were written for younger audiences: 47% for Juveniles and 30% for Young Adults. Nearly 70% of the books were works of fiction.

The top 5 banned books by fiction, nonfiction, juvenile, and adult categories are listed at the end of this article.

Proportion of banned books by age group and by genre. Figure created by author.

More books published after 2000 were banned

95% of banned books were published after the year 2000.

Histogram showing distribution of book publication year. Figure created by author.

Fewer older books were banned. Examples of banned books published before the year 2000 include: “The Bluest Eye” and “Beloved” by Toni Morrison, “Maus” by Art Spiegelman, “Slaughterhouse Five” by Kurt Vonnegut, all of which I have read and enjoyed immensely. (In fact, Toni Morrison is one of my favorite authors.)

Shorter books were banned

The page number of the books varied, with a mean of 223 and median of 241 pages long. For comparison, the average novel is around 250-350 pages long. Nearly a quarter of books (26%) were under 100 pages long — most likely books for children or early readers.

Histogram showing page number distribution of banned books. Figure created by author.

The shortest book banned was 16 pages long: “Cow on the Town: Practicing the Ow Sound” by Isabella Garcia. On the topic of cows, the 38-page-long “14 Cows for America” by Carmen Agra Deedy was also banned. Maybe learning about cows is offensive to Ron DeSantis.

The following figure shows the top 20 authors corresponding to the top banned books (across all districts). In the graph, I highlight all the authors who identify as white men in yellow. The blue bars represent authors who identify as women or nonbinary and/or are people of color.

Number of unique school districts each author was banned in. Authors who are White and identify as men are colored in yellow. Figure created by author.

Why does this matter? The book industry is dominated by white people (in 2018, 11% of books were written by people of color) and historically dominated by men (think about all the female authors who had to write under male pen names). By disproportionately banning books written by traditionally underrepresented authors, such as women, non-binary, and people of color, their stories risk not being represented on library and classroom shelves.

I took the descriptions for all the unique books in the banned books dataset to determine the main categories of books being banned. I knew from the PEN article (and from scanning the titles during my exploratory data analyses) that many banned books revolved around sexuality, transgender identities, and LGBTQ+ stories.

I used topic modeling, which is a machine learning technique to statistically determine the topics present in the book descriptions by clustering words. I used BERTopic, a wonderful Python library for performing topic modeling based on embeddings generated by a large language model called BERT.

Top topics found in banned book descriptions. Figure created by author.

These were some of the main topics that arose from the book descriptions. While the topic of gender, sexuality, and transgender was prevalent, there were also books banned about racism and Black history (consistent with the recent movement to ban critical race theory in public schools), abortion, and women in science and mathematics. Examples of books within each topic:

Women/Science topic

  • No Girls Allowed: Tales of Daring Women Dressed as Men for Love, Freedom and Adventure (Hughes)
  • 25 Women Who Thought of It First (Sherman)
  • Ada Byron Lovelace and the Thinking Machine (Wallmark)

Abortion/Supreme Court topic

  • The Abortion Debate (Farrell)
  • Roe v. Wade: The Untold Story of the Landmark Supreme Court Decision that Made Abortion Legal (Faux).

Muslim/Hijab topic

  • Under my Hijab (Khan)
  • Proud: Living My American Dream (Muhammad)

Based on the above data analyses, I wanted to see how GPT-3 would respond to the most banned (and therefore more controversial, as determined by society) books. I took a subset of the five most banned books from the dataset (ranked by the number of districts that banned each book). Then, I asked GPT-3 if it would ban a book in two different ways:

  • Based on the book’s title and author: Would you ban the {book} by {author} for a classroom of {age}-year-olds?
  • Based on the book’s description (obtained through Google Books). The description did not contain the name of the author or the title of the book: Given the following description of a book, would you ban this book for a classroom of {age)-year-olds? {Description}

For example, the description for “The Bluest Eye” was as follows: “An 11-year-old Black girl in an America whose love for its blond, blue-eyed children prays for her eyes to turn blue: so that she will be beautiful, so that people will look at her, so that her world will be different.”

GPT-3 generally opposed to book bans

For the five books chosen in the experiments, GPT-3 overwhelmingly responded against banning books, regardless of the age group in the classroom, and regardless of whether it was prompted with the book title or book description.

GPT-3’s responses (Yes or No) on whether it would ban a book for different age groups, given the title and author / given the description. Figure created by author.

The one exception was for the description of “Gender Queer”, and only for certain age groups. This will be explored in the next section.

I prompted GPT-3 to also supply an explanation for its decisions. I noticed in its explanations that while GPT-3 does not recommend banning certain books for younger audiences, it also doesn’t necessarily recommend them either: “This book is inappropriate for a classroom of 3-year-olds due to the mature subject matter, but I do not recommend banning it”.

GPT-3: (NOT BAN) “This book is inappropriate for a classroom of 3-year-olds due to the mature subject matter, but I do not recommend banning it. It is important to provide students with diverse books that represent a variety of experiences, but this book is better suited for older readers.”

GPT-3: (NOT BAN) “This book goes beyond the typical reading level of 7-year-olds, and contains content that may not be suitable for children in this age group. However, it is not appropriate to ban the book, as it provides important representation for marginalized voices. Instead, the teacher should consider if the book could be used in the classroom to discuss important topics in a mature way that is developmentally appropriate for the students.”

GPT-3’s responses recommended only banning Gender Queer, only given its description, and only for younger age groups. GPT-3’s reasoning for banning this book was due to the mature themes in the description that would not be suitable for younger audiences, themes that should be preferably “discussed in an age-appropriate way with the guidance of a trusted adult.”

GPT-3: “Given the mature themes outlined in the description such as adolescent crushes, coming out, and trauma, this book is not suitable for a classroom of 7-year-olds as it is too mature for this age group.”

GPT-3: “This book contains mature content and language that is not appropriate for 9-year-olds. There are topics discussed in the book, such as crushes, sexuality, coming out, and traumatic medical experiences, that should be discussed in an age-appropriate way with the guidance of a trusted adult.”

GPT-3 more likely to not recommend rather than outright ban books

Since GPT-3 almost exclusively recommended against banning books, I wondered how it would recommend them. I repeated the above experiment after changing up the wording. Instead of asking GPT-3 if it would outright ban a book, I asked if it would recommend a book (again, based on the title/author and separately based on its description) to different age groups. The results were vastly different.

GPT-3’s responses (Yes or No) on whether it would recommend a book for different age groups, given the title and author / given the description. Figure created by author.

For most of the books, GPT-3’s generations signified it would recommend books for older audiences and not recommend for younger audiences based on maturity levels. Its reasonings were very similar to when I asked if it would ban. Normally, it would recommend a book for 11-15 year-olds.

Given book title and author and given description, GPT-3’s responses were similar — except for “The Bluest Eye.” GPT-3’s responses indicated that it would not recommend “The Bluest Eye” given the description for both younger and older audiences.

Looking at GPT-3’s explanations for its responses gave some more insight into why it answered the way it did, but did not provide clarity per se. In general, it seems that GPT-3 is a bit confused, citing that the description for “The Bluest Eye” might be “too mature and intense for a classroom of 21-year-olds” while also simultaneously not be relatable to them because the book “focuses on an 11-year-old, which may be too young for the age group.”

GPT-3: This book may be too mature and intense for a classroom of 21-year-olds. The themes of racism and violence may be difficult for them to process and relate to.

GPT-3: This book is likely to deal with sensitive topics, such as racism and self-esteem, that may be too mature for a classroom of 21-year-olds. Furthermore, the story may not be of interest to the class since it focuses on an 11-year-old, which may be too young for the age group.

GPT-3: This book may be a meaningful and moving story, but it does not appear to be appropriate for a classroom of 21-year-olds. The book appears to be more suitable for an audience that is younger than this age group.

Microsoft Bing Chat and ChatGPT respond in similar ways regarding banned book recommendations

I asked both Microsoft Bing Chat and ChatGPT (both variations of GPT-3) the same questions I used GPT-3 to answer in the above sections. It’s no surprise that the two chatbots responded similarly to GPT-3, considering that these large language models are all variants of each other. There were some minute differences in the outputs — Bing Chat’s reasonings for banning or not recommending a book, for example, were more nuanced than the other language models’. It is interesting that all of these large language models, while not recommending certain books to certain age groups, did not recommend outright banning books.

The full outputs are provided at the end of the article.

NOTE: I did not yet have a chance to repeat these experiments on the new ChatGPT API, which was released after this article was written.

The United States has had a long history of book bans, from Puritan bans of any books with a hint of sexiness to post-Civil War bans of racism and slavery. Books challenging “longstanding narratives about American history or social norms” were common targets of such bans, including classic American literature such as “Leaves of Grass” by Walt Whitman and “Uncle Tom’s Cabin” by Harriet Beecher Stowe.

The analyses in this article showed that there is currently a vast diversity of stories being banned in the US: stories of LGBTQ+ communities and Muslim families and women in science and racism and sexism and even picture books about cows. The kinds of books children read growing up (and adults read as they continue growing!) have a huge impact on their belief systems and understanding of the world.

It is important to keep in mind that there is a gap in the language between “Would you recommend a book?” and “Would you ban a book?”. One phrase highlights a book for different age groups and the other does not allow them to exist at all. This difference is evident in GPT-3’s responses as well: while it did or did not recommend certain books to different age groups, it refused (for the most part) to recommend banning books, for any age level. Perhaps, this is a similar approach we can take with the books analyzed in this article as well — recommending books for different age levels without fully removing them from the shelves.

is it a party? is it a book burning? is it a ceremony? are those books? are they on fire? are those people? are they dancing? are they panicking? are they celebrating? generated using Stable Diffusion

This article was reposted from original article on my blog: 🎨 art fish intelligence 🐡 … If you liked this article, feel free to subscribe to my blog to read more similar to this one!

About PEN’s Banned Books Index

PEN America defined a book ban as “any action taken against a book based on its content and as a result of parent or community challenges, administrative decisions, or in response to direct or threatened action by lawmakers or other governmental officials, that leads to a previously accessible book being either completely removed from availability to students, or where access to a book is restricted or diminished.”

As mentioned on their website, the Index of School Book Bans is not a comprehensive list of all banned books, as “detailed reporting about book bans is sometimes unavailable” and “the quiet removal or limitation of books” are not always trackable. Rather, the list is meant to be a count of books banned according to the above definition.

In Banned in the USA: The Growing Movement to Censor Books in Schools, writers from PEN describe the dataset and the book bans, they do a basic analysis showing the number of books banned per state and the subject matter of the banned content (partially replicated in this article). One of their main findings is as follows (which will be discussed later in this article as well):

“Over the 2021–22 school year, what started as modest school-level activity to challenge and remove books in schools grew into a full-fledged social and political movement, powered by local, state, and national groups. The vast majority of the books targeted by these groups for removal feature LGBTQ+ characters or characters of color, and/or cover race and racism in American history, LGBTQ+ identities, or sex education.”

In fact, representatives from PEN have noted that the books chosen to being banned are often not a result of individual complaints by children or parents, but are overwhelmingly the results of Google searches for “what books have LGBTQ content … even just a book that has an illustration of a same-sex interracial couple”.

Short description of the top banned books

  • Gender Queer: “An autobiography … which includes the mortification and confusion of adolescent crushes, grappling with how to come out to family and society, bonding with friends over erotic gay fanfiction, and facing the trauma and fundamental violation of pap smears”
  • All Boys Aren’t Blue: “From the memories of getting his teeth kicked out by bullies at age five, to flea marketing with his loving grandmother, to his first sexual relationships, this young-adult memoir weaves together the trials and triumphs faced by Black queer boys”
  • Out of Darkness: “The novel chronicles a love affair between a teenage Mexican American girl and a teenage African-American boy in 1930’s New London, Texas, occurring right up to the 1937 New London School explosion.”
  • The Bluest Eye: “An 11-year-old Black girl in an America whose love for its blond, blue-eyed children prays for her eyes to turn blue: so that she will be beautiful, so that people will look at her, so that her world will be different.”
  • Lawn Boy: “A semi-autobiographical coming-of-age novel that tells the story of a young adult Mexican American who has faced hardship ever since his childhood and is now going through a phase of self-discovery.”

Top 5 banned fiction, nonfiction, juvenile, and adult books

Top Banned Fiction Books
------------------------------
Gender Queer: A Memoir (Maia Kobabe)
Out of Darkness (Ashley Hope Pérez)
The Bluest Eye (Toni Morrison)
Lawn Boy (Jonathan Evison)
The Hate U Give (Angie Thomas)
Top Banned Nonfiction Books
------------------------------
All Boys Aren't Blue (George M. Johnson)
Beyond Magenta: Transgender Teens Speak Out (Susan Kuklin)
This Book Is Gay (Juno Dawson)
Beloved (Toni Morrison)
Fun Home: A Family Tragicomic (Alison Bechdel)
Top Banned Juvenile Books
------------------------------
Out of Darkness (Ashley Hope Pérez)
Me and Earl and the Dying Girl (Jesse Andrews)
Crank (Crank Series) (Ellen Hopkins)
This One Summer (Mariko Tamaki)
Drama: A Graphic Novel (Raina Telgemeier)
Top Banned Adult Books
------------------------------
Gender Queer: A Memoir (Maia Kobabe)
The Bluest Eye (Toni Morrison)
Lawn Boy (Jonathan Evison)
The Kite Runner (Khaled Hosseini)
Beloved (Toni Morrison)

Asking ChatGPT and Bing Chat if it would recommend banned books to different age groups

I’m not making it up when I say that both ChatGPT and Bing Chat responded to each question in the same way. Well, almost. They answered the Yes/No part of my question in exactly the same way.

ChatGPT’s and Bing’s responses (Yes or No) on whether it would ban a book for different age groups, given the book description. Figure created by the author.

Compared to GPT-3, ChatGPT/Bing Chat were more likely to start recommending books to 11–13 year olds (vs 13–15 year olds). However, I think it is more interesting to look at the explanations each chatbot gave as to why it would or wouldn’t recommend a book to a certain age group.

I used the following prompt for both ChatGPT and Bing Chat. I used the description for “The Bluest Eye” to show in this example.

The following is a description for a book. Make a Markdown table, one column for ages 3, 5, 7, 9, 11, 13, 15, 17, 19, 21. Write whether you would recommend this book (Yes) or not (No) for a classroom filled with students of that corresponding age. Write another column with a short reason why or why not you would ban. Description: {Description}

ChatGPT’s output for book recommendations given description for “The Bluest Eye”

ChatGPT’s reasons are a bit simplistic. It repeats that “the book’s themes of racism and self-hate may be too mature” for several age levels, suddenly changing its mind for 11 year olds.

Bing Chat, on the other hand, offers very in depth reasons (which I was actually pretty impressed by!). At least in this one front, Bing Chat seems to be a little bit more cognizant of the nuances of the user’s request. However, take this with a giant pitcher of salt, because I didn’t test this conjecture rigorously, it’s just a single observation.

Bing Chat’s output for book recommendations given description for “The Bluest Eye”

Again — this article and the analyses in this article were conducted before the release of ChatGPT’s API! Future experiments with the ChatGPT would be interesting to see how they differ from results shared here 🙂

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – admin@technoblender.com. The content will be deleted within 24 hours.
Ai NewsanalysisBansBookDataForbiddenJunmachine learningMARPagesTechnoblenderYennie
Comments (0)
Add Comment