The Cultural Analytics Series, jointly organized by the UC Berkeley School of Information and Berkeley Institute for Data Science (BIDS), highlights research that uses data-driven methods to study cultural phenomena.
In the fall of 2025, the series featured the presentation "AI Fiction in the Wild" from Melanie Walsh, an Assistant Professor at the University of Washington's Information School. She explored collaborative insights from WildChat, a public dataset generated by Allen Institute for AI (AI2) with more than one million user-consented 'real-world' ChatGPT conversations.
Walsh dives into three main ideas: how the experience of reading and writing fiction will change, how large language models (LLMs) will shape or disrupt traditional models of literary publishing, and where AI fits into broader literary and cultural history.

Walsh presents her work at the I School.
Understanding top use-cases of LLMs today
These questions become especially tangible when examining how LLMs are used in practice today. With the rise of LLMs, many users are turning to these tools to generate content and stories. Some common use cases, as reported by Anthropic and OpenAI, include fiction writing and other creative writing.
Analysis of user-LLM generated fictional stories
To analyze user-LLM conversations generating fictional stories, Walsh used the Wildchat dataset. She also worked collaboratively with Neel Gupta (iSchool PhD student at University of Washington) and Maria Antoniak (Assistant Professor in Computer Science at CU Boulder). The team narrowed down 573,000 English-based user conversations from WildChat through LLM prompting to only include fiction-related conversations. Interestingly, Walsh, Gupta, and Antoniak found more than one-third of the English-language conversations contained some form of fiction. This highlighted the prevalence of fiction based-content generated from LLMs.
Upon further examination, fiction subcategories emerged. User-generated fiction subcategories included fanfiction reimaginings, direct storytelling, erotica, roleplay, and option listing or idea enumeration using prompting with a set of instructions. Specifically, the main two prevalent categories from the WildChat fiction conversations using LLM classifications included Fanfiction (49%) and Erotica or sexually explicit material (27%).
The Future of LLM-generated fiction
These findings suggest LLM-generated fiction material is growing in popularity and will continue to expand, impacting current literary trends and challenging traditional publishing models. Walsh reflected that examining "...this real world data has made [her] see AI-generated fiction differently... It has made [her] think about the affinity between LLM outputs and fiction." This phenomenon will actively be shaping the literary and cultural landscape and is important to address.
Walsh meets with a group of students and faculty members at AI Futures Lab.
Watch the I School event recording for the full account!
To stay in touch and join these far-ranging conversations of critical cultural importance, please join our Cultural Analytics mailing list by visiting this page or emailing bids-cultural-analytics+subscribe@lists.berkeley.edu.