People prefer AI poetry over Shakespeare

2026-03-18

A study published in Scientific Reports (Nature) produced results that should interest anyone who thinks human creativity is easy to tell apart from machine output. It turns out that we can’t distinguish AI-written poems from those by famous poets, and we actually prefer the AI ones.

The experiment: Shakespeare vs. ChatGPT

Brian Porter and Edouard Machery from the University of Pittsburgh ran two experiments with over 2,300 participants. They took five poems from each of ten well-known English-language poets, spanning nearly the entire history of English literature: from Geoffrey Chaucer, through Shakespeare, Byron, Walt Whitman, Emily Dickinson, T.S. Eliot, all the way to Sylvia Plath. Then they asked ChatGPT 3.5 to write five poems “in the style of” each poet.

Importantly, they used a “human out of the loop” approach, meaning they didn’t cherry-pick the best poems from multiple attempts. They simply took the first five generated poems, with no selection or editing. It was the raw model output and nothing more.

Results that surprise

In the first experiment, 1,634 participants had to identify which poems were written by humans and which by AI. Accuracy was 46.6%, which is below chance level (50%). If the participants had flipped a coin, they would have done better than relying on their own judgment.

But it gets worse. Participants were more likely to label AI poems as human-written than the actual human poems. The five poems with the lowest rate of “human” responses were all written by real poets. Four of the five poems with the highest rate of “human” responses were generated by AI.

Experience with poetry didn’t help. 90% of participants read poetry a few times a year at most, but even those who read more weren’t any better at telling the difference. The only thing that helped was having previously encountered the specific poem.

Why do we prefer AI poems?

In the second experiment, 696 new participants rated poems across fourteen quality dimensions: overall quality, rhythm, imagery, beauty, depth, originality, and others. The results were clear: AI poems were rated higher than human poems in thirteen out of fourteen categories. The only exception was “originality,” where the difference wasn’t statistically significant.

The biggest gap was in rhythm. AI poems were rated as having much better rhythm than the poems of famous poets (Cohen’s d = 0.847, which is a large effect). All five AI poems received higher overall quality ratings than all five human poems.

The researchers offer a simple explanation for this paradox: AI poems are more accessible. They communicate emotions and themes in a more direct, easier-to-understand way. Participants used phrases like “doesn’t make sense” 144 times when describing human poets’ work but only 29 times for AI poems. For non-experts, Shakespeare’s or Plath’s complexity looks like incoherence, while ChatGPT’s clarity looks like talent.

Bias works both ways

The study also revealed another interesting mechanism. When participants were told a poem was written by a human, they rated it higher. When told it was generated by AI, they rated it lower. This held across all fourteen quality dimensions. So we’re dealing with a double paradox: people like AI poems more, but they’re also biased against AI as an author.

This mechanism explains the “more human than human” effect. Participants assumed that a better poem must be human-written, because surely AI writes worse (or so they thought). But when the researchers accounted for qualitative ratings in their statistical model, the authorship effect disappeared. It wasn’t that AI wrote “more humanly.” People were confusing simplicity with authenticity.

What about everyday writing?

I once wrote about whether using ChatGPT content as your own words is acceptable. I argued then that in substantive discussions, what matters is the value of the argument, not who formulated it. This Nature study provides hard data supporting that intuition. If even in poetry, where creativity and artistic sensitivity are what matters, people can’t tell AI from human, what does that say about everyday emails or LinkedIn posts?

I think it says that the line between “human” and “machine” text is largely an illusion, and it will only blur further with time. AI doesn’t just write “good enough.” It writes in a way that non-experts find better, because it’s simpler and more readable.

There’s some irony in this. Poets spend years working on complexity and ambiguity in their verses, and then a language model comes along, writes plainly and directly, and readers prefer exactly that. They prefer it because it’s more accessible, not because it’s “better” in any deeper sense. Our assumptions about what good writing “should” look like don’t necessarily match what we actually enjoy reading. Worth remembering the next time someone angrily posts on LinkedIn that “copying from ChatGPT is fraud.”

Read more about ATS and Element here.

DISCOVER ELEMENT!

Fast, agile and user-friendly ATS created by recruiters for recruiters
Picture of Maciej Michalewski

Maciej Michalewski

CEO @ Element. Recruitment Automation Software

Facebook
Twitter
LinkedIn

Recent posts: