The em dash was never the interesting tell.
It was funny for a week because it gave everyone a visible mark to circle in red. AI writing had become familiar enough that readers were looking for fingerprints, and punctuation offered the comfort of a quick diagnosis. Too many dashes. Too much “delve.” Too much competent prose arriving with no particular wound behind it.
But style tells are the easiest ones to train away. A model can be prompted out of a favorite word. A house style can forbid a rhythm. A rewrite pass can roughen the surface until the sentence looks less machine-made. That does not mean the story underneath has become more human.
Ethan Mollick pointed to the more important layer in a fresh X post about StoryScope: Investigating idiosyncrasies in AI fiction, an April 2026 arXiv preprint by Jenna Russell, Rishanth Rajendhran, Chau Minh Pham, Mohit Iyyer, and John Wieting. His framing was simple: the paper is looking at AI narrative tells, not merely stylistic ones. In a follow-up, he added that the findings matched his own experiments with AI storytelling, and that storytelling has improved less quickly than many other AI abilities. (Mollick on X, StoryScope)
“AI narrative tells”
Ethan Mollick, X, May 27, 2026
That phrase matters because it moves the argument upstream. We are no longer asking whether a sentence sounds like AI. We are asking whether the story thinks like AI.
The Paper Finds Structure, Not Just Style
The StoryScope paper starts from a pressure every writer and publisher now feels: if AI fiction is becoming common, and if surface detectors are brittle, can we identify deeper differences in how stories are built? The authors create a large parallel corpus: 10,272 writing prompts, each producing one human story and five AI-generated mirrors from Claude, DeepSeek, Gemini, GPT, and Kimi. That yields 61,608 stories, each around 5,000 words, and 304 extracted features per story.
The crucial move is methodological. Instead of only asking about words, syntax, punctuation, or phrase frequency, StoryScope converts each story into structured narrative features across ten dimensions: agents, social network, events, plot, structure, setting, time, revelation, perspective, and style. Then the authors remove style features and ask whether the remaining narrative decisions still distinguish human-written fiction from AI-generated fiction.
They do. Narrative features alone achieve 93.2% macro-F1 in human-vs-AI detection, retaining more than 97% of the performance of the combined narrative-plus-style model. A smaller set of 30 core narrative features still reaches 84.8% macro-F1. After stylistic alteration, the narrative signal remains strong. (StoryScope PDF)
That does not mean StoryScope is a magic detector. The paper is a preprint under review, and it depends on several choices worth holding carefully: human stories from Books3, reverse-engineered prompts, AI-generated mirrors, LLM-based feature extraction, and classifier performance on a particular controlled dataset. The authors do useful work to address obvious objections. They group train/test splits by prompt to prevent leakage, test length-matched subsets, run a memorization-risk audit, validate feature assignment against humans, and release code and AI-generated data. (StoryScope GitHub)
Still, the value of the paper for writers is not “now we can catch every AI story.” The value is that the detected differences are legible as storytelling choices.
AI stories over-explain theme. They state moral meaning directly. The paper reports that narrators explicitly explain theme far more often in AI stories than in human ones. AI dialogue more often turns into philosophical debate. The machine tends to make the story’s meaning available, announced, and resolved.
Human stories, by contrast, tolerate more ambiguity. They use more temporal discontinuity, more flashbacks and nonlinear movement, more specific intertextual references, and more morally ambiguous protagonist choices. They are more willing to let a revelation reshape earlier material rather than simply deliver the next clean cause in a line.
The paper’s most useful sentence for writers may be its simplest diagnosis:
“AI spells out meaning”
Russell et al., StoryScope, 2026
That is the tell beneath the tell.
AI Likes the Clean Line
The replies under Mollick’s post converged quickly around the same intuition. One person put it as “AI explains. Humans imply.” Another said style was costume and narrative was the skeleton underneath. Others pointed to over-explanation, reinforcement learning’s preference for unambiguous answers, and the way AI tends to build conflict-resolution-lesson patterns instead of messier discovery.
“AI explains. Humans imply.”
Mohammad Siam, X, May 27, 2026
That reply lands because it names something readers feel before they can defend it. AI fiction often seems to know the point too early. The prose may be elegant, but the story keeps helping the reader by explaining what the moment means, how the character has changed, and why the ending matters. The result can feel emotionally legible and dramatically inert at the same time.
StoryScope gives numbers to that instinct. AI stories show tighter causal chains, more protagonist-driven resolutions, fewer subplots, and cleaner internal acceptance. Human stories are more likely to leave endings ambiguous, complicate responsibility, fracture time, or let consequence arrive from several directions at once. The human version of a premise is often less obedient to the premise.
From a Dramatica standpoint, this is not a minor stylistic preference. A story is not merely a sequence of understandable events. It is an argument carried through conflict. The Objective Story, Main Character, Influence Character, and Relationship Story each apply pressure from a distinct Perspective. If those pressures collapse into one clean line of cause, lesson, and resolution, the story can feel “complete” while losing the actual complexity that makes a complete Storyform work.
This is where AI’s apparent helpfulness becomes a problem. A model trained to satisfy the user tends to clarify, complete, reconcile, and explain. Fiction often depends on withholding, contradiction, implication, misdirection, and partial knowledge. The machine wants to make the answer legible. The story often needs the pressure to remain unresolved long enough for meaning to accumulate.
That does not mean human writing is better because it is messier. Mess alone is not art. A confused draft can be nonlinear and still have no argument. But meaningful narrative complexity comes from pressure distributed across Perspectives, not from a random refusal to be clear.
The problem with much AI fiction is that it resolves the wrong kind of uncertainty. It smooths away the places where the story should be thinking.
The Cluster Is the Warning
One of StoryScope’s strongest findings is not just that human and AI stories can be separated. It is that the five AI models cluster together in narrative space while human stories are more dispersed. The authors report that human stories occupy a distinct region and are rarer by their nearest-neighbor measure. Human stories are overrepresented in the rarest tail of the corpus.
That should make every writer pause.
The danger is not only that AI might produce bad fiction. Bad human fiction is abundant. The danger is that AI can produce a vast amount of competent fiction that shares an underlying shape: tidy, explicit, causally legible, emotionally annotated, and thematically over-secured. It may not read as broken. It may read as polished. That is why it is dangerous.
The model-specific fingerprints are also revealing. Claude, in the paper’s analysis, produces restrained stories with flatter event escalation and quieter endings. GPT leans into social mechanisms like gossip and rumor, retrospective framing, and more ensemble-heavy social networks. Gemini, DeepSeek, and Kimi are harder to distinguish from one another, with Gemini especially associated with tidy endings and extended denouements.
Those fingerprints are fun, but the shared cluster matters more than the individual quirks. If multiple advanced models, trained by different organizations, still converge toward a recognizable narrative region, then we are looking at something deeper than a bad prompt. We are seeing the default shape of machine-made storytelling when the machine is asked to continue, satisfy, and complete.
That is why Mollick’s follow-up is important. Storytelling has improved, but it has not improved at the same pace as other AI abilities. This makes sense if storytelling is not merely a language task. A model can get much better at explaining, coding, summarizing, and transforming text while still struggling to hold the kind of internal narrative argument that makes a story feel authored.
The replies underneath Mollick’s post included one useful challenge: if a human writer uses AI only after doing complex architecture and character design, perhaps the StoryScope weaknesses would not appear as strongly. That is probably true.
It is also the whole point.
When the human supplies the architecture, the AI is no longer being asked to originate the narrative argument. It is being asked to execute inside one. The difference between those two workflows is enormous. In the first, the model invents a story-shaped object from statistical habit. In the second, the writer brings a Storyform, a set of constraints, a pattern of pressure, and a standard of meaning. The tool may help express the story, but it is no longer responsible for knowing what the story is.
Put the Narrative First
Jim Hull’s reply under Mollick’s post asked the question Dramatica would ask:
“put the narrative first”
Jim Hull, X, May 27, 2026
That is not a slogan. It is a workflow.
Dramatica has always insisted on the distinction between Storyforming and Storytelling. Storyforming establishes the structural argument: what inequity drives the story, how the four Throughlines explore it, what the Main Character is personally facing, what the Influence Character pressures them to reconsider, what the Relationship Story changes between them, and how the Objective Story pushes everyone through shared conflict. Storytelling expresses that argument in characters, scenes, images, genre, voice, and plot.
AI tends to rush to Storytelling because Storytelling is visible. It can generate scenes, voices, reversals, metaphors, and endings. But Storyforming is where meaning is actually constrained. Without that prior structure, the generated story has to fake necessity from surface cues. It explains because it does not trust the argument. It resolves because it does not know which tension should remain active. It moralizes because it cannot rely on the Storyform to carry meaning beneath the words.
This is why “make it sound more human” is the wrong repair. You can prompt away the punctuation and still leave the machine’s narrative habits intact. You can ask for ambiguity and receive ambiguity as a texture rather than as a structural consequence. You can ask for nonlinear time and get shuffled chronology without meaningful revelation.
The better question is: what is the story’s argument, and where does this passage sit inside that argument?
Once that question is answered, AI can become more useful. It can test whether a scene is drifting from the Main Character Perspective into Objective Story exposition. It can identify when an Influence Character has become advice instead of pressure. It can notice when the Relationship Story has disappeared into plot logistics. It can help generate alternatives that stay inside the Storyform rather than merely matching the genre.
But it cannot be allowed to replace the act of deciding what the story means.
The Human Advantage Is Not Vibes
StoryScope gives the AI-writing conversation a better center of gravity.
The shallow version says humans have soul and AI does not. That may be emotionally true for many readers, but it leaves writers with a foggy defense. The sharper version is that human stories more often carry unusual combinations of narrative decisions: temporal complexity, ambiguous agency, specific cultural reference, withheld meaning, unresolved pressure, and choices that do not reduce neatly to lesson delivery.
In Dramatica terms, the human advantage is not simply personal style. It is the capacity to build and sustain a coherent narrative argument from the inside. A writer can decide that a scene should resist closure because the Main Character has not yet faced the right kind of pressure. A writer can let the Objective Story move forward while the Relationship Story curdles underneath it. A writer can withhold explicit theme because the arrangement of Storybeats is already saying what needs to be said.
That is why the paper feels so important. It is not merely another AI detector paper. It is evidence that narrative structure leaves fingerprints. The shape of agency, revelation, time, causality, social pressure, and moral clarity tells us something about who, or what, built the story.
The response should not be panic. It should be seriousness.
Writers who use AI casually will get AI’s default narrative habits. Writers who use AI after doing the structural work may get something much more interesting: a tool constrained by human meaning. The difference between those outcomes is the difference between asking a model to produce fiction and asking it to serve a Storyform.
The em dash was never the interesting tell.
The interesting tell is whether the story knows what it is arguing before it starts explaining itself.
Sources
- Ethan Mollick, X post on StoryScope and AI narrative tells
- Ethan Mollick, X follow-up on AI storytelling persistence
- Mohammad Siam, X reply on explanation and implication
- Jim Hull, X reply on putting narrative first
- Jenna Russell, Rishanth Rajendhran, Chau Minh Pham, Mohit Iyyer, and John Wieting, StoryScope: Investigating idiosyncrasies in AI fiction
- StoryScope authors, StoryScope GitHub repository
- Dramatica, Of Stories and Storyforms
- Dramatica, What is Dramatica?