Diagnosing Procedural Failure
My prompt:
Generate a short story about a low-level technician who thinks of herself as a cultural anthropologist conducting an informal participant observation study within her organisation.
She has recently become interested in David Graeber's work. Include multiple essayistic digressions (not labelled as such) inspired by her reading.
Primary focus: diagnosing pathologies of workplace culture, investigating specific procedural failures
Assume: individual targeting, weaponised gossip, epistemic injustice
Setting: Berlin, 2024. The setting should feel integral to the narrative rather than merely incidental.
No dialogue. No tortured metaphors. Prioritise readability and flow.
Should work as a coherent piece of fiction.
Stylistic models: Philip Roth and John Updike
—
Responses: Claude Sonnet 4.5, GPT 5.2 Thinking
I asked ChatGPT, Claude and Gemini to critically assess the linked stories using this article by Sam Kriss as a thematic guide.
-
Assessments of Claude Sonnet 4.5’s “The Participant Observer”
GPT 5.2 Thinking: “As an answer to the original prompt, it’s competent and often effective. As a piece of fiction in a Roth/Updike register, it’s too frictionless—and, per your thematic guide, it carries several of the modern ‘AI house voice’ fingerprints (especially the em-dash + ‘not X but Y’ engine) that make prose feel composed rather than lived.”
Claude Sonnet 4.5: “The story succeeds primarily because it stays concrete and resists AI's tendency toward vaporous profundity. But the prompt's request for ‘essayistic digressions inspired by reading Graeber’ may have inadvertently encouraged exactly the kind of pre-digested theoretical pronouncements that AI excels at producing.”
Gemini 3 Pro: “The story is a competent piece of ‘sleek and precise’ prose, yet it fails to capture the specific stylistic models requested (Roth and Updike). Instead, it falls victim to several of the exact ‘telltale signs" Kriss identifies as the ‘bizarre prose of the machine’. While it successfully avoids the most egregious vocabulary traps (like ‘delve’ or ‘tapestry’), it heavily relies on the structural tics—specifically the ‘Not X, but Y’ construction and the obsession with em dashes—that Kriss flags as the signature of the ‘Omniwriter’.”
-
Assessments of GPT-5.2 Thinking’s “Runbook for a Berlin Institute”
GPT 5.2 Thinking: “It’s a strong ‘workplace culture diagnosis’ essay-fiction hybrid, but it’s still too rhetorically upholstered and too morally symmetrical to feel like Roth/Updike—or like a single, irreplaceable person telling you what happened.”
Claude Sonnet 4.5: “This is high-quality AI writing that has successfully studied and avoided the crude markers of its predecessors. It reads like competent literary fiction and would likely pass undetected in many contexts. But it lacks the essential quality the kriss.txt article points to: the ‘small, strange way’ that genuinely human observation connects disparate experiences through lived reality rather than statistical correlation. It's not that it's bad—it's that it's executing a template for ‘literary fiction about workplace pathology’ rather than discovering that story through the messy process of actually writing it.”
Gemini 3 Pro: “While the story is stylistically cohesive and follows the prompt's negative constraints (no tortured metaphors), it confirms Kriss's central thesis: ‘Once, there were many writers, and many different styles. Now, increasingly, one uncredited author turns out essentially everything’.”