We Analyzed 10,000 Code Samples — Here’s What AI Writes Differently

December 16th, 202514 min read

Everyone says they can "feel" when code is written by AI. But feelings aren't data. So we decided to put the vibes to the test.

We collected 5,000 human-written snippets from open-source repositories (pre-2021 to avoid contamination) and generated 5,000 corresponding snippets using GPT-4, Claude 3.5, and Llama 3.

Then we ran them through our heuristic engine. The results were not just statistically significant—they were staggering.

The Dataset

  • Human Set: Popular JS/TS repos (React, Vue, D3, Express).
  • AI Set: Generated via prompts like "Write a function to..." based on the human function signatures.
  • Languages: JavaScript, TypeScript, Python.

Top 10 Observed Differences

Metric Human Code AI Code
Avg. Comment Density 12% 28% (Over-commenting)
Variable Name Length 8.4 chars 6.1 chars (Generic names)
"Guard Clause" Usage 68% of functions 22% of functions
Error Handling Specific / Bubbling Generic `try/catch`
Unique Vocabulary High (Domain slang) Low (Standard English)

Visualizing the "Vibe Gap"

The most striking difference was in structural variety. Human code has peaks and valleys. AI code is a flat plain.

AI Structure (The Block)

const process = (data) => {
  if (data) {
    const result = [];
    for (let i = 0; i < data.length; i++) {
      if (data[i].isValid) {
        result.push(data[i]);
      }
    }
    return result;
  }
  return [];
};
    

Dense. Nested. Uniform.

Human Structure (The Flow)

const process = (items = []) => {
  if (!items.length) return [];

  return items.filter(isValid);
};
    

Spaced out. Linear. Expressive.

What Surprised Us

We expected AI to be "better" at syntax. And it is. It rarely makes syntax errors. But we were surprised by how insecure AI coding styles are.

AI code is terrified of runtime errors. It checks for `null` constantly, even when types guarantee existence. It wraps simple logic in `try/catch`. It defaults to defensive coding patterns that actually make debugging harder because they swallow the useful crash data.

The "Turing Test" for Code

Based on this data, we've refined our detection algorithm. We don't just look for "bad" code. We look for "statistically average" code.

If your code looks like the average of all code on GitHub, it triggers our AI detector. If it has quirks, weird formatting choices, and domain-specific slang, it passes as human.

Does Your Code Pass the Test?

We've fed these 10,000 learnings into our Vibe Engine. Paste your snippet to see where it falls on the Human-AI spectrum.

Run the Vibe Test →
Vibe Code Detector