Why My PR Carried Other People's Commits: A Git Detective Story

By Dylan5 min read

Today I submitted a simple documentation PR to an open-source project. One commit, one small improvement. But when I opened the PR on GitHub, I was shocked to see five commits listed—including commits from other contributors I had never touched. You can see the original PR here: zigcc/zig-course#295.

What followed was a fascinating journey into Git's internals.

The Crime Scene

I was contributing to zig-course, a Chinese tutorial for the Zig programming language. My PR should have contained just one commit, but GitHub showed five:

6ed454e - fix: change 对象文件 to 目标文件 (Dylan)        ← My old PR?
736985a - fix: 适配 Zig 内联汇编语法变更 (jinzhongjia)   ← Who?
2b2774c - fix: 适配 Zig 内联汇编语法变更 (jinzhongjia)   ← Who??
f8774c8 - Merge branch 'zigcc:main' into main            ← What???
dbe0044 - docs(struct): add struct initialization section ← My actual commit

I was confused. I had clicked "Sync fork" on GitHub before creating my branch. Everything should have been clean. But clearly, something was wrong.

The Fork Sync Trap

Here's what I thought fork sync did:

Upstream main:  A → B → C → D
                    ↓ sync (replace)
My fork main:   A → B → C → D

Here's what it actually did:

Upstream main:   A → B → C → D
My fork main:    A → B → X → Merge commit
              My old unmerged commit (still there!)

GitHub's fork sync performs a merge, not a reset. It brings in upstream changes but never deletes your local commits—even if they've already been merged upstream through a PR.

The Phantom Duplicate

But wait. My old commit (6ed454e, the "对象文件" fix) was already merged upstream! I'm even listed as a contributor. Why was it still haunting my fork?

I ran a search and found something strange:

$ git log --all --oneline --grep="对象文件"
35750b1 fix: change `对象文件` to `目标文件 (object file)`
6ed454e fix: change `对象文件` to `目标文件 (object file)`

Two commits with the exact same message. Same author. Same content. But different hashes. How?

$ git show 6ed454e --format="Committer: %cn <%ce>"
Committer: GitHub <noreply@github.com>

$ git show 35750b1 --format="Committer: %cn <%ce>"
Committer: jinzhongjia <mail@nvimer.org>

The committer changed. And that's enough to generate an entirely different hash.

Author vs Committer

Git distinguishes between two identities on every commit. The Author is the person who wrote the code. The Committer is the person (or system) who executed git commit. Usually they're the same, but not always:

  • When you commit via command line: Author = You, Committer = You
  • When you edit a file on GitHub's web UI: Author = You, Committer = GitHub
  • When a maintainer clicks "Rebase and merge": Author = You, Committer = Maintainer

My original commit was made through GitHub's web editor, so the committer was "GitHub". When the maintainer used "Rebase and merge", they became the new committer. Since Git calculates the hash from author, committer, parent, message, and content, changing the committer meant a new hash.

Git saw 6ed454e and 35750b1 as two completely different commits, even though the actual code change was identical.

The Three Merge Strategies

GitHub offers three ways to merge a PR, and each has different implications:

Merge commit creates a merge commit and preserves your original hashes. History shows branches converging.

Rebase and merge replays your commits on top of the latest main. Your commits get new hashes because their parent changes (and the committer becomes the maintainer).

Squash and merge combines all your commits into a single new commit. All original hashes are lost.

The zig-course project uses "Rebase and merge", which is why my commit hash changed from 6ed454e to 35750b1 even though nothing was actually rebased.

The Fix

The solution was to bypass my polluted fork entirely:

# Add upstream as a remote
git remote add upstream https://github.com/zigcc/zig-course.git

# Fetch upstream (download only, don't merge into current branch)
git fetch upstream

# Create a new branch based on upstream/main, not origin/main
git checkout -B my-feature upstream/main

# Cherry-pick only my commit
git cherry-pick dbe0044

# Force push to update the PR
git push --force origin my-feature

The key insight: your fork is just a place to push code. It's not meant to be your source of truth. Always branch from upstream/main.

What This Taught Me

What started as a confusing PR error became a deep dive into Git's data model. Fork sync is a merge, not a reset—it won't remove your old commits even if they've been merged upstream under a different hash. Commit hashes depend on committer info, so "Rebase and merge" will always change your hash. And the distinction between author and committer, while subtle, turns out to matter more than I expected.

The underlying lesson is simple: in Git, identity is everything. A single bit of metadata change—even just the committer field—creates an entirely different object. And that's not a bug. That's the elegant engineering behind content-addressable storage.