Troubleshooting

Fixing Duplicate Emails After Migration

Remove duplicate emails after migration with concrete deduplication steps, checkpoint repair, and label-mapping fixes that work across Gmail, Office 365, and IMAP.

DO

Dan Okafor

MSP Practice Lead

Reviewed by Priya Shah
· 6 min read
Stacks of envelopes representing duplicate emails after migration

You finished an IMAP migration, opened the destination mailbox, and the message counts are wrong. Every message appears twice, or worse, three or four times. The user is on the phone asking why their inbox suddenly has 80,000 items instead of 20,000. The good news is that duplicates are almost always recoverable without losing data, provided you stop the running job and apply a header-based deduplication pass before anything else changes. This post walks through identifying the duplication pattern, cleaning it up, and stopping it from happening on the next run.

Heads up

Typical signals of a duplicated mailbox include:

Folder Inbox: source=12,431 destination=24,862
Duplicate Message-ID detected: <CABxyz@mail.gmail.com>
WARN: UID 4521 already exists at destination, writing copy

Why this happens

Three patterns cause almost every duplicated-mailbox case after an IMAP migration.

Improper checkpoint on resume. When a migration is interrupted (network blip, throttling, laptop sleep) and you restart it, the tool should resume from the last successfully migrated UID per folder. If the checkpoint is missing or corrupt, the tool re-scans from UID 1, sees that no destination match exists for messages it actually already wrote, and writes them all again. You end up with two clean copies of every pre-interruption message.

Skip the manual setup — let Mailbox Taxi handle it

One desktop app, every IMAP provider, zero data leaving your machine.

Running the tool twice without deduplication. Someone hits "Migrate" on Tuesday and again on Friday because the first run looked incomplete. Without a destination-side deduplication step (header match on Message-ID, or a content hash), the second run treats every message as new and writes it again. This is the most common cause in MSP environments where two technicians both think they are owning the migration.

Gmail labels mapped to multiple folders. Gmail stores labels rather than folders. The same message can carry the labels Inbox, Important, and a custom label. Walked over IMAP, that one message appears in three IMAP folders. A migration tool that copies each IMAP folder verbatim writes the message three times on the destination, which has real folders, not labels. The destination user sees three copies in three places.

Fix it now

  1. Identify the duplication pattern

    Before deleting anything, work out what you are dealing with. Pick a sample folder (the Inbox is fine) and sort by Subject and Date. If you see exact pairs of identical messages with the same Message-ID, you have a rerun duplication. If you see the same message appearing in three folders with the same Message-ID, you have a Gmail-label-to-folder mismatch. If you see near-duplicates with different Message-IDs, you may have forwarded copies and not real duplicates.

  2. Stop any active migration jobs

    Pause the migration tool. If anything is still running it will keep adding to the problem. In Mailbox Taxi this is a single click; in other tools you may need to kill the worker process. Do not start any cleanup until every active job is confirmed paused, otherwise you will dedup against a moving target.

  3. Run a header-based deduplication pass

    Most reputable migration tools include a deduplication mode that walks the destination folder, hashes each message by Message-ID, From, Date, and Subject, and removes exact matches keeping the oldest copy. If your tool does not have this, imapsync --delete2duplicates and doveadm deduplicate (on Dovecot destinations) both do the same job. Always run dedup against a snapshot or a backup-first destination so you can roll back.

  4. Repair the per-folder checkpoint

    Once duplicates are gone, fix the cause. Open the migration tool's state file (Mailbox Taxi stores it as a local JSON file in the project directory) and confirm that every folder has a lastUid value greater than the highest successfully migrated UID. If the checkpoint is missing, regenerate it by running the tool in "verify only" mode, which compares source and destination UIDs without writing anything.

  5. Fix Gmail label mapping

    If the source is Gmail or Google Workspace, switch the tool from IMAP folder mode to Gmail API label mode. The API exposes each message exactly once with a list of labels attached. Map labels to destination folders one to one (Inbox to Inbox, Sent to Sent) and write each message a single time. The IMAP protocol glossary covers why IMAP and Gmail labels do not map cleanly.

  6. Validate against source counts

    Compare destination message counts per folder against the source. Allow a small variance for messages legitimately filtered (spam, trash) but anything more than 1 to 2 percent indicates an unresolved issue. The post-migration validation playbook covers the exact checks to run. If you still see drift, look at missing folders fix before assuming the dedup itself failed.

How to prevent it next time

Always choose a migration tool that records per-folder UID checkpoints to disk after every batch, not just at the end of the run. Mailbox Taxi flushes checkpoint state every 500 messages so an unexpected restart resumes within a 500-message window.

Treat every rerun as risky by default. Before clicking Start on a mailbox that has previously run, confirm that the tool will dedup on Message-ID at write time, not just on UID. Tools that only check the destination UID will happily write the same message under a new UID.

For Gmail and Google Workspace sources, use the Gmail API path rather than IMAP wherever the tool supports it. It avoids the label-to-folder duplication problem entirely.

Document the migration as one job with one owner. The simplest way to avoid two technicians re-running the same mailbox is to have a single ticket, a single owner, and a written changelog. The complete email migration guide includes a sample runbook you can copy.

Try Mailbox Taxi

Migrate your mailbox the easy way

Join the waitlist for early access and lock in launch pricing.

Related reading

Try Mailbox Taxi

Migrate your mailbox the easy way

Join the waitlist for early access and lock in launch pricing.