What is a duplicate file?

Last updated July 1, 2026
Short answer
A duplicate file is another copy of content you already have. There are two kinds worth separating: exact duplicates, which are byte-for-byte identical and can be found with a checksum, and near duplicates, which look the same but differ slightly, such as two exports of one photo. They need different methods to catch.

Exact duplicates

An exact duplicate is byte-for-byte identical to another file. The names and locations can differ, but the contents match perfectly. These are the easy case: compute a checksum for every file and any two with the same checksum are exact duplicates. This is reliable because it compares the actual bytes, not the file name or size.

Near duplicates

A near duplicate looks the same to a person but is not identical to a computer. Examples:

  • The same photo exported at two quality settings.
  • An image that was rotated, cropped, or lightly edited.
  • A document saved by two different apps, adding different metadata.

These have different checksums, so a hash comparison will not pair them. Catching near duplicates needs content analysis, such as comparing what an image actually shows rather than its exact bytes.

Why the distinction matters

If you only care about reclaiming space from literal copies, exact duplicate detection by checksum is fast and safe. If you are cleaning a photo library where the same shot exists in several slightly different versions, you need the near-duplicate approach, and you have to decide which version to keep. Treating both as the same problem leads to either missed duplicates or deleted files you wanted.

How to find them

For exact duplicates across a folder or drive, hash every file and group the matches. For near duplicates among photos, use a tool that compares image content. Some workflows use both: checksums to clear out the obvious literal copies first, then content matching to review the visually similar ones that remain.

FileLister: Folder Inventory app icon

FileLister: Folder Inventory

Find exact duplicate files by checksum across a folder on Mac and export the results. Runs on-device. · macOS

Related entries