An exact duplicate is byte-for-byte identical to another file. The names and locations can differ, but the contents match perfectly. These are the easy case: compute a checksum for every file and any two with the same checksum are exact duplicates. This is reliable because it compares the actual bytes, not the file name or size.
A near duplicate looks the same to a person but is not identical to a computer. Examples:
These have different checksums, so a hash comparison will not pair them. Catching near duplicates needs content analysis, such as comparing what an image actually shows rather than its exact bytes.
If you only care about reclaiming space from literal copies, exact duplicate detection by checksum is fast and safe. If you are cleaning a photo library where the same shot exists in several slightly different versions, you need the near-duplicate approach, and you have to decide which version to keep. Treating both as the same problem leads to either missed duplicates or deleted files you wanted.
For exact duplicates across a folder or drive, hash every file and group the matches. For near duplicates among photos, use a tool that compares image content. Some workflows use both: checksums to clear out the obvious literal copies first, then content matching to review the visually similar ones that remain.
Find exact duplicate files by checksum across a folder on Mac and export the results. Runs on-device. · macOS