The purpose of this tool is scanning the selected directory or directories for
duplicate files, i.e. files with identical content. Duplicate files are
identified by first calculating the SHA-1 digest of each file and then looking
for values that appear more than once. In particular, files with identical
content are guaranteed to have the same SHA-1 digest, while files with
differing content will have different SHA-1 values with very high certainty.
All computed SHA-1 values are stored in a hash table, so collisions are found
quickly and we do NOT need to compare every digest to every other one. Also,
the files are processed concurrently in multiple "worker" threads in order to
parallelize and speed-up the SHA-1 computations on multi-core processors. On
our test machine it took ~15 minutes to analyse all the ~260,000 files on the
system drive (~63.5 GB). During this operation ~44,000 duplicates were found.
The list of identified duplicates can be exported to the XML and INI formats.
Main Categories: Other
Sub Categories:
Windows XP/Windows Vista/Windows 7/Windows 8
GNU General Public License
Version 2.03
- Release Date: Dec 19, 2014
-
- Performance optimizations
- Display file name and path in separate columns
- Further improved sorting of the results
- Various minor fixes and improvements
View complete revision history