Tag: duplicate files
-
Deleting about 24,000 files takes about 5 minutes
$ wc -l zz*txt 24028 zz-discard-paths.txt 78968 zz-keep-paths.txt 102996 total
-
Deduplication, continued.
OK, so what started out as a bash script grew into a rather finicky Perl script. I used a bunch of parallel hashes, judging things by combinations duplicate names, identical file sizes and actually scoring the path name and taking the highest score. I ended up no using the file hashes, because I decided that…