Looking for a GENUINE FREEWARE duplicate file finder
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Plusnet Community
- :
- Forum
- :
- Other forums
- :
- Tech Help - Software/Hardware etc
- :
- Looking for a GENUINE FREEWARE duplicate file find...
Looking for a GENUINE FREEWARE duplicate file finder
01-06-2019 8:26 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I have a multitude of duplicate files, the duplicates mostly having longer names that include the likes of
... (2018_06_04 06_52_58 UTC)....
I say GENUINE FREEWARE because I have downloaded too many. which purport to be, but then it turns out they are very limited, unless you upgrade to a pay version Totally dishonest!
I want something that is easy to use, and allows the user good choice of drive and locations to scan.
I'll be grateful for any recommendations.
Re: Looking for a GENUINE FREEWARE duplicate file finder
01-06-2019 9:09 PM - edited 01-06-2019 9:23 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I use two:
https://www.auslogics.com/en/software/duplicate-file-finder/
and https://www.ashisoft.com Both have paid-for versions but the free version has always been good enough for my needs with various ways of searching and with previews.
CCleaner also has a duplicate file searcher but I found that took much longer.
Forum Moderator and Customer
Courage is resistance to fear, mastery of fear, not absence of fear - Mark Twain
He who feared he would not succeed sat still
Re: Looking for a GENUINE FREEWARE duplicate file finder
01-06-2019 9:23 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I've used Freecommander quite a bit in the past.
To argue with someone who has renounced the use of reason is like administering medicine to the dead - Thomas Paine

Re: Looking for a GENUINE FREEWARE duplicate file finder
02-06-2019 8:32 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I wrote one of these before for a forum member, but I guess they're not using it any more
But there are duplicate files and duplicate file names so may not actually be a duplicate file. So any good duplicator will hash the file as well to ensure the files are true duplicates.
With that said doing this is a very resource and process intensive procedure so will take some time. Imagine if you have a file called A.dat on the root of C:\ and a possible duplicate of this elsewhere on the disk. Every other file on the disk needs to get checked with A.dat to see if it's a duplicate so the more you have the more time it will take and if hashing is involved it will take even longer. Even when one is found there may be more, so every file on the disk must be checked.
Re: Looking for a GENUINE FREEWARE duplicate file finder
02-06-2019 11:30 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
A good file duplicate detector will analyse photo's on a similarity basis so that virtually identical photo's can be presented for selection by a human.
Simple hashing will only find exact duplicates.
digiKam (Linux) can do this - although it is really a much bigger photo management tool and does much more than this.
"In The Beginning Was The Word, And The Word Was Aardvark."

Re: Looking for a GENUINE FREEWARE duplicate file finder
02-06-2019 11:59 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
That’s all very well @VileReynard but similar is not identical, that’s where the hash comes in!
For what it’s worth, no doubt nothing to most of you, this code will hash all of the files in a directory using a SHA512 hash digest.
#include <sys/stat.h> #include <sys/mman.h> #include <fcntl.h> #include <cstdio> #include <stdlib.h> #include <cstring> #include <iostream> #include <unistd.h> #include <exception> #include <filesystem> #include <openssl/sha.h> #include <openssl/ssl.h> char * mdString = new char[(SHA512_DIGEST_LENGTH * 2) +1]{0}; unsigned char * mdDigest = new unsigned char[SHA512_DIGEST_LENGTH]{0}; const unsigned long getFileSize(const int& fd) { struct stat fileInfo; std::memset((void *)&fileInfo, 0, sizeof(fileInfo)); if (fstat(fd, &fileInfo) < 0) { std::cerr << "Failed to stat file" << std::endl; } return fileInfo.st_size; } const bool shaFile(const char * content, const unsigned long& size) { bool success{false} ; try { std::memset(mdString, 0, (SHA512_DIGEST_LENGTH * 2)) ; SHA512_CTX context ; SHA512_Init(&context) ; SHA512_Update(&context, content, size) ; SHA512_Final(mdDigest, &context) ; for (unsigned int i = 0; i < SHA512_DIGEST_LENGTH; i++) { std::snprintf(&mdString[i*2], SHA512_DIGEST_LENGTH, "%02x", (unsigned int) mdDigest[i]) ; } success = true ; } catch (std::exception const& e) { std::cerr << e.what() << " while hashing file!" << std::endl ; } return success ; } int main(int argc, char *argv[]) { if (argc != 2) { std::cerr << "The path to hash the file in is missing from Command Line!" << std::endl; exit(-1); } for (auto& f : std::filesystem::directory_iterator(argv[1])) { const int filePtr = open(f.path().c_str(), O_RDONLY); if (filePtr < 0) { std::cerr << "Unable to open " << f.path() << "!" << std::endl ; } else { const unsigned long fileSize = getFileSize(filePtr); if (fileSize) { char * fileBuffer = (char *) mmap(0, fileSize, PROT_READ, MAP_SHARED, filePtr, 0); if (fileBuffer != MAP_FAILED) { if (shaFile(fileBuffer, fileSize)) { std::cout << "File : " << f.path() << " size : " << fileSize << " Hash : " << mdString << std::endl ; } munmap(fileBuffer, fileSize); } } } close(filePtr) ; } if (mdString) delete[] mdString ; if (mdDigest) delete[] mdDigest ; return 0; }
To compile it use:
g++ FileHash.cpp -std=c++17 -o FileHash -lstdc++fs -lcrypto
And to use it do:
./FileHash ./
You'll then get output like this:
File : "./FileHash.cpp" size : 2147 Hash : e2719c384b834cb345bc36275c28a923369cbac0f474e4508b1d592af38827b0524925b4f22f57bcff899b2db36c982799a83d9c915fd053fc0caae0d54ff0ae File : "./FileHash" size : 220536 Hash : 1c921c84549b7783df29e375499c3688e8a288cf281003f40d1515d890f2964e4ad402500c42ac6df1567c20c0f1273c8e121cf0f6c9619da91d67bd9b48545d
Re: Looking for a GENUINE FREEWARE duplicate file finder
02-06-2019 12:07 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Very good!
"In The Beginning Was The Word, And The Word Was Aardvark."
Re: Looking for a GENUINE FREEWARE duplicate file finder
02-06-2019 1:25 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Or you could just use the sha512sum command.

Re: Looking for a GENUINE FREEWARE duplicate file finder
02-06-2019 1:56 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
And if you do you get the exact same results as well.
[mook@dev FileHash]$ sha512sum FileHash.cpp e2719c384b834cb345bc36275c28a923369cbac0f474e4508b1d592af38827b0524925b4f22f57bcff899b2db36c982799a83d9c915fd053fc0caae0d54ff0ae FileHash.cpp [mook@dev FileHash]$ sha512sum FileHash 1c921c84549b7783df29e375499c3688e8a288cf281003f40d1515d890f2964e4ad402500c42ac6df1567c20c0f1273c8e121cf0f6c9619da91d67bd9b48545d FileHash
Re: Looking for a GENUINE FREEWARE duplicate file finder
02-06-2019 2:41 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
You need to filter your files by mime type - it is generally unusual to get duplicates of most file types.
Sometimes you get duplicates of content such as ebooks or image files - but not random files.
So you really need to check the contents of files, which makes it a hard problem.
"In The Beginning Was The Word, And The Word Was Aardvark."

Re: Looking for a GENUINE FREEWARE duplicate file finder
02-06-2019 3:58 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I know what you mean by MIME type but that’s a web thing and having duplicates of the same file and type is quite easy if you copy them to various locations around your disk over time.
However, in order to do the way you are suggesting you’d really need to open the file in binary mode and attempt to read a 'header'. You could do this easily if you could take as true what its extension implied, but not all files have extensions of course and there's nothing stopping you from renaming an image file as an exe.
Doing it by 'header' would be horrendous, say you had an image file called doggie where you’d removed the extension, to truly determine its type you would have to try and open it and read 8 bytes to see if it might be a PNG if not, then another 6 bytes to see if it were a BMP or equally it may be a GIF, or if none of these it might be a JPEG etc, you get the idea, and that's reducing the task to image files!
I for one wouldn’t want to be doing this.
Re: Looking for a GENUINE FREEWARE duplicate file finder
02-06-2019 6:37 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I use a paid for product. EF Duplicate Files Manager its €11.90 one time payment. It's regularly updated and works extremely well. It has lots of options and an easy to use interface.
Given the complexity of what you need to do, I think it only fair to reward the person who develops such a complex product.
Re: Looking for a GENUINE FREEWARE duplicate file finder
03-06-2019 12:12 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
@VileReynard wrote:
A good file duplicate detector will analyse photo's on a similarity basis so that virtually identical photo's can be presented for selection by a human.
Simple hashing will only find exact duplicates.
digiKam (Linux) can do this - although it is really a much bigger photo management tool and does much more than this.
I use Awesome Photo Finder to analyse. It picks differences difficult to see.
Re: Looking for a GENUINE FREEWARE duplicate file finder
03-06-2019 12:27 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I got rid of a lot of the duplicates that had an extra string in the name compared with the originals by searching for example my docs folder very vaguely with *(*UTC), then removed all to the recycle bin.
All picture diles had the log names, so I put My Pictures in one of the bulk re-namers just to remove the offending string in the middle
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Plusnet Community
- :
- Forum
- :
- Other forums
- :
- Tech Help - Software/Hardware etc
- :
- Looking for a GENUINE FREEWARE duplicate file find...