Couldn't find any answers to this with google. When importing the same folder of pictures twice, Shotwell will skip duplicate photos. But how does it detect duplicates? If I import two different folders of pictures, some of which have the same name for some reason, will Shotwell assume they are duplicates? Or does it also factor in the file size, making false duplicates unlikely? Or does it hash the pictures, making false duplicates all but impossible?
Ubuntu – How does Shotwell detect duplicates
photo-managementshotwell
Best Answer
I believe it is more advanced than simple names, I just tried. In fact it would seem that it doesn't base it on name at all.
So I just created the following:
Imported the folder TestDir (which imports from any subdirectories too). This was the notice:
The two it had imported were blue.png and yellow.png. This is because they were created first (it chooses the oldest if there are duplicates).
This was confirmed by the next test:
pink2.png
andpink.png
have been created.pink2.png
was created first, thenpink.png
The successful imported ones were
blue.png
,yellow.png
andpink2.png
.Because of that I assume it uses a hashing algorithm.
It is accurate enough that changing just 1 pixel of colour from green to yellow on an A4 page caused it to not detect as a duplicate. Pretty accurate then!
In fact, I just found this post here:
In fact in the source code, at line 732 is this: Kudos @ Jeremie Miserez
Sounds like it uses a MD5 hash!
Shapes for directory tree from here
My pronouns are He / Him