Duplicates of photos

I’m not sure if this is a bug. I did the Google Takeout download and that’s given me a large file with loads of images and JSON files. Now PhotoStructure has scanned this I have two of every image in the library - one the Google Photos one, and one the original one. This latter one (the original one) has any other copies I happen to have on my hard drives successfully de-duped, but the Google Photos image stands alone. Is there anything I can do to help with this - happy to send originals.

Cheers

Tom

Thanks for the report! Any de-duplication issue you find I’ll most likely consider a bug: that part of PhotoStructure should “just work.”

Google Photos did nasty things to the metadata in many of my photos: deleting gps info, and actually rewriting exposure information, so I had to make PhotoStructure’s de-duplication handle that gracefully.

It looks like yours may be mucked with in different ways: if you can email me an original and a Google Takeout variation, I can take a look.

:+1:

Responded via e-mail - thanks!

The two images you shared have different captured-at times because the tags they contain have different resolutions: one has “SubSec” (millisecond) precision, the other only has to-the-second resolution:

mrm@speedy:~/Downloads/t2$ exiftool -j IMG_7140.JPG | grep 2011
  "DateTimeOriginal": "2011:07:26 16:19:10",
  "DateCreated": "2011:07:26",
  "DateTimeCreated": "2011:07:26 16:19:10+00:00",
mrm@speedy:~/Downloads/t2$ exiftool -j IMG_7140.CR2 | grep 2011
  "ModifyDate": "2011:07:26 16:19:10",
  "DateTimeOriginal": "2011:07:26 16:19:10",
  "CreateDate": "2011:07:26 16:19:10",
  "SubSecCreateDate": "2011:07:26 16:19:10.45",
  "SubSecDateTimeOriginal": "2011:07:26 16:19:10.45",
  "SubSecModifyDate": "2011:07:26 16:19:10.45",

I hadn’t thought of this case before: so I’ve updated the millis-of-precision calculator to handle this appropriately, and just updated the info tool to include this new precisionMs field:

$ ./photostructure info --filter capturedAt ~/Downloads/t2/IMG_7140.*
{
  a: {
    nativePath: '/home/mrm/Downloads/t2/IMG_7140.CR2',
    capturedAt: {
      date: ExifDateTime {
        year: 2011,
        month: 7,
        day: 26,
        hour: 16,
        minute: 19,
        second: 10,
        millisecond: 450,
        tzoffsetMinutes: undefined,
        rawValue: '2011:07:26 16:19:10.45',
        zoneName: undefined
      },
      src: 'tags:SubSecDateTimeOriginal',
      precisionMs: 10,
      localCentiseconds: 2011072616191045
    }
  },
  b: {
    nativePath: '/home/mrm/Downloads/t2/IMG_7140.JPG',
    capturedAt: {
      date: ExifDateTime {
        year: 2011,
        month: 7,
        day: 26,
        hour: 16,
        minute: 19,
        second: 10,
        millisecond: 0,
        tzoffsetMinutes: undefined,
        rawValue: '2011:07:26 16:19:10',
        zoneName: undefined
      },
      src: 'tags:DateTimeOriginal',
      precisionMs: 1000,
      localCentiseconds: 2011072616191000
    }
  }
}

I’ve also just updated the asset file comparator to respect precisionMs when comparing dates.

Before the fix:

$ ./photostructure info ~/Downloads/t2/IMG* | head
{
  fileComparison: 'These files represent different assets: captured-at 2011072616191045 != 2011072616191000',
  variant: false,
...
}

After:

$ ./photostructure info ~/Downloads/t2/IMG* | head
{
  fileComparison: 'These two files will be aggregated into a single asset.',
  variant: true,
...
}

Thanks again for reporting! This update will be in beta.10.