Handle egregious Google Takeout metadata butchering

I just had a user ask:

I’ve noticed that google appears to have butchered the exif data from my Sony a5100. It is modifying the shutter speed for my images to be 1 second and may also mess with the aperture. This may cause photostructure to see separate assets.

And, indeed: $ exiftool -j -ShutterSpeed *.arw *.jpeg

[{
  "SourceFile": "original.arw",
  "ShutterSpeed": "1/640"
},
{
  "SourceFile": "takeout.jpeg",
  "ShutterSpeed": 1
}]

Normally PhotoStructure discriminates images that have different exposure information: those have to be different assets, right?

Ideally, I could detect that the borked image was scrambled by Google Takeout, and apply specific heuristics that take metadata with a grain of salt, but there isn’t a clear (to me) way to detect this situation.

Google Takeout images sometimes have an added ImageUniqueID, but this can be added by other software. I don’t see anything else (other than maybe having a .json sidecar? That would work)…

Ideas?

Yep, I think the existence of the takeout sidecar might be the way to go?

Could grep for googlePhotosOrigin or googleusercontent perhaps?

This only works if the ‘takeout’ sidecar is present. But if someone is following the ratarmount workflow process for exposing takeout data (or they didn’t manually remove the sidecars) it should be a safe-ish bet?

A few samples from takeout sidecars:
A manual upload:

  "url": "https://lh3.googleusercontent.com/<blahblablah>",
  "googlePhotosOrigin": {
    "webUpload": {
      "computerUpload": {
      }
    }
  },

Here’s another one where it came from ‘partner sharing’:

  "url": "https://lh3.googleusercontent.com/CR....",
  "googlePhotosOrigin": {
    "fromPartnerSharing": {
    }
  },

Or this one from an LGV20 upload via the ‘photos’ app:

  "url": "https://lh3.googleusercontent.com/...",
  "googlePhotosOrigin": {
    "mobileUpload": {
      "deviceFolder": {
        "localFolderName": ""
      },
      "deviceType": "ANDROID_PHONE"
    }
  },

Excellent. I’ve changed this to a feature request: it shouldn’t take long to cook up, but I want to get v2.0 out first.