Does the bold file name in "info" panel signify anything?

Daniel · January 6, 2021, 3:46am

I noticed that the first file path in the “info” panel is bold:

Does that signify anything? In this case I have a copy of the photo from Google Photos takeout (1.73 MB) as well as the original file from my phone (3.85 MB) and the file path of the Google Photos version is in bold.

I suspected that it might have signified the “best” version of the photo, but in that case, shouldn’t that be the one with the larger file size as it’s not compressed as heavily?

mrm · January 6, 2021, 4:29am

Indeed, the bold path is the “shown” or “primary” variant.

PhotoStructure has a bunch of heuristics it applies to determine which variant to show:

https://photostructure.com/faq/what-do-you-mean-by-dedupe/#how-does-photostructure-pick-which-file-to-show

I don’t currently look at compression rates: but perhaps I should?

What’s the JPEG quality of the 1.73 MB version versus the 3.85MB version?

(JPEG quality vs file size is quite surprising: 85 quality is excellent, but can be 1/3-1/4 the file size of 95)

Daniel · January 6, 2021, 5:36am

ImageMagick says that the quality of the original is 92, while the quality of the Google-ified version is 87.

IMO if I have two versions of the same image, one of which is more compressed, PhotoStructure should prefer the less compressed version. I could always just manually compress it if I wanted to, whereas I can’t “un-compress” the other version. I don’t know whether Google’s compression has resulted in small artifacts in the image so I’d prefer using my phone’s or camera’s original file.

mrm · January 6, 2021, 7:13am

I’m not a betting man, but if you look at those original images at 400%, I’d put $10 on you not being able to see any JPEG artifacts or reduction in image quality (typical JPEG compression artifacts include posterization and disjoint blocks in 8x8 pixel segments).

The heuristics to pick the “best” variant are a bit more nuanced: I want to pick up recent image edits, if there are any, but only if that edit is not on a substantively smaller-resolution image.

Compression rate and file size don’t actually correlate to more data or improved image quality. JPEGs can be saved at higher quality levels than the original image and then “win,” in other words.

(Ideally there would be some metric of “image quality” that I could just use directly in the sort criteria).

Daniel · January 6, 2021, 8:08am

The thing is that I do see a difference on some photos (but not all). The colour sometimes seems slightly different between the two, too (maybe Google did some auto enhancements).

That makes sense.

In my case, I know I have two copies - the original (larger), and the Google Photos compressed version (smaller). I might just write a script to go through all the photos (using the PhotoStructure database) and delete the smaller version.

mrm · January 6, 2021, 8:15am

Yeah, Google’s auto-photo-enhance feature (esp. in the Google+ era) certainly did that. Which do you think are better; are you picking the google version, or the original?

(You saw the “list” tool, right? If you think up other tooling or args that would be nice, it’s super easy to add stuff there…)

Daniel · January 6, 2021, 7:06pm

I think I’m going to stick with the original since that’s the image as it was originally captured. I can do enhancements to it myself if I want to.

Yes! It looks useful but I haven’t really played with it much yet.

mrm · January 9, 2021, 12:34am

OK, so it seems like your heuristics preferences are slightly different than the ones I implemented.

I just implemented a new variantSortCriteria library setting. The info tool in v0.9 exposes a fileSortCriteria field, and v1.0 renamed it to match the new setting (variantSortCriteria).

# +-----------------------+
# |  variantSortCriteria  |
# +-----------------------+
#
# How should PhotoStructure pick the "best" asset file variant for a given
# asset? You may reorder the default fields. Only "resolution", "fileSize",
# "mtime", "schemeIdx", "isCover", "count", and "isBrowserSupported" are
# understood: other field names will be ignored. If you change this value, you
# must "rebuild" your library, or at least "resync this asset" to apply the
# change one-off.
# (env: "PS_VARIANT_SORT_CRITERIA")
#
# variantSortCriteria = [
#   "resolution",
#   "mtime",
#   "schemeIdx",
#   "fileSize",
#   "isCover",
#   "count",
#   "isBrowserSupported"
# ]

Daniel · January 9, 2021, 12:50am

This sounds perfect! Thank you for exposing a setting for this

Could you please explain what the fields mean? Some of them are clear, but what is “schemeIdx”?

mrm · January 9, 2021, 2:03am

Sure: I just updated https://photostructure.com/faq/what-do-you-mean-by-dedupe/#how-does-photostructure-pick-which-file-to-show.