I continue to experiment with “organization” aspect of the PS. I have my assets (jpeg, png, mov and mv4) sitting on NAS in one messy folder. There are 526 assets. I also set very “aggressive” ENV VARS:
These are all the AssetFiles that are associated to an imported Asset and not the “primary variation” (the one that’s “shown” in the asset view).
If you want to only show those files that are in your library, and what to see all AssetFile columns, change the SELECT to * and add AssetFile.uri LIKE 'pslib:%' to the WHERE clause:
SELECT
*
FROM
AssetFile
JOIN Asset ON Asset.id = AssetFile.assetId
WHERE
Asset.shown = 1
AND AssetFile.shown = 0
AND AssetFile.uri LIKE 'pslib:%'
Right… I forgot about de-dup. Library metrics match to what I see: 488 assets (it changed from 484 after restart ) and (image files + video files)/2 = 526.
Now, I need to play with list tool a bit more to get the list of assets that were considered duplicates, as
./photostructure list --where "Asset.shown=1 AND AssetFile.shown=0"
gives me the list of all 526 assets in NAS dir and 38 assets in Library dir, but I have no idea how to “match” them. 526-38=488 that means one can hope there are 38 duplicates and there is only 1 duplicate per asset and there are no other issues.
To be honest, I am probably fine to let PS now do this huge amount of work to sort out thousand of assets backed up from various sources and have “her” sort them out in a neat folder structure. It’s just my OCD kicks in every time I see my numbers mismatch.
Update: a bit more SQLing and I was able to find the offenders. iPhone selfie burst mode is an outlaw: creating 10 identical photos… These are truly duplicates. I think I can construct a query to make it work and find what’s going on.
This is my “lame-o” SQL query to show all assets and files URIs, that were de-dupped. They grouped and ordered by assetId. This way I can analyze dups and delete them. Not ideal and I need to test this on another set of assets, but it is a start.
SELECT id, assetId, uri from AssetFile WHERE assetId in (
SELECT
Asset.id
FROM
AssetFile
JOIN Asset ON Asset.id = AssetFile.assetId
WHERE
Asset.shown = 1
AND AssetFile.shown = 0
AND AssetFile.uri like 'pslib:%' group by AssetFile.assetId
)
and uri like 'psfile:%' order by assetId;