Export All Deduplicated Assets

I think it would be useful if Photostructure had a utility of some sort that allowed you to export your entire deduplicated library.

Often questions pop up on the forum or the discord about physically deleting duplicate photo files. There are problems, of course, with letting software “smartly” delete user files, and I think Photostructure’s position on this is entirely reasonable.

With that in mind, since Photostructure already has access to the physical location of all assets, along with information about duplicates (and which duplicate file is the “best”), it seems like it might be a pretty easy lift to create an export option. You could even apply the auto-organization logic and put these files into year/month folders.

The end result would be that the user has a clean, deduplicated copy of all of their assets in one place.

What do others think of this feature request?

2 Likes

I think it would be best to have an option to set the “organize library” feature to only copy deduplicated assets into it. Frankly, I turned off that feature (eventhough I am a “plus” user) because what’s the point? Right now it turns out to be an exact copy of my source directories.

But generally speaking, an export option (including an option to only include deduplicated assets as you suggest) is a good idea.

I agree, I don’t use auto-organization either. I’m getting the impression, though, that there are at least a few people searching for an easyish way to physically dedup their photo files. Photostructure doesn’t do that currently, nor would it ever in normal usage. However, it’s got all the information it needs to provide users with a physically deduped copy of their files.

I imagine that if this feature is developed, a new user could set up Photostructure, export a deduped copy of their files, and then manually delete all of their source files. They would be left with a clean, deduplicated copy of their photos AND an amazing software for viewing said photos.

I know that’s what people are looking for. It’s what I was looking for myself when I adopted photostructure. When I discovered it doesn’t really do that I was a bit dissapointed and ended up using another tool to delete dupes. It was extremely tedious, with lots of visual inspections of side by side pictures. But based on some of the questionable deduping decision the PS algorithms make (getting better every release) not sure I would EVER trust a completely automated physical deduping of my library.

That’s the point of this request, really… it’s not completely automated. The software is just spitting out a deduplicated copy of all your files. What you do with it at that point is up to you. You COULD delete (or archive) your source photos and use this new deduplicated copy as your source going forward, but you don’t have to.

To be clear, when I say “completely automated” de-duping process (as PS has implemented), I mean a de-duping process that never prompt the user to validate matches below a certain treshold of confidence. In other words, I have trust it’s 100% accurate instead of trusting it’s 80% accurate and validate (through prompts) the other 20%.

To that end, since Photostructure already has a way for you to pick the best variant in the UI, I wonder if a workflow could be added that allows you to look through all assets with multiple asset files and pick the best? Once you’ve done this, that asset could be marked as “confirmed” so it’s not presented to you again. Then, the process of confirming the best variant can be ongoing without losing track of what you’ve already done.

I’ll be honest, I don’t actually have any problems with duplicates, so none of this would benefit me personally. I’m just looking at the common themes of questions on the forum and on discord and then trying to imagine how Photostructure could help address those questions.

This gets to the heart(s) of the problem: it is tedious. But, we’re all distrustful of the machinery.

What we want is less friction. The software can make it easier to bulk approve it’s detections, and let us non-destructively export the “top” set of de-duplicated images for use in other contexts.

I think there would be value in understanding and visually being able to view duplicates and a “view” option in the tool. So that I can triage a new collection of files and see how many duplicates I actually have for a file. So if you had “view by duplicate count” or similar you could manually curate your collections and do any deduping you wanted within the tool before doing the “clean dedup’ed” export.

Howdy, @neogeek83 – welcome to PhotoStructure!

You can currently view duplicates by opening the asset info panel (tap i or click the i icon in the asset header), and then click the pathname of each variation. The fullscreen view will change to that variation.