skipLibraryOpenLocks

mrm · August 20, 2021, 9:07pm

Is you library database sitting on an HDD, or on SSD?

My test machine can run 18 concurrent sync-file, but only on fast SSD.

That said, it’d be nice to support Postgres or MariaDB as an option for very large libraries or users (like you) that need heavy concurrency support.

Another possible solution which I haven’t tested yet would be switching from process-threading (which is what PhotoStructure currently uses) to web worker threading. Sharing the same may avoid some amount of lock contention.

This is an interesting idea! Imports would look “bursty” to users, though, as batches of work were merged back into the final library.

Shuffling around what the work is to “import a file” to avoid concurrent db access might be a (much) simpler approach (and avoid map/reduce batches).

Currently importing a file looks roughly like

Walk though directories and look for files to import.

Then, for each file that passes import filters:

Is the file URI already in AssetFile? If so, is the size and largest-mtime-of-file-or-sidecar match the current value in the db? If so, we’re done with that file.
If not, run a series of db queries and file operations (including SHA and image hashing) to try to find any prior row in the Asset table. If no prior asset matches, add a new Asset and insert a new AssetFile
For the referenced Asset, find the “best” variation, and ensure the previews and transcoded video is in order.

The time-consuming operations are

File SHA (10ms-10s, depends on the size of the file and speed of the disk)
Image hash (100ms-1.5s, depending on image resolution and CPU speed)
Preview generation (100ms-5s, depending on image resolution and CPU speed)
Transcoding (can be .2x-4x the duration of the video, depending on speed of the disk and CPU speed and count)

DB operations typically complete in between a millisecond to 10ms (SQLite is fast!), as long as there isn’t any lock contention.

I’d like to fix this. I’ll think about how I can adjust the sync workloads.