Full rebuild taking ~3 days… I know there’s a lot of variables but
does that sound about right for:
53,846 assets
150,974 image files
211 video files
Synchronization speed depends on both your hardware, your network, and the contents of your library.
I have a ~8 year old dual-core, hyperthreaded Intel NUC that does 3 imports concurrently and is regularly I/O bound on my old NAS that has aging HDDs. The NUC takes about 3-15 seconds per image (depending on the size of the original image: older 2MP images can import in less than a second, and recent dSLR 25MP RAW images can take 15 seconds just to spool off the spinning rust onto the LAN with I/O contention).
I also have a 2 year old AMD 3900x that runs 18+ concurrent imports, and when using SSD for storage, it’s 20x+ faster than the NUC+NAS.
150k assets in one day is ~1.7s per asset: (150000 / (24 * 60 * 60)), so 3 days to rebuild is about in line with my NUC.
So this is a 96TB UNRAID array, 12 TB disks, AMD Ryzen 5 3600 6-Core @ 3600 MHz.
CPU utilisation is pretty low, hovering around 4% with peaks up to 20%, array throughput also pretty low, peaking at 25MB/s reads, 12 MB/s writes.
It certainly is a mixture of some RAW some jpg.
Seems to fluctuate between saying 3 days to complete down to 19 hours to complete, currently it’s back up to 1 day.
So I guess the estimate is adjusting as it hits different content? i.e. the estimate lengthens when it hits RAW images, and speeds up if it hits a run of jpg’s?
The estimate is based on weighed average of prior processing time per mimetype multiplied by the number of remaining files of that type.
If the volume hasn’t been fully scanned yet, it makes a swag at how many more files it will find on the specific volume.
If you’re scanning several volumes, realize the estimate is only for the currently processed volume.
If you’re looking at CPU load, make sure you’re not hiding nice’d load: to keep the system responsive, PhotoStructure runs at a lower scheduling priority. If that doesn’t seem to explain what’s going on, it may be due to a scheduling bug that isn’t feeding the work queue properly. Are you using v0.9.1 or a v1.0-beta build?
Also: if it’s doing a rebuild, know that the second stage (asset re-aggregation) is (somewhat necessarily) single-threaded, because there isn’t a clear way to acquire an advisory lock per asset (when assets may actually incorrectly share prior-aggregated asset file variations). (I could probably run several asset re-aggregations in parallel by concurrently scheduling re-aggregation of maximally-disparate assets, but rebuilds really shouldn’t be an often task that people have to run…)