Scanning never completes

I have tested out PhotoStructure since 1.0.0-beta both for Windows and Docker Compose (not at the same time, I’m currently trying to get it to work with Docker Compose and version 2.0.0-beta.1), but I have so far never been able to get PS to completely scan my photo collection. I have tried rescanning, rebuilding and completely reinstalling PS.

I have turned off backup software for all of PS’ directories (in the beginning I thought the backup software was the reason that I couldn’t get PS to work reliably).

It seems to start off ok when I start with a clean database, but after a day or two it either stops or moves extremely slowly. When running PS in docker compose, it sometimes just deletes the db.sqlite3 file and starts over without any warning or explanation.

At the moment I have lots of errors like this:
2021-11-07T12:05:22.370Z web-5175 error DbRetries Caught db error. Retrying in 4844ms. 'SqliteError: SQLITE_BUSY: database is locked

I have consistently through the different versions of PS had errors in the logs that it is unable to write to database or that the database is locked, but not necessarily the same error message as above.

It seems for some reason PS is not able to write to the database, but I don’t understand why.

Do you have any suggestions on how to move forward?

Apologies! I know how frustrating this is.

I’m finishing up a large update to PhotoStructure that completely changes how library imports work.

what we’ve got now

Currently, sync is in charge of directory iteration, and then spawns N sync-file processes to actually import files into your library.

https://photostructure.com/server/photostructure-for-servers/#service-architecture

why that’s problematic

The issue with this approach is that each process is reading and writing to your library database. SQLite’s ability to handle concurrent writes drops precipitously as the size of the database increases (especially when disk I/O is slow), so progress eventually stalls completely in SQLITE_BUSY errors and retries.

how does the new stuff work?

The new approach moves database janitorial work from main to sync, and does away with sync-file sub-processes completely, so only sync and web read and write to your library database. File imports are done within sync, with the majority of non-database work offloaded to worker_threads.

why the original design?

The original design was actually predicated on “not getting stuck,” which can happen if a file is corrupt in such a way that it causes one of the native libraries that PhotoStructure uses to wedge or kill the process, but it seems like the recent worker_thread implementation gives us the same process isolation.

when’s this going to be ready?

My development branch works for smaller libraries on Linux, but I’m still charging down a couple issues on other platforms, and then need to performance test with larger libraries. I hope to release a new alpha branch in a couple of days.

My setup is that I have a small and fast SSD (C:) and a large, slow HDD (D:).

From what you say, it sounds like it would help if I moved the library directory (.photostructure), which contains the db.sqlite3 file, to the SSD drive, and just leave originalsDir on the large HDD. I already have cache and other system files on the SSD.

Do you agree?

Or should I just wait for your alpha build being ready?

1 Like

PhotoStructure certainly supports hybrid library setups, which may help speed up browsing, but I suspect you’ll still bonk against SQLITE_BUSY issues unless you single-thread your imports and extend timeouts (which is decidedly not a reasonable solution, but it lets you limp to completion):

  • maxSyncFileJobs=1
  • dbTimeoutMs=5000

Thanks!

Just to clarify, I have never used Windows and Docker at the same time, I have always started completely from scratch when I have switched between Windows and docker compose.

I have just had similar issues with locked database on both versions of PS, so I thought I’d mention both in the question.

(Edit: I thought “hybrid solution” meant using two variants of PS on the same database. I now realised “hybrid solution” means exactly my setup with the library spread across several harddrives)

I can’t get through a full scan of about 39k photos (not my entire library, I have other folders to add). I always get hung processing. The app is using cpu but makes no progress. The latest log shows this db locked error. I tried multiple times pause/resume and also restarting the scan. Behavior is odd because each time it hangs saying there are 80-200 left but I restart and it processes a couple thousand before hanging again at some 80-200 left. Right now based on file count I believe there are about 3k pics left and it is hung saying there are 82 left. For now I’ve stopped it so it doesn’t consume cpu all night again.

I don’t know if these errors are related to the hang or not. BUSY sounds like something that might have cleared up.

{“ts”:1696469099769,“l”:“error”,“ctx”:“DbRetries”,“msg”:“Caught db error. Retrying in 1205ms.”,“meta”:“SqliteError: code SQLITE_BUSY: database is locked\nSqliteError: database is locked\n at /ps/app/bin/sync.js:9:707677\n at Function. (/ps/app/bin/sync.js:9:728923)\n at Function.sqliteTransaction (/ps/app/node_modules/better-sqlite3/lib/methods/transaction.js:65:24)\n at /ps/app/bin/sync.js:9:728933\n at Object.t.handleDbRetries (/ps/app/bin/sync.js:9:709692)”}
{“ts”:1696469794422,“l”:“error”,“ctx”:“DbRetries”,“msg”:“Caught db error. Retrying in 951ms.”,“meta”:“SqliteError: code SQLITE_BUSY: database is locked\nSqliteError: database is locked\n at /ps/app/bin/sync.js:9:707677\n at Function. (/ps/app/bin/sync.js:9:728923)\n at Function.sqliteTransaction (/ps/app/node_modules/better-sqlite3/lib/methods/transaction.js:65:24)\n at /ps/app/bin/sync.js:9:728933\n at Object.t.handleDbRetries (/ps/app/bin/sync.js:9:709692)”}
{“ts”:1696469818420,“l”:“error”,“ctx”:“DbRetries”,“msg”:“Caught db error. Retrying in 785ms.”,“meta”:“SqliteError: code SQLITE_BUSY: database is locked\nSqliteError: database is locked\n at /ps/app/bin/sync.js:9:707677\n at Function. (/ps/app/bin/sync.js:9:728923)\n at Function.sqliteTransaction (/ps/app/node_modules/better-sqlite3/lib/methods/transaction.js:65:24)\n at /ps/app/bin/sync.js:9:728933\n at Object.t.handleDbRetries (/ps/app/bin/sync.js:9:709692)”}
{“ts”:1696470385309,“l”:“error”,“ctx”:“DbRetries”,“msg”:“Caught db error. Retrying in 1390ms.”,“meta”:“SqliteError: code SQLITE_BUSY: database is locked\nSqliteError: database is locked\n at /ps/app/bin/sync.js:9:707677\n at Function. (/ps/app/bin/sync.js:9:728923)\n at Function.sqliteTransaction (/ps/app/node_modules/better-sqlite3/lib/methods/transaction.js:65:24)\n at /ps/app/bin/sync.js:9:728933\n at Object.t.handleDbRetries (/ps/app/bin/sync.js:9:709692)”}

I can’t tell from those errors what version you’re using, but at least for docker, the current best release (especially for larger libraries) is the :prealpha build.

(The :prealpha build looks to be stable enough to be promoted to :alpha at this point–but :prealpha includes all current bugfixes)

Scanning completed using :prealpha. Even though it completed I see a lot of workers running exiftool. Is that adding keywords and eventually it will finish? Also, if I update a file (re-export from lightroom) will it detect the file change at some point and rescan it or do I have to trigger it to do that? I sometimes re-edit photos or add more keywords. Thanks!

Another sync job may have been kicked off incorrectly. The next build will try to address this bug.

File changes should be detected and assets reimported automatically. The current implementation uses polling, but I’m going to use fswatch to make file updates cause sync to reimport within a couple minutes of waiting for the file to be quiescent.