Docker on UNRAID hard crash

I don’t see much in the logs (or I may not know what to look for):

But this looks bad:

{"ts":1628892902511,"l":"error","ctx":"DbRetries","msg":"Caught db error. Retrying in 513ms.","meta":"SqliteError: code SQLITE_BUSY: database is locked\nSqliteError: database is locked\n    at /ps/app/bin/web.js:9:708001\n    at Function.<anonymous> (/ps/app/bin/web.js:9:729247)\n    at Function.sqliteTransaction (/ps/app/node_modules/better-sqlite3/lib/methods/transaction.js:65:24)\n    at /ps/app/bin/web.js:9:729257\n    at Object.t.handleDbRetries (/ps/app/bin/web.js:9:710016)"}

This also look to be related, just in the synclog:

 {"ts":1628892979630,"l":"error","ctx":"DbRetries","msg":"Caught db error. Retrying in 939ms.","meta":"SqliteError: code SQLITE_BUSY: database is locked\nSqliteError: database is locked\n    at /ps/app/bin/sync.js:9:706357\n    at Function.<anonymous> (/ps/app/bin/sync.js:9:727603)\n    at Function.sqliteTransaction (/ps/app/node_modules/better-sqlite3/lib/methods/transaction.js:65:24)\n    at /ps/app/bin/sync.js:9:727613\n    at Object.t.handleDbRetries (/ps/app/bin/sync.js:9:708372)"}
{"ts":1628893126122,"l":"error","ctx":"Error","msg":"onError()","meta":{"event":"fatal","message":"Failed to scan system volumes.¹⁵"}}
{"ts":1628893126122,"l":"error","ctx":"Service(sync)","msg":"exit()","meta":{"status":12,"reason":"Failed to scan system volumes.¹⁵","waitForJobs":false,"ending":false}}
{"ts":1628893126126,"l":"error","ctx":"Error","msg":"onError()","meta":{"event":"fatal","message":"Failed to scan system volumes.¹⁵"}}
{"ts":1628893126126,"l":"error","ctx":"Service(sync)","msg":"exit()","meta":{"status":12,"reason":"Failed to scan system volumes.¹⁵","waitForJobs":false,"ending":false}}

If the database locked sometimes that can mean out of space? Is it possible my sqllite DB is unable to grow more? I don’t see anything on disk that would limit its size.

241M Aug 13 16:18 db.sqlite3

Reading further it seems to be unable to scan a system volume…no idea how that would be the case, the array is not having issues, or has reported none, does photostructure have a crazy low timeout? I’m running everything on 5400RPM drives…no SSD (there is a plan to do so but honestly 8TBs of SSD is pricey).

I might have figured it out…let me play with it before you spend time on it:

Aug 13 09:04:14 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5414.json
Aug 13 09:04:14 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5520.json
Aug 13 09:04:14 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5211.json
Aug 13 09:04:15 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5333.json
Aug 13 09:04:15 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5464.json
Aug 13 09:04:15 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5532.json
Aug 13 09:04:15 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/8667.json
Aug 13 09:04:15 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/8671.json
Aug 13 09:04:15 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/3709.json
Aug 13 09:04:16 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5213.json
Aug 13 09:04:16 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5318.json
Aug 13 09:04:16 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5381.json
Aug 13 09:04:16 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/2678.json
Aug 13 09:04:16 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5496.json
Aug 13 09:04:16 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/4434.json
Aug 13 09:04:16 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5243.json
Aug 13 09:04:16 Tower move: error: move, 397: No such file or directory (2): lstat: /mnt/cache/appdata/photostructure/config/pids/5394.json

Ok for future googlers and duckduck…goosers:

Disable appdata share caching if you have it enabled on that share. Also, disable caching on any shared that photostructure is writing to.

Almost every other docker image in my unraid system uses a cached appdata for logs / config / thumbnails / etc. I set PS up the same way (modified the current docker image). I don’t get the errors above but am having issues with image import.

Even though I am not getting the error above, would you just not have PS use any caching whatsoever? I too am using 5400 rpm drives.

If it were possible I’d have it use a read cache and have unraid read ahead as ps worked through a folder of images. But for write cache, the mover I think was moving files as ps was using them. Now this probably only is an issue for the appdata not having a path like /mnt/cache/. It for me is /mnt/appdata. I also think the database would benefit from living on an ssd permanently.

My long term goal is an expansion card with 4 m.2 ssds and all of my photos on that. But it’s expensive so I haven’t done that yet :).

For what it’s worth, my appdata share is set to cache: prefer. My photos are all on my main array. Cache drive is SSD, main array is 5400rpm HDD.

I’ve not experienced the error described. I don’t know if that’s helpful, but it’s another data point at least!

2 Likes

That is interesting. Mine was set to yes. I didn’t play with prefer. That is a good data point and something to try.

Thanks!

It’s a little confusing, but you definitely want “Prefer” for your appdata share.

The cache options go like this:

Yes: New files go on the cache drive, but are moved off of the cache and onto the array each time the mover runs (default once a day)

Prefer: New files will go on the cache and STAY on the cache, unless the cache runs out of space, at which point new files will go on the array instead.

No: New files go on the array and stay there.

There is any only as well now.

So for some reason my ps docker won’t start at all now. Do you still have your memory settings set to the default? If so what are they?

Which memory settings are you referring to?

Also: yes, I forgot about the only setting, and it’s a decent candidate for an appdata share as well. The way I see it, the difference between only and prefer is that if I run out of space on my cache drive, docker will keep working with prefer. If I use only and run out of space, dockers will start failing.

I have prefer:yes on both my appdata share, and the photostructure library share and it’s been solid. I had random issues with sqllite early on (which may or may not have anything to do with cache) which went away completely when I set PS_FORCE_LOCAL_DB_REPLICA to false.

1 Like

(Make sure you only do this if your library is actually stored on a local volume!)

Argh! It’s back!


ls * | egrep -v "gz" | xargs -n1 cat | egrep "error" | egrep "locked" | wc -l
5627

5627 database locked errors in a 12 hour period. Does that seem high?

Is there a way for me to see why it thinks it’s locked?

SQLite normally handles concurrent writes from different processes just fine, but PhotoStructure sometimes needs to run updates that touch many rows, like tag asset counts, or database maintenance tasks ) like VACUUM and OPTIMIZE).

PhotoStructure tries to coordinate this across different processes to reduce write contention and busy lock timeouts, but it also retries these sorts of errors automatically: normally the second attempt works fine.

If you want to send me your recent logs I can verify the DB errors are from this expected situation, or maybe from something else more nefarious.

I can do that but here is what I’ve done now, its running (much much faster too). But can you tell me if I’ve maybe done something that has broken it in a new and interesting way?

  1. Moved previews to ssd per this post: Troubleshooting slow performance
  2. I setup my docker like this:
  3. I moved the entire app into that /mnt/cache/appdata/photostructure/.photostructure folder (everything in /ps/library/.photostructure which is in /mnt/user/Photos/PS.

It doesn’t seem to be scanning though, even when I tell it to, it made some logs but not much just the following:

{"ts":1629231478694,"l":"error","ctx":"DbRetries","msg":"Caught db error. Retrying in 1231ms.","meta":"SqliteError: code SQLITE_BUSY: database is locked\nSqliteError: database is locked\n    at /ps/app/bin/sync.js:9:706357\n    at Function.<anonymous> (/ps/app/bin/sync.js:9:727603)\n    at Function.sqliteTransaction (/ps/app/node_modules/better-sqlite3/lib/methods/transaction.js:65:24)\n    at /ps/app/bin/sync.js:9:727613\n    at Object.t.handleDbRetries (/ps/app/bin/sync.js:9:708372)"}
{"ts":1629231664586,"l":"error","ctx":"EventStore","msg":"maybeSendEvent(): ACCEPT","meta":{"event":{"timestamp":1629231478.963,"exception":{"values":[{"stacktrace":{"frames":[{"colno":843842,"filename":"async /ps/app/bin/sync.js","function":"null.<anonymous>","lineno":9,"in_app":false,"module":"sync"},{"colno":727502,"filename":"/ps/app/bin/sync.js","function":"async Object.t.tx","lineno":9,"in_app":true,"module":"sync","pre_context":["","/*"," * Copyright © 2021, PhotoStructure Inc."," * By using this software, you accept all of the terms in <https://photostructure.com/eula>."," * IF YOU DO NOT ACCEPT THESE TERMS, DO NOT USE THIS SOFTWARE"," */",""],"context_line":"'{snip} (\"Timeout while waiting for external vacuum to complete\")),await o.retryOnReject((async()=>e.inTransaction?t(e.db):h.handleDbRetries((()=>e. {snip}","post_context":["//# sourceMappingURL=sync.js.map"]},{"colno":608825,"filename":"/ps/app/bin/sync.js","function":"async c","lineno":9,"in_app":true,"module":"sync","pre_context":["","/*"," * Copyright © 2021, PhotoStructure Inc."," * By using this software, you accept all of the terms in <https://photostructure.com/eula>."," * IF YOU DO NOT ACCEPT THESE TERMS, DO NOT USE THIS SOFTWARE"," */",""],"context_line":"'{snip} orElse(e,1))}));let u=0;const c=async()=>{var e;try{return await i()}catch(i){if(!1===await(null===(e=t.errorIsRetriable)||void 0===e?void 0 {snip}","post_context":["//# sourceMappingURL=sync.js.map"]},{"colno":7,"filename":"node:internal/timers","function":"processTimers","lineno":500,"in_app":false,"module":"timers"},{"colno":9,"filename":"node:internal/timers","function":"listOnTimeout","lineno":526,"in_app":false,"module":"timers"},{"colno":5,"filename":"node:internal/process/task_queues","function":"runNextTicks","lineno":61,"in_app":false,"module":"task_queues"},{"colno":608485,"filename":"/ps/app/bin/sync.js","function":"null.<anonymous>","lineno":9,"in_app":true,"module":"sync","pre_context":["","/*"," * Copyright © 2021, PhotoStructure Inc."," * By using this software, you accept all of the terms in <https://photostructure.com/eula>."," * IF YOU DO NOT ACCEPT THESE TERMS, DO NOT USE THIS SOFTWARE"," */",""],"context_line":"'{snip} :void 0)),n.unrefDelay(t).then((()=>{if(null==i)throw i=!1,new Error(\"timeout\")}))])}t.thenOrTimeout=o,t.retryOnReject=function(e,t){const i {snip}","post_context":["//# sourceMappingURL=sync.js.map"]}]},"type":"Error","value":"timeout: unhandledRejection","mechanism":{"handled":true,"type":"generic"}}]},"event_id":"2dadd9c28e904d94a19802736180f818","platform":"node","environment":"production","release":"1.0.0+20210812145208","sdk":{"integrations":["InboundFilters","FunctionToString","Console","Http","OnUncaughtException","OnUnhandledRejection","LinkedErrors"]},"message":"timeout: unhandledRejection: Error: timeout: unhandledRejection","user":{"email":"whoopn@gmail.com"},"extra":{"pid":203,"serviceName":"sync","serviceEnding":false,"runtimeMs":217266,"version":"1.0.0","os":"Alpine Linux v3.13 on x64","isDocker":true,"nodeVersion":"16.6.2","locale":"en","cpus":"24 × Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz","memoryUsageMb":32,"memoryUsageRssMb":140,"systemMemory":"1.33 GB / 33.8 GB","ffmpeg":"version 4.3.1","vlc":"(not found)","argv":"[\"/usr/local/bin/node\",\"/ps/app/bin/sync.js\"]"}},"pleaseSend":false,"recentEventCount":0,"maxErrorsPerDay":3}}
{"ts":1629231664629,"l":"error","ctx":"EventStore","msg":"maybeSendEvent(): ACCEPT","meta":{"event":{"timestamp":1629231478.98,"exception":{"values":[{"stacktrace":{"frames":[{"colno":843842,"filename":"async /ps/app/bin/sync.js","function":"null.<anonymous>","lineno":9,"in_app":false,"module":"sync"},{"colno":727502,"filename":"/ps/app/bin/sync.js","function":"async Object.t.tx","lineno":9,"in_app":true,"module":"sync","pre_context":["","/*"," * Copyright © 2021, PhotoStructure Inc."," * By using this software, you accept all of the terms in <https://photostructure.com/eula>."," * IF YOU DO NOT ACCEPT THESE TERMS, DO NOT USE THIS SOFTWARE"," */",""],"context_line":"'{snip} (\"Timeout while waiting for external vacuum to complete\")),await o.retryOnReject((async()=>e.inTransaction?t(e.db):h.handleDbRetries((()=>e. {snip}","post_context":["//# sourceMappingURL=sync.js.map"]},{"colno":608825,"filename":"/ps/app/bin/sync.js","function":"async c","lineno":9,"in_app":true,"module":"sync","pre_context":["","/*"," * Copyright © 2021, PhotoStructure Inc."," * By using this software, you accept all of the terms in <https://photostructure.com/eula>."," * IF YOU DO NOT ACCEPT THESE TERMS, DO NOT USE THIS SOFTWARE"," */",""],"context_line":"'{snip} orElse(e,1))}));let u=0;const c=async()=>{var e;try{return await i()}catch(i){if(!1===await(null===(e=t.errorIsRetriable)||void 0===e?void 0 {snip}","post_context":["//# sourceMappingURL=sync.js.map"]},{"colno":7,"filename":"node:internal/timers","function":"processTimers","lineno":500,"in_app":false,"module":"timers"},{"colno":9,"filename":"node:internal/timers","function":"listOnTimeout","lineno":526,"in_app":false,"module":"timers"},{"colno":5,"filename":"node:internal/process/task_queues","function":"runNextTicks","lineno":61,"in_app":false,"module":"task_queues"},{"colno":608485,"filename":"/ps/app/bin/sync.js","function":"null.<anonymous>","lineno":9,"in_app":true,"module":"sync","pre_context":["","/*"," * Copyright © 2021, PhotoStructure Inc."," * By using this software, you accept all of the terms in <https://photostructure.com/eula>."," * IF YOU DO NOT ACCEPT THESE TERMS, DO NOT USE THIS SOFTWARE"," */",""],"context_line":"'{snip} :void 0)),n.unrefDelay(t).then((()=>{if(null==i)throw i=!1,new Error(\"timeout\")}))])}t.thenOrTimeout=o,t.retryOnReject=function(e,t){const i {snip}","post_context":["//# sourceMappingURL=sync.js.map"]}]},"type":"Error","value":"timeout: unhandledRejection","mechanism":{"handled":false,"type":"onunhandledrejection"}}]},"event_id":"da41be4101b246628b8e329e517bf191","platform":"node","environment":"production","release":"1.0.0+20210812145208","sdk":{"integrations":["InboundFilters","FunctionToString","Console","Http","OnUncaughtException","OnUnhandledRejection","LinkedErrors"]},"extra":{"unhandledPromiseRejection":true,"pid":203,"serviceName":"sync","serviceEnding":false,"runtimeMs":217280,"version":"1.0.0","os":"Alpine Linux v3.13 on x64","isDocker":true,"nodeVersion":"16.6.2","locale":"en","cpus":"24 × Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz","memoryUsageMb":32,"memoryUsageRssMb":140,"systemMemory":"1.33 GB / 33.8 GB","ffmpeg":"version 4.3.1","vlc":"(not found)","argv":"[\"/usr/local/bin/node\",\"/ps/app/bin/sync.js\"]"},"message":"timeout: unhandledRejection: [object Promise]","user":{"email":"whoopn@gmail.com"}},"pleaseSend":false,"recentEventCount":0,"maxErrorsPerDay":3}}

I don’t have enough SSD for my sorted library to live on SSD, but I can support the previews and database on ssd, that is my goal as my theory is that the db is starved for IO sitting on the same drives been hit so hard for the scanning/ingest. I’m just not sure if the inevitable links that are make on the docker container are supported, since now the /ps/library/.photostructure folder isn’t TECHNICALLY right under the /ps/library folder. Now linux usually has no issues but I’ve seen docker having issues.

Should I be doing this another way?
This way perhaps? Hybrid Library

@mrm This is something I didn’t consider, as I have auto organize turned off, but does the library path house both the auto organized library as well as the sqlite database?

If so, what do you think about creating a new path (perhaps optional) for the auto-organized photos? Obviously the auto-organized photos are going to take up a lot of space, which will discourage Unraid users from putting that on an SSD cache drive. However, users may really appreciate the performance boost from having the sqlite database on the SSD cache.

I don’t know if that’s causing any issues in this case, but I thought it was worth mentioning at least!

Yes: read more details here: PhotoStructure | What is a “PhotoStructure library”?

Good idea, and good news: that exists! You’re looking for the originalsDir setting.

Changing this setting is a bit tricky: read the docs, the hybrid library page, and if you have questions, feel free to ask.

# +----------------+
# |  originalsDir  |
# +----------------+
#
# This is the directory that PhotoStructure uses to store original images when
# "copyAssetsToLibrary" is enabled. Absolute paths are supported. Relative
# paths are evaluated from your libraryDir. This setting defaults to ".",
# which is the same as your PhotoStructure library directory.
#
# If you open your PhotoStructure library on a different computer, and that
# computer doesn't have access to your originals volume, full-screen zoom
# won't work, and non-transcoded videos will not play.
#
# This system setting needs to be set appropriately on different computers (it
# won't be set automatically!)
#
# If you have a large library and want to use an SSD, we recommend you set
# your libraryDir to your SSD, and use this setting to store your originals on
# a larger volume, rather than using the "previewsDir" setting.
#
# See
# <https://forum.photostructure.com/t/hybrid-photostructure-libraries/775>.
#
# environment: "PS_ORIGINALS_DIR"
#
# originalsDir = "."

Well shoot, now it thinks its a fresh install

What did I do wrong?

Do you want to hop on discord and I can help debug?