Import and resync faster on Docker NAS?

nuk · November 6, 2022, 2:45pm

Any tips on getting my import/resync to use more disk/CPU/RAM, or figure out what’s actually limiting it? It’s using only ~20% of CPU and ~1 MB/s of the hard drives.

I’m trying out alpha.7 on a Synology DS920+ with the Docker GUI. So far I have:

configured the Docker container with “High” CPU priority
configured the Docker container to be able to use all 4GB of my RAM
set environment variable PS_PROCESS_PRIORITY=Normal
set environment variable PS_CPU_LOAD_PERCENT=200
set environment variable PS_FORCE_LOCAL_DB_REPLICA=false since my library is on a local volume as required for this setting

Any other ideas? Perhaps it’s a bug that PhotoStructure is only doing 1 concurrent import and 1 gfx/process (whatever that is), even though it’s detecting 4 cores and targeting 200% usage.

mrm · November 6, 2022, 4:14pm

The code that determines concurrency looks at memory as a hard limiting factor, and won’t exceed that. I guess I could apply that clamp before I apply the cpuPercent setting, though, because if someone sets it to higher than 100%, it’s certainly signal that they are ok with higher load.

I’ll take a look at this. Thanks for the heads up!

nuk · November 6, 2022, 5:10pm

PhotoStructure is the single most important application on my NAS, so when it’s rebuilding I want it done ASAP

(I got the NAS to prevent data rot and centralize my storage, and while theoretically I want to do Plex/Jellyfin/other stuff sometime, PhotoStructure is the only application I actually care about at this time).

mrm · November 6, 2022, 6:52pm

I forgot I’d already added a manual override–try setting maxConcurrentImports=3 in your system settings.toml, or the environment variable in your docker run command. (Note that PhotoStructure looks for an environment variable override using both the PS_MAX_CONCURRENT_IMPORTS and maxConcurrentImports keys. I made the SHOUTY_CASE variant to make env pedants happy).

I think I’d try 2 or 3 first–using 4 will be almost certainly slower than 3, and may make your NAS unresponsive.

I’d also set your cpuLoadPercent to only 100. Although some “CPU percent utilization” (like Windows Task Manager) count one busy CPU as 100%, and 4 busy CPUs as 400%, PhotoStructure adds up all busy “cpu” percents and divides by the number of "cpu"s (where a “cpu” is the number of reported CPUs, which may be 2 ⨉ core_count, like on your hardware).

So, here’s the final recommendation:

PS_PROCESS_PRIORITY=BelowNormal
PS_CPU_LOAD_PERCENT=100
PS_MAX_CONCURRENT_IMPORTS=3
PS_FFMPEG_THREADS=3
PS_SHARP_THREADS_PER_PROCESS=2
PS_FORCE_LOCAL_DB_REPLICA=false

Here are the relevant bits of your settings.toml:

# +------------------+
# |  cpuLoadPercent  |
# +------------------+
#
# This setting is a rough goal for PhotoStructure to load the system during
# library synchronization. A higher value here will allow PhotoStructure to
# run more tasks in parallel, but may impact your system's responsiveness.
# Setting this value to 0 will still allow 1 import to run.
#
# This setting is ignored if "maxConcurrentImports" and
# "sharpThreadsPerProcess" are set.
#
# environment keys: "PS_CPU_LOAD_PERCENT"
# minValue: 0
# maxValue: 200
#
# cpuLoadPercent = 75


# +------------------------+
# |  maxConcurrentImports  |
# +------------------------+
#
# How many imports can PhotoStructure schedule concurrently? This will be
# clamped between 1 and 32.
#
# If not set, a sensible value will be computed based on "cpuLoadPercent".
#
# If set explicitly, this and "sharpThreadsPerProcess" will override
# "cpuLoadPercent" and "maxConcurrentImportsWhenRemote" settings.
#
# aliases: "maxSyncFileJobs"
# environment keys: "PS_MAX_CONCURRENT_IMPORTS" or "PS_MAX_SYNC_FILE_JOBS"
#
# maxConcurrentImports = undefined


# +----------------------------------+
# |  maxConcurrentImportsWhenRemote  |
# +----------------------------------+
#
# How many concurrent files can be imported if the library is on a remote
# volume? This defaults to 2 to try to avoid overwhelming HDD I/O on the
# remote NAS. If this is larger than (cpus.length * cpuLoadPercent) or max
# child processes given available memory, this value will be ignored.
#
# aliases: "maxSyncFileJobsWhenRemote"
# environment keys: "PS_MAX_CONCURRENT_IMPORTS_WHEN_REMOTE" or
# "PS_MAX_SYNC_FILE_JOBS_WHEN_REMOTE"
#
# maxConcurrentImportsWhenRemote = 2

(Given this, I won’t change the current code).

If you ever see any settings documentation that’s confusing or wrong, or think of ways to improve the verbiage, please share!

mrm · November 6, 2022, 7:12pm

Make sure you enable snapshots and periodic data scrubs! Those weren’t enabled by default (at least on my NAS).

Also know that btrfs, which Synology uses, doesn’t do automatic data scrub recovery unless your data redundancy via RAID is sufficient. Unfortunately, I don’t believe RAID on btrfs is considered stable.

So–as far as I understand, btrfs will detect bitrot, but can’t automatically repair bitrot.

(I’d also rsync everything important to an external drive every quarter or year or so, and trade drives with a relative when you see them for holidays, just to have an offsite backup)

nuk · November 7, 2022, 2:06pm

Thank you for the concern and writing PhotoStructure | How do I safely store my files?. I think I’m actually okay with my Synology using btrfs with SHR Synology Hybrid Raid with two hard drives, with snapshots and periodic data scrubs enabled, and with data healing enabled per How do I enable File self-healing on DSM? - Synology Knowledge Center. It can’t be enabled for the whole NAS, only on “shared folders”, but that’s the root folder you create for files during setup anyway so no big deal. It’s even enabled on my docker photostructure and Plex folders, too, though idk whether that’s necessary. Looks like it’s new since DSM 6.1, so data healing hasn’t always been a thing. And it’s goofy to have to specifically enable it, and the UI doesn’t call it “data healing”.

ECC RAM does concern me, though. It would be a bummer if a cosmic ray flipped some bits while writing some photo edits to disc. But there’s still a hefty price premium for ECC and I expect to do very little editing, so the risk is low enough for now to be acceptable.

adept · January 17, 2023, 1:11am

Btrfs’es implementation of RAID5/6 is indeed not considered stable, however btrfs on top of dmraid’s RAID5/6 is perfectly stable