Support for S3 storage

[copied from the subReddit ]

I see on PhotoStructure’s “What’s Next” page that " Non-local imports (via S3 or other URL-accessible sites)" is coming, someday.

That would be AWESOME. I self-host and currently pay about $40/month for volume space for my assets on Digital Ocean. If PhotoStructure could address those assets via DO’s Spaces, I’d only be spending like $5/month.

I realize it’s a feature you, yourself, want, so it’ll likely come someday. I can wait. I’m just giving you feedback that I want it, too. :slight_smile:

Howdy!

Will the new originalsDir and something like rclone --mount suffice, do you think?

https://photostructure.com/about/2020-release-notes/#more-storage-flexibility

Since you’re suggesting it, I bet it works and solves my need. I’ll report the results back here if I’m smart enough to get it working.

1 Like

I think what you’re wanting to set up here will be a common setup: if you get stuck or see issues, I’d be happy to help.

How much space do you need? Not sure if you’re willing to switch from DigitalOcean, but BuyVM provide block storage at $5/TB/month, in 256 GB increments (so $1.25 per 256 GB), and the VPS just sees it as another regular hard drive. No S3 API needed, just mount it like any other drive. You can use whatever file system you like (ext4, zfs, btrfs).

The downside is that it can only be used with BuyVM’s VPSes. Their $15/month (4 GB RAM, 80 GB SSD, unmetered bandwidth @ 1 Gb/s) and above plans include dedicated CPU usage, so you can use the CPU 100% with no problems. DigitalOcean don’t provide dedicated CPU unless you get their “CPU-Optimized Droplets” which start at $40/month.

I swear I don’t work for them; I’m just a happy customer :stuck_out_tongue:

I think I did figure out how to do this, but Photostructure doesn’t seem to like it.

I got rclone to mount an Amazon S3 bucket containing a few pics and Photostructure launches and shows me the initial Settings and EULA screen, and it even creates a .photostructure dir in the S3 bucket, but then it crashes and kills the mount when I click the “Save” button on the EULA screen (i.e., the photostructure container shuts down and the rclone mount dies with “Transport endpoint is not connected”).

FYI, I’m using this to set up Photostructure on a Digital Ocean droplet.

Here’s the Photostructure container log from startup to “crash”:

Upgrading your library database...
PhotoStructure is ready: <http://localhost:1787/>
2021-01-03T14:40:28.726Z main-13 error Caught error from file write stream Error: EBADF: bad file descriptor, write
2021-01-03T14:40:32.282Z sync-33 error Caught error from file write stream Error: ENOTCONN: socket is not connected, close
WatchedChild.onError() {
ctx: {
src: 'ChildService(sync).onStdout()',
fatal: true,
ignorable: false,
errToS: 'sync-file: internal error: Error: onStderr({"error":"Health checks failed¹","problems":["Cannot write to /ps/library: undefined: _nativeCopyFile(/ps/library/.tmp-swr2ku/write-test.jpg.gz): {\\"src\\":\\"/ps/app/public/images/splashbg02-1024w.jpg.gz\\",\\"dest…'
},
src: 'ChildService(sync).onStdout()',
error: l [Error]: ChildService(sync).onStdout()sync-file: internal error: Error: onStderr({"error":"Health checks failed","problems":["Cannot write to /ps/library: undefined: _nativeCopyFile(/ps/library/.tmp-swr2ku/write-test.jpg.gz): {\"src\":\"/ps/app/public/images/splashbg02-1024w.jpg.gz\",\"dest…¹⁶
at C.onError (/ps/app/bin/main.js:3:133290)
at A.onStdout (/ps/app/bin/main.js:3:130238)
at s.onData (/ps/app/bin/main.js:3:137340)
at /ps/app/bin/main.js:3:203449
at Array.forEach (<anonymous>)
at s.onChunk (/ps/app/bin/main.js:3:203396)
at Socket.<anonymous> (/ps/app/bin/main.js:3:203615)
at Socket.emit (events.js:315:20)
at Socket.EventEmitter.emit (domain.js:486:12)
at addChunk (_stream_readable.js:309:12) {
cause: undefined,
retriable: true,
fatal: false
}
}
2021-01-03T14:40:32.961Z web-27 error Caught error from file write stream Error: ENOTCONN: socket is not connected, close
WatchedChild.onError() {
ctx: {
src: 'ChildService(web).onStdout()',
fatal: true,
ignorable: false,
errToS: 'Cannot write to /ps/library: undefined: Cannot mkdirp /ps/library/.tmp-7rx9wu¹⁶⁵undefined'
},
src: 'ChildService(web).onStdout()',
error: l [Error]: ChildService(web).onStdout()Cannot write to /ps/library: undefined: Cannot mkdirp /ps/library/.tmp-7rx9wuundefined¹⁶
at C.onError (/ps/app/bin/main.js:3:133290)
at A.onStdout (/ps/app/bin/main.js:3:130238)
at s.onData (/ps/app/bin/main.js:3:137340)
at /ps/app/bin/main.js:3:203449
at Array.forEach (<anonymous>)
at s.onChunk (/ps/app/bin/main.js:3:203396)
at Socket.<anonymous> (/ps/app/bin/main.js:3:203615)
at Socket.emit (events.js:315:20)
at Socket.EventEmitter.emit (domain.js:486:12)
at addChunk (_stream_readable.js:309:12) {
cause: undefined,
retriable: true,
fatal: false
}
}
{"fatal":true,"exit":true,"status":12,"pid":13,"ppid":6,"error":"ChildService(web).onStdout(): Error: ChildService(web).onStdout()Cannot write to /ps/library: undefined: Cannot mkdirp /ps/library/.tmp-7rx9wuundefined¹⁶"}

Shutting down PhotoStructure...
Verifying & backing up /ps/tmp/local-db/models/db.sqlite3...

I’m not fishing for free tech support here, just letting you know my results. Though if anyone does want to help me figure this out, I’d appreciate it. I’d love to get my asset library on S3. I may check out Daniel’s suggestion of BuyVM (below), but then I’d have to figure out how to set up everything on BuyVM (“everything” = photostructure, traefik, syncthing, sftp server; I had a friend create Psodo for me so I’d have to kind of start from scratch on a non-Digital-Ocean platform - though I think cloud-init is pretty universal).

Until originalsDir is a thing, I think this may not work well for a library directory proper: I expect rsync --mount to be overwhelmed by the I/O needed for files inside the .photostructure directory.

I guess I’m not understanding what originalsDir will do. In my current setup, I have one directory containing my asset library, so whether I choose “yes, please copy my photos into my Photostructure library” or “No thanks, I like my photos and videos where they already are,” the results are the same.

Will originalsDir allow a single backup of the asset library, or will it merely separate the .photostructure directory from the asset library?

:point_up: It does this.

Another solution could be to make the PhotoStructure library a local directory, and then make the year subdirectories symlinks to the bind mount, but that’s a pretty grotty hack.

This feature is real important to me, so I wanted to bump the feature request again now that there’s more activity on the forum and now that some features have been added and their votes have been released.

Thanks for the bump.

It will be a non-trivial chunk of code to build out another the filesystem iterator that uses a non-posix-like API, but it’s certainly something I’d personally like (I want a .tgz importer for Google Takeouts). Until I build that out, I’ve been limping along with FUSE plugins, like ratarmount. You may want to try GitHub - s3fs-fuse/s3fs-fuse: FUSE-based file system backed by Amazon S3 (it’s actually packaged into Ubuntu proper!)

Thanks for the suggestion. I got it to work, tested w/1 image. I’m running dockerized PS. I mounted my s3-compatible storage (Digital Ocean) via s3fs on the host and mapped it to a volume in the PS container in docker-compose.yaml:

...
    volumes:
      - /opt/containers/photostructure/config:/ps/config:rw
      - /storage/photos:/ps/library:rw
      - /storage/photos/.photostructure/logs:/ps/logs:rw
      - /home/psuser/tmp/photostructure-docker:/ps/tmp:rw
      - /mnt/ps_library:/ps_library
...

I restarted the container and in Photostructure’s UI Settings I added the mapped volume as a directory to scan for images (note in the image below that “/ps/library” is my photostructure library, and “/ps_library” is my mapped S3 - sorry for the confusing names):


Then I told Photostructure to “Restart Sync” and when it finished, it found and properly processed the image in the S3 storage. Yay!

However, when I restart Photostructure again, it seems to forget my custom scan paths:

I then manually entered the mapped storage into config/settings.toml:

# +–––––––––––––+
# |  scanPaths  |
# +–––––––––––––+
#
scanPaths = [
"/ps/library",
"/ps_library"
]

After another reboot, settings.toml still has my path, but Photostructure doesn’t display is as a path to scan.

So there’s that weird issue to figure out, but also now I’m going to copy a large quantity of images and videos into S3 storage and ask Photostructure to sync, and see how it performs. I’ll report back.

Sorry that the settings didn’t seem to stick. If you could send me your logs I can take a look tomorrow (I’m taking today off: Happy Father’s day!)

Are you sure the /opt/containers bird mount is writable by the user in the docker container that runs PhotoStructure (if you’re using custom UID/GID?)

Also, if you set PS_SCAN_PATHS as an environment variable, it will always override whatever is in your system settings.toml.

Also, libraries are always scanned, so you don’t need to include /ps/library.

I think so.

From docker-compose.yaml:

    ...
    environment:
      UID: 1000
      GID: 1000
    ...
    volumes:
      - /opt/containers/photostructure/config:/ps/config:rw
    ...

and on the host:

/path/to/.photostructure$ ls -l settings.toml
 -rw-r--r--  1 psuser psuser 61064 Jun 20 21:07 settings.toml
/path/to/.photostructure$ cat /etc/passwd | grep psuser
psuser:x:1000:1000:Photostructure User,,,:/home/psuser:/bin/bash

Actually, all files in /opt/containers were psuser:psuser except for settings.toml, which was root:root. But I changed it and rebooted, same results.

So, while the container is running, do these touches work?

docker exec -it PHOTOSTRUCTURE sh
su - node
touch /ps/config/settings.toml
touch /ps/library/.photostructure/settings.toml

(I can’t tell you the number of times I’ve thought all the permissions were OK but three directories up was read-only, or something like that)

Yes, both the touch commands worked.

So I copied maybe 10G of assets from my Google Photos into an S3 space and told Photostructure to “Rebuild (slow)” overnight and it did find the new assets (I had new tags). My Photostructure library is still local block-storage but I also added my S3 space as a directory to scan, as described above.

So now I’m copying the rest of my 150G of assets, but my question is: should I change my Photostructure library to the S3 space, or just my S3 space as another directory to scan, or is there even a difference?

One concern I have is something you said here:

Make sure /ps/tmp is a fast local disk (SSD would be great).

Will Photostructre be able to address an S3 space as the main library fast enough, or will it present as an unusably slow interface, or will it crash Photostructure? I guess we’ll see! :slight_smile:

It depends on what you mean by “fast enough.” If this is on a VPS with great upstream and downstream bandwidth, sure. If it’s on your DSL line at home with constrained bandwidth, not so much.

PhotoStructure should be fine with syncing storage that can support down to ~10mb/s read/write, and not trip processing timeouts (the timeouts are for raspberry pis which are glacially slow). FWIW, 10mb/s is >10x slower than a decade-old NAS on consumer networking using decade-old HDDs.

Your user experience will be painfully slow with that storage, though. It’ll take seconds to render the home page, for example.

Copying the previews directory to fast local disk will help, but when you view an asset at full screen, if there isn’t a pre-built preview image big enough for your browser’s canvas, PhotoStructure will steam the original asset to your browser, which may take a while if the original is slow to fetch.

I think I understand. Thanks!

The thing I’d be concerned about with S3 is that bandwidth is relatively expensive… I think if you want to store photos externally it’d be better to use some storage that either doesn’t have a monthly transfer limit, or has a fixed price with a fixed amount of transfer per month.