✅ Hardware-accelerated encoding/transcoding

I typically prefer not to use transcoding which is great that i’m able to disable it but in the rare case where i’m forced to, i’d like ffmpeg to use my GPU for help doing so. NVenc, etc.

:+1: Anything to make imports faster!

Here’s the current function for transcoding:

export async function ffmpegTranscode(args: {
  src: PosixFile
  dest: PosixFile
  width: Maybe<number>
  height: Maybe<number>
  videoBitrateKbps?: number
  timeoutMs: number
}) {
  // https://trac.ffmpeg.org/wiki/Limiting%20the%20output%20bitrate
  const bitrate = opt(args.videoBitrateKbps)
    .filter(gt0)
    .map(ea => sigFigs(ea, 2))
    .map(m => [
      "-b:v",
      m + "k",
      "-maxrate",
      m + "k",
      "-bufsize",
      sigFigs(m / 2, 2) + "k"
    ])
    .getOrElse(() => [])

  return stdoutResult(
    Settings.ffmpegPath.valueOrDefault,
    [
      "-loglevel",
      "error",
      "-threads",
      toS(ffmpegThreads()),
      "-i",
      args.src.nativePath,
      "-c:v",
      "libx264",
      // pix_fmt and profile are required by firefox (!!)
      "-pix_fmt",
      "yuv420p",
      "-profile:v",
      "high",
      ...bitrate,
      "-c:a",
      "aac",
      args.dest.nativePath
    ],
    {
      timeout: args.timeoutMs,
      isIgnorableError
    }
  )
}

I could just dump most of these into a setting (basically everything between -i and the dynamic bit rate). I think that would cover it (considering this): what do you think?

Also, I’d assume this would be a library setting. Hardware encoding would work on my workstation but would break on all my other computers, so I think it’ll need to be a system setting.

1 Like

I’ve just implemented this setting, which will be available in v1.0:

# +-----------------------+
# |  ffmpegTranscodeArgs  |
# +-----------------------+
#
# The following are the default arguments added to transcode requests made to
# ffmpeg (when ffmpeg is available). The following arguments will proceed the
# command: "-loglevel error -threads T -i INPUT_FILE_PATH" (where T is
# replaced by ~half the available CPU threads, and INPUT_FILE_PATH is the full
# native pathname to the source video). The following arguments will follow
# the arguments in this setting: "-b:v VIDEO_BITRATE_KBPS OUTPUT_FILE_PATH".
#
# CAUTION: this is an advanced setting. Editing this may cause videos that
# require transcoding to not be imported, or not be viewable on all browsers
# and platforms. See
# <https://forum.photostructure.com/t/hardware-accelerated-encoding-transcoding/166>
# for more details.
# (env: "PS_FFMPEG_TRANSCODE_ARGS")
#
# ffmpegTranscodeArgs = [
#   "-c:a",
#   "aac",
#   "-c:v",
#   "libx264",
#   "-pix_fmt",
#   "yuv420p",
#   "-profile:v",
#   "high"
# ]
2 Likes

that would be perfect! allowing the ability to fine-tune it and set it up for GPU/CPU workloads

1 Like

Hey, would that be as simple as adding the -hwaccel cuda flag to the ffmpeg arguments per this doc?

Thanks.

Almost! ffmpeg is picky (the argument needs to go right in the front), so I had to add a new setting:

# +-----------------+
# |  ffmpegHwaccel  |
# +-----------------+
#
# FFmpeg supports both software and hardware encoders. Valid values include
# "auto" which should work for everyone, "cuda" for NVIDIA GPUs, or use
# "disable", "no", "false", or "" to disable. Run "ffmpeg -hwaccels" to see
# supported acceleration methods. See
# <https://forum.photostructure.com/t/hardware-accelerated-encoding-transcoding/166>
# and <https://trac.ffmpeg.org/wiki/HWAccelIntro> for more details.
#
# environment: "PS_FFMPEG_HWACCEL"
#
# ffmpegHwaccel = "auto"

Is that available in beta? I didn’t see it in the .env file defaults here or here. I was combing through those earlier.

Thanks!

It’ll drop in beta.8:

1 Like

I did a bit more testing with this, and although I think PhotoStructure is making the correct incantation for ffmpeg, you’ll need hardware and drivers set up properly to get hardware acceleration to kick in.

I have an older GeForce GTX 970 in my workstation, and it seems to not play nicely with the latest 465 drivers on Ubuntu 20.04. You may have better luck on a newer Ubuntu and newer GPU.

You can test by running this command:

time ffmpeg -i /path/to/file.mov \
  -c:a aac -c:v libx264 -pix_fmt yuv420p -color_trc smpte2084 -color_primaries bt2020\
  -profile:v high -b:v 5000k -maxrate 5000k -bufsize 2500k \
  /tmp/out-$RANDOM.mp4

and then add -hwaccel (with either “auto” or “cuda” or whatever is applicable: run ffmpeg -hwaccels to see what you can try):

time ffmpeg -hwaccel auto \
  -i /path/to/file.mov \
  -c:a aac -c:v libx264 -pix_fmt yuv420p -color_trc smpte2084 -color_primaries bt2020\
  -profile:v high -b:v 5000k -maxrate 5000k -bufsize 2500k \
  /tmp/out-$RANDOM.mp4

I’m getting an odd error running that, though I’m now passing through the requisite pieces (same as my Plex container) to access the GPU.

    runtime: nvidia

and

      - NVIDIA_VISIBLE_DEVICES=all

Here is the error:

/ps/app # time ffmpeg -i /photos/icloud/adam/2021/06/14/IMG_3100.MOV -hwaccel auto -c:a aac -c:v libx264 -pix_fmt yuv420p -color_trc smpte2084 -color_primaries bt2020 -profile:v high -b:v 5000k -maxrate 5000k -bufsize 2500k /ps/tmp/test3.mp4
ffmpeg version 4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 10.2.1 (Alpine 10.2.1_pre1) 20201203
  configuration: --prefix=/usr --enable-avresample --enable-avfilter --enable-gnutls --enable-gpl --enable-libass --enable-libmp3lame --enable-libvorbis --enable-libvpx --enable-libxvid --enable-libx264 --enable-libx265 --enable-libtheora --enable-libv4l2 --enable-libdav1d --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-libxcb --enable-libsrt --enable-libssh --enable-libvidstab --disable-stripping --disable-static --disable-librtmp --enable-vaapi --enable-vdpau --enable-libopus --enable-vulkan --enable-libsoxr --enable-libaom --disable-debug
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/photos/icloud/adam/2021/06/14/IMG_3100.MOV':
  Metadata:
    major_brand     : qt  
    minor_version   : 0
    compatible_brands: qt  
    creation_time   : 2021-06-14T20:12:26.000000Z
    com.apple.quicktime.location.accuracy.horizontal: 65.000000
    com.apple.quicktime.location.ISO6709: REMOVED
    com.apple.quicktime.make: Apple
    com.apple.quicktime.model: iPhone 11 Pro
    com.apple.quicktime.software: 14.6
    com.apple.quicktime.creationdate: 2021-06-14T13:12:26-0700
  Duration: 00:00:07.59, start: 0.000000, bitrate: 48264 kb/s
    Stream #0:0(und): Video: hevc (Main) (hvc1 / 0x31637668), yuv420p(tv, bt709), 3840x2160, 47997 kb/s, 59.96 fps, 59.94 tbr, 600 tbn, 600 tbc (default)
    Metadata:
      creation_time   : 2021-06-14T20:12:26.000000Z
      handler_name    : Core Media Video
      encoder         : HEVC
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 179 kb/s (default)
    Metadata:
      creation_time   : 2021-06-14T20:12:26.000000Z
      handler_name    : Core Media Audio
    Stream #0:2(und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
    Metadata:
      creation_time   : 2021-06-14T20:12:26.000000Z
      handler_name    : Core Media Metadata
    Stream #0:3(und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
    Metadata:
      creation_time   : 2021-06-14T20:12:26.000000Z
      handler_name    : Core Media Metadata
    Stream #0:4(und): Data: none (mebx / 0x7862656D), 69 kb/s (default)
    Metadata:
      creation_time   : 2021-06-14T20:12:26.000000Z
      handler_name    : Core Media Metadata
Option hwaccel (use HW accelerated decoding) cannot be applied to output url /ps/tmp/test3.mp4 -- you are trying to apply an input option to an output file or vice versa. Move this option before the file it belongs to.
Error parsing options for output file /ps/tmp/test3.mp4.
Error opening output files: Invalid argument
Command exited with non-zero status 1
real    0m 0.04s
user    0m 0.03s
sys     0m 0.01s

Ugh, sorry, the -hwaccel was in the wrong place. I just edited the instructions.

Looks better now. I need to find out why the container thinks my device isn’t there because I’m seeing this:

Device creation failed: -542398533.
[AVHWDeviceContext @ 0x7f8dde62bb00] Cannot open the X11 display .
Device creation failed: -1313558101.
[hevc @ 0x7f8dda09d6c0] Auto hwaccel disabled: no device found.

But I think that’s a me problem. As far as I know multiple containers should be able to share a GPU from within the same host machine. I’ll need to poke around.

The alpine docker image that I use for PhotoStructure is certainly not going to have the drivers necessary to do GPU passthrough.

I switched to Alpine specifically to make the image and container smaller (it’s 75% smaller than the ubuntu image), but that size reduction comes at a (substantial) reduction in functionality.

You should be able to use the PhotoStructure for Node version for this, or modify the Dockerfile to pull in an nvidia image, but that path leads to unexplored forests most likely filled with monsters.

It might be fun though.

Do the drivers need to be part of the base image? I thought by leveraging the nvidia container runtime, the host OS mounts them into the container automatically. At least, that’s what the docs for the LSIO plex image seem to indicate.

I’ve gone through these instructions (GitHub - NVIDIA/nvidia-docker: Build and run Docker containers leveraging NVIDIA GPUs) and have HW acceleration enabled for Plex transcodes.

If I go into /dev in Photostructure and do an ls now, this is what I see:

drwxr-xr-x 5 root root      420 Jun 16 21:53 .
drwxr-xr-x 1 root root     4.0K Jun 16 21:53 ..
lrwxrwxrwx 1 root root       11 Jun 16 21:53 core -> /proc/kcore
lrwxrwxrwx 1 root root       13 Jun 16 21:53 fd -> /proc/self/fd
crw-rw-rw- 1 root root   1,   7 Jun 16 21:53 full
drwxrwxrwt 2 root root       40 Jun 16 21:53 mqueue
crw-rw-rw- 1 root root   1,   3 Jun 16 21:53 null
crw-rw-rw- 1 root root 236,   0 Jun 15 21:56 nvidia-uvm
crw-rw-rw- 1 root root 236,   1 Jun 15 21:56 nvidia-uvm-tools
crw-rw-rw- 1 root root 195,   0 Jun 15 21:56 nvidia0
crw-rw-rw- 1 root root 195, 255 Jun 15 21:56 nvidiactl
lrwxrwxrwx 1 root root        8 Jun 16 21:53 ptmx -> pts/ptmx
drwxr-xr-x 2 root root        0 Jun 16 21:53 pts
crw-rw-rw- 1 root root   1,   8 Jun 16 21:53 random
drwxrwxrwt 2 root root       40 Jun 16 21:53 shm
lrwxrwxrwx 1 root root       15 Jun 16 21:53 stderr -> /proc/self/fd/2
lrwxrwxrwx 1 root root       15 Jun 16 21:53 stdin -> /proc/self/fd/0
lrwxrwxrwx 1 root root       15 Jun 16 21:53 stdout -> /proc/self/fd/1
crw-rw-rw- 1 root root   5,   0 Jun 16 21:53 tty
crw-rw-rw- 1 root root   1,   9 Jun 16 21:53 urandom
crw-rw-rw- 1 root root   1,   5 Jun 16 21:53 zero

@adamf, that very well may be sufficient to make hwaccel happy, as long as the drivers are also present.

In any event, I actually chickened out and switched the default for the new ffmpegHwaccel from auto to disabled due to failures on macOS, and the fact that it doesn’t seem to be a simple apt install something-something-cuda and have things work (at least on Ubuntu).

I’m closing this now to release the votes associated to this feature request. If you’d like to discuss this further, feel free to open a new topic.

1 Like