I typically prefer not to use transcoding which is great that i’m able to disable it but in the rare case where i’m forced to, i’d like ffmpeg to use my GPU for help doing so. NVenc, etc.
Anything to make imports faster!
Here’s the current function for transcoding:
export async function ffmpegTranscode(args: {
src: PosixFile
dest: PosixFile
width: Maybe<number>
height: Maybe<number>
videoBitrateKbps?: number
timeoutMs: number
}) {
// https://trac.ffmpeg.org/wiki/Limiting%20the%20output%20bitrate
const bitrate = opt(args.videoBitrateKbps)
.filter(gt0)
.map(ea => sigFigs(ea, 2))
.map(m => [
"-b:v",
m + "k",
"-maxrate",
m + "k",
"-bufsize",
sigFigs(m / 2, 2) + "k"
])
.getOrElse(() => [])
return stdoutResult(
Settings.ffmpegPath.valueOrDefault,
[
"-loglevel",
"error",
"-threads",
toS(ffmpegThreads()),
"-i",
args.src.nativePath,
"-c:v",
"libx264",
// pix_fmt and profile are required by firefox (!!)
"-pix_fmt",
"yuv420p",
"-profile:v",
"high",
...bitrate,
"-c:a",
"aac",
args.dest.nativePath
],
{
timeout: args.timeoutMs,
isIgnorableError
}
)
}
I could just dump most of these into a setting (basically everything between -i
and the dynamic bit rate). I think that would cover it (considering this): what do you think?
Also, I’d assume this would be a library setting. Hardware encoding would work on my workstation but would break on all my other computers, so I think it’ll need to be a system setting.
I’ve just implemented this setting, which will be available in v1.0:
# +-----------------------+
# | ffmpegTranscodeArgs |
# +-----------------------+
#
# The following are the default arguments added to transcode requests made to
# ffmpeg (when ffmpeg is available). The following arguments will proceed the
# command: "-loglevel error -threads T -i INPUT_FILE_PATH" (where T is
# replaced by ~half the available CPU threads, and INPUT_FILE_PATH is the full
# native pathname to the source video). The following arguments will follow
# the arguments in this setting: "-b:v VIDEO_BITRATE_KBPS OUTPUT_FILE_PATH".
#
# CAUTION: this is an advanced setting. Editing this may cause videos that
# require transcoding to not be imported, or not be viewable on all browsers
# and platforms. See
# <https://forum.photostructure.com/t/hardware-accelerated-encoding-transcoding/166>
# for more details.
# (env: "PS_FFMPEG_TRANSCODE_ARGS")
#
# ffmpegTranscodeArgs = [
# "-c:a",
# "aac",
# "-c:v",
# "libx264",
# "-pix_fmt",
# "yuv420p",
# "-profile:v",
# "high"
# ]
that would be perfect! allowing the ability to fine-tune it and set it up for GPU/CPU workloads
Hey, would that be as simple as adding the -hwaccel cuda
flag to the ffmpeg arguments per this doc?
Thanks.
Almost! ffmpeg
is picky (the argument needs to go right in the front), so I had to add a new setting:
# +-----------------+
# | ffmpegHwaccel |
# +-----------------+
#
# FFmpeg supports both software and hardware encoders. Valid values include
# "auto" which should work for everyone, "cuda" for NVIDIA GPUs, or use
# "disable", "no", "false", or "" to disable. Run "ffmpeg -hwaccels" to see
# supported acceleration methods. See
# <https://forum.photostructure.com/t/hardware-accelerated-encoding-transcoding/166>
# and <https://trac.ffmpeg.org/wiki/HWAccelIntro> for more details.
#
# environment: "PS_FFMPEG_HWACCEL"
#
# ffmpegHwaccel = "auto"
Is that available in beta? I didn’t see it in the .env
file defaults here or here. I was combing through those earlier.
Thanks!
It’ll drop in beta.8:
I did a bit more testing with this, and although I think PhotoStructure is making the correct incantation for ffmpeg
, you’ll need hardware and drivers set up properly to get hardware acceleration to kick in.
I have an older GeForce GTX 970 in my workstation, and it seems to not play nicely with the latest 465 drivers on Ubuntu 20.04. You may have better luck on a newer Ubuntu and newer GPU.
You can test by running this command:
time ffmpeg -i /path/to/file.mov \
-c:a aac -c:v libx264 -pix_fmt yuv420p -color_trc smpte2084 -color_primaries bt2020\
-profile:v high -b:v 5000k -maxrate 5000k -bufsize 2500k \
/tmp/out-$RANDOM.mp4
and then add -hwaccel (with either “auto” or “cuda” or whatever is applicable: run ffmpeg -hwaccels
to see what you can try):
time ffmpeg -hwaccel auto \
-i /path/to/file.mov \
-c:a aac -c:v libx264 -pix_fmt yuv420p -color_trc smpte2084 -color_primaries bt2020\
-profile:v high -b:v 5000k -maxrate 5000k -bufsize 2500k \
/tmp/out-$RANDOM.mp4
I’m getting an odd error running that, though I’m now passing through the requisite pieces (same as my Plex container) to access the GPU.
runtime: nvidia
and
- NVIDIA_VISIBLE_DEVICES=all
Here is the error:
/ps/app # time ffmpeg -i /photos/icloud/adam/2021/06/14/IMG_3100.MOV -hwaccel auto -c:a aac -c:v libx264 -pix_fmt yuv420p -color_trc smpte2084 -color_primaries bt2020 -profile:v high -b:v 5000k -maxrate 5000k -bufsize 2500k /ps/tmp/test3.mp4
ffmpeg version 4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 10.2.1 (Alpine 10.2.1_pre1) 20201203
configuration: --prefix=/usr --enable-avresample --enable-avfilter --enable-gnutls --enable-gpl --enable-libass --enable-libmp3lame --enable-libvorbis --enable-libvpx --enable-libxvid --enable-libx264 --enable-libx265 --enable-libtheora --enable-libv4l2 --enable-libdav1d --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-libxcb --enable-libsrt --enable-libssh --enable-libvidstab --disable-stripping --disable-static --disable-librtmp --enable-vaapi --enable-vdpau --enable-libopus --enable-vulkan --enable-libsoxr --enable-libaom --disable-debug
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/photos/icloud/adam/2021/06/14/IMG_3100.MOV':
Metadata:
major_brand : qt
minor_version : 0
compatible_brands: qt
creation_time : 2021-06-14T20:12:26.000000Z
com.apple.quicktime.location.accuracy.horizontal: 65.000000
com.apple.quicktime.location.ISO6709: REMOVED
com.apple.quicktime.make: Apple
com.apple.quicktime.model: iPhone 11 Pro
com.apple.quicktime.software: 14.6
com.apple.quicktime.creationdate: 2021-06-14T13:12:26-0700
Duration: 00:00:07.59, start: 0.000000, bitrate: 48264 kb/s
Stream #0:0(und): Video: hevc (Main) (hvc1 / 0x31637668), yuv420p(tv, bt709), 3840x2160, 47997 kb/s, 59.96 fps, 59.94 tbr, 600 tbn, 600 tbc (default)
Metadata:
creation_time : 2021-06-14T20:12:26.000000Z
handler_name : Core Media Video
encoder : HEVC
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 179 kb/s (default)
Metadata:
creation_time : 2021-06-14T20:12:26.000000Z
handler_name : Core Media Audio
Stream #0:2(und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
Metadata:
creation_time : 2021-06-14T20:12:26.000000Z
handler_name : Core Media Metadata
Stream #0:3(und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
Metadata:
creation_time : 2021-06-14T20:12:26.000000Z
handler_name : Core Media Metadata
Stream #0:4(und): Data: none (mebx / 0x7862656D), 69 kb/s (default)
Metadata:
creation_time : 2021-06-14T20:12:26.000000Z
handler_name : Core Media Metadata
Option hwaccel (use HW accelerated decoding) cannot be applied to output url /ps/tmp/test3.mp4 -- you are trying to apply an input option to an output file or vice versa. Move this option before the file it belongs to.
Error parsing options for output file /ps/tmp/test3.mp4.
Error opening output files: Invalid argument
Command exited with non-zero status 1
real 0m 0.04s
user 0m 0.03s
sys 0m 0.01s
Ugh, sorry, the -hwaccel
was in the wrong place. I just edited the instructions.
Looks better now. I need to find out why the container thinks my device isn’t there because I’m seeing this:
Device creation failed: -542398533.
[AVHWDeviceContext @ 0x7f8dde62bb00] Cannot open the X11 display .
Device creation failed: -1313558101.
[hevc @ 0x7f8dda09d6c0] Auto hwaccel disabled: no device found.
But I think that’s a me problem. As far as I know multiple containers should be able to share a GPU from within the same host machine. I’ll need to poke around.
The alpine docker image that I use for PhotoStructure is certainly not going to have the drivers necessary to do GPU passthrough.
I switched to Alpine specifically to make the image and container smaller (it’s 75% smaller than the ubuntu image), but that size reduction comes at a (substantial) reduction in functionality.
You should be able to use the PhotoStructure for Node version for this, or modify the Dockerfile to pull in an nvidia image, but that path leads to unexplored forests most likely filled with monsters.
It might be fun though.
Do the drivers need to be part of the base image? I thought by leveraging the nvidia container runtime, the host OS mounts them into the container automatically. At least, that’s what the docs for the LSIO plex image seem to indicate.
I’ve gone through these instructions (GitHub - NVIDIA/nvidia-docker: Build and run Docker containers leveraging NVIDIA GPUs) and have HW acceleration enabled for Plex transcodes.
If I go into /dev
in Photostructure and do an ls
now, this is what I see:
drwxr-xr-x 5 root root 420 Jun 16 21:53 .
drwxr-xr-x 1 root root 4.0K Jun 16 21:53 ..
lrwxrwxrwx 1 root root 11 Jun 16 21:53 core -> /proc/kcore
lrwxrwxrwx 1 root root 13 Jun 16 21:53 fd -> /proc/self/fd
crw-rw-rw- 1 root root 1, 7 Jun 16 21:53 full
drwxrwxrwt 2 root root 40 Jun 16 21:53 mqueue
crw-rw-rw- 1 root root 1, 3 Jun 16 21:53 null
crw-rw-rw- 1 root root 236, 0 Jun 15 21:56 nvidia-uvm
crw-rw-rw- 1 root root 236, 1 Jun 15 21:56 nvidia-uvm-tools
crw-rw-rw- 1 root root 195, 0 Jun 15 21:56 nvidia0
crw-rw-rw- 1 root root 195, 255 Jun 15 21:56 nvidiactl
lrwxrwxrwx 1 root root 8 Jun 16 21:53 ptmx -> pts/ptmx
drwxr-xr-x 2 root root 0 Jun 16 21:53 pts
crw-rw-rw- 1 root root 1, 8 Jun 16 21:53 random
drwxrwxrwt 2 root root 40 Jun 16 21:53 shm
lrwxrwxrwx 1 root root 15 Jun 16 21:53 stderr -> /proc/self/fd/2
lrwxrwxrwx 1 root root 15 Jun 16 21:53 stdin -> /proc/self/fd/0
lrwxrwxrwx 1 root root 15 Jun 16 21:53 stdout -> /proc/self/fd/1
crw-rw-rw- 1 root root 5, 0 Jun 16 21:53 tty
crw-rw-rw- 1 root root 1, 9 Jun 16 21:53 urandom
crw-rw-rw- 1 root root 1, 5 Jun 16 21:53 zero
@adamf, that very well may be sufficient to make hwaccel
happy, as long as the drivers are also present.
In any event, I actually chickened out and switched the default for the new ffmpegHwaccel
from auto
to disabled
due to failures on macOS, and the fact that it doesn’t seem to be a simple apt install something-something-cuda
and have things work (at least on Ubuntu).
I’m closing this now to release the votes associated to this feature request. If you’d like to discuss this further, feel free to open a new topic.