What You Will Learn
- How NVENC hardware encoding works and its prerequisites
- The basic usage of
h264_nvencandhevc_nvenc - How to choose preset and quality options (
-rc,-cq,-b:v) - A speed and quality comparison against CPU software encoding
- Common errors and how to fix them
Tested with: FFmpeg 6.1 (NVIDIA GPU environment)
Target OS: Windows / Linux (a CUDA driver is required)
What Is NVENC
NVENC is the hardware video encoder built into NVIDIA GPUs. Compared to CPU software encoding, it is dramatically faster and consumes virtually no CPU resources. The trade-off is that it tends to be slightly less efficient at compression than software encoding.
| Item | NVENC Hardware | libx264 Software |
|---|---|---|
| Speed | 5–10× faster | Baseline |
| CPU usage | Low (GPU processing) | High |
| Compression efficiency | Slightly worse | Excellent |
| Supported codecs | H.264, H.265, AV1 (RTX 40 and later) | H.264 |
Prerequisites
To use NVENC, you need the following:
- An NVIDIA GPU: Kepler generation (GTX 600) or later
- NVIDIA driver: the latest version is recommended (Linux: 520 or higher)
- An NVENC-enabled FFmpeg build: confirm that the H264/HEVC NVENC encoders appear with
ffmpeg -encoders | grep nvenc
Checking whether NVENC is available:
※ This command requires an NVIDIA GPU environment
ffmpeg -encoders | grep nvenc
Basic h264_nvenc Command
The simplest NVENC encode:
※ This command requires an NVIDIA GPU environment
ffmpeg -i input.mp4 -c:v h264_nvenc -c:a copy output_nvenc.mp4
When specifying a quality parameter (-cq) (0–51, lower is higher quality):
※ This command requires an NVIDIA GPU environment
ffmpeg -i input.mp4 -c:v h264_nvenc -rc vbr -cq 23 -c:a aac -b:a 128k output_nvenc.mp4
hevc_nvenc (H.265)
Reduces file size compared to H.264 at equivalent quality:
※ This command requires an NVIDIA GPU environment
ffmpeg -i input.mp4 -c:v hevc_nvenc -rc vbr -cq 28 -c:a aac -b:a 128k output_hevc_nvenc.mp4
Presets
NVENC presets control the balance between speed and quality:
※ This command requires an NVIDIA GPU environment
ffmpeg -i input.mp4 -c:v h264_nvenc -preset p4 -rc vbr -cq 23 output.mp4
With newer drivers you use -preset p1 (fastest) through -preset p7 (highest quality). With older drivers you use fast, medium, and slow.
| Preset | Speed | Quality |
|---|---|---|
| p1 / fast | Fastest | Lower |
| p4 / medium | Balanced | Standard |
| p7 / slow | Slow | High |
Combining GPU Decoding + NVENC Encoding
By performing decoding on the GPU as well, you can lower CPU load even further:
※ This command requires an NVIDIA GPU environment
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -c:v h264_nvenc -preset p4 -rc vbr -cq 23 output.mp4
Specifying -hwaccel_output_format cuda lets you pass the decoded frames in GPU memory directly to NVENC.
Bitrate-Targeted Encoding
When you want to target a specific bitrate:
※ This command requires an NVIDIA GPU environment
ffmpeg -i input.mp4 -c:v h264_nvenc -rc cbr -b:v 4M -maxrate 4M -bufsize 8M output.mp4
Common Errors and Fixes
No NVENC capable devices found
- The NVIDIA driver is outdated or not installed
- Check whether the GPU is recognized with
nvidia-smi
NVENC: OpenEncodeSessionEx failed: out of memory (10)
- Another process has used up all available NVENC sessions
- NVIDIA caps concurrent NVENC sessions per board; the limit varies by GPU, driver, and product line, so check NVIDIA’s official support matrix for your card
Cannot load libcuda.so.1 (Linux)
- The CUDA library path is not set
- Verify with
ldconfig -v | grep cuda, or check whether it exists in/usr/lib/x86_64-linux-gnu/
Choosing Between Software and Hardware Encoding
| Use Case | Recommendation |
|---|---|
| Archiving / high quality | libx264 / libx265 (software) |
| Real-time / live streaming | NVENC |
| Batch conversion / time-critical | NVENC |
| Environment without a GPU | Software encoding |
Speed and Quality Ballparks (Typical Ranges)
These figures vary widely with GPU generation, input resolution, and preset, so treat them as the typical ranges observed in public benchmarks rather than a single managed result.
- Encode speed: An RTX 3060-class GPU often encodes 1080p H.264 at roughly 8–12× realtime (several hundred fps), an order of magnitude faster than
libx264 -preset medium(CPU), which typically runs around 1.5–3× realtime. At 4K, expect this to drop to roughly 2–4×. Exact numbers depend on GPU, driver, content, preset, and resolution. - Quality trade-off: At a matched quality target, NVENC generally needs a higher bitrate than
libx264to reach the same VMAF; the exact gap depends on the generation and the content. It narrows with newer generations, and Ada Lovelace (RTX 40) is reported to close much of it. - Where it wins: Because the speed difference is an order of magnitude, streaming, large batch jobs, and deadline-driven work are effectively NVENC-only. For one-time archival encodes you keep long term, the better compression efficiency of
libx265is worth the extra time.
Common Pitfalls
-preset slow has no effect / presets seem ignored
- Symptom: Specifying
-preset slowdoesn’t give the speed/quality balance you expected. - Cause: Both naming schemes are valid in modern FFmpeg. The old
fast/medium/slownames remain as compatibility aliases, but thep1–p7scheme is the current, more predictable control. - Fix: Prefer the new presets for reproducibility, e.g.
-preset p7 -tune hq. Usep6–p7for high quality,p4for balanced, andp1–p2for low latency.
B-frames added but compression doesn’t improve
- Symptom: Adding
-bf 3barely reduces output size, or is ignored. - Cause: HEVC B-frame support is generation-dependent — it was added with Turing (RTX 20), so older GPUs may not apply HEVC B-frames. (H.264 B-frames are broadly supported across generations.)
- Fix: Check your GPU generation and combine
-b_ref_mode middle -bf 3. If HEVC B-frames don’t take effect on an older GPU, lean on bitrate or use H.264 to hold quality.
Encoding two or more streams at once fails with out of memory (10)
- Symptom: Running several parallel NVENC processes in a batch fails once you exceed the board’s session limit with
OpenEncodeSessionEx failed. - Cause: NVIDIA imposes a per-board cap on concurrent NVENC sessions that varies by GPU, driver, and product line (consumer GeForce currently around a dozen; professional cards vary, some unrestricted). The exact number is not fixed across the lineup.
- Fix: Check NVIDIA’s official support matrix for your specific GPU/driver/product line to find the concurrent-session cap, then limit parallelism accordingly or update the driver. Producing multiple outputs within a single process can also be more stable.
CPU-decode → GPU-encode isn’t as fast as expected
- Symptom: You switched to NVENC but CPU usage stays high and speed plateaus.
- Cause: Decoding is still done in software (CPU), and the CPU→GPU frame transfer becomes the bottleneck.
- Fix: Add
-hwaccel cuda -hwaccel_output_format cudaon the input side so decoding also happens on the GPU and frames are handed to NVENC in GPU memory. This removes the transfer cost and lowers CPU load.
Related Articles
- VAAPI (Linux) Hardware Encoding
- VideoToolbox (macOS) Hardware Encoding
- Compress Video — CRF and Target Bitrate
Primary sources: trac.ffmpeg.org/wiki/HWAccelIntro / ffmpeg.org/ffmpeg-codecs.html
FAQ
How does NVENC compare to libx264 in image quality?
At the same bitrate, NVENC is slightly inferior. However, the latest NVENC (Ada Lovelace generation and later) comes close to the level of libx264 medium, and given the speed it is more than sufficient in practice.
NVENC isn’t working
Make sure the GPU is made by NVIDIA, the driver is up to date, and your FFmpeg build includes --enable-nvenc. You can verify this with ffmpeg -encoders | grep nvenc.
Should I use CRF or CQ?
With NVENC you use -cq (-crf is for libx264). The value ranges from 0 to 51, and just like libx264’s CRF, smaller values mean higher quality. 19–23 is typical.
Should I use B-frames?
Yes. Enabling B-frames with -b_ref_mode middle -bf 3 significantly improves compression efficiency. Note that HEVC B-frame support is generation-dependent (added with Turing), so on older GPUs HEVC B-frames may not apply.
Can I do streaming delivery with NVENC?
Yes, it is ideal for low-latency streaming. You can minimize latency by combining -tune ll or -tune ull (ultra-low-latency) with -preset p1.