This commit implements ROI (Region of Interest) encoding support for D3D12VA hardware encoders, enabling spatially-adaptive quality control for H.264, HEVC, and AV1 encoders.
Query for `D3D12_VIDEO_ENCODER_RATE_CONTROL_FLAG_ENABLE_DELTA_QP` support during initialization to check whether the hardware support delta QP. If delta QP is supported, then process `AV_FRAME_DATA_REGIONS_OF_INTEREST` side data and generate delta QP maps for each frame.
Sample command line:
ffmpeg.exe -hwaccel d3d12va -hwaccel_output_format d3d12 -i input.mp4 -vf addroi=x=480:y=270:w=960:h=540:qoffset=-1/5 -c:v hevc_d3d12va output.mp4
By default, the D3D12 video encoder uses MAXIMUM, which means no restriction—it uses the highest precision supported by the driver.
Applications may want to reduce precision to improve speed or reduce power consumption. This requires the encoder to support user-defined motion estimation precision modes.
D3D12_VIDEO_ENCODER_MOTION_ESTIMATION_PRECISION_MODE defines several precision modes:
maximum: No restriction, uses the maximum precision supported by the driver.
full_pixel: Allows only full-pixel precision.
half_pixel: Allows half-pixel precision.
quarter-pixel: Allows quarter-pixel precision.
eighth-pixel: Allows eighth-pixel precision (introduced in Windows 11).
Sample Command Line:
ffmpeg -hwaccel d3d12va -hwaccel_output_format d3d12 -extra_hw_frames 20 -i input.mp4 -an -c:v h264_d3d12va -me_precision half_pixel out.mp4
Intra refresh is a technique that gradually refreshes the video by encoding rows or regions as intra macroblocks/CTUs spread over multiple frames, rather than using periodic I-frames.
This provides better error resilience for video streaming while maintaining more consistent bitrate.
Disable Intra Refresh (This is the default)
ffmpeg -init_hw_device d3d12va -hwaccel d3d12va -hwaccel_output_format d3d12 \
-i input.mp4 \
-c:v h264_d3d12va \
-intra_refresh_mode none \
-intra_refresh_duration 30 \
-g 60 \
output.h264
Enable Intra Refresh
ffmpeg -init_hw_device d3d12va -hwaccel d3d12va -hwaccel_output_format d3d12 \
-i input.mp4 \
-c:v h264_d3d12va \
-intra_refresh_mode row_based \
-intra_refresh_duration 30 \
-g 60 \
output.h264
Parameters
- `-intra_refresh_mode`: Set to `row_based` to enable row-based intra refresh, or `NONE` to disable
- `-intra_refresh_duration`: Number of frames over which to spread the intra refresh (default: 0 = use GOP size)
- `-g`: GOP size (should typically be larger than intra refresh duration)
This patch adds support for the texture array feature
used by AMD boards in the D3D12 HEVC encoder.
In texture array mode, a single texture array is shared for all
reference and reconstructed pictures using different subresources.
The implementation ensures compatibility
and has been successfully tested on AMD, Intel, and NVIDIA GPUs.
Add the max_frame_size option to support setting max frame size in
bytes. Max frame size is the maximum cap in the bitrate algorithm per
each encoded frame.
Signed-off-by: Tong Wu <wutong1208@outlook.com>
This commit cleans up and refactors the mess of private state upon
private state that used to be.
Now, FFHWBaseEncodePicture is fully initialized upon call-time,
and, most importantly, this lets APIs which require initialization
data for frames (VkImageViews) to initialize this for both the
input image, and the reconstruction (DPB) image.
Signed-off-by: Tong Wu <wutong1208@outlook.com>
It is d3d12va's requirement that the FrameStartOffset must be aligned as
per hardware limitation. However, we could trim this alignment at output
to reduce coded size. A aligned_header_size is added to
D3D12VAEncodePicture.
Signed-off-by: Tong Wu <wutong1208@outlook.com>
This patch is to make FFHWBaseEncodeContext a standalone component
and avoid getting FFHWBaseEncodeContext from avctx->priv_data.
This patch also removes some unnecessary AVCodecContext arguments.
For receive_packet call, a small wrapper is introduced.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
This implementation is based on D3D12 Video Encoding Spec:
https://microsoft.github.io/DirectX-Specs/d3d/D3D12VideoEncoding.html
Sample command line for transcoding:
ffmpeg.exe -hwaccel d3d12va -hwaccel_output_format d3d12 -i input.mp4
-c:v hevc_d3d12va output.mp4
Signed-off-by: Tong Wu <tong1.wu@intel.com>