The function violated the ABI requirement not to write below SP
(this breaks asynchronous signal handling). On RV32, it also broke
did not align SP to 16 bytes and did not restore it correctly.
No changes to benchmarks as this patch only changes a few immediate
offsets.
Fix several AMF-related issues.
Check the return value of amf_init_frames_context() correctly in amfdec,
as it returns int rather than AMF_RESULT.
Handle possible NULL surfaces returned from QueryInterface() in
vf_amf_common to avoid passing invalid data to amf_amfsurface_to_avframe().
Remove FILTER_SINGLE_PIXFMT from vf_sr_amf since it must not be used
together with a query formats function.
aom_codec_control() takes control id as int. It could be AV1E_ or common
AV1_ enum in encoder, and AV1D_ for decoder.
While upstream provides AOM_CODEC_CONTROL_TYPECHECKED() macro to check
the provided enum value, we wrap those calls in codecctl_ functions,
which makes it not feasible to use.
To avoid complicating this needlessly, just use int.
Fixes: warning: implicit conversion from enumeration type 'enum aom_com_control_id' to different enumeration type 'enum aome_enc_control_id'
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
By default, the D3D12 video encoder uses MAXIMUM, which means no restriction—it uses the highest precision supported by the driver.
Applications may want to reduce precision to improve speed or reduce power consumption. This requires the encoder to support user-defined motion estimation precision modes.
D3D12_VIDEO_ENCODER_MOTION_ESTIMATION_PRECISION_MODE defines several precision modes:
maximum: No restriction, uses the maximum precision supported by the driver.
full_pixel: Allows only full-pixel precision.
half_pixel: Allows half-pixel precision.
quarter-pixel: Allows quarter-pixel precision.
eighth-pixel: Allows eighth-pixel precision (introduced in Windows 11).
Sample Command Line:
ffmpeg -hwaccel d3d12va -hwaccel_output_format d3d12 -extra_hw_frames 20 -i input.mp4 -an -c:v h264_d3d12va -me_precision half_pixel out.mp4
Most EXIF metadata is in IFD0 and most EXIF payloads only contain
one IFD, but it is possible for there to be more IFDs after the
existing trailing one. exiftool and similar software report these IFDs
as IFD1, IFD2, etc. This commit reads those additional IFDs and attaches
them as dummy entries in the top-level IFD ranging from 0xFFFC down to
0xFFED, which are unused by the EXIF spec. The EXIF API is only able to
return and work with a single IFD, so by attaching it as a subdirectory
this metadata can be preserved.
This is done transparently through the read/write process. Upon parsing
an additional IFD1, it will be attached, but it will be written with
av_exif_write after IFD0 rather than as a subdirectory, as intended.
Existing files without more than one IFD, i.e. most files, will be unaffected
by this change, as well as API clients looking to parse specific fields, but
now more metadata is parsed and written, rather than simply being discarded
as trailing data.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Before this commit, exif_parse_ifd_list didn't free *ifd upon failure,
relying on the caller to do so instead. We only guarded some of the
calls against this function, not all of them, so sometimes it leaked.
This commit fixes this, so exif_parse_ifd_list freeds *ifd upon failure
so callers do not have to guard its invocation with a free wrapper.
Fixes: ossfuzz 440747118: Integer-overflow in av_strerror
Signed-off-by: Leo Izen <leo.izen@gmail.com>
For AVX2, movdqu is as fast as movdqa when used on aligned addresses,
so don't instantiate aligned/unaligned versions.
(The check was btw overtly strict: The AVX2 code only uses 16 byte
stores, so it would be enough for dst to be 16-byte aligned.)
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These functions are only used on Conroe (they are overwritten
by SSSE3 functions using xmm registers if the SSSE3SLOW is not set)
which is very old (introduced in 2006), so remove them.
Btw: The checkasm test (which uses declare_func and not
declare_func_emms since cd8a33bcce)
would fail on a Conroe, yet no one ever reported any such failure.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If Zbb is enabled at compilation (e.g. Ubuntu), the compiler should
compile the new C mid_pred() function correctly. But if Zbb is *not*
enabled (e.g. Debian), then we can at least fallback at run-time.
On SiFive-U74, before:
sub_median_pred_c: 1331.9 ( 1.00x)
sub_median_pred_rvb_b: 881.8 ( 1.51x)
After:
sub_median_pred_c: 1133.1 ( 1.00x)
sub_median_pred_rvb_b: 875.7 ( 1.29x)
This reduces the minimum instruction emission for mid_pred()
(i.e. median of 3) down to:
- 3 comparisons and 4 conditional moves, or
- 4 min/max.
With that the compiler can eliminate any branch. This optimal
situation is attainable with Clang 21 on Arm64, RVA22 and x86,
with GCC 15 on Arm64 and x86 (RVA22 goes from 2 to 1 branch).
These optimisations also work on Arm32 and LoongArch.
The same algorithm is already implemented via inline assembler for some
architectures such as x86 and Arm32, but notably not Arm64 and RVA22.
Besides, using C code allows the compiler to schedule instruction
properly.
Even on architectures with neither conditional moves nor min/max, this
leads to a visible performance improvement for C code, as seen here for
RVA20 code running on SiFive-U74:
Before:
sub_median_pred_c: 1657.5 ( 1.00x)
sub_median_pred_rvb_b: 875.9 ( 1.89x)
After:
sub_median_pred_c: 1331.9 ( 1.00x)
sub_median_pred_rvb_b: 881.8 ( 1.51x)
Note that this commit leaves the x86 and Arm32 code intact so it has
no effects on those ISA's.
vp8 encoder can be configured to drop frames, when e.g. bitrate
overshoot is detected. At present the code responsible for
managing an internal fifo assumes that we will get an output frame per
each frame fed into encoder. That is not the case if the encoder can
decide to drop frames.
Running:
ffmpeg -stream_loop 100 -i dash_video3.webm -c:v libvpx -b:v 50k
-drop-threshold 20 -screen-content-mode 2 output.webm
results in lots of warnings like:
[libvpx @ 0x563fd8aba100] Mismatching timestamps: libvpx 2187 queued
2185; this is a bug, please report it
[libvpx @ 0x563fd8aba100] Mismatching timestamps: libvpx 2189 queued
2186; this is a bug, please report it
followed by:
[vost#0:0/libvpx @ 0x563fd8ab9b40] [enc:libvpx @ 0x563fd8aba080] Error
submitting video frame to the encoder
[vost#0:0/libvpx @ 0x563fd8ab9b40] [enc:libvpx @ 0x563fd8aba080] Error
encoding a frame: No space left on device
[vost#0:0/libvpx @ 0x563fd8ab9b40] Task finished with error code: -28
(No space left on device)
[vost#0:0/libvpx @ 0x563fd8ab9b40] Terminating thread with return code
-28 (No space left on device)
The reason for the above error is that each dropped frame leaves an
extra item in the fifo, which eventually overflows.
The proposed fix is to keep popping elements from the fifo until the
one with the matching pts is found. A side effect of this change is that
the code no longer considers pts mismatch to be a bug.
This has likely regressed around 5bda4ec6c3
when fifo started to be universally used.
Signed-off-by: Dariusz Marcinkiewicz <darekm@google.com>
JPEG-XS streams can have the bytes corresponding to certain markers as part of
slice data, and no considerations were made for it, so we need to add checks
for false positives.
This fixes assembling several samples.
Signed-off-by: James Almer <jamrial@gmail.com>
The code assumed that the destination buffer was zeroed, a misbehaviour
with which checkasm is bug-compatible as it zeroes the destination
buffer. The fixed code is even faster:
SpacemiT X60:
sub_left_predict_c: 51792.5 ( 1.00x)
sub_left_predict_rvv_i32: 3504.4 (14.78x)
The shader needs ~3 loads per DCT coeff.
This data was not observed to get efficiently stored
in the upper cached levels, loading it explicitely in
shared memory fixes that.
Also reduce code size by moving the bitstream
initialization outside of the switch/case.
(Like the old code, the new code limits the number of threads to 64,
even when the user explicitly set a higher thread count. I don't know
whether it is intentional to apply this limit even when the user
explicitly supplied the number of threads.)
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The pixel format has already been checked generically.
This also fixes the bug that the earlier code ignored
the return value of set_pix_fmt().
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
sub_median_pred_mmxext() calculates a predictor from the left, top
and topleft pixel values. The topleft values need to be initialized
differently for the first loop initialization than for the others
in order to avoid reading ptr[-1]. So it has been initialized before
the loop and then read again at the end of the loop, so that the last
value read was never used. Yet this can lead to reads beyond the end
of the buffer, e.g. with
ffmpeg -cpuflags mmx+mmxext -f lavfi -i "color=size=64x4,format=yuv420p" \
-vf vflip -c:v ffvhuff -pred median -frames 1 -f null -
Fix this by not reading the value at the end of the loop.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
sub_median_pred_mmxext() calculates a predictor from the left, top
and topleft pixel values. The left value is simply read via
ptr[-1], although this is not guaranteed to be inside the buffer
in case of negative strides. This happens e.g. with
ffmpeg -i fate-suite/mpeg2/dvd_single_frame.vob -vf vflip \
-c:v magicyuv -pred median -f null -
Fix this by reading the first value like the topleft value.
Also change the documentation of sub_median_pred to reflect this
change (and the one from 791b5954bc).
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>