Commit graph

124606 commits

Author SHA1 Message Date
Daniil Cherednik
816d74e0bb avcodec/opus/enc: fix CELT psy analysis and packet sizing
The CELT psychoacoustic path was effectively broken: analysis could use
the wrong queued audio and stale scratch samples, and raw band scores were
folded into the frame bit budget, where they could overflow instead of
only driving alloc_boost.

On top of that, c3aea7628c changed avctx->frame_size from fixed
120-sample steps to a configuration-derived value, while the CELT input
and psy paths still treated queue entries as 120-sample steps. That could
misalign psy analysis, read before a short overlap frame, stall silent
flushes, poison rate control with zero-bit silent frames, and overrun the
range coder on EOF or short tails.

This commit fixes these cases by using avctx->frame_size for psy step
accounting, aligning bufqueue analysis with actual audio, padding short
overlaps, and avoiding invalid bit-budget updates for silent or EOF
packets. This lets CELT produce valid packets again.
2026-05-23 10:38:28 +00:00
Niklas Haas
aa08cf8112 swscale/options: add missing option value for SWS_STRICT
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 11:31:54 +02:00
Niklas Haas
03dfac5630 fftools/ffmpeg_sched: allow throttling decoder outputs
This is a departure from the conventional idea of decoders always outputting
data as fast as possible. Instead, this allows decoders to be throttled in the
same way filter graphs can be.

This comes into play when e.g. a demuxer is feeding into two decoders, but
only one of the two decoders is actually currently needed (e.g. due to
A/V misalignment). In that case, what typically happens is that the unneeded
decoder alse decodes all frames, and then piles them up on the "buffersrc"
filter's downstream link (growing indefinitely).

Another issue this solves manifests when e.g. a single demuxer is feeding many
decoders that all try to feed frames to the same filter graph. In this case,
all decoders run as fast as posssible, leading to lock contention on the
filter graph input queue; resulting in (again) many frames piling up on the
buffersrc (or downstream filters) for the unneeded inputs that are not actually
the bottleneck, while the input that's actually undersatisfied can end up
starved for CPU time, possibly for long enough to exhaust memory limits. The
normal rate limiting fails to apply in this scenario because all decoders share
a single demuxer, and are hence rate-limited only by the demuxer speed; whereas
the demuxer is not choked because from the PoV of the scheduler, the filter
graph is simply not getting enough frames.

In a more general sense, there's a philosophical argument to be made here.
Since a decoder is typically also a decompressor, it produces more data than
it consumes. So, it a sense, it's acting like a type of producer also - in
the same way that a filter graph can produce more input that outputs.

Solve all of these issues by allowing decoders to be output-choked, which
gives the scheduler control over when decoders are allowed to output frames.
This does mean we have to add some sort of internal packet queue, because the
decoder thread may need to continue *accepting* upstream packets from the
demuxer (or else we risk stalling the demuxer), but defer the actual decoding
by placing them inside an internal "overflow" queue.

This effectively simulates a sort of "filter graph"-type semantics but
for the decoder queue.

This overflow logic is fairly self-contained inside `sch_dec_receive`, though
it is quite nontrivial. I have added as much documentation as is hopefully
needed to understand the logic.

Importantly, we cannot simply unlimit the decoder input thread queue because
the demuxer relies on backpressure from the decoder to rate limit itself. (Note
that demuxers may only be active if there is at least one downstream decoder
that is alse active, so we always have at least one decoder providing
backpressure)

Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 08:41:12 +00:00
Niklas Haas
2b72d5243c fftools/ffmpeg_sched: drain incoming frames before blocking filters
When a filter is choked, but upstream threads are trying to write to its input,
this can result in the filter's input queue getting stuck. Normally, the
unchoke_downstream() logic would prevent this from happening, since the
filter would itself get unchoked as a result of upstream decoders receiving
pressure from the demuxer.

However, upcoming changes to this logic will require weakening this upstream
unchoking logic, so preventing the deadlock in a more elegant way helps with
making the code more robust.

Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 08:41:12 +00:00
Niklas Haas
95391352b5 fftools/thread_queue: add THREAD_QUEUE_FLAG_NO_BLOCK
Exactly what it says on the tin. There is some ambiguity as to whether this
should also prevent reading from *choked*, as opposed to empty queue, but
I think it makes sense to consider them equivalent, as I struggle to think
of a use case where it would be beneficial to allow draining a queue that
was explicitly choked by the upstream (to e.g. prevent further reads).

Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 08:41:12 +00:00
Niklas Haas
321b0e36a3 fftools/thread_queue: add flags parameter to tq_receive()
I want to use this to allow a non-blocking use of this function.

Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 08:41:12 +00:00
Marton Balint
f48e5d4db4 fftools/ffmpeg_filter: rate control all filter graphs
It was never reliable to detect if a filtergraph have sources, because a filter
can act as a source only after some time, for example the loop filter.

So it is better to remove the source detection entirely and always give the
scheduler an oppurtunity to stop processing.

Fixes ticket #11604.

Signed-off-by: Marton Balint <cus@passwd.hu>
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 08:41:12 +00:00
Niklas Haas
6a563dab71 fftools/ffmpeg_sched: allow choosing nodes to unchoke
This level of granularity will help for the upcoming patch.

Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 08:41:12 +00:00
Niklas Haas
04888287b3 fftools/ffmpeg_sched: fix sch_stop() and schedule_update_locked() race
schedule_update_locked() is supposed to be a no-op when `sch->terminate`
is 1. However, there is a TOCTOU error here, where a different thread may
currently be executing schedule_update_locked(), having successfully passed
the sch->terminate check but without actually updating the choke status.

This does not matter for the current code, but will matter with the following
commit, where it creates the theoretical possibility of a race where sch_stop()
is trying to choke the demuxers (and unchoke the decoders) while
schedule_update_locked() is simultaneously trying to choke the decoders,
leading to a deadlock if the last decoder is left choked and unable to
propagate EOF downstream.

The cleanest solution is to just take the scheduler lock while updating the
choke status here. This ensures that any other schedule_update_locked() calls
will have completed.

Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 08:41:12 +00:00
Niklas Haas
0d123a3c23 fftools/ffmpeg_sched: use macros for schedule_update_locked() loops
Instead of awkwardly looping over the type, just split this up into
multiple loops. The loss in complexity seems worth the loss in conciseness
to me, and more importantly, this allows us to easily add more waiter types.

Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 08:41:12 +00:00
Michael Niedermayer
4492ad7228 avformat/ty: check rec_size
Fixes: ada-4-poc.ty

change is based on the suggested fix

Found-by: Claude and Ada Logics. This issue was found by Anthropic from using agents to study security of open source projects
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-23 00:04:46 +00:00
James Almer
3a042a5ab8 avcodec/libfdk-aacdec: support streams with more than six channels
Before this change, the decoder was forcing downmixing everything to a max of
six channels.
Layouts 6.1(back), 7.1(wide), 7.1 and 5.1.2 (Channel Configurations 11, 7, 12,
and 14 respectively, as well as the equivalent PCE version) should be supported
now.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-22 23:10:05 +00:00
James Almer
7755d264bd avcodec/libfdk-aacdec: fix the check for downmix layout order
The code reads a mask afer this check, meaning it expects NATIVE order, not
others.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-22 23:10:05 +00:00
Zhao Zhili
268c37fdae avformat/iamf_writer: fix stream id log argument
Signed-off-by: Zhao Zhili <quinkblack@foxmail.com>
2026-05-23 02:41:31 +08:00
Lynne
5b38e6eafb
apv_decode: add update_thread_context callback
An earliear commit added hwaccel hooks and a pix_fmt internal state
but did not synchronize its state when using frame threading with
a hwaccel.
2026-05-22 23:06:42 +09:00
Lynne
dd0ac641a5
vulkan: properly check if the internally synchronized queue extension is supported
libplacebo enables the extension string but does not enable the extension flag
if it's unsupported.
2026-05-22 23:06:38 +09:00
Dale Curtis
6e0e13b0bf avformat/ogg: Fix overflow and stale oggvorbis_private values
- Prevent integer overflow when summing header lengths; add bounds check.
- Re-initialize priv->vp with the new stream's extradata once all chained
  stream headers are collected.

Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
2026-05-22 13:27:18 +00:00
Romain Beauxis
cd02463dc6 lavf: wire id3v2 fdebug option
Signed-off-by: Romain Beauxis <romain.beauxis@gmail.com>
2026-05-22 08:20:13 -05:00
Romain Beauxis
274bc5f9ae avformat: deprecate FF_FDEBUG_TS in favor of AV_FDEBUG_TS, add AV_FDEBUG_ID3V2
Signed-off-by: Romain Beauxis <romain.beauxis@gmail.com>
2026-05-22 08:20:13 -05:00
DROOdotFOO
2e142e52ae swscale/aarch64: add NEON yuv->rgb16 fast paths
Add NEON unscaled converters for {yuv420p, yuv422p, yuva420p, nv12, nv21}
to {rgb565le, bgr565le, rgb555le, bgr555le}.

The 16bpp packing uses v8/v9 as the output accumulator. Since AAPCS-64
requires d8-d15 to be callee-saved, declare_func now wraps a
stp d8, d9 / ldp d8, d9 around 16bpp paths only (gated by .ifc on the
output format). Pattern matches libswscale/aarch64/hscale.S.

yuva420p -> 16bpp drops alpha and routes through the yuv420p wrappers,
mirroring how yuva420p -> rgb24/bgr24 already work in tree.

Speedup vs C at width=1920 on Apple M1 (checkasm --bench):

  | input    | rgb565le | bgr565le | rgb555le | bgr555le |
  |----------|----------|----------|----------|----------|
  | yuv420p  | 3.69x    | 3.68x    | 3.28x    | 3.31x    |
  | yuv422p  | 4.70x    | 4.70x    | 4.32x    | 4.35x    |
  | yuva420p | 3.67x    | 3.66x    | 3.32x    | 3.27x    |

NEON cycles are ~48 for planar and ~50.5 for semi-planar across all
four outputs. yuv422p shows the biggest speedup because its C
reference is the most expensive. 555 ratios trail 565 because the C
reference is faster for 555 (one fewer mask bit); NEON cycles are the
same. nv12/nv21 are bench-only (see the preceding checkasm commit) and
run at the same ~50.5 cycles.

This only handles the little endian forms of the 16 bit RGB formats.

Verified with checkasm --test=sw_yuv2rgb (110/110) and the full
checkasm regression (7657/7657) on Apple M1.

Signed-off-by: DROOdotFOO <drew@axol.io>
2026-05-22 10:03:07 +00:00
DROOdotFOO
34501921fd tests/checkasm/sw_yuv2rgb: cover nv12 and nv21
The previous chroma stride formula (width >> log2_chroma_w) is correct
for planar yuv but wrong for semi-planar nv12/nv21, where the UV plane
is interleaved at width bytes per row (width/2 UV pairs of 2 bytes
each). Use av_image_get_linesize() so the test feeds a valid stride to
libswscale regardless of input format; for the existing planar suites
the value is unchanged.

With the stride fixed, add nv12 and nv21 to check_yuv2rgb() so the
upcoming NEON 16bpp paths get bench coverage. ff_get_unscaled_swscale
does not wire a C yuv2rgb fast path for these inputs, so the suites
report bench-only (no correctness reference); they still run clobber
detection and cycle counts.

Signed-off-by: DROOdotFOO <drew@axol.io>
2026-05-22 10:03:07 +00:00
Marvin Scholz
61a0b8fb41 avcodec: remove CELT decoder
libcelt, which it depends on, was not updated in a very long time and is
considered deprecated, as Opus exists which has a CELT mode. Therefore
remove standalone CELT decoding support.

It was already broken since b8604a9761,
11 years ago, and no one noticed and complained.
2026-05-22 09:33:52 +00:00
Lynne
f17c8db820
swscale/vulkan: add a non-bitexact version of OP_LINEAR
Uses matrix*vector + vector multiplication.

Sponsored-by: Sovereign Tech Fund
2026-05-22 15:27:08 +09:00
Lynne
6d57426b6a
swscale/vulkan: create a constant matrix from linear op constants
Sponsored-by: Sovereign Tech Fund
2026-05-22 15:27:07 +09:00
Lynne
2423a719e0
swscale/vulkan: put entire linear matrix+vector as constant data
Rather than only using what we need.
The driver will remove any unused constants.

Sponsored-by: Sovereign Tech Fund
2026-05-22 15:27:07 +09:00
Lynne
198991372c
swscale/vulkan: move linear op handling to a separate function
Sponsored-by: Sovereign Tech Fund
2026-05-22 15:27:03 +09:00
Lynne
437aa4d208
lavfi/blackdetect_vulkan: use integer timestamps and fix crash 2026-05-22 14:06:03 +09:00
Lynne
a189413832
vf_blackdetect_vulkan: port to compile-time SPIR-V generation 2026-05-22 14:06:03 +09:00
Lynne
2e25da3121
vf_nlmeans_vulkan: port to compile-time SPIR-V generation 2026-05-22 14:06:01 +09:00
Lynne
c40ac0f03a
swscale/vulkan: add support for filtering on SWS_OP_READ
Sponsored-by: Sovereign Tech Fund
2026-05-22 14:05:27 +09:00
Lynne
448e08aa80
swscale/vulkan: allocate buffers for scaling filters
Simply allocates buffers to hold filter data.

Sponsored-by: Sovereign Tech Fund
2026-05-22 14:05:26 +09:00
Lynne
b7ccdaa018
swscale/vulkan: make buffer descriptor generation generic
Again, simple rename.

Sponsored-by: Sovereign Tech Fund
2026-05-22 14:05:26 +09:00
Lynne
d0af60afa8
swscale/vulkan: make dither buffer allocation path generic
Just a simple rename.

Sponsored-by: Sovereign Tech Fund
2026-05-22 14:05:26 +09:00
Lynne
c8ddaa97db
swscale/vulkan: base dispatch size on output image size, rather than input
Sponsored-by: Sovereign Tech Fund
2026-05-22 14:05:26 +09:00
Lynne
51d4406e07
swscale/graph: support allocating hardware intermediate frames
Sponsored-by: Sovereign Tech Fund
2026-05-22 14:05:21 +09:00
James Almer
b67570b765 avformat/mov_chan: don't share a function that's only used where it's defined
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-21 18:04:02 -03:00
Franciszek Kalinowski
99e1ecca36 avformat/icecast: reject CR/LF in metadata header values
Reported by Franciszek Kalinowski (isec.pl / striga.ai) and Bartosz Smigielski.
2026-05-21 18:05:39 +00:00
Franciszek Kalinowski
87439ed619 avformat/oggparsecelt: bound extra_headers to avoid an effectively infinite loop
celt_header() reads a uint32 `extra_headers` field from the CELT identification
header and stores `1 + extra_headers` into the signed int extra_headers_left.
With extra_headers = 0x7FFFFFFE this becomes INT_MAX and the OGG parser
consumes every subsequent page as a CELT "extra header" without ever reaching
audio data, hanging on any streaming input. A value of 0xFFFFFFFE wraps the
signed addition negative, with the same family of consequences.

Reject any extra_headers count above a small fixed cap (16, well above any
real CELT-over-Ogg stream).

Verified with the audit PoC (a crafted file plus an infinite-page FIFO):
without the patch, ffmpeg consumes pages forever; with the patch it logs
"Too many CELT extra headers (...)" and exits in ~70 ms with
AVERROR_INVALIDDATA.

Reported by Franciszek Kalinowski (isec.pl / striga.ai) and Bartosz Smigielski.
2026-05-21 14:19:54 +00:00
Franciszek Kalinowski
0857141823 avformat/rtpenc_xiph: bail out when the max payload size underflows
Reported by Franciszek Kalinowski (isec.pl / striga.ai) and Bartosz Smigielski.
2026-05-21 01:15:30 +00:00
Franciszek Kalinowski
a44cae9025 avformat/rtpenc_aac: reject packets smaller than the ADTS header
When extradata_size == 0, ff_rtp_send_aac() does `size -= 7` to skip the
ADTS header without checking size >= 7. A short packet makes size negative,
and the value is later passed to memcpy() as size_t, reading past the buffer
end. Bail out instead.

The vulnerable branch is not reached when using the built-in AAC encoder
(which always emits extradata), but an application that feeds raw
ADTS-stripped AAC packets through the libavformat RTP muxer can hit it. The
fix is a one-line lower-bound check and compiles/runs cleanly; see audit
PoC for the static analysis and reachable-by-API write-up.

Reported by Franciszek Kalinowski (isec.pl / striga.ai) and Bartosz Smigielski.
2026-05-21 01:09:37 +00:00
Michael Niedermayer
59367afc3d avcodec/jpeg2000dec: compute mask in decode_clnpass() like in decode_sigpass()
Fixes: integer overflow

Found by: Jiale Yao
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-21 01:08:30 +00:00
Niklas Haas
d94c293e62 swscale/ops_dispatch: prevent float over-read when horizontal filtering
The code made the fundamental assumption that over-read into the padding
bytes is okay to do; because the most that can happen is that those pixel
values end up corrupted, which doesn't affect any adjacent pixels.

However, this is not true for SWS_OP_FILTER_H, because this operation
fundamentally mixes together horizontal pixels. Normally, this was fine,
because the filter weights for those pixels are set to 0, and 0 * x = 0.

However, that is not true for floating point inputs, which can contain
Infinity; and 0 * Infinity = NaN, thus corrupting the entire pixel.

Solve it by specifically preventing over-read when it would be unsafe.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-20 21:45:28 +00:00
Niklas Haas
6bc0f9517c swscale/ops_dispatch: rename filter_size to filter_size_h
Since this is not set for vertical filters.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-20 21:45:28 +00:00
James Almer
79e30c18e6 avcodec/aacdec_tab: add side channels to ff_aac_channel_map
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-20 13:05:57 -03:00
James Almer
b2ba26a694 avformat/adtsenc: don't accept channeConfiguration values that can't be coded
the ADTS header only has 3 bits for it, so anything higher than 7 can't be
used.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-20 10:59:02 -03:00
James Almer
2bad226f89 avcodec/aacenc: add missing default Channel Configurations
And stop writing 7.1 as 7.1(wide) (channel conf 7). Lets not create any more
non-spec compliant files that the native decoder needs to work around with now
that we can use PCE configuration for it, getting rid of the ambiguity.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-20 10:59:02 -03:00
James Almer
c92c6cbf19 avcodec/aacenc: fix PCE configs for known native layouts
Many of the entries were downright wrong, like mistagging LFE elements as
SCE, as well as trying to match the native channel ordering in the PCE
by placing CPE elements before SCE ones in some cases (like with FRONT
elements), which is not spec compliant and results in unparseable streams.

Remove the three layouts that define top channels. It's not clear how they
should be signaled in PCE.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-20 10:59:01 -03:00
Damiano Galassi
1aefacfda4 avformat/movenc: fix mov_create_dvd_sub_decoder_specific_info
It has been broken since 3064487.
2026-05-20 13:53:37 +00:00
Tim Walker
9e71ea2d60 avformat/movenc: write stsd v2 atom when nb_channels > 2
Required by the QuickTime File Format specification.
2026-05-20 12:31:15 +00:00
Tim Walker
6181195d1f avformat/movenc: use sample_rate to determine audio stsd version.
timescale should be equal sample_rate, but the latter is
the actual source field used when writing the stsd atom
since 3b5d4428ac
2026-05-20 12:31:15 +00:00