Commit graph

52905 commits

Author SHA1 Message Date
Andreas Rheinhardt
31f0749cd4 avcodec/vp3: Optimize alignment check away when possible
Check only on arches that need said check.

(Btw: I do not see how h_loop_filter benefits from alignment
at all and why h_loop_filter_unaligned exists.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-13 18:59:49 +02:00
Andreas Rheinhardt
5823ab347a avcodec/vp3dsp: Remove unused flags parameter from ff_vp3dsp_init()
No longer necessary now that the x86 loop filter functions are
bitexact.

Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-13 18:59:24 +02:00
Andreas Rheinhardt
e3ca57ae8f avcodec/x86/vp3dsp: Port loop filters to SSE2
The old code operated on bytes and did lots of tricks
due to their limited range; it did not completely succeed,
which is why the old versions were not used when bitexact
output was requested.

In contrast, the new version is much simpler: It operates
on signed 16 bit words whose range is more than sufficient.
This means that these functions don't need a check for bitexactness
(and can be used in FATE).

Old benchmarks (for this, the AV_CODEC_FLAG_BITEXACT check has been
removed from checkasm):
h_loop_filter_c:                                        29.8 ( 1.00x)
h_loop_filter_mmxext:                                   32.2 ( 0.93x)
h_loop_filter_unaligned_c:                              29.9 ( 1.00x)
h_loop_filter_unaligned_mmxext:                         31.4 ( 0.95x)
v_loop_filter_c:                                        39.3 ( 1.00x)
v_loop_filter_mmxext:                                   14.2 ( 2.78x)
v_loop_filter_unaligned_c:                              38.9 ( 1.00x)
v_loop_filter_unaligned_mmxext:                         14.3 ( 2.72x)

New benchmarks:
h_loop_filter_c:                                        29.2 ( 1.00x)
h_loop_filter_sse2:                                     28.6 ( 1.02x)
h_loop_filter_unaligned_c:                              29.0 ( 1.00x)
h_loop_filter_unaligned_sse2:                           26.9 ( 1.08x)
v_loop_filter_c:                                        38.3 ( 1.00x)
v_loop_filter_sse2:                                     11.0 ( 3.47x)
v_loop_filter_unaligned_c:                              35.5 ( 1.00x)
v_loop_filter_unaligned_sse2:                           11.2 ( 3.18x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-13 18:58:50 +02:00
Tong Wu
10e9672a8c avcodec/d3d12va_encode: use macros to set QP range and max frame size
Signed-off-by: Tong Wu <wutong1208@outlook.com>
2025-10-12 01:50:57 +00:00
Andreas Rheinhardt
36f92206bb avcodec/x86/hpeldsp: Improve ff_{avg,put}_pixels8_xy2_ssse3()
This SSSE3 function uses MMX registers (of course without emms
at the end) and processes eight bytes of input by unpacking
it into two MMX registers. This is very suboptimal given
that one can just use XMM registers to process eight words.
This commit switches them to using XMM registers.

Old benchmarks:
avg_pixels_tab[1][3]_c:                                114.5 ( 1.00x)
avg_pixels_tab[1][3]_ssse3:                             43.6 ( 2.62x)
put_pixels_tab[1][3]_c:                                 83.6 ( 1.00x)
put_pixels_tab[1][3]_ssse3:                             34.0 ( 2.46x)

New benchmarks:
avg_pixels_tab[1][3]_c:                                115.3 ( 1.00x)
avg_pixels_tab[1][3]_ssse3:                             24.6 ( 4.69x)
put_pixels_tab[1][3]_c:                                 83.8 ( 1.00x)
put_pixels_tab[1][3]_ssse3:                             19.7 ( 4.24x)

Reviewed-by: Kieran Kunhya <kieran@kunhya.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-12 02:45:37 +02:00
Andreas Rheinhardt
4c55724da8 avcodec/x86/hpeldsp: Add ff_put_no_rnd_pixels8_xy2_ssse3()
Given that one has to deal with 16 byte intermediates it is
unsurprising that SSE2 wins against MMX; the MMX version has
therefore been removed (as well as the now unused inline_asm.h).
The new function is even 32B smaller than the old MMX one.

Old benchmarks:
put_no_rnd_pixels_tab[1][3]_c:                          84.1 ( 1.00x)
put_no_rnd_pixels_tab[1][3]_mmx:                        41.1 ( 2.05x)

New benchmarks:
put_no_rnd_pixels_tab[1][3]_c:                          84.0 ( 1.00x)
put_no_rnd_pixels_tab[1][3]_ssse3:                      22.1 ( 3.80x)

Reviewed-by: Kieran Kunhya <kieran@kunhya.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-12 02:45:25 +02:00
Andreas Rheinhardt
f84e06026a avcodec/x86/hpeldsp: Add SSE2 of {avg,put} no_rnd xy2 with blocksize 16
Also remove the now superseded MMX versions (the new functions have the
exact same codesize as the removed ones).

Old benchmarks:
avg_no_rnd_pixels_tab[0][3]_c:                         233.7 ( 1.00x)
avg_no_rnd_pixels_tab[0][3]_mmx:                       121.5 ( 1.92x)
put_no_rnd_pixels_tab[0][3]_c:                         171.4 ( 1.00x)
put_no_rnd_pixels_tab[0][3]_mmx:                        82.6 ( 2.08x)

New benchmarks:
avg_no_rnd_pixels_tab[0][3]_c:                         233.3 ( 1.00x)
avg_no_rnd_pixels_tab[0][3]_sse2:                       45.0 ( 5.18x)
put_no_rnd_pixels_tab[0][3]_c:                         172.1 ( 1.00x)
put_no_rnd_pixels_tab[0][3]_sse2:                       40.9 ( 4.21x)

Reviewed-by: Kieran Kunhya <kieran@kunhya.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-12 02:43:29 +02:00
Andreas Rheinhardt
ce9d181444 avcodec/mjpegdec: Remove unnecessary reloads
Hint: The parts of this patch in decode_block_progressive()
and decode_block_refinement() rely on the fact that GET_VLC
returns -1 on error, so that it enters the codepaths for
actually coded block coefficients.

Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-11 08:20:42 +02:00
Andreas Rheinhardt
dad06a445f avcodec/Makefile: Remove h263 decoder->mpeg4videodec.o dependency
Also prefer using #if CONFIG_MPEG4_DECODER checks in order not
to rely on DCE.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-11 07:51:01 +02:00
Andreas Rheinhardt
10d3479da0 avcodec/h263dec: Avoid redundant branch
Only the MPEG-4 decoder can have partitioned frames here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-11 07:51:01 +02:00
Andreas Rheinhardt
d96f8d32ad avcodec/x86/h264_qpel: Don't instantiate unused functions
The v_lowpass wrappers (which are instantiated by this macro)
are only used in the put (and not the avg) form for SSSE3
(the avg form is only used for mc02, which doesn't exist
for SSSE3). Clang warns about the unused functions.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-10 16:27:57 +02:00
Leo Izen
eab3b68237
avcodec/exif: avoid printing errors for makernote non-IFD parsing
When we parse a MakerNote, we first try to parse it as an IFD and if
that fails, we try to re-parse it as a binary blob. This is because
MakerNote is not well-documented in its nature.

However, if we fail to parse it the first time, we should not av_log
error messages about the parse failure, so instead we log these as
AV_LOG_DEBUG.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Ramiro Polla <ramiro.polla@gmail.com>
2025-10-09 12:40:41 -04:00
James Almer
41c168444e avcodec/hevc/sei: don't attempt to use stale values in HEVCSEITimeCode
Invalidate the whole struct on SEI reset.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-10-09 12:09:35 -03:00
James Almer
8e01bff774 avcodec/hevc/sei: don't attempt to use stale values in HEVCSEITDRDI
Invalidate the whole struct on SEI reset.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-10-09 12:09:35 -03:00
James Almer
d448d6d1a0 avcodec/hevc/sei: prevent storing a potentially bogus num_ref_displays value in HEVCSEITDRDI
Fixes: 439711052/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-4956250308935680
Fixes: out of array access

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: James Almer <jamrial@gmail.com>
2025-10-09 12:09:35 -03:00
Manuel Lauss
aa91ae25b8
avcodec/sanm: minor comment and size detection changes
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:02 +02:00
Manuel Lauss
c46c1cb0db
avcodec/sanm: remove rotate_code context member
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:02 +02:00
Manuel Lauss
72df8f271d
avcodec/sanm: implement 3 blits for codec37/47/48
The various game engines implement the following blit types, from the decoded
result to the main canvas:
- normal (opaque) blit (c37/c47/c48)
- masked blit (c37/c48)
- interpolated-frame blit (c48)
  Here an artificial frame is generated by looking up the pixels
  from both buffers and picking a color from the interpolation table
  for the artificial frame.
  This is only supported in the decoder of "Making Magic".

Implement and hook up these 3 schemes for each of the 3 compresstion types,
and switch codec20 to a call to the opaque blit function.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:02 +02:00
Manuel Lauss
d7d97ea32c
avcodec/sanm: partially fix codec48 for Making Magic
Making Magic makes use of codec48 flag bit 0, which, when set,
means NOT to swap both buffers on even sequence numbers.

This fixes most of the artifacts in the Making Magic videos.
It's not complete though, bits 1 and 4 still need to be handled.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:02 +02:00
Manuel Lauss
9e72b2f2d0
avcodec/sanm: codec37/47/48 updates
- align the incoming widths to 4(c37) / 8(c47/78) pixels. LucasArts
   game videos have these aligned.
- since these codecs use their 2/3 buffers for themselves, adjust the
  stride to the aligned width, keeping it even, which gets rid of
  an unaligned store in c48_4to8() found by the fuzzer with an
  odd stride.
- clear the whole diff buffer, not just the area described by w/h.
- adjust the RLE "decoded_size" to the product of the aligned width
   and reported height.

These changes are the result of various fuzzer-found issues; all my
test videos still work fine.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:02 +02:00
Manuel Lauss
0eb58e40cb
avcodec/sanm: codec20 left/top offset support
Add left/top offsets and clipping to codec20 (raw images),
use it for the copying of codec37/47/48 images to main buffer.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
a108be2ba3
avcodec/sanm: for ANIM codecs with own buffers, really check dimensions
Codec37/47/48 have their own buffers; left/top are applied after
the decoding is done when copying to the main buffer.  Don't add left/top
to their width/height when doing checks against the established buffer sizes.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
b6a9c4671a
avcodec/sanm: reimplement XPAL algorithm identical to DOS smush engine
This implements XPAL the same way the DOS/Windows players do, with an
additional 768-entry table holding the palette left-shifted by 7 bits,
and adding the deltapal values to this.

This results in a perfectly smooth day-to-night transition in the last
30 seconds of the Outlaws RAE.SAN (ending) video, while before there
were visible brightness "pulses" when a new palette was loaded.
It also fixes color banding in the The Dig Intro (sq1.san), in the
scene showing the shuttle launch pad and the night sky.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
4316914b39
avcodec/sanm: codec37: comp1: guard against invalid mv index
the c37 mvtable has only 255 pairs, change index 255 to zero to
avoid reading outside the table boundaries.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
5f1f5dd2d4
avcodec/sanm: guard against image area growing larger than buffer
When checking for oversized frames, check not only for the width
and height being larger, but also the area not outgrowing the
allocated buffer.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
556cef27d9
avcodec/sanm: enforce SANM min and max sizes at decode_init()
Enforce at least 8x8 and at max 800x600 for SANM/BL16.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
70b04717d0
avcodec/sanm: distrust dimensions for ANIM in decode_init
When decode_init() is called for ANIM content, zero the dimensions
set in avctx width/height. Only SANM files have image dimensions in
their header, while ANIM do not.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
0802044d81
avcodec/sanm: codec48: reimplement block scaling
Reimplement opcodes 0xFF and 0xFD the same way the c48 decoder
in the "Mysteries of the Sith" game engine does it:
The source pixel(s) and various pixels from inside the same and above
block of the second to last image rendered to the destination buffer
are used together with the interpolation table to generate a 4x4 pattern,
which is then expanded by doubling each pixel horizontally and vertically
to produce the final 8x8 block.

This fixes visible artifacts in frames 25-50 of the S1L1OCS.SAN
video of Mysteries of the Sith.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
b99b7a6f90
avcodec/sanm: change codec37 opcode FE to 4 2x2 blocks
It was initially implemented as 4 4x1 blocks, reimplement it as 4 2x2 blocks.
Fixes a few The Dig videos, esp. black dots on the asteroid in the
intro scene.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
d618577747
avcodec/sanm: check codec48 subblock mv index
Codec48 opcodes F9 and FC take per-subblock indices into the motion vector
table from the source stream, however the table has only 255 entries.
Luckily, index 255 is index 0 of the following table, which means no
motion vector, the same as index 0 of the current table.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
dd875f56b7
avcodec/sanm: invalidate STOR data when subversion changes
since the STOR data is a different format.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
b7e55ef8a1
avcodec/sanm: per-fobj GetByteContext
Create a separate GetByteContext from the general one, to be able
to limit the size of the FOBJ to the size described in the tag size.
Otherwise each fobj could theoretically use all the remaining data
in the FRME (which also contains audio, subtitles, ...).

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
a0c4dfc63b
avcodec/sanm: handle FTCH on video start
some videos have a FTCH at the start of the video, to restore the
last image produced by the previous game file.  This leads to
ugly messages like these:

[sanm @ 0x7f18cc001980] [IMGUTILS @ 0x7f18d7ffe8e0] Picture size 0x0 is invalid
[sanm @ 0x7f18cc001980] video_get_buffer: image parameters invalid
[sanm @ 0x7f18cc001980] get_buffer() failed

Fix this by not setting the got_frame_ptr when there is nothing to
restore/fetch.  Seen with a lot of RA1 and the RA2 Level 11/12 videos.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:01 +02:00
Manuel Lauss
7c205b5397
avcodec/sanm: rename motion_vectors[] table to c47_mv[]
Rename the generic motion_vectors to c47_mv, as this vector table
was initially introduced with codec47 which predates bl16 by 1-2 years,
and bl16 is a development of codec47 (with a bit of c48 thrown in).

No functional change.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
3945d100ef
avcodec/sanm: remove unused SANMFrameHeader
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
2ef26c30eb
avcodec/sanm: implement BL16 subcodecs 1 and 7
Both of these encode a quarter-sized keyframe, with missing pixels
interpolated from the immediate neighbours.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
b1a7f8b7cf
avcodec/sanm: factor out the ANIM decoding into separate function
Mainly for readability. No functional changes.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
49c552d066
avcodec/sanm: restructure SANM like the other block codecs
Restructe the SANM (or BL16 as LucasArts calls it) decoder to make it
look like the others, as it is basically a development of old_codec47
for rgb565 values.

No functional changes.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
4d5e87eaa4
Revert "avcodec/sanm: Check w,h,left,top"
This reverts commit 134fbfd1dc.

As it breaks valid uses of this in Rebel Assault 1 videos.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
75b6937527
avcodec/sanm: reset rotate_code every iteration
and eliminate the explicit reset in the other decoders that
don't need it.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
de7db62acc
avcodec/sanm: rename process_block to codec47_block
the new name better indicates where it belongs to.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
043dafc4c2
avcodec/sanm: codec37/47/48 size checks
Add more size checks to old_codec37/47/48, esp. the headers.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
f98cd66b4b
avcodec/sanm: codec47: read the small codebook
codec47 carries a 4-byte small codebook in its header. Read those
4 bytes into context member instead of awkwardly redirecting the
bytestream pointer every time it needs to be accessed.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
72e6206c88
avcodec/sanm: partially fix codec48
The mv check introduced with d5bdb0b705 broke MotS videos:
- their height (300 lines) is 37,5 blocks; unfortunately the videos try to
  access up to 1 block more.
  Extend the mv check to the aligned_height, which fixes most artifacts.
- don't return an error when an mv is invalid; rather skip the (subblock).
  Gets rid of almost all artifacts.

Some artifacts still remain, esp in space scenes where the original
encoder apparently fetched black pixels from outside of the aligned
height.  An increase of the buffer size by 8 lines will fix that later.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
24ce42b406
avcodec/sanm: codec4 improvements
- don't draw outside the buffers
- don't wrap around when coordinates go over the edge

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
dfe4a0626f
avcodec/sanm: codec31 improvements
- don't draw outside the buffers
- don't wrap around when coordinates go over the edge
  this is especially noticeable in the e.g. O1OPEN.ANM, C1C3PO.ANM
  RA1 files with planets wrapping around.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:29:00 +02:00
Manuel Lauss
da4b88494c
avcodec/sanm: codec1 improvements
- don't draw outside the buffers
- don't wrap around when coordinates go over the edge
  this is especially noticeable in the e.g. O1OPEN.ANM, C1C3PO.ANM
  RA1 files with planets wrapping around.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:28:59 +02:00
Manuel Lauss
d18c25f1a9
avcodec/sanm: codec21 improvements
- don't draw outside the buffers
- don't wrap around when coordinates go over the edge

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:28:59 +02:00
Manuel Lauss
67b28acba3
avcodec/sanm: codec23 improvements
- don't draw outside the buffers
- don't wrap around when coordinates go over the edge

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2025-10-09 10:28:59 +02:00
James Almer
4377affc28 avcodec/hevc/refs: don't unconditionally discard non-IRAP frames if no IRAP frame was seen before
Should fix issue #20661

Signed-off-by: James Almer <jamrial@gmail.com>
2025-10-09 02:52:46 +00:00