ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2026-04-18 16:40:23 +00:00

Author	SHA1	Message	Date
Lynne	ca591e6b50	vulkan_decode: force layered_dpb to 0 when dedicated_dpb is 0 layered_dpb only makes sense when dedicated_dpb is set to 1. For some mysterious reason, some Nvidia drivers stopped indicating SEPARATE_REFRENCES, but kept the COINCIDE flag, which broke the code.	2024-08-11 05:13:14 +02:00
Lynne	6757cdb535	vulkan_video: remove NIH pooled buffer implementation The code predates ff_vk_get_pooled_buffer().	2024-08-11 05:13:10 +02:00
Osamu Watanabe	d88a988d3d	avcodec/jpeg2000dec: Fix HT decoding Fixes incorrect handling of MAGB_P value in Ccap15. Fixes bugs in HT block decoding. Signed-off-by: Pierre-Anthony Lemieux <pal@palemieux.com>	2024-08-10 09:22:51 -07:00
Osamu Watanabe	48b14732d8	avcodec/jpeg2000dec: Add support for placeholder passes See Rec. ITU-T T.814 \| ISO/IEC 15444-15, Annex B. Signed-off-by: Pierre-Anthony Lemieux <pal@palemieux.com>	2024-08-10 09:22:44 -07:00
Osamu Watanabe	fe1b196499	avcodec/jpeg2000dec: Add support for CAP and CPF markers Signed-off-by: Pierre-Anthony Lemieux <pal@palemieux.com>	2024-08-10 09:20:15 -07:00
Fei Wang	cda5f5c5ed	lavc/qsv: Use vendor id to create device New kernel driver "xe" will be supported from Lunar Lake instead of "i915". "xe" kernel driver: https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/xe Signed-off-by: Fei Wang <fei.w.wang@intel.com>	2024-08-09 13:40:26 +08:00
Kacper Michajłow	9876158ee2	avcodec/wmavoice: use av_clipd for double values Fixes Clang warning. Signed-off-by: Kacper Michajłow <kasper93@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-08-07 00:59:18 +02:00
Kacper Michajłow	1165c14444	avcodec/vp9mvs: fix misaligned access when clearing VP9mv Fixes runtime error: member access within misaligned address <addr> for type 'av_alias64', which requires 8 byte alignment. VP9mv is aligned to 4 bytes, so instead doing 8 bytes clear, let's do 2 times 4 bytes. Signed-off-by: Kacper Michajłow <kasper93@gmail.com> Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-08-07 00:59:18 +02:00
Andreas Rheinhardt	bfcee368e2	avcodec/cbs_sei: Always zero-initialize SEI payload Fixes: Use-of-uninitialized value Fixes: clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_METADATA_fuzzer-5458626041413632 Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-08-06 20:25:23 +02:00
Kacper Michajłow	5dfc0cc841	avcodec/parser: ensure input padding is zeroed Fixes use of uninitialized value, reported by MSAN. Found by OSS-Fuzz. Signed-off-by: Kacper Michajłow <kasper93@gmail.com> Fixes: 70852/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5179190066872320 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2024-08-05 23:17:46 +02:00
Rémi Denis-Courmont	616fdeaea3	lavc/riscv: depend on RVB and simplify accordingly There is no known (real) hardware with V and without the complete B extension. B was indeed required in the RISC-V application profile from 2022, earlier than V. There should not be any relevant hardware in the future either. In practice, different R-V Vector optimisations in FFmpeg already depend on every constituent of the B extension anyhow, so it would not work well.	2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont	4edfc11a28	lavc/h264dsp: R-V V idct4_add8 (all depths) These are really just wrappers for idct4_add16intra functions, which are in turn mostly wrappers for idct4_add and idct4_dc_add functions. For benchmarks refer to the later two sets.	2024-08-05 21:16:26 +03:00
Timo Rothenpieler	9a2171318d	avcodec/nvenc: fix signedness of timing fields	2024-08-03 20:04:31 +02:00
James Almer	4a56b5f3d8	avcodec/cbs_h265: don't attempt to read 0 length elements in sei_3d_reference_displays_info Fixes: 70458/clusterfuzz-testcase-minimized-ffmpeg_BSF_TRACE_HEADERS_fuzzer-5259339779080192 Fixes: Assertion width > 0 && width <= 32 failed at libavcodec/cbs.c:608 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: James Almer <jamrial@gmail.com>	2024-08-03 11:59:14 -03:00
Rémi Denis-Courmont	de7f999481	lavc/videodsp: work-around LLVM-as For some reason, it can't handle the normal syntax for an address operand without an offset, so add a dummy zero offset.	2024-08-02 21:24:01 +03:00
Rémi Denis-Courmont	677f28b310	lavc/h264dsp: stick R-V V weight to 16-bit precision T-Head C908 (ns): h264_weight2_8_c: 1607.8 h264_weight2_8_rvv_i32: 515.0 (before) h264_weight2_8_rvv_i32: 348.5 (after) h264_weight4_8_c: 2255.8 h264_weight4_8_rvv_i32: 1015.0 (before) h264_weight4_8_rvv_i32: 691.0 (after) h264_weight8_8_c: 3857.5 h264_weight8_8_rvv_i32: 2218.8 (before) h264_weight8_8_rvv_i32: 1561.3 (after) h264_weight16_8_c: 7431.5 h264_weight16_8_rvv_i32: 2737.3 (before) h264_weight16_8_rvv_i32: 1848.3 (after) SpacemiT X60 (ns): h264_weight2_8_c: 1624.1 h264_weight2_8_rvv_i32: 352.6 (before) h264_weight2_8_rvv_i32: 259.3 (after) h264_weight4_8_c: 2259.3 h264_weight4_8_rvv_i32: 685.8 (before) h264_weight4_8_rvv_i32: 530.3 (after) h264_weight8_8_c: 4103.3 h264_weight8_8_rvv_i32: 1581.8 (before) h264_weight8_8_rvv_i32: 1238.6 (after) h264_weight16_8_c: 7624.3 h264_weight16_8_rvv_i32: 2738.1 (before) h264_weight16_8_rvv_i32: 1853.3 (after)	2024-08-02 21:24:01 +03:00
Rémi Denis-Courmont	afd45c7ff7	lavc/h264dsp: stick R-V V biweight to 16-bit T-Head C908 (ns): h264_biweight2_8_c: 2414.5 h264_biweight2_8_rvv_i32: 701.8 (before) h264_biweight2_8_rvv_i32: 468.5 (after) h264_biweight4_8_c: 4655.3 h264_biweight4_8_rvv_i32: 1377.5 (before) h264_biweight4_8_rvv_i32: 931.8 (after) h264_biweight8_8_c: 9701.5 h264_biweight8_8_rvv_i32: 2896.0 (before) h264_biweight8_8_rvv_i32: 2070.5 (after) h264_biweight16_8_c: 18025.0 h264_biweight16_8_rvv_i32: 3460.8 (before) h264_biweight16_8_rvv_i32: 1978.0 (after) SpacemiT X60 (ns): h264_biweight2_8_c: 2415.5 h264_biweight2_8_rvv_i32: 478.2 (before) h264_biweight2_8_rvv_i32: 362.8 (after) h264_biweight4_8_c: 4655.3 h264_biweight4_8_rvv_i32: 946.7 (before) h264_biweight4_8_rvv_i32: 727.3 (after) h264_biweight8_8_c: 9061.8 h264_biweight8_8_rvv_i32: 2071.7 (before) h264_biweight8_8_rvv_i32: 1685.8 (after) h264_biweight16_8_c: 18020.5 h264_biweight16_8_rvv_i32: 3457.2 (before) h264_biweight16_8_rvv_i32: 1935.8 (after)	2024-08-02 21:24:01 +03:00
Zhao Zhili	670ff6c7ce	avcodec/nvenc: rework on DTS generation Before the patch, the method to generate DTS only works with timebase equal to 1/fps. With timebase like 1/1000 ./ffmpeg -i foo.mp4 -an -c:v h264_nvenc -enc_time_base 1/1000 bar.mp4 pts 0 dts -3 pts 160 dts 37 pts 80 dts 77 pts 40 dts 117 <-- invalid pts 120 dts 157 pts 320 dts 197 pts 240 dts 237 pts 200 dts 277 <-- invalid pts 280 dts 317 <-- invalid The generated DTS can be larger than PTS, since it only reorder the input PTS and minus the number of frame delay, which doesn't take timebase into account. It should minus the "time" of frame delay. `9a245bd` trying to fix the issue, but the implementation is incomplete, which only use time_base.num. Then it got reverted by `ac7c265b33`. After this patch: pts 0 dts -120 pts 160 dts -80 pts 80 dts -40 pts 40 dts 0 pts 120 dts 40 pts 320 dts 80 pts 240 dts 120 pts 200 dts 160 pts 280 dts 200 Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	2024-08-02 17:57:19 +02:00
Roman Arzumanyan	bcea693f75	avcodec/cuviddec: more accurately guess probed sw pixel format Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	2024-08-02 17:38:46 +02:00
Rémi Denis-Courmont	2f083fd581	lavc/audiodsp: drop R-V F vector_clipf This is now firmly slower than C. SiFive-U74 (cycles): audiodsp.vector_clipf_c: 31.2 audiodsp.vector_clipf_rvf: 39.5	2024-08-01 19:29:40 +03:00
Rémi Denis-Courmont	c48213b2dc	lavc/audiodsp: drop opposite sign optimisation This was added along side the original SSE(one) DSP function in `0a68cd876e` without rationale. This was presumably faster on x87, which is no longer relevant since we pretty much assume SSE2 or later on x86. Meanwhile this function is ~2.5x slower than the normal floating point one on SiFive-U74.	2024-08-01 19:29:40 +03:00
Rémi Denis-Courmont	d86b6767ce	lavc/audiodsp: properly unroll vector_clipf Given that source and destination can alias, the compiler was forced to perform each read-modify-write sequentially. We cannot use the `restrict` qualifier to avoid this here because the AC-3 encoder uses the function in-place. Instead this commit provides an explicit guarantee to the compiler that batches of 8 elements will not overlap, so that it can interleave calculations. In practice contemporary optimising compilers are able to unroll and keep the temporary array in FPU registers (without spilling). On SiFive-U74, this speeds the same signs branch by 4x, and the opposite signs branch 1.5x.	2024-08-01 19:29:40 +03:00
Rémi Denis-Courmont	d527d23872	lavc/pixblockdsp: specialise aligned 16-bit get_pixels The current code assumes that we have unaligned rows, which hurts on platforms with slower unaligned accesses. (Also, this lets the compiler unroll manually, which it seems to do in practice.)	2024-08-01 18:44:01 +03:00
Rémi Denis-Courmont	54ae270213	lavc/rv34dsp: use saturating add/sub for R-V V DC add T-Head C908 (cycles): rv34_idct_dc_add_c: 113.2 rv34_idct_dc_add_rvv_i32: 48.5 (before) rv34_idct_dc_add_rvv_i32: 39.5 (after)	2024-08-01 18:43:04 +03:00
Rémi Denis-Courmont	952b426f3b	lavc/bswapdsp: add RV Zvbb bswap16 and bswap32	2024-08-01 18:43:04 +03:00
James Almer	f4daf633b2	avcodec/aacps_tablegen_template: don't redefine CONFIG_HARDCODED_TABLES Fixes relevant warnings when compiling with --enable-hardcoded-tables Signed-off-by: James Almer <jamrial@gmail.com>	2024-08-01 12:13:53 -03:00
Anton Khirnov	bcf08c1171	lavc/ffv1: change FFV1SliceContext.plane into a RefStruct object Frame threading in the FFV1 decoder works in a very unusual way - the state that needs to be propagated from the previous frame is not decoded pixels(¹), but each slice's entropy coder state after decoding the slice. For that purpose, the decoder's update_thread_context() callback stores a pointer to the previous frame thread's private data. Then, when decoding each slice, the frame thread uses the standard progress mechanism to wait for the corresponding slice in the previous frame to be completed, then copies the entropy coder state from the previously-stored pointer. This approach is highly dubious, as update_thread_context() should be the only point where frame-thread contexts come into direct contact. There are no guarantees that the stored pointer will be valid at all, or will contain any particular data after update_thread_context() finishes. More specifically, this code can break due to the fact that keyframes reset entropy coder state and thus do not need to wait for the previous frame. As an example, consider a decoder process with 2 frame threads - thread 0 with its context 0, and thread 1 with context 1 - decoding a previous frame P, current frame F, followed by a keyframe K. Then consider concurrent execution consistent with the following sequence of events: * thread 0 starts decoding P * thread 0 reads P's slice header, then calls ff_thread_finish_setup() allowing next frame thread to start * main thread calls update_thread_context() to transfer state from context 0 to context 1; context 1 stores a pointer to context 0's private data * thread 1 starts decoding F * thread 1 reads F's slice header, then calls ff_thread_finish_setup() allowing the next frame thread to start decoding * thread 0 finishes decoding P * thread 0 starts decoding K; since K is a keyframe, it does not wait for F and reallocates the arrays holding entropy coder state * thread 0 finishes decoding K * thread 1 reads entropy coder state from its stored pointer to context 0, however it finds state from K rather than from P This execution is currently prevented by special-casing FFV1 in the generic frame threading code, however that is supremely ugly. It also involves unnecessary copies of the state arrays, when in fact they can only be used by one thread at a time. This commit addresses these deficiencies by changing the array of PlaneContext (each of which contains the allocated state arrays) embedded in FFV1SliceContext into a RefStruct object. This object can then be propagated across frame threads in standard manner. Since the code structure guarantees only one thread accesses it at a time, no copies are necessary. It is also re-created for keyframes, solving the above issue cleanly. Special-casing of FFV1 in the generic frame threading code will be removed in a later commit. (¹) except in the case of a damaged slice, when previous frame's pixels are used directly	2024-08-01 10:09:26 +02:00
Anton Khirnov	c335218a81	lavc/ffv1dec: inline copy_fields() into update_thread_context() It is now only called from a single place, so there is no point in it being a separate function.	2024-08-01 10:09:26 +02:00
Anton Khirnov	d44812f7cf	lavc/ffv1dec: stop using per-slice FFV1Context All remaining accesses to them are for fields that have the same value in the main encoder context. Drop now-unused FFV1Context.slice_contexts.	2024-08-01 10:09:26 +02:00
Anton Khirnov	2b21cdff6e	lavc/ffv1dec: move slice_damaged to per-slice context	2024-08-01 10:09:26 +02:00
Anton Khirnov	f2aeba56c4	lavc/ffv1dec: move slice_reset_contexts to per-slice context	2024-08-01 10:09:26 +02:00
Anton Khirnov	84dda32202	lavc/ffv1enc: stop using per-slice FFV1Context All remaining accesses to them are for fields that have the same value in the main encoder context.	2024-08-01 10:09:26 +02:00
Anton Khirnov	96e8af6c4d	lavc/ffv1: move ac_byte_count to per-slice context	2024-08-01 10:09:26 +02:00
Anton Khirnov	e7d0f44138	lavc/ffv1enc: store per-slice rc_stat(2?) in FFV1SliceContext Instead of the per-slice FFV1Context, which will be removed in future commits.	2024-08-01 10:09:26 +02:00
Anton Khirnov	7b2bfba55d	lavc/ffv1: move RangeCoder to per-slice context	2024-08-01 10:09:26 +02:00
Anton Khirnov	28769f6bc1	lavc/ffv1: move FFV1Context.plane to per-slice context	2024-08-01 10:09:26 +02:00
Anton Khirnov	9b86ba5a92	lavc/ffv1: always use the main context values of ac It cannot change between slices.	2024-08-01 10:09:26 +02:00
Anton Khirnov	a57c88d67b	lavc/ffv1: move FFV1Context.slice_{coding_mode,rct_.y_coef} to per-slice context	2024-08-01 10:09:26 +02:00
Anton Khirnov	39486a2b29	lavc/ffv1: always use the main context values of plane_count/transparency They cannot change between slices.	2024-08-01 10:09:26 +02:00
Anton Khirnov	492df65201	lavc/ffv1: drop write-only PlaneContext.interlace_bit_state	2024-08-01 10:09:26 +02:00
Anton Khirnov	a411fc5a84	lavc/ffv1: drop redundant PlaneContext.quant_table It is a copy of FFV1Context.quant_tables[quant_table_index].	2024-08-01 10:09:26 +02:00
Anton Khirnov	4b9f7c7e3a	lavc/ffv1: drop redundant FFV1Context.quant_table In all cases except decoding version 1 it's either not used, or contains a copy of a table from quant_tables, which we can just as well use directly. When decoding version 1, we can just as well decode into quant_tables[0], which would otherwise be unused.	2024-08-01 10:09:26 +02:00
Anton Khirnov	d2f507233a	lavc/ffv1enc: move bit writer to per-slice context	2024-08-01 10:09:26 +02:00
Anton Khirnov	889faedd26	lavc/ffv1dec: move the bitreader to stack There is no reason to place it in persistent state.	2024-08-01 10:09:25 +02:00
Anton Khirnov	19e9f3d5f2	lavc/ffv1: move run_index to the per-slice context	2024-08-01 10:09:25 +02:00
Anton Khirnov	91d3c1ac47	lavc/ffv1: move sample_buffer to the per-slice context	2024-08-01 10:09:25 +02:00
Anton Khirnov	54aa33f116	lavc/ffv1: add a per-slice context FFV1 decoder and encoder currently use the same struct - FFV1Context - both as codec private data and per-slice context. For this purpose FFV1Context contains an array of pointers to per-slice FFV1Context instances. This pattern is highly confusing, as it is not clear which fields are per-slice and which per-codec. Address this by adding a new struct storing only per-slice data. Start by moving slice_{x,y,width,height} to it.	2024-08-01 10:09:25 +02:00
Anton Khirnov	d845ea49c5	lavc/ffv1dec: move copy_fields() under HAVE_THREADS It is unused otherwise	2024-08-01 10:09:25 +02:00
Anton Khirnov	3a5c814b19	lavc/ffv1dec: drop a pointless variable in decode_slice() fsdst is by construction always equal to fs, there is even an av_assert1() checking that. Just use fs directly.	2024-08-01 10:09:25 +02:00
Anton Khirnov	4da146ba83	lavc/ffv1dec: drop FFV1Context.cur It is merely a pointer to FFV1Context.picture.f, which can just as well be used directly.	2024-08-01 10:09:25 +02:00

1 2 3 4 5 ...

50783 commits