ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2025-12-08 06:09:50 +00:00

Author	SHA1	Message	Date
Andreas Rheinhardt	eccf130fdb	{lib{avcodec,swscale}/x86/,}Makefile: Kill MMX-OBJS Reviewed-by: Kacper Michajłow <kasper93@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-11-30 22:20:13 +01:00
Andreas Rheinhardt	ba94177242	avcodec/x86/Makefile: Only compile ASM init files when X86ASM is enabled To do so, simply add these init files to X86ASM-OBJS instead of OBJS in the Makefile. The former is already used for the actual assembly files, but using them for the C init files just works, because the build system uses file extensions to derive whether it is a C or a NASM file. This avoids compiling unused function stubs and also reduces our reliance on DCE: We don't add %if checks to the asm files except for AVX, AVX2, FMA3, FMA4, XOP and AVX512, so all the MMX-SSE4 functions will be available. It also allows to remove HAVE_X86ASM checks in these init files. Reviewed-by: Kacper Michajłow <kasper93@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-11-30 22:20:13 +01:00
Andreas Rheinhardt	65b4feb782	avcodec/x86/Makefile: Remove redundant WebP decoder->vp8dsp dependencies Redundant since `35b02732b9`. Reviewed-by: Kacper Michajłow <kasper93@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-11-30 22:20:13 +01:00
Andreas Rheinhardt	18e08101eb	avcodec/x86/Makefile: Don't use MMX-OBJS for fdct.o MMX has been removed in `d402ec6be9`. Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-11-04 11:41:29 +01:00
Andreas Rheinhardt	74a88c0c11	avcodec/x86/cavsdsp: Add SSE2 mc20 horizontal motion compensation Basically a direct port of the MMXEXT one. The main difference is of course that one can process eight pixels (unpacked to words) at a time, leading to speedups. avg_cavs_qpel_pixels_tab[0][2]_c: 700.1 ( 1.00x) avg_cavs_qpel_pixels_tab[0][2]_mmxext: 158.1 ( 4.43x) avg_cavs_qpel_pixels_tab[0][2]_sse2: 86.0 ( 8.14x) avg_cavs_qpel_pixels_tab[1][2]_c: 171.9 ( 1.00x) avg_cavs_qpel_pixels_tab[1][2]_mmxext: 39.4 ( 4.36x) avg_cavs_qpel_pixels_tab[1][2]_sse2: 21.7 ( 7.92x) put_cavs_qpel_pixels_tab[0][2]_c: 525.7 ( 1.00x) put_cavs_qpel_pixels_tab[0][2]_mmxext: 148.5 ( 3.54x) put_cavs_qpel_pixels_tab[0][2]_sse2: 75.2 ( 6.99x) put_cavs_qpel_pixels_tab[1][2]_c: 129.5 ( 1.00x) put_cavs_qpel_pixels_tab[1][2]_mmxext: 36.7 ( 3.53x) put_cavs_qpel_pixels_tab[1][2]_sse2: 19.0 ( 6.81x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-08 20:40:08 +02:00
Andreas Rheinhardt	92ae9d1ffc	configure: Remove vc1dsp->qpeldsp dependency It only needs it for some x86 fpel functions; instead add a direct dependency for that. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-10-04 07:06:32 +02:00
Henrik Gramner	0b5d46ee1c	vp9: Add 8bpc AVX2 asm for inverse transforms	2025-09-19 23:12:59 +00:00
Kacper Michajłow	5ff2500514	avcodec/x86/Makefile: add missing x86/proresdsp.o for prores raw	2025-08-15 20:45:20 +02:00
Kacper Michajłow	a9e7b5aa07	avcodec/Makefile: add missing dependency for prores raw decoder Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-08-14 04:43:16 +02:00
Andreas Rheinhardt	9b409ea1e6	configure: Factor mpegvideoencdsp out of mpegvideoenc This will allow to relax the dependency on mpegvideoenc for several codecs. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-06-21 22:08:52 +02:00
Henrik Gramner	eda0ac7e5f	avcodec/x86/vp9: Add AVX-512ICL for 16x16 and 32x32 10bpc inverse transforms	2025-05-26 15:26:11 +02:00
Henrik Gramner	fd18ae88ae	avcodec/x86/vp9: Add AVX-512ICL for 16x16 and 32x32 8bpc inverse transforms	2025-05-19 15:56:27 +02:00
Mark Thompson	d03c99441d	lavc/apv: AVX2 transquant for x86-64 Typical checkasm result on Alder Lake: decode_transquant_8_c: 464.2 ( 1.00x) decode_transquant_8_avx2: 86.2 ( 5.38x) decode_transquant_10_c: 481.6 ( 1.00x) decode_transquant_10_avx2: 83.5 ( 5.77x)	2025-04-27 15:52:30 +01:00
Nuo Mi	0a6388d1da	avcodec/hevcdec: remove hevc prefix for x86 asm files	2024-12-22 21:00:06 +08:00
Andreas Rheinhardt	df2416ca97	Remove remnants of prores_lgpl decoder Forgotten in `5c6a3604f0`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-07 23:53:26 +02:00
James Almer	6b6eb7d74e	x86/Makefile: fix hevc and vvc dependency of h2656dsp.o And remove tabs while at it. Signed-off-by: James Almer <jamrial@gmail.com>	2024-02-01 16:02:50 -03:00
Wu Jianhua	7d9f1f5485	avcodec/x86/hevc_mc: move put/put_uni to h26x/h2656_inter.asm This enable that the asm optimization can be reused by VVC Signed-off-by: Wu Jianhua <toqsxw@outlook.com>	2024-02-01 19:54:28 +08:00
Andreas Rheinhardt	6f7bf64dbc	avcodec: Remove DCT, FFT, MDCT and RDFT They were replaced by TX from libavutil; the tremendous work to get to this point (both creating TX as well as porting the users of the components removed in this commit) was completely performed by Lynne alone. Removing the subsystems from configure may break some command lines, because the --disable-fft etc. options are no longer recognized. Co-authored-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-01 02:25:09 +02:00
Andreas Rheinhardt	d9464f3e34	avcodec/mpegaudiodsp: Init dct32 directly This avoids using dct.c and will allow removing it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-10-01 01:53:32 +02:00
Andreas Rheinhardt	947d51f32a	avcodec/x86/hpeldsp_vp3: Merge into hpeldsp Once upon a time, `413abbe164` added versions of some put_no_rnd_pixels functions for use in VP3 and Theora (with an explicit check so that they are only used for VP3 and Theora). When this was moved to hpeldsp (from dsputil) in `3ced55d51c`, the check was replaced by a check for the bitexact flag (and a CONFIG_VP3_DECODER compile-time check), so that these functions were now used for other codecs as well. Later commit `1dfc3cf89d` split off the "VP3-specific bits into a separate file", yet these bits were not really VP3-specific bits at all any more. (The error was repeated in commit `0a39c9ac0b`.) This commit has not been reverted, because this would make future changes from Libav (from where it originated) harder, yet Libav is no more, so this commit effectively reverts `1dfc3cf89d`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-09-07 00:24:39 +02:00
Andreas Rheinhardt	262e7439c6	avcodec/x86/Makefile: Don't build empty files simple_idct.asm is 32 bit-only since `bfb28b5ce8`, whereas simple_idct10.asm is x64-only. So don't build the ultimately unneeded and empty files, as some linkers complain about this: "ranlib: file: libavcodec/libavcodec.a(simple_idct.o) has no symbols" (this is from an Xcode toolchain as reported by Ronald S. Bultje). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-12-13 16:16:40 +01:00
Lynne	b85e106d5f	libavcodec: remove mdct15 It's not needed nor used by anything anymore, lavu/tx is faster, and better in every way. RIP.	2022-11-06 14:39:41 +01:00
Andreas Rheinhardt	4209216ee8	avcodec/mpegvideodsp: Make MpegVideoDSP MPEG-4 only It is only used by gmc/gmc1 which is only used by the MPEG-4 decoder, so move it to Mpeg4DecContext and rename it to Mpeg4VideoDSP. Also compile it iff the MPEG-4 decoder is compiled. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-10-20 07:56:17 +02:00
Lynne	3ade6a8644	x86/lpc: implement a new Welch windowing function Old one was written with the assumption only even inputs would be given. This very messy replacement supports even and odd inputs, and supports AVX2 for extra speed. The buffers given are usually quite big (4k samples), so the speedup is worth it. The new SSE version is still faster than the old inline asm version by 33%. Also checkasm is provided to make sure this monstrosity works. This fixes some FATE tests.	2022-09-21 07:12:39 +02:00
Andreas Rheinhardt	6c4595190e	avcodec/flacdsp: Split encoder-only parts into a ctx of its own Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-08-05 03:28:45 +02:00
Paul B Mahol	b69c91bbee	avcodec/x86: add cfhdenc SIMD	2021-02-27 17:09:44 +01:00
Paul B Mahol	389cc142fb	avcodec/cfhd: add x86 SIMD Overall speed changes for 1920x1080, yuv422p10le, 60fps from: 0.19x to 0.343x	2020-08-26 21:13:38 +02:00
James Almer	58d167bcd5	avcodec/Makefile: add missing pngdsp dependency to the lscr decoder Signed-off-by: James Almer <jamrial@gmail.com>	2019-05-14 16:47:56 -03:00
Lynne	605e330310	x86/opusdsp: implement FMA3 accelerated postfilter and deemphasis 58893 decicycles in deemphasis_c, 130548 runs, 524 skips 9475 decicycles in deemphasis_fma3, 130686 runs, 386 skips -> 6.21x speedup 24866 decicycles in postfilter_c, 65386 runs, 150 skips 5268 decicycles in postfilter_fma3, 65505 runs, 31 skips -> 4.72x speedup Total decoder speedup: ~14% Deemphasis SIMD based on the following unrolling: const float c1 = CELT_EMPH_COEFF, c2 = c1c1, c3 = c2c1, c4 = c3c1; float state = coeff; for (int i = 0; i < len; i += 4) { y[0] = x[0] + c1state; y[1] = x[1] + c2state + c1x[0]; y[2] = x[2] + c3state + c1x[1] + c2x[0]; y[3] = x[3] + c4state + c1x[2] + c2x[1] + c3*x[0]; state = y[3]; y += 4; x += 4; }	2019-04-01 00:22:00 +02:00
Lynne	5468c1d075	celt_pvq_init: only build when CONFIG_OPUS_ENCODER is enabled The entire function was defined away before.	2019-03-31 23:36:43 +02:00
Lynne	4a2c651620	x86/opus_dsp: rename to celt_pvq Its only used in the encoder and in CELT's PVQ.	2019-03-31 23:35:00 +02:00
Aurelien Jacobs	f1e490b1ad	sbcenc: add MMX optimizations This was originally based on libsbc, and was fully integrated into ffmpeg. Rough speed test: C version: speed= 592x MMX version: speed= 785x	2018-03-07 22:26:53 +01:00
Martin Vignali	9b8c1224d7	libavcodec/exr : add X86 SIMD for reorder_pixels Signed-off-by: James Almer <jamrial@gmail.com>	2017-09-17 17:53:57 -03:00
Ivan Kalvachev	7205513f8f	SIMD opus pvq_search implementation Explanation on the workings and methods used by the Pyramid Vector Quantization Search function could be found in the following Work-In-Progress mail threads: http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212146.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212816.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213030.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213436.html Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com>	2017-08-18 17:18:32 +01:00
Paul B Mahol	4ed7c2bbc3	avcodec/utvideodec: add SIMD for restore_rgb_planes Signed-off-by: Paul B Mahol <onemda@gmail.com>	2017-06-27 09:54:10 +02:00
Rostislav Pehlivanov	e1120b1c54	mdct15: add assembly optimizations for the 15-point FFT c: 1802 decicycles in fft15,16774635 runs, 2581 skips avx: 865 decicycles in fft15,16776378 runs, 838 skips Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>	2017-06-23 23:45:37 +01:00
Diego Biurrun	fd502f4f5f	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler. (Cherry-picked from libav commit `39e208f4d4`) Signed-off-by: James Almer <jamrial@gmail.com>	2017-06-21 17:00:29 -03:00
James Darnley	8e89f6fd37	avcodec/x86: move simple_idct to external assembly	2017-05-30 13:20:42 +02:00
Ronald S. Bultje	c9d98c5649	cavs: convert idct from inline asm to yasm.	2017-04-06 10:03:27 -04:00
Clément Bœsch	40ac226014	lavc/x86/hevc: rename hevc_res_add to hevc_add_res This will simplify incoming merge.	2017-03-24 11:45:23 +01:00
Clément Bœsch	c66bd8f3ff	Merge commit '`b57e38f52c`' * commit '`b57e38f52c`': ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm Merged-by: Clément Bœsch <u@pkh.me>	2017-03-22 12:49:29 +01:00
James Almer	ca8a3978e5	Merge commit '`1dfc3cf89d`' * commit '`1dfc3cf89d`': x86: hpeldsp: Split off VP3-specific bits into a separate file Merged-by: James Almer <jamrial@gmail.com>	2017-01-31 14:49:29 -03:00
James Almer	cf9ef83960	huffyuvencdsp: move shared functions to a new lossless_videoencdsp context Signed-off-by: James Almer <jamrial@gmail.com>	2017-01-12 22:53:04 -03:00
Rostislav Pehlivanov	d2ae5f77c6	aacenc: add SIMD optimizations for abs_pow34 and quantization Performance improvements: quant_bands: with: 681 decicycles in quant_bands, 8388453 runs, 155 skips without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips Around 42% for the function Twoloop coder: abs_pow34: with/without: 7.82s/8.17s Around 4% for the entire encoder Both: with/without: 7.15s/8.17s Around 12% for the entire encoder Fast coder: abs_pow34: with/without: 3.40s/3.77s Around 10% for the entire encoder Both: with/without: 3.02s/3.77s Around 20% faster for the entire encoder Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> Tested-by: Michael Niedermayer <michael@niedermayer.cc> Reviewed-by: James Almer <jamrial@gmail.com>	2016-10-18 21:41:18 +01:00
Justin Ruggles	b57e38f52c	ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm Adds a wrapper function for downmixing which detects channel count changes and updates the selected downmix function accordingly. Simplification and porting to current x86inc infrastructure by Diego Biurrun. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2016-10-01 00:46:25 +02:00
Anton Khirnov	12004a9a7f	audiodsp/x86: yasmify vector_clipf_sse	2016-09-22 09:47:52 +02:00
Anton Khirnov	89466de4ae	vp9/x86: rename vp9dsp to vp9mc It only contains the MC SIMD, other SIMD will go into different files.	2016-08-03 10:57:50 +02:00
James Almer	efc9d5c4bc	x86/ttaenc: add ff_ttaenc_filter_process_{ssse3,sse4} Signed-off-by: James Almer <jamrial@gmail.com>	2016-08-02 15:48:04 -03:00
Diego Biurrun	1dfc3cf89d	x86: hpeldsp: Split off VP3-specific bits into a separate file	2016-07-20 18:33:25 +02:00
James Almer	fca3c3b619	hevc: Add AVX2 DC IDCT Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>. Integrated to Libav by Josh de Kock <josh@itanimul.li>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>	2016-07-18 15:27:13 +02:00

1 2 3 4 5 ...

352 commits