ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2025-12-08 06:09:50 +00:00

Author	SHA1	Message	Date
Frank Plowman	1e3dc705df	lavc/x86/videodsp: Drop MMX usage Remove the MMX versions of these functions and modify the SSE implementations to avoid using MMX registers. Signed-off-by: Frank Plowman <post@frankplowman.com> Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>	2024-12-01 13:26:34 +08:00
Lynne	bbe95f7353	x86: replace explicit REP_RETs with RETs From x86inc: > On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either > a branch or a branch target. So switch to a 2-byte form of ret in that case. > We can automatically detect "follows a branch", but not a branch target. > (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.) x86inc can automatically determine whether to use REP_RET rather than REP in most of these cases, so impact is minimal. Additionally, a few REP_RETs were used unnecessary, despite the return being nowhere near a branch. The only CPUs affected were AMD K10s, made between 2007 and 2011, 16 years ago and 12 years ago, respectively. In the future, everyone involved with x86inc should consider dropping REP_RETs altogether.	2023-02-01 04:23:55 +01:00
Andreas Rheinhardt	19abc4c0a9	avcodec/x86/videodsp: Remove obsolete MMX, 3dnow, SSE functions x64 always has MMX, MMXEXT, SSE and SSE2 and this means that some functions for MMX, MMXEXT, SSE and 3dnow are always overridden by other functions (unless one e.g. explicitly disables SSE2). So given that the only systems which benefit from these functions are truely ancient 32bit x86s they are removed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-06-22 13:37:35 +02:00
James Almer	bac44a5020	Merge commit '`b89804da9b`' * commit '`b89804da9b`': x86: videodsp: Add parentheses to expression to work around warning Merged-by: James Almer <jamrial@gmail.com>	2017-03-23 18:35:49 -03:00
Diego Biurrun	b89804da9b	x86: videodsp: Add parentheses to expression to work around warning libavcodec/x86/videodsp.asm:128: warning: signed dword value exceeds bounds	2016-10-19 10:13:34 +02:00
Ronald S. Bultje	0f88b3f82f	videodsp: fix 1-byte overread in top/bottom READ_NUM_BYTES iterations. This can overread (either before start or beyond end) of the buffer in Nx1 (i.e. height=1) images. Fixes mozilla bug 1240080.	2016-01-18 11:12:47 -05:00
Ronald S. Bultje	52f84d82bd	videodsp: don't overread edges in vfix3 emu_edge. Fixes trac ticket 3226. Also see Andreas' analysis in https://bugs.debian.org/801745, which was very helpful.	2015-10-24 14:34:50 -04:00
James Almer	70277d1d23	x86/videodsp: add ff_emu_edge_{hfix,hvar}_avx2 ~15% faster than sse2. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-24 16:12:55 -03:00
Michael Niedermayer	5bca5f87d1	Revert "x86/videodsp: add emulated_edge_mc_mmxext" The commit causes minor out of array reads and was mainly intended for future optimizations which turned out not to be meassurably faster. Itself it was just 1 cpu cycle faster Approved-by: jamrial This reverts commit `057d2704e7`.	2014-06-28 05:39:07 +02:00
James Almer	057d2704e7	x86/videodsp: add emulated_edge_mc_mmxext This also changes hfix8_mmx and above to use mmx regs instead of gprs, and makes emulated_edge_mc_sse and emulated_edge_mc_sse2 use mmxext hfix and hvar functions instead of mmx where possible. This is mostly in preparation for an ssse3 version. Signed-off-by: James Almer <jamrial@gmail.com> code is about 1 cpu cycle faster approximately Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-06-26 17:58:57 +02:00
Ronald S. Bultje	9ee9c679a7	x86: videodsp: Fix a bug in a %if statement where we used '%%' instead of '&&'. Signed-off-by: Janne Grunau <janne-libav@jannau.net>	2014-01-30 15:33:23 +01:00
Ronald S. Bultje	51daafb02e	x86: videodsp: Properly mark sse2 instructions in emulated_edge_mc as such. Should fix crashes or corrupt output on pre-SSE2 CPUs when they were using SSE2-code (e.g. AMD Athlon XP 2400+ or Intel Pentium III) in hfix or hvar single-edge (left/right) extension functions. Signed-off-by: Janne Grunau <janne-libav@jannau.net>	2014-01-30 15:30:01 +01:00
Ronald S. Bultje	458446acfa	lavc: Edge emulation with dst/src linesize Allow supporting files for which the image stride is smaller than the maximum block size + number of subpel mc taps, e.g. a 64x64 VP9 file or a 16x16 VP8 file with -fflags +emu_edge.	2013-11-15 10:16:27 +01:00
Ronald S. Bultje	960490c0b2	avcodec/x86/videodsp: Small speedups in ff_emulated_edge_mc x86 SIMD. Don't use word-size multiplications if size == 2, and if we're using SIMD instructions (size >= 8), complete leftover 4byte sets using movd, not mov. Both of these changes lead to minor speedups. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-27 15:02:48 +01:00
Ronald S. Bultje	cd86eb265f	avcodec/x86/videodsp: fix a bug in a %if statement where we used '%%' instead of '&&'. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-27 15:02:48 +01:00
Ronald S. Bultje	1b3a7e1f42	avcodec/x86/videodsp: Properly mark sse2 instructions in emulated_edge_mc x86 simd as such. Should fix crashes or corrupt output on pre-SSE2 CPUs when they were using SSE2-code (e.g. AMD Athlon XP 2400+ or Intel Pentium III) in hfix or hvar single-edge (left/right) extension functions. Tested-by: Ingo Brückl <ib@wupperonline.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-24 13:36:55 +02:00
Ronald S. Bultje	20d78a8606	libavcodec/x86: Fix emulated_edge_mc SSE code to not contain SSE2 instructions on x86-32. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-10 13:36:06 +02:00
Ronald S. Bultje	face578d56	Rewrite emu_edge functions to have separate src/dst_stride arguments. This allows supporting files for which the image stride is smaller than the max. block size + number of subpel mc taps, e.g. a 64x64 VP9 file or a 16x16 VP8 file with -fflags +emu_edge.	2013-09-28 20:28:08 -04:00
Michael Niedermayer	63a97d5674	Merge commit '`b6649ab503`' * commit '`b6649ab503`': cosmetics: Remove unnecessary extern keywords from function declarations Conflicts: libswscale/x86/swscale.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-28 11:20:41 +01:00
Diego Biurrun	b6649ab503	cosmetics: Remove unnecessary extern keywords from function declarations	2013-03-27 14:21:45 +01:00
Michael Niedermayer	e16bac7b33	videodsp: Fix project name These are all part of splited out dsp utils from FFmpeg Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-22 00:58:08 +01:00
Ronald S. Bultje	8c53d39e7f	lavc: introduce VideoDSPContext Move some functions from dsputil. The idea is that videodsp contains functions that are useful for a large and varied set of video decoders. Currently, it contains emulated_edge_mc() and prefetch(). Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2012-12-20 13:40:45 +01:00

22 commits