ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2026-02-09 19:30:19 +00:00

Author	SHA1	Message	Date
Andreas Rheinhardt	947d51f32a	avcodec/x86/hpeldsp_vp3: Merge into hpeldsp Once upon a time, `413abbe164` added versions of some put_no_rnd_pixels functions for use in VP3 and Theora (with an explicit check so that they are only used for VP3 and Theora). When this was moved to hpeldsp (from dsputil) in `3ced55d51c`, the check was replaced by a check for the bitexact flag (and a CONFIG_VP3_DECODER compile-time check), so that these functions were now used for other codecs as well. Later commit `1dfc3cf89d` split off the "VP3-specific bits into a separate file", yet these bits were not really VP3-specific bits at all any more. (The error was repeated in commit `0a39c9ac0b`.) This commit has not been reverted, because this would make future changes from Libav (from where it originated) harder, yet Libav is no more, so this commit effectively reverts `1dfc3cf89d`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-09-07 00:24:39 +02:00
Lynne	bbe95f7353	x86: replace explicit REP_RETs with RETs From x86inc: > On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either > a branch or a branch target. So switch to a 2-byte form of ret in that case. > We can automatically detect "follows a branch", but not a branch target. > (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.) x86inc can automatically determine whether to use REP_RET rather than REP in most of these cases, so impact is minimal. Additionally, a few REP_RETs were used unnecessary, despite the return being nowhere near a branch. The only CPUs affected were AMD K10s, made between 2007 and 2011, 16 years ago and 12 years ago, respectively. In the future, everyone involved with x86inc should consider dropping REP_RETs altogether.	2023-02-01 04:23:55 +01:00
Andreas Rheinhardt	a51279bbde	avcodec/x86/hpeldsp: Remove obsolete MMX/3dnow functions x64 always has MMX, MMXEXT, SSE and SSE2 and this means that some functions for MMX, MMXEXT and 3dnow are always overridden by other functions (unless one e.g. explicitly disables SSE2) for x64. So given that the only systems that benefit from these functions are truely ancient 32bit x86s they are removed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-06-22 13:34:58 +02:00
James Almer	ca8a3978e5	Merge commit '`1dfc3cf89d`' * commit '`1dfc3cf89d`': x86: hpeldsp: Split off VP3-specific bits into a separate file Merged-by: James Almer <jamrial@gmail.com>	2017-01-31 14:49:29 -03:00
Diego Biurrun	1dfc3cf89d	x86: hpeldsp: Split off VP3-specific bits into a separate file	2016-07-20 18:33:25 +02:00
Henrik Gramner	ab43beefab	x86inc: Drop SECTION_TEXT macro The .text section is already 16-byte aligned by default on all supported platforms so `SECTION_TEXT` isn't any different from `SECTION .text`. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2015-08-11 11:12:01 +02:00
Henrik Gramner	f0b7882ceb	x86inc: Drop SECTION_TEXT macro The .text section is already 16-byte aligned by default on all supported platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.	2015-08-04 20:13:09 +02:00
Christophe Gisquet	4e128ab0b1	x86: vpx/h264/hevc/mpeg2: share constants Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-08-06 18:36:31 +02:00
Christophe Gisquet	2267003981	x86: hpeldsp: better factorization Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-29 21:47:40 +02:00
Christophe Gisquet	81aa0f4604	x86: hpeldsp: implement SSSE3 version of _xy2 Loading pb_1 rather than pw_8192 was benchmarked to be more efficient. Loading of the 2 yields no advantage. Loading of one saves ~11 cycles. decicycles count: put8: 3223(mmx) -> 2387 avg8: 2863(mmxext) -> 2125 put16: 4356(sse2) -> 3553 avg16: 4481(sse2) -> 3513 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-24 15:15:56 +02:00
Christophe Gisquet	9722a6a3f3	x86: hpeldsp: implement SSE2 put_pixels16_xy2 This is obviously equivalent to the avg version, without the avg. 3223(mmx) -> 2006(sse2) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-24 03:45:17 +02:00
Christophe Gisquet	f0aca50e0b	x86: hpeldsp: implement SSE2 versions Those are mostly used in codecs older than H.264, eg MPEG-2. put16 versions: mmx mmx2 sse2 x2: 1888 1185 552 y2: 1778 1092 510 avg16 xy2: 3509(mmx2) -> 2169(sse2) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-24 03:29:48 +02:00
Christophe Gisquet	c081ca851c	x86: hpeldsp: avg_pixels_xy2 for mmx2&3dnow This is a port of the inline assembly of the mmx version to use the pavg(us\|)b instruction. 8 16 mmx 1498 4355 mmx2 1242 3509 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-22 20:17:49 +02:00
Christophe Gisquet	17ac998055	x86: hpeldsp: mark _xy2 versions as approximate Currently, only the mmx version is bitexact, the others (mmxext and 3dnow) are not, in spite of their naming. Therefore, make their name more obvious. Also restore a comment that was removed in `71155d7b`. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-22 20:17:45 +02:00
Christophe Gisquet	f8de35ebc4	x86: hpeldsp: kill hpeldsp_mmx.c before: 1987 decicycles in 8_x2, 262121 runs, 23 skips after: 1902 decicycles in 8_x2, 262112 runs, 32 skips Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-22 20:17:40 +02:00
Michael Niedermayer	4104eb44e6	Merge commit '`55519926ef`' * commit '`55519926ef`': x86: Make function prototype comments in assembly code consistent Conflicts: libavcodec/x86/sbrdsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-03-14 00:01:30 +01:00
Michael Niedermayer	1c788eaca9	Merge commit '`831a118078`' * commit '`831a118078`': Update dsputil- and SIMD-related comments to match reality more closely Conflicts: libavcodec/x86/hpeldsp.asm libavutil/arm/float_dsp_init_arm.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-03-13 23:59:56 +01:00
Diego Biurrun	55519926ef	x86: Make function prototype comments in assembly code consistent This helps grepping for functions, among other things.	2014-03-13 05:50:29 -07:00
Diego Biurrun	831a118078	Update dsputil- and SIMD-related comments to match reality more closely	2014-03-13 05:50:29 -07:00
Mikulas Patocka	694d997afe	x86: hpeldsp: Use PAVGB instruction macro where necessary Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-11-04 01:29:23 +01:00
Mikulas Patocka	074155360d	avcodec/x86/hpeldsp: fix crash on AMD K6-3+ There are instructions pavgb and pavgusb. Both instructions do the same operation but they have different enconding. Pavgb exists in SSE (or MMXEXT) instruction set and pavgusb exists in 3D-NOW instruction set. livavcodec uses the macro PAVGB to select the proper instruction. However, the function avg_pixels8_xy2 doesn't use this macro, it uses pavgb directly. As a consequence, the function avg_pixels8_xy2 crashes on AMD K6-2 and K6-3 processors, because they have pavgusb, but not pavgb. This bug seems to be introduced by commit `71155d7b41`, "dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasm" Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-03 19:49:11 +01:00
Ronald S. Bultje	610b18e2e3	x86: qpel: Move fullpel and l2 functions to a separate file This way, they can be shared between mpeg4qpel and h264qpel without requiring either one to be compiled unconditionally. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-08 12:38:33 +03:00
Ronald S. Bultje	22cc8a103c	x86/qpel: move fullpel and l2 functions to separate file. This way, they can be shared between mpeg4qpel and h264qpel without requiring either one to be compiled unconditionally. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-09 17:25:30 +01:00
Michael Niedermayer	ede45c4e1d	Merge commit '`25841dfe80`' * commit '`25841dfe80`': Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter. Conflicts: libavcodec/alpha/dsputil_alpha.c libavcodec/dsputil_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-06 12:18:25 +01:00
Diego Biurrun	25841dfe80	Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter. This avoids SIMD-optimized functions having to sign-extend their line size argument manually to be able to do pointer arithmetic.	2013-02-05 12:59:12 +01:00
Michael Niedermayer	cb573f7fbc	avcodec/x86: Add daniels copyright to the recent gcc->yasm convertions he did. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-03 13:50:44 +01:00
Michael Niedermayer	dd87d4a318	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: hpel: Move {avg,put}_pixels16_sse2 to hpeldsp configure: Add a comment indicating why uclibc is checked before glibc Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-31 20:03:36 +01:00
Diego Biurrun	52acd79165	x86: hpel: Move {avg,put}_pixels16_sse2 to hpeldsp	2013-01-31 11:19:23 +01:00
Michael Niedermayer	bb2f4ae434	Merge commit '`05b0998f51`' * commit '`05b0998f51`': dsputil: Fix error by not using redzone and register name swscale: GBRP output support Conflicts: libswscale/output.c libswscale/swscale.c libswscale/swscale_internal.h libswscale/utils.c tests/ref/lavfi/pixdesc tests/ref/lavfi/pixfmts_copy tests/ref/lavfi/pixfmts_null tests/ref/lavfi/pixfmts_scale tests/ref/lavfi/pixfmts_vflip Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-28 14:11:31 +01:00
Michael Niedermayer	834e9fb056	x86: hpeldsp: Fix a typo, use the right register This makes the code actually work. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-01-28 12:49:37 +02:00
Daniel Kang	05b0998f51	dsputil: Fix error by not using redzone and register name Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-28 07:23:20 +01:00
Michael Niedermayer	edde562130	AVG_PIXELS8_XY2: fix typo, make code actually work Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-27 15:50:26 +01:00
Michael Niedermayer	aa3f449955	x86/hpeldsp: Fix author attribution This also fixes the project name Original authors fabrice and nick go back to the initial ffmpeg commit Others for example contributed in: (for a complete list please use git blame / show / log) commit `e9c0a38ff0` Author: Zdenek Kabelac <kabi@informatics.muni.cz> Date: Tue May 28 16:35:58 2002 +0000 * optimized avg_* functions (except xy2) * minor speedup for put_pixels_x2 & cleanup Originally committed as revision 619 to svn://svn.ffmpeg.org/ffmpeg/trunk commit `607dce96c0` Author: Michael Niedermayer <michaelni@gmx.at> Date: Fri May 17 01:04:14 2002 +0000 hopefully faster mmx2&3dnow MC Originally committed as revision 506 to svn://svn.ffmpeg.org/ffmpeg/trunk Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-27 14:47:58 +01:00
Daniel Kang	71155d7b41	dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasm Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-27 06:45:31 +01:00

34 commits