Commit graph

17 commits

Author SHA1 Message Date
Lynne
bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
James Darnley
6af453ca38 avcodec/x86: add avx512icl function for v210dec
Ice Lake (Xeon Silver 4316): 2.01x faster (1147±36.8 vs. 571±38.2 decicycles) compared with avx2
2022-12-20 15:02:45 +01:00
James Darnley
f30b4c2f47 avcodec/x86/v210: add some comments to the improved avx2 function 2022-12-20 15:02:45 +01:00
James Almer
b41d8ab2e6 x86/v210dec: use named registers
Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-03 01:20:18 -03:00
James Almer
abf1aa87ab x86/v210dec: don't reserve more xmm regs than needed
Prevents pointless register saving on win64 for the sse3 and avx
versions of the function.

Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-03 01:09:50 -03:00
James Almer
b0e29357ba x86/v210dec: remove duplicate load instruction
Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-03 01:08:34 -03:00
James Darnley
46f1718cd9 avcodec/x86/v210: fix operands of vpblendd used in new avx2 code
Assembly failed when using yasm rather than nasm.
2019-05-02 21:20:54 +02:00
Michael Stoner
ebd6fb23c5 libavcodec Adding ff_v210_planar_unpack AVX2
Replaced VSHUFPS with VPBLENDD to relieve port 5 bottleneck
AVX2 is 1.4x faster than AVX
2019-05-02 19:21:37 +02:00
James Almer
844bef578e avcodec/x86: add missing colon to labels
Silences warnings with Nasm

Signed-off-by: James Almer <jamrial@gmail.com>
2015-07-26 02:50:14 -03:00
James Darnley
46ef45ab59 lavc/x86/v210: give cpuflag to INIT macro
This lets the cglobal macro automatically append a suffix to the function name.
This means that INIT_XMM avx must be used rather than INIT_AVX.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-09-05 00:35:07 +02:00
Thilo Borgmann
d814a839ac Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
Michael Niedermayer
3174616f59 Merge commit '6860b4081d'
* commit '6860b4081d':
  x86: include x86inc.asm in x86util.asm
  cng: Reindent some incorrectly indented lines
  cngdec: Allow flushing the decoder
  cngdec: Make the dbov variable have the right unit
  cngdec: Fix the memset size to cover the full array
  cngdec: Update the LPC coefficients after averaging the reflection coefficients
  configure: fix print_config() with broke awks

Conflicts:
	libavcodec/x86/ac3dsp.asm
	libavcodec/x86/dct32.asm
	libavcodec/x86/deinterlace.asm
	libavcodec/x86/dsputil.asm
	libavcodec/x86/dsputilenc.asm
	libavcodec/x86/fft.asm
	libavcodec/x86/fmtconvert.asm
	libavcodec/x86/h264_chromamc.asm
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264_deblock_10bit.asm
	libavcodec/x86/h264_idct.asm
	libavcodec/x86/h264_idct_10bit.asm
	libavcodec/x86/h264_intrapred.asm
	libavcodec/x86/h264_intrapred_10bit.asm
	libavcodec/x86/h264_weight.asm
	libavcodec/x86/vc1dsp.asm
	libavcodec/x86/vp3dsp.asm
	libavcodec/x86/vp56dsp.asm
	libavcodec/x86/vp8dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-31 13:43:33 +01:00
Michael Niedermayer
47277c4153 x86/v210: fix xmm clobbers
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-09-15 16:26:00 +02:00
Carl Eugen Hoyos
52be5428c0 Add some missing _EXTERNAL suffixes to yasm source files. 2012-08-31 15:39:03 +02:00
Reimar Döffinger
f51a072160 Fix compilation without HAVE_AVX.
%ifdef HAVE_AVX must now be %if HAVE_AVX.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-02-12 21:42:31 +01:00
Carl Eugen Hoyos
ef3a19d595 Fix compilation with yasm-0.6.2 2012-01-12 16:35:49 +01:00
Kieran Kunhya
44d27736fc Add V210 SIMD
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-10-19 20:26:55 +02:00