Commit graph

17 commits

Author SHA1 Message Date
Lynne
bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
Andreas Rheinhardt
4b6ffc2880 avcodec/x86/huffyuvdsp: Remove obsolete MMX functions
The only systems which benefit from these are truely
ancient 32bit x86s as all other systems use at least the SSE2 versions
(this includes all x64 cpus (which is why this code is restricted
to x86-32)).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-06-22 13:40:10 +02:00
Martin Vignali
e641c94190 avcodec/huffyuvdsp : add add_int16 AVX2 func 2017-11-21 09:41:58 +01:00
Martin Vignali
6955e8842e avcodec/huffyuvdsp : reorganize add_int16 asm 2017-11-21 09:41:52 +01:00
Martin Vignali
7f9b67bcb6 avcodec/huffyuvdsp(enc) : move duplicate macro to a template file 2017-11-21 09:41:46 +01:00
James Almer
47f212329e huffyuvdsp: move functions only used by huffyuv from lossless_videodsp
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:05 -03:00
James Almer
5ac1dd8e23 lossless_videodsp: move shared functions from huffyuvdsp
Several codecs other than huffyuv use them.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:53:04 -03:00
Henrik Gramner
f0b7882ceb x86inc: Drop SECTION_TEXT macro
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
2015-08-04 20:13:09 +02:00
James Almer
844bef578e avcodec/x86: add missing colon to labels
Silences warnings with Nasm

Signed-off-by: James Almer <jamrial@gmail.com>
2015-07-26 02:50:14 -03:00
Christophe Gisquet
9dc45d1f42 x86: lavc: share more constants
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-06 23:35:02 +01:00
Christophe Gisquet
9107612818 x86util: add and use RSHIFT/LSHIFT macros
Those macros take a byte number as shift argument, as this argument
differs between MMX and SSE2 instructions.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-15 13:19:27 +02:00
Christophe Gisquet
d136fe6fd7 x86: huffyuvdsp: fewer functions for x86_64
When there are 2 functions that are <= SSE2, only one is needed for x86_64.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 21:39:06 +02:00
Christophe Gisquet
f743fa9c7f x86: huffyuvdsp: add_hfyu_left_pred_bgr32
C   MMX   SSE2
Cycles: 3092  1053  578

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 15:20:36 +02:00
Christophe Gisquet
884078d2df x86: huffyuvdsp: add SSE2 median prediction
From 5010c to 4566 on lagarith YUY2.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 14:57:57 +02:00
Christophe Gisquet
99a319c4e7 x86: huffyuvdsp: port add_bytes to yasm
C   MMX  SSE2
Cycles: 2972  587  302

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 21:56:00 +02:00
Michael Niedermayer
e2abc0d5ca Merge commit '0d439fbede'
* commit '0d439fbede':
  dsputil: Split off HuffYUV decoding bits into their own context

Conflicts:
	configure
	libavcodec/dsputil.c
	libavcodec/dsputil.h
	libavcodec/huffyuv.h
	libavcodec/huffyuvdec.c
	libavcodec/lagarith.c
	libavcodec/vble.c
	libavcodec/x86/Makefile
	libavcodec/x86/dsputil.asm
	libavcodec/x86/dsputil_init.c
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-27 23:16:06 +02:00
Diego Biurrun
0d439fbede dsputil: Split off HuffYUV decoding bits into their own context
Also shorten HuffYUV context member names to avoid clutter.
2014-05-27 08:52:34 -07:00