Commit graph

7 commits

Author SHA1 Message Date
James Almer
0922c6b01b x86/lpc: use fused negative multiply-add instructions where useful
Signed-off-by: James Almer <jamrial@gmail.com>
2022-09-22 18:17:26 -03:00
James Almer
0627e6d74c avcodec/lpc: zero the middle odd sample in the output
Signed-off-by: James Almer <jamrial@gmail.com>
2022-09-22 18:17:26 -03:00
James Almer
c8c4a162fc avcodec/lpc: use ptrdiff_t for length parameters
Signed-off-by: James Almer <jamrial@gmail.com>
2022-09-22 18:17:26 -03:00
Martin Storsjö
c9aa6164d4 x86/lpc: Fix parameter sign extension, unbreaking checkasm-lpc on x86_64 windows
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-09-22 11:13:33 +03:00
Lynne
b67776e12f
x86/lpc: fix even scalar loop overreads/writes
Passes checkasm with valgrind, tested to sizes of more than 4000 samples.
2022-09-22 04:27:19 +02:00
Lynne
dea944b838
x86/lpc: fix odd scalar loop overreads/writes 2022-09-22 03:07:41 +02:00
Lynne
3ade6a8644
x86/lpc: implement a new Welch windowing function
Old one was written with the assumption only even inputs would be given.
This very messy replacement supports even and odd inputs, and supports
AVX2 for extra speed. The buffers given are usually quite big (4k samples),
so the speedup is worth it.
The new SSE version is still faster than the old inline asm version by 33%.

Also checkasm is provided to make sure this monstrosity works.

This fixes some FATE tests.
2022-09-21 07:12:39 +02:00