Commit graph

24 commits

Author SHA1 Message Date
yuanhecai
a87a52ed0b
avcodec/hevc: Add ff_hevc_idct_32x32_lasx asm opt
tests/checkasm/checkasm:

                          C          LSX       LASX
hevc_idct_32x32_8_c:      1243.0     211.7     101.7

Speedup of decoding H265 4K 30FPS 30Mbps on
3A6000 with 8 threads is 1fps(56fps-->57fps).

Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-12 23:35:40 +01:00
jinbo
a28eea2a27
avcodec/hevc: Add pel_uni_w_pixels4/6/8/12/16/24/32/48/64 asm opt
tests/checkasm/checkasm:           C       LSX     LASX
put_hevc_pel_uni_w_pixels4_8_c:    2.7     1.0
put_hevc_pel_uni_w_pixels6_8_c:    6.2     2.0     1.5
put_hevc_pel_uni_w_pixels8_8_c:    10.7    2.5     1.7
put_hevc_pel_uni_w_pixels12_8_c:   23.0    5.5     5.0
put_hevc_pel_uni_w_pixels16_8_c:   41.0    8.2     5.0
put_hevc_pel_uni_w_pixels24_8_c:   91.0    19.7    13.2
put_hevc_pel_uni_w_pixels32_8_c:   161.7   32.5    16.2
put_hevc_pel_uni_w_pixels48_8_c:   354.5   73.7    43.0
put_hevc_pel_uni_w_pixels64_8_c:   641.5   130.0   64.2

Speedup of decoding H265 4K 30FPS 30Mbps on 3A6000 with
8 threads is 1fps(47fps-->48fps).

Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-12 23:35:40 +01:00
jinbo
cfbdda607d
avcodec/hevc: Add add_residual_4/8/16/32 asm opt
After this patch, the peformance of decoding H265 4K 30FPS 30Mbps
on 3A6000 with 8 threads improves 2fps (45fps-->47fsp).

Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-01-12 23:35:40 +01:00
yuanhecai
f6077cc666
avcodec/la: Add LSX optimization for h264 qpel.
./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 214fps
after:  274fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-05-25 21:05:01 +02:00
Lu Wang
8815a7719e
avcodec/la: Add LSX optimization for h264 chroma and intrapred.
./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 199fps
after:  214fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-05-25 21:04:56 +02:00
Hao Chen
7845b5ecd6
avcodec/la: Add LSX optimization for loop filter.
Replaced function(LSX is sufficient for these functions):
ff_h264_v_lpf_chroma_8_lasx
ff_h264_h_lpf_chroma_8_lasx
ff_h264_v_lpf_chroma_intra_8_lasx
ff_h264_h_lpf_chroma_intra_8_lasx
ff_weight_h264_pixels4_8_lasx
ff_biweight_h264_pixels4_8_lasx

./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 161fps
after:  199fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-05-25 21:04:43 +02:00
Shiyou Yin
e1b6ecd20a
avcodec/la: add LSX optimization for h264 idct.
loongson_asm.S is LoongArch asm optimization helper.
Add functions:
  ff_h264_idct_add_8_lsx
  ff_h264_idct8_add_8_lsx
  ff_h264_idct_dc_add_8_lsx
  ff_h264_idct8_dc_add_8_lsx
  ff_h264_idct_add16_8_lsx
  ff_h264_idct8_add4_8_lsx
  ff_h264_idct_add8_8_lsx
  ff_h264_idct_add8_422_8_lsx
  ff_h264_idct_add16_intra_8_lsx
  ff_h264_luma_dc_dequant_idct_8_lsx
Replaced function(LSX is sufficient for these functions):
  ff_h264_idct_add_lasx
  ff_h264_idct4x4_addblk_dc_lasx
  ff_h264_idct_add16_lasx
  ff_h264_idct8_add4_lasx
  ff_h264_idct_add8_lasx
  ff_h264_idct_add8_422_lasx
  ff_h264_idct_add16_intra_lasx
  ff_h264_deq_idct_luma_dc_lasx
Renamed functions:
  ff_h264_idct8_addblk_lasx ==> ff_h264_idct8_add_8_lasx
  ff_h264_idct8_dc_addblk_lasx ==> ff_h264_idct8_dc_add_8_lasx

./configure --disable-lasx
ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before: 155fps
after:  161fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-05-25 21:04:25 +02:00
Lu Wang
72604b10f4 avcodec: [loongarch] Optimize Hevc_mc_uni/w with LSX.
ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an
before: 182fps
after : 191fps

Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-03-01 23:53:40 +01:00
Hao Chen
a70a5b7c62 avcodec: [loongarch] Optimize Hevc_mc_bi with LSX.
ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an
before: 124fps
after : 182fps

Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-03-01 23:53:40 +01:00
Lu Wang
b6ceeee16b avcodec: [loongarch] Optimize Hevc_idct/lpf with LSX.
ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an
before:  110fps
after : 124fps

Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-03-01 23:53:40 +01:00
Lu Wang
20194d573d avcodec: [loongarch] Optimize Hevcdsp with LSX.
ffmpeg -i 5_h265_1080p_60fps_3Mbps.mkv -f rawvideo -y /dev/null -an
before:  94fps
after : 110fps

Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-03-01 23:53:40 +01:00
gxw
8ca7d474c1 avcodec: [loongarch] Optimize prefetch with loongarch.
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:296
after :308

Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-01-04 15:55:05 +01:00
Hao Chen
555b850bd5 avcodec: [loongarch] Optimize idctdstp with LASX.
./ffmpeg -i 8_mpeg4_1080p_24fps_12Mbps.avi -f rawvideo -y /dev/null -an
before:433fps
after :552fps

Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-01-04 15:55:05 +01:00
Shiyou Yin
5d58355bf1 avcodec: [loongarch] Optimize hpeldsp with LASX.
./ffmpeg -i 8_mpeg4_1080p_24fps_12Mbps.avi -f rawvideo -y /dev/null -an
before:376fps
after :433fps

Reviewed-by: 殷时友 <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2022-01-04 15:55:05 +01:00
Hao Chen
60ead5cd68 avcodec: [loongarch] Optimize vc1dsp with LASX.
./ffmpeg -i 11_wmv3_720p_24fps_7Mbps.wmv -f rawvideo -y /dev/null -an
before:131fps
after :229fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-23 12:28:54 +01:00
Jin Bo
fea299f876 avcodec: [loongarch] Optimize vp9_lpf/idct with LSX.
ffmpeg -i ../10_vp9_1080p_30fps_3Mbps.webm -f rawvideo -y /dev/null -an
before:294fps
after :567fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-23 12:28:54 +01:00
Hao Chen
2fd914e079 avcodec: [loongarch] Optimize vp9_mc/intra with LSX.
ffmpeg -i ../10_vp9_1080p_30fps_3Mbps.webm -f rawvideo -y /dev/null -an
before:170fps
after :294fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-23 12:28:54 +01:00
yuanhecai
72bcbe216e avcodec: [loongarch] Optimize vp8_lpf/mc with LSX.
./ffmpeg -i ../9_vp8_1080p_30fps_2Mbps.webm -f rawvideo -y /dev/null -an
before: 210fps
after : 585fps

Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-23 12:28:54 +01:00
Hao Chen
df46d7cb49 avcodec: [loongarch] Optimize pred16x16_plane with LASX.
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:295
after :296

Change-Id: I281bc739f708d45f91fc3860150944c0b8a6a5ba
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-15 18:37:40 +01:00
Jin Bo
1ccc458960 avcodec: [loongarch] Optimize h264_deblock with LASX.
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:293
after :295

Change-Id: I5ff6cba4eaca0c4218c0c97b880ca500e35f9c87
Signed-off-by: Hao Chen <chenhao@loongson.cn>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-15 18:37:40 +01:00
Lu Wang
5ff58b77bb avcodec: [loongarch] Optimize h264idct with LASX.
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:282
after :293

Change-Id: Ia8889935a6359630dd5dbb61263287f1cb24a0a4
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-15 18:37:40 +01:00
gxw
3f294ec879 avcodec: [loongarch] Optimize h264dsp with LASX.
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:225
after :282

Change-Id: Ibe245827dcdfe8fc1541c6b172483151bfa9e642
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-15 18:37:40 +01:00
Shiyou Yin
cba7c0267d avcodec: [loongarch] Optimize h264qpel with LASX.
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:183
after :225

Change-Id: I7c7d2f34cd82ef728aab5ce8f6bfb46dd81f0da4
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-15 18:37:40 +01:00
Shiyou Yin
6038a9eb92 avcodec: [loongarch] Optimize h264_chroma_mc with LASX.
./ffmpeg -i ../1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -y /dev/null -an
before:170
after :183

Change-Id: I42ff23cc2dc7c32bd1b7e4274da9d9ec87065f20
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Reviewed-by: guxiwei <guxiwei-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-12-15 18:37:40 +01:00