The earlier code only used the counts from the last slice.
The two FATE tests using slices show compression improvements
due to this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary, as we already have the entries sorted by
probability and therefore implicitly by length. All we need
on top of that to build the tree is the number of entries
of a given length.
Doing so gives a 3.6% speedup of ff_mjpeg_encode_huffman_close()
here; it also saves about 640B of .text here.
The new code puts values with higher probability to the left
of the tree. The old code did not and therefore
the FATE checksums needed to be updated. Due to MJPEG's
0xFF unescaping file sizes as well as file checksums
needed to be updated; the decoded picture hashes stayed
the same. Given that codes on the left of the tree have
on average fewer bits set than codes on the right, the
file sizes mostly improve (all except vsynth3-mjpeg-444).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A sample with a particular partitioning structure that could not be read
correctly before 26c5d8cf5d
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This allows catching whether the functions write outside of
the designated rectangle, and if run with "checkasm -v", it also
prints out on which side of the rectangle the overwrite was.
Signed-off-by: Martin Storsjö <martin@martin.st>
this is a good compromise between speed and compression
-rw-r----- 1 michael michael 180987 Feb 6 14:29 lena-def.png
-rw-r----- 1 michael michael 128430 Feb 6 14:36 lena-pavg.png
-rw-r----- 1 michael michael 126269 Feb 6 14:36 lena-pmixed.png
-rw-r----- 1 michael michael 180987 Feb 6 14:35 lena-pnone.png
-rw-r----- 1 michael michael 127758 Feb 6 14:35 lena-ppaeth.png
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Since 45eeb1f785
optimal Huffman tables are the default (without slice-threading).
This made the fate-vsynth*-mjpeg-{trell-,}-huffman tests
identical to their corresponding tests without "-huffman".
This is of course wasteful, so switch the two tests with
"-huffman" counterparts back to the default tables.
Also use one of these tests to test slice threaded encoding.
It has so far been untested.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This backports similar functionality from dav1d, from commits
35d1d011fda4a92bcaf42d30ed137583b27d7f6d and
d130da9c315d5a1d3968d278bbee2238ad9051e7.
This allows detecting writes out of bounds, on all 4 sides of
the intended destination rectangle.
The bounds checking also can optionally allow small overwrites
(up to a specified alignment), while still checking for larger
overwrites past the intended allowed region.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes it easier to implement custom error printouts in tests.
This is a port of dav1d's commit
13a7d78655f8747c2cd01e8a48d44dcc7f60a8e5 into ffmpeg's checkasm.
Signed-off-by: Martin Storsjö <martin@martin.st>
This was changed 8 years ago with the introduction of the linux-perf path,
with seemingly no justification at the time. Likely a developer oversight
from testing.
This bug not only made --runs completely ineffective, but also meant that we
didn't actually correctly filter out outliers.
Fixes: e0d56f097f
It is currently not due to endianness. This forced to add
workarounds with sed in fate/mxf.mak (which are removed
in this commit).
This is supposed to fix the enhanced-flv-hevc-hdr10 test
on big endian systems.
Reviewed-by: Zhao Zhili <quinkblack-at-foxmail.com@ffmpeg.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
With this modification the test would have caught the regression
introduced in 72bf3d3c12.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Many of the fields of MpegEncContext (which is also used by decoders)
are actually only used by encoders. Therefore this commit adds
a new encoder-only structure and moves all of the encoder-only
fields to it except for those which require more explicit
synchronisation between the main slice context and the other
slice contexts. This synchronisation is currently mainly provided
by ff_update_thread_context() which simply copies most of
the main slice context over the other slice contexts. Fields
which are moved to the new MPVEncContext no longer participate
in this (which is desired, because it is horrible and for the
fields b) below wasteful) which means that some fields can only
be moved when explicit synchronisation code is added in later commits.
More explicitly, this commit moves the following fields:
a) Fields not copied by ff_update_duplicate_context():
dct_error_sum and dct_count; the former does not need synchronisation,
the latter is synchronised in merge_context_after_encode().
b) Fields which do not change after initialisation (these fields
could also be put into MPVMainEncContext at the cost of
an indirection to access them): lambda_table, adaptive_quant,
{luma,chroma}_elim_threshold, new_pic, fdsp, mpvencdsp, pdsp,
{p,b_forw,b_back,b_bidir_forw,b_bidir_back,b_direct,b_field}_mv_table,
[pb]_field_select_table, mb_{type,var,mean}, mc_mb_var, {min,max}_qcoeff,
{inter,intra}_quant_bias, ac_esc_length, the *_vlc_length fields,
the q_{intra,inter,chroma_intra}_matrix{,16}, dct_offset, mb_info,
mjpeg_ctx, rtp_mode, rtp_payload_size, encode_mb, all function
pointers, mpv_flags, quantizer_noise_shaping,
frame_reconstruction_bitfield, error_rate and intra_penalty.
c) Fields which are already (re)set explicitly: The PutBitContexts
pb, tex_pb, pb2; dquant, skipdct, encoding_error, the statistics
fields {mv,i_tex,p_tex,misc,last}_bits and i_count; last_mv_dir,
esc_pos (reset when writing the header).
d) Fields which are only used by encoders not supporting slice
threading for which synchronisation doesn't matter: esc3_level_length
and the remaining mb_info fields.
e) coded_score: This field is only really used when FF_MPV_FLAG_CBP_RD
is set (which implies trellis) and even then it is only used for
non-intra blocks. For these blocks dct_quantize_trellis_c() either
sets coded_score[n] or returns a last_non_zero value of -1
in which case coded_score will be reset in encode_mb_internal().
Therefore no old values are ever used.
The MotionEstContext has not been moved yet.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It allows the callee to clobber the MMX state,
yet since 1e3dc705df this is no longer
done. So use the stricter declare_func instead.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit adds a 32-bit *integer* planar RGBA format.
Vulkan FFv1 decoding is best performed on separate planes, rather than
packed RGBA (i.e. RGBA128), hence this is useful as an intermediate format.
Encoding was untested before this.
Notice that the filesize degradation is partially due to
mpegvideo no longer using progressive_sequence and
progressive_frame.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When there are multiple candidates for macroblock type, the encoder
tries them all. In order to do so, it keeps several sets of states
containing the variables that get modified when encoding
the macroblock and in the end uses the best of these.
Yet one variable was set, but not included in this state:
The current macroblock's qscale value in the current picture's
qscale_table. This may currently be set multiple times in
mpv_reconstruct_mb(), yet it is read when adaptive_quant is true.
Currently, the value read can be the value set by the last attempt
to write the current macroblock and not the initial value.
Fix this by only setting the qscale_table value in one place
outside of mpv_reconstruct_mb() (where it does not belong at all).
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The Main profile of AAC is... terrible.
It enables the use of delta coding across coefficients of two frames
to try to increase compression, and it enabled one more pole for TNS
filters.
What the AAC authors failed to take into account were basic
mathematics, as MDCT leakage (e.g. the spread of each frequency when
represented in a discrete spectrum) is significant in most audio codecs.
This leads to huge variations between each frame, basically rendering
prediction completely pointless.
In fact, enabling AAC-Main prediction does not, in general, even recoup
the metadata losses from signalling the profile and prediction properties
in the first place. So you lose efficiency by using AAC Main.
The rumor is that it was put in the AAC spec for patent reasons, though
patent-wise, it has about as much use as a patent for a bicycle designed
for use by snakes.
The only other thing AAC Main changes is it permits 3-pole TNS filters.
When AAC's bands are absolutely tiny, except for very high frequency bands,
where you're likely to use PNS instead.
Just get rid of it.
When an Info-tag is present, marking initial and trailing samples as
padding, those samples should not be included in the calculation of track
duration.
This solves a surprising user experience where converting a WAV->MP3->WAV,
ffprobe will show the duration of the mp3 as slightly longer than both the
input and the output.
As a result, the estimated duration and imprecise seek-results of some
FATE-tests have been updated.
Previously, we read elements from ff_aac_pow34sf_tab; however
that table is initialized to zero; one needs to call
ff_aac_float_common_init() to make sure that the table is
initialized.
However, given the range of the input values, a large number of
entries in ff_aac_pow34sf_tab would give results outside of the
range for signed 32 bit integers. As the largest aac_cb_maxval
entry is 16, it seems more reasonable to produce values within
an order of mangitude of that value.
(When hitting INT_MIN, implementations may end up with different
results depending on whether the value is negated as a float or
as an int. This corner case is irrelevant in practice as this
is way outside of the expected value range here.)
Coincidentally, this fixes linking checkasm with Apple's older
linker. (In Xcode 15, Apple switched to a new linker. The one in
older toolchains seems to have a bug where it won't figure out to
load object files from a static library, if the only symbol
referenced in the object file is a "common" symbol, i.e. one for
a zero-initialized variable. This issue can also be reproduced with
newer Apple toolchains by passing -Wl,-ld_classic to the linker.)
Signed-off-by: Martin Storsjö <martin@martin.st>