Instead of implicitly testing for NaN values. This is mostly a straightforward
translation, but we need some slight extra boilerplate to ensure the mask
is correctly updated when e.g. commuting past a swizzle.
Signed-off-by: Niklas Haas <git@haasn.dev>
A failure while preparing a dither buffer leaves the newly allocated
buffer outside the cleanup range, leaking Vulkan resources. Make the
failure path cover the current buffer as well.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
swscale gets runtime-defined assembly once again!
This commit splits the Vulkan backend into two, SPIR-V and GLSL,
enabling falling back onto the GLSL implementation if an instruction
is unavailable, or simply for testing.
Sponsored-by: Sovereign Tech Fund
This commit adds a SPIR-V assembler header file. It was partially generated
from the SPIR-V header file JSON definition, then edited by hand to template
and reduce its size as much as possible.
It only implements the essentials required for SPIR-V assembly that swscale
requires.
Sponsored-by: Sovereign Tech Fund
Uniform buffers are much simpler to index, and require no work from
the driver compiler to optimize.
In SPIR-V, large 2D shader constants can be spilled into scratch memory,
since you need to create a function variable to index them during runtime.
Sponsored-by: Sovereign Tech Fund
The issue is that very often, hardware has limited support for BGRA
formats.
As this is a limitation of Vulkan itself, we cannot work around this
in a compatible way.
Sponsored-by: Sovereign Tech Fund
The issue is that with multiplane images, or packed images,
there may be some mismatching between what .elems has, and what
we need.
Descriptors are cheap, so just always reserve 4.
Sponsored-by: Sovereign Tech Fund
The issue is that the main Vulkan context is shared between possibly
multiple shaders, and registering a new shader requires allocating
descriptors.
Sponsored-by: Sovereign Tech Fund
It was a bit clunky, lacked semantic contextual information, and made it
harder to reason about the effects of extending this struct. There should be
zero runtime overhead as a result of the fact that this is already a big
union.
I made the changes in this commit by hand, but due to the length and noise
level of the commit, I used Opus 4.6 to verify that I did not accidentally
introduce any bugs or typos.
Signed-off-by: Niklas Haas <git@haasn.dev>
Just define these directly as integer arrays; there's really no point in
having them re-use SwsSwizzleOp; the only place this was ever even remotely
relevant was in the no-op check, which any decent compiler should already
be capable of optimizing into a single 32-bit comparison.
Signed-off-by: Niklas Haas <git@haasn.dev>
> packed = load all components from a single plane (the index given by order_src[0])
> planar = load one component each from separate planes (the index given by order_src[i])
Sponsored-by: Sovereign Tech Fund
This allows reads to directly embed filter kernels. This is because, in
practice, a filter needs to be combined with a read anyways. To accomplish
this, we define filter ops as their semantic high-level operation types, and
then have the optimizer fuse them with the corresponding read/write ops
(where possible).
Ultimately, something like this will be needed anyways for subsampled formats,
and doing it here is just incredibly clean and beneficial compared to each
of the several alternative designs I explored.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
This reverts commit 32554fc107.
Accidentally pushed this commit twice, with the wrong location.
Correct version is 97682155e6.
Signed-off-by: Niklas Haas <git@haasn.dev>
Avoids some unnecessary round-trips through the execution harness, as well
as removing one unnecessary layer of abstraction (SwsOpExec).
It's a bit unfortunate that we have to cast away the const on the AVFrame,
since the Vulkan functions take non-const everywhere, even though all they're
doing is modifying frame internal metadata, but alas.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
And call it on the read/write ops directly, rather than this awkward loop.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
And call it on the read/write ops directly, rather than this awkward loop.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
This is a bit more forward-facing than a bare allocation, and importantly,
allows the `swscale/utils.c` code to remain agnostic about how to correctly
uninit this struct.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
AVFrame just really doesn't have the semantics we want. However, there a
tangible benefit to having SwsFrame act as a carbon copy of a (subset of)
AVFrame.
This partially reverts commit 67f3627267.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>