3 Commits

Author SHA1 Message Date
Niklas Haas
d62102d679 swscale/ops_tmpl_float: actually skip allocation for !size_log2 case
This check was wrong; 1 << 0 = 1. The intent was to skip allocating a 1x1
matrix by assuming it is a constant 0.5. But as written, the check never
actually executed.

This did not affect the runtime performance, nor did it leak memory; but it
did mean we didn't hit the intended `assert`.
2025-12-15 14:31:58 +00:00
Niklas Haas
58f933798f swscale/ops_tmpl_float: respect specified dither matrix offsets
Since we only need 8 bytes to store the dither matrix pointer, we actually
still have 8 bytes left-over. That means we could either store the 8-bit
row offset directly, or alternatively compute a 16-bit pointer offsets.

I have chosen to do the former for the C backend, in the interest of
simplicity.

The one downside of this approach is that it would fail on hypothetical
128-bit platforms; although I seriously hope that this code does not live
long enough to see the need for 128-bit addressable memory.
2025-12-15 14:31:58 +00:00
Niklas Haas
5aef513fb4 swscale/ops_backend: add reference backend basend on C templates
This will serve as a reference for the SIMD backends to come. That said,
with auto-vectorization enabled, the performance of this is not atrocious.
It easily beats the old C code and sometimes even the old SIMD.

In theory, we can dramatically speed it up by using GCC vectors instead of
arrays, but the performance gains from this are too dependent on exact GCC
versions and flags, so it practice it's not a substitute for a SIMD
implementation.
2025-09-01 19:28:36 +02:00