FFmpeg

breeze/FFmpeg

Fork 0

mirror of https://github.com/FFmpeg/FFmpeg.git synced 2026-01-12 00:06:51 +08:00

Commit Graph

Author	SHA1	Message	Date
Niklas Haas	d62102d679	swscale/ops_tmpl_float: actually skip allocation for !size_log2 case This check was wrong; 1 << 0 = 1. The intent was to skip allocating a 1x1 matrix by assuming it is a constant 0.5. But as written, the check never actually executed. This did not affect the runtime performance, nor did it leak memory; but it did mean we didn't hit the intended `assert`.	2025-12-15 14:31:58 +00:00
Niklas Haas	58f933798f	swscale/ops_tmpl_float: respect specified dither matrix offsets Since we only need 8 bytes to store the dither matrix pointer, we actually still have 8 bytes left-over. That means we could either store the 8-bit row offset directly, or alternatively compute a 16-bit pointer offsets. I have chosen to do the former for the C backend, in the interest of simplicity. The one downside of this approach is that it would fail on hypothetical 128-bit platforms; although I seriously hope that this code does not live long enough to see the need for 128-bit addressable memory.	2025-12-15 14:31:58 +00:00
Niklas Haas	5aef513fb4	swscale/ops_backend: add reference backend basend on C templates This will serve as a reference for the SIMD backends to come. That said, with auto-vectorization enabled, the performance of this is not atrocious. It easily beats the old C code and sometimes even the old SIMD. In theory, we can dramatically speed it up by using GCC vectors instead of arrays, but the performance gains from this are too dependent on exact GCC versions and flags, so it practice it's not a substitute for a SIMD implementation.	2025-09-01 19:28:36 +02:00

Author

SHA1

Message

Date

Niklas Haas

d62102d679

swscale/ops_tmpl_float: actually skip allocation for !size_log2 case

This check was wrong; 1 << 0 = 1. The intent was to skip allocating a 1x1
matrix by assuming it is a constant 0.5. But as written, the check never
actually executed.

This did not affect the runtime performance, nor did it leak memory; but it
did mean we didn't hit the intended `assert`.

2025-12-15 14:31:58 +00:00

Niklas Haas

58f933798f

swscale/ops_tmpl_float: respect specified dither matrix offsets

Since we only need 8 bytes to store the dither matrix pointer, we actually
still have 8 bytes left-over. That means we could either store the 8-bit
row offset directly, or alternatively compute a 16-bit pointer offsets.

I have chosen to do the former for the C backend, in the interest of
simplicity.

The one downside of this approach is that it would fail on hypothetical
128-bit platforms; although I seriously hope that this code does not live
long enough to see the need for 128-bit addressable memory.

2025-12-15 14:31:58 +00:00

Niklas Haas

5aef513fb4

swscale/ops_backend: add reference backend basend on C templates

This will serve as a reference for the SIMD backends to come. That said,
with auto-vectorization enabled, the performance of this is not atrocious.
It easily beats the old C code and sometimes even the old SIMD.

In theory, we can dramatically speed it up by using GCC vectors instead of
arrays, but the performance gains from this are too dependent on exact GCC
versions and flags, so it practice it's not a substitute for a SIMD
implementation.

2025-09-01 19:28:36 +02:00

3 Commits