Commit Graph

122233 Commits

Author SHA1 Message Date
Lynne
ea14f8a28f vulkan_prores: split up shader creation functions
Its more mess that it has to be.
2025-12-22 19:46:26 +01:00
Rémi Denis-Courmont
eb3b632b48 lavc/h264qpel: fix RISC-V stack usage
The function violated the ABI requirement not to write below SP
(this breaks asynchronous signal handling). On RV32, it also broke
did not align SP to 16 bytes and did not restore it correctly.

No changes to benchmarks as this patch only changes a few immediate
offsets.
2025-12-22 18:55:20 +02:00
Rémi Denis-Courmont
65018b3e83 lavu/float_dsp: fix R-V V scalarpdocut_double with ILP32 ABI 2025-12-22 18:55:16 +02:00
Rémi Denis-Courmont
435623cbda lavu/float_dsp: fix R-V V vector_dmul_scalar with ILP32 ABI 2025-12-22 18:55:16 +02:00
Rémi Denis-Courmont
56d933b0a7 lavu/float_dsp: fix R-V V vector_dmac_scalar with ILP32 ABI 2025-12-22 18:55:16 +02:00
Rémi Denis-Courmont
a583639bf0 lavu/fixed_dsp: fix scalarproduct on riscv32
On riscv32, the result must be narrowed from 63 to 32 bit before being
moved to the scalar side.
2025-12-22 18:55:13 +02:00
Araz Iusubov
4479d28103 avcodec/avfilter_amf: correct handling of AMF errors
Fix several AMF-related issues.

Check the return value of amf_init_frames_context() correctly in amfdec,
as it returns int rather than AMF_RESULT.

Handle possible NULL surfaces returned from QueryInterface() in
vf_amf_common to avoid passing invalid data to amf_amfsurface_to_avframe().

Remove FILTER_SINGLE_PIXFMT from vf_sr_amf since it must not be used
together with a query formats function.
2025-12-22 14:58:59 +00:00
Romain Beauxis
b43645b2ef libavformat/id3v2.c: return valid string in decode_str for empty strings
with no bom. Fixes: #YWH-PGM40646-12
2025-12-22 13:44:42 +00:00
Kacper Michajłow
c50e5c7778 avcodec/libaomenc: remove enum type from codecctl_* functions
aom_codec_control() takes control id as int. It could be AV1E_ or common
AV1_ enum in encoder, and AV1D_ for decoder.

While upstream provides AOM_CODEC_CONTROL_TYPECHECKED() macro to check
the provided enum value, we wrap those calls in codecctl_ functions,
which makes it not feasible to use.

To avoid complicating this needlessly, just use int.

Fixes: warning: implicit conversion from enumeration type 'enum aom_com_control_id' to different enumeration type 'enum aome_enc_control_id'
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-12-22 07:05:58 +01:00
Kacper Michajłow
96afe665ef avcodec/libaomenc: remove UENUM1BYTE check
AOM had a short-lived API breakage introduced in commit [1], which was
workedaround in commit [2]. The original change, however, was reverted
shortly afterward in commit [3]. Since we require at least v2.0.0, there
is no need to keep this workaround.

[1] https://aomedia.googlesource.com/aom/+/4667aa1a373566e9c124afcd58c71731ab0d7377
[2] aaf9171574
[3] https://aomedia.googlesource.com/aom/+/9b1252eab0616d2c1f6d7990c6256441c0b6483f

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-12-22 07:05:58 +01:00
stevxiao
64b9be2dc5 avcodec/d3d12va_encode: support motion estimation precision mode
By default, the D3D12 video encoder uses MAXIMUM, which means no restriction—it uses the highest precision supported by the driver.

Applications may want to reduce precision to improve speed or reduce power consumption. This requires the encoder to support user-defined motion estimation precision modes.

D3D12_VIDEO_ENCODER_MOTION_ESTIMATION_PRECISION_MODE defines several precision modes:

maximum: No restriction, uses the maximum precision supported by the driver.
full_pixel: Allows only full-pixel precision.
half_pixel: Allows half-pixel precision.
quarter-pixel: Allows quarter-pixel precision.
eighth-pixel: Allows eighth-pixel precision (introduced in Windows 11).

Sample Command Line:

ffmpeg -hwaccel d3d12va -hwaccel_output_format d3d12 -extra_hw_frames 20 -i input.mp4 -an -c:v h264_d3d12va -me_precision half_pixel out.mp4
2025-12-22 05:35:04 +00:00
Gyan Doshi
1fcb201ab1 configure: update dependencies
These muxers - hls, tee, whip - open other muxers.
In addition, whip requires http protocol.
2025-12-21 04:16:10 +00:00
Leo Izen
784aa09fa8 avcodec/exif: parse additional EXIF IFDs
Most EXIF metadata is in IFD0 and most EXIF payloads only contain
one IFD, but it is possible for there to be more IFDs after the
existing trailing one. exiftool and similar software report these IFDs
as IFD1, IFD2, etc. This commit reads those additional IFDs and attaches
them as dummy entries in the top-level IFD ranging from 0xFFFC down to
0xFFED, which are unused by the EXIF spec. The EXIF API is only able to
return and work with a single IFD, so by attaching it as a subdirectory
this metadata can be preserved.

This is done transparently through the read/write process. Upon parsing
an additional IFD1, it will be attached, but it will be written with
av_exif_write after IFD0 rather than as a subdirectory, as intended.

Existing files without more than one IFD, i.e. most files, will be unaffected
by this change, as well as API clients looking to parse specific fields, but
now more metadata is parsed and written, rather than simply being discarded
as trailing data.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2025-12-20 11:53:23 -05:00
Leo Izen
105b6fcd9c avcodec/exif: avoid leaking EXIF metadata upon parse failure
Before this commit, exif_parse_ifd_list didn't free *ifd upon failure,
relying on the caller to do so instead. We only guarded some of the
calls against this function, not all of them, so sometimes it leaked.

This commit fixes this, so exif_parse_ifd_list freeds *ifd upon failure
so callers do not have to guard its invocation with a free wrapper.

Fixes: ossfuzz 440747118: Integer-overflow in av_strerror

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2025-12-20 11:53:21 -05:00
Niklas Haas
7505264b6a swscale/ops: update comment on SWS_COMP_EXACT
That the integer is "in-range" is implied by the min/max range tracking,
not the flag itself.
2025-12-20 13:52:45 +00:00
Niklas Haas
1d0fd7fabf swscale/ops: categorize ops by type compatibility
This is a more useful grouping than the previous, somewhat arbitrary one.
2025-12-20 13:52:45 +00:00
Niklas Haas
94777ed2eb swscale/ops_chain: fix comment 2025-12-20 13:52:45 +00:00
Niklas Haas
75ba2bf457 swscale/ops: correctly truncate on ff_sws_apply_op_q(SWS_OP_RSHIFT)
Instead of using a "precise" division, simulate the actual truncation.

Note that the division by `den` is unneeded in principle because the
denominator *should* always be 1 for an integer, but this way we don't
explode if the user should happen to pass `4/2` or something.

Fixes a lot of unnecessary clamps w.r.t. xv36, e.g.:

 xv36be -> yuv444p12be:
   [u16 XXXX -> ++++] SWS_OP_READ         : 4 elem(s) packed >> 0
   [u16 ...X -> ++++] SWS_OP_SWAP_BYTES
   [u16 ...X -> ++++] SWS_OP_SWIZZLE      : 1023
   [u16 ...X -> ++++] SWS_OP_RSHIFT       : >> 4
-  [u16 ...X -> ++++] SWS_OP_CONVERT      : u16 -> f32
-  [f32 ...X -> ++++] SWS_OP_MIN          : x <= {4095 4095 4095 _}
-  [f32 ...X -> ++++] SWS_OP_CONVERT      : f32 -> u16
   [u16 ...X -> ++++] SWS_OP_SWAP_BYTES
   [u16 ...X -> ++++] SWS_OP_WRITE        : 3 elem(s) planar >> 0
     (X = unused, + = exact, 0 = zero)
2025-12-20 13:52:45 +00:00
Niklas Haas
d1eaea1a03 swscale/ops: add type assertions to ff_sws_apply_op_q() 2025-12-20 13:52:45 +00:00
Niklas Haas
258dbfdbc9 swscale/format: only generate SHIFT ops when needed
Otherwise, we may spuriously generate illegal combinations like
SWS_OP_LSHIFT on SWS_PIXEL_F32.
2025-12-20 13:52:45 +00:00
Niklas Haas
c31f3926d1 swscale/ops_optimizer: simplify loop slightly (cosmetic)
We always `goto retry` whenever an optimization case is hit, so we don't
need to defer the increment of `n`.
2025-12-20 13:52:45 +00:00
Niklas Haas
900d91b541 swscale/ops_optimizer: apply optimizations in a more predictable order
Instead of blindly interleaving re-ordering and minimizing optimizations,
separate this loop into several passes - the first pass will minimize the
operation list in-place as much as possible, and the second pass will apply any
desired re-orderings. (We also want to try pushing clear back before any other
re-orderings, as this can trigger more phase 1 optimizations)

This restructuring leads to significantly more predictable and stable behavior,
especially when introducing more operation types going forwards. Does not
actually affect the current results, but matters with some upcoming changes
I have planned.
2025-12-20 13:52:45 +00:00
Niklas Haas
c51c63058c swscale/ops_optimizer: don't commute clear with itself
These would normally be merged, not swapped.
2025-12-20 13:52:45 +00:00
Andreas Rheinhardt
b934dd1d4b avformat/whip: Fix leak of dtls_fingerprint
Reviewed-by: Steven Liu <lingjiujianke@gmail.com>
Reviewed-by: Jack Lau <jacklau1222gm@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 21:45:24 +01:00
Andreas Rheinhardt
34bc95ddfc avformat/whip: Check number of audio/video streams generically
Reviewed-by: Steven Liu <lingjiujianke@gmail.com>
Reviewed-by: Jack Lau <jacklau1222gm@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 21:45:21 +01:00
Andreas Rheinhardt
e252bc0c3d avformat/whip: Remove dead code
This is checked generically after
6070ea29de.
Also set AVOutputFormat.subtitle_codec explicitly in order
not to rely on AV_CODEC_ID_NONE to be zero.

Reviewed-by: Steven Liu <lingjiujianke@gmail.com>
Reviewed-by: Jack Lau <jacklau1222gm@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 21:42:27 +01:00
Andreas Rheinhardt
6177af5acc avcodec/x86/lossless_videodsp: Avoid unnecessary reg push,pop
Happens on Win64.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 20:56:09 +01:00
Andreas Rheinhardt
9314d5cae8 avcodec/x86/lossless_videodsp: Avoid aligned/unaligned versions
For AVX2, movdqu is as fast as movdqa when used on aligned addresses,
so don't instantiate aligned/unaligned versions.

(The check was btw overtly strict: The AVX2 code only uses 16 byte
stores, so it would be enough for dst to be 16-byte aligned.)

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 20:55:53 +01:00
Andreas Rheinhardt
6368d2baae avcodec/x86/lossless_videodsp: Don't store in eight byte chunks
Use movu (movdqu) instead of movq+movhps.

Old benchmarks:
add_left_pred_int16_c:                                2265.5 ( 1.00x)
add_left_pred_int16_ssse3:                             595.4 ( 3.81x)
add_left_pred_rnd_acc_c:                              1255.0 ( 1.00x)
add_left_pred_rnd_acc_ssse3:                           326.2 ( 3.85x)
add_left_pred_rnd_acc_avx2:                            279.0 ( 4.50x)
add_left_pred_zero_c:                                 1249.5 ( 1.00x)
add_left_pred_zero_ssse3:                              326.1 ( 3.83x)
add_left_pred_zero_avx2:                               277.0 ( 4.51x)

New benchmarks:
add_left_pred_int16_c:                                2266.9 ( 1.00x)
add_left_pred_int16_ssse3:                             509.9 ( 4.45x)
add_left_pred_rnd_acc_c:                              1251.4 ( 1.00x)
add_left_pred_rnd_acc_ssse3:                           282.6 ( 4.43x)
add_left_pred_rnd_acc_avx2:                            208.9 ( 5.99x)
add_left_pred_zero_c:                                 1253.7 ( 1.00x)
add_left_pred_zero_ssse3:                              280.0 ( 4.48x)
add_left_pred_zero_avx2:                               206.8 ( 6.06x)

The checkasm test has been modified to use an unaligned destination
for this test.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 20:55:37 +01:00
Andreas Rheinhardt
bb523a2d3f tests/checkasm/llviddsp: Reindent after the previous commit
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 20:55:33 +01:00
Andreas Rheinhardt
b2dea09de1 tests/checkasm/llviddsp: Avoid unnecessary initializations, allocs
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 20:55:20 +01:00
Andreas Rheinhardt
a6b8939e1e avcodec/x86/lossless_videodsp: Remove SSSE3 functions using MMX regs
These functions are only used on Conroe (they are overwritten
by SSSE3 functions using xmm registers if the SSSE3SLOW is not set)
which is very old (introduced in 2006), so remove them.

Btw: The checkasm test (which uses declare_func and not
declare_func_emms since cd8a33bcce)
would fail on a Conroe, yet no one ever reported any such failure.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 20:54:44 +01:00
Martin Storsjö
cf608f6b65 checkasm: Use av_strlcatf for appending SME info after SVE
If we had SVE enabled and formatted info about its vector lengths,
it would be overwritten by the SME info.
2025-12-19 18:42:10 +00:00
Martin Storsjö
ec2ceefcfa fate.sh: Allow specifying --ar through a separate variable
This avoids needing to use the extra_conf variable. That variable
is problematic for setting a value that contains spaces.

This adds options for another tool in the same fashion as other
tools were added in 523d688c2b.
2025-12-19 18:41:23 +00:00
Martin Storsjö
06a17fdafc tests: Fix fate-run.sh to handle busybox-w32 absolute paths
Busybox-w32 uses regular Windows style paths with drive letters,
but with forward slashes; thus an absolute path starts with "c:/".

Make the target_path() function in fate-run.sh (which converts a
potentially relative path to an absolute one, under the target_path
prefix) handle this case.

With this in place, running fate tests almost works in
busybox-w32 - only one issue remains. A patch [1] has been sent to
upstream busybox for fixing that issue (which also is present if
running fate tests on busybox on Linux), but it hasn't been
responded to yet.

[1] https://lists.busybox.net/pipermail/busybox/2025-December/091851.html
2025-12-19 18:38:33 +00:00
Martin Storsjö
6149ceadeb configure: Recognize uname "Windows_NT" as using an .exe suffix
Busybox-w32 [1] works for building ffmpeg on Windows (as an
alternative to msys2, cygwin or WSL).

On busybox-w32, "uname" returns "Windows_NT"; recognize this
in exesuf() as having an .exe suffix.

If building in this environment with a mingw toolchain, one has
to explicitly set --target-os=mingw32. (We probably don't
want to imply that this uname, set as target_os_default, would
default to mingw?) But despite what is set with --target-os,
one can't override the configure variable "host_os", which
exesuf() has to recognize.

[1] https://github.com/rmyorston/busybox-w32
2025-12-19 18:38:33 +00:00
Rémi Denis-Courmont
55200f999c lavc/mathops: R-V B optimisation for mid_pred
If Zbb is enabled at compilation (e.g. Ubuntu), the compiler should
compile the new C mid_pred() function correctly. But if Zbb is *not*
enabled (e.g. Debian), then we can at least fallback at run-time.

On SiFive-U74, before:
sub_median_pred_c:                                    1331.9 ( 1.00x)
sub_median_pred_rvb_b:                                 881.8 ( 1.51x)

After:
sub_median_pred_c:                                    1133.1 ( 1.00x)
sub_median_pred_rvb_b:                                 875.7 ( 1.29x)
2025-12-19 19:56:13 +02:00
Rémi Denis-Courmont
ccd7e66f9e lavc/mathops: remove bespoke Arm mid_pred()
The C codegen is as good if not slightly better than the assembler at
this point.
2025-12-19 19:56:13 +02:00
Rémi Denis-Courmont
8dccb380cf lavc/mathops: simplify mid_pred()
This reduces the minimum instruction emission for mid_pred()
(i.e. median of 3) down to:
- 3 comparisons and 4 conditional moves, or
- 4 min/max.

With that the compiler can eliminate any branch. This optimal
situation is attainable with Clang 21 on Arm64, RVA22 and x86,
with GCC 15 on Arm64 and x86 (RVA22 goes from 2 to 1 branch).
These optimisations also work on Arm32 and LoongArch.

The same algorithm is already implemented via inline assembler for some
architectures such as x86 and Arm32, but notably not Arm64 and RVA22.
Besides, using C code allows the compiler to schedule instruction
properly.

Even on architectures with neither conditional moves nor min/max, this
leads to a visible performance improvement for C code, as seen here for
RVA20 code running on SiFive-U74:

Before:
sub_median_pred_c:                                    1657.5 ( 1.00x)
sub_median_pred_rvb_b:                                 875.9 ( 1.89x)

After:
sub_median_pred_c:                                    1331.9 ( 1.00x)
sub_median_pred_rvb_b:                                 881.8 ( 1.51x)

Note that this commit leaves the x86 and Arm32 code intact so it has
no effects on those ISA's.
2025-12-19 19:50:56 +02:00
Harishmcw
5946d2eadc compat: Fix .def file generation for ARM64EC builds on Windows
When building DLLs on ARM64EC, the default use of `dumpbin
-linkermember:1` fails because ARM64EC static libraries use a
different linker member format. Use `-linkermember:32` for ARM64EC
to correctly extract symbols.

Additionally, MSVC inserts $exit_thunk and $entry_thunk symbols
for ARM64EC to handle x64 ↔ ARM64 transitions. These are internal
thunks and must not be exported. Filter them out when generating
the .def file to avoid unresolved symbols or invalid exports.

Trim out the leading '#' on ARM64EC function symbols. This is only
relevant on ARM64EC, but it is benign to do that filtering on
all architectures (such symbols aren't expected on other
architectures).

Simplify the sed command by removing the symbol address with a
sed expression instead of a later "cut" command.

This ensures correct symbol extraction and stable DLL generation
on ARM64EC targets, while keeping behavior unchanged for other
Windows architectures.
2025-12-19 10:01:16 +00:00
Martin Storsjö
087f46674a doc: Document our stance on Windows ARM64EC
Explicitly spell it out that we are not going to modify the
individual libraries for the purposes of improving conformance
to ARM64EC.

We may (or may not) accept build system patches for making such
a build succeed, provided that it does not require changes to
the actual library code.
2025-12-19 10:01:16 +00:00
Andreas Rheinhardt
1c35a1b79b avformat/flvdec: Fix leak of channel layout map
Fixes: memleak
Fixes: 418396714/clusterfuzz-testcase-minimized-ffmpeg_dem_KUX_fuzzer-4595253332213760

Found-by:  continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 08:02:52 +01:00
Andreas Rheinhardt
4aed9db83c avformat/flac_picture: Correct check
Since af97c9865f,
the return value of avio_read() has been compared against
an uint32_t, so that the int is promoted to uint32_t for
the comparison (on common systems with 32bit ints). The upshot was
that errors returned from avio_read() were ignored, so that
the buffer could be uninitialized on success.

Fix this by using ffio_read_size() instead.

Fixes: MemorySanitizer: use-of-uninitialized-value
Fixes: 443923343/clusterfuzz-testcase-minimized-ffmpeg_dem_FLAC_fuzzer-5458132865449984

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-19 07:32:01 +01:00
Gyan Doshi
6070ea29de lavf/whip: add flag for default codecs only
The muxer does not accept any other codecs.
2025-12-19 04:13:34 +00:00
Gyan Doshi
fdce17953c lavf/supenc: add flag for default codecs only
The muxer does not accept any other codecs.
2025-12-19 04:13:34 +00:00
Gyan Doshi
20b671f651 lavf/dvenc: add flag for default codecs only
The muxer does not accept any other codecs.
2025-12-19 04:13:34 +00:00
James Almer
78c75d546a avcodec/apv_parser: add support for AU assembly
Signed-off-by: James Almer <jamrial@gmail.com>
2025-12-18 01:24:35 +00:00
Timo Rothenpieler
0d7b8d8913 forgejo/workflows: fix error handling of configure result 2025-12-17 13:28:21 +00:00
Timo Rothenpieler
0be989edcb forgejo/workflows: cat .err files after running fate 2025-12-17 13:28:21 +00:00
Timo Rothenpieler
5e8dcd6db1 forgejo/workflows: run windows fate tests through wine 2025-12-17 13:28:21 +00:00