The function violated the ABI requirement not to write below SP
(this breaks asynchronous signal handling). On RV32, it also broke
did not align SP to 16 bytes and did not restore it correctly.
No changes to benchmarks as this patch only changes a few immediate
offsets.
Fix several AMF-related issues.
Check the return value of amf_init_frames_context() correctly in amfdec,
as it returns int rather than AMF_RESULT.
Handle possible NULL surfaces returned from QueryInterface() in
vf_amf_common to avoid passing invalid data to amf_amfsurface_to_avframe().
Remove FILTER_SINGLE_PIXFMT from vf_sr_amf since it must not be used
together with a query formats function.
aom_codec_control() takes control id as int. It could be AV1E_ or common
AV1_ enum in encoder, and AV1D_ for decoder.
While upstream provides AOM_CODEC_CONTROL_TYPECHECKED() macro to check
the provided enum value, we wrap those calls in codecctl_ functions,
which makes it not feasible to use.
To avoid complicating this needlessly, just use int.
Fixes: warning: implicit conversion from enumeration type 'enum aom_com_control_id' to different enumeration type 'enum aome_enc_control_id'
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
By default, the D3D12 video encoder uses MAXIMUM, which means no restriction—it uses the highest precision supported by the driver.
Applications may want to reduce precision to improve speed or reduce power consumption. This requires the encoder to support user-defined motion estimation precision modes.
D3D12_VIDEO_ENCODER_MOTION_ESTIMATION_PRECISION_MODE defines several precision modes:
maximum: No restriction, uses the maximum precision supported by the driver.
full_pixel: Allows only full-pixel precision.
half_pixel: Allows half-pixel precision.
quarter-pixel: Allows quarter-pixel precision.
eighth-pixel: Allows eighth-pixel precision (introduced in Windows 11).
Sample Command Line:
ffmpeg -hwaccel d3d12va -hwaccel_output_format d3d12 -extra_hw_frames 20 -i input.mp4 -an -c:v h264_d3d12va -me_precision half_pixel out.mp4
Most EXIF metadata is in IFD0 and most EXIF payloads only contain
one IFD, but it is possible for there to be more IFDs after the
existing trailing one. exiftool and similar software report these IFDs
as IFD1, IFD2, etc. This commit reads those additional IFDs and attaches
them as dummy entries in the top-level IFD ranging from 0xFFFC down to
0xFFED, which are unused by the EXIF spec. The EXIF API is only able to
return and work with a single IFD, so by attaching it as a subdirectory
this metadata can be preserved.
This is done transparently through the read/write process. Upon parsing
an additional IFD1, it will be attached, but it will be written with
av_exif_write after IFD0 rather than as a subdirectory, as intended.
Existing files without more than one IFD, i.e. most files, will be unaffected
by this change, as well as API clients looking to parse specific fields, but
now more metadata is parsed and written, rather than simply being discarded
as trailing data.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Before this commit, exif_parse_ifd_list didn't free *ifd upon failure,
relying on the caller to do so instead. We only guarded some of the
calls against this function, not all of them, so sometimes it leaked.
This commit fixes this, so exif_parse_ifd_list freeds *ifd upon failure
so callers do not have to guard its invocation with a free wrapper.
Fixes: ossfuzz 440747118: Integer-overflow in av_strerror
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Instead of blindly interleaving re-ordering and minimizing optimizations,
separate this loop into several passes - the first pass will minimize the
operation list in-place as much as possible, and the second pass will apply any
desired re-orderings. (We also want to try pushing clear back before any other
re-orderings, as this can trigger more phase 1 optimizations)
This restructuring leads to significantly more predictable and stable behavior,
especially when introducing more operation types going forwards. Does not
actually affect the current results, but matters with some upcoming changes
I have planned.
For AVX2, movdqu is as fast as movdqa when used on aligned addresses,
so don't instantiate aligned/unaligned versions.
(The check was btw overtly strict: The AVX2 code only uses 16 byte
stores, so it would be enough for dst to be 16-byte aligned.)
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These functions are only used on Conroe (they are overwritten
by SSSE3 functions using xmm registers if the SSSE3SLOW is not set)
which is very old (introduced in 2006), so remove them.
Btw: The checkasm test (which uses declare_func and not
declare_func_emms since cd8a33bcce)
would fail on a Conroe, yet no one ever reported any such failure.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This avoids needing to use the extra_conf variable. That variable
is problematic for setting a value that contains spaces.
This adds options for another tool in the same fashion as other
tools were added in 523d688c2b.
Busybox-w32 uses regular Windows style paths with drive letters,
but with forward slashes; thus an absolute path starts with "c:/".
Make the target_path() function in fate-run.sh (which converts a
potentially relative path to an absolute one, under the target_path
prefix) handle this case.
With this in place, running fate tests almost works in
busybox-w32 - only one issue remains. A patch [1] has been sent to
upstream busybox for fixing that issue (which also is present if
running fate tests on busybox on Linux), but it hasn't been
responded to yet.
[1] https://lists.busybox.net/pipermail/busybox/2025-December/091851.html
Busybox-w32 [1] works for building ffmpeg on Windows (as an
alternative to msys2, cygwin or WSL).
On busybox-w32, "uname" returns "Windows_NT"; recognize this
in exesuf() as having an .exe suffix.
If building in this environment with a mingw toolchain, one has
to explicitly set --target-os=mingw32. (We probably don't
want to imply that this uname, set as target_os_default, would
default to mingw?) But despite what is set with --target-os,
one can't override the configure variable "host_os", which
exesuf() has to recognize.
[1] https://github.com/rmyorston/busybox-w32
If Zbb is enabled at compilation (e.g. Ubuntu), the compiler should
compile the new C mid_pred() function correctly. But if Zbb is *not*
enabled (e.g. Debian), then we can at least fallback at run-time.
On SiFive-U74, before:
sub_median_pred_c: 1331.9 ( 1.00x)
sub_median_pred_rvb_b: 881.8 ( 1.51x)
After:
sub_median_pred_c: 1133.1 ( 1.00x)
sub_median_pred_rvb_b: 875.7 ( 1.29x)
This reduces the minimum instruction emission for mid_pred()
(i.e. median of 3) down to:
- 3 comparisons and 4 conditional moves, or
- 4 min/max.
With that the compiler can eliminate any branch. This optimal
situation is attainable with Clang 21 on Arm64, RVA22 and x86,
with GCC 15 on Arm64 and x86 (RVA22 goes from 2 to 1 branch).
These optimisations also work on Arm32 and LoongArch.
The same algorithm is already implemented via inline assembler for some
architectures such as x86 and Arm32, but notably not Arm64 and RVA22.
Besides, using C code allows the compiler to schedule instruction
properly.
Even on architectures with neither conditional moves nor min/max, this
leads to a visible performance improvement for C code, as seen here for
RVA20 code running on SiFive-U74:
Before:
sub_median_pred_c: 1657.5 ( 1.00x)
sub_median_pred_rvb_b: 875.9 ( 1.89x)
After:
sub_median_pred_c: 1331.9 ( 1.00x)
sub_median_pred_rvb_b: 881.8 ( 1.51x)
Note that this commit leaves the x86 and Arm32 code intact so it has
no effects on those ISA's.
When building DLLs on ARM64EC, the default use of `dumpbin
-linkermember:1` fails because ARM64EC static libraries use a
different linker member format. Use `-linkermember:32` for ARM64EC
to correctly extract symbols.
Additionally, MSVC inserts $exit_thunk and $entry_thunk symbols
for ARM64EC to handle x64 ↔ ARM64 transitions. These are internal
thunks and must not be exported. Filter them out when generating
the .def file to avoid unresolved symbols or invalid exports.
Trim out the leading '#' on ARM64EC function symbols. This is only
relevant on ARM64EC, but it is benign to do that filtering on
all architectures (such symbols aren't expected on other
architectures).
Simplify the sed command by removing the symbol address with a
sed expression instead of a later "cut" command.
This ensures correct symbol extraction and stable DLL generation
on ARM64EC targets, while keeping behavior unchanged for other
Windows architectures.
Explicitly spell it out that we are not going to modify the
individual libraries for the purposes of improving conformance
to ARM64EC.
We may (or may not) accept build system patches for making such
a build succeed, provided that it does not require changes to
the actual library code.
Since af97c9865f,
the return value of avio_read() has been compared against
an uint32_t, so that the int is promoted to uint32_t for
the comparison (on common systems with 32bit ints). The upshot was
that errors returned from avio_read() were ignored, so that
the buffer could be uninitialized on success.
Fix this by using ffio_read_size() instead.
Fixes: MemorySanitizer: use-of-uninitialized-value
Fixes: 443923343/clusterfuzz-testcase-minimized-ffmpeg_dem_FLAC_fuzzer-5458132865449984
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>