The faiure handling code is C and requires correct stack, global and
thread pointers. This restores them before returning to C. At the same
time, we no longer need to abort() afterwards.
Implement NEON optimization for compute_weights_line.
Also update the function signature to use ptrdiff_t for stack arguments
(max_meaningful_diff, startx, endx). This is done to unify the stack
layout between Apple platforms (which pack 32-bit stack arguments tightly)
and the generic AAPCS64 ABI (which requires 8-byte stack slots for 32-bit
arguments). Using ptrdiff_t ensures 8-byte slots are used on all AArch64
platforms, avoiding ABI mismatches with the assembly implementation.
The x86 AVX2 prototype is updated to match the new signature.
Performance benchmark (AArch64) in MacOS M4:
./tests/checkasm/checkasm --test=vf_nlmeans --bench
compute_weights_line_c: 151.1 ( 1.00x)
compute_weights_line_neon: 62.6 ( 2.42x)
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
If an input packet results in several output packets, the side data will be
exported only by the first output packet, and be missing from the rest.
Signed-off-by: Nicolas Gaullier <nicolas.gaullier@cji.paris>
Signed-off-by: James Almer <jamrial@gmail.com>
Many parsers will request data until they find what will be the start of the
next assembled packet in order to decide where to cut the current one. If this
happens, the loop in demux.c will, in case the demuxer exports already fully
assembled packets as is sometimes the case for MPEG-TS, discard the already
handled first input packet before it tries to move its side data to the output.
The affected FATE tests reflect this change by no longer dropping the side data
from the first input packet, nor exporting every other side data in the wrong
output packet.
Signed-off-by: James Almer <jamrial@gmail.com>
This is in preparation for adding checkasm support for av_crc(),
which will always call the same function, but uses different CRC
tables to distinguish different implementations.
This reuses checkasm_check_func() for this; one could also add
a new function or use unions. This would allow to avoid casting
const away in the crc test to be added. It would also allow
to avoid converting function pointers to void* (which ISO C
does not allow).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To be able to reuse colors from the original frame, the last value returned by
`p()` is tracked in the eval state, and if it is assigned to a variable, the
original color components are copied to `color_vars`. Thus, commands like
`setcolor` and `colorstop` can use those variables:
setvar pixel (p(0, 0))
...
setcolor pixel
`fate-filter-drawvg-video` now also verifies the `p()` function.
Signed-off-by: Ayose <ayosec@gmail.com>
The arguments for `setvar` and `call` commands can be colors (like `#rrggbb`).
This replaces the previous trick of using `0xRRGGBBAA` values to use colors as
procedure arguments.
The parser stores colors as the value expected by Cairo (a `double[4]`). This
array is allocated on the heap so the size of the union in `VGSArgument` is
not increased (i.e. it is still 8 bytes, instead of 32).
Signed-off-by: Ayose <ayosec@gmail.com>
If AVOptionArrayDef.def is NULL, av_opt_is_set_to_default() should return true
when the field in the object is NULL.
Signed-off-by: James Almer <jamrial@gmail.com>
Some tools like v4l2-ctl dump data without skip padding. If the
padding size is not an integer multiple of the number of pixel
bytes, we cannot handle it by using a larger width, e.g.,, for
RGB24 image with 32 bytes padding in each line.
This function was assuming that the bits are MSB-aligned, but they are
LSB-aligned in both practice (and in the actual backend).
Also update the documentation of SwsPackOp to make this clearer.
Fixes an incorrect omission of a clamp after decoding e.g. rgb4, since
the max value range was incorrectly determined as 0 as a result of unpacking
the MSB bits instead of the LSB bits:
bgr4 -> gray:
[ u8 XXXX -> +XXX] SWS_OP_READ : 1 elem(s) packed >> 1
[ u8 .XXX -> +++X] SWS_OP_UNPACK : {1 2 1 0}
[ u8 ...X -> +++X] SWS_OP_SWIZZLE : 2103
[ u8 ...X -> +++X] SWS_OP_CONVERT : u8 -> f32
[f32 ...X -> .++X] SWS_OP_LINEAR : dot3 [...]
[f32 .XXX -> .++X] SWS_OP_DITHER : 16x16 matrix + {0 3 2 5}
+ [f32 .XXX -> .++X] SWS_OP_MIN : x <= {255 _ _ _}
[f32 .XXX -> +++X] SWS_OP_CONVERT : f32 -> u8
[ u8 .XXX -> +++X] SWS_OP_WRITE : 1 elem(s) planar >> 0
(X = unused, + = exact, 0 = zero)
Most EXIF metadata is in IFD0 and most EXIF payloads only contain
one IFD, but it is possible for there to be more IFDs after the
existing trailing one. exiftool and similar software report these IFDs
as IFD1, IFD2, etc. This commit reads those additional IFDs and attaches
them as dummy entries in the top-level IFD ranging from 0xFFFC down to
0xFFED, which are unused by the EXIF spec. The EXIF API is only able to
return and work with a single IFD, so by attaching it as a subdirectory
this metadata can be preserved.
This is done transparently through the read/write process. Upon parsing
an additional IFD1, it will be attached, but it will be written with
av_exif_write after IFD0 rather than as a subdirectory, as intended.
Existing files without more than one IFD, i.e. most files, will be unaffected
by this change, as well as API clients looking to parse specific fields, but
now more metadata is parsed and written, rather than simply being discarded
as trailing data.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
This avoids needing to use the extra_conf variable. That variable
is problematic for setting a value that contains spaces.
This adds options for another tool in the same fashion as other
tools were added in 523d688c2b.
Busybox-w32 uses regular Windows style paths with drive letters,
but with forward slashes; thus an absolute path starts with "c:/".
Make the target_path() function in fate-run.sh (which converts a
potentially relative path to an absolute one, under the target_path
prefix) handle this case.
With this in place, running fate tests almost works in
busybox-w32 - only one issue remains. A patch [1] has been sent to
upstream busybox for fixing that issue (which also is present if
running fate tests on busybox on Linux), but it hasn't been
responded to yet.
[1] https://lists.busybox.net/pipermail/busybox/2025-December/091851.html
Use the concat protocol, to test the parser's capabilities to differentiate between
EOC maker before SOC marker, on top of false EOC marker positives and EOC maker on EOF.
Signed-off-by: James Almer <jamrial@gmail.com>
This requires updating the order of the dither matrix offsets as well. In
particular, this can only be done cleanly if the dither matrix offsets are
compatible between duplicated channels; but this should be guaranteed after
the previous commits.
As a side note, this also fixes a bug where we pushed SWS_OP_SWIZZLE past
SWS_OP_DITHER even for very low bit depth output (e.g. rgb4), which led to
a big loss in accuracy due to loss of per-channel dither noise.
Improves loss of e.g. gray -> rgb444 from 0.00358659 to 0.00261414,
now beating legacy swscale in these cases as well.
On the surface, this trades a tiny bit of PSNR for not introducing chroma
noise into grayscale images. However, the main reason for this change is
actually motivated by a desire to avoid regressing the status quo of
duplicating swizzles being able to be commuted past dither ops.
To improve decorrelation between components, we offset the dither matrix
slightly for each component. This is currently done by adding a hard-coded
offset of {0, 3, 2, 5} to each of the four components, respectively.
However, this represents a serious challenge when re-ordering SwsDitherOp
past a swizzle, or when splitting an SwsOpList into multiple sub-operations
(e.g. for decoupling luma from subsampled chroma when they are independent).
To fix this on a fundamental level, we have to keep track of the offset per
channel as part of the SwsDitherOp metadata, and respect those values at
runtime.
This commit merely adds the metadata; the update to the underlying backends
will come in a follow-up commit. The FATE change is merely due to the
added offsets in the op list print-out.
The format does not contain audio timestamps and the calculated audio pts
values were only correct for compressed audio. It is better to remove PTS
calculation entirely and let the generic code handle it.
Fixes ticket #8595.
Fixes#20983.
Fate test changes are because of the different audio time base which is now
always 1/sample_rate.
Signed-off-by: Marton Balint <cus@passwd.hu>
After the full ffmpeg CLI multithreading changes went in, this
test started depending on how far the input side read and decoded
the input compared to how quickly the output encoded things, causing
spurious failures on the CI.
To my knowledge all of the failures have so far been valid correct
results, but unfortunately FATE's built in checks mostly consist of
whether there is a difference against an exact result.
This way we still get the CI and valgrind running of the code,
but stop its comparison. Reference file is left around so that
the previous reference is still available.