mirror of
https://github.com/FFmpeg/FFmpeg.git
synced 2026-02-04 14:30:55 +08:00
This cannot beat the Zbb implementation, and it is unlikely that a real meaningful CPU design would support V and not Zbb. The best loop rewrite that I could come up with (4 shifts, 2 ands, 3 ors) is still ~40% slower than Zbb. A proper faster vector implementation should be feasible with the cryptographic vector extensions, but that is a story for another time.