Refactor x86 JITing of LDX, STX, CALL instructions into separate helper
functions. No functional changes in LDX and STX helpers. There is a minor
change in CALL helper. It will populate target address correctly on the first
pass of JIT instead of second pass. That won't reduce total number of JIT
passes though.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-3-ast@kernel.org
In preparation for static_call and variable size jump_label support,
teach text_poke_bp() to emulate instructions, namely:
JMP32, JMP8, CALL, NOP2, NOP_ATOMIC5, INT3
The current text_poke_bp() takes a @handler argument which is used as
a jump target when the temporary INT3 is hit by a different CPU.
When patching CALL instructions, this doesn't work because we'd miss
the PUSH of the return address. Instead, teach poke_int3_handler() to
emulate an instruction, typically the instruction we're patching in.
This fits almost all text_poke_bp() users, except
arch_unoptimize_kprobe() which restores random text, and for that site
we have to build an explicit emulate instruction.
Tested-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191111132457.529086974@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 8c7eebc10687af45ac8e40ad1bac0cf7893dba9f)
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The upcoming s390 branch length extension patches rely on "passes do
not increase code size" property in order to consistently choose between
short and long branches. Currently this property does not hold between
the first and the second passes for register save/restore sequences, as
well as various code fragments that depend on SEEN_* flags.
Generate the code during the first pass conservatively: assume register
save/restore sequences have the maximum possible length, and that all
SEEN_* flags are set.
Also refuse to JIT if this happens anyway (e.g. due to a bug), as this
might lead to verifier bypass once long branches are introduced.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191114151820.53222-1-iii@linux.ibm.com
Currently passing alignment greater than 4 to bpf_jit_binary_alloc does
not work: in such cases it silently aligns only to 4 bytes.
On s390, in order to load a constant from memory in a large (>512k) BPF
program, one must use lgrl instruction, whose memory operand must be
aligned on an 8-byte boundary.
This patch makes it possible to request 8-byte alignment from
bpf_jit_binary_alloc, and also makes it issue a warning when an
unsupported alignment is requested.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191115123722.58462-1-iii@linux.ibm.com
When installing kselftests to its own directory and run the
test_lwt_ip_encap.sh it will complain that test_lwt_ip_encap.o can't be
found. Same with the test_tc_edt.sh test it will complain that
test_tc_edt.o can't be found.
$ ./test_lwt_ip_encap.sh
starting egress IPv4 encap test
Error opening object test_lwt_ip_encap.o: No such file or directory
Object hashing failed!
Cannot initialize ELF context!
Failed to parse eBPF program: Invalid argument
Rework to add test_lwt_ip_encap.o and test_tc_edt.o to TEST_FILES so the
object file gets installed when installing kselftest.
Fixes: 74b5a5968f ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191111161728.8854-1-anders.roxell@linaro.org
With latest llvm compiler, running test_progs will have the following
verifier failure for test_sysctl_loop1.o:
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf:
invalid indirect read from stack var_off (0x0; 0xff)+196 size 7
...
libbpf: -- END LOG --
libbpf: failed to load program 'cgroup/sysctl'
libbpf: failed to load object 'test_sysctl_loop1.o'
The related bytecode looks as below:
0000000000000308 LBB0_8:
97: r4 = r10
98: r4 += -288
99: r4 += r7
100: w8 &= 255
101: r1 = r10
102: r1 += -488
103: r1 += r8
104: r2 = 7
105: r3 = 0
106: call 106
107: w1 = w0
108: w1 += -1
109: if w1 > 6 goto -24 <LBB0_5>
110: w0 += w8
111: r7 += 8
112: w8 = w0
113: if r7 != 224 goto -17 <LBB0_8>
And source code:
for (i = 0; i < ARRAY_SIZE(tcp_mem); ++i) {
ret = bpf_strtoul(value + off, MAX_ULONG_STR_LEN, 0,
tcp_mem + i);
if (ret <= 0 || ret > MAX_ULONG_STR_LEN)
return 0;
off += ret & MAX_ULONG_STR_LEN;
}
Current verifier is not able to conclude that register w0 before '+'
at insn 110 has a range of 1 to 7 and thinks it is from 0 - 255. This
leads to more conservative range for w8 at insn 112, and later verifier
complaint.
Let us workaround this issue until we found a compiler and/or verifier
solution. The workaround in this patch is to make variable 'ret' volatile,
which will force a reload and then '&' operation to ensure better value
range. With this patch, I got the below byte code for the loop:
0000000000000328 LBB0_9:
101: r4 = r10
102: r4 += -288
103: r4 += r7
104: w8 &= 255
105: r1 = r10
106: r1 += -488
107: r1 += r8
108: r2 = 7
109: r3 = 0
110: call 106
111: *(u32 *)(r10 - 64) = r0
112: r1 = *(u32 *)(r10 - 64)
113: if w1 s< 1 goto -28 <LBB0_5>
114: r1 = *(u32 *)(r10 - 64)
115: if w1 s> 7 goto -30 <LBB0_5>
116: r1 = *(u32 *)(r10 - 64)
117: w1 &= 7
118: w1 += w8
119: r7 += 8
120: w8 = w1
121: if r7 != 224 goto -21 <LBB0_9>
Insn 117 did the '&' operation and we got more precise value range
for 'w8' at insn 120. The test is happy then:
#3/17 test_sysctl_loop1.o:OK
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191107170045.2503480-1-yhs@fb.com
Magnus Karlsson says:
====================
This patch set extends libbpf and the xdpsock sample program to
demonstrate the shared umem mode (XDP_SHARED_UMEM) as well as Rx-only
and Tx-only sockets. This in order for users to have an example to use
as a blue print and also so that these modes will be exercised more
frequently.
Note that the user needs to supply an XDP program with the
XDP_SHARED_UMEM mode that distributes the packets over the sockets
according to some policy. There is an example supplied with the
xdpsock program, but there is no default one in libbpf similarly to
when XDP_SHARED_UMEM is not used. The reason for this is that I felt
that supplying one that would work for all users in this mode is
futile. There are just tons of ways to distribute packets, so whatever
I come up with and build into libbpf would be wrong in most cases.
This patch has been applied against commit 30ee348c12 ("Merge branch 'bpf-libbpf-fixes'")
Structure of the patch set:
Patch 1: Adds shared umem support to libbpf
Patch 2: Shared umem support and example XPD program added to xdpsock sample
Patch 3: Adds Rx-only and Tx-only support to libbpf
Patch 4: Uses Rx-only sockets for rxdrop and Tx-only sockets for txpush in
the xdpsock sample
Patch 5: Add documentation entries for these two features
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Toke Høiland-Jørgensen says:
====================
This series fixes a few bugs in libbpf that I discovered while playing around
with the new auto-pinning code, and writing the first utility in xdp-tools[0]:
- If object loading fails, libbpf does not clean up the pinnings created by the
auto-pinning mechanism.
- EPERM is not propagated to the caller on program load
- Netlink functions write error messages directly to stderr
In addition, libbpf currently only has a somewhat limited getter function for
XDP link info, which makes it impossible to discover whether an attached program
is in SKB mode or not. So the last patch in the series adds a new getter for XDP
link info which returns all the information returned via netlink (and which can
be extended later).
Finally, add a getter for BPF program size, which can be used by the caller to
estimate the amount of locked memory needed to load a program.
A selftest is added for the pinning change, while the other features were tested
in the xdp-filter tool from the xdp-tools repo. The 'new-libbpf-features' branch
contains the commits that make use of the new XDP getter and the corrected EPERM
error code.
[0] https://github.com/xdp-project/xdp-tools
Changelog:
v4:
- Don't do any size checks on struct xdp_info, just copy (and/or zero)
whatever size the caller supplied.
v3:
- Pass through all kernel error codes on program load (instead of just EPERM).
- No new bpf_object__unload() variant, just do the loop at the caller
- Don't reject struct xdp_info sizes that are bigger than what we expect.
- Add a comment noting that bpf_program__size() returns the size in bytes
v2:
- Keep function names in libbpf.map sorted properly
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently, libbpf only provides a function to get a single ID for the XDP
program attached to the interface. However, it can be useful to get the
full set of program IDs attached, along with the attachment mode, in one
go. Add a new getter function to support this, using an extendible
structure to carry the information. Express the old bpf_get_link_id()
function in terms of the new function.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157333185164.88376.7520653040667637246.stgit@toke.dk
When loading an eBPF program, libbpf overrides the return code for EPERM
errors instead of returning it to the caller. This makes it hard to figure
out what went wrong on load.
In particular, EPERM is returned when the system rlimit is too low to lock
the memory required for the BPF program. Previously, this was somewhat
obscured because the rlimit error would be hit on map creation (which does
return it correctly). However, since maps can now be reused, object load
can proceed all the way to loading programs without hitting the error;
propagating it even in this case makes it possible for the caller to react
appropriately (and, e.g., attempt to raise the rlimit before retrying).
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184946.88376.11768171652794234561.stgit@toke.dk
Since the automatic map-pinning happens during load, it will leave pinned
maps around if the load fails at a later stage. Fix this by unpinning any
pinned maps on cleanup. To avoid unpinning pinned maps that were reused
rather than newly pinned, add a new boolean property on struct bpf_map to
keep track of whether that map was reused or not; and only unpin those maps
that were not reused.
Fixes: 57a00f4164 ("libbpf: Add auto-pinning of maps when loading BPF objects")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/157333184731.88376.9992935027056165873.stgit@toke.dk
Since, the new syntax of BTF-defined map has been introduced,
the syntax for using maps under samples directory are mixed up.
For example, some are already using the new syntax, and some are using
existing syntax by calling them as 'legacy'.
As stated at commit abd29c9314 ("libbpf: allow specifying map
definitions using BTF"), the BTF-defined map has more compatablility
with extending supported map definition features.
The commit doesn't replace all of the map to new BTF-defined map,
because some of the samples still use bpf_load instead of libbpf, which
can't properly create BTF-defined map.
This will only updates the samples which uses libbpf API for loading bpf
program. (ex. bpf_prog_load_xattr)
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently, under samples, several methods are being used to load bpf
program.
Since using libbpf is preferred solution, lots of previously used
'load_bpf_file' from bpf_load are replaced with 'bpf_prog_load_xattr'
from libbpf.
But some of the error messages still show up as 'load_bpf_file' instead
of 'bpf_prog_load_xattr'.
This commit fixes outdated errror messages under samples and fixes some
code style issues.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191107005153.31541-2-danieltimlee@gmail.com
This patch adds array support to btf_struct_access().
It supports array of int, array of struct and multidimensional
array.
It also allows using u8[] as a scratch space. For example,
it allows access the "char cb[48]" with size larger than
the array's element "char". Another potential use case is
"u64 icsk_ca_priv[]" in the tcp congestion control.
btf_resolve_size() is added to resolve the size of any type.
It will follow the modifier if there is any. Please
see the function comment for details.
This patch also adds the "off < moff" check at the beginning
of the for loop. It is to reject cases when "off" is pointing
to a "hole" in a struct.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191107180903.4097702-1-kafai@fb.com
Andrii Nakryiko says:
====================
Github's mirror of libbpf got LGTM and Coverity statis analysis running
against it and spotted few real bugs and few potential issues. This patch
series fixes found issues.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
If we get ELF file with "maps" section, but no symbols pointing to it, we'll
end up with division by zero. Add check against this situation and exit early
with error. Found by Coverity scan against Github libbpf sources.
Fixes: bf82927125 ("libbpf: refactor map initialization")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-6-andriin@fb.com
Coverity scan against Github libbpf code found the issue of not freeing memory and
leaving already freed memory still referenced from bpf_program. Fix it by
re-assigning successfully reallocated memory sooner.
Fixes: 2993e0515b ("tools/bpf: add support to read .BTF.ext sections")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107020855.3834758-2-andriin@fb.com
Fix issue reported by static analysis (Coverity). If bpf_prog_get_fd_by_id()
fails, xsk_lookup_bpf_maps() will fail as well and clean-up code will attempt
close() with fd=-1. Fix by checking bpf_prog_get_fd_by_id() return result and
exiting early.
Fixes: 10a13bb40e ("libbpf: remove qidconf and better support external bpf programs.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107054059.313884-1-andriin@fb.com
A BPF program may consist of 1m instructions, which means JIT
instruction-address mapping can be as large as 4m. s390 has
FORCE_MAX_ZONEORDER=9 (for memory hotplug reasons), which means maximum
kmalloc size is 1m. This makes it impossible to JIT programs with more
than 256k instructions.
Fix by using kvcalloc, which falls back to vmalloc for larger
allocations. An alternative would be to use a radix tree, but that is
not supported by bpf_prog_fill_jited_linfo.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107141838.92202-1-iii@linux.ibm.com
When compiling larger programs with bpf_asm, it's possible to
accidentally exceed jt/jf range, in which case it won't complain, but
rather silently emit a truncated offset, leading to a "happy debugging"
situation.
Add a warning to help detecting such issues. It could be made an error
instead, but this might break compilation of existing code (which might
be working by accident).
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191107100349.88976-1-iii@linux.ibm.com
Streamline BPF_CORE_READ_BITFIELD_PROBED interface to follow
BPF_CORE_READ_BITFIELD (direct) and BPF_CORE_READ, in general, i.e., just
return read result or 0, if underlying bpf_probe_read() failed.
In practice, real applications rarely check bpf_probe_read() result, because
it has to always work or otherwise it's a bug. So propagating internal
bpf_probe_read() error from this macro hurts usability without providing real
benefits in practice. This patch fixes the issue and simplifies usage,
noticeable even in selftest itself.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191106201500.2582438-1-andriin@fb.com
As part of 42765ede5c ("selftests/bpf: Remove too strict field offset relo
test cases"), few ints relocations negative (supposed to fail) tests were
removed, but not completely. Due to them being negative, some leftovers in
prog_tests/core_reloc.c went unnoticed. Clean them up.
Fixes: 42765ede5c ("selftests/bpf: Remove too strict field offset relo test cases")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191106173659.1978131-1-andriin@fb.com
Andrii Nakryiko says:
====================
This patch set adds support for reading bitfields in a relocatable manner
through a set of relocations emitted by Clang, corresponding libbpf support
for those relocations, as well as abstracting details into
BPF_CORE_READ_BITFIELD/BPF_CORE_READ_BITFIELD_PROBED macro.
We also add support for capturing relocatable field size, so that BPF program
code can adjust its logic to actual amount of data it needs to operate on,
even if it changes between kernels. New convenience macro is added to
bpf_core_read.h (bpf_core_field_size(), in the same family of macro as
bpf_core_read() and bpf_core_field_exists()). Corresponding set of selftests
are added to excercise this logic and validate correctness in a variety of
scenarios.
Some of the overly strict logic of matching fields is relaxed to support wider
variety of scenarios. See patch #1 for that.
Patch #1 removes few overly strict test cases.
Patch #2 adds support for bitfield-related relocations.
Patch #3 adds some further adjustments to support generic field size
relocations and introduces bpf_core_field_size() macro.
Patch #4 tests bitfield reading.
Patch #5 tests field size relocations.
v1 -> v2:
- added direct memory read-based macro and tests for bitfield reads.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add a bunch of selftests verifying correctness of relocatable bitfield reading
support in libbpf. Both bpf_probe_read()-based and direct read-based bitfield
macros are tested. core_reloc.c "test_harness" is extended to support raw
tracepoint and new typed raw tracepoints as test BPF program types.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-5-andriin@fb.com
Add support for the new field relocation kinds, necessary to support
relocatable bitfield reads. Provide macro for abstracting necessary code doing
full relocatable bitfield extraction into u64 value. Two separate macros are
provided:
- BPF_CORE_READ_BITFIELD macro for direct memory read-enabled BPF programs
(e.g., typed raw tracepoints). It uses direct memory dereference to extract
bitfield backing integer value.
- BPF_CORE_READ_BITFIELD_PROBED macro for cases where bpf_probe_read() needs
to be used to extract same backing integer value.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-3-andriin@fb.com
drivers/isdn/hardware/mISDN/mISDNisar.c:30:17:
warning: faxmodulation_s defined but not used [-Wunused-const-variable=]
It is never used, so can be removed.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IDT ClockMatrix (TM) family includes integrated devices that provide
eight PLL channels. Each PLL channel can be independently configured as a
frequency synthesizer, jitter attenuator, digitally controlled
oscillator (DCO), or a digital phase lock loop (DPLL). Typically
these devices are used as timing references and clock sources for PTP
applications. This patch adds support for the device.
Co-developed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>