POST_INC is a code that's only supposed to be valid in an address, so
it should only be calculated through the TARGET_ADDRESS_COST hook, not
by the TARGET_RTX_COSTS hook. But, because rtx_cost does not
special-case MEM costs by calling TARGET_ADDRESS_COST, we get here as
part of e.g. the auto-inc-dec and combine passes, so deal with it for
the time being. Without this, the cost is the value of size_factor *
COSTS_N_INSNS (1), i.e. 4 per word. There's no obvious observable
effect for generated code (coremark, libgcc and newlib-libc checked
for -march=v10), but it may make a difference in the future, so be
safe and correct the cost.
Tested at r16-6493-ge77ba7ef8c75 for cris-elf. That the cost actually
is changed is observable mostly simply by applying -dp when compiling
int incref(int n, char *p)
{
int sum = 0;
while (n--)
sum += *p++;
return sum;
}
and seeing that the cost for the single autoincrement is changed from e.g.
adds.b [$r11+],$r10 ;# 15 [c=12 l=2] *addsqisi_swap/1
to
adds.b [$r11+],$r10 ;# 15 [c=8 l=2] *addsqisi_swap/1
gcc:
* config/cris/cris.cc (cris_rtx_costs) <POST_INC>: Handle POST_INC
as ZERO_EXTEND and SIGN_EXTEND, i.e. as an operator without cost.
This header is not used any more and its inclusion is problematic
when building against Helix Cert as it might end up dragging LLVM-specific
headers from spinLockLib.h.
libgcc/
* config/gthr-vxworks.h: Remove #include of tickLib.h.
PR libfortran/123012
libgfortran/ChangeLog:
* io/list_read.c (read_character): Add new check when no
quate is provided and the character string is digits only.
gcc/testsuite/ChangeLog:
* gfortran.dg/namelist_100.f90: New test.
Filip's recent change to re-enable switch conversion at -Og triggered a
regression on the mcore-elf target.
If we look at tree-switch-conversion.cc we have this:
if (flag_pic)
return false;
The mcore-elf port defines a dummy ASM_OUTPUT_ADDR_DIFF_ELT which is designed
to trigger an assembler syntax error and thus fail loudly. That definition
comes from a time when it appears we had to define that macro in every port,
even if it wasn't being used.
These days we do not need to define that macro unless it's really needed. And
a definition like the one for mcore-elf will cause problems
(compile/pr69102.c). That definition has also been the cause of a long
standing failure in the port (gcc.dg/pr47446-2.c).
Naturally this has been through a round of testing where it fixes the two
issues noted above without any regressions.
gcc/
* config/mcore/mcore.h (ASM_OUT_ADDR_DIFF_ELT): Remove.
This patch adds support for _Float16. As time of writing this, there is
no hardware _Float16 support on s390. Therefore, _Float16 operations
have to be extended and truncated which is supported via soft-fp.
The ABI demands that _Float16 values are left aligned in FP registers
similar as it is already the case for 32-bit FP values. If vector
extensions are available, copying between left-aligned FPRs and
right-aligned GPRs is natively supported. Without vector extensions,
the alignment has to be taken care of manually. For target z10,
instructions lgdr/ldgr can be used in conjunction with shifts. Copying
via lgdr from an FPR into a GPR is the easy case since for the shift the
target GPR can be utilized. However, copying via ldgr from a GPR into a
FPR requires a secondary reload register which is used for the shift
result and is then copied into the FPR. Prior z10, there is no hardware
support in order to copy directly between FPRs and GPRs. Therefore, in
order to copy from a GPR into an FPR we would require a secondary reload
register for the shift and secondary memory for copying the aligned
value. Since this is not supported, _Float16 support starts with z10.
As a consequence, for all targets older than z10 test
libstdc++-abi/abi_check fails.
gcc/ChangeLog:
* config/s390/s390-modes.def (FLOAT_MODE): Add HF mode.
(VECTOR_MODE): Add V{1,2,4,8,16}HF modes.
* config/s390/s390.cc (s390_scalar_mode_supported_p): For 64-bit
targets z10 and newer support HF mode.
(s390_vector_mode_supported_p): Add HF mode.
(s390_register_move_cost): Keep HF mode operands in registers.
(s390_legitimate_constant_p): Support zero constant.
(s390_secondary_reload): For GPR to FPR moves a secondary reload
register is required.
(s390_secondary_memory_needed): GPR<->FPR moves don't require
secondary memory.
(s390_libgcc_floating_mode_supported_p): For 64-bit targets z10
and newer support HF mode.
(s390_hard_regno_mode_ok): Allow HF mode for FPRs and VRs.
(s390_function_arg_float): Consider HF mode, too.
(s390_excess_precision): For EXCESS_PRECISION_TYPE_FLOAT16
return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16.
(TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Define.
* config/s390/s390.md (movhf): Define.
(reload_half_gprtofpr_z10): Define.
(signbithf2): Define.
* config/s390/vector.md: Add new vector modes to various
iterators.
libgcc/ChangeLog:
* config.host: Include s390/t-float16.
* config/s390/libgcc-glibc.ver: Export symbols
__trunc{sf,df,tf}hf2, __extendhf{sf,df,tf}2, __fix{,uns}hfti,
__float{,un}tihf, __floatbitinthf.
* config/s390/t-softfp: Add to softfp_extras instead of setting
it.
* configure: Regenerate.
* configure.ac: Support float16 only for 64-bit targets z10 and
newer.
* config/s390/_dpd_dd_to_hf.c: New file.
* config/s390/_dpd_hf_to_dd.c: New file.
* config/s390/_dpd_hf_to_sd.c: New file.
* config/s390/_dpd_hf_to_td.c: New file.
* config/s390/_dpd_sd_to_hf.c: New file.
* config/s390/_dpd_td_to_hf.c: New file.
* config/s390/t-float16: New file.
libstdc++-v3/ChangeLog:
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Add
names {,P,K}DF16.
gcc/testsuite/ChangeLog:
* g++.target/s390/float16-1.C: New test.
* g++.target/s390/float16-2.C: New test.
* gcc.target/s390/float16-1-2.h: New test.
* gcc.target/s390/float16-1.c: New test.
* gcc.target/s390/float16-10.c: New test.
* gcc.target/s390/float16-2.c: New test.
* gcc.target/s390/float16-3.c: New test.
* gcc.target/s390/float16-4.c: New test.
* gcc.target/s390/float16-5.c: New test.
* gcc.target/s390/float16-6.c: New test.
* gcc.target/s390/float16-7.c: New test.
* gcc.target/s390/float16-8.c: New test.
* gcc.target/s390/float16-9.c: New test.
* gcc.target/s390/float16-signbit.h: New test.
* gcc.target/s390/vector/vec-extract-4.c: New test.
* gcc.target/s390/vector/vec-float16-1.c: New test.
Similar to the changes in r16-6620, the improved gnatwu warning finds a 'use'
clause that is not needed in s-osinte__darwin.abd leading to a bootstrap
fail building the libraries.
Fixed by removing the extraneous 'use' clause.
gcc/ada/ChangeLog:
* libgnarl/s-osinte__darwin.adb: Delete unneeded use clause.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
On the mingw32 target, std::system_category().message(int) uses
FormatMessage api to format error messages. When the error message
contains insert sequences, it is unsafe not to use the
FORMAT_MESSAGE_OGNORE_INSERTS flag, as seen at:
https://devblogs.microsoft.com/oldnewthing/20071128-00/?p=24353
The output of FormatMessage ends with "\r\n" and includes a Full stop
character used by the current thread's UI language. Now, we will remove
"\r\n" and any trailing '.' from the output in any language environment.
In the testsuite for std::system_category().message(int), we first
switch the thread UI language to en-US to meet expectations in any
language environment.
libstdc++-v3/ChangeLog:
* src/c++11/system_error.cc (system_error_category) [_WIN32]:
Use FormatMessageA function instead of FormatMessage macro.
* testsuite/19_diagnostics/error_category/system_category.cc:
Fix typo in __MINGW32__ macro name. Adjust behavior on the
mingw32 target.
So it turns out LOOPS_MAY_HAVE_MULTIPLE_LATCHES is set in places
along compiling. Setting it only means there might be multiple
latches currently. It does not mean let's go in an delete them
all; which is what remove_forwarder_block does currently. This
was happening before my set of patches too but since it was
only happening in merge_phi pass, latches were not cleared away
al of the time and then recreated.
This solves the problem by protecting latches all of the time
instead of depedent on LOOPS_MAY_HAVE_MULTIPLE_LATCHES not being set.
vect-uncounted_7.c needs to be xfailed here because we no longer
vectorize the code. Note the IR between GCC 15 and after this patch
is the same so I think this was just a case were the testcase
was added after the remove forwarder changes and should not have
vectorized (or vectorize differently).
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/123417
gcc/ChangeLog:
* tree-cfgcleanup.cc (maybe_remove_forwarder_block): Always
protect latches.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-uncounted_7.c: xfail vect test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
As written earlier, the config-ml.in change from the
--with-multi-buildlist patch broke build of Ada, Ada uses
RTSDIR = rts$(subst /,_,$(MULTISUBDIR))
and expects that the primary multilib will result in rts
rather than rts_. it results in after the --with-multi-buildlist
changes.
The following patch fixes it by restoring previous behavior for
ml_subdir / MULTISUBDIR such that for primary multilib it is
still empty rather than /.
2026-01-10 Jakub Jelinek <jakub@redhat.com>
PR ada/123490
* config-ml.in: Restore ml_subdir being empty instead of /.
for the primary multilib.
While gimple_call_combined_fn already do call
gimple_builtin_call_types_compatible_p and for most of builtins ensures
the right types of arguments, for type generic builtins it does not,
from POV of that function those functions are rettype (...).
Now, while the FE does some number of argument checking for the type
generic builtins, as the testcase below shows, it can be gamed.
So, this patch checks the number of arguments for type generic builtins
and does nothing if they have unexpected number of arguments.
Also for the returns arg verifies it can access the first argument.
2026-01-10 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/123431
* gimple-range-op.cc (gimple_range_op_handler::maybe_builtin_call):
Punt if type-generic builtins with a single argument don't have
exactly one argument. For returns_arg punt if call doesn't have
at least one argument.
* gcc.dg/pr123431.c: New test.
We accept a mismatch in qualifiers for enumerations and integers
because we switch to the underlying type before checking that qualifiers
match.
PR c/123435
PR c/123463
gcc/c/ChangeLog:
* c-typeck.cc (comptypes_internal): Test for qualifiers first.
gcc/testsuite/ChangeLog:
* gcc.dg/pr123435-1.c: New test.
* gcc.dg/pr123435-2.c: New test.
* gcc.dg/pr123463.c: New test.
RVV's vectors can get very large with LMUL8. In the PR we have
256-element char vectors which get permuted. For permuting them
we use a mask vectype that is deduced from the element type
without checking if the permute indices fit this type.
That leads to an invalid permute mask which gets optimized away.
This patch uses ssizetype as masktype instead.
PR tree-optimization/123414
gcc/ChangeLog:
* tree-ssa-forwprop.cc (simplify_vector_constructor):
Use ssizetype as mask type.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr123414.c: New test.
As analyzed by Steve, on freebsd __gthread_t is a pointer type.
I thought it the cleanest solution to remove the #ifdef in gfc_unit,
make the "self" member a intptr_t and cast the return value of
__gthread_t to that type.
PR fortran/123512
libgfortran/ChangeLog:
* io/io.h: Change type of self to intptr_t.
* io/async.h (LOCK_UNIT): Cast __gthread_self () to intptr_t.
(TRYLOCK_UNIT): Likewise.
(OWN_THREAD_ID): Likewise.
On Fri, Jan 09, 2026 at 05:54:47PM +0000, Joseph Myers wrote:
> I think updates to gcc/config/loongarch/genopts/gen-evolution.awk's calls
> to copyright_header are needed as well. At present, building for
> loongarch can result in files in the source tree being reverted to older
> copyright dates because the generation hasn't been updated (discovered via
> my glibc bot with GCC mainline stopping updating its GCC source tree
> because such modifications appeared in the sources). Of course this also
> shows up missing entries in contrib/gcc_update for the three files
> generated by gen-evolution.awk.
gen-evolution.awk was explicitly blacklisted
and so was gen-cxxapi-file.py, both because update-copyright.py
matched Copyright line also within the printing code but it wasn't
matching the expected form.
Fixed by making sure the printing code doesn't match it by using
print " Copy" "right (C) " ... in the awk case and
Copy{:s}right in the python case (with "" arg added).
2026-01-09 Jakub Jelinek <jakub@redhat.com>
contrib/
* update-copyright.py (GCCFilter): Don't filter out
gen-evolution.awk and gen-cxxapi-file.py.
gcc/
* config/loongarch/genopts/gen-evolution.awk: Update
copyright year.
(copyright_header): Separate parts of Copyright word
with " " so that it doesn't get matched by update-copyright.py.
(gen_full_header, gen_full_source, gen_full_def): Include
2026 year in the ranges.
gcc/cp/
* gen-cxxapi-file.py: Update copyright year. Separate
parts of Copyright word with {:s} so that it doesn't get matched
by update-copyright.py.
More simplification/consolidation of some callback logic in analyzer in
favor of using the analyzer pub/sub channel.
No functional change intended.
gcc/analyzer/ChangeLog:
* common.h (struct on_frame_popped): New.
(subscriber::on_message): New vfunc for on_frame_popped.
* region-model.cc: Include "context.h" and "channels.h".
(region_model::pop_frame_callbacks): Delete.
(region_model::pop_frame): Port from notify_on_pop_frame to
using pub/sub channel.
* region-model.h (pop_frame_callback): Delete typedef.
(region_model::register_pop_frame_callback): Delete.
(region_model::pop_frame_callbacks): Delete.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_cpython_plugin.cc
(cpython_analyzer_events_subscriber::on_message): Implement for
on_frame_popped.
(plugin_init): Drop call to
region_model::register_pop_frame_callback in favor of the above
pub/sub handler.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Simplification/consolidation of some callback logic in analyzer in
favor of using the analyzer pub/sub channel.
No functional change intended.
gcc/analyzer/ChangeLog:
* analyzer-language.cc: Include "context.h" and "channels.h".
(finish_translation_unit_callbacks): Delete.
(register_finish_translation_unit_callback): Delete.
(run_callbacks): Delete.
(on_finish_translation_unit): Port from run_callbacks to pub/sub.
* analyzer-language.h (finish_translation_unit_callback): Delete
typedef.
(register_finish_translation_unit_callback): Delete decl.
* common.h (class translation_unit): New forward decl.
(struct analyzer_events::on_tu_finished): New.
(analyzer_events::subscriber::on_message): Add vfunc for
on_tu_finished messages.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_cpython_plugin.cc
(cpython_analyzer_events_subscriber::on_message): New.
(plugin_init): Port stashing of named types and global vars to
pub/sub framework.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This patch eliminates the PLUGIN_ANALYZER_INIT event in favor of a new
analyzer_events_channel that can be subscribed to, and ports all the
in-tree analyzer plugins to using it.
The PLUGIN_* approach isn't typesafe, and the name suggests it's only
meant to be used for plugins, whereas the pub/sub approach is typesafe,
and treats the publish/subscribe network as orthogonal to whether the
code is built into the executable or is a plugin.
gcc/analyzer/ChangeLog:
* common.h: Define INCLUDE_LIST.
(class plugin_analyzer_init_iface): Replace with...
(gcc::topics::analyzer_events::on_ana_init): ...this.
(gcc::topics::analyzer_events::subscriber): New.
* engine.cc: Include "context.h" and "channels.h".
(class plugin_analyzer_init_impl): Replace with...
(class impl_on_ana_init): ...this. Fix some overlong lines.
(impl_run_checkers): Port from PLUGIN_ANALYZER_INIT to using
publish/subscribe framework.
gcc/ChangeLog:
* channels.h (gcc::topics::analyzer_events::subscriber): New
forward decl.
(compiler_channels::analyzer_events_channel): New field.
* doc/plugins.texi (PLUGIN_ANALYZER_INIT): Delete.
* plugin.cc (register_callback): Delete PLUGIN_ANALYZER_INIT.
(invoke_plugin_callbacks_full): Likewise.
* plugin.def (PLUGIN_ANALYZER_INIT): Delete this event.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_cpython_plugin.cc: Port from
PLUGIN_ANALYZER_INIT to subscribing to analyzer_events_channel.
* gcc.dg/plugin/analyzer_gil_plugin.cc: Likewise.
* gcc.dg/plugin/analyzer_kernel_plugin.cc: Likewise.
* gcc.dg/plugin/analyzer_known_fns_plugin.cc: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This patch adds a new key/value pair "cfgs={yes,no}" to diagnostics
sinks, "no" by default.
If set to "yes" for a SARIF sink, then GCC will add the internal state
of the CFG for all functions after each pertinent optimization pass in
graph form to theRun.graphs in the SARIF output.
If set to "yes" for an HTML sink, the generated HTML will contain SVG
displaying the graphs, adapted from code in graph.cc
Text sinks ignore it.
The SARIF output is thus a machine-readable serialization of (some of)
GCC's intermediate representation (as JSON), but it's much less than
GCC-XML used to provide. The precise form of the information is
documented as subject to change without notice.
Currently it shows both gimple statements and RTL instructions,
depending on the pass. My hope is that it should be possible to write a
"cfg-grep" tool that can read the SARIF and automatically identify
in which pass a particular piece of our IR appeared or disappeared,
for tracking down bugs in our optimization passes.
Implementation-wise:
* this uses the publish-subscribe mechanism from the earlier patch, by
having the diagnostics sink subscribe to pass_events::after_pass
messages from the pass_events_channel.
* the patch adds a new hook to cfghooks.h for dumping a basic block
into a SARIF property bag
gcc/ChangeLog:
* Makefile.in (OBJS): Add tree-diagnostic-cfg.o.
(OBJS-libcommon): Add custom-sarif-properties/cfg.o,
diagnostics/digraphs-to-dot.o, and
diagnostics/digraphs-to-dot-from-cfg.o.
* cfghooks.cc: Define INCLUDE_VECTOR. Add includes of
"diagnostics/sarif-sink.h" and "custom-sarif-properties/cfg.h".
(dump_bb_as_sarif_properties): New.
* cfghooks.h (diagnostics::sarif_builder): New forward decl.
(json::object): New forward decl.
(cfg_hooks::dump_bb_as_sarif_properties): New callback field.
(dump_bb_as_sarif_properties): New decl.
* cfgrtl.cc (rtl_cfg_hooks): Populate the new callback
field with rtl_dump_bb_as_sarif_properties.
(cfg_layout_rtl_cfg_hooks): Likewise.
* custom-sarif-properties/cfg.cc: New file.
* custom-sarif-properties/cfg.h: New file.
* diagnostics/digraphs-to-dot-from-cfg.cc: New file, partly
adapted from gcc/graph.cc.
* diagnostics/digraphs-to-dot.cc: New file.
* diagnostics/digraphs-to-dot.h: New file, based on material in...
* diagnostics/digraphs.cc: Include
"diagnostics/digraphs-to-dot.h".
(class conversion_to_dot): Rework and move to above.
(make_dot_graph_from_diagnostic_graph): Likewise.
(make_dot_node_from_digraph_node): Likewise.
(make_dot_edge_from_digraph_edge): Likewise.
(conversion_to_dot::get_dot_id_for_node): Likewise.
(conversion_to_dot::has_edges_p): Likewise.
(digraph::make_dot_graph): Use to_dot::converter::make and invoke
the result to make the dot graph.
* diagnostics/digraphs.h (digraph:get_all_nodes): New accessor.
* diagnostics/html-sink.cc
(html_builder::m_per_logical_loc_graphs): New field.
(html_builder::add_graph_for_logical_loc): New.
(html_sink::report_digraph_for_logical_location): New.
* diagnostics/sarif-sink.cc (sarif_array_of_unique::get_element):
New.
(sarif_builder::report_digraph_for_logical_location): New.
(sarif_sink::report_digraph_for_logical_location): New.
* diagnostics/sink.h: Include "diagnostics/logical-locations.h".
(sink::report_digraph_for_logical_location): New vfunc.
* diagnostics/text-sink.h
(text_sink::report_digraph_for_logical_location): New.
* doc/invoke.texi (fdiagnostics-add-output): Clarify wording.
Distinguish between scheme-specific vs GCC-specific keys, and add
"cfgs" as the first example of the latter.
* gimple-pretty-print.cc: Include "cfghooks.h", "json.h", and
"custom-sarif-properties/cfg.h".
(gimple_dump_bb_as_sarif_properties): New.
* gimple-pretty-print.h (diagnostics::sarif_builder): New forward
decl.
(json::object): Likewise.
(gimple_dump_bb_as_sarif_properties): New.
* graphviz.cc (get_compass_pt_from_string): New
* graphviz.h (get_compass_pt_from_string): New decl.
* libsarifreplay.cc (sarif_replayer::handle_graph_object): Fix
overlong line.
* opts-common.cc: Define INCLUDE_VECTOR.
* opts-diagnostic.cc: Define INCLUDE_LIST. Include
"diagnostics/sarif-sink.h", "tree-diagnostic-sink-extensions.h",
"opts-diagnostic.h", and "pub-sub.h".
(class gcc_extra_keys): New class.
(opt_spec_context::opt_spec_context): Add "client_keys" param and
pass to dc_spec_context.
(handle_gcc_specific_keys): New.
(try_to_make_sink): New.
(gcc_extension_factory::singleton): New.
(handle_OPT_fdiagnostics_add_output_): Rework to use
try_to_make_sink.
(handle_OPT_fdiagnostics_set_output_): Likewise.
* opts-diagnostic.h: Include "diagnostics/sink.h".
(class gcc_extension_factory): New.
* opts.cc: Define INCLUDE_LIST.
* print-rtl.cc: Include "dumpfile.h", "cfghooks.h", "json.h", and
"custom-sarif-properties/cfg.h".
(rtl_dump_bb_as_sarif_properties): New.
* print-rtl.h (diagnostics::sarif_builder): New forward decl.
(json::object): Likewise.
(rtl_dump_bb_as_sarif_properties): New decl.
* tree-cfg.cc (gimple_cfg_hooks): Use
gimple_dump_bb_as_sarif_properties for new callback field.
* tree-diagnostic-cfg.cc: New file, based on material in graph.cc.
* tree-diagnostic-sink-extensions.h: New file.
* tree-diagnostic.cc: Define INCLUDE_LIST. Include
"tree-diagnostic-sink-extensions.h".
(compiler_ext_factory): New.
(tree_diagnostics_defaults): Set gcc_extension_factory::singleton
to be compiler_ext_factory.
gcc/testsuite/ChangeLog:
* gcc.dg/diagnostic-cfgs-html.py: New test.
* gcc.dg/diagnostic-cfgs-sarif.py: New test.
* gcc.dg/diagnostic-cfgs.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This patch adds a new "struct compiler_channels" to hold channels
relating to the compiler that plugins (or diagnostic sinks) might want
to subscribe to events for, accessed from the global gcc::context
object, along with a new gcc/topics/ source subdirectory to hold
strongly-typed publish/subscribe topics relating to the compiler.
For now, there is just one: pass_events_channel, which, if there are any
subscribers, issues notifications about passes starting/stopping on
particular functions, using topics::pass_events, declared in
topics/pass-events.h, but followup patches add more kinds of
notification channel.
A toy plugin in the testsuite shows how this could be used to build a
progress notification UI for the compiler, and a followup patch uses the
channel to (optionally) capture CFG information at each stage of
optimization in machine-readable form into a SARIF sink.
gcc/ChangeLog:
* channels.h: New file.
* context.cc: Define INCLUDE_LIST. Include "channels.h".
(gcc::context::context): Create m_channels.
(gcc::context::~context): Delete it.
* context.h (struct compiler_channels): New forward decl.
(gcc::context::get_channels): New accessor.
(gcc::context::m_channels): New field.
* passes.cc: Define INCLUDE_LIST. Include "topics/pass-events.h"
and "channels.h".
(execute_one_pass): If the global context's pass_events_channel
has subscribers, publish before_pass and after_pass events to it.
* topics/pass-events.h: New file.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/plugin.exp: Add progress_notifications_plugin.cc.
* gcc.dg/plugin/progress_notifications_plugin.cc: New test plugin.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This patch introduces a publish/subscribe mechanism, allowing for
loosely-coupled senders and receivers, with strongly-typed messages
passing between them. For example, a GCC subsystem could publish
messages about events, and a plugin could subscribe to them.
An example can be seen in the selftests.
gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add pub-sub.o.
* pub-sub.cc: New file.
* pub-sub.h: New file.
* selftest-run-tests.cc (selftest::run_tests): Call
selftest::pub_sub_cc_tests.
* selftest.h (selftest::pub_sub_cc_tests): New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The problem here is function_table was not in the GGC memory space and not
streamed out. So even though the builtins were reloaded, function_table was
a nullptr as it was not reloaded.
Also noticed initial_indexes should be marked with GTY so it is reloaded correctly
from PCH.
Built and tested for aarch64-linux-gnu.
PR target/123457
gcc/ChangeLog:
* config/aarch64/aarch64-sve-builtins.cc (struct registered_function_hasher):
Change base class to ggc_ptr_hash.
(initial_indexes): Mark with GTY.
(function_table): Likewise.
(handle_arm_sve_h): Allocate function_table from ggc instead of heap.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
On the following testcase we emit a false positive warning that
a temporary (TARGET_EXPR slot) is used uninitialized, from the early_uninit
pass.
This regressed with my change to instrument for
-ftrivial-auto-var-init={zero,pattern} not just DECL_EXPRs, but also
TARGET_EXPR initializations if the TARGET_EXPR_INITIALIZER has void type.
Those cases are where the initializer doesn't necessarily have to initialize
the whole TARGET_EXPR slot, or might use parts or the whole slot before
those are initialized; this is how e.g. various C++ temporary objects are
constructed.
The problem is in pass interaction. The FE creates a TARGET_EXPR with
void type initializer because the initializer is originally
__atomic_load (&expr, &tmp, SEQ_CST); but it is folded instantly into
(void) (tmp = (type) __atomic_load_N (&expr, SEQ_CST)). The FE also
marks the TARGET_EXPR slot as TREE_ADDRESSABLE, because it would be
if it will use libatomic, but nothing in the IL then takes its address.
Now, since my r16-4212 change which was for mainly C++26 compliance
we see the TARGET_EXPR and because it has void type TARGET_EXPR_INITIALIZER,
we start with tmp = .DEFERRED_INIT (...); just in case the initialization
would attempt to use the slot before initialization or not initialize fully.
Because tmp is TREE_ADDRESSABLE and has gimple reg type, it is actually not
gimplified as tmp = .DEFERRED_INIT (...); but as _1 = .DEFERRED_INIT (...);
tmp = _1; but because it is not actually address taken in the IL, already
the ssa pass turns it into SSA_NAME (dead one), so we have
_1 = .DEFERRED_INIT (...); _2 = _1; and _2 is unused. Next comes
early_uninit and warns on the dead SSA_NAME copy that it uses uninitialized
var.
The following patch attempts to fix that by checking if
c_build_function_call_vec has optimized the call right away into pure
assignment to the TARGET_EXPR slot without the slot being used anywhere
else in the expression and 1) clearing again TREE_ADDRESSABLE on the slot,
because it isn't really addressable 2) optimizing the TARGET_EXPR, so that
it doesn't have void type TARGET_EXPR_INITIALIZER by changing it to the rhs
of the MODIFY_EXPR. That way gimplifier doesn't bother creating
.DEFERRED_INIT for it at all.
Or should something like this be done instead in the TARGET_EXPR
gimplification? I mean not the TREE_ADDRESSABLE clearing, that can't be
done without knowing what we know in the FE, but the rest, generally
TARGET_EXPR with initializer (void) (TARGET_EXPR_SLOT = something)
where something doesn't refer to TARGET_EXPR_SLOT can be optimized into
just something TARGET_EXPR_INITIALIZER.
2026-01-09 Jakub Jelinek <jakub@redhat.com>
PR c/123475
* c-typeck.cc (c_find_var_r): New function.
(convert_lvalue_to_rvalue): If c_build_function_call_vec
folded __atomic_load (&expr, &tmp, SEQ_CST); into
(void) (tmp = __atomic_load_<N> (&expr, SEQ_CST)), drop
TREE_ADDRESSABLE flag from tmp and set TARGET_EXPR
initializer just to the rhs of the MODIFY_EXPR.
* gcc.dg/pr123475.c: New test.
We miss quite a few -x option arguments that can be specified.
2026-01-09 Jakub Jelinek <jakub@redhat.com>
* doc/invoke.texi (-x): Add c++-system-module, objc-cpp-output,
objc++-cpp-output, adascil, adawhy, modula-2, modula-2-cpp-output,
rust, algol68 and lto as further possible option arguments.
The __wait_args::_M_setup_proxy_wait function must only be called when
_M_obj == addr is true, so it's redundant for _M_setup_proxy_wait to
pass addr to use_proxy_wait. That address is already passed as
args._M_old anyway.
libstdc++-v3/ChangeLog:
* src/c++20/atomic.cc (use_proxy_wait): Remove unused second
parameter.
(__wait_args::_M_setup_proxy_wait): Remove second argument.
(__notify_impl): Likewise.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
A failed assertion was observed with std::atomic<bool>::wait when the
loop in __atomic_wait_address is entered and calls _M_setup_wait a
second time, after waking from __wait_impl. When the first call to
_M_setup_wait makes a call to _M_setup_proxy_wait that function decides
that a proxy wait is needed for an object of type bool, and it updates
the _M_obj and _M_obj_size members to refer to the futex in the proxy
state, instead of referring to the bool object itself. The next time
_M_setup_wait is called it calls _M_setup_proxy_wait again but now it
sees _M_obj_size == sizeof(futex) and so this time decides a proxy wait
is *not* needed, and then fails the __glibcxx_assert(_M_obj == addr)
check.
The problem is that _M_setup_proxy_wait wasn't correctly handling the
case where it's called a second time, after the decision to use a proxy
wait has already been made. That can be fixed in _M_setup_proxy_wait by
checking if _M_obj != addr, which implies that a proxy wait has already
been set up by a previous call. In that case, _M_setup_proxy_wait should
only update _M_old to the latest value of the proxy _M_ver.
This change means that _M_setup_proxy_wait is safe to call repeatedly
for a proxy wait, and will only update _M_wait_state, _M_obj, and
_M_obj_size on the first call. On the second and subsequent calls, those
variables are already correctly set for the proxy wait so don't need to
be set again.
For non-proxy waits, calling _M_setup_proxy_wait more than once is safe,
but pessimizes performance. The caller shouldn't make a second call to
_M_setup_proxy_wait because we don't need to check again if a proxy wait
should be used (the answer won't change) and we don't need to load a
value from the proxy _M_ver.
However, it was difficult to detect the case of a non-proxy wait,
because _M_setup_wait doesn't know if it's being called the first time
(when _M_setup_proxy_wait is called to make the initial decision) or a
subsequent time (in which case _M_obj == addr implies a non-proxy wait
was already decided on). As a result, _M_setup_proxy_wait was being used
every time to see if it's a proxy wait. We can resolve this by splitting
the _M_setup_wait function into _M_setup_wait and _M_on_wake, where the
former is only called once to do the initial setup and the latter is
called after __wait_impl returns, to prepare to check the predicate and
possibly wait again. The new _M_on_wake function can avoid unnecessary
calls to _M_setup_proxy_wait by checking _M_obj == addr to identify a
non-proxy wait.
The three callers of _M_setup_wait are updated to use _M_on_wake instead
of _M_setup_wait after waking from a waiting function. This change
revealed a latent performance bug in __atomic_wait_address_for which was
not passing __res to _M_setup_wait, so a new value was always loaded
even when __res._M_has_val was true. By splitting _M_on_wake out of
_M_setup_wait this problem became more obvious, because we no longer
have _M_setup_wait doing two different jobs, depending on whether it was
passed the optional third argument or not.
libstdc++-v3/ChangeLog:
* include/bits/atomic_timed_wait.h (__atomic_wait_address_until):
Use _M_on_wake instead of _M_setup_wait after waking.
(__atomic_wait_address_for): Likewise.
* include/bits/atomic_wait.h (__atomic_wait_address): Likewise.
(__wait_args::_M_setup_wait): Remove third parameter and move
code to update _M_old to ...
(__wait_args::_M_on_wake): New member function to update _M_old
after waking, only calling _M_setup_proxy_wait if needed.
(__wait_args::_M_store): New member function to update _M_old
from a value, for non-proxy waits.
* src/c++20/atomic.cc (__wait_args::_M_setup_proxy_wait): If
_M_obj is not addr, only load a new value and return true.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
As noted in Bug 122878 comment 2, the _M_try_acquire_for implementation
doesn't reduce the remaining timeout each time it returns from an atomic
waiting function. This means that it can wait longer than requested, or
even loop forever. If there is a spurious wake from the timed waiting
function (__wait_until_impl) it will return indicating no timeout
occurred, which means the caller will check the value and potentially
sleep again. If spurious wakes happen every time, it will just keep
sleeping in a loop forever. This is observed to actually happen on
FreeBSD 14.0-STABLE where pthread_cond_timedwait gets a spurious wake
and so never times out.
The solution in this commit is to replace the implementation of
_M_try_acquire_for with a call to _M_try_acquire_until, converting the
relative timeout to an absolute timeout against the steady clock. This
is what ends up happening anyway, because we only have a
__wait_until_impl entry point into the library internals, so
__atomic_wait_address_for already converts the relative timeout to an
absolute timeout (except for the special case of a zero-value duration,
which only checks for an update while spinning for a finite number of
iterations, and doesn't sleep).
As noted in comment 4 of the PR, this requires some changes to
_M_try_acquire which was relying on the behaviour of _M_try_acquire_for
for zero-value durations. That behaviour is desirable for
_M_try_acquire so that it can handle short-lived contention without
failing immediately. To preserve that behaviour of _M_try_acquire it is
changed to do its own loop and to call __atomic_wait_address_for
directly with a zero duration, to do the spinloop.
libstdc++-v3/ChangeLog:
PR libstdc++/122878
* include/bits/semaphore_base.h (_M_try_acquire): Replace
_M_try_acquire_for call with explicit loop and call to
__atomic_wait_address_for.
(_M_try_acquire_for): Replace loop with call to
_M_try_acquire_until.
Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com>
A duplicated call to a finalizer occured in cases where a derived type
has components, one or more of which are allocatable, and one or more
of which are finalizable. (The bug occured only if the derived type
is an extension of another type, which has defined assignment.)
New test case derived from the original report by Paul Thomas.
PR fortran/123483
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_deallocate_alloc_comp): Ad the new
finalization argument and pass it to structure_alloc_comps.
* trans-array.h (gfc_deallocate_alloc_comp): Add a finalization
flag that can be passed by gfc_conv_procedure_call.
* trans-expr.cc (gfc_conv_procedure_call): Use the new
finalization flag.
gcc/testsuite/ChangeLog:
* gfortran.dg/finalize_61.f90: New test.
Signed off by: Andrew Benson <abensonca@gcc.gnu.org>
Existing toolchain builds rely on the similarity between picolibc and
newlib when building libstdc++ and use --with-newlib.
Switch to the picolibc 16-bit _ctype_wide array which provides
separate values for ctype_base::blank and ctype_base::space.
This fixes a bug where libstdc++ was including '\f', '\n', '\r' and
'\v' in the set of 'blank' chars. Afterwards, only ' ' and '\t' are in
this set, as specified by C++ 11.
libstdc++-v3/ChangeLog:
* acinclude.m4 (GLIBCXX_CONFIGURE): Add --with-picolibc.
* configure: Regenerate.
* configure.ac: Add handling for with_picolibc=yes.
* config/os/picolibc/ctype_base.h: New file.
* config/os/picolibc/ctype_configure_char.cc: New file.
* config/os/picolibc/ctype_inline.h: New file.
* config/os/picolibc/os_defines.h: New file.
Signed-off-by: Keith Packard <keithp@keithp.com>
LRA in the test case, rematerialize insn with div/mod where div result
is not used. Still div result requires ax which is used by different
pseudos at point of rematerialization and this clobbers the pseudo
value. The patch solves the problem by constraining to single set
insns as we always rematerialize only one pseudo value. Also there is
no sense to rematerialize div/mod as usually their latency is more
than load value from CPU cache. The patch explicitly excludes such
insns from rematerialization.
gcc/ChangeLog:
PR rtl-optimization/123121
* lra-remat.cc (bad_for_rematerialization_p): Consider div/mod ops.
(operand_to_remat): Exclude rematerialization of insns with
multiple sets.
gcc/testsuite/ChangeLog:
PR rtl-optimization/123121
* gcc.target/i386/pr123121.c: New.
The Ascalon core implements the full RVA23 profile plus a few other optional
extensions. However, the -mcpu=tt-ascalon-d8 option doesn't enable them all.
Add the missing extensions.
2026-01-08 Peter Bergner <bergner@tenstorrent.com>
gcc/
PR target/123492
* config/riscv/riscv-cores.def (RISCV_CORE)<tt-ascalon-d8>: Add missing
extensions via use of rva23s64 profile and adding zkr, smaia, smmpm,
smnpm, smrnmi, smstateen, ssaia, ssstrict, svadu.
Signed-off-by: Peter Bergner <bergner@tenstorrent.com>
Since the IPA-CP lattices for value ranges cannot hold more values and
don't have any "variable" flag, we initialize them to bottom for
non-local nodes. However, that means we don't make use of known
information gathered in jump functions when the corresponding node is
cloned for some other reason. This patch allows collection of the
information and only does not use them for the original non-local
nodes, while making sure that we do not propagate information through
such-non local nodes as there may be unknown calls.
gcc/ChangeLog:
2026-01-06 Martin Jambor <mjambor@suse.cz>
* ipa-cp.h (class ipcp_bits_lattice): New members set_recipient_only,
recipient_only_p and m_recipient_only.
(class ipcp_vr_lattice): Likewise.
(ipcp_vr_lattice::init): Initialize also m_recipient_only.
* ipa-cp.cc (ipcp_bits_lattice::print): Adjust printting to also
print the new flag.
(ipcp_vr_lattice::print): Likewise.
(ipcp_vr_lattice::set_recipient_only): New function.
(ipcp_bits_lattice::set_recipient_only): Likewise.
(set_all_contains_variable): New parameter MAKE_SIMPLE_RECIPIENTS, set
bits and vr lattices to recibient only insted to bottom when it is
true.
(initialize_node_lattices): Pass true to the second parameter of
set_all_contains_variable.
(propagate_bits_across_jump_function): Treat recipient_only source
lattices like bottom.
(propagate_vr_across_jump_function): Likewise.
(ipcp_store_vr_results): Skip non-local nodes.
This modifies the decision making stage of IPA-CP in two ways:
Previously, local effects of the cloning were estimated only for the
constant that was being considered, even though the calls which bring
it also carry other constants. With this patch, all knowsn constants
for the given subset of caller edges are considered and the heuritics
should therefore have more information and generally work better.
Also, when evaluating the opportunities for a given node, IPA-CP
previously just iterate over the parameters starting with the first
one and if any opportunity looked profitable, it was carried out and
associated calling edges were redirected, even if this precludes some
even better opportunity. The patch tries to mitigate this by first
using the initial estimates to sort all cloning candidates and then
iterate in that order.
The one difference from the version I posted before is that I have
extended the checking assert making sure the value we clone for is
indeed used to also work for non-aggregate constants and polymorphic
contexts.
gcc/ChangeLog:
2025-12-01 Martin Jambor <mjambor@suse.cz>
* ipa-cp.cc (good_cloning_opportunity_p): Dump a message when
bailing out early too.
(find_more_scalar_values_for_callers_subset): Rename to
find_scalar_values_for_callers_subset, collect constants regardless of
what is already in the vector. Remove dumping.
(find_more_contexts_for_caller_subset): Rename to
find_contexts_for_caller_subset, collect contexts regardless of what
is already in the vector. Remove dumping.
(find_aggregate_values_for_callers_subset): Rename to
find_aggregate_values_for_callers_subset_gc, implement using new
functions.
(find_aggregate_values_for_callers_subset_1): New function.
(find_aggregate_values_for_callers_subset): Likewise.
(copy_known_vectors_add_val): Removed.
(dump_reestimation_message): New function.
(decide_about_value): Remove formal parameter avals, compute it
independently, and use it to estimate local cloning effects.
(struct cloning_opportunity_ranking): New type.
(compare_cloning_opportunities): New function.
(cloning_opportunity_ranking_evaluation): Likewise.
(decide_whether_version_node): Pre-sort candidates for cloning before
really evaluating them. Calculate context independent values only
when considering versioning for all contexts.
(ipcp_val_agg_replacement_ok_p): Renamed to
ipcp_val_replacement_ok_p, check also non-aggregate values.
gcc/testsuite/ChangeLog:
2026-01-08 Martin Jambor <mjambor@suse.cz>
* gcc.dg/ipa/ipcp-agg-2.c: Adjust dump test.
* gcc.dg/ipa/ipcp-agg-3.c: Likewise.
* gcc.dg/ipa/ipcp-agg-4.c: Likewise.
* gcc.dg/ipa/ipcp-agg-14.c: New test.
* gcc.dg/vect/pr101145_1.c: Compile with -fno-ipa-cp.
* gcc.dg/vect/pr101145_2.c: Likewise.
* gcc.dg/vect/pr101145_3.c: Likewise.
When a function call uses up all argument registers, and needs IP for
the static chain, there aren't any call-clobbered registers left for
reload to assign as the sibcall target, when -mlong-calls is enabled.
Use the same logic that does the job for indirect calls to prevent
tail calls in this case.
With this change, it is possible to bootstrap armv7a-linux-gnu with
both -O3 and lto, but only with both -mlong-calls and
-ffunction-sections.
Without -mlong-calls, linker veneer thunks may clobber the static
chain register set up by callers in one lto unit, preventing them from
reaching the callee in a separate lto unit. -ffunction-sections is
required for -mlong-calls to be effective, because both caller and
callee are in the same section, and that disables long-calls when
!flag_reorder_blocks_and_partition.
gcc/ChangeLog
PR target/119430
* config/arm/arm.cc (arm_function_ok_for_sibcall): Disable
sibcalls for long-calls that use all call-clobbered
general-purpose registers, including the static chain.
Currently, operand modifier c truncates and extends any integer constant
to a signed 8-bit constant whereas the common code implementation just
prints the constant unmodified. The modifier was introduced in
r0-87728-g963fc8d00baeca matching the new constraint C which ensures
that a constant is an 8-bit signed integer.
In the machine description, operand modifier c is only used for operands
with constraint C. Therefore, there is no immediate need for some
special constant printing.
Since print_operand() is also used by output_asm_insn(), inline asm is
also affected by this. Note, in output_asm_insn() we cannot utilize
output_addr_const() since not every CONST_INT is a valid address, i.e.,
we have up to 32-bit immediates and at most 20-bit (long) displacements.
In fact, %cN should behave the same as %N for any CONST_INT operand N,
although, this literally means that the output modifier accepts and
prints immediates which might be larger than any instruction accepts.
Though, regarding accepting or rejecting immediates, this is what
constraints et al. are for. Therefore, align %cN and %N.
gcc/ChangeLog:
* config/s390/s390.cc (print_operand): Align %cN with %N.
* config/s390/s390.md: Remove comment.
gcc/testsuite/ChangeLog:
* gcc.target/s390/asm-constant-1.c: New test.
Since GCC 15, bit test and jump table lowering was disabled for both -O0
and -Og to save compile time. On -Og, compile time isn't *that*
critical, so this patch enables bit tests and jump tables on -Og once
again.
PR c/123212
gcc/ChangeLog:
* opts.cc: Enable -fbit-tests and -fjump-tables at -Og.
Signed-off-by: Filip Kastl <fkastl@suse.cz>
Adds support for the AArch64 2024 fmmla extensions.
Note this includes a work around in the testsuite for spurious warnings
from binutils with movprfx and fmmla instructions.
(PR gas/33562).
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc
(aarch64_expand_pragma_builtin): Add case for FMMLA.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins):
Add new __ARM_FEATURE_X macros.
* config/aarch64/aarch64-simd-pragma-builtins.def
(vmmlaq_f16_mf8): New intrinsic.
(vmmlaq_f32_mf8): Likewise.
* config/aarch64/aarch64-simd.md
(@aarch64_<insn><VDQ_HSF_FMMLA:mode>): New instruction.
* config/aarch64/aarch64-sve-builtins-base.cc: Update mmla_impl
for new instructions.
* config/aarch64/aarch64-sve-builtins-shapes.cc
(struct mmla_def): Add support for the new widening forms.
* config/aarch64/aarch64-sve-builtins-sve2.def (svmmla) Add new
intrinsics.
* config/aarch64/aarch64-sve-builtins.cc (TYPES_cvt_narrow_s):
Fix comment.
* config/aarch64/aarch64-sve2.md
(@aarch64_sve2_<sve_fp_op><SVE_FULL_HSF_FMMLA:mode><VNx16QI_ONLY:mode>): New instruction.
(@aarch64_sve2_<sve_fp_op><VNx4SF_ONLY:mode><VNx8HF_ONLY:mode>): Likewise.
* config/aarch64/aarch64.h (TARGET_F8F32MM): New macro.
(TARGET_F8F16MM): Likewise.
(TARGET_SVE_F16F32MM): Likewise.
* config/aarch64/iterators.md (insn): Add fmmla entry.
(VDQ_HSF_FMMLA): New iterator.
(SVE_FULL_HSF_FMMLA): Likewise.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp:
* gcc.target/aarch64/acle/vmmlaq_f16_mf8.c: New test.
* gcc.target/aarch64/acle/vmmlaq_f32_mf8.c: New test.
* gcc.target/aarch64/sve2/acle/asm/fmmla_f8f16mm_sve2.c: New test.
* gcc.target/aarch64/sve2/acle/asm/fmmla_f8f32mm_sve2.c: New test.
* gcc.target/aarch64/sve2/acle/asm/fmmla_sve_f16f32mm.c: New test.
* gcc.target/aarch64/sve/acle/general-c/mmla_1.c: Update error messages.
We get lots of error messages when compiling arm_neon.h under
e.g. -mcpu=cortex-m55, because Neon builtins are enabled only when
!TARGET_HAVE_MVE. This has been the case since MVE support was
introduced.
This patch uses an approach similar to what we do on aarch64, but only
partially since Neon intrinsics do not use the "new" framework.
We register all types and Neon intrinsics, whether MVE is enabled or
not, which enables to compile arm_neon.h. However, we need to
introduce a "switcher" similar to aarch64's to avoid ICEs when LTO is
enabled: in that case, since we have to enable the MVE intrinsics, we
temporarily change arm_active_target.isa to enable MVE bits. This
enables hooks like arm_vector_mode_supported_p and arm_array_mode to
behave as expected by the MVE intrinsics framework. We switch back
to the previous arm_active_target.isa immediately after.
With a toolchain targetting e.g. cortex-m55,
gcc.target/arm/attr-neon3.c now compiles successfully, with only one
failure to be fixed separately:
FAIL: gcc.target/arm/attr-neon3.c check-function-bodies my1
Besides that, gcc.log is no longer full of errors messages when trying
to compile arm_neon.h if MVE is forced somehow.
gcc/ChangeLog:
* config/arm/arm-builtins.cc (arm_init_simd_builtin_types): Remove
TARGET_HAVE_MVE condition.
(class arm_target_switcher): New.
(arm_init_mve_builtins): Remove calls to
arm_init_simd_builtin_types and
arm_init_simd_builtin_scalar_types. Switch to MVE isa flags.
(arm_init_neon_builtins): Remove calls to
arm_init_simd_builtin_types and
arm_init_simd_builtin_scalar_types.
(arm_need_mve_mode_regs): New.
(arm_need_neon_mode_regs): New.
(arm_target_switcher::arm_target_switcher): New.
(arm_target_switcher::~arm_target_switcher): New.
(arm_init_builtins): Call arm_init_simd_builtin_scalar_types and
arm_init_simd_builtin_types. Always call arm_init_mve_builtins
and arm_init_neon_builtins.
Two tests currently XPASS on 64-bit Solaris/SPARC:
XPASS: gcc.dg/vect/pr33804.c scan-tree-dump-times vect "vectorized 1 loops" 1
XPASS: gcc.dg/vect/pr33804.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1
XPASS: gcc.dg/vect/slp-multitypes-3.c scan-tree-dump-times vect "vectorized 1 loops" 1
XPASS: gcc.dg/vect/slp-multitypes-3.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2
Both tests are currently xfail'ed on sparc*-*-*. The following patch
restricts that to 32-bit SPARC instead.
2026-01-05 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
gcc/testsuite:
PR tree-optimization/102954
* gcc.dg/vect/pr33804.c (scan-tree-dump-times): Only
xfail on 32-bit SPARC.
* gcc.dg/vect/slp-multitypes-3.c: Likewise.
This happens for a discriminated record type with default discriminants, for
which GNAT allocates mutable objects with the maximum size, while trying not
to copy padding bits unnecessarily. When the padded size is small enough to
be copied efficiently, it should nevertheless be profitable to copy them in
order to avoid a call to memcpy with a dynamic size.
This version makes sure that it is safe to read the padded size on the RHS,
which is not the case for example when the LHS is an unconstrained variable
but the RHS is a constrained object.
gcc/ada
* gcc-interface/trans.cc (gnat_to_gnu): Add comment explaining why
it is necessary to remove the padding for an object of a type with
self-referential size when it is not converted to the result type.
* gcc-interface/utils2.cc (build_binary_op) <MODIFY_EXPR>: For an
assignment between small padded objects of the same type with self-
referential size, and which have the same (constant) size, use the
padded view of the objects.
The following 4 builtins have corresponding insns guarded with TARGET_64BIT
and are only used in #ifdef __x86_64__ ... #endif section of an intrin
header, so when used by hand with -m32 they ICE.
Fixed thusly.
I've additionally verified all the #ifdef __x86_64__ ... #endif guarded
builtins used in intrinsic headers and checked whether they have
OPTION_MASK_ISA_64BIT, the only other exception was __builtin_ia32_prefetchi
but I think that one is fine, as expansion in that case has
if (TARGET_64BIT && TARGET_PREFETCHI
&& local_func_symbolic_operand (op0, GET_MODE (op0)))
emit_insn (gen_prefetchi (op0, op2));
else
{
warning (0, "instruction prefetch applies when in 64-bit mode"
" with RIP-relative addressing and"
" option %<-mprefetchi%>;"
" they stay NOPs otherwise");
emit_insn (gen_nop ());
}
2026-01-09 Jakub Jelinek <jakub@redhat.com>
PR target/123489
* config/i386/i386-builtin.def (__builtin_ia32_cvttsd2sis64_round,
__builtin_ia32_cvttsd2usis64_round, __builtin_ia32_cvttss2sis64_round,
__builtin_ia32_cvttss2usis64_round): Require OPTION_MASK_ISA_64BIT.
* gcc.target/i386/pr123489.c: New test.
This happens for a discriminated record type with default discriminants, for
which GNAT allocates mutable objects with the maximum size, while trying not
to copy padding bits unnecessarily. When the padded size is small enough to
be copied efficiently, it should nevertheless be profitable to copy them in
order to avoid a call to memcpy with a dynamic size.
gcc/ada/ChangeLog:
* gcc-interface/trans.cc (gnat_to_gnu): For the LHS of an assignment
or an actual parameter of a call, do not remove the padding even for
a type of self-referential size when the padded size is small enough
to be copied efficiently.
Improves previous fix to handle an object that has an address clause
and it is initialized by C++ imported constructor call.
gcc/ada/ChangeLog:
* exp_ch3.adb (Expand_N_Object_Declaration): Remove previous patch
and place the call to the constructor into a compound statement
attached to the object; the compound statement will be moved to
the freezing actions of the object if the object has an address
clause.
This patch fixes a crash occurring during the legality check of the Initialize
aspect when the constructor is implicitly created by the compiler, e.g., the
default copy constructor. In such case, Corresponding_Spec is not available, the
Specification field must be used instead.
gcc/ada/ChangeLog:
* sem_ch13.adb (Check_Constructor_Initialization_Expression): The first
parameter of an implicit constructor comes from Specification, not
Corresponding_Spec.