222016 Commits

Author SHA1 Message Date
48479558b5 fortran: Fix indentation
Move a block of code two spaces to the left.  Commiting as obvious.

gcc/fortran/ChangeLog:

	* resolve.cc (resolve_select_type): Fix indentation.

Signed-off-by: Filip Kastl <fkastl@suse.cz>
2025-07-15 08:39:00 +02:00
a55b5691a8 Daily bump. 2025-07-15 00:18:55 +00:00
c1be1d7512 cobol: Eliminate cppcheck warnings in gcc/cobol .cc files.
These changes eliminate various cppcheck warnings, mostly involving C-Style
casting and applying "const" to various variables and formal parameters.
Some tab characters were eliminated, and some lines were trimmed to
seventy-nine characters.

gcc/cobol/ChangeLog:

	* cobol1.cc (cobol_langhook_handle_option): Eliminate cppcheck warnings.
	* dts.h: Likewise.
	* except.cc (cbl_enabled_exceptions_t::dump): Likewise.
	* gcobolspec.cc (lang_specific_driver): Likewise.
	* genapi.cc (parser_file_merge): Likewise.
	* gengen.cc (gg_unique_in_function): Likewise.
	(gg_declare_variable): Likewise.
	(gg_peek_fn_decl): Likewise.
	(gg_define_function): Likewise.
	* genmath.cc (set_up_on_exception_label): Likewise.
	(set_up_compute_error_label): Likewise.
	(arithmetic_operation): Likewise.
	(fast_divide): Likewise.
	* genutil.cc (get_and_check_refstart_and_reflen): Likewise.
	(get_depending_on_value_from_odo): Likewise.
	(get_data_offset): Likewise.
	(get_binary_value): Likewise.
	(process_this_exception): Likewise.
	(copy_little_endian_into_place): Likewise.
	(refer_is_clean): Likewise.
	(refer_fill_depends): Likewise.
	* genutil.h (process_this_exception): Likewise.
	(copy_little_endian_into_place): Likewise.
	(refer_is_clean): Likewise.
	* lexio.cc (check_push_pop_directive): Likewise.
	(check_source_format_directive): Likewise.
	(location_in): Likewise.
	(lexer_input): Likewise.
	(cdftext::lex_open): Likewise.
	(lexio_dialect_mf): Likewise.
	(valid_sequence_area): Likewise.
	(cdftext::free_form_reference_format): Likewise.
	(cdftext::segment_line): Likewise.
	* lexio.h (struct span_t): Likewise.
	* scan_ante.h (trim_location): Likewise.
	* symbols.cc (symbol_elem_cmp): Likewise.
	(symbol_alphabet): Likewise.
	(end_of_group): Likewise.
	(cbl_field_t::attr_str): Likewise.
	(symbols_update): Likewise.
	(symbol_typedef_add): Likewise.
	(symbol_field_add): Likewise.
	(new_temporary_impl): Likewise.
	(symbol_label_section_exists): Likewise.
	(symbol_program_callables): Likewise.
	(file_status_status_of): Likewise.
	* symfind.cc (is_data_field): Likewise.
	(finalize_symbol_map2): Likewise.
	(class in_scope): Likewise.
	(symbol_match2): Likewise.
	* util.cc (get_current_dir_name): Likewise.
	(gb4): Likewise.
	(class cdf_directives_t): Likewise.
	(cbl_field_t::report_invalid_initial_value): Likewise.
	(literal_subscript_oob): Likewise.
	(cbl_refer_t::str): Likewise.
	(date_time_fmt): Likewise.
	(class unique_stack): Likewise.
	(cobol_set_pp_option): Likewise.
	(cobol_filename): Likewise.
	(cobol_filename_restore): Likewise.
	(gcc_location_set_impl): Likewise.
	(ydferror): Likewise.
	(error_msg_direct): Likewise.
	(yyerror): Likewise.
	(cbl_unimplemented_at): Likewise.
2025-07-14 17:05:47 -04:00
9840a1db02 libstdc++: Add comments to deleted std::swap overloads for LWG 2766
We pre-emptively implemented part of LWG 2766, which still hasn't been
approved. Add comments to the deleted swap overloads saying why they're
there, because the standard doesn't require them.

libstdc++-v3/ChangeLog:

	* include/bits/stl_pair.h (swap): Add comment to deleted
	overload.
	* include/bits/unique_ptr.h (swap): Likewise.
	* include/std/array (swap): Likewise.
	* include/std/optional (swap): Likewise.
	* include/std/tuple (swap): Likewise.
	* include/std/variant (swap): Likewise.
	* testsuite/23_containers/array/tuple_interface/get_neg.cc:
	Adjust dg-error line numbers.
2025-07-14 21:41:37 +01:00
d8680bac95 amdgcn: fix vec_ucmp infinite recursion
I suppose this pattern doesn't get used much! The unsigned compare was meant to
be defined using the signed compare pattern, but actually ended up trying to
recursively call itself.  This patch fixes the issue in the obvious way.

gcc/ChangeLog:

	* config/gcn/gcn-valu.md (vec_cmpu<mode>di_exec): Call gen_vec_cmp*,
	not gen_vec_cmpu*.
2025-07-14 15:46:57 +00:00
66c0c3b0b1 Revert "tree-optimization/121059 - record loop mask when required"
This reverts commit 66346b6d80.
2025-07-14 17:18:12 +02:00
e91b8e0449 s390: Implement reduction optabs
Implementation and tests for the standard reduction optabs.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>

gcc/ChangeLog:

	* config/s390/vector.md (reduc_plus_scal_<mode>): Implement.
	(reduc_plus_scal_v2df): Implement.
	(reduc_plus_scal_v4sf): Implement.
	(REDUC_FMINMAX): New int iterator.
	(reduc_fminmax_name): New int attribute.
	(reduc_minmax): New code iterator.
	(reduc_minmax_name): New code attribute.
	(reduc_<reduc_fminmax_name>_scal_v2df): Implement.
	(reduc_<reduc_fminmax_name>_scal_v4sf): Implement.
	(reduc_<reduc_minmax_name>_scal_v2df): Implement.
	(reduc_<reduc_minmax_name>_scal_v4sf): Implement.
	(REDUCBIN): New code iterator.
	(reduc_bin_insn): New code attribute.
	(reduc_<reduc_bin_insn>_scal_v2di): Implement.
	(reduc_<reduc_bin_insn>_scal_v4si): Implement.
	(reduc_<reduc_bin_insn>_scal_v8hi): Implement.
	(reduc_<reduc_bin_insn>_scal_v16qi): Implement.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp: Add s390 to vect_logical_reduc targets.
	* gcc.target/s390/vector/reduc-binops-1.c: New test.
	* gcc.target/s390/vector/reduc-minmax-1.c: New test.
	* gcc.target/s390/vector/reduc-plus-1.c: New test.
2025-07-14 17:16:53 +02:00
383ec62349 s390: Remove min-vect-loop-bound override
The default setting of s390 for the parameter min-vect-loop-bound was
set to 2 to prevent certain epilogue loop vectorizations in the past.
Reevaluation of this parameter shows that this setting now is not
needed anymore and sometimes even harmful.  Remove the overwrite to
align s390 with other backends.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>

gcc/ChangeLog:

	* config/s390/s390.cc (s390_option_override_internal): Remove override.
2025-07-14 17:16:53 +02:00
0eee2dd286 amdgcn: Don't clobber VCC if we don't need to
This is a hold-over from GCN3 where v_add always wrote to the condition
register, whether you wanted it or not.  This hasn't been true since GCN5, and
we dropped support for GCN3 a little while ago, so let's fix it.

There was actually a latent bug here because some other post-reload splitters
were generating v_add instructions without declaring the VCC clobber (at least
mul did this), so this should fix some wrong-code bugs also.

gcc/ChangeLog:

	* config/gcn/gcn-valu.md (add<mode>3<exec_clobber>): Rename ...
	(add<mode>3<exec>): ... to this, remove the clobber, and change the
	instruction from v_add_co_u32 to v_add_u32.
	(add<mode>3_dup<exec_clobber>): Rename ...
	(add<mode>3_dup<exec>): ... to this, and likewise.
	(sub<mode>3<exec_clobber>): Rename ...
	(sub<mode>3<exec>): ... to this, and likewise
	* config/gcn/gcn.md (addsi3): Remove the DI clobber, and change the
	instruction from v_add_co_u32 to v_add_u32.
	(addsi3_scc): Likewise.
	(subsi3): Likewise, but for v_sub_co_u32.
	(muldi3): Likewise.
2025-07-14 13:59:01 +00:00
66346b6d80 tree-optimization/121059 - record loop mask when required
For loop masking we need to mask a mask AND operation with the loop
mask.  The following makes sure we have a corresponding mask
available.  There's no good way to distinguish loop masking from
len masking here, so assume we have recorded a mask for the operands
mask producers.

	PR tree-optimization/121059
	* tree-vect-stmts.cc (vectorizable_operation): Record a
	loop mask for mask AND operations.

	* gcc.dg/vect/pr121059.c: New testcase.
2025-07-14 15:38:13 +02:00
f7f0539ae9 RISC-V: Add testcase for rv32 SAT_MUL from uint64
Add the run and asm testcase for rv32 SAT_MUL, widen mul from
uint8_t, uint16_t, uint32_t to uint64_t.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/sat/sat_u_mul-1-u16-from-u64.c: New test.
	* gcc.target/riscv/sat/sat_u_mul-1-u32-from-u64.c: New test.
	* gcc.target/riscv/sat/sat_u_mul-1-u8-from-u64.c: New test.
	* gcc.target/riscv/sat/sat_u_mul-run-1-u16-from-u64.c: New test.
	* gcc.target/riscv/sat/sat_u_mul-run-1-u32-from-u64.c: New test.
	* gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u64.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-14 21:03:41 +08:00
f01216a0b7 Match: Refine the widen mul check for SAT_MUL pattern
The widen mul will have source type from N-bits to
dest type 2N-bits.  The previous check only focus on
the HOST_WIDE_INT but not working for QI => HI, HI => SI
and SI to DImode.  Thus, refine the widen mul precision
check as dest has twice bits of input.

gcc/ChangeLog:

	* match.pd: Make sure widen mul has twice bitsize
	of the inputs in SAT_MUL pattern.

Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-14 21:03:41 +08:00
dc07752af0 x86: Check all 0s/1s vectors with standard_sse_constant_p
commit 77473a27ba
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Jun 26 06:08:51 2025 +0800

    x86: Also handle all 1s float vector constant

replaces

(insn 29 28 30 5 (set (reg:V2SF 107)
        (mem/u/c:V2SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S8 A64])) 2031 {*movv2sf_internal}
     (expr_list:REG_EQUAL (const_vector:V2SF [
                (const_double:SF -QNaN [-QNaN]) repeated x2
            ])
        (nil)))

with

(insn 98 13 14 3 (set (reg:V8QI 112)
        (const_vector:V8QI [
                (const_int -1 [0xffffffffffffffff]) repeated x8
            ])) -1
     (nil))
...
(insn 29 28 30 5 (set (reg:V2SF 107)
        (subreg:V2SF (reg:V8QI 112) 0)) 2031 {*movv2sf_internal}
     (expr_list:REG_EQUAL (const_vector:V2SF [
                (const_double:SF -QNaN [-QNaN]) repeated x2
            ])
        (nil)))

which leads to

pr121015.c: In function ‘render_result_from_bake_h’:
pr121015.c:34:1: error: unrecognizable insn:
   34 | }
      | ^
(insn 98 13 14 3 (set (reg:V8QI 112)
        (const_vector:V8QI [
                (const_int -1 [0xffffffffffffffff]) repeated x8
            ])) -1
     (expr_list:REG_EQUIV (const_vector:V8QI [
                (const_int -1 [0xffffffffffffffff]) repeated x8
            ])
        (nil)))
during RTL pass: ira

Check all 0s/1s vectors with standard_sse_constant_p to avoid unsupported
all 1s vectors.

Co-Developed-by: H.J. Lu <hjl.tools@gmail.com>

gcc/

	PR target/121015
	* config/i386/i386-features.cc (ix86_broadcast_inner): Check all
	0s/1s vectors with standard_sse_constant_p.

gcc/testsuite/

	PR target/121015
	* gcc.target/i386/pr121015.c: New test.
2025-07-14 20:55:58 +08:00
07d8de9174 x86-64: Add --enable-x86-64-mfentry
When profiling is enabled with shrink wrapping, the mcount call may not
be placed at the function entry after

	pushq %rbp
	movq %rsp,%rbp

As the result, the profile data may be skewed which makes PGO less
effective.

Add --enable-x86-64-mfentry to enable -mfentry by default to use
__fentry__, added to glibc in 2010 by:

commit d22e4cc9397ed41534c9422d0b0ffef8c77bfa53
Author: Andi Kleen <ak@linux.intel.com>
Date:   Sat Aug 7 21:24:05 2010 -0700

    x86: Add support for frame pointer less mcount

instead of mcount, which is placed before the prologue so that -pg can
be used with -fshrink-wrap-separate enabled at -O1.  This option is
64-bit only because __fentry__ doesn't support PIC in 32-bit mode.  The
default it to enable -mfentry when targeting glibc.

Also warn -pg without -mfentry with shrink wrapping enabled.  The warning
is disable for PIC in 32-bit mode.

gcc/

	PR target/120881
	* config.in: Regenerated.
	* configure: Likewise.
	* configure.ac: Add --enable-x86-64-mfentry.
	* config/i386/i386-options.cc (ix86_option_override_internal):
	Enable __fentry__ in 64-bit mode if ENABLE_X86_64_MFENTRY is set
	to 1.  Warn -pg without -mfentry with shrink wrapping enabled.
	* doc/install.texi: Document --enable-x86-64-mfentry.

gcc/testsuite/

	PR target/120881
	* gcc.dg/20021014-1.c: Add additional -mfentry -fno-pic options
	for x86.
	* gcc.dg/aru-2.c: Likewise.
	* gcc.dg/nest.c: Likewise.
	* gcc.dg/pr32450.c: Likewise.
	* gcc.dg/pr43643.c: Likewise.
	* gcc.target/i386/pr104447.c: Likewise.
	* gcc.target/i386/pr113122-3.c: Likewise.
	* gcc.target/i386/pr119386-1.c: Add additional -mfentry if not
	ia32.
	* gcc.target/i386/pr119386-2.c: Likewise.
	* gcc.target/i386/pr120881-1a.c: New test.
	* gcc.target/i386/pr120881-1b.c: Likewise.
	* gcc.target/i386/pr120881-1c.c: Likewise.
	* gcc.target/i386/pr120881-1d.c: Likewise.
	* gcc.target/i386/pr120881-2a.c: Likewise.
	* gcc.target/i386/pr120881-2b.c: Likewise.
	* gcc.target/i386/pr82699-1.c: Add additional -mfentry.
	* lib/target-supports.exp (check_effective_target_fentry): New.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-07-14 20:24:12 +08:00
cc4f339733 Darwin: account for macOS 26
darwin25 will be named macOS 26 (codename Tahoe). This is a change from
darwin24, which was macOS 15. We need to adapt the driver to this new
numbering scheme.

2025-07-14  François-Xavier Coudert  <fxcoudert@gcc.gnu.org>

gcc/ChangeLog:

	PR target/120645
	* config/darwin-driver.cc: Account for latest macOS numbering
	scheme.

gcc/testsuite/ChangeLog:

	* gcc.dg/darwin-minversion-link.c: Account for macOS 26.
2025-07-14 14:23:19 +02:00
99a3c71db6 [PATCH v2] RISC-V: Vector-scalar widening multiply-(subtract-)accumulate [PR119100]
This pattern enables the combine pass (or late-combine, depending on the case)
to merge a float_extend'ed vec_duplicate into a plus-mult or minus-mult RTL
instruction.

Before this patch, we have three instructions, e.g.:
  fcvt.s.h       fa5,fa5
  vfmv.v.f       v24,fa5
  vfmadd.vv      v8,v24,v16

After, we get only one:
  vfwmacc.vf     v8,fa5,v16

	PR target/119100

gcc/ChangeLog:

	* config/riscv/autovec-opt.md (*vfwmacc_vf_<mode>): New pattern to
	handle both vfwmacc and vfwmsac.
	(*extend_vf_<mode>): New pattern that serves as an intermediate combine
	step.
	* config/riscv/vector-iterators.md (vsubel): New mode attribute. This is
	just the lower-case version of VSUBEL.
	* config/riscv/vector.md (@pred_widen_mul_<optab><mode>_scalar): Reorder
	and swap operands to match the RTL emitted by expand, i.e. first
	float_extend then vec_duplicate.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f16.c: Add vfwmacc and
	vfwmsac.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f32.c: Likewise.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f16.c: Likewise. Also check
	for fcvt and vfmv.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f32.c: Likewise.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f16.c: Add vfwmacc and
	vfwmsac.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f32.c: Likewise.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f16.c: Likewise. Also check
	for fcvt and vfmv.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f32.c: Likewise.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop.h: Add support for
	widening variants.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop_widen_run.h: New test
	helper.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmacc-run-1-f16.c: New test.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmacc-run-1-f32.c: New test.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmsac-run-1-f16.c: New test.
	* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmsac-run-1-f32.c: New test.
2025-07-14 06:10:44 -06:00
9c75032b40 libstdc++: Protect PSTL headers against overloaded commas
Reported upstream: https://github.com/uxlfoundation/oneDPL/issues/2342

libstdc++-v3/ChangeLog:

	* include/pstl/algorithm_impl.h (__for_each_n_it_serial):
	Protect against overloaded comma operator.
	(__brick_walk2): Likewise.
	(__brick_walk2_n): Likewise.
	(__brick_walk3): Likewise.
	(__brick_move_destroy::operator()): Likewise.
	(__brick_calc_mask_1): Likewise.
	(__brick_copy_by_mask): Likewise.
	(__brick_partition_by_mask): Likewise.
	(__brick_calc_mask_2): Likewise.
	(__brick_reverse): Likewise.
	(__pattern_partial_sort_copy): Likewise.
	* include/pstl/memory_impl.h (__brick_uninitialized_move):
	Likewise.
	(__brick_uninitialized_copy): Likewise.
	* include/pstl/numeric_impl.h (__brick_transform_scan):
	Likewise.
2025-07-14 12:54:42 +01:00
9b6b7fed78 libstdc++: Correct value of __cpp_lib_constexpr_exceptions [PR117785]
Only P3068R6 (Allowing exception throwing in constant-evaluation) is
implemented in the library so far, so the value of the
constexpr_exceptions feature test macro should be 202411L. Once we
support the library changes in P3378R2 (constexpr exception types) then
we can set the value to 202502L again.

libstdc++-v3/ChangeLog:

	PR libstdc++/117785
	* include/bits/version.def (constexpr_exceptions): Define
	correct value.
	* include/bits/version.h: Regenerate.
	* libsupc++/exception: Check correct value.
	* testsuite/18_support/exception/version.cc: New test.
2025-07-14 12:53:22 +01:00
8aff55e259 libstdc++: Fix constexpr exceptions for -fno-exceptions
The if-consteval branches in std::make_exception_ptr and
std::exception_ptr_cast use a try-catch block, which gives an error for
-fno-exceptions. Just make them return a null pointer at compile-time
when -fno-exceptions is used, because there's no way to get an active
exception with -fno-exceptions.

For both functions we have a runtime-only branch that depends on RTTI,
and a fallback using try-catch which works for runtime and consteval.
Rearrange both functions to express this logic more clearly.

Also adjust some formatting and whitespace elsewhere in the file.

libstdc++-v3/ChangeLog:

	* libsupc++/exception_ptr.h (make_exception_ptr): Return null
	for consteval when -fno-exceptions is used.
	(exception_ptr_cast): Likewise. Allow consteval path to work
	with -fno-rtti.

Reviewed-by: Jakub Jelinek <jakub@redhat.com>
2025-07-14 12:53:04 +01:00
b513e4f3e0 Ada: Add missing guard before accessing the Underlying_Record_View field
It is necessary when GNAT extensions are enabled (-gnatX switch).

gcc/ada/
	PR ada/121056
	* sem_ch4.adb (Try_Object_Operation.Try_Primitive_Operation): Add
	test on Is_Record_Type before accessing Underlying_Record_View.

gcc/testsuite/
	* gnat.dg/deref4.adb: New test.
	* gnat.dg/deref4_pkg.ads: New helper.
2025-07-14 12:16:09 +02:00
3a1067c8b8 aarch64: Implement sme2+faminmax extension.
Implements the sme2+faminmax svamin and svamax intrinsics.

gcc/ChangeLog:

	* config/aarch64/aarch64-sme.md (@aarch64_sme_<faminmax_uns_op><mode>):
	New patterns.
	* config/aarch64/aarch64-sve-builtins-sme.def (svamin): New intrinsics.
	(svamax): New intrinsics.
	* config/aarch64/aarch64-sve-builtins-sve2.cc (class faminmaximpl): New
	class.
	(svamin): New function.
	(svamax): New function.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sme2/acle-asm/amax_f16_x2.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amax_f16_x4.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amax_f32_x2.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amax_f32_x4.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amax_f64_x2.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amax_f64_x4.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amin_f16_x2.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amin_f16_x4.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amin_f32_x2.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amin_f32_x4.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amin_f64_x2.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/amin_f64_x4.c: New test.
2025-07-14 06:48:44 +00:00
e69b78c9e2 i386: Remove KEYLOCKER related feature since Panther Lake and Clearwater Forest
According to July 2025 SDM, Key locker will no longer be supported on
hardware 2025 onwards. This means for Panther Lake and Clearwater Forest,
the feature will not be enabled. Remove them from those two platforms.

gcc/ChangeLog:

	* config/i386/i386.h (PTA_PANTHERLAKE): Revmoe KL and WIDEKL.
	(PTA_CLEARWATERFOREST): Ditto.
	* doc/invoke.texi: Revise documentation.
2025-07-14 14:28:44 +08:00
db3afff48f RISC-V: Add testcases for unsigned vector SAT_SUB form 11 and form 12
This patch adds testcase for form11 and form12, as shown below:

void __attribute__((noinline))                                       \
vec_sat_u_sub_##T##_fmt_11 (T *out, T *op_1, T *op_2, unsigned limit) \
{                                                                    \
  unsigned i;                                                        \
  for (i = 0; i < limit; i++)                                        \
    {                                                                \
      T x = op_1[i];                                                 \
      T y = op_2[i];                                                 \
      T ret;                                                         \
      T overflow = __builtin_sub_overflow (x, y, &ret);           \
      out[i] = overflow ? 0 : ret;                                   \
    }                                                                \
}

void __attribute__((noinline))                                        \
vec_sat_u_sub_##T##_fmt_12 (T *out, T *op_1, T *op_2, unsigned limit) \
{                                                                     \
  unsigned i;                                                         \
  for (i = 0; i < limit; i++)                                         \
    {                                                                 \
      T x = op_1[i];                                                  \
      T y = op_2[i];                                                  \
      T ret;                                                          \
      T overflow = __builtin_sub_overflow (x, y, &ret);            \
      out[i] = !overflow ? ret : 0;                                   \
    }                                                                 \
}

Passed the rv64gcv regression test.

Signed-off-by: Ciyan Pan <panciyan@eswincomputing.com>
gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/sat/vec_sat_arith.h: Unsigned vector SAT_SUB form11 form12.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_data.h: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u16.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u32.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u64.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u8.c: Use ussub instead of usub.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u16.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u32.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u64.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u8.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u16.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u32.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u64.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u8.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u16.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u32.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u64.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u8.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u16.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u32.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u64.c: New test.
	* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u8.c: New test.
2025-07-14 01:45:06 +00:00
c7798b53bd Daily bump. 2025-07-14 00:16:48 +00:00
4d7baa94a4 tree: Add include to tm_p.h to tree.cc [PR120866]
After r16-1738-g0337e3c2743ca0, a call to ASM_GENERATE_INTERNAL_LABEL
was done without including tm_p.h. This does not break most targets
as ASM_GENERATE_INTERNAL_LABEL macro function does not call target
specific functions from it; mostly just sprintf. It does however
break pdp11-aout and powerpc-aix* because those two call a target
specific function to do create the internal label.

Pushed as obvious after a build of gcc for pdp11-aout and x86_64-linux-gnu.

	PR middle-end/120866
gcc/ChangeLog:

	* tree.cc: Add include to tm_p.h.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-13 12:07:30 -07:00
356250630a middle-end: Fix typo in gimple.h
gcc/ChangeLog:

	* gimple.h (GTMA_DOES_GO_IRREVOCABLE): Fix typo.
2025-07-13 17:26:06 +01:00
9b9753718e cobol: Minor changes to genapi.cc to eliminate CPPCHECK warnings.
Several hundred cppcheck warnings were eliminated.

Most of these changes were replacing C-style casts, checking for NULL
pointers, establishing some variables and formal parameters as const,
and moving some variables around to tidy up their scopes.

One memory leak was found and eliminated as a result of the cppcheck.

gcc/cobol/ChangeLog:

	* Make-lang.in: Eliminate the .cc.o override.
	* genapi.cc (level_88_helper): Eliminate cppcheck warning.
	(get_level_88_domain): Likewise.
	(get_class_condition_string): Likewise.
	(parser_call_targets_dump): Likewise.
	(parser_compile_ecs): Likewise.
	(initialize_variable_internal): Likewise.
	(move_tree): Likewise.
	(combined_name): Likewise.
	(assembler_label): Likewise.
	(find_procedure): Likewise.
	(parser_perform): Likewise.
	(parser_perform_times): Likewise.
	(internal_perform_through): Likewise.
	(internal_perform_through_times): Likewise.
	(psa_FldLiteralN): Likewise.
	(psa_FldBlob): Likewise.
	(parser_accept): Likewise.
	(parser_accept_exception): Likewise.
	(parser_accept_exception_end): Likewise.
	(parser_accept_command_line): Likewise.
	(parser_accept_envar): Likewise.
	(parser_display_internal): Likewise.
	(parser_display): Likewise.
	(parser_assign): Likewise.
	(parser_initialize_table): Likewise.
	(parser_arith_error): Likewise.
	(parser_arith_error_end): Likewise.
	(parser_division): Likewise.
	(label_fetch): Likewise.
	(parser_label_label): Likewise.
	(parser_label_goto): Likewise.
	(parser_perform_start): Likewise.
	(parser_perform_conditional): Likewise.
	(parser_perform_conditional_end): Likewise.
	(parser_perform_until): Likewise.
	(parser_file_delete): Likewise.
	(parser_intrinsic_subst): Likewise.
	(create_lsearch_address_pairs): Likewise.
	(parser_bsearch_start): Likewise.
	(is_ascending_key): Likewise.
	(parser_sort): Likewise.
	(parser_file_sort): Likewise.
	(parser_return_start): Likewise.
	(parser_file_merge): Likewise.
	(parser_string_overflow): Likewise.
	(parser_unstring): Likewise.
	(parser_string): Likewise.
	(parser_call_exception): Likewise.
	(create_and_call): Likewise.
	(mh_identical): Likewise.
	(move_helper): Likewise.
	(binary_initial_from_float128): Likewise.
	(initial_from_initial): Likewise.
	(psa_FldLiteralA): Likewise.
	(parser_local_add): Likewise.
	(parser_symbol_add): Likewise.
	* genapi.h (parser_display): Likewise.
	* gengen.cc (gg_call_expr): Explict check for NULL_TREE.
	(gg_call): Likewise.
	* show_parse.h (SHOW_PARSE_LABEL_OK): Likewise.
	(TRACE1_FIELD_VALUE): Likewise.
	(CHECK_FIELD): Likewise.
	(CHECK_FIELD2): Likewise.
	(CHECK_LABEL): Likewise.
	* util.cc (cbl_internal_error): Apply [[noreturn]] attribute.
	* util.h (cbl_internal_error): Likewise.

libgcobol/ChangeLog:

	* common-defs.h (PTRCAST): Moved here from libgcobol.h.
	* libgcobol.h (PTRCAST): Deleted.
2025-07-13 10:05:17 -04:00
598455fd73 Daily bump. 2025-07-13 00:22:33 +00:00
f3186568d0 Fix some auto-profile issues
This patch fixes minor things that has cumulated in my tree.  Except for
formating fixes an important change is that seen set is now kept up to date.
Oriignal code first populated it for all string in the string table but now
gimple matching may introduce new ones that needs to be checked for match with
symbol table as well.

This makes imagemagic of spec2017 to be faster with auto-fdo then without at
least when trained with ref run.  Train run has problem since it does not train
the innermost loop at all, so even with normal PGO it is slower then without.

autorpfoiledbootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

	* auto-profile.cc (function_instance::~function_instance):
	Move down in source.
	(string_table::get_cgraph_node): New member function with
	logic broken out from ...
	(function_instance::get_cgraph_node): ... here.
	(match_with_target): Fix formating.
	(function_instance::match): Fix formating; do not use iterators
	after modifying map; remove incorrect set of warned flag.
	(autofdo_source_profile::offline_external_functions): Keep
	seen set up to date.
	(function_instance::read_function_instance): Fix formating.
2025-07-12 17:58:03 +02:00
8f304b3873 i386: Robustify MMX move patterns
MMX allows only direct moves from zero, so correct V_32:mode and v2qi
move patterns to allow only nonimm_or_0_operand as their input operand.

gcc/ChangeLog:

	* config/i386/mmx.md (mov<V_32:mode>):
	Use nonimm_or_0_operand predicate for operand 1.
	(*mov<V_32:mode>_internal): Ditto.
	(movv2qi): Ditto.
	(*movv2qi_internal): Ditto.  Use ix86_hardreg_mov_ok
	in insn condition.
2025-07-12 17:34:18 +02:00
e6d3c88e7b lra: Reallow reloading user hard registers if the insn is not asm [PR 120983]
The PR 87600 fix has disallowed reloading user hard registers to resolve
earlyclobber-induced conflict.

However before reload, recog completely ignores the constraints of
insns, so the RTL passes may produce insns where some user hard
registers violate an earlyclobber.  Then we'll get an ICE without
reloading them, like what we are recently encountering in LoongArch test
suite.

IIUC "recog does not look at constraints until reload" has been a
well-established rule in GCC for years and I don't have enough skill to
challange it.  So reallow reloading user hard registers (but still
disallow doing so for asm) to fix the ICE.

gcc/ChangeLog:

	PR rtl-optimization/120983
	* lra-constraints.cc (process_alt_operands): Allow reloading
	user hard registers unless the insn is an asm.
2025-07-12 16:45:20 +08:00
651845ceaa testsuite: Enable the PR 87600 tests for LoongArch
I'm going to refine a part of the PR 87600 fix which seems triggering
PR 120983 that LoongArch is particularly suffering.  Enable the PR 87600
tests so I'll not regress PR 87600.

gcc/testsuite/ChangeLog:

	PR rtl-optimization/87600
	PR rtl-optimization/120983
	* gcc.dg/pr87600.h [__loongarch__]: Define REG0 and REG1.
	* gcc.dg/pr87600-1.c (dg-do): Add loongarch.
	* gcc.dg/pr87600-2.c (dg-do): Likewise.
2025-07-12 16:45:17 +08:00
451b6dbf47 Fortran/OpenACC: Permit PARAMETER as 'var' in clauses (+ ignore)
It turned out that other compilers permit (require?) named constants
to appear in clauses - and programs actually use this. OpenACC 3.4
added therefore the following:
  In this spec, a _var_ (in italics) is one of the following:
  ...
  * a named constant in Fortran.
plus
  If during an optimization phase _var_ is removed by the compiler,
  appearances of var in data clauses are ignored.

Thus, all errors related to PARAMETER are now downgraded, most
to a -Wsurprising warning, but for 'acc declare device_resident'
(which kind of makes sense), no warning is printed.

In trans-openmp.cc, those are ignored, unless I missed some code
path. (If so, I hope the middle end removes them; but before
removing them for the covered cases, the program just compiled &
linked fine.)

Note that 'ignore PARAMETER inside clauses' in trans-openmp.cc
would in principle also apply to expressions ('if (var)') but
those should be evaluated during 'resolve.cc' + 'openmp.cc' to
their (numeric, logical, string) value such that there should
be no issue.

gcc/fortran/ChangeLog:

	* invoke.texi (-Wsurprising): Note about OpenACC warning
	related to PARAMATER.
	* openmp.cc (resolve_omp_clauses, gfc_resolve_oacc_declare):
	Accept PARAMETER for OpenACC but add surprising warning.
	* trans-openmp.cc (gfc_trans_omp_variable_list,
	gfc_trans_omp_clauses): Ignore PARAMETER inside clauses.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/parameter.f95: Add -Wsurprising flag
	and update expected diagnostic.
	* gfortran.dg/goacc/parameter-3.f90: New test.
	* gfortran.dg/goacc/parameter-4.f90: New test.
2025-07-12 07:18:06 +02:00
2a73ddc6e3 Daily bump. 2025-07-12 00:19:25 +00:00
a5d9debedd diagnostics: add support for directed graphs; use them for state graphs
In r16-1631-g2334d30cd8feac I added support for capturing state
information from -fanalyzer in XML form, and adding a way to visualize
these states in HTML output.  The data was optionally captured in SARIF
output (with "xml-state=yes"), stashing the XML in string form in
a property bag.

This worked, but there was no way to round-trip the stored data back
from SARIF without adding an XML parser to GCC, which I don't want to
do.

SARIF supports capturing directed graphs, so this patch:

(a) adds a new namespace diagnostics::digraphs, with classes digraph,
node, and edge, representing directed graphs in a form similar to
what SARIF can serialize

(b) adds support to GCC's diagnostic subsystem for reporting graphs,
either "globally" or as part of a diagnostic.  An example in a testsuite
plugin emits an error that has a couple of dummy graphs associated with
it, and captures the optimization passes as a digraph "globally".
Graphs are ignored by text sinks, but are captured by sarif sinks,
and the "experimental-html" sink gains SVG-based rendering of any graphs
using dot.  This HTML output is rather crude; an example can be seen
here:
  https://dmalcolm.fedorapeople.org/gcc/2025-07-10/diagnostic-test-graphs-html.c.html

(c) adds support to libgdiagnostics for the above

(d) adds support to sarif-replay for the above (round-tripping any
graph information)

(e) replaces the XML representation of state with a representation
based on the above directed graphs, using property bags to stash
additional information (e.g. "this is an on-stack buffer")

(f) implements round-tripping of this information in sarif-replay

To summarize:
- previously we could generate HTML diagrams for debugging
  -fanalyzer directly from gcc, but not from stored .sarif output.
- with this patch, we can generate such HTML diagrams both directly
  *and* from stored .sarif output (provided the SARIF sink was created
  with "state-graphs=yes")

Examples of HTML output can be seen here:
  https://dmalcolm.fedorapeople.org/gcc/2025-07-10/
where as before j/k can be used to cycle through the events.
which is almost identical to the output from the old XML-based
implementation seen at:
  https://dmalcolm.fedorapeople.org/gcc/2025-06-23/

gcc/ChangeLog:
	* Makefile.in (OBJS-libcommon): Add diagnostic-digraphs.o and
	diagnostic-state-graphs.o.

gcc/ChangeLog:
	* diagnostic-format-html.cc: Include "diagnostic-format-sarif.h",
	Replace include of "diagnostic-state.h" with includes of
	"diagnostic-digraphs.h" and "diagnostic-state-graphs.h".
	(html_generation_options::html_generation_options): Update for
	field renaming.
	(html_builder::m_body_element): New field.
	(html_builder::html_builder): Initialize m_body_element.
	(html_builder::maybe_make_state_diagram): Port from XML
	implementation to state graph implementation.
	(html_builder::make_element_for_diagnostic): Add any
	per-diagnostic graphs.
	(html_builder::add_graph): New.
	(html_builder::emit_global_graph): New.
	(html_output_format::report_global_digraph): New.
	* diagnostic-format-html.h
	(html_generation_options::m_show_state_diagram_xml): Replace
	with...
	(html_generation_options::m_show_state_diagrams_sarif): ...this.
	(html_generation_options::m_show_state_diagram_dot_src): Rename
	to...
	(html_generation_options::m_show_state_diagrams_dot_src): ...this.
	* diagnostic-format-sarif.cc: Include "diagnostic-digraphs.h" and
	"diagnostic-state-graphs.h".
	(sarif_builder::m_run_graphs): New field.
	(sarif_result::on_nested_diagnostic): Update call to
	make_location_object to pass arg by pointer.
	(sarif_builder::sarif_builder): Initialize m_run_graphs.
	(sarif_builder::report_global_digraph): New.
	(sarif_builder::make_result_object): Add any graphs to
	the result object.
	(sarif_builder::make_locations_arr): Update call to
	make_location_object to pass arg by pointer.
	(sarif_builder::make_location_object): Pass param "loc_mgr" by
	pointer rather than by reference so that it can be null, and
	handle this case.
	(copy_any_property_bag): New.
	(make_sarif_graph): New.
	(make_sarif_node): New.
	(make_sarif_edge): New.
	(sarif_property_bag::set_graph): New.
	(populate_thread_flow_location_object): Port from XML
	implementation to state graph implementation.
	(make_run_object): Store any graphs.
	(sarif_output_format::report_global_digraph): New.
	(sarif_generation_options::sarif_generation_options): Rename
	m_xml_state to m_state_graph.
	(selftest::test_make_location_object): Update for change to
	make_location_object.
	* diagnostic-format-sarif.h:
	(sarif_generation_options::m_xml_state): Replace with...
	(sarif_generation_options::m_state_graph): ...this.
	(class sarif_location_manager): Add forward decl.
	(diagnostics::digraphs::digraph): New forward decl.
	(diagnostics::digraphs::node): New forward decl.
	(diagnostics::digraphs::edge): New forward decl.
	(sarif_property_bag::set_graph): New decl.
	(class sarif_graph): New.
	(class sarif_node): New.
	(class sarif_edge): New.
	(make_sarif_graph): New decl.
	(make_sarif_node): New decl.
	(make_sarif_edge): New decl.
	* diagnostic-format-text.h
	(diagnostic_text_output_format::report_global_digraph): New.
	* diagnostic-format.h
	(diagnostic_output_format::report_global_digraph): New vfunc.
	* diagnostic-digraphs.cc: New file.
	* diagnostic-digraphs.h: New file.
	* diagnostic-metadata.h (diagnostics::digraphs::lazy_digraphs):
	New forward decl.
	(diagnostic_metadata::diagnostic_metadata): Initialize
	m_lazy_digraphs.
	(diagnostic_metadata::set_lazy_digraphs): New.
	(diagnostic_metadata::get_lazy_digraphs): New.
	(diagnostic_metadata::m_lazy_digraphs): New field.
	* diagnostic-output-spec.cc (sarif_scheme_handler::make_sink):
	Update for XML to state graph changes.
	(sarif_scheme_handler::make_sarif_gen_opts): Likewise.
	(html_scheme_handler::make_sink): Rename "show-state-diagram-xml"
	to "show-state-diagrams-sarif" and use pluralization consistently.
	* diagnostic-path.cc: Replace include of "xml.h" with
	"diagnostic-state-graphs.h".
	(diagnostic_event::maybe_make_xml_state): Replace with...
	(diagnostic_event::maybe_make_diagnostic_state_graph): ...this.
	* diagnostic-path.h (diagnostics::digraphs::digraph): New forward
	decl.
	(diagnostic_event::maybe_make_xml_state): Replace with...
	(diagnostic_event::maybe_make_diagnostic_state_graph): ...this.
	* diagnostic-state-graphs.cc: New file.
	* diagnostic-state-graphs.h: New file.
	* diagnostic-state-to-dot.cc: Port implementation from XML to
	state graphs.
	* diagnostic-state.h: Deleted file.
	* diagnostic.cc (diagnostic_context::report_global_digraph): New.
	* diagnostic.h (diagnostics::digraphs::lazy_digraph): New forward
	decl.
	(diagnostic_context::report_global_digraph): New decl.
	* doc/analyzer.texi (Debugging the Analyzer): Update to reflect
	change from XML to state graphs.
	* doc/invoke.texi ("sarif" diagnostics sink): Replace "xml-state"
	with "state-graphs".
	("experimental-html" diagnostics sink): Replace
	"show-state-diagrams-xml" with "show-state-diagrams-sarif"
	* doc/libgdiagnostics/topics/compatibility.rst
	(LIBGDIAGNOSTICS_ABI_3): New.
	* doc/libgdiagnostics/topics/graphs.rst: New file.
	* doc/libgdiagnostics/topics/index.rst: Add graphs.rst.
	* graphviz.h (node_id::operator=): New.
	* json.h (json::value::dyn_cast_string): New.
	(json::object::get_num_keys): New accessor.
	(json::object::get_key): New accessor.
	(json::string::dyn_cast_string): New.
	* libgdiagnostics++.h (class libgdiagnostics::graph): New.
	(class libgdiagnostics::node): New.
	(class libgdiagnostics::edge): New.
	(class libgdiagnostics::diagnostic::take_graph): New.
	(class libgdiagnostics::manager::take_global_graph): New.
	(class libgdiagnostics::graph::set_description): New.
	(class libgdiagnostics::graph::get_node_by_id): New.
	(class libgdiagnostics::graph::get_edge_by_id): New.
	(class libgdiagnostics::graph::add_edge): New.
	(class libgdiagnostics::node::set_label): New.
	(class libgdiagnostics::node::set_location): New.
	(class libgdiagnostics::node::set_logical_location): New.
	* libgdiagnostics-private.h: New file.
	* libgdiagnostics.cc: Define INCLUDE_STRING.  Include
	"diagnostic-digraphs.h", "diagnostic-state-graphs.h", and
	"libgdiagnostics-private.h".
	(struct diagnostic_graph): New.
	(struct diagnostic_node): New.
	(struct diagnostic_edge): New.
	(libgdiagnostics_path_event::libgdiagnostics_path_event): Add
	state_graph param.
	(libgdiagnostics_path_event::maybe_make_diagnostic_state_graph):
	New.
	(libgdiagnostics_path_event::m_state_graph): New field.
	(diagnostic_execution_path::add_event_va): Add state_graph param.
	(class prebuilt_digraphs): New.
	(diagnostic::diagnostic): Use m_graphs in m_metadata.
	(diagnostic::take_graph): New.
	(diagnostic::get_graphs): New accessor.
	(diagnostic::m_graphs): New field.
	(diagnostic_manager::take_global_graph): New.
	(diagnostic_execution_path_add_event): Update for new param to
	add_event_va.
	(diagnostic_execution_path_add_event_va): Likewise.
	(diagnostic_graph::add_node_with_id): New public entrypoint.
	(diagnostic_graph::add_edge_with_label): New public entrypoint.
	(diagnostic_manager_new_graph): New public entrypoint.
	(diagnostic_manager_take_global_graph): New public entrypoint.
	(diagnostic_take_graph): New public entrypoint.
	(diagnostic_graph_release): New public entrypoint.
	(diagnostic_graph_set_description): New public entrypoint.
	(diagnostic_graph_add_node): New public entrypoint.
	(diagnostic_graph_add_edge): New public entrypoint.
	(diagnostic_graph_get_node_by_id): New public entrypoint.
	(diagnostic_graph_get_edge_by_id): New public entrypoint.
	(diagnostic_node_set_location): New public entrypoint.
	(diagnostic_node_set_label): New public entrypoint.
	(diagnostic_node_set_logical_location): New public entrypoint.
	(private_diagnostic_execution_path_add_event_2): New private
	entrypoint.
	(private_diagnostic_graph_set_property_bag): New private
	entrypoint.
	(private_diagnostic_node_set_property_bag): New private
	entrypoint.
	(private_diagnostic_edge_set_property_bag): New private
	entrypoint.
	* libgdiagnostics.h (diagnostic_graph): New typedef.
	(diagnostic_node): New typedef.
	(diagnostic_edge): New typedef.
	(diagnostic_manager_new_graph): New decl.
	(diagnostic_manager_take_global_graph): New decl.
	(diagnostic_take_graph): New decl.
	(diagnostic_graph_release): New decl.
	(diagnostic_graph_set_description): New decl.
	(diagnostic_graph_add_node): New decl.
	(diagnostic_graph_add_edge): New decl.
	(diagnostic_graph_get_node_by_id): New decl.
	(diagnostic_graph_get_edge_by_id): New decl.
	(diagnostic_node_set_label): New decl.
	(diagnostic_node_set_location): New decl.
	(diagnostic_node_set_logical_location): New decl.
	* libgdiagnostics.map (LIBGDIAGNOSTICS_ABI_3): New.
	* libsarifreplay.cc: Include "libgdiagnostics-private.h".
	(id_map): New "using".
	(sarif_replayer::report_invalid_sarif): Update for change to
	report_problem params.
	(sarif_replayer::report_unhandled_sarif): Likewise.
	(sarif_replayer::report_note): New.
	(sarif_replayer::report_problem): Pass param "ref" by
	pointer rather than reference and handle it being null.
	(sarif_replayer::maybe_get_property_bag): New.
	(sarif_replayer::maybe_get_property_bag_value): New.
	(sarif_replayer::handle_run_obj): Handle run-level "graphs" as per
	§3.14.20.
	(sarif_replayer::handle_result_obj): Handle result-level "graphs"
	as per §3.27.19.
	(handle_thread_flow_location_object): Optionally handle graphs
	stored in property "gcc/diagnostic_event/state_graph" as state
	graphs.
	(sarif_replayer::handle_graph_object): New.
	(sarif_replayer::handle_node_object): New.
	(sarif_replayer::handle_edge_object): New.
	(sarif_replayer::get_graph_node_by_id_property): New.
	* selftest-run-tests.cc (selftest::run_tests): Call
	selftest::diagnostic_graph_cc_tests and
	selftest::diagnostic_state_graph_cc_tests.
	* selftest.h (selftest::diagnostic_graph_cc_tests): New decl.
	(selftest::diagnostic_state_graph_cc_tests): New decl.

gcc/analyzer/ChangeLog:
	* ana-state-to-diagnostic-state.cc: Reimplement, replacing
	XML-based implementation with one based on state graphs.
	* ana-state-to-diagnostic-state.h: Likewise.
	* checker-event.cc: Replace include of "xml.h" with include of
	"diagnostic-state-graphs.h".
	(checker_event::maybe_make_xml_state): Replace with...
	(checker_event::maybe_make_diagnostic_state_graph): ...this.
	* checker-event.h: Add include of "diagnostic-digraphs.h".
	(checker_event::maybe_make_xml_state): Replace decl with...
	(checker_event::maybe_make_diagnostic_state_graph): ...this.
	* engine.cc (exploded_node::on_stmt_pre): Replace
	"_analyzer_dump_xml" with "__analyzer_dump_sarif".
	* program-state.cc: Replace include of "diagnostic-state.h" with
	"diagnostic-state-graphs.h".
	(program_state::dump_dot): Port from XML to state graphs.
	* program-state.h: Drop reduntant forward decl of xml::document.
	(program_state::make_xml): Replace decl with...
	(program_state::make_diagnostic_state_graph): ...this.
	(program_state::dump_xml_to_pp): Drop decl.
	(program_state::dump_xml_to_file): Drop decl.
	(program_state::dump_xml): Drop decl.
	(program_state::dump_dump_sarif): New decl.
	* sm-malloc.cc (get_dynalloc_state_for_state): New.
	(malloc_state_machine::add_state_to_xml): Replace with...
	(malloc_state_machine::add_state_to_state_graph): ...this.
	* sm.cc (state_machine::add_state_to_xml): Replace with...
	(state_machine::add_state_to_state_graph): ...this.
	(state_machine::add_global_state_to_xml): Replace with...
	(state_machine::add_global_state_to_state_graph): ...this.
	* sm.h (class xml_state): Drop forward decl.
	(class analyzer_state_graph): New forward decl.
	(state_machine::add_state_to_xml): Replace decl with...
	(state_machine::add_state_to_state_graph): ...this.
	(state_machine::add_global_state_to_xml): Replace decl with...
	(state_machine::add_global_state_to_state_graph): ...this.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/state-diagram-1-sarif.py (test_xml_state):
	Rename to...
	(test_state_graph): ...this.  Port from XML to SARIF graphs.
	* gcc.dg/analyzer/state-diagram-1.c: Update sink option
	from "sarif:xml-state=yes" to "sarif:state-graphs=yes".
	* gcc.dg/analyzer/state-diagram-5-sarif.c: Likewise.
	* gcc.dg/analyzer/state-diagram-5-sarif.py: Drop import of ET.
	(test_nested_types_in_xml_state): Rename to...
	(test_nested_types_in_state_graph): ...this.  Port from XML to
	SARIF graphs.
	* gcc.dg/plugin/diagnostic-test-graphs-html.c: New test.
	* gcc.dg/plugin/diagnostic-test-graphs-html.py: New test script.
	* gcc.dg/plugin/diagnostic-test-graphs-sarif.c: New test.
	* gcc.dg/plugin/diagnostic-test-graphs-sarif.py: New test script.
	* gcc.dg/plugin/diagnostic-test-graphs.c: New test.
	* gcc.dg/plugin/diagnostic_plugin_test_graphs.cc: New test plugin.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
	* lib/sarif.py (get_xml_state): Delete.
	(get_state_graph): New.
	(def get_state_node_attr): New.
	(get_state_node_kind): New.
	(get_state_node_name): New.
	(get_state_node_type): New.
	(get_state_node_value): New.
	* sarif-replay.dg/2.1.0-invalid/3.40.2-duplicate-node-id.sarif:
	New test.
	* sarif-replay.dg/2.1.0-invalid/3.41.4-unrecognized-node-id.sarif:
	New test.
	* sarif-replay.dg/2.1.0-valid/graphs-check-html.py: New test
	script.
	* sarif-replay.dg/2.1.0-valid/graphs-check-sarif-roundtrip.py: New
	test script.
	* sarif-replay.dg/2.1.0-valid/graphs.sarif: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-07-11 14:58:21 -04:00
d7c1e9b37c json: add json::value::clone
gcc/ChangeLog:
	* json.cc (json::object::clone): New.
	(json::object::clone_as_object): New.
	(json::array::clone): New.
	(json::float_number::clone): New.
	(json::integer_number::clone): New.
	(json::string::clone): New.
	(json::literal::clone): New.
	(selftest::test_cloning): New test.
	(selftest::json_cc_tests): Call it.
	* json.h (json::value::clone): New vfunc.
	(json::object::clone): New decl.
	(json::object::clone_as_object): New decl.
	(json::array::clone): New decl.
	(json::float_number::clone): New decl.
	(json::integer_number::clone): New decl.
	(json::string::clone): New decl.
	(json::literal::clone): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-07-11 14:58:20 -04:00
1ea72a1503 json: fix null-termination of json::string
gcc/ChangeLog:
	* json.cc (string::string): When constructing from pointer and
	length, ensure the new buffer is null-terminated.
	(selftest::test_strcmp): New.
	(selftest::json_cc_tests): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-07-11 14:58:20 -04:00
457464edf1 libgdiagnostics: doc fixes
gcc/ChangeLog:
	* doc/libgdiagnostics/topics/compatibility.rst
	(_LIBGDIAGNOSTICS_ABI_2): Add missing anchor.
	* doc/libgdiagnostics/topics/diagnostic-manager.rst
	(diagnostic_manager_add_sink_from_spec): Add links to GCC's
	documentation of "-fdiagnostics-add-output=".  Fix parameter
	markup.
	(diagnostic_manager_set_analysis_target): Fix parameter markup.
	Add link to SARIF spec.
	* doc/libgdiagnostics/topics/logical-locations.rst: Markup fix.
	* doc/libgdiagnostics/tutorial/02-physical-locations.rst: Clarify
	wording of what "the source file" means, and that a range can't
	have multiple files.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-07-11 14:58:20 -04:00
06c41504bd [PR121007, LRA]: Fall back to reload of whole inner address in PR case and constrain iteration number of address reloads
gcc/ChangeLog:

	* lra-constraints.cc (process_address_1): When changing base reg
	on a reg of the base class, fall back to reload of whole inner address.
	(process_address): Constrain the iteration number.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr121007.c: New.
2025-07-11 14:26:53 -04:00
981bd3e62c c++: Implement C++26 P2786R13 - Trivial Relocatability [PR119064]
The following patch implements the compiler side of the C++26 paper.
Based on the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119064#c3
feedback, the patch enables the new conditional keywords
trivially_relocatable_if_eligible and replaceable_if_eligible only
for C++26, for older versions those conditional keywords yield
-Wc++26-compat warning and are treated as normal identifiers.
Plus __trivially_relocatable_if_eligible and __replaceable_if_eligible
are handled as conditional keywords always without diagnostics (similarly
to __final in C++98).
The patch uses __builtin_ prefix on the new traits (but unlike clang
which for some weird reason chose to name one __builtin_is_replaceable
and another __builtin_is_cpp_trivially_relocatable this one uses
__builtin_is_replaceable and __builtin_is_trivially_relocatable.
I'll try to convince clang to change, they've only implemented it
recently.
The patch computes these properties on demand, only when something needs
them (at the expense of eating 2 more bits per lang_type, but I've recently
saved 64 bits and a patch to save another 64 bits is pending; and even
4 bits wouldn't fit).
The patch doesn't add __builtin_trivially_relocate builtin that clang has,
std::trivially_relocate is not constexpr and I think we don't need it for
now at least until we implement some kind of vtable pointer signing
__builtin_memmove should do the job.  Especially if libstdc++ will for clang
compatibility use the builtin if available and __builtin_memmove otherwise,
we can switch any time.
I've cross-tested all testcases also against the clang++ trunk
implementation, and both compilers agreed in everything except for
https://github.com/llvm/llvm-project/issues/143599
where clang++ was changed already and
https://github.com/llvm/llvm-project/issues/144232
where I believe clang++ got it wrong too.
The first testcase comes from
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2786r13.html#simple-worked-examples
just tweaked so that the classes are named differently each time and that it
compiles.  There are 3 differences from the paper vs. the g++ as well as
clang++ implementation, I've added comments into
trivially-relocatable1.C but I think either that part of the paper wasn't
updated through the later changes or it got it wrong.

2025-07-11  Jakub Jelinek  <jakub@redhat.com>

	PR c++/119064
gcc/
	* doc/invoke.texi (Wc++26-compat): Document.
gcc/c-family/
	* c.opt (Wc++26-compat): New option.
	* c.opt.urls: Regenerate.
	* c-opts.cc (c_common_post_options): Clear warn_cxx26_compat for
	C++26 or later.
	* c-cppbuiltin.cc (c_cpp_builtins): For C++26 predefine
	__cpp_trivial_relocatability=202502L.
gcc/cp/
	* cp-tree.h: Implement C++26 P2786R13 - Trivial Relocatability.
	(struct lang_type): Add trivially_relocatable,
	trivially_relocatable_computed, replaceable and replaceable_computed
	bitfields.  Change width of dummy from 2 to 30.
	(CLASSTYPE_TRIVIALLY_RELOCATABLE_BIT,
	CLASSTYPE_TRIVIALLY_RELOCATABLE_COMPUTED, CLASSTYPE_REPLACEABLE_BIT,
	CLASSTYPE_REPLACEABLE_COMPUTED): Define.
	(enum virt_specifier): Add VIRT_SPEC_TRIVIALLY_RELOCATABLE_IF_ELIGIBLE
	and VIRT_SPEC_REPLACEABLE_IF_ELIGIBLE enumerators.
	(trivially_relocatable_type_p, replaceable_type_p): Declare.
	* cp-trait.def (IS_NOTHROW_RELOCATABLE, IS_REPLACEABLE,
	IS_TRIVIALLY_RELOCATABLE): New traits.
	* parser.cc (cp_parser_class_property_specifier_seq_opt): Handle
	trivially_relocatable_if_eligible,
	__trivially_relocatable_if_eligible, replaceable_if_eligible and
	__replaceable_if_eligible.
	(cp_parser_class_head): Set CLASSTYPE_REPLACEABLE_BIT
	and/or CLASSTYPE_TRIVIALLY_RELOCATABLE_BIT if corresponding
	conditional keywords were parsed and assert corresponding *_COMPUTED
	macro is false.
	* pt.cc (instantiate_class_template): Copy over also
	CLASSTYPE_TRIVIALLY_RELOCATABLE_{BIT,COMPUTED} and
	CLASSTYPE_REPLACEABLE_{BIT,COMPUTED} bits.
	* semantics.cc (referenceable_type_p): Move definition earlier.
	(trait_expr_value): Handle CPTK_IS_NOTHROW_RELOCATABLE,
	CPTK_IS_REPLACEABLE and CPTK_IS_TRIVIALLY_RELOCATABLE.
	(finish_trait_expr): Likewise.
	* tree.cc (default_movable_type_p): New function.
	(union_with_no_declared_special_member_fns): Likewise.
	(trivially_relocatable_type_p): Likewise.
	(replaceable_type_p): Likewise.
	* constraint.cc (diagnose_trait_expr): Handle
	CPTK_IS_NOTHROW_RELOCATABLE, CPTK_IS_REPLACEABLE and
	CPTK_IS_TRIVIALLY_RELOCATABLE.
gcc/testsuite/
	* g++.dg/cpp26/feat-cxx26.C: Add test for
	__cpp_trivial_relocatability.
	* g++.dg/cpp26/trivially-relocatable1.C: New test.
	* g++.dg/cpp26/trivially-relocatable2.C: New test.
	* g++.dg/cpp26/trivially-relocatable3.C: New test.
	* g++.dg/cpp26/trivially-relocatable4.C: New test.
	* g++.dg/cpp26/trivially-relocatable5.C: New test.
	* g++.dg/cpp26/trivially-relocatable6.C: New test.
	* g++.dg/cpp26/trivially-relocatable7.C: New test.
	* g++.dg/cpp26/trivially-relocatable8.C: New test.
	* g++.dg/cpp26/trivially-relocatable9.C: New test.
	* g++.dg/cpp26/trivially-relocatable10.C: New test.
	* g++.dg/cpp26/trivially-relocatable11.C: New test.
2025-07-11 19:05:38 +02:00
1f52396c6f aarch64: Tweak handling of general SVE permutes [PR121027]
This PR is partly about a code quality regression that was triggered
by g:caa7a99a052929d5970677c5b639e1fa5166e334.  That patch taught the
gimple optimisers to fold two VEC_PERM_EXPRs into one, conditional
upon either (a) the original permutations not being "native" operations
or (b) the combined permutation being a "native" operation.

Whether something is a "native" operation is tested by calling
can_vec_perm_const_p with allow_variable_p set to false.  This requires
the permutation to be supported directly by TARGET_VECTORIZE_VEC_PERM_CONST,
rather than falling back to the general vec_perm optab.

This exposed a problem with the way that we handled general 2-input
permutations for SVE.  Unlike Advanced SIMD, base SVE does not have
an instruction to do general 2-input permutations.  We do still implement
the vec_perm optab for SVE, but only when the vector length is known at
compile time.  The general expansion is pretty expensive: an AND, a SUB,
two TBLs, and an ORR.  It certainly couldn't be considered a "native"
operation.

However, if a VEC_PERM_EXPR has a constant selector, the indices can
be wider than the elements being permuted.  This is not true for the
vec_perm optab, where the indices and permuted elements must have the
same precision.

This leads to one case where we cannot leave a general 2-input permutation
to be handled by the vec_perm optab: when permuting bytes on a target
with 2048-bit vectors.  In that case, the indices of the elements in
the second vector are in the range [256, 511], which cannot be stored
in a byte index.

TARGET_VECTORIZE_VEC_PERM_CONST therefore has to handle 2-input SVE
permutations for one specific case.  Rather than check for that
specific case, the code went ahead and used the vec_perm expansion
whenever it worked.  But that undermines the !allow_variable_p
handling in can_vec_perm_const_p; it becomes impossible for
target-independent code to distinguish "native" operations from
the worst-case fallback.

This patch instead limits TARGET_VECTORIZE_VEC_PERM_CONST to the
cases that it has to handle.  It fixes the PR for all vector lengths
except 2048 bits.

A better fix would be to introduce some sort of costing mechanism,
which would allow us to reject the new VEC_PERM_EXPR even for
2048-bit targets.  But that would be a significant amount of work
and would not be backportable.

gcc/
	PR target/121027
	* config/aarch64/aarch64.cc (aarch64_evpc_sve_tbl): Punt on 2-input
	operations that can be handled by vec_perm.

gcc/testsuite/
	PR target/121027
	* gcc.target/aarch64/sve/acle/general/perm_1.c: New test.
2025-07-11 16:48:41 +01:00
cfa827188d aarch64: Use EOR3 for DImode values
Similar to BCAX, we can use EOR3 for DImode, but we have to be careful
not to force GP<->SIMD moves unnecessarily, so add a splitter for that case.

So for input:
uint64_t eor3_d_gp (uint64_t a, uint64_t b, uint64_t c) { return EOR3 (a, b, c); }
uint64x1_t eor3_d (uint64x1_t a, uint64x1_t b, uint64x1_t c) { return EOR3 (a, b, c); }

We generate the desired:
eor3_d_gp:
        eor     x1, x1, x2
        eor     x0, x1, x0
        ret

eor3_d:
        eor3    v0.16b, v0.16b, v1.16b, v2.16b
        ret

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>

gcc/

	* config/aarch64/aarch64-simd.md (*eor3qdi4): New
	define_insn_and_split.

gcc/testsuite/

	* gcc.target/aarch64/simd/eor3_d.c: Add tests for DImode operands.
2025-07-11 16:10:03 +02:00
1b7bcac032 aarch64: Handle DImode BCAX operations
To handle DImode BCAX operations we want to do them on the SIMD side only if
the incoming arguments don't require a cross-bank move.
This means we need to split back the combination to separate GP BIC+EOR
instructions if the operands are expected to be in GP regs through reload.
The split happens pre-reload if we already know that the destination will be
a GP reg.  Otherwise if reload descides to use the "=r,r" alternative we ensure
operand 0 is early-clobber.
This scheme is similar to how we handle the BSL operations elsewhere in
aarch64-simd.md.

Thus, for the functions:
uint64_t bcax_d_gp (uint64_t a, uint64_t b, uint64_t c) { return BCAX (a, b, c); }
uint64x1_t bcax_d (uint64x1_t a, uint64x1_t b, uint64x1_t c) { return BCAX (a, b, c); }

we now generate the desired:
bcax_d_gp:
        bic     x1, x1, x2
        eor     x0, x1, x0
        ret

bcax_d:
        bcax    v0.16b, v0.16b, v1.16b, v2.16b
        ret

When the inputs are in SIMD regs we use BCAX and when they are in GP regs we
don't force them to SIMD with extra moves.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>

gcc/

	* config/aarch64/aarch64-simd.md (*bcaxqdi4): New
	define_insn_and_split.

gcc/testsuite/

	* gcc.target/aarch64/simd/bcax_d.c: Add tests for DImode arguments.
2025-07-11 16:09:40 +02:00
4da7ba8617 aarch64: Use EOR3 for 64-bit vector modes
Similar to the BCAX patch, we can also use EOR3 for 64-bit modes,
just by adjusting the mode iterator used.
Thus for input:

uint32x2_t
bcax_s (uint32x2_t a, uint32x2_t b, uint32x2_t c)
{
  return EOR3 (a, b, c);
}

we now generate:
bcax_s:
        eor3    v0.16b, v0.16b, v1.16b, v2.16b
        ret

instead of:
bcax_s:
        eor     v1.8b, v1.8b, v2.8b
        eor     v0.8b, v1.8b, v0.8b
        ret

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>

gcc/

	* config/aarch64/aarch64-simd.md (eor3q<mode>4): Use VDQ_I mode
	iterator.

gcc/testsuite/

	* gcc.target/aarch64/simd/eor3_d.c: New test.
2025-07-11 16:09:22 +02:00
5300e2bda9 aarch64: Allow 64-bit vector modes in pattern for BCAX instruction
The BCAX instruction from TARGET_SHA3 only operates on the full .16b form
of the inputs but as it's a pure bitwise operation we can use it for the 64-bit
modes as well as there we don't care about the upper 64 bits.  This patch extends
the relevant pattern in aarch64-simd.md to accept the 64-bit vector modes.

Thus, for the input:
uint32x2_t
bcax_s (uint32x2_t a, uint32x2_t b, uint32x2_t c)
{
  return BCAX (a, b, c);
}

we can now generate:
bcax_s:
        bcax    v0.16b, v0.16b, v1.16b, v2.16b
        ret

instead of the current:
bcax_s:
        bic     v1.8b, v1.8b, v2.8b
        eor     v0.8b, v1.8b, v0.8b
        ret

This patch doesn't cover the DI/V1DI modes as that would require extending
the bcaxqdi4 pattern with =r,r alternatives and adding splitting logic to
handle the cases where the operands arrive in GP regs.  It is doable, but can
be a separate patch.  This patch as is should be a straightforward improvement
always.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>

gcc/

	* config/aarch64/aarch64-simd.md (bcaxq<mode>4): Use VDQ_I mode
	iterator.

gcc/testsuite/

	* gcc.target/aarch64/simd/bcax_d.c: New test.
2025-07-11 16:09:14 +02:00
f451ef41bd tree-optimization/121034 - fix reduction vectorization
The following fixes the loop following the reduction chain to
properly visit all SLP nodes involved and makes the stmt info
and the SLP node we track match.

	PR tree-optimization/121034
	* tree-vect-loop.cc (vectorizable_reduction): Cleanup
	reduction chain following code.

	* gcc.dg/vect/pr121034.c: New testcase.
2025-07-11 14:38:16 +02:00
dc503631a5 libstdc++: Implement C++26 P3748R0 - Inspecting exception_ptr should be constexpr
The following patch makes std::exception_ptr_cast constexpr.
The paper suggests using dynamic_cast, but that does only work for
polymorphic exceptions, doesn't work if they are scalar or non-polymorphic
classes.

Furthermore, the patch adds some static_asserts for
"Mandates: E is a cv-unqualified complete object type. E is not an array type.
E is not a pointer or pointer-to-member type."

2025-07-11  Jakub Jelinek  <jakub@redhat.com>

	* libsupc++/exception_ptr.h: Implement C++26 P3748R0 - Inspecting
	exception_ptr should be constexpr.
	(std::exception_ptr_cast): Make constexpr, remove inline keyword.  Add
	static_asserts for Mandates.  For if consteval use std::rethrow_exception,
	catch it and return its address or nullptr.
	* testsuite/18_support/exception_ptr/exception_ptr_cast.cc (E::~E): Add
	constexpr.
	(G::G): Likewise.
	(test01): Likewise.  Return bool and take bool argument, throw if the
	argument is true.  Add static_assert(test01(false)).
	(main): Call test01(true) in try.
2025-07-11 13:50:07 +02:00
9eea49825e testsuite: Add testcase for already fixed PR [PR120954]
This was a regression introduced by r16-1893 (and its backports) for C++,
though for C it had false positive warning for years.  Fixed by r16-2000
(and its backports).

2025-07-11  Jakub Jelinek  <jakub@redhat.com>

	PR c++/120954
	* c-c++-common/Warray-bounds-11.c: New test.
2025-07-11 13:47:23 +02:00
385d9937f0 Rewrite assign_discriminators
To assign debug locations to corresponding statements auto-fdo uses
discriminators.  Documentation says that if given statement belongs to multiple
basic blocks, the discrminator distinguishes them.

Current implementation however only work fork statements that expands into a
squence of gimple statements which forms a linear sequence, sicne it
essentially tracks a current location and renews it each time new BB is found.
This is commonly not true for C++ code as in:

  <bb 25> :
  [simulator/csimplemodule.cc:379:85] _40 = std::__cxx11::basic_string<char>::c_str ([simulator/csimplemodule.cc:379:85] &D.80680);
  [simulator/csimplemodule.cc:379:85 discrim 13] _41 = [simulator/csimplemodule.cc:379:85] &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782;
  [simulator/csimplemodule.cc:379:85 discrim 13] _42 = &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782;
  [simulator/csimplemodule.cc:377:45] _43 = this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782._vptr.cObject;
  [simulator/csimplemodule.cc:377:45] _44 = _43 + 40;
  [simulator/csimplemodule.cc:377:45] _45 = [simulator/csimplemodule.cc:377:45] *_44;
  [simulator/csimplemodule.cc:379:85] D.89001 = OBJ_TYPE_REF(_45;(const struct cObject)_42->5B) (_41);

This is a fragment of code that is expanded from:

371         if (this!=simulation.getContextModule())
372             throw cRuntimeError("send()/sendDelayed() of module (%s)%s called in the context of "
373                                 "module (%s)%s: method called from the latter module "
374                                 "lacks Enter_Method() or Enter_Method_Silent()? "
375                                 "Also, if message to be sent is passed from that module, "
376                                 "you'll need to call take(msg) after Enter_Method() as well",
377                                 getClassName(), getFullPath().c_str(),
378                                 simulation.getContextModule()->getClassName(),
379                                 simulation.getContextModule()->getFullPath().c_str());

Notice that 379:85 is interleaved by 377:45 and the pass does not assign new discriminator.
With patch we get:

  <bb 25> :
  [simulator/csimplemodule.cc:379:85 discrim 7] _40 = std::__cxx11::basic_string<char>::c_str ([simulator/csimplemodule.cc:379:85] &D.80680);
  [simulator/csimplemodule.cc:379:85 discrim 8] _41 = [simulator/csimplemodule.cc:379:85] &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782;
  [simulator/csimplemodule.cc:379:85 discrim 8] _42 = &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782;
  [simulator/csimplemodule.cc:377:45 discrim 1] _43 = this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782._vptr.cObject;
  [simulator/csimplemodule.cc:377:45 discrim 1] _44 = _43 + 40;
  [simulator/csimplemodule.cc:377:45 discrim 1] _45 = [simulator/csimplemodule.cc:377:45] *_44;
  [simulator/csimplemodule.cc:379:85 discrim 8] D.89001 = OBJ_TYPE_REF(_45;(const struct cObject)_42->5B) (_41);

There are earlier statements with line number 379, so that is why there is discriminator 7 for the call.
After that discriminator is increased.  There are two reasons for it
 1) AFDO requires every callsite to have unique lineno:discriminator pair
 2) call may not terminate and htus the profile of first statement
    may be higher than the rest.

Old pass also contained logic to skip debug statements.  This is not a good
idea since we output them to the debug output and if AFDO tool picks these
locations up they will be misplaced in basic blocks.

Debug statements are naturally quite useful to track back the AFDO profiles
and in meantime LLVM folks implemented something similar called pseudoprobe.
I think it makes sense toenable debug statements with -fauto-profile even if
debug info is off and make use of them as done in this patch.

Sadly AFDO tool is quite broken and bulid around assumption that every address
has at most one debug location assigned to it (i.e. debug info before debug
statements were introduced). I have WIP patch fixing this.

Note that LLVM also has -fdebug-info-for-auto-profile (on by defualt it seems)
that controls discriminator production and some other little bits.  I wonder if
we want to have something similar.  Should it be -gdebug-info-for-auto-profile
instead?

gcc/ChangeLog:

	* opts.cc (finish_options): Enable debug_nonbind_markers_p for
	auto-profile.
	* tree-cfg.cc (struct locus_discrim_map): Remove.
	(struct locus_discrim_hasher): Remove.
	(locus_discrim_hasher::hash): Remove.
	(locus_discrim_hasher::equal): Remove.
	(first_non_label_nondebug_stmt): Remove.
	(build_gimple_cfg): Do not allocate discriminator tables.
	(next_discriminator_for_locus): Remove.
	(same_line_p): Remove.
	(struct discrim_entry): New structure.
	(assign_discriminator): Rewrite.
	(assign_discriminators): Rewrite.
2025-07-11 13:01:13 +02:00
52d9c2272f Fix ICE in speculative devirtualization
This patch fixes ICE bilding lto1 with autoprofiledbootstrap and in pr114790.
What happens is that auto-fdo speculatively devirtualizes to a wrong target.
This is due to a bug where it mixes up dwarf names and linkage names of inline
functions I need to fix as well.

Later we clone at WPA time. At ltrans time clone is materialized and call is
turned into a direct call (this optimization is missed by ipa-cp propagation).
At this time we should resolve speculation but we don't.  As a result we get
error from verifier after inlining complaining that there is speculative call
with corresponding direct call lacking speculative flag.

This seems long-lasting problem in cgraph_update_edges_for_call_stmt_node but
I suppose it does not trigger since we usually speculate correctly or notice
the direct call at WPA time already.

Bootstrapped/regtested x86_64-linux.

gcc/ChangeLog:

	PR ipa/114790
	* cgraph.cc (cgraph_update_edges_for_call_stmt_node): Resolve devirtualization
	if call statement was optimized out or turned to direct call.

gcc/testsuite/ChangeLog:

	* g++.dg/lto/pr114790_0.C: New test.
	* g++.dg/lto/pr114790_1.C: New test.
2025-07-11 12:38:00 +02:00