From patchwork Wed Nov 1 09:45:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 160495 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp292274vqx; Wed, 1 Nov 2023 02:46:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFCuhPzCqS5lOvI3baZe1I6bN7QsVtYG+wM6I7SKne1q0N5h+usawKfKX2J1pWYCrcrpBGS X-Received: by 2002:ad4:5c47:0:b0:66d:62b7:53f4 with SMTP id a7-20020ad45c47000000b0066d62b753f4mr23667149qva.45.1698831986381; Wed, 01 Nov 2023 02:46:26 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698831986; cv=pass; d=google.com; s=arc-20160816; b=LSRvDEfRfvmKrGrVHKEz6EnUnXtZy+mMM/Fkj+C70xA3jyfMqcunVs2HmrvnJtJ92s WDq5pwlXFO7MtpsoYuHmTsVF7ABFTNut7ezutOC+Ahk71F2rnvt4zFwLoyRtXf30z++d 85hS/anW549ExzDQmKgQIs0vaYGUUkLQygn+IcyRWp0PMw8e4ppzhwXNWYmTr8RMwAsI kq8CRql1r2c09uzr3YPYMwGIYOnzz1nM6e9+KAbLzeVyeZphPWV8ZK/+swfYJ3W60C3T bB+WmQiN8TzsuHNuijKIncUZe8DbnQmqiw9bRkdLEfnrwADHCOkX0Bn50y/lWx7APdI9 1BEA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:to:subject:message-id:date:from :mime-version:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=eSDi4/o6xbHAunr1lMltUuWoKUx0OQhVZOTk2OwrXnc=; fh=XNn3asQvIblazGK92GBt13dVv+YmGV3pBS0JC29ZQco=; b=DXB0XQ7B60dCW3DQTruiMA/Jp7dU64RMN2hF6eSnZgsuhII0BBQjr6l+AHe8tozTYL wi6eDeXNX3wnjYnFe6W85y6Wj5J8tYkijbxoig5+xAHkF3heeAtwjK+InE2tIR18+/+z AyBP1iKSfYh3KDzP7GgatP0K1jpU2GM3bxdRaRm1/CS8D1pzJMqv274TMReQ0KDWmlS8 iS5EQOGtHA122ZV4i1vVeV3HGsufAd+cF0zoTErDfDj7Xh8zuygER1Dp449Q/hOlrpmI ziGMx/LnVs5sOH7qdxgcKywju6reaKv4MSKT06BRWn4QNntHf1i5S3XrmzljmLopXuu0 jcRA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=SoinyinC; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id a2-20020a0ccdc2000000b006563e5513a2si2559745qvn.250.2023.11.01.02.46.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 02:46:26 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=SoinyinC; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 20E203858C2B for ; Wed, 1 Nov 2023 09:46:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by sourceware.org (Postfix) with ESMTPS id 333013858D1E for ; Wed, 1 Nov 2023 09:45:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 333013858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 333013858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::62b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698831961; cv=none; b=o2gd0lEdG4U3+D36KoN2LuCawY+dl6ZLw2bVgaMS2442PAtHivOJBvzLE331cfdUAfYkPSGI4b/oyRufrbm5xKN9I7xtcLbHkj+DrvElHYUaqR6nR2cF5CngEsZoY8q1hVVkQ0Y1GQdqtFJxjBt0UGBDPvok0QUTFXFBtcnGxbY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698831961; c=relaxed/simple; bh=cBzjQ9PlLnPo2oW1CMqF0qHuZTTlQP1Gdaw5u8BHpOg=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=vF09kmVM/Wvsyk9j0L7TQ5WkSdHYc7qz/fATqA9FTXL1oJVM36zY+e0qEnU5zB4obG3K7KoncivcijfceWuk6tap8odHtV6y8PN0Lm3aBwD3BNM47bX0rFSz+05ujBDuLS2hjr1ofja0Si/syOnf6vVWrb73UzzLFtuRgi5SiEU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x62b.google.com with SMTP id a640c23a62f3a-9be02fcf268so962497866b.3 for ; Wed, 01 Nov 2023 02:45:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698831949; x=1699436749; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=eSDi4/o6xbHAunr1lMltUuWoKUx0OQhVZOTk2OwrXnc=; b=SoinyinCqmM0uwPbp1UwJYo/JOR2XX59lNJ/aHufyqZJsUdomU4erKpw8Ql0CikWdJ WqdD35bxlbu9Ly4JNbhDhatONcxbRO4GQdtdcIJQt6qyJWGA8A9JaQhQSZ9eqYJ1GQr1 bxpD6UkOJZ22X6uoDaJBGgN8qR7Jc2r5g6kj3YI7TkMCTEjHsG2s0qJ1da82gVdniruO EG2k6sdGtGQE4YEBkDOHI7OK2rqqEkADs32YHA0Qs2lUGwMbjFQEGqLAIUt8bCg2xR/u 3lUFFb31naE5Vipl7NiormOBbFqEh6J4LKrVXAE/9ufrCg0eDgCL4Usa6d9UwWI34rCR gsnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698831949; x=1699436749; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=eSDi4/o6xbHAunr1lMltUuWoKUx0OQhVZOTk2OwrXnc=; b=dG5SVJXB6a0DmElohTupYMZudp+DVEtKI9tJ9hooUAZhH3VlVY3nD8UHqO2x5C9NxE UdyDKBVp9/XndGWQC37BvzU6wNJb6yO3GG4xOMBbCn7gadypQj4qgWZJKsFzPoFdo+7b yRhjPKoVAd/fxNDlQbDKr2sOgW4uSz4ANssdjkeOWXTUkNqV+VPWCZEZvq3kmZqBUbDN ATtE6y6/BcK+pma+lba32MAMCUsJtI00k5dS7cLJYaMi8O5qNcG+4XMJJvHw0u/kfrdA 8s6kazRAsvejvi+mE1NK9uI2sWqr9mM/QE4pQ3hpN4g9RrmP/Dd/gpiA82JOUbvAE+dg 2hfg== X-Gm-Message-State: AOJu0YyTQit8fMeD8WB3ThKx4SqNxbJxYFw6k/cLWG2AsZvCZBNBJ8Kd GtTRqonn7GfRR0nUKJaT5GN4BOEa736aNcapIrJuCB1IKE+5TQ== X-Received: by 2002:a17:907:934d:b0:9d3:ccd1:a911 with SMTP id bv13-20020a170907934d00b009d3ccd1a911mr1415961ejc.76.1698831948572; Wed, 01 Nov 2023 02:45:48 -0700 (PDT) MIME-Version: 1.0 From: Uros Bizjak Date: Wed, 1 Nov 2023 10:45:37 +0100 Message-ID: Subject: [PUSHED] i386: Improve stack protector patterns and peephole2s To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781354449112423819 X-GMAIL-MSGID: 1781354449112423819 Improve stack protector patterns and peephole2s to substitute stack protector scratch register clear with unrelated subsequent register initialization in several ways: a. Explicitly generate scratch register as named pseudo. This allows optimizers to eventually reuse the zero value in the register. b. Allow scratch register in different mode (SWI48) than PTR mode: d000: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax d007: 00 00 d009: 48 89 44 24 08 mov %rax,0x8(%rsp) d00e: 8b 87 e0 01 00 00 mov 0x1e0(%rdi),%eax SImode moves on x86 zero-extend to the whole DImode register, so stack protector paranoia is not compromised. c. Relax peephole2 constraint that stack protector scratch register must match new initialized register. This relaxation substantially improves peephole2 opportunities, and generates sequences like: a310: 65 4c 8b 34 25 28 00 mov %gs:0x28,%r14 a317: 00 00 a319: 4c 89 74 24 08 mov %r14,0x8(%rsp) a31e: 4c 8b b7 98 00 00 00 mov 0x98(%rdi),%r14 We have to ensure the new scratch is dead in front of the sequence. The patch also fixes omission of earlyclobbers for all alternatives of new initialized register in *stack_protect_set_3, avoiding the need for reg_overlap_mentioned_p constraint. Earlyclobbers are per alternative, not per operand. Also, instructions are already valid in peephole2 pass, so we don't have to explicitly re-check their operands for validity. gcc/ChangeLog: * config/i386/i386.md (stack_protect_set): Explicitly generate scratch register in word mode. (@stack_protect_set_1_): Rename to ... (@stack_protect_set_1__): ... this. Use SWI48 mode iterator to match scratch register. (stack_protexct_set_1 peephole2): Use PTR, W and SWI48 mode iterators to match peephole sequence. Use general_operand predicate for operand 4. Allow different operand 2 and operand 3 registers and use peep2_reg_dead_p to ensure new scratch register is dead before peephole seqeunce. Use peep2_reg_dead_p to ensure old scratch register is dead after peephole sequence. (*stack_protect_set_2_): Rename to ... (*stack_protect_set_2__si): .. this. (*stack_protect_set_3): Rename to ... (*stack_protect_set_2__di): ... this. Use PTR mode iterator to match stack protector memory move. Use earlyclobber for all alternatives of operand 1. (stack_protexct_set_2 peephole2): Use PTR, W and SWI48 mode iterators to match peephole sequence. Use general_operand predicate for operand 4. Allow different operand 2 and operand 3 registers and use peep2_reg_dead_p to ensure new scratch register is dead before peephole seqeunce. Use peep2_reg_dead_p to ensure old scratch register is dead after peephole sequence. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 75dd4b4061f..35d073c9a21 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -25653,40 +25653,58 @@ (define_expand "stack_protect_set" (match_operand 1 "memory_operand")] "" { + rtx scratch = gen_reg_rtx (word_mode); + emit_insn (gen_stack_protect_set_1 - (ptr_mode, operands[0], operands[1])); + (ptr_mode, word_mode, operands[0], operands[1], scratch)); DONE; }) -(define_insn "@stack_protect_set_1_" +(define_insn "@stack_protect_set_1__" [(set (match_operand:PTR 0 "memory_operand" "=m") (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")] UNSPEC_SP_SET)) - (set (match_scratch:PTR 2 "=&r") (const_int 0)) + (set (match_operand:SWI48 2 "register_operand" "=&r") (const_int 0)) (clobber (reg:CC FLAGS_REG))] "" { - output_asm_insn ("mov{}\t{%1, %2|%2, %1}", operands); - output_asm_insn ("mov{}\t{%2, %0|%0, %2}", operands); + output_asm_insn ("mov{}\t{%1, %2|%2, %1}", + operands); + output_asm_insn ("mov{}\t{%2, %0|%0, %2}", + operands); return "xor{l}\t%k2, %k2"; } [(set_attr "type" "multi")]) ;; Patterns and peephole2s to optimize stack_protect_set_1_ -;; immediately followed by *mov{s,d}i_internal to the same register, -;; where we can avoid the xor{l} above. We don't split this, so that -;; scheduling or anything else doesn't separate the *stack_protect_set* -;; pattern from the set of the register that overwrites the register -;; with a new value. -(define_insn "*stack_protect_set_2_" +;; immediately followed by *mov{s,d}i_internal, where we can avoid +;; the xor{l} above. We don't split this, so that scheduling or +;; anything else doesn't separate the *stack_protect_set* pattern from +;; the set of the register that overwrites the register with a new value. + +(define_peephole2 + [(parallel [(set (match_operand:PTR 0 "memory_operand") + (unspec:PTR [(match_operand:PTR 1 "memory_operand")] + UNSPEC_SP_SET)) + (set (match_operand:W 2 "general_reg_operand") (const_int 0)) + (clobber (reg:CC FLAGS_REG))]) + (parallel [(set (match_operand:SWI48 3 "general_reg_operand") + (match_operand:SWI48 4 "const0_operand")) + (clobber (reg:CC FLAGS_REG))])] + "peep2_reg_dead_p (0, operands[3]) + && peep2_reg_dead_p (1, operands[2])" + [(parallel [(set (match_dup 0) + (unspec:PTR [(match_dup 1)] UNSPEC_SP_SET)) + (set (match_dup 3) (const_int 0)) + (clobber (reg:CC FLAGS_REG))])]) + +(define_insn "*stack_protect_set_2__si" [(set (match_operand:PTR 0 "memory_operand" "=m") (unspec:PTR [(match_operand:PTR 3 "memory_operand" "m")] UNSPEC_SP_SET)) (set (match_operand:SI 1 "register_operand" "=&r") - (match_operand:SI 2 "general_operand" "g")) - (clobber (reg:CC FLAGS_REG))] - "reload_completed - && !reg_overlap_mentioned_p (operands[1], operands[2])" + (match_operand:SI 2 "general_operand" "g"))] + "reload_completed" { output_asm_insn ("mov{}\t{%3, %1|%1, %3}", operands); output_asm_insn ("mov{}\t{%1, %0|%0, %1}", operands); @@ -25699,38 +25717,16 @@ (define_insn "*stack_protect_set_2_" [(set_attr "type" "multi") (set_attr "length" "24")]) -(define_peephole2 - [(parallel [(set (match_operand:PTR 0 "memory_operand") - (unspec:PTR [(match_operand:PTR 1 "memory_operand")] - UNSPEC_SP_SET)) - (set (match_operand:PTR 2 "general_reg_operand") (const_int 0)) - (clobber (reg:CC FLAGS_REG))]) - (set (match_operand:SI 3 "general_reg_operand") - (match_operand:SI 4))] - "REGNO (operands[2]) == REGNO (operands[3]) - && general_operand (operands[4], SImode) - && (general_reg_operand (operands[4], SImode) - || memory_operand (operands[4], SImode) - || immediate_operand (operands[4], SImode)) - && !reg_overlap_mentioned_p (operands[3], operands[4])" - [(parallel [(set (match_dup 0) - (unspec:PTR [(match_dup 1)] UNSPEC_SP_SET)) - (set (match_dup 3) (match_dup 4)) - (clobber (reg:CC FLAGS_REG))])]) - -(define_insn "*stack_protect_set_3" - [(set (match_operand:DI 0 "memory_operand" "=m,m,m") - (unspec:DI [(match_operand:DI 3 "memory_operand" "m,m,m")] - UNSPEC_SP_SET)) - (set (match_operand:DI 1 "register_operand" "=&r,r,r") - (match_operand:DI 2 "general_operand" "Z,rem,i")) - (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT - && reload_completed - && !reg_overlap_mentioned_p (operands[1], operands[2])" +(define_insn "*stack_protect_set_2__di" + [(set (match_operand:PTR 0 "memory_operand" "=m,m,m") + (unspec:PTR [(match_operand:PTR 3 "memory_operand" "m,m,m")] + UNSPEC_SP_SET)) + (set (match_operand:DI 1 "register_operand" "=&r,&r,&r") + (match_operand:DI 2 "general_operand" "Z,rem,i"))] + "TARGET_64BIT && reload_completed" { - output_asm_insn ("mov{q}\t{%3, %1|%1, %3}", operands); - output_asm_insn ("mov{q}\t{%1, %0|%0, %1}", operands); + output_asm_insn ("mov{}\t{%3, %1|%1, %3}", operands); + output_asm_insn ("mov{}\t{%1, %0|%0, %1}", operands); if (pic_32bit_operand (operands[2], DImode)) return "lea{q}\t{%E2, %1|%1, %E2}"; else if (which_alternative == 0) @@ -25746,25 +25742,18 @@ (define_insn "*stack_protect_set_3" (set_attr "length" "24")]) (define_peephole2 - [(parallel [(set (match_operand:DI 0 "memory_operand") - (unspec:DI [(match_operand:DI 1 "memory_operand")] - UNSPEC_SP_SET)) - (set (match_operand:DI 2 "general_reg_operand") (const_int 0)) - (clobber (reg:CC FLAGS_REG))]) - (set (match_dup 2) (match_operand:DI 3))] - "TARGET_64BIT - && general_operand (operands[3], DImode) - && (general_reg_operand (operands[3], DImode) - || memory_operand (operands[3], DImode) - || x86_64_zext_immediate_operand (operands[3], DImode) - || x86_64_immediate_operand (operands[3], DImode) - || (CONSTANT_P (operands[3]) - && (!flag_pic || LEGITIMATE_PIC_OPERAND_P (operands[3])))) - && !reg_overlap_mentioned_p (operands[2], operands[3])" - [(parallel [(set (match_dup 0) - (unspec:PTR [(match_dup 1)] UNSPEC_SP_SET)) - (set (match_dup 2) (match_dup 3)) - (clobber (reg:CC FLAGS_REG))])]) + [(parallel [(set (match_operand:PTR 0 "memory_operand") + (unspec:PTR [(match_operand:PTR 1 "memory_operand")] + UNSPEC_SP_SET)) + (set (match_operand:W 2 "general_reg_operand") (const_int 0)) + (clobber (reg:CC FLAGS_REG))]) + (set (match_operand:SWI48 3 "general_reg_operand") + (match_operand:SWI48 4 "general_operand"))] + "peep2_reg_dead_p (0, operands[3]) + && peep2_reg_dead_p (1, operands[2])" + [(parallel [(set (match_dup 0) + (unspec:PTR [(match_dup 1)] UNSPEC_SP_SET)) + (set (match_dup 3) (match_dup 4))])]) (define_expand "stack_protect_test" [(match_operand 0 "memory_operand")