From patchwork Mon Dec 19 22:02:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34792 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2638473wrn; Mon, 19 Dec 2022 14:06:09 -0800 (PST) X-Google-Smtp-Source: AA0mqf7ggt2iUjJuhz/XQ8i3VhfKkcb/t2YAYv+KLnUfBWEnj84vp5uHSQTc4Es7PqNGAbWiHb8e X-Received: by 2002:a62:1781:0:b0:574:58d1:cf9f with SMTP id 123-20020a621781000000b0057458d1cf9fmr42280321pfx.27.1671487566966; Mon, 19 Dec 2022 14:06:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487566; cv=none; d=google.com; s=arc-20160816; b=uKesEZTj0wNB2JpoBLuLgRg8Yss+myV586X14f0mOYMiqFbB4vM05IVakk8AFlnuNH zyZMe3F9VXzUCvsBOY6KUSZO02O2NClA+8zz+N28sbDEgEg6U4P2bYddAB5tR1m0+dhH 9fI9jcR7Vn74T0aK8JGGUF0TkawRj85mL0vT+tFJ3uiz0U6npupQDArXo38wxKrzmCgf aiKqwV4l0fS9ZmvW7gbD1d60XLLAaoggoUjNiwIx1QUX0NtCNfEL4CRYn1TuDnaWKAi+ Tyb2nnrh86dULZ3FOBaFYYOAyKZA7cV567/W6KWhtsKWnsJ8n/3gKYjJOlCfDi+rPgLU Pdvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=I6XKT/Pcf2+1kXWaQqbl3UWdCnYWXaqzJLkuul+7CVE=; b=Vk0+AweEGQNTVBSlnZd4VOWsIJrMK1Cy2FX/QqWRRv9owHPdQ2R8d4FY7nf1nMnu+B GFW0ruY4jM2DGEn/8d3fW653grkOtd3WYDjcDiDVLHO8nIG3+PILOcybgGe+Cx7+wPPu G62WAXiJvMhZiqPvjj3LZTEVJeNwcmL+p4+E8V5whaGUmaNG7TePhJ001PU2oIL06Oe5 O8JjOCGZ3/k4Qrq0UFBTdSfLpBqr0AeOoPk+G+lN6lhsT34Ighv842zmi+foVc0J47Zi af2wvKt0i6ZwkuC5byGWaTfsswL9UBIYLGlehgVbKqSJhMhAWN1/jXIwGjvlcckL2R7S 5jIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=D4P6AjoZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g18-20020a056a001a1200b0054764d871b5si12381949pfv.230.2022.12.19.14.05.53; Mon, 19 Dec 2022 14:06:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=D4P6AjoZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232747AbiLSWDx (ORCPT + 99 others); Mon, 19 Dec 2022 17:03:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232852AbiLSWDU (ORCPT ); Mon, 19 Dec 2022 17:03:20 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9045A140C7; Mon, 19 Dec 2022 14:03:19 -0800 (PST) Received: from pps.filterd (m0134422.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJKsZRw018294; Mon, 19 Dec 2022 22:02:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=I6XKT/Pcf2+1kXWaQqbl3UWdCnYWXaqzJLkuul+7CVE=; b=D4P6AjoZHDeS94X51/5MPD7RQ8nRac65Iyn+ZTKFVfPanU0/oMOsd8ixWba7jamocAzi y38XUKZzvJpszmLjyQ3uQknffSNFQlIrbCvWzuarR/mHjvVQWyQDWo64Dq9+sB1pQtJM KjegFI2S18Hbkq05ByYKyB8cSPSYNpD6REA7POOabUZOqP/0fcB+NN2oDkCFWHKo2ClF k4l6lEIOgUnMWkeA/mHJfKSYojyU6z5W3JCfvuJ3qr8rZ6DL6p9BDoEhN0xQiyn5RlC1 MJnbgQtCYZ185A54VHaNlBDGtkUR0y62l8o4iIJtlHEBEgCbsRkK7ZvDa2E/Do+B40cg Ew== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3mjyd9rckj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:02:44 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id C5288807116; Mon, 19 Dec 2022 22:02:43 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id EC3FA805634; Mon, 19 Dec 2022 22:02:42 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 01/13] x86: protect simd.h header file Date: Mon, 19 Dec 2022 16:02:11 -0600 Message-Id: <20221219220223.3982176-2-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: k_vQFl1c6KrTHEMMA5Ho4JlnNYZ_Q4NU X-Proofpoint-GUID: k_vQFl1c6KrTHEMMA5Ho4JlnNYZ_Q4NU X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 mlxlogscore=926 clxscore=1015 bulkscore=0 adultscore=0 malwarescore=0 spamscore=0 impostorscore=0 mlxscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681747194956151?= X-GMAIL-MSGID: =?utf-8?q?1752681747194956151?= Add the usual #ifndef/#define construct around the contents of simd.h so it doesn't confuse the C pre-processor if included by multiple include files. Fixes: 801201aa2564 ("crypto: move x86 to the generic version of ablk_helper") Signed-off-by: Robert Elliott --- arch/x86/include/asm/simd.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/include/asm/simd.h b/arch/x86/include/asm/simd.h index a341c878e977..bd9c672a2792 100644 --- a/arch/x86/include/asm/simd.h +++ b/arch/x86/include/asm/simd.h @@ -1,4 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_SIMD_H +#define _ASM_X86_SIMD_H #include @@ -10,3 +12,5 @@ static __must_check inline bool may_use_simd(void) { return irq_fpu_usable(); } + +#endif /* _ASM_X86_SIMD_H */ From patchwork Mon Dec 19 22:02:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34794 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2638479wrn; Mon, 19 Dec 2022 14:06:10 -0800 (PST) X-Google-Smtp-Source: AMrXdXuwvIwkiWWW+nUJO2vZAQ9ID3FAS9uGcIX3g1dxQI/Lt6TCbYGAFBnkYWkZ+LbYg//TFRtE X-Received: by 2002:a17:902:ba89:b0:189:c62e:ac34 with SMTP id k9-20020a170902ba8900b00189c62eac34mr11052961pls.47.1671487570364; Mon, 19 Dec 2022 14:06:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487570; cv=none; d=google.com; s=arc-20160816; b=C0RSkYa912rALr37qD23fabre/S5t9GBD3uZAzpj8mdNbO/pRNrSib1B211qsXUgS5 ADTeiqmx2akVmJx67b6isaqZszHkTCNhLMu/Oo9PA29SKdLeVrNRqWCcPnSd23d2Thls 0sgmgt5T/HJjW4e8CQjgaq60nOpr/+wBlGOlmlJNEBWcOY2Osc0p8qFJAROGSscdBOvX OFHVhYqpdkGSfM2G5IXxisLG1x9ANj/4TszLj5bAlZo+37+ZevztAfTz+lfppMKj4S5J SmK6pbIamBb+dBSBmXXf5OCv1J/vTYGxfPIcoArm6RCpCazgczZ0mJtZnZWUHk2R5ubl wNuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DUz+WsoLUu9ZHA+z71TaCHVliVtTYVa80yxAm1HKF4g=; b=F6Un5FsyhprB7Je8xPc1GdOTahA7fiOkCtCqIXxfWXDwCSdHriGJdYR08oRuNpL7Ju FoLeiQbxWZOs48iWsqPXbcfbUYPWQx9SaQcjokHQKpWK/hVmbzZopsY/c/M6KmAnzKnj unEcpMI1/4WqjlyG7h2vGDCte1bPfY5MtvyDGPJDa49jalAvbKmtD4S2Tcf2lONscJbs pPxYUzOpBHhiHPlZqoXUq48/2NCIwSS+fX51q1ZHuJLZsUpcR1y+N5irm9kmv2CAiYuA XXpQvIzJlMLzMcE/nSyC5yBabwP5jY8vyKLUB/6i5Yrgqj2+Q79q0fche1f04filU98t NkeA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=RW6sKc9B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f15-20020a170902ab8f00b001782ecb617dsi11014317plr.412.2022.12.19.14.05.57; Mon, 19 Dec 2022 14:06:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=RW6sKc9B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232923AbiLSWD4 (ORCPT + 99 others); Mon, 19 Dec 2022 17:03:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232840AbiLSWDU (ORCPT ); Mon, 19 Dec 2022 17:03:20 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D42F140C6; Mon, 19 Dec 2022 14:03:19 -0800 (PST) Received: from pps.filterd (m0148663.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJL5BJ4004423; Mon, 19 Dec 2022 22:02:47 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=DUz+WsoLUu9ZHA+z71TaCHVliVtTYVa80yxAm1HKF4g=; b=RW6sKc9BEWBI+rYS8eckIaK7NhIc+uCvrxlgKBPBVyU17Ir5g1ATwZVhFr8RV4hy2IO1 pNjJM+Kkt1DlUldQ3RS3phZwXdOUnoRSP2yxdAloCK0r5p3BhHgYhIakYGlriIvCUVuG hYKeXIivWtOcwMRDjovRrbjxQv3P3qRNTp7HFo6GK+smWP5+aaPh8lRwWG+O6cPG/9Kh YBUq2metVJc6XBwcbOmp4ASPSh4IonJD7FVC3DmsiFE7aMxcb3T11YvD7gsrQekRZHli A9KcA1NuAHJVzUho6WNoSqc6Dozkwlw3OLwLcVNgbFJdRmwpjJxiotcUtvghs8rDjdv7 Lg== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3mjyd5rcwj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:02:47 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 3735731099; Mon, 19 Dec 2022 22:02:46 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 5F9DF8061BF; Mon, 19 Dec 2022 22:02:45 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 02/13] x86: add yield FPU context utility function Date: Mon, 19 Dec 2022 16:02:12 -0600 Message-Id: <20221219220223.3982176-3-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: Gvjh4KDpMzFEWSzZRj7nTCTBt22lX88l X-Proofpoint-ORIG-GUID: Gvjh4KDpMzFEWSzZRj7nTCTBt22lX88l X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 phishscore=0 malwarescore=0 clxscore=1015 mlxlogscore=709 bulkscore=0 priorityscore=1501 adultscore=0 impostorscore=0 mlxscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681750529467953?= X-GMAIL-MSGID: =?utf-8?q?1752681750529467953?= Add a function that may be called to avoid hogging the CPU between kernel_fpu_begin() and kernel_fpu_end() calls. Signed-off-by: Robert Elliott --- arch/x86/include/asm/simd.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/arch/x86/include/asm/simd.h b/arch/x86/include/asm/simd.h index bd9c672a2792..2c887dec95a2 100644 --- a/arch/x86/include/asm/simd.h +++ b/arch/x86/include/asm/simd.h @@ -3,6 +3,7 @@ #define _ASM_X86_SIMD_H #include +#include /* * may_use_simd - whether it is allowable at this time to issue SIMD @@ -13,4 +14,22 @@ static __must_check inline bool may_use_simd(void) return irq_fpu_usable(); } +/** + * kernel_fpu_relax - pause FPU preemption if scheduler wants + * + * Call this periodically during long loops between kernel_fpu_begin() + * and kernel_fpu_end() calls to avoid hogging the CPU if the + * scheduler wants to use the CPU for another thread + * + * Return: none + */ +static inline void kernel_fpu_yield(void) +{ + if (need_resched()) { + kernel_fpu_end(); + cond_resched(); + kernel_fpu_begin(); + } +} + #endif /* _ASM_X86_SIMD_H */ From patchwork Mon Dec 19 22:02:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34800 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2639543wrn; Mon, 19 Dec 2022 14:08:36 -0800 (PST) X-Google-Smtp-Source: AA0mqf6sBIqeZLCw0bOZy9809Sd95R5m1JchPKA1gxzKU/w/zLCMPammt/VrcK1t7OONW2J/59VE X-Received: by 2002:a05:6a00:2255:b0:578:3592:6eb7 with SMTP id i21-20020a056a00225500b0057835926eb7mr40495849pfu.25.1671487716286; Mon, 19 Dec 2022 14:08:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487716; cv=none; d=google.com; s=arc-20160816; b=PkYXI4+KUQL2irjUnjZHGcBhBiB7I/4KFbB+P6fRX7qBZZjr4SsyQyokp+4uKzX0b8 twR9+u5y6iaAS67SQTMg5u4PFbXUr/ct2OeNMFJdB1IivwxBllopEaxZRDIj1mbBQ6Zk PK9eqwxk4A8ykLerI79RJfcRSeZF1O+mjXrupsgqqPzOjcnazeMsyf+SPC1vXtTHo8Ij s8jzOZNPgbZb8/7qhBfweqH+VMefQFmnzpacWBMYIxtCXBERPfY/tZMrpBrgwYqUJjnQ dds0t2nEdCVi3jm06a3y/XBP2f7OjVO6Lm2278XUPFMya6sL5ZlQJKwU1ywvWgiLBSj2 QQYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=z/RHuqgk6c33JWJbpAkTO+kVc65c62TXc795SLcxXos=; b=WZsTaIhos2H5YpWw77IhPKaRCKQCkWj3kDYVeQy6XSYGDPP7bpIoAU29K49Lrb0uh7 GMOdRuMO3RiiCTlJ12dhfCXOsWouzIe5a3G3bXGloLaMItJtoBlsDrlqLbgqAtTV538c TgMIR8awm0oWNecz8/9ZMrSvaLN1JcfJaOo1y63bJ/pa2tkHR6vkujRRGDQEer6W8BpT 6j11kjDw0DGoRA7Di7kLyMPSMj8njeTpNow0P3mhnfy4/rZ/zh27Z/EPeM2Xyqz7gqdm UKTnmmeDuAq2NoC6yHZ9kbZm5wqmkO3ZB/8H8XjbeI/5T74ZhzUJ2PBhKBzW+QZQNau1 5CWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=XDIm92Eb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h12-20020a056a00170c00b0057edc69e935si12669598pfc.247.2022.12.19.14.08.23; Mon, 19 Dec 2022 14:08:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=XDIm92Eb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232947AbiLSWEn (ORCPT + 99 others); Mon, 19 Dec 2022 17:04:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232885AbiLSWDf (ORCPT ); Mon, 19 Dec 2022 17:03:35 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 904AB140D4; Mon, 19 Dec 2022 14:03:20 -0800 (PST) Received: from pps.filterd (m0148664.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJLIRAa027251; Mon, 19 Dec 2022 22:02:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : mime-version; s=pps0720; bh=z/RHuqgk6c33JWJbpAkTO+kVc65c62TXc795SLcxXos=; b=XDIm92EbVglvc35vMp9ryw8mKvdIdK2A4WZ5BtwLCpSHIHdOM2ef1cJDc0MpKRNbQvI1 UBIlwwkzlyVzP38kQ+uywu1qZAQxWEuKQgee86BKIOpSO7a/jNdGD/iUCBO8GfTZzaXV ekrj9WNaDzP+LCUKoK7kIsF3M07LU0vSLU0qB2gw2uxTGyOPB9hfK8WjdEG39h95+NOF O/JkD2/WC/yq3PGSWfyHMhbs6blyGuriC8ltJT51XgZmuyCpdnTrRmtl2xI0ekogbTNH fI/L6LbYI7Kw6/S8sR78FpbUCkaUTd3OcGaFYi4VA/RffeGP45+US4Tue79pElaZbGZq YA== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3mjyrar6gy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:02:48 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id DB234310BD; Mon, 19 Dec 2022 22:02:47 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 16A9F8052C3; Mon, 19 Dec 2022 22:02:47 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 03/13] crypto: x86/sha - yield FPU context during long loops Date: Mon, 19 Dec 2022 16:02:13 -0600 Message-Id: <20221219220223.3982176-4-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> X-Proofpoint-GUID: MsXbanzF6xtEWQ2FuoUBYmcrUNTB_Ceu X-Proofpoint-ORIG-GUID: MsXbanzF6xtEWQ2FuoUBYmcrUNTB_Ceu X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 lowpriorityscore=0 adultscore=0 suspectscore=0 spamscore=0 clxscore=1015 mlxscore=0 impostorscore=0 malwarescore=0 bulkscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190194 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681903566057860?= X-GMAIL-MSGID: =?utf-8?q?1752681903566057860?= The x86 assembly language implementations using SIMD process data between kernel_fpu_begin() and kernel_fpu_end() calls. That disables scheduler preemption, so prevents the CPU core from being used by other threads. The update() and finup() functions might be called to process large quantities of data, which can result in RCU stalls and soft lockups. Periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. Fixes: 66be89515888 ("crypto: sha1 - SSSE3 based SHA1 implementation for x86-64") Fixes: 8275d1aa6422 ("crypto: sha256 - Create module providing optimized SHA256 routines using SSSE3, AVX or AVX2 instructions.") Fixes: 87de4579f92d ("crypto: sha512 - Create module providing optimized SHA512 routines using SSSE3, AVX or AVX2 instructions.") Fixes: aa031b8f702e ("crypto: x86/sha512 - load based on CPU features") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_avx2_x86_64_asm.S | 6 +- arch/x86/crypto/sha1_ni_asm.S | 8 +- arch/x86/crypto/sha1_ssse3_glue.c | 120 ++++++++++++++++++++----- arch/x86/crypto/sha256_ni_asm.S | 8 +- arch/x86/crypto/sha256_ssse3_glue.c | 115 +++++++++++++++++++----- arch/x86/crypto/sha512_ssse3_glue.c | 89 ++++++++++++++---- 6 files changed, 277 insertions(+), 69 deletions(-) diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S index c3ee9334cb0f..df03fbb2c42c 100644 --- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S +++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S @@ -58,9 +58,9 @@ /* * SHA-1 implementation with Intel(R) AVX2 instruction set extensions. * - *This implementation is based on the previous SSSE3 release: - *Visit http://software.intel.com/en-us/articles/ - *and refer to improving-the-performance-of-the-secure-hash-algorithm-1/ + * This implementation is based on the previous SSSE3 release: + * Visit http://software.intel.com/en-us/articles/ + * and refer to improving-the-performance-of-the-secure-hash-algorithm-1/ * */ diff --git a/arch/x86/crypto/sha1_ni_asm.S b/arch/x86/crypto/sha1_ni_asm.S index a69595b033c8..d513b85e242c 100644 --- a/arch/x86/crypto/sha1_ni_asm.S +++ b/arch/x86/crypto/sha1_ni_asm.S @@ -75,7 +75,7 @@ .text /** - * sha1_ni_transform - Calculate SHA1 hash using the x86 SHA-NI feature set + * sha1_transform_ni - Calculate SHA1 hash using the x86 SHA-NI feature set * @digest: address of current 20-byte hash value (%rdi, DIGEST_PTR macro) * @data: address of data (%rsi, DATA_PTR macro); * data size must be a multiple of 64 bytes @@ -94,9 +94,9 @@ * The non-indented lines are instructions related to the message schedule. * * Return: none - * Prototype: asmlinkage void sha1_ni_transform(u32 *digest, const u8 *data, int blocks) + * Prototype: asmlinkage void sha1_transform_ni(u32 *digest, const u8 *data, int blocks) */ -SYM_TYPED_FUNC_START(sha1_ni_transform) +SYM_TYPED_FUNC_START(sha1_transform_ni) push %rbp mov %rsp, %rbp sub $FRAME_SIZE, %rsp @@ -294,7 +294,7 @@ SYM_TYPED_FUNC_START(sha1_ni_transform) pop %rbp RET -SYM_FUNC_END(sha1_ni_transform) +SYM_FUNC_END(sha1_transform_ni) .section .rodata.cst16.PSHUFFLE_BYTE_FLIP_MASK, "aM", @progbits, 16 .align 16 diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 44340a1139e0..b269b455fbbe 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -41,9 +41,7 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha1_state, state) != 0); - kernel_fpu_begin(); sha1_base_do_update(desc, data, len, sha1_xform); - kernel_fpu_end(); return 0; } @@ -54,28 +52,46 @@ static int sha1_finup(struct shash_desc *desc, const u8 *data, if (!crypto_simd_usable()) return crypto_sha1_finup(desc, data, len, out); - kernel_fpu_begin(); if (len) sha1_base_do_update(desc, data, len, sha1_xform); sha1_base_do_finalize(desc, sha1_xform); - kernel_fpu_end(); return sha1_base_finish(desc, out); } -asmlinkage void sha1_transform_ssse3(struct sha1_state *state, - const u8 *data, int blocks); +asmlinkage void sha1_transform_ssse3(u32 *digest, const u8 *data, int blocks); + +void __sha1_transform_ssse3(struct sha1_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA1_BLOCK_SIZE); + + sha1_transform_ssse3(state->state, data, chunks); + data += chunks * SHA1_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static int sha1_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha1_update(desc, data, len, sha1_transform_ssse3); + return sha1_update(desc, data, len, __sha1_transform_ssse3); } static int sha1_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha1_finup(desc, data, len, out, sha1_transform_ssse3); + return sha1_finup(desc, data, len, out, __sha1_transform_ssse3); } /* Add padding and return the message digest. */ @@ -113,19 +129,39 @@ static void unregister_sha1_ssse3(void) crypto_unregister_shash(&sha1_ssse3_alg); } -asmlinkage void sha1_transform_avx(struct sha1_state *state, - const u8 *data, int blocks); +asmlinkage void sha1_transform_avx(u32 *digest, const u8 *data, int blocks); + +void __sha1_transform_avx(struct sha1_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA1_BLOCK_SIZE); + + sha1_transform_avx(state->state, data, chunks); + data += chunks * SHA1_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static int sha1_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha1_update(desc, data, len, sha1_transform_avx); + return sha1_update(desc, data, len, __sha1_transform_avx); } static int sha1_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha1_finup(desc, data, len, out, sha1_transform_avx); + return sha1_finup(desc, data, len, out, __sha1_transform_avx); } static int sha1_avx_final(struct shash_desc *desc, u8 *out) @@ -175,8 +211,28 @@ static void unregister_sha1_avx(void) #define SHA1_AVX2_BLOCK_OPTSIZE 4 /* optimal 4*64 bytes of SHA1 blocks */ -asmlinkage void sha1_transform_avx2(struct sha1_state *state, - const u8 *data, int blocks); +asmlinkage void sha1_transform_avx2(u32 *digest, const u8 *data, int blocks); + +void __sha1_transform_avx2(struct sha1_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA1_BLOCK_SIZE); + + sha1_transform_avx2(state->state, data, chunks); + data += chunks * SHA1_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static bool avx2_usable(void) { @@ -193,9 +249,9 @@ static void sha1_apply_transform_avx2(struct sha1_state *state, { /* Select the optimal transform based on data block size */ if (blocks >= SHA1_AVX2_BLOCK_OPTSIZE) - sha1_transform_avx2(state, data, blocks); + __sha1_transform_avx2(state, data, blocks); else - sha1_transform_avx(state, data, blocks); + __sha1_transform_avx(state, data, blocks); } static int sha1_avx2_update(struct shash_desc *desc, const u8 *data, @@ -245,19 +301,39 @@ static void unregister_sha1_avx2(void) } #ifdef CONFIG_AS_SHA1_NI -asmlinkage void sha1_ni_transform(struct sha1_state *digest, const u8 *data, - int rounds); +asmlinkage void sha1_transform_ni(u32 *digest, const u8 *data, int rounds); + +void __sha1_transform_ni(struct sha1_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA1_BLOCK_SIZE); + + sha1_transform_ni(state->state, data, chunks); + data += chunks * SHA1_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static int sha1_ni_update(struct shash_desc *desc, const u8 *data, - unsigned int len) + unsigned int len) { - return sha1_update(desc, data, len, sha1_ni_transform); + return sha1_update(desc, data, len, __sha1_transform_ni); } static int sha1_ni_finup(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out) + unsigned int len, u8 *out) { - return sha1_finup(desc, data, len, out, sha1_ni_transform); + return sha1_finup(desc, data, len, out, __sha1_transform_ni); } static int sha1_ni_final(struct shash_desc *desc, u8 *out) diff --git a/arch/x86/crypto/sha256_ni_asm.S b/arch/x86/crypto/sha256_ni_asm.S index e7a3b9939327..29458ec970a9 100644 --- a/arch/x86/crypto/sha256_ni_asm.S +++ b/arch/x86/crypto/sha256_ni_asm.S @@ -79,7 +79,7 @@ .text /** - * sha256_ni_transform - Calculate SHA256 hash using the x86 SHA-NI feature set + * sha256_transform_ni - Calculate SHA256 hash using the x86 SHA-NI feature set * @digest: address of current 32-byte hash value (%rdi, DIGEST_PTR macro) * @data: address of data (%rsi, DATA_PTR macro); * data size must be a multiple of 64 bytes @@ -98,9 +98,9 @@ * The non-indented lines are instructions related to the message schedule. * * Return: none - * Prototype: asmlinkage void sha256_ni_transform(u32 *digest, const u8 *data, int blocks) + * Prototype: asmlinkage void sha256_transform_ni(u32 *digest, const u8 *data, int blocks) */ -SYM_TYPED_FUNC_START(sha256_ni_transform) +SYM_TYPED_FUNC_START(sha256_transform_ni) shl $6, NUM_BLKS /* convert to bytes */ jz .Ldone_hash add DATA_PTR, NUM_BLKS /* pointer to end of data */ @@ -329,7 +329,7 @@ SYM_TYPED_FUNC_START(sha256_ni_transform) .Ldone_hash: RET -SYM_FUNC_END(sha256_ni_transform) +SYM_FUNC_END(sha256_transform_ni) .section .rodata.cst256.K256, "aM", @progbits, 256 .align 64 diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 3a5f6be7dbba..43927cf3d06e 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -40,8 +40,28 @@ #include #include -asmlinkage void sha256_transform_ssse3(struct sha256_state *state, - const u8 *data, int blocks); +asmlinkage void sha256_transform_ssse3(u32 *digest, const u8 *data, int blocks); + +void __sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA256_BLOCK_SIZE); + + sha256_transform_ssse3(state->state, data, chunks); + data += chunks * SHA256_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static int _sha256_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha256_block_fn *sha256_xform) @@ -58,9 +78,7 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0); - kernel_fpu_begin(); sha256_base_do_update(desc, data, len, sha256_xform); - kernel_fpu_end(); return 0; } @@ -71,11 +89,9 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data, if (!crypto_simd_usable()) return crypto_sha256_finup(desc, data, len, out); - kernel_fpu_begin(); if (len) sha256_base_do_update(desc, data, len, sha256_xform); sha256_base_do_finalize(desc, sha256_xform); - kernel_fpu_end(); return sha256_base_finish(desc, out); } @@ -83,13 +99,13 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data, static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return _sha256_update(desc, data, len, sha256_transform_ssse3); + return _sha256_update(desc, data, len, __sha256_transform_ssse3); } static int sha256_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha256_finup(desc, data, len, out, sha256_transform_ssse3); + return sha256_finup(desc, data, len, out, __sha256_transform_ssse3); } /* Add padding and return the message digest. */ @@ -143,19 +159,39 @@ static void unregister_sha256_ssse3(void) ARRAY_SIZE(sha256_ssse3_algs)); } -asmlinkage void sha256_transform_avx(struct sha256_state *state, - const u8 *data, int blocks); +asmlinkage void sha256_transform_avx(u32 *digest, const u8 *data, int blocks); + +void __sha256_transform_avx(struct sha256_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA256_BLOCK_SIZE); + + sha256_transform_avx(state->state, data, chunks); + data += chunks * SHA256_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static int sha256_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return _sha256_update(desc, data, len, sha256_transform_avx); + return _sha256_update(desc, data, len, __sha256_transform_avx); } static int sha256_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha256_finup(desc, data, len, out, sha256_transform_avx); + return sha256_finup(desc, data, len, out, __sha256_transform_avx); } static int sha256_avx_final(struct shash_desc *desc, u8 *out) @@ -219,19 +255,39 @@ static void unregister_sha256_avx(void) ARRAY_SIZE(sha256_avx_algs)); } -asmlinkage void sha256_transform_rorx(struct sha256_state *state, - const u8 *data, int blocks); +asmlinkage void sha256_transform_rorx(u32 *state, const u8 *data, int blocks); + +void __sha256_transform_avx2(struct sha256_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA256_BLOCK_SIZE); + + sha256_transform_rorx(state->state, data, chunks); + data += chunks * SHA256_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static int sha256_avx2_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return _sha256_update(desc, data, len, sha256_transform_rorx); + return _sha256_update(desc, data, len, __sha256_transform_avx2); } static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha256_finup(desc, data, len, out, sha256_transform_rorx); + return sha256_finup(desc, data, len, out, __sha256_transform_avx2); } static int sha256_avx2_final(struct shash_desc *desc, u8 *out) @@ -294,19 +350,38 @@ static void unregister_sha256_avx2(void) } #ifdef CONFIG_AS_SHA256_NI -asmlinkage void sha256_ni_transform(struct sha256_state *digest, - const u8 *data, int rounds); +asmlinkage void sha256_transform_ni(u32 *digest, const u8 *data, int rounds); + +void __sha256_transform_ni(struct sha256_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA256_BLOCK_SIZE); + sha256_transform_ni(state->state, data, chunks); + data += chunks * SHA256_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static int sha256_ni_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return _sha256_update(desc, data, len, sha256_ni_transform); + return _sha256_update(desc, data, len, __sha256_transform_ni); } static int sha256_ni_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha256_finup(desc, data, len, out, sha256_ni_transform); + return sha256_finup(desc, data, len, out, __sha256_transform_ni); } static int sha256_ni_final(struct shash_desc *desc, u8 *out) diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index 6d3b85e53d0e..cb6aad9d5052 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -39,8 +39,28 @@ #include #include -asmlinkage void sha512_transform_ssse3(struct sha512_state *state, - const u8 *data, int blocks); +asmlinkage void sha512_transform_ssse3(u64 *digest, const u8 *data, int blocks); + +void __sha512_transform_ssse3(struct sha512_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA512_BLOCK_SIZE); + + sha512_transform_ssse3(&state->state[0], data, chunks); + data += chunks * SHA512_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static int sha512_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha512_block_fn *sha512_xform) @@ -57,9 +77,7 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha512_state, state) != 0); - kernel_fpu_begin(); sha512_base_do_update(desc, data, len, sha512_xform); - kernel_fpu_end(); return 0; } @@ -70,11 +88,9 @@ static int sha512_finup(struct shash_desc *desc, const u8 *data, if (!crypto_simd_usable()) return crypto_sha512_finup(desc, data, len, out); - kernel_fpu_begin(); if (len) sha512_base_do_update(desc, data, len, sha512_xform); sha512_base_do_finalize(desc, sha512_xform); - kernel_fpu_end(); return sha512_base_finish(desc, out); } @@ -82,13 +98,13 @@ static int sha512_finup(struct shash_desc *desc, const u8 *data, static int sha512_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha512_update(desc, data, len, sha512_transform_ssse3); + return sha512_update(desc, data, len, __sha512_transform_ssse3); } static int sha512_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha512_finup(desc, data, len, out, sha512_transform_ssse3); + return sha512_finup(desc, data, len, out, __sha512_transform_ssse3); } /* Add padding and return the message digest. */ @@ -142,8 +158,29 @@ static void unregister_sha512_ssse3(void) ARRAY_SIZE(sha512_ssse3_algs)); } -asmlinkage void sha512_transform_avx(struct sha512_state *state, - const u8 *data, int blocks); +asmlinkage void sha512_transform_avx(u64 *digest, const u8 *data, int blocks); + +void __sha512_transform_avx(struct sha512_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA512_BLOCK_SIZE); + + sha512_transform_avx(state->state, data, chunks); + data += chunks * SHA512_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} + static bool avx_usable(void) { if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { @@ -158,13 +195,13 @@ static bool avx_usable(void) static int sha512_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha512_update(desc, data, len, sha512_transform_avx); + return sha512_update(desc, data, len, __sha512_transform_avx); } static int sha512_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha512_finup(desc, data, len, out, sha512_transform_avx); + return sha512_finup(desc, data, len, out, __sha512_transform_avx); } /* Add padding and return the message digest. */ @@ -218,19 +255,39 @@ static void unregister_sha512_avx(void) ARRAY_SIZE(sha512_avx_algs)); } -asmlinkage void sha512_transform_rorx(struct sha512_state *state, - const u8 *data, int blocks); +asmlinkage void sha512_transform_rorx(u64 *digest, const u8 *data, int blocks); + +void __sha512_transform_avx2(struct sha512_state *state, const u8 *data, int blocks) +{ + if (blocks <= 0) + return; + + kernel_fpu_begin(); + for (;;) { + const int chunks = min(blocks, 4096 / SHA512_BLOCK_SIZE); + + sha512_transform_rorx(state->state, data, chunks); + data += chunks * SHA512_BLOCK_SIZE; + blocks -= chunks; + + if (blocks <= 0) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); +} static int sha512_avx2_update(struct shash_desc *desc, const u8 *data, unsigned int len) { - return sha512_update(desc, data, len, sha512_transform_rorx); + return sha512_update(desc, data, len, __sha512_transform_avx2); } static int sha512_avx2_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return sha512_finup(desc, data, len, out, sha512_transform_rorx); + return sha512_finup(desc, data, len, out, __sha512_transform_avx2); } /* Add padding and return the message digest. */ From patchwork Mon Dec 19 22:02:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34795 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2638712wrn; Mon, 19 Dec 2022 14:06:41 -0800 (PST) X-Google-Smtp-Source: AA0mqf42lraVNK+s2ZW+7BvwC9Xp7sazJDJzQjZEkUIRroQLzrOBD849glwr9/mA/v0esFy7ZVx2 X-Received: by 2002:a17:903:4111:b0:189:cc58:7784 with SMTP id r17-20020a170903411100b00189cc587784mr40750143pld.45.1671487601460; Mon, 19 Dec 2022 14:06:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487601; cv=none; d=google.com; s=arc-20160816; b=LQorlX4aj2ebnsckkA0f/hpjGbeX6No96MTfKHcD3E0zLQOSK/eRo6R93nmb7kCVSR e+Q28V4lZrr47ruwJT+dB6+dUGyTPXTW63FpRT388jGgw/D48PIcBiO5eSMMIB4Lc0Hr ydcN9r1BHEZjQnMmH6lh+AHFKYcszJk3lWM092F05eE+NNXxe6p9Yp1IIL9WmE7HoOwj tYV1uZ4RL4eB/JSUbCiriV24AVA+MNrmUF5Ldk/p5CWBT75iF4PxWU/xbDw4kFq342H6 V4uH6f8kw1RqzE2htEomkvewWM/AiqI8Eb6wEtlaDcsf/kUAvfQeWvPfgaKzOWkY2t0J DN7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Fw7SMOXU0AU3aza6EQK5/61RTV3W7f05AzzXrFzmOik=; b=lPGOl9F/3IwMk2h3SN2tGLogDNfe7m5DM+wpcnvn2P1hdNO+dg0nsZ4hBoML7NQBy/ ehXfNuIROL1hw6eyG+XvJdLWvCsSD9sUVyw5MLjhwMKICxHAFroAu7FA79xLVWYUJKeV IneabpbhSvHGd/0x5Qq2QYyRs4cY7+xzSsGTJDvhq/fIlhcP80G7EO+WTAAiIK7y9p4f J2rCibSVF683fdGwwvyCgeAYrlWdPy2Bf3aPidGDeOdPswq71+WpLPeEIWxbB/CM7MaV ryuq0B8MgWeoMwaN8DIFFrGaneI2CRLhKcDkYab9tVBuQubp1ZlyTCIDcSQCcPtWq9I/ bVEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=a5+Tnjms; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v3-20020a170902ca8300b00189c4a9c5ecsi10891925pld.297.2022.12.19.14.06.28; Mon, 19 Dec 2022 14:06:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=a5+Tnjms; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232674AbiLSWER (ORCPT + 99 others); Mon, 19 Dec 2022 17:04:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232870AbiLSWDd (ORCPT ); Mon, 19 Dec 2022 17:03:33 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9035B140D1; Mon, 19 Dec 2022 14:03:20 -0800 (PST) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJK6Pu2024283; Mon, 19 Dec 2022 22:02:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=Fw7SMOXU0AU3aza6EQK5/61RTV3W7f05AzzXrFzmOik=; b=a5+TnjmsrVS/fBlkPFiDHskIFCiHBEK+zWoazCcXt5elIrnK9M/HRq/3NlicIKuwdcOH 6x3zU0Xs1eeOyHYKnYgL7ZWCgeo9mWRqsOv5J4nk5pFScZwCZoSYmuS8GX5SourExhU0 KSoNywxMVUTMhyh6CJcfDm3sV6tqCTcz1N9m52JetyJ5BccqlvqWOaRaBaLEj/0SRlAt TBmzVALCtwe/R0Nox0q/QBG7owRcItwBBkWkcjmH1b4CzsthLsyns2OF4l3Vq5j91RHl er/o+L7Zf8k8w9es4A7jmHVkso3ZK/eBbOqk5UcK945re21rsrOtgegIcr2UEJVLcKIz 9Q== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3mjvh69gnb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:02:50 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id 5E9D880711F; Mon, 19 Dec 2022 22:02:49 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 92F7E805634; Mon, 19 Dec 2022 22:02:48 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 04/13] crypto: x86/crc - yield FPU context during long loops Date: Mon, 19 Dec 2022 16:02:14 -0600 Message-Id: <20221219220223.3982176-5-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: 4FevuVYEbMsP_5mKNtJVYHc63lHdLsiw X-Proofpoint-ORIG-GUID: 4FevuVYEbMsP_5mKNtJVYHc63lHdLsiw X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 clxscore=1015 impostorscore=0 mlxlogscore=999 mlxscore=0 spamscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190194 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681782989789904?= X-GMAIL-MSGID: =?utf-8?q?1752681782989789904?= The x86 assembly language implementations using SIMD process data between kernel_fpu_begin() and kernel_fpu_end() calls. That disables scheduler preemption, so prevents the CPU core from being used by other threads. The update() and finup() functions might be called to process large quantities of data, which can result in RCU stalls and soft lockups. Periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. For crc32, add a pre-alignment loop so the assembly language function is not repeatedly called with an unaligned starting address. Fixes: 78c37d191dd6 ("crypto: crc32 - add crc32 pclmulqdq implementation and wrappers for table implementation") Fixes: 6a8ce1ef3940 ("crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction") Fixes: 0b95a7f85718 ("crypto: crct10dif - Glue code to cast accelerated CRCT10DIF assembly as a crypto transform") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/crc32-pclmul_glue.c | 49 +++++----- arch/x86/crypto/crc32c-intel_glue.c | 118 +++++++++++++++++------- arch/x86/crypto/crct10dif-pclmul_glue.c | 65 ++++++++++--- 3 files changed, 165 insertions(+), 67 deletions(-) diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index 98cf3b4e4c9f..3692b50faf1c 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -41,41 +41,50 @@ #define CHKSUM_BLOCK_SIZE 1 #define CHKSUM_DIGEST_SIZE 4 -#define PCLMUL_MIN_LEN 64L /* minimum size of buffer - * for crc32_pclmul_le_16 */ -#define SCALE_F 16L /* size of xmm register */ +#define PCLMUL_MIN_LEN 64U /* minimum size of buffer for crc32_pclmul_le_16 */ +#define SCALE_F 16U /* size of xmm register */ #define SCALE_F_MASK (SCALE_F - 1) -u32 crc32_pclmul_le_16(unsigned char const *buffer, size_t len, u32 crc32); +asmlinkage u32 crc32_pclmul_le_16(const u8 *buffer, unsigned int len, u32 crc32); -static u32 __attribute__((pure)) - crc32_pclmul_le(u32 crc, unsigned char const *p, size_t len) +static u32 crc32_pclmul_le(u32 crc, const u8 *p, unsigned int len) { unsigned int iquotient; unsigned int iremainder; - unsigned int prealign; if (len < PCLMUL_MIN_LEN + SCALE_F_MASK || !crypto_simd_usable()) return crc32_le(crc, p, len); - if ((long)p & SCALE_F_MASK) { + if ((unsigned long)p & SCALE_F_MASK) { /* align p to 16 byte */ - prealign = SCALE_F - ((long)p & SCALE_F_MASK); + unsigned int prealign = SCALE_F - ((unsigned long)p & SCALE_F_MASK); crc = crc32_le(crc, p, prealign); len -= prealign; - p = (unsigned char *)(((unsigned long)p + SCALE_F_MASK) & - ~SCALE_F_MASK); + p += prealign; } - iquotient = len & (~SCALE_F_MASK); + iquotient = len & ~SCALE_F_MASK; iremainder = len & SCALE_F_MASK; - kernel_fpu_begin(); - crc = crc32_pclmul_le_16(p, iquotient, crc); - kernel_fpu_end(); + if (iquotient) { + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(iquotient, 4096U); - if (iremainder) - crc = crc32_le(crc, p + iquotient, iremainder); + crc = crc32_pclmul_le_16(p, chunk, crc); + iquotient -= chunk; + p += chunk; + + if (iquotient < PCLMUL_MIN_LEN) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); + } + + if (iquotient || iremainder) + crc = crc32_le(crc, p, iquotient + iremainder); return crc; } @@ -120,8 +129,7 @@ static int crc32_pclmul_update(struct shash_desc *desc, const u8 *data, } /* No final XOR 0xFFFFFFFF, like crc32_le */ -static int __crc32_pclmul_finup(u32 *crcp, const u8 *data, unsigned int len, - u8 *out) +static int __crc32_pclmul_finup(u32 *crcp, const u8 *data, unsigned int len, u8 *out) { *(__le32 *)out = cpu_to_le32(crc32_pclmul_le(*crcp, data, len)); return 0; @@ -144,8 +152,7 @@ static int crc32_pclmul_final(struct shash_desc *desc, u8 *out) static int crc32_pclmul_digest(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return __crc32_pclmul_finup(crypto_shash_ctx(desc->tfm), data, len, - out); + return __crc32_pclmul_finup(crypto_shash_ctx(desc->tfm), data, len, out); } static struct shash_alg alg = { diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index feccb5254c7e..932574661ef5 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -35,19 +35,24 @@ #ifdef CONFIG_X86_64 /* - * use carryless multiply version of crc32c when buffer - * size is >= 512 to account - * for fpu state save/restore overhead. + * only use crc_pcl() (carryless multiply version of crc32c) when buffer + * size is >= 512 to account for fpu state save/restore overhead. */ #define CRC32C_PCL_BREAKEVEN 512 -asmlinkage unsigned int crc_pcl(const u8 *buffer, int len, - unsigned int crc_init); +/* + * only pass aligned buffers to crc_pcl() to avoid special handling + * in each pass + */ +#define ALIGN_CRCPCL 16U +#define ALIGN_CRCPCL_MASK (ALIGN_CRCPCL - 1) + +asmlinkage u32 crc_pcl(const u8 *buffer, u64 len, u32 crc_init); #endif /* CONFIG_X86_64 */ -static u32 crc32c_intel_le_hw_byte(u32 crc, unsigned char const *data, size_t length) +static u32 crc32c_intel_le_hw_byte(u32 crc, const u8 *data, unsigned int len) { - while (length--) { + while (len--) { asm("crc32b %1, %0" : "+r" (crc) : "rm" (*data)); data++; @@ -56,7 +61,7 @@ static u32 crc32c_intel_le_hw_byte(u32 crc, unsigned char const *data, size_t le return crc; } -static u32 __pure crc32c_intel_le_hw(u32 crc, unsigned char const *p, size_t len) +static u32 __pure crc32c_intel_le_hw(u32 crc, const u8 *p, unsigned int len) { unsigned int iquotient = len / SCALE_F; unsigned int iremainder = len % SCALE_F; @@ -69,8 +74,7 @@ static u32 __pure crc32c_intel_le_hw(u32 crc, unsigned char const *p, size_t len } if (iremainder) - crc = crc32c_intel_le_hw_byte(crc, (unsigned char *)ptmp, - iremainder); + crc = crc32c_intel_le_hw_byte(crc, (u8 *)ptmp, iremainder); return crc; } @@ -110,8 +114,8 @@ static int crc32c_intel_update(struct shash_desc *desc, const u8 *data, return 0; } -static int __crc32c_intel_finup(u32 *crcp, const u8 *data, unsigned int len, - u8 *out) +static int __crc32c_intel_finup(const u32 *crcp, const u8 *data, + unsigned int len, u8 *out) { *(__le32 *)out = ~cpu_to_le32(crc32c_intel_le_hw(*crcp, data, len)); return 0; @@ -134,8 +138,7 @@ static int crc32c_intel_final(struct shash_desc *desc, u8 *out) static int crc32c_intel_digest(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { - return __crc32c_intel_finup(crypto_shash_ctx(desc->tfm), data, len, - out); + return __crc32c_intel_finup(crypto_shash_ctx(desc->tfm), data, len, out); } static int crc32c_intel_cra_init(struct crypto_tfm *tfm) @@ -149,47 +152,96 @@ static int crc32c_intel_cra_init(struct crypto_tfm *tfm) #ifdef CONFIG_X86_64 static int crc32c_pcl_intel_update(struct shash_desc *desc, const u8 *data, - unsigned int len) + unsigned int len) { u32 *crcp = shash_desc_ctx(desc); + u32 crc; + + BUILD_BUG_ON(CRC32C_PCL_BREAKEVEN > 4096U); /* * use faster PCL version if datasize is large enough to * overcome kernel fpu state save/restore overhead */ - if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *crcp = crc_pcl(data, len, *crcp); - kernel_fpu_end(); - } else + if (len < CRC32C_PCL_BREAKEVEN + ALIGN_CRCPCL_MASK || !crypto_simd_usable()) { *crcp = crc32c_intel_le_hw(*crcp, data, len); + return 0; + } + + crc = *crcp; + /* + * Although crc_pcl() supports unaligned buffers, it is more efficient + * handling a 16-byte aligned buffer. + */ + if ((unsigned long)data & ALIGN_CRCPCL_MASK) { + unsigned int prealign = ALIGN_CRCPCL - ((unsigned long)data & ALIGN_CRCPCL_MASK); + + crc = crc32c_intel_le_hw(crc, data, prealign); + len -= prealign; + data += prealign; + } + + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(len, 4096U); + + crc = crc_pcl(data, chunk, crc); + len -= chunk; + + if (!len) + break; + + data += chunk; + kernel_fpu_yield(); + } + kernel_fpu_end(); + + *crcp = crc; return 0; } -static int __crc32c_pcl_intel_finup(u32 *crcp, const u8 *data, unsigned int len, - u8 *out) +static int __crc32c_pcl_intel_finup(const u32 *crcp, const u8 *data, + unsigned int len, u8 *out) { - if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__le32 *)out = ~cpu_to_le32(crc_pcl(data, len, *crcp)); - kernel_fpu_end(); - } else - *(__le32 *)out = - ~cpu_to_le32(crc32c_intel_le_hw(*crcp, data, len)); + u32 crc; + + BUILD_BUG_ON(CRC32C_PCL_BREAKEVEN > 4096U); + + if (len < CRC32C_PCL_BREAKEVEN || !crypto_simd_usable()) { + *(__le32 *)out = ~cpu_to_le32(crc32c_intel_le_hw(*crcp, data, len)); + return 0; + } + + crc = *crcp; + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(len, 4096U); + + crc = crc_pcl(data, chunk, crc); + len -= chunk; + + if (!len) + break; + + data += chunk; + kernel_fpu_yield(); + } + kernel_fpu_end(); + + *(__le32 *)out = ~cpu_to_le32(crc); return 0; } static int crc32c_pcl_intel_finup(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out) + unsigned int len, u8 *out) { return __crc32c_pcl_intel_finup(shash_desc_ctx(desc), data, len, out); } static int crc32c_pcl_intel_digest(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out) + unsigned int len, u8 *out) { - return __crc32c_pcl_intel_finup(crypto_shash_ctx(desc->tfm), data, len, - out); + return __crc32c_pcl_intel_finup(crypto_shash_ctx(desc->tfm), data, len, out); } #endif /* CONFIG_X86_64 */ diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 71291d5af9f4..4d39eac94289 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -34,6 +34,8 @@ #include #include +#define PCLMUL_MIN_LEN 16U /* minimum size of buffer for crc_t10dif_pcl */ + asmlinkage u16 crc_t10dif_pcl(u16 init_crc, const u8 *buf, size_t len); struct chksum_desc_ctx { @@ -49,17 +51,36 @@ static int chksum_init(struct shash_desc *desc) return 0; } -static int chksum_update(struct shash_desc *desc, const u8 *data, - unsigned int length) +static int chksum_update(struct shash_desc *desc, const u8 *data, unsigned int len) { struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); + u16 crc; + + if (len < PCLMUL_MIN_LEN || !crypto_simd_usable()) { + ctx->crc = crc_t10dif_generic(ctx->crc, data, len); + return 0; + } + + crc = ctx->crc; + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(len, 4096U); + + crc = crc_t10dif_pcl(crc, data, chunk); + len -= chunk; + data += chunk; + + if (len < PCLMUL_MIN_LEN) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); + + if (len) + crc = crc_t10dif_generic(crc, data, len); - if (length >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - ctx->crc = crc_t10dif_pcl(ctx->crc, data, length); - kernel_fpu_end(); - } else - ctx->crc = crc_t10dif_generic(ctx->crc, data, length); + ctx->crc = crc; return 0; } @@ -73,12 +94,30 @@ static int chksum_final(struct shash_desc *desc, u8 *out) static int __chksum_finup(__u16 crc, const u8 *data, unsigned int len, u8 *out) { - if (len >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__u16 *)out = crc_t10dif_pcl(crc, data, len); - kernel_fpu_end(); - } else + if (len < PCLMUL_MIN_LEN || !crypto_simd_usable()) { *(__u16 *)out = crc_t10dif_generic(crc, data, len); + return 0; + } + + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(len, 4096U); + + crc = crc_t10dif_pcl(crc, data, chunk); + len -= chunk; + data += chunk; + + if (len < PCLMUL_MIN_LEN) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); + + if (len) + crc = crc_t10dif_generic(crc, data, len); + + *(__u16 *)out = crc; return 0; } From patchwork Mon Dec 19 22:02:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34799 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2639377wrn; Mon, 19 Dec 2022 14:08:13 -0800 (PST) X-Google-Smtp-Source: AA0mqf5akoTX/5XlHO+fp/KfYM9yYdp7N3c/+FmTOxGw0W0t0/3wkwKGn1zpn3BEjDTmWnXV77xM X-Received: by 2002:a62:a509:0:b0:56b:f127:f21f with SMTP id v9-20020a62a509000000b0056bf127f21fmr43212907pfm.13.1671487693545; Mon, 19 Dec 2022 14:08:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487693; cv=none; d=google.com; s=arc-20160816; b=RuD8kiGXhPXYDklNlu4d9cMNheyFqVm/KF9n3bRa0aL90Kbzd0fSEbQqvd6smeMArw HHDvGBypk2Zh2DK5jMeynCjDS/OOSgmDJExpSHYeDkZqt+JXLm57UCEOsK4VVSiuaz1F 6+Rp3kSScnGcllB7sEUN/5XJWpUGcNlQ4OvQeOy1GZGKdxtNqFOck5sdbAZH8bCErTV9 EZzZ1LmBoIFV+Z7tRrK2DElYBYRaOIiasR6sehWVMTm3DRTNRs17wHFDuQzOiYbYdeaH /5QBAZvBzq7bU1x/8kjphQGEUevH2KKicAFj+5V4xMY8G692orDD2Sy9JzhDKVixegOW ZYvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=r0Ub34Z3hVxIma+xs/LdQFAyE9eIoMSZK4PDG6wbiZE=; b=hFmXcR1YJPeS6DrK1KXJlD/FDZnPL79GZdqlns8geNqbxHJrWJ5sZQGNLLfj0KNtYr CJ6P5iy0SUMoXVyMO9j+iIJ0KXJRyvYoAsiS+12uU2uC5afjKawBE3uTx8DtpXt9coa8 8wnA+ngaoJzWMh3ay6SZS5E9Y8Ycfb3WGdPUyu++za9iICO/QMxGHY5e0AiDgkg+OWBM 5w6Xe6ie1Iz461WxE1xc+agqvZGicmSmWUp447KxsjvhWhdTkJ9LfbxwMfbat82wsFJk YUX9DxoSb179/qxnd/g8rSfWF8ieMvWbCfvgF12Ij45I3QTPrHiNFvCrAIQynuiVoQlx DKyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=DXs2UhpR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v63-20020a626142000000b00574eb89dfc0si10844717pfb.252.2022.12.19.14.08.00; Mon, 19 Dec 2022 14:08:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=DXs2UhpR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232469AbiLSWEa (ORCPT + 99 others); Mon, 19 Dec 2022 17:04:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232882AbiLSWDf (ORCPT ); Mon, 19 Dec 2022 17:03:35 -0500 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C2B1140A6; Mon, 19 Dec 2022 14:03:22 -0800 (PST) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJI2Af6022166; Mon, 19 Dec 2022 22:02:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=r0Ub34Z3hVxIma+xs/LdQFAyE9eIoMSZK4PDG6wbiZE=; b=DXs2UhpRPlEH2PcRtE2PcXm4KYW1rTX0Z/zQ6SFznBcqPLLZI/XhfgE9O+AkZdGTSsjs KNevrI8AZxW32Tup2G9GhyTzB7urDgKH5DqkFGXp2X9jhq9SbXvg9QIYxigyYTE1JubJ troqSUzOLmfSijnALFE7lf3pyhQomijWGGL6wAx6TXbq0WXBBqGsezUn+/AdHjqg+QCN bjD7D6NV13r+M7OvTkyCXxjrU2GE2P9wP/PwzAhTjTcxoWlZeutF5F9NlYm5Q/GBXBUi 6G46sPRMtcdscO7UUCl20WyGAoaMfENBS9C5Wz2n0yqepB1W7wGbkR/2hcpZVoY6jbrb Yw== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3mjvh69gnh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:02:52 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 27C1B310B4; Mon, 19 Dec 2022 22:02:51 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 581BF8061BF; Mon, 19 Dec 2022 22:02:50 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 05/13] crypto: x86/sm3 - yield FPU context during long loops Date: Mon, 19 Dec 2022 16:02:15 -0600 Message-Id: <20221219220223.3982176-6-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: TXAUQ1xjzXxwZkk3YpfNvY56-Pji46tJ X-Proofpoint-ORIG-GUID: TXAUQ1xjzXxwZkk3YpfNvY56-Pji46tJ X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 clxscore=1015 impostorscore=0 mlxlogscore=999 mlxscore=0 spamscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190194 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681879631529017?= X-GMAIL-MSGID: =?utf-8?q?1752681879631529017?= The x86 assembly language implementations using SIMD process data between kernel_fpu_begin() and kernel_fpu_end() calls. That disables scheduler preemption, so prevents the CPU core from being used by other threads. The update() and finup() functions might be called to process large quantities of data, which can result in RCU stalls and soft lockups. Periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. Fixes: 930ab34d906d ("crypto: x86/sm3 - add AVX assembly implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/sm3_avx_glue.c | 34 +++++++++++++++++++++++++++++----- 1 file changed, 29 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index 661b6f22ffcd..9e4b21c0e748 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -25,8 +25,7 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, { struct sm3_state *sctx = shash_desc_ctx(desc); - if (!crypto_simd_usable() || - (sctx->count % SM3_BLOCK_SIZE) + len < SM3_BLOCK_SIZE) { + if (((sctx->count % SM3_BLOCK_SIZE) + len < SM3_BLOCK_SIZE) || !crypto_simd_usable()) { sm3_update(sctx, data, len); return 0; } @@ -38,7 +37,19 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0); kernel_fpu_begin(); - sm3_base_do_update(desc, data, len, sm3_transform_avx); + for (;;) { + const unsigned int chunk = min(len, 4096U); + + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + + len -= chunk; + + if (!len) + break; + + data += chunk; + kernel_fpu_yield(); + } kernel_fpu_end(); return 0; @@ -58,8 +69,21 @@ static int sm3_avx_finup(struct shash_desc *desc, const u8 *data, } kernel_fpu_begin(); - if (len) - sm3_base_do_update(desc, data, len, sm3_transform_avx); + if (len) { + for (;;) { + const unsigned int chunk = min(len, 4096U); + + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + len -= chunk; + + if (!len) + break; + + data += chunk; + kernel_fpu_yield(); + } + } + sm3_base_do_finalize(desc, sm3_transform_avx); kernel_fpu_end(); From patchwork Mon Dec 19 22:02:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34796 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2638950wrn; Mon, 19 Dec 2022 14:07:17 -0800 (PST) X-Google-Smtp-Source: AMrXdXsOM0MtEmtjTwczrnQtwMxQXE5pWkTmt5QA2LeO9nE7X15pxrqpshzvvDQQRKxOEuG6RG7P X-Received: by 2002:a05:6a21:595:b0:a5:7700:2a4a with SMTP id lw21-20020a056a21059500b000a577002a4amr12319698pzb.51.1671487636993; Mon, 19 Dec 2022 14:07:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487636; cv=none; d=google.com; s=arc-20160816; b=FjhYIYGj9flOhCIo+Z/aeAx2FZtkLqpksL2wTwCeX6uMALVHjPsQNnArhRJU41tJnl MRQ42cBiOn/MktGFuL8lTIAm65h4pUaNtmrgzUN1qhBDtD8oBS9CcL0gGlm/Gaw+Hh+B W7sB7FpSis48j5prK4wk9Nnk+tWhWVUHYMJ/Kp4sUUQzJEqNMcXn+tYCVjiEuNrUc/T5 o/q/GAUaC0/nRK4FmrDWvVYZpRYu/WLoA1mbNh6yYv08kKNH05ab4fiV6ormrnz363T3 uYOjQdiM1wQLzo0fa0UmUe6VXX71S7yltHYXZIWxpyKZwH6FKTFLF8cfaGkhGDQVkXDt PDbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=AEloZssES2RC7BH2BiqVkRNNWZvMUr3nP3cZeI400jg=; b=Wy8zx3OwCxPv2k+JjEjrWwhqwBOVi39s55qMh5u5lKHA+0C4fe4DFPU2xBorLzd80F ZT0S3ZXq+LADIjvvxjmSCNOxDRc45scrD8LsyLZjExti4W9Yv5oYN1ilgEjermI9wEFN c2jSHYWYzfmeo9b9cuXS26Q1YDKVLn8HFJsKayZyH4rDISR/jDdPCFFVYIJrAU7Xnuma 4MmC8UkH5Z73XvdOhTKsNXaqoKPoI6gKJdT1eoqWIP1Kx7B6hQJc33MmAXJg6fcGGS+1 B9yYzFujS35k1fce2+Tna+kvAgQZ3HKhk5xnbzCzlhRLamqNtPDfE6XnxOFS6+/gMxVB 3VxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=cAbaJLlr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f21-20020a635555000000b004750a252941si12472283pgm.774.2022.12.19.14.07.04; Mon, 19 Dec 2022 14:07:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=cAbaJLlr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232432AbiLSWEY (ORCPT + 99 others); Mon, 19 Dec 2022 17:04:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232883AbiLSWDf (ORCPT ); Mon, 19 Dec 2022 17:03:35 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B140140F8; Mon, 19 Dec 2022 14:03:22 -0800 (PST) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJKW2xn028221; Mon, 19 Dec 2022 22:02:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=AEloZssES2RC7BH2BiqVkRNNWZvMUr3nP3cZeI400jg=; b=cAbaJLlrHwY+t2yJT1Q1k41/eKiApkIZBwnKcdK3YSq0J8bem9gPVeNq07fTUbM4UY23 QG6eQO3MKwq+E+s20Qp0ZJig3POddIlluStUgdYoMQxRnDn+eoVKqAsWhitkXVRPVpLd Fkal101DFUx+grt3BpN0HarCpfevjef0ZCtvneWpTntgD7LMt39D8dFfGRFuNG7p7GdV DQkrzFhSMzAlSof4dAxCHoq+pXUvYfEtbTkFVE2EPzmo8sJVNjs4MwBb7ydPGMi4u3bs ih2VpcZi/P8LhwF4Qxc7VXffzWHiebuiuDuUPKoVWuRyBRMYQtn9GidvuqzdxPH7Ijka WQ== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3mjx3b10xe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:02:54 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id F270C801722; Mon, 19 Dec 2022 22:02:52 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 31411805E9F; Mon, 19 Dec 2022 22:02:52 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 06/13] crypto: x86/ghash - use u8 rather than char Date: Mon, 19 Dec 2022 16:02:16 -0600 Message-Id: <20221219220223.3982176-7-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: W0daaAMd-lf2JY2jF3r3OO2XyX_3vs9G X-Proofpoint-GUID: W0daaAMd-lf2JY2jF3r3OO2XyX_3vs9G X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 spamscore=0 clxscore=1015 impostorscore=0 priorityscore=1501 bulkscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681820079898894?= X-GMAIL-MSGID: =?utf-8?q?1752681820079898894?= Use more consistent unambivalent types (u8 rather than char) for the source and destination buffer pointer arguments for the asm functions. Reference them with "asmlinkage" as well. Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_asm.S | 6 +++--- arch/x86/crypto/ghash-clmulni-intel_glue.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S index 09cf9271b83a..ad860836f75b 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_asm.S +++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S @@ -96,7 +96,7 @@ SYM_FUNC_END(__clmul_gf128mul_ble) * This supports 64-bit CPUs. * * Return: none (but @dst is updated) - * Prototype: asmlinkage void clmul_ghash_mul(char *dst, const u128 *shash) + * Prototype: asmlinkage void clmul_ghash_mul(u8 *dst, const u128 *shash) */ SYM_FUNC_START(clmul_ghash_mul) FRAME_BEGIN @@ -122,8 +122,8 @@ SYM_FUNC_END(clmul_ghash_mul) * This supports 64-bit CPUs. * * Return: none (but @dst is updated) - * Prototype: asmlinkage clmul_ghash_update(char *dst, const char *src, - * unsigned int srclen, const u128 *shash); + * Prototype: asmlinkage void clmul_ghash_update(u8 *dst, const u8 *src, + * unsigned int srclen, const u128 *shash); */ SYM_FUNC_START(clmul_ghash_update) FRAME_BEGIN diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 1f1a95f3dd0c..beac4b2eddf6 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -23,10 +23,10 @@ #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 -void clmul_ghash_mul(char *dst, const u128 *shash); +asmlinkage void clmul_ghash_mul(u8 *dst, const u128 *shash); -void clmul_ghash_update(char *dst, const char *src, unsigned int srclen, - const u128 *shash); +asmlinkage void clmul_ghash_update(u8 *dst, const u8 *src, unsigned int srclen, + const u128 *shash); struct ghash_async_ctx { struct cryptd_ahash *cryptd_tfm; From patchwork Mon Dec 19 22:02:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34793 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2638478wrn; Mon, 19 Dec 2022 14:06:10 -0800 (PST) X-Google-Smtp-Source: AA0mqf7oN10U2489OZzV8bEh7o2jXFNFRdsArLex4Jt4xUBB3NK/OYGb4D5XX0ewsWWYJ1hPX/he X-Received: by 2002:a05:6a21:1084:b0:9d:efbf:48bf with SMTP id nl4-20020a056a21108400b0009defbf48bfmr44150593pzb.3.1671487570346; Mon, 19 Dec 2022 14:06:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487570; cv=none; d=google.com; s=arc-20160816; b=pWoVtXB6Z9Snv32LnCRiSby+jFuVmxbGkiSO1nSy2fOdkYtFP64EdX52owAOFKAZY9 aqH/1NIofIsf4LzYKINgjo7MU7r0nSUfhk3JMdS2a39i9LTkTaZLD5pORVT9eEBMvGpP SHDMElKE8o3yubSXJSVCzcRJqGPVd2MFjsz1HnG91udfvHP9/znCnld2ZSpB991xqcxU FDeeSwmG6sJnGdLuhj4GTyo2uY3swXamMlL8ykmz9jXYdk0kn0widRRymhPip4Ow8o6X gqcS3uWPYIfTRYaXJtFmQZgTuDQl5l0q5skUlW4Ny/wqJVUs7SrPMsnr2b5KkpgkMpRv AbKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=v7bR1szvHVL/RPr0ZNg3zP1MrZYuBcDYIV4Z7L2T914=; b=fIyEt2AJ9VGl7yTKHLBt9uZZud7ofaaMqbT8e1jt0q9RYs4CjFian2D5NtDzkcKrPn w/5KzozunkZhA5LdLV8yfMckCDmBSBkjYcJS99ALUN2sHHuS4Zc4sj8WzdMzZUEfNqDf nNNLcL78Gh4RzHU+uKqDbauiDlBrwfssRt6cKr+HuU6tO577sjS3ZI/w0y2cRR5hPal7 cwOUYajqEJ7KNHiN7tCUMiG9JmE/XVeOhc58vBG3Y3CvJBTzcSO380jEHTNqcOrbOCY9 Z7BkF14zzsh0uwSwGfgAwLCdS87imoFegM+ICmlYvem60dKIRBPnBC3BPobF44fOIito cf1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=ONkH+4LS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p21-20020a63f455000000b004790510bfe5si11854661pgk.692.2022.12.19.14.05.57; Mon, 19 Dec 2022 14:06:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=ONkH+4LS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232840AbiLSWEC (ORCPT + 99 others); Mon, 19 Dec 2022 17:04:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232853AbiLSWDU (ORCPT ); Mon, 19 Dec 2022 17:03:20 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99A87140C9; Mon, 19 Dec 2022 14:03:19 -0800 (PST) Received: from pps.filterd (m0134422.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJL1wCX003253; Mon, 19 Dec 2022 22:02:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=v7bR1szvHVL/RPr0ZNg3zP1MrZYuBcDYIV4Z7L2T914=; b=ONkH+4LS7SXoAXohpr8SBsVPGhgEtOKMi2uiTZ/y20e/Do1S5koY8UmDXOn3lCXox5Gn so3QBUWBG47i6BhCiN1h9GBDOunOEmqQBNWLC4fVDMmZXYbsS4f75dGWSMnEYgyh7vzQ NlR9wRg+i+GW4T+IoGPEXc94OdVPr4bbNuEabW8S6Y6ggDc42r+GRaCCARK0+v67mXCv qygtL8Ok8JtwYw2oFH+YXQdb/ClYASd/ORpDCqkqSSBgxLRPGV23lOu384M0WACUnMbN jB/1ECLjBKvuYhLZ+1EsHtqIWonj1zY4LmsHc4WdNSXYQdhmGOC6XnpXIuA0XRNLaQRt cg== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3mjyd9rcn4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:02:55 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id AF7183DE2E; Mon, 19 Dec 2022 22:02:54 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id E37DB80649A; Mon, 19 Dec 2022 22:02:53 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 07/13] crypto: x86/ghash - restructure FPU context saving Date: Mon, 19 Dec 2022 16:02:17 -0600 Message-Id: <20221219220223.3982176-8-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: wEAm2iiXTf33vjlVf3B3bGIxapDJMGAh X-Proofpoint-GUID: wEAm2iiXTf33vjlVf3B3bGIxapDJMGAh X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 mlxlogscore=999 clxscore=1015 bulkscore=0 adultscore=0 malwarescore=0 spamscore=0 impostorscore=0 mlxscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681750454388392?= X-GMAIL-MSGID: =?utf-8?q?1752681750454388392?= Wrap each of the calls to clmul_hash_update and clmul_ghash__mul in its own set of kernel_fpu_begin and kernel_fpu_end calls, preparing to limit the amount of data processed by each _update call to avoid RCU stalls. This is more like how polyval-clmulni_glue is structured. Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index beac4b2eddf6..1bfde099de0f 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -80,7 +80,6 @@ static int ghash_update(struct shash_desc *desc, struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); u8 *dst = dctx->buffer; - kernel_fpu_begin(); if (dctx->bytes) { int n = min(srclen, dctx->bytes); u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); @@ -91,10 +90,14 @@ static int ghash_update(struct shash_desc *desc, while (n--) *pos++ ^= *src++; - if (!dctx->bytes) + if (!dctx->bytes) { + kernel_fpu_begin(); clmul_ghash_mul(dst, &ctx->shash); + kernel_fpu_end(); + } } + kernel_fpu_begin(); clmul_ghash_update(dst, src, srclen, &ctx->shash); kernel_fpu_end(); From patchwork Mon Dec 19 22:02:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34788 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2637968wrn; Mon, 19 Dec 2022 14:05:05 -0800 (PST) X-Google-Smtp-Source: AA0mqf6L3diS8FFYjsaHpcR8Dq2U6j1rv7l5WTVou2WQGQBaiv8+Lw2RLiwI2BDNP2lF4oUCHIWx X-Received: by 2002:aa7:8a02:0:b0:573:846c:b88 with SMTP id m2-20020aa78a02000000b00573846c0b88mr42576357pfa.23.1671487505462; Mon, 19 Dec 2022 14:05:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487505; cv=none; d=google.com; s=arc-20160816; b=tS2qzD1IxLQxHOVU2fc+W18FBa1BK5cK3p4PkGS9EZYLyAc7Ma3SZKYTH6VUMD9DqR xyKqDhFdY4O87J5NB4tcXBMdQsHTv1ajzrfMaUi7fTPCWq5+0QLR0K3yVbVLaTmT+2ed iC/3iRemAK3cESPy+QTRuXAA5InYAZsJHXtzX3Oc2ivP0rJd8AwzdpN4M3t61/mLUS0w jT201bdNBxjg5UlayN0UaEzyYF8gCT22hJ6E6iTURPwkpNvUrQ9iCMAwmPbb7/MN7V2D dv7F+txxpmunDoRwmZuOk4eA4oAA9Vtv1qlWAuuwQohqvPMXn/8XfEIx5b/essycKRw5 DMlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hrnSS/acWGmvImUo/QDE8X3zDiPvrvWWK9oGzVt4i0M=; b=D6MeS9j6OHtOUxev6mCEF6foEmdUCnAUwlnCU3AaOaV5pwy2CQJ5vLFhe2vuavtzUJ O8YyRFTLesVINjzoaZaDSekkBBMKuo5jz0XzAVjCyO5R5oFKlF1n911yBLoW/N8O3sQA NteSn1wjw4o4VvMOIncpXOJoIYBt9fItbIXc652IGR5jvotunrDnPpc2YTxYbFaRWiI9 bMvaLwGznug3v3yCWZrqzRqbzACw9CWVGiswHANRbHDHXNx0qrN2JJAg0orh9a4TiHz9 BIQL83HXc3B0H4mrz8Dpn5kYGQRKnX4gRC+MMlHHcghjeiUSRqwKfPAc62FxQPkeQ/HW 8fpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=STcXxQSX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k20-20020a056a00135400b00574a8619855si12073534pfu.364.2022.12.19.14.04.50; Mon, 19 Dec 2022 14:05:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=STcXxQSX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232517AbiLSWDS (ORCPT + 99 others); Mon, 19 Dec 2022 17:03:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231344AbiLSWDR (ORCPT ); Mon, 19 Dec 2022 17:03:17 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B59313F97; Mon, 19 Dec 2022 14:03:16 -0800 (PST) Received: from pps.filterd (m0134421.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJKW3DT015099; Mon, 19 Dec 2022 22:02:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=hrnSS/acWGmvImUo/QDE8X3zDiPvrvWWK9oGzVt4i0M=; b=STcXxQSXSo0nENAYhUcqOjh7Q66IsWXyNouDI9udlCce7apiUNr0jN4gvb1ifzxSXbGj mgv5IJf3xa1RwT1cyWuAxn8NdS6pCGhv3ZorMrV6hy3edS/3L3hCiOmmn+gtjZny1hzp NdTR9RDEkTh+/bDszmDh3aBpV9+DeCeE9Z2YaKZr7NDumVp4ANYMdX/vuxR4DNcaK1uJ iCNPwbsPWegU6hKLVD4G0rttHtU4xMn7wgO+QG/cTAtd0bP5hgh6P2IRwREbkXdnlHiP PQLcH4iff8aLlicxQnvGaEji9MuItOtS+AAPLKIfXvwUtzubn53EZk8DMDxgtzZVpkgG 2A== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3mjv4222vs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:02:57 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 634A5310BD; Mon, 19 Dec 2022 22:02:56 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 913C0805634; Mon, 19 Dec 2022 22:02:55 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 08/13] crypto: x86/ghash - yield FPU context during long loops Date: Mon, 19 Dec 2022 16:02:18 -0600 Message-Id: <20221219220223.3982176-9-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: OTJy3vKEdfmzvoxGnRwruc0exfrYOwPG X-Proofpoint-GUID: OTJy3vKEdfmzvoxGnRwruc0exfrYOwPG X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 adultscore=0 phishscore=0 impostorscore=0 spamscore=0 mlxlogscore=999 malwarescore=0 bulkscore=0 mlxscore=0 lowpriorityscore=0 priorityscore=1501 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681682489614479?= X-GMAIL-MSGID: =?utf-8?q?1752681682489614479?= The x86 assembly language implementations using SIMD process data between kernel_fpu_begin() and kernel_fpu_end() calls. That disables scheduler preemption, so prevents the CPU core from being used by other threads. The update() and finup() functions might be called to process large quantities of data, which can result in RCU stalls and soft lockups. Periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 26 ++++++++++++++++------ 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 1bfde099de0f..cd44339abdbb 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -82,7 +82,7 @@ static int ghash_update(struct shash_desc *desc, if (dctx->bytes) { int n = min(srclen, dctx->bytes); - u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); + u8 *pos = dst + GHASH_BLOCK_SIZE - dctx->bytes; dctx->bytes -= n; srclen -= n; @@ -97,13 +97,25 @@ static int ghash_update(struct shash_desc *desc, } } - kernel_fpu_begin(); - clmul_ghash_update(dst, src, srclen, &ctx->shash); - kernel_fpu_end(); + if (srclen >= GHASH_BLOCK_SIZE) { + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(srclen, 4096U); + + clmul_ghash_update(dst, src, chunk, &ctx->shash); + + srclen -= chunk & ~(GHASH_BLOCK_SIZE - 1); + src += chunk & ~(GHASH_BLOCK_SIZE - 1); + + if (srclen < GHASH_BLOCK_SIZE) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); + } - if (srclen & 0xf) { - src += srclen - (srclen & 0xf); - srclen &= 0xf; + if (srclen) { dctx->bytes = GHASH_BLOCK_SIZE - srclen; while (srclen--) *dst++ ^= *src++; From patchwork Mon Dec 19 22:02:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34791 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2638405wrn; Mon, 19 Dec 2022 14:06:00 -0800 (PST) X-Google-Smtp-Source: AA0mqf4vy0BdbQtSrVjA2MAhjSMvchyLDH0rqDgryZOMPPSb8W1dCW9lSJdHdBV0j64XbPJy+HbT X-Received: by 2002:a05:6a20:7aa3:b0:ad:c694:3fbb with SMTP id u35-20020a056a207aa300b000adc6943fbbmr31713710pzh.25.1671487559643; Mon, 19 Dec 2022 14:05:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487559; cv=none; d=google.com; s=arc-20160816; b=MGIgwxvvfL6x0UPz3Q7iyNS8YdhmdkIO3C6V04+jFxHydtqRbwyFj+4CU0oiphqfFD vc0QReJ+R2CiXHFGqTKLmm+zC8Ha3ncB54ISmdsItDefUT13moGj66ZpfT3Fj4s51BOg FVwJ3z5xY+ZAm3p4v+T1QAbMMh7plDCPysVg2bgMCJQ+EK0kAXfgqtoBw0qGSHeeNe4T HkIVWajfDiVP24g4lKF9laxHEcA/QZgEjGeCKF216pXjLcBYSzXe7D8TUWYqMuVqhcdS t3DtwsK5ZcMv7tNqr3V9Uobz68xxQLL8+Z0o1kKQ8wen3/t5nt19jhDpRfr5Z83fikGM DK+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=p9JbhkqMPpi3fDjAvwBFoPmQr84CYKUu0cOLpCX7Oos=; b=dz6RZxUqXHLNQOaVrDpyxcGnNwy8gti5MkWch12MobwocQHawne12R9VfynmNQsAht zqqL4g0jAOCsJAHVXOau4NNYftEALIKZC90Ja5W2huhrU6vyawjGzWAxDHdWANolEURG r2BcK1F9LO0+mxHdEc3+yr2upJOfiSYbuWybzyrnfZ5t6yWSM/cQ0fZs8sijW+/WzCUF BnjH+TCBLa5Npi4H2XjtYS+h/I4S419tKw2HDFxfOwIoEx6mR6RyPvS5JYy2dZy1dR2K W6R+/BW4VqEMQ1dooLH1y8rzQGka/hjCPPPxny7zhpJ48n7vsiDkky1dyAos7/+k2hRf 4k1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=R8vnhzEl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x6-20020a63fe46000000b00476b15a02cesi11865423pgj.70.2022.12.19.14.05.46; Mon, 19 Dec 2022 14:05:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=R8vnhzEl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232860AbiLSWDX (ORCPT + 99 others); Mon, 19 Dec 2022 17:03:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60726 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232385AbiLSWDS (ORCPT ); Mon, 19 Dec 2022 17:03:18 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62E541409D; Mon, 19 Dec 2022 14:03:17 -0800 (PST) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJJqpHh022240; Mon, 19 Dec 2022 22:02:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=p9JbhkqMPpi3fDjAvwBFoPmQr84CYKUu0cOLpCX7Oos=; b=R8vnhzElii1PpLsbNT17v0Pq13SNkh97OdCOMmYK3gZBqg08UIHuF8KB2TCKL05kimPn cWrDjYCmo1i+csmB9g8RtK1/xDCphuS5drdfTPT4tUjQZJk08kDb/KSre7BqA/gX8Crf 1TVQtBiuLqKbbVZWS95OfUqk+MBBXSesV3eeXM58KL8ioG+bBi6amFu0AeNZUAMhBEKP 2rQvfcM2jrYvZBixCyOWLtKk87bXFExdBs2MFUoYHJCWFGvNmLn207QGxZv9fnSDOS6+ wgol+YeZCK5INUMepxHDXYQs7fCJxoAiNjGwt+o4AHl0mcXuERH+5XhV7MMOaHjmkNPU 0w== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3mjx3b10xw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:02:58 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id 01CFA807130; Mon, 19 Dec 2022 22:02:57 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 33578805634; Mon, 19 Dec 2022 22:02:57 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 09/13] crypto: x86/poly - yield FPU context only when needed Date: Mon, 19 Dec 2022 16:02:19 -0600 Message-Id: <20221219220223.3982176-10-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: twEmfStn_S24aIZ0bUsjbvnDIt-5xTZA X-Proofpoint-GUID: twEmfStn_S24aIZ0bUsjbvnDIt-5xTZA X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 spamscore=0 clxscore=1015 impostorscore=0 priorityscore=1501 bulkscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681739253303097?= X-GMAIL-MSGID: =?utf-8?q?1752681739253303097?= The x86 assembly language implementations using SIMD process data between kernel_fpu_begin() and kernel_fpu_end() calls. That disables scheduler preemption, so prevents the CPU core from being used by other threads. The update() and finup() functions might be called to process large quantities of data, which can result in RCU stalls and soft lockups. Rather than break the processing into 4 KiB passes, each of which unilaterally calls kernel_fpu_begin() and kernel_fpu_end(), periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/nhpoly1305-avx2-glue.c | 22 +++++++----- arch/x86/crypto/nhpoly1305-sse2-glue.c | 22 +++++++----- arch/x86/crypto/poly1305_glue.c | 47 ++++++++++++-------------- arch/x86/crypto/polyval-clmulni_glue.c | 46 +++++++++++++++---------- 4 files changed, 79 insertions(+), 58 deletions(-) diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 46b036204ed9..4afbfd35afda 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -22,15 +22,21 @@ static int nhpoly1305_avx2_update(struct shash_desc *desc, if (srclen < 64 || !crypto_simd_usable()) return crypto_nhpoly1305_update(desc, src, srclen); - do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(srclen, 4096U); + + crypto_nhpoly1305_update_helper(desc, src, chunk, nh_avx2); + srclen -= chunk; + + if (!srclen) + break; + + src += chunk; + kernel_fpu_yield(); + } + kernel_fpu_end(); - kernel_fpu_begin(); - crypto_nhpoly1305_update_helper(desc, src, n, nh_avx2); - kernel_fpu_end(); - src += n; - srclen -= n; - } while (srclen); return 0; } diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index 4a4970d75107..f5c757f6f781 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -22,15 +22,21 @@ static int nhpoly1305_sse2_update(struct shash_desc *desc, if (srclen < 64 || !crypto_simd_usable()) return crypto_nhpoly1305_update(desc, src, srclen); - do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(srclen, 4096U); + + crypto_nhpoly1305_update_helper(desc, src, chunk, nh_sse2); + srclen -= chunk; + + if (!srclen) + break; + + src += chunk; + kernel_fpu_yield(); + } + kernel_fpu_end(); - kernel_fpu_begin(); - crypto_nhpoly1305_update_helper(desc, src, n, nh_sse2); - kernel_fpu_end(); - src += n; - srclen -= n; - } while (srclen); return 0; } diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 1dfb8af48a3c..13e2e134b458 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -15,20 +15,13 @@ #include #include -asmlinkage void poly1305_init_x86_64(void *ctx, - const u8 key[POLY1305_BLOCK_SIZE]); -asmlinkage void poly1305_blocks_x86_64(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); -asmlinkage void poly1305_emit_x86_64(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], - const u32 nonce[4]); -asmlinkage void poly1305_emit_avx(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], - const u32 nonce[4]); -asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); -asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); -asmlinkage void poly1305_blocks_avx512(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); +asmlinkage void poly1305_init_x86_64(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]); +asmlinkage void poly1305_blocks_x86_64(void *ctx, const u8 *inp, unsigned int len, u32 padbit); +asmlinkage void poly1305_emit_x86_64(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); +asmlinkage void poly1305_emit_avx(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); +asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, unsigned int len, const u32 padbit); +asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, unsigned int len, u32 padbit); +asmlinkage void poly1305_blocks_avx512(void *ctx, const u8 *inp, unsigned int len, u32 padbit); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx2); @@ -86,7 +79,7 @@ static void poly1305_simd_init(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]) poly1305_init_x86_64(ctx, key); } -static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, +static void poly1305_simd_blocks(void *ctx, const u8 *inp, unsigned int len, const u32 padbit) { struct poly1305_arch_internal *state = ctx; @@ -103,21 +96,25 @@ static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, return; } - do { - const size_t bytes = min_t(size_t, len, SZ_4K); + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(len, 4096U); - kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && static_branch_likely(&poly1305_use_avx512)) - poly1305_blocks_avx512(ctx, inp, bytes, padbit); + poly1305_blocks_avx512(ctx, inp, chunk, padbit); else if (static_branch_likely(&poly1305_use_avx2)) - poly1305_blocks_avx2(ctx, inp, bytes, padbit); + poly1305_blocks_avx2(ctx, inp, chunk, padbit); else - poly1305_blocks_avx(ctx, inp, bytes, padbit); - kernel_fpu_end(); + poly1305_blocks_avx(ctx, inp, chunk, padbit); + len -= chunk; - len -= bytes; - inp += bytes; - } while (len); + if (!len) + break; + + inp += chunk; + kernel_fpu_yield(); + } + kernel_fpu_end(); } static void poly1305_simd_emit(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index 8fa58b0f3cb3..a3d72e87d58d 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -45,8 +45,8 @@ struct polyval_desc_ctx { u32 bytes; }; -asmlinkage void clmul_polyval_update(const struct polyval_tfm_ctx *keys, - const u8 *in, size_t nblocks, u8 *accumulator); +asmlinkage void clmul_polyval_update(const struct polyval_tfm_ctx *keys, const u8 *in, + unsigned int nblocks, u8 *accumulator); asmlinkage void clmul_polyval_mul(u8 *op1, const u8 *op2); static inline struct polyval_tfm_ctx *polyval_tfm_ctx(struct crypto_shash *tfm) @@ -55,27 +55,40 @@ static inline struct polyval_tfm_ctx *polyval_tfm_ctx(struct crypto_shash *tfm) } static void internal_polyval_update(const struct polyval_tfm_ctx *keys, - const u8 *in, size_t nblocks, u8 *accumulator) + const u8 *in, unsigned int nblocks, u8 *accumulator) { - if (likely(crypto_simd_usable())) { - kernel_fpu_begin(); - clmul_polyval_update(keys, in, nblocks, accumulator); - kernel_fpu_end(); - } else { + if (!crypto_simd_usable()) { polyval_update_non4k(keys->key_powers[NUM_KEY_POWERS-1], in, nblocks, accumulator); + return; } + + kernel_fpu_begin(); + for (;;) { + const unsigned int chunks = min(nblocks, 4096U / POLYVAL_BLOCK_SIZE); + + clmul_polyval_update(keys, in, chunks, accumulator); + nblocks -= chunks; + + if (!nblocks) + break; + + in += chunks * POLYVAL_BLOCK_SIZE; + kernel_fpu_yield(); + } + kernel_fpu_end(); } static void internal_polyval_mul(u8 *op1, const u8 *op2) { - if (likely(crypto_simd_usable())) { - kernel_fpu_begin(); - clmul_polyval_mul(op1, op2); - kernel_fpu_end(); - } else { + if (!crypto_simd_usable()) { polyval_mul_non4k(op1, op2); + return; } + + kernel_fpu_begin(); + clmul_polyval_mul(op1, op2); + kernel_fpu_end(); } static int polyval_x86_setkey(struct crypto_shash *tfm, @@ -113,7 +126,6 @@ static int polyval_x86_update(struct shash_desc *desc, struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); const struct polyval_tfm_ctx *tctx = polyval_tfm_ctx(desc->tfm); u8 *pos; - unsigned int nblocks; unsigned int n; if (dctx->bytes) { @@ -131,9 +143,9 @@ static int polyval_x86_update(struct shash_desc *desc, tctx->key_powers[NUM_KEY_POWERS-1]); } - while (srclen >= POLYVAL_BLOCK_SIZE) { - /* Allow rescheduling every 4K bytes. */ - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; + if (srclen >= POLYVAL_BLOCK_SIZE) { + const unsigned int nblocks = srclen / POLYVAL_BLOCK_SIZE; + internal_polyval_update(tctx, src, nblocks, dctx->buffer); srclen -= nblocks * POLYVAL_BLOCK_SIZE; src += nblocks * POLYVAL_BLOCK_SIZE; From patchwork Mon Dec 19 22:02:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34789 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2638097wrn; Mon, 19 Dec 2022 14:05:20 -0800 (PST) X-Google-Smtp-Source: AA0mqf6EJHeCBx+6Z5oXir7TIuh9NzBAKU0Xoh4YZQvgoU3dRXXxjIt4Wl5wIVK9LpU+qI92MTb7 X-Received: by 2002:a05:6a20:8b18:b0:ac:b2a3:e39c with SMTP id l24-20020a056a208b1800b000acb2a3e39cmr46119372pzh.62.1671487520158; Mon, 19 Dec 2022 14:05:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487520; cv=none; d=google.com; s=arc-20160816; b=aMerAyG4sfEAQXgWONpEd1jVWE7r2yQTM0O5pWmbueFc2/RIwzsn59bYX9XPXaUfX2 Ln553stVWZosxMc5DZUCznFvMyOf/2ZA+bf+LwBvmAJJSf5Ce++UJvTuuhn40WV2I1yD y6QPlX24m65G+soo/bhryzIw1s4GD9bC0uR15cGLMWAu2Zoa4ExdVBqSzdGs/jvAKH11 6Tzt4oShID7yoE1V09q2iEfoXWgGG5LwTsVQMImZy/q4/YWHgr9DQp6MjtLuY6kB7xYz 9xdscVsf6o4OK7RBvmnVSPp4d1YTvSTYPV5fd1tvZN0I4hWv8xHwPLi2IxJK2FDl2Zsu px+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=cBwySZEgLVUjwgqyHLHULZ/4YxK3sjPT9ThAW/euji8=; b=kZuYA6V9+JdbdqPoJpCR3vUAWiOhmJwovXM3F/kt9F/Lj62FmTfBtQ1D1XswFVm9xA C9wj3CY/l4Fy2cZyjOFCtJZupgEEhsU/9bPu9s1FZ3vgu5+Uy1Qw8Ri6Tzj8sWgIysS6 PYVEUADRTFV41+cxJlGmxNVsFzCsOvk+iNpfxo8iv8qshe5rU+2AkLjt9ACGmfj1xYFo hO8f/fdAPlCsdzzNxmxH71RRhhMWkVyliM4XDHIvMHeyKAioJuq2BT7JQFkbhGhXy0Lw u5jUTmVDOkMn34fpAB8KfkVxE2PQI+VxmCUoCRxNAZoBsc9tNJ+qFGbfxqTVK+3+FT+m sLCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=VJhp6fzD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w190-20020a6382c7000000b004790794f214si11665017pgd.828.2022.12.19.14.05.07; Mon, 19 Dec 2022 14:05:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=VJhp6fzD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229895AbiLSWDg (ORCPT + 99 others); Mon, 19 Dec 2022 17:03:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232124AbiLSWDT (ORCPT ); Mon, 19 Dec 2022 17:03:19 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D02413D6F; Mon, 19 Dec 2022 14:03:18 -0800 (PST) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJLW1rF025606; Mon, 19 Dec 2022 22:03:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=cBwySZEgLVUjwgqyHLHULZ/4YxK3sjPT9ThAW/euji8=; b=VJhp6fzD/f6TR2gTVRg7uEutZGe27LkEmWB5s8L1k+jlyUoxHHvYcL9vrIkf1EDtWINJ KVZCUNNgS8FFjj8RKDt57CO5XbZfRv1WDthh3F3mCN/wsMTIPttwCT49hs3Nwq9XBrke L9EqPi0m7GwQrgoPlM03aTV2crOo2V+N+/lA8k99/R8uCjDM9mlbj+nLKgpj0B7bIWBU jgnKaIBr32VmZop61nUkCCZJa3nMpV3mziCJSV+I9VMVDzFGF26UHkNjBSVn8BBqe34S k/mM8FFPf2FGop10BKXxNgJVxDmYlkZVea9rCScx64SsybkudRUsZ5LR9hbYeVz/ykQW lw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3mjx3b10y3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:03:00 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 7AE703DE29; Mon, 19 Dec 2022 22:02:59 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id AECC2808734; Mon, 19 Dec 2022 22:02:58 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 10/13] crypto: x86/aegis - yield FPU context during long loops Date: Mon, 19 Dec 2022 16:02:20 -0600 Message-Id: <20221219220223.3982176-11-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: MWN4LHZ798tbmusyvRonviknI_w6SdXZ X-Proofpoint-GUID: MWN4LHZ798tbmusyvRonviknI_w6SdXZ X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 spamscore=0 clxscore=1015 impostorscore=0 priorityscore=1501 bulkscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681697553602802?= X-GMAIL-MSGID: =?utf-8?q?1752681697553602802?= Make kernel_fpu_begin() and kernel_fpu_end() calls around each assembly language function that uses FPU context, rather than around the entire set (init, ad, crypt, final). During encryption, periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. Associated data is not limited. Allow the skcipher_walk functions to sleep again, since they are no longer called inside FPU context. Fixes: 1d373d4e8e15 ("crypto: x86 - Add optimized AEGIS implementations") Fixes: ba6771c0a0bc ("crypto: x86/aegis - fix handling chunked inputs and MAY_SLEEP") Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 49 ++++++++++++++++++++------- 1 file changed, 36 insertions(+), 13 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 4623189000d8..f99f3e597b3c 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -12,8 +12,8 @@ #include #include #include -#include #include +#include #define AEGIS128_BLOCK_ALIGN 16 #define AEGIS128_BLOCK_SIZE 16 @@ -85,15 +85,19 @@ static void crypto_aegis128_aesni_process_ad( if (pos > 0) { unsigned int fill = AEGIS128_BLOCK_SIZE - pos; memcpy(buf.bytes + pos, src, fill); - crypto_aegis128_aesni_ad(state, + kernel_fpu_begin(); + crypto_aegis128_aesni_ad(state->blocks, AEGIS128_BLOCK_SIZE, buf.bytes); + kernel_fpu_end(); pos = 0; left -= fill; src += fill; } - crypto_aegis128_aesni_ad(state, left, src); + kernel_fpu_begin(); + crypto_aegis128_aesni_ad(state->blocks, left, src); + kernel_fpu_end(); src += left & ~(AEGIS128_BLOCK_SIZE - 1); left &= AEGIS128_BLOCK_SIZE - 1; @@ -110,7 +114,9 @@ static void crypto_aegis128_aesni_process_ad( if (pos > 0) { memset(buf.bytes + pos, 0, AEGIS128_BLOCK_SIZE - pos); - crypto_aegis128_aesni_ad(state, AEGIS128_BLOCK_SIZE, buf.bytes); + kernel_fpu_begin(); + crypto_aegis128_aesni_ad(state->blocks, AEGIS128_BLOCK_SIZE, buf.bytes); + kernel_fpu_end(); } } @@ -118,16 +124,31 @@ static void crypto_aegis128_aesni_process_crypt( struct aegis_state *state, struct skcipher_walk *walk, const struct aegis_crypt_ops *ops) { - while (walk->nbytes >= AEGIS128_BLOCK_SIZE) { - ops->crypt_blocks(state, - round_down(walk->nbytes, AEGIS128_BLOCK_SIZE), - walk->src.virt.addr, walk->dst.virt.addr); - skcipher_walk_done(walk, walk->nbytes % AEGIS128_BLOCK_SIZE); + if (walk->nbytes >= AEGIS128_BLOCK_SIZE) { + kernel_fpu_begin(); + for (;;) { + unsigned int chunk = min(walk->nbytes, 4096U); + + chunk = round_down(chunk, AEGIS128_BLOCK_SIZE); + + ops->crypt_blocks(state->blocks, chunk, + walk->src.virt.addr, walk->dst.virt.addr); + + skcipher_walk_done(walk, walk->nbytes - chunk); + + if (walk->nbytes < AEGIS128_BLOCK_SIZE) + break; + + kernel_fpu_yield(); + } + kernel_fpu_end(); } if (walk->nbytes) { - ops->crypt_tail(state, walk->nbytes, walk->src.virt.addr, + kernel_fpu_begin(); + ops->crypt_tail(state->blocks, walk->nbytes, walk->src.virt.addr, walk->dst.virt.addr); + kernel_fpu_end(); skcipher_walk_done(walk, 0); } } @@ -172,15 +193,17 @@ static void crypto_aegis128_aesni_crypt(struct aead_request *req, struct skcipher_walk walk; struct aegis_state state; - ops->skcipher_walk_init(&walk, req, true); + ops->skcipher_walk_init(&walk, req, false); kernel_fpu_begin(); + crypto_aegis128_aesni_init(&state.blocks, ctx->key.bytes, req->iv); + kernel_fpu_end(); - crypto_aegis128_aesni_init(&state, ctx->key.bytes, req->iv); crypto_aegis128_aesni_process_ad(&state, req->src, req->assoclen); crypto_aegis128_aesni_process_crypt(&state, &walk, ops); - crypto_aegis128_aesni_final(&state, tag_xor, req->assoclen, cryptlen); + kernel_fpu_begin(); + crypto_aegis128_aesni_final(&state.blocks, tag_xor, req->assoclen, cryptlen); kernel_fpu_end(); } From patchwork Mon Dec 19 22:02:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34790 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2638324wrn; Mon, 19 Dec 2022 14:05:50 -0800 (PST) X-Google-Smtp-Source: AMrXdXsTF6J6guut8ySfZs2s2RcQpy1qmjJddrn2y3mkFVtH6vkCnuMFX7i/DCH9XMdxQaBll3Ln X-Received: by 2002:a17:90a:6744:b0:219:1d62:9e05 with SMTP id c4-20020a17090a674400b002191d629e05mr10430901pjm.34.1671487549688; Mon, 19 Dec 2022 14:05:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487549; cv=none; d=google.com; s=arc-20160816; b=hgrgLDAsX7D+K+zBDRSYy0V8zBfKCJrXcNAPYuNEyny0ysRrs85C1J4QpHyCgNgF+T 7GLkys6RRnAtkyxB/Y4Q999eJU+MwDudn0VkLhIKGR8KLi87K4NfEClgOQyw3P43qUH9 xZhO8vauNVG5R2K0MEZk5IDKd6/AH8cMCVdWJwc2IugM3W2wqHNctInwaEfFlvAURNzA GcQ6cr+eESC44l067VqIWxGxoqQ6LBFhcGmYYxLtPpHFu1dAh/TGjJKnsO2lY2EQ+FUR Xg0JAA3OXTKNCpopoOCfBzg/F8moABv3DFhSD1BiaLR13NJx4zsyQb3AG26w9BPPS8jM AC0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=nSvaflAAMuZJfAIaCIotghXNKYG+XpkRDzCUXcgsa5g=; b=AcnWzRPYAUypfMqtEQI6r0wBzhs/QaFqkNwdTgw4UtGo4mLd/s8ENiAztKl1cPHDdt BZfhyVRj0UWu6jKYsz7reStaIOADlNh13MQNQpJHe+Ef/EFl6166osffCeYLu3dbi/PI 8eWKU2uT7q3xVAxqnxj4K1Zth2OBPej5+R//2tKu4S1se+OZA40f/k6mKzE9ZS0mbTdH /W9RpzHvYsbCIpeWcup5rvGkt6wTN7xjDPhhTt10blPHSIy9UKZWPmZIiKtnHTG4/Kwi d6UwI79nJqOBFbX/2f2m1t612zfdGyexuYOi6EKPM57Tnk1LaG6TgOlCITJ/50BfZ7Jf PdnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=JtNnDTFq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ms19-20020a17090b235300b002129a8204d2si7918889pjb.44.2022.12.19.14.05.34; Mon, 19 Dec 2022 14:05:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=JtNnDTFq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232559AbiLSWDq (ORCPT + 99 others); Mon, 19 Dec 2022 17:03:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232747AbiLSWDT (ORCPT ); Mon, 19 Dec 2022 17:03:19 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EBDFA13F97; Mon, 19 Dec 2022 14:03:18 -0800 (PST) Received: from pps.filterd (m0134420.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJKsHLI016291; Mon, 19 Dec 2022 22:03:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=nSvaflAAMuZJfAIaCIotghXNKYG+XpkRDzCUXcgsa5g=; b=JtNnDTFqPb0hBmYH8+/JkEsBILIbaFYXXGSZdnc3lzB68YQav0oJHkwYpQFVNbClPyx0 u/Pj0y+ndNK9Nu5Iyw+EEfFJe1QPli4e3N6yhPSeDxP8c189BQ/XGlKTkAGp1OThJANs YhfBAiqZXvtzeCCGdErFar8YZVz4rbKMrVHBwM9wBNa5n3WdFwwOAi8A6DaDCgawEUic QPRCc/EtuC04ZhT7LvEVjVLR4qDuFAqfyEXFrczTRMkj9jbTLQ3swsqQmTSJXhbYsPEA +26v9T3WUkxbEvsHpOCPgWSe1rPjwzaU3a3HC0l/h8u4/9GtYKauJmUhYTuHVMQMlbjW DA== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3mjy6d0etp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:03:01 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 3EF413DE2A; Mon, 19 Dec 2022 22:03:01 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 6DAE080649A; Mon, 19 Dec 2022 22:03:00 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 11/13] crypto: x86/blake - yield FPU context only when needed Date: Mon, 19 Dec 2022 16:02:21 -0600 Message-Id: <20221219220223.3982176-12-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: O0UNo8-c_31ZUN3VBFPimjZMmRle62hI X-Proofpoint-GUID: O0UNo8-c_31ZUN3VBFPimjZMmRle62hI X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 lowpriorityscore=0 bulkscore=0 phishscore=0 impostorscore=0 malwarescore=0 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 clxscore=1015 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_FILL_THIS_FORM_SHORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681729015563856?= X-GMAIL-MSGID: =?utf-8?q?1752681729015563856?= The x86 assembly language implementations using SIMD process data between kernel_fpu_begin() and kernel_fpu_end() calls. That disables scheduler preemption, so prevents the CPU core from being used by other threads. The update() and finup() functions might be called to process large quantities of data, which can result in RCU stalls and soft lockups. Rather than break the processing into 4 KiB passes, each of which unilaterally calls kernel_fpu_begin() and kernel_fpu_end(), periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. Adjust the type of the length arguments everywhere to be unsigned long rather than size_t to avoid typecasts. Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/blake2s-glue.c | 41 ++++++++++++++++--------------- include/crypto/internal/blake2s.h | 8 +++--- lib/crypto/blake2s-generic.c | 12 ++++----- 3 files changed, 31 insertions(+), 30 deletions(-) diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index aaba21230528..bbb0a67ebb1c 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -12,46 +12,47 @@ #include #include -#include #include #include -asmlinkage void blake2s_compress_ssse3(struct blake2s_state *state, - const u8 *block, const size_t nblocks, - const u32 inc); -asmlinkage void blake2s_compress_avx512(struct blake2s_state *state, - const u8 *block, const size_t nblocks, - const u32 inc); +asmlinkage void blake2s_compress_ssse3(struct blake2s_state *state, const u8 *data, + unsigned int nblocks, u32 inc); +asmlinkage void blake2s_compress_avx512(struct blake2s_state *state, const u8 *data, + unsigned int nblocks, u32 inc); static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_ssse3); static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_avx512); -void blake2s_compress(struct blake2s_state *state, const u8 *block, - size_t nblocks, const u32 inc) +void blake2s_compress(struct blake2s_state *state, const u8 *data, + unsigned int nblocks, const u32 inc) { /* SIMD disables preemption, so relax after processing each page. */ BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8); if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) { - blake2s_compress_generic(state, block, nblocks, inc); + blake2s_compress_generic(state, data, nblocks, inc); return; } - do { - const size_t blocks = min_t(size_t, nblocks, - SZ_4K / BLAKE2S_BLOCK_SIZE); + kernel_fpu_begin(); + for (;;) { + const unsigned int chunks = min(nblocks, 4096U / BLAKE2S_BLOCK_SIZE); - kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && static_branch_likely(&blake2s_use_avx512)) - blake2s_compress_avx512(state, block, blocks, inc); + blake2s_compress_avx512(state, data, chunks, inc); else - blake2s_compress_ssse3(state, block, blocks, inc); - kernel_fpu_end(); + blake2s_compress_ssse3(state, data, chunks, inc); - nblocks -= blocks; - block += blocks * BLAKE2S_BLOCK_SIZE; - } while (nblocks); + nblocks -= chunks; + + if (!nblocks) + break; + + data += chunks * BLAKE2S_BLOCK_SIZE; + kernel_fpu_yield(); + } + kernel_fpu_end(); } EXPORT_SYMBOL(blake2s_compress); diff --git a/include/crypto/internal/blake2s.h b/include/crypto/internal/blake2s.h index 506d56530ca9..d6df791e6148 100644 --- a/include/crypto/internal/blake2s.h +++ b/include/crypto/internal/blake2s.h @@ -10,11 +10,11 @@ #include #include -void blake2s_compress_generic(struct blake2s_state *state, const u8 *block, - size_t nblocks, const u32 inc); +void blake2s_compress_generic(struct blake2s_state *state, const u8 *data, + unsigned int nblocks, u32 inc); -void blake2s_compress(struct blake2s_state *state, const u8 *block, - size_t nblocks, const u32 inc); +void blake2s_compress(struct blake2s_state *state, const u8 *data, + unsigned int nblocks, u32 inc); bool blake2s_selftest(void); diff --git a/lib/crypto/blake2s-generic.c b/lib/crypto/blake2s-generic.c index 75ccb3e633e6..6a1caa702698 100644 --- a/lib/crypto/blake2s-generic.c +++ b/lib/crypto/blake2s-generic.c @@ -37,12 +37,12 @@ static inline void blake2s_increment_counter(struct blake2s_state *state, state->t[1] += (state->t[0] < inc); } -void blake2s_compress(struct blake2s_state *state, const u8 *block, - size_t nblocks, const u32 inc) +void blake2s_compress(struct blake2s_state *state, const u8 *data, + unsigned int nblocks, u32 inc) __weak __alias(blake2s_compress_generic); -void blake2s_compress_generic(struct blake2s_state *state, const u8 *block, - size_t nblocks, const u32 inc) +void blake2s_compress_generic(struct blake2s_state *state, const u8 *data, + unsigned int nblocks, u32 inc) { u32 m[16]; u32 v[16]; @@ -53,7 +53,7 @@ void blake2s_compress_generic(struct blake2s_state *state, const u8 *block, while (nblocks > 0) { blake2s_increment_counter(state, inc); - memcpy(m, block, BLAKE2S_BLOCK_SIZE); + memcpy(m, data, BLAKE2S_BLOCK_SIZE); le32_to_cpu_array(m, ARRAY_SIZE(m)); memcpy(v, state->h, 32); v[ 8] = BLAKE2S_IV0; @@ -103,7 +103,7 @@ void blake2s_compress_generic(struct blake2s_state *state, const u8 *block, for (i = 0; i < 8; ++i) state->h[i] ^= v[i] ^ v[i + 8]; - block += BLAKE2S_BLOCK_SIZE; + data += BLAKE2S_BLOCK_SIZE; --nblocks; } } From patchwork Mon Dec 19 22:02:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34797 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2638989wrn; Mon, 19 Dec 2022 14:07:23 -0800 (PST) X-Google-Smtp-Source: AA0mqf5x2Oyg2qZs5GVVyZKQP/t0+iaSLi8tZOhgo1nIoB/YHV6vdD3Rg/qpt9CxYquo+RSGq0h4 X-Received: by 2002:a05:6a00:26e1:b0:576:84ae:59a with SMTP id p33-20020a056a0026e100b0057684ae059amr36812383pfw.3.1671487643055; Mon, 19 Dec 2022 14:07:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487643; cv=none; d=google.com; s=arc-20160816; b=v2BYDC8emkP4sprBto5svx+eDJ7IlguUbh3MTT+b7soAU6ZOtTaadDxcWTQrLkWQaM 15MX2S//h0MAqd+PMcsP1+EEHWrzjadusrfdfx/GoL0lK9lDsIorX8u0Th9T5+bixmXc qpbGLILRp7otzZvCqq3ujO2CAEria1Dot70S0a9V+rintw12xuVoCTYtugYPgB1hPF3b MBH3JuQCYysPm1WtyYufpsYZ/BryAEnXwgK13TGwnnfsDZNkHFZRXjZEq5ZkokpVQ0Ly HGwPVN4RqZ+PD/OEcuvr1Ntk5tv/FM6eU+KOoii7ZdcjB/6YQUubIuNekXJhQXVPAXS5 t0KA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ytx49hxAW/5Vofc1aOHCr8Q+x9j/e6AuWZ4zsK8+9Fc=; b=ZTyO9suy/LsRS/1cogJjPrxpLTRj/8oeUTjdG/smm2dcwHkYleueKj34pV/TqbEuKy ptMpWsl8IyAvH+KE+YhCfeKYTqjsi8MxkHxORwXFuu+BkWSDHboZwh4BvElUyV8hmgRz XGHSF8baRbuExTSLG15rl9pxllf9M7dSuCDD/vSNPzsilxkXiRafX2mFHKDI19BDB4kH WloTMFdGssDwlSvb55PyGfj0JLadgTyN3lV2Hs88sUt26CMrrNmhlNd9gaGav4g5QfzS qLTmDxA1+Sym2N3khRS+MrtJrdGralnspFgp63p97DJ0rbpIW2OPKqEWPrahiw8YOUjz pmIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b="T/CbVSUj"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h5-20020a056a00230500b0057a6dd45320si12256982pfh.243.2022.12.19.14.07.09; Mon, 19 Dec 2022 14:07:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b="T/CbVSUj"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230007AbiLSWEK (ORCPT + 99 others); Mon, 19 Dec 2022 17:04:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232869AbiLSWDd (ORCPT ); Mon, 19 Dec 2022 17:03:33 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1A2A140D8; Mon, 19 Dec 2022 14:03:20 -0800 (PST) Received: from pps.filterd (m0150242.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJKdEeQ002227; Mon, 19 Dec 2022 22:03:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ytx49hxAW/5Vofc1aOHCr8Q+x9j/e6AuWZ4zsK8+9Fc=; b=T/CbVSUjbXTIPnbdEp5IctHSx/fVjKlksErJBnrc/G/sqyWFXQhLUzilM38Zf247E5jt GKvRjMDVq9Xxg/qXAuOxyysQWkh4Hz2rWITY6gNPt10CKx7yYXkBb3gzfcZ3kyWV6pZ+ RZxksUGp30chzOo1BXP/ir5lu7ytBDADYQ2rtEaDRrHJR6uVdc5hvtgwj13YWk/iUIM0 gbnwoJnORqDTpyKpFJo97/TjKEaASY/Ij0d2J45gW6CRFoXI8/KgwHb0oA0uANPVR8+G 1lt8WQqCMEstAg9FqRQeL6ryFpdhlXCeN64F83QqXwtqs+yGNP2zOALvesIp7KCziLEn qw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3mjy650gt0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:03:03 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id D89063DE25; Mon, 19 Dec 2022 22:03:02 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 161CF808734; Mon, 19 Dec 2022 22:03:02 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 12/13] crypto: x86/chacha - yield FPU context only when needed Date: Mon, 19 Dec 2022 16:02:22 -0600 Message-Id: <20221219220223.3982176-13-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: j1P-dXNFLU2g8BowGj6h7B-U2Q2W6YtA X-Proofpoint-ORIG-GUID: j1P-dXNFLU2g8BowGj6h7B-U2Q2W6YtA X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 impostorscore=0 mlxscore=0 phishscore=0 clxscore=1015 malwarescore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 priorityscore=1501 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190194 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681826439956512?= X-GMAIL-MSGID: =?utf-8?q?1752681826439956512?= The x86 assembly language implementations using SIMD process data between kernel_fpu_begin() and kernel_fpu_end() calls. That disables scheduler preemption, so prevents the CPU core from being used by other threads. Rather than break the processing into 4 KiB passes, each of which unilaterally calls kernel_fpu_begin() and kernel_fpu_end(), periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/chacha_glue.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 7b3a1cf0984b..892cbae958b8 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -146,17 +146,21 @@ void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, bytes <= CHACHA_BLOCK_SIZE) return chacha_crypt_generic(state, dst, src, bytes, nrounds); - do { - unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + kernel_fpu_begin(); + for (;;) { + const unsigned int chunk = min(bytes, 4096U); - kernel_fpu_begin(); - chacha_dosimd(state, dst, src, todo, nrounds); - kernel_fpu_end(); + chacha_dosimd(state, dst, src, chunk, nrounds); - bytes -= todo; - src += todo; - dst += todo; - } while (bytes); + bytes -= chunk; + if (!bytes) + break; + + src += chunk; + dst += chunk; + kernel_fpu_yield(); + } + kernel_fpu_end(); } EXPORT_SYMBOL(chacha_crypt_arch); From patchwork Mon Dec 19 22:02:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert (Servers)" X-Patchwork-Id: 34798 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2639160wrn; Mon, 19 Dec 2022 14:07:44 -0800 (PST) X-Google-Smtp-Source: AA0mqf6A1Xn9PDYPbxEMKCsxLhiBTnkwDTZpgEkm2hKqJoHAG7wDfqiPPL/hUHoF/5jHNxA3Bgt+ X-Received: by 2002:a17:907:d12:b0:7c1:54b9:c688 with SMTP id gn18-20020a1709070d1200b007c154b9c688mr44702296ejc.60.1671487663891; Mon, 19 Dec 2022 14:07:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671487663; cv=none; d=google.com; s=arc-20160816; b=Eh9pBU6lDaVnajdeYSXYEPpjX679awHXykZNrPG3t3VU4F6vlSmjQgwFAN2NOvAgMN L0W/OsK023xpd7sqHhg1PW22y/l8ADZKEIU24geqrcJ6SudtH0xQZ+4hD9IoAcw8jHOU uzYGpknviP/KMJWcT0amd+paiDk76GFOwiabcNieUZBZml7vZ1+jWsA5PqZrvQMPXMvV Wy79B9s+p9U1egS3CJwc1QxtqcjKW6nSdMmUYee4+Hty5E+lFp05fMPwCGBo0RKSjVCC I5IYsQ0NX6LiA2uU+QcDR3Y1GBjOdIcbGy7mlZpyIVaQ6CEuM6vg388iBc6hqJpuNpeP kIzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=O4GnIk7zd/4MqgzKF2156bsbM+N7CAdELJWrz2MCgdk=; b=WgJrxDOMdw4NOB3nccziLO51N8LqEl1HyirTqrrLKzqeiplAPYF1Tj+Q36fs3zCilj 6FdsbYWL8jvH2bFDOeAMWDuRgjTBBRMWmNqWIUCc7Wq+OAQmzeW4XlEQQI5CdIyvDsV3 593dwGavSu1cfM0E0/PAvWcZrGi2ikvRmmB90JfGhTGE8pWFINWW7KKISdd/CldQEX+x JZsieqYbXHoWdYprqLhDmBMv5+WRxxfW2RorUz6ZVZQ9g5NJ1QK/mtE0k4b6+gX0yXtp BJizMRhrAbDW1bHlrUwHUjQ1lw+v4fNFnbe0r52Ytm0/cKjEOXyjdIBUNRYbt8yfsRAZ A5Fw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=SXjklFqc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x9-20020a1709065ac900b007c0daa7bc20si8539819ejs.880.2022.12.19.14.07.20; Mon, 19 Dec 2022 14:07:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@hpe.com header.s=pps0720 header.b=SXjklFqc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=hpe.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232882AbiLSWEi (ORCPT + 99 others); Mon, 19 Dec 2022 17:04:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232633AbiLSWDf (ORCPT ); Mon, 19 Dec 2022 17:03:35 -0500 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80D82140F6; Mon, 19 Dec 2022 14:03:21 -0800 (PST) Received: from pps.filterd (m0150241.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BJJr66N024598; Mon, 19 Dec 2022 22:03:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=O4GnIk7zd/4MqgzKF2156bsbM+N7CAdELJWrz2MCgdk=; b=SXjklFqcz0t9iTLogr2jGRxusZQC60LRJIbIqsGnHdNhOeE83d0mLu6phlugRe6kfDdl ujMBmMkL3+aSMn/R5QtS1k8ppkPjA30SDCRTBZjjRBwzMwo5K+1PjoPBD26ysB8lyGd9 of2PLn64zaXCZUBrDiUbra03X5J58JNqmanlAa+mW0NIHQ33QqDfYxADKh0vM7qjFsBH SpVWFE56M6x1iieblh75hPbo7NYyWq5txFHkhqWf/LvJly+c0NXNQ3EMH4LFCyDPiMwz NeyIiMhWVvzLs0BfE7BGpIv6R3CspSX1Sytg5q/vN3npNUgjoDt0DmASHBDpEeVkgWs1 SA== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3mjx3b10yx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2022 22:03:05 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 75DB84AC45; Mon, 19 Dec 2022 22:03:04 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id A8B1380649A; Mon, 19 Dec 2022 22:03:03 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, Jason@zx2c4.com, ardb@kernel.org, ap420073@gmail.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, tim.c.chen@linux.intel.com, peter@n8pjl.ca, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Cc: linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Robert Elliott Subject: [PATCH 13/13] crypto: x86/aria - yield FPU context only when needed Date: Mon, 19 Dec 2022 16:02:23 -0600 Message-Id: <20221219220223.3982176-14-elliott@hpe.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221219220223.3982176-1-elliott@hpe.com> References: <20221219220223.3982176-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: l8D768RjsgE5OEabaL3ZFIEKiFHi-Ky- X-Proofpoint-GUID: l8D768RjsgE5OEabaL3ZFIEKiFHi-Ky- X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-19_01,2022-12-15_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 spamscore=0 clxscore=1015 impostorscore=0 priorityscore=1501 bulkscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212190193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752681848597498104?= X-GMAIL-MSGID: =?utf-8?q?1752681848597498104?= The x86 assembly language implementations using SIMD process data between kernel_fpu_begin() and kernel_fpu_end() calls. That disables scheduler preemption, so prevents the CPU core from being used by other threads. During ctr mode, rather than break the processing into 256 byte passes, each of which unilaterally calls kernel_fpu_begin() and kernel_fpu_end(), periodically check if the kernel scheduler wants to run something else on the CPU. If so, yield the kernel FPU context and let the scheduler intervene. Signed-off-by: Robert Elliott --- arch/x86/crypto/aria_aesni_avx_glue.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/crypto/aria_aesni_avx_glue.c b/arch/x86/crypto/aria_aesni_avx_glue.c index c561ea4fefa5..6657ce576e6c 100644 --- a/arch/x86/crypto/aria_aesni_avx_glue.c +++ b/arch/x86/crypto/aria_aesni_avx_glue.c @@ -5,6 +5,7 @@ * Copyright (c) 2022 Taehee Yoo */ +#include #include #include #include @@ -85,17 +86,19 @@ static int aria_avx_ctr_encrypt(struct skcipher_request *req) const u8 *src = walk.src.virt.addr; u8 *dst = walk.dst.virt.addr; + kernel_fpu_begin(); while (nbytes >= ARIA_AESNI_PARALLEL_BLOCK_SIZE) { u8 keystream[ARIA_AESNI_PARALLEL_BLOCK_SIZE]; - kernel_fpu_begin(); aria_ops.aria_ctr_crypt_16way(ctx, dst, src, keystream, walk.iv); - kernel_fpu_end(); dst += ARIA_AESNI_PARALLEL_BLOCK_SIZE; src += ARIA_AESNI_PARALLEL_BLOCK_SIZE; nbytes -= ARIA_AESNI_PARALLEL_BLOCK_SIZE; + + kernel_fpu_yield(); } + kernel_fpu_end(); while (nbytes >= ARIA_BLOCK_SIZE) { u8 keystream[ARIA_BLOCK_SIZE];