Message ID | 20240130083007.1876787-2-kirill.shutemov@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-44137-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2087:b0:106:209c:c626 with SMTP id gs7csp1180873dyb; Tue, 30 Jan 2024 04:24:03 -0800 (PST) X-Google-Smtp-Source: AGHT+IHSTviFkB8ekuxjHOM3a/DaOxo6hmBruuDxDeI9KMH0kkbfr+QnFH/u6VQLoVc/CjGZ028n X-Received: by 2002:a05:6a21:9208:b0:19c:5651:adc1 with SMTP id tl8-20020a056a21920800b0019c5651adc1mr9492896pzb.38.1706617443346; Tue, 30 Jan 2024 04:24:03 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706617443; cv=pass; d=google.com; s=arc-20160816; b=KyrftwFRZ1aqnZa4pB0LpKDhTMsO/gzJo5abkXBs4NyFsL/4rzv0CN9QDFgyo6eKn6 wGRw+KCNd/FDEmU+Gfy49b+gEpdszIZg+xleI1cU2+YQ542vMAxp4QOzbexkXwfjs8bH 6tESPPh3Vd6KrvPPcsR6ZU8AoRV9dwEwWLhe2G0c6N7U4ILa187oXqoVU0Ee6u4Fw56L hR+cSumTC+eWk8LoReVd4YapqUgE6EdFi+GGsUyf10egRzZmykFdbrd0e4xcE3x5zMBx Om3RYLSoayW4bt3mSC8KaNZKkIBJJMB5skxsnmaMmrKmjnbMqx9YYrYMVcZaJGRWaDIu zDCg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=g9Gg7GOC7ceT8fJCI8ET/aa/MNX+stJNIpevBzV6luA=; fh=ODt9kxcTNWB7X1EmolPcmJdx0Rxj6DM5rRkwFDfN79c=; b=O5jYLzwXdadZRPhcNe6O8MW+d1IKl46my1kXVD+Dn5MXZPXpocC1LHafdIuJ0fu2Yo QxyKy1iVX2S4Q2bm92VRMhPxOOAdhA6GNWXMOv1YvRkQ4N5xnSiAT/dyKulJ6cT2WGY2 ztNjD/csOTjcOUbVIr+h5chVGl/BbfsSTtMdrJ/oyVJQHwQTu9hDrtruDjkCm6EmJbMn 9fUdRgIfbgfI9e7ZWu6d3jaFP+vrJ1OA9Xl2DCG3mt8Sy85cUr5wI5Fx3IL0oFybkUwI oQ8KVXaKLA9dhENMpoAuRn37tMTIw1j/OXk0/dPllnGrnJmI3fGHkPQQuWsTR9Ji1wQR hp9A== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TNME18jA; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-44137-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-44137-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id q6-20020aa79826000000b006dd9213a323si7425497pfl.333.2024.01.30.04.24.03 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jan 2024 04:24:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-44137-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TNME18jA; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-44137-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-44137-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 1E259289E55 for <ouuuleilei@gmail.com>; Tue, 30 Jan 2024 08:31:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A570058108; Tue, 30 Jan 2024 08:31:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TNME18jA" Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDAE633CE5 for <linux-kernel@vger.kernel.org>; Tue, 30 Jan 2024 08:30:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=134.134.136.31 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706603458; cv=none; b=CUbG8FLsGRSTTkSBbK/UHLmFao4/hsKRbT6Y9/6GKhsumN2dp980qa7UNt5wUn8doY/YIfkxyrDOgZFN1/UguUqCg+o1BTKeHOJhDZ8jDmg1482moV/17HI/XUXLe1RH021nUoIlOpdHTT9f/+/uukl2rsXXLwKJnEcBjfShGCI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706603458; c=relaxed/simple; bh=AwZwJ5cYVLV4jd8F8avXy7jN4r8TxDP5NN7KpZbNqJY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mrNikCBxuNuNg9x+qU57TAi9ShvJeFeuzIQzpCn6zRcaWcHO1dhkUApnLUVqIsUoI07T86U2S2dGNaBq2cBD1e9m3I0RUjsKbp15TgShcZJbkJlwID8lXf9ElGvSUfpzU2UspKSEVF0gYydJXfz3Le/x+3cPFYM8sIFkTgPEySQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TNME18jA; arc=none smtp.client-ip=134.134.136.31 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706603455; x=1738139455; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AwZwJ5cYVLV4jd8F8avXy7jN4r8TxDP5NN7KpZbNqJY=; b=TNME18jATKnZBMKyxYHwkR7Y/H06oinPcSuEtyjCzxYuScMTBO1cEQfQ AnLBb/JGfqas6gAMN0DCjkfnR+dbRSiyLkaB3TRtdthL87NgCPXICG4e1 YE2dxl1gwOXsetIKTlYkTY/eIXucdOeqHHYnctpu/CLOhgkP2pXCBubR/ /N9ERVF9Q+HOm1n9fy2/rmQpHiz+1LY0GH3+M3QYgAFpKAt/6Z8ZS3ODC F5cI4LLIWqj93nAEjT+TVMlVDdpw5wMGNTClauYcFb+Qxpq1aTwcBCgFW KkuHEhVYi7SgAqACIs8awSP34+QeIi8uwAy72Ed4ITvGLi81LMjZYJl8U w==; X-IronPort-AV: E=McAfee;i="6600,9927,10968"; a="467464334" X-IronPort-AV: E=Sophos;i="6.05,707,1701158400"; d="scan'208";a="467464334" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jan 2024 00:30:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10968"; a="822120353" X-IronPort-AV: E=Sophos;i="6.05,707,1701158400"; d="scan'208";a="822120353" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga001.jf.intel.com with ESMTP; 30 Jan 2024 00:30:13 -0800 Received: by black.fi.intel.com (Postfix, from userid 1000) id 81FFBB8; Tue, 30 Jan 2024 10:30:12 +0200 (EET) From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> To: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com>, x86@kernel.org, "Theodore Ts'o" <tytso@mit.edu>, "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>, Elena Reshetova <elena.reshetova@intel.com>, Jun Nakajima <jun.nakajima@intel.com>, Tom Lendacky <thomas.lendacky@amd.com>, "Kalra, Ashish" <ashish.kalra@amd.com>, Sean Christopherson <seanjc@google.com>, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Subject: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails Date: Tue, 30 Jan 2024 10:30:07 +0200 Message-ID: <20240130083007.1876787-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240130083007.1876787-1-kirill.shutemov@linux.intel.com> References: <20240130083007.1876787-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789518092425050511 X-GMAIL-MSGID: 1789518092425050511 |
Series |
[1/2] x86/random: Retry on RDSEED failure
|
|
Commit Message
Kirill A. Shutemov
Jan. 30, 2024, 8:30 a.m. UTC
RDRAND and RDSEED instructions rarely fail. Ten retries should be
sufficient to account for occasional failures.
If the instruction fails more than ten times, it is likely that the
hardware is broken or someone is attempting to exceed the rate at which
the random number generator hardware can provide random numbers.
Issue a warning if ten retries were not enough.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/include/asm/archrandom.h | 11 +++++++++++
1 file changed, 11 insertions(+)
Comments
Hi Kirill, Picking up from my last email on patch 1/2: On Tue, Jan 30, 2024 at 9:30 AM Kirill A. Shutemov <kirill.shutemov@linux.intel.com> wrote: > RDRAND and RDSEED instructions rarely fail. Ten retries should be > sufficient to account for occasional failures. > > If the instruction fails more than ten times, it is likely that the > hardware is broken or someone is attempting to exceed the rate at which > the random number generator hardware can provide random numbers. You're the Intel employee so you can find out about this with much more assurance than me, but I understand the sentence above to be _way more_ true for RDRAND than for RDSEED. If your informed opinion is, "RDRAND failing can only be due to totally broken hardware" then a WARN_ON seems like an appropriate solution, consistent with what other drivers do for totally broken hardware. I'm less convinced that this is the case also for RDSEED, but you know better than me. However, there's one potentially concerning aspect to consider: if the statement is "RDRAND only fails when the hardware fails", that's fine, but if the statement is "RDRAND only fails when the hardware fails or a user hammers on RDRAND in a busy loop," then this seems like a potential DoS vector from userspace, since RDRAND is not a privileged instruction. Unless there's different pools and rate limiting and hardware and such depending on which ring the instruction is called from? But I've never read about that. What's your feeling on this concern? And if the DoS thing _is_ a concern, and the use case for this WARN_ON in the first place is the trusted computing scenario, so we basically only care about early boot, then one addendum would be to only warn if we're in early boot, which would work because seeding via RDRAND is attempted pretty early on in init.c. Jason
> Hi Kirill, > > Picking up from my last email on patch 1/2: > > On Tue, Jan 30, 2024 at 9:30 AM Kirill A. Shutemov > <kirill.shutemov@linux.intel.com> wrote: > > RDRAND and RDSEED instructions rarely fail. Ten retries should be > > sufficient to account for occasional failures. > > > > If the instruction fails more than ten times, it is likely that the > > hardware is broken or someone is attempting to exceed the rate at which > > the random number generator hardware can provide random numbers. > > You're the Intel employee so you can find out about this with much > more assurance than me, but I understand the sentence above to be _way > more_ true for RDRAND than for RDSEED. If your informed opinion is, > "RDRAND failing can only be due to totally broken hardware" No, this is not the case per Intel SDM. I think we can live under a simple assumption that both of these instructions can fail not just due to broken HW, but also due to enough pressure put into the whole DRBG construction that supplies random numbers via RDRAND/RDSEED. then a > WARN_ON seems like an appropriate solution, consistent with what other > drivers do for totally broken hardware. I'm less convinced that this > is the case also for RDSEED, but you know better than me. I do agree that due to internal structure of DRBG it is easier to create a situation when RDSEED will fail. But for the purpose of Linux RNG and confidential computing it actually doesn’t make a difference if we get an output from RDRAND or RDSEED, as soon as we get either of them. Problems only start imo when both of them are made to fail. > > However, there's one potentially concerning aspect to consider: if the > statement is "RDRAND only fails when the hardware fails", that's fine, > but if the statement is "RDRAND only fails when the hardware fails or > a user hammers on RDRAND in a busy loop," then this seems like a > potential DoS vector from userspace, since RDRAND is not a privileged > instruction. Unless there's different pools and rate limiting and > hardware and such depending on which ring the instruction is called > from? But I've never read about that. What's your feeling on this > concern? RDRAND can fail with enough load as I already said above. I am also not aware about any ring separation or anything like this for RDRAND/RDSEED instructions. I guess your concern about DoS here is for the case when we don’t trust the host/VMM *and* assume malicious userspace, correct? Because in non-confidential computing case, the Linux RNG in such case will just use non-RDRAND fallbacks, no DoS will happen and we should have enough entropy that is outside of userspace control. I guess this is indeed difficult situation because we don’t have any other entropy sources anymore (unless assume some special HW). But you bring a very valid point that in this case we make it easier for userspace to make a DoS to the kernel if we require RDRAND/RDSEED to succeed, which is not acceptable (with exception of early boot when we don’t have the userspace problem). > > And if the DoS thing _is_ a concern, and the use case for this WARN_ON > in the first place is the trusted computing scenario, so we basically > only care about early boot, then one addendum would be to only warn if > we're in early boot, which would work because seeding via RDRAND is > attempted pretty early on in init.c. I don’t think we are only concerned with initial early boot and initial seeding. What about periodic reseeding of ChaCha CSPRNG? If you don’t get RDRAND/RDSEED output during this step, don’t we formally loose the forward prediction resistance property of Linux RNG assuming this is the only source of entropy that is outside of attacker control? Best Regards, Elena. > > Jason
On Tue, Jan 30, 2024 at 2:45 PM Reshetova, Elena <elena.reshetova@intel.com> wrote: > No, this is not the case per Intel SDM. I think we can live under a simple > assumption that both of these instructions can fail not just due to broken > HW, but also due to enough pressure put into the whole DRBG construction > that supplies random numbers via RDRAND/RDSEED. Yea, thought so. > I guess your concern about DoS here is for the case when we don’t > trust the host/VMM *and* assume malicious userspace, correct? > Because in non-confidential computing case, the Linux RNG in such > case will just use non-RDRAND fallbacks, no DoS will happen and we > should have enough entropy that is outside of userspace control. Don't think about the RNG for just one second. The basic principle is simpler: if you have a `WARN_ON(unprivd_userspace_triggerable_condition)`, that's usually considered a DoS - panic_on_warn and such. > > > > And if the DoS thing _is_ a concern, and the use case for this WARN_ON > > in the first place is the trusted computing scenario, so we basically > > only care about early boot, then one addendum would be to only warn if > > we're in early boot, which would work because seeding via RDRAND is > > attempted pretty early on in init.c. > > I don’t think we are only concerned with initial early boot and initial seeding. > What about periodic reseeding of ChaCha CSPRNG? If you don’t get > RDRAND/RDSEED output during this step, don’t we formally loose the forward > prediction resistance property of Linux RNG assuming this is the only source > of entropy that is outside of attacker control? If you never add new material, and you have the initial seed, then it's deterministic. But you still mostly can't backtrack if the state leaks at some future point in time. Jason
> On Tue, Jan 30, 2024 at 2:45 PM Reshetova, Elena > <elena.reshetova@intel.com> wrote: > > No, this is not the case per Intel SDM. I think we can live under a simple > > assumption that both of these instructions can fail not just due to broken > > HW, but also due to enough pressure put into the whole DRBG construction > > that supplies random numbers via RDRAND/RDSEED. > > Yea, thought so. > > > I guess your concern about DoS here is for the case when we don’t > > trust the host/VMM *and* assume malicious userspace, correct? > > Because in non-confidential computing case, the Linux RNG in such > > case will just use non-RDRAND fallbacks, no DoS will happen and we > > should have enough entropy that is outside of userspace control. > > Don't think about the RNG for just one second. The basic principle is > simpler: if you have a > `WARN_ON(unprivd_userspace_triggerable_condition)`, that's usually > considered a DoS - panic_on_warn and such. Ok, agree, you do bring a valid point that we should not create new DoS attack vectors from userspace in such cases. > > > > > > > And if the DoS thing _is_ a concern, and the use case for this WARN_ON > > > in the first place is the trusted computing scenario, so we basically > > > only care about early boot, then one addendum would be to only warn if > > > we're in early boot, which would work because seeding via RDRAND is > > > attempted pretty early on in init.c. > > > > I don’t think we are only concerned with initial early boot and initial seeding. > > What about periodic reseeding of ChaCha CSPRNG? If you don’t get > > RDRAND/RDSEED output during this step, don’t we formally loose the forward > > prediction resistance property of Linux RNG assuming this is the only source > > of entropy that is outside of attacker control? > > If you never add new material, and you have the initial seed, then > it's deterministic. But you still mostly can't backtrack if the state > leaks at some future point in time. I am not talking about backtrack resistance, i.e. when attacker learns about RNG state and then can recover the past output. I was talking about an attacker learning the RNG state at some point of time (RNG compromise) and then for RNG being able to recover over time from this state to a secure state using fresh entropy input that is outside of attacker control/observance. Does Linux RNG aim to provide this property? Do people care about this? If noone cares about this one and Linux RNG doesn’t aim to provide it anyhow, then I agree that we should just ensure that early entropy collection includes RDRAND/RDSEED input for confidential VMs one way or another. Best Regards, Elena. > > Jason
On Tue, Jan 30, 2024 at 02:55:08PM +0000, Reshetova, Elena wrote: > > > > On Tue, Jan 30, 2024 at 2:45 PM Reshetova, Elena > > <elena.reshetova@intel.com> wrote: > > > No, this is not the case per Intel SDM. I think we can live under a simple > > > assumption that both of these instructions can fail not just due to broken > > > HW, but also due to enough pressure put into the whole DRBG construction > > > that supplies random numbers via RDRAND/RDSEED. > > > > Yea, thought so. > > > > > I guess your concern about DoS here is for the case when we don’t > > > trust the host/VMM *and* assume malicious userspace, correct? > > > Because in non-confidential computing case, the Linux RNG in such > > > case will just use non-RDRAND fallbacks, no DoS will happen and we > > > should have enough entropy that is outside of userspace control. > > > > Don't think about the RNG for just one second. The basic principle is > > simpler: if you have a > > `WARN_ON(unprivd_userspace_triggerable_condition)`, that's usually > > considered a DoS - panic_on_warn and such. > > Ok, agree, you do bring a valid point that we should not create new > DoS attack vectors from userspace in such cases. > > > > > > > > > > > And if the DoS thing _is_ a concern, and the use case for this WARN_ON > > > > in the first place is the trusted computing scenario, so we basically > > > > only care about early boot, then one addendum would be to only warn if > > > > we're in early boot, which would work because seeding via RDRAND is > > > > attempted pretty early on in init.c. > > > > > > I don’t think we are only concerned with initial early boot and initial seeding. > > > What about periodic reseeding of ChaCha CSPRNG? If you don’t get > > > RDRAND/RDSEED output during this step, don’t we formally loose the forward > > > prediction resistance property of Linux RNG assuming this is the only source > > > of entropy that is outside of attacker control? > > > > If you never add new material, and you have the initial seed, then > > it's deterministic. But you still mostly can't backtrack if the state > > leaks at some future point in time. > > I am not talking about backtrack resistance, i.e. when attacker learns about > RNG state and then can recover the past output. I was talking about an attacker > learning the RNG state at some point of time (RNG compromise) and > then for RNG being able to recover over time from this state to a secure state using > fresh entropy input that is outside of attacker control/observance. > Does Linux RNG aim to provide this property? Do people care about this? > If noone cares about this one and Linux RNG doesn’t aim to provide it anyhow, > then I agree that we should just ensure that early entropy collection includes > RDRAND/RDSEED input for confidential VMs one way or another. That's the first thing I mentioned -- "If you never add new material, and you have the initial seed, then it's deterministic." The property you mention is a good one to have and Linux usually has it. > > Best Regards, > Elena. > > > > > Jason
On 1/30/24 12:30 AM, Kirill A. Shutemov wrote: > RDRAND and RDSEED instructions rarely fail. Ten retries should be > sufficient to account for occasional failures. > > If the instruction fails more than ten times, it is likely that the > hardware is broken or someone is attempting to exceed the rate at which > the random number generator hardware can provide random numbers. > > Issue a warning if ten retries were not enough. Did you come across a case where it fails? Wondering why add this warning now? > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > arch/x86/include/asm/archrandom.h | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h > index 918c5880de9e..fc8d837fb3b9 100644 > --- a/arch/x86/include/asm/archrandom.h > +++ b/arch/x86/include/asm/archrandom.h > @@ -13,6 +13,12 @@ > #include <asm/processor.h> > #include <asm/cpufeature.h> > > +#ifdef KASLR_COMPRESSED_BOOT > +#define rd_warn(msg) warn(msg) Why not use warn_once in both cases? > +#else > +#define rd_warn(msg) WARN_ONCE(1, msg) > +#endif > + > #define RDRAND_RETRY_LOOPS 10 > > /* Unconditional execution of RDRAND and RDSEED */ > @@ -28,6 +34,9 @@ static inline bool __must_check rdrand_long(unsigned long *v) > if (ok) > return true; > } while (--retry); > + > + rd_warn("RDRAND failed\n"); > + > return false; > } > > @@ -45,6 +54,8 @@ static inline bool __must_check rdseed_long(unsigned long *v) > return true; > } while (--retry); > > + rd_warn("RDSEED failed\n"); > + > return false; > } >
On 1/30/24 05:45, Reshetova, Elena wrote: >> You're the Intel employee so you can find out about this with much >> more assurance than me, but I understand the sentence above to be _way >> more_ true for RDRAND than for RDSEED. If your informed opinion is, >> "RDRAND failing can only be due to totally broken hardware" > No, this is not the case per Intel SDM. I think we can live under a simple > assumption that both of these instructions can fail not just due to broken > HW, but also due to enough pressure put into the whole DRBG construction > that supplies random numbers via RDRAND/RDSEED. I don't think the SDM is the right thing to look at for guidance here. Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures to be exceedingly rare by design. If they're not, we're going to get our trusty torches and pitchforks and go after the folks who built the broken hardware. Repeat after me: Regular RDRAND/RDSEED failures only occur on broken hardware If it's nice hardware that's gone bad, then we WARN() and try to make the best of it. If it turns out that WARN() was because of a broken hardware _design_ then we go sharpen the pitchforks. Anybody disagree?
On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 1/30/24 05:45, Reshetova, Elena wrote: > >> You're the Intel employee so you can find out about this with much > >> more assurance than me, but I understand the sentence above to be _way > >> more_ true for RDRAND than for RDSEED. If your informed opinion is, > >> "RDRAND failing can only be due to totally broken hardware" > > No, this is not the case per Intel SDM. I think we can live under a simple > > assumption that both of these instructions can fail not just due to broken > > HW, but also due to enough pressure put into the whole DRBG construction > > that supplies random numbers via RDRAND/RDSEED. > > I don't think the SDM is the right thing to look at for guidance here. > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures > to be exceedingly rare by design. If they're not, we're going to get > our trusty torches and pitchforks and go after the folks who built the > broken hardware. > > Repeat after me: > > Regular RDRAND/RDSEED failures only occur on broken hardware > > If it's nice hardware that's gone bad, then we WARN() and try to make > the best of it. If it turns out that WARN() was because of a broken > hardware _design_ then we go sharpen the pitchforks. > > Anybody disagree? Yes, I disagree. I made a trivial test that shows RDSEED breaks easily in a busy loop. So at the very least, your statement holds true only for RDRAND. But, anyway, if the statement "RDRAND failures only occur on broken hardware" is true, then a WARN() in the failure path there presents no DoS potential of any kind, and so that's a straightforward conclusion to this discussion. However, that really hinges on "RDRAND failures only occur on broken hardware" being a true statement. Also, I don't know how much heavy lifting the word "regular" was doing in your original statement. Because my example shows that that irregular RDSEED usage from malicious users can hinder regular users. If that also applies to RDRAND, the "regular" makes the statement not as useful for taking conclusive action here.
On 1/30/24 09:49, Jason A. Donenfeld wrote: >> Anybody disagree? > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily > in a busy loop. So at the very least, your statement holds true only > for RDRAND. Well, darn. :) Any chance you could share some more information about the environment where you're seeing this? It'd be good to reconcile what you're seeing with how the hardware is expected to behave.
On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote: > On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote: > > > > On 1/30/24 05:45, Reshetova, Elena wrote: > > >> You're the Intel employee so you can find out about this with much > > >> more assurance than me, but I understand the sentence above to be _way > > >> more_ true for RDRAND than for RDSEED. If your informed opinion is, > > >> "RDRAND failing can only be due to totally broken hardware" > > > No, this is not the case per Intel SDM. I think we can live under a simple > > > assumption that both of these instructions can fail not just due to broken > > > HW, but also due to enough pressure put into the whole DRBG construction > > > that supplies random numbers via RDRAND/RDSEED. > > > > I don't think the SDM is the right thing to look at for guidance here. > > > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures > > to be exceedingly rare by design. If they're not, we're going to get > > our trusty torches and pitchforks and go after the folks who built the > > broken hardware. > > > > Repeat after me: > > > > Regular RDRAND/RDSEED failures only occur on broken hardware > > > > If it's nice hardware that's gone bad, then we WARN() and try to make > > the best of it. If it turns out that WARN() was because of a broken > > hardware _design_ then we go sharpen the pitchforks. > > > > Anybody disagree? > > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily > in a busy loop. So at the very least, your statement holds true only > for RDRAND. > > But, anyway, if the statement "RDRAND failures only occur on broken > hardware" is true, then a WARN() in the failure path there presents no > DoS potential of any kind, and so that's a straightforward conclusion > to this discussion. However, that really hinges on "RDRAND failures > only occur on broken hardware" being a true statement. There's a useful comment here from an Intel engineer https://web.archive.org/web/20190219074642/https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed "RDRAND is, indeed, faster than RDSEED because it comes from a hardware-based pseudorandom number generator. One seed value (effectively, the output from one RDSEED command) can provide up to 511 128-bit random values before forcing a reseed" We know we can exhaust RDSEED directly pretty trivially. Making your test program run in parallel across 20 cpus, I got a mere 3% success rate from RDSEED. If RDRAND is reseeding every 511 values, RDRAND output would have to be consumed significantly faster than RDSEED in order that the reseed will happen frequently enough to exhaust the seeds. This looks pretty hard, but maybe with a large enough CPU count this will be possible in extreme load ? So I'm not convinced we can blindly wave away RDRAND failures as guaranteed to mean broken hardware. With regards, Daniel
On January 30, 2024 9:58:09 AM PST, Dave Hansen <dave.hansen@intel.com> wrote: >On 1/30/24 09:49, Jason A. Donenfeld wrote: >>> Anybody disagree? >> Yes, I disagree. I made a trivial test that shows RDSEED breaks easily >> in a busy loop. So at the very least, your statement holds true only >> for RDRAND. > >Well, darn. :) > >Any chance you could share some more information about the environment >where you're seeing this? It'd be good to reconcile what you're seeing >with how the hardware is expected to behave. What CPU is this and could you clarify exactly how you run your busy loop?
On Tue, Jan 30, 2024 at 6:58 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 1/30/24 09:49, Jason A. Donenfeld wrote: > >> Anybody disagree? > > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily > > in a busy loop. So at the very least, your statement holds true only > > for RDRAND. > > Well, darn. :) > > Any chance you could share some more information about the environment > where you're seeing this? It'd be good to reconcile what you're seeing > with how the hardware is expected to behave. That is already in this thread already. Maybe catch up on the whole thing and then jump back in? https://lore.kernel.org/all/Zbjw5hRHr_E6k18r@zx2c4.com/
On Tue, Jan 30, 2024 at 7:16 PM H. Peter Anvin <hpa@zytor.com> wrote: > > On January 30, 2024 9:58:09 AM PST, Dave Hansen <dave.hansen@intel.com> wrote: > >On 1/30/24 09:49, Jason A. Donenfeld wrote: > >>> Anybody disagree? > >> Yes, I disagree. I made a trivial test that shows RDSEED breaks easily > >> in a busy loop. So at the very least, your statement holds true only > >> for RDRAND. > > > >Well, darn. :) > > > >Any chance you could share some more information about the environment > >where you're seeing this? It'd be good to reconcile what you're seeing > >with how the hardware is expected to behave. > > What CPU is this and could you clarify exactly how you run your busy loop? That is already in this thread already. Maybe catch up on the whole thing and then jump back in? https://lore.kernel.org/all/Zbjw5hRHr_E6k18r@zx2c4.com/
On Tue, Jan 30, 2024 at 7:06 PM Daniel P. Berrangé <berrange@redhat.com> wrote: > So I'm not convinced we can blindly wave away RDRAND failures as > guaranteed to mean broken hardware. Indeed, and now I'm further disturbed by the @intel.com people on the thread making these claims that are demonstratively false.
On Tue, Jan 30, 2024 at 7:06 PM Daniel P. Berrangé <berrange@redhat.com> wrote: > > On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote: > > On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > On 1/30/24 05:45, Reshetova, Elena wrote: > > > >> You're the Intel employee so you can find out about this with much > > > >> more assurance than me, but I understand the sentence above to be _way > > > >> more_ true for RDRAND than for RDSEED. If your informed opinion is, > > > >> "RDRAND failing can only be due to totally broken hardware" > > > > No, this is not the case per Intel SDM. I think we can live under a simple > > > > assumption that both of these instructions can fail not just due to broken > > > > HW, but also due to enough pressure put into the whole DRBG construction > > > > that supplies random numbers via RDRAND/RDSEED. > > > > > > I don't think the SDM is the right thing to look at for guidance here. > > > > > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures > > > to be exceedingly rare by design. If they're not, we're going to get > > > our trusty torches and pitchforks and go after the folks who built the > > > broken hardware. > > > > > > Repeat after me: > > > > > > Regular RDRAND/RDSEED failures only occur on broken hardware > > > > > > If it's nice hardware that's gone bad, then we WARN() and try to make > > > the best of it. If it turns out that WARN() was because of a broken > > > hardware _design_ then we go sharpen the pitchforks. > > > > > > Anybody disagree? > > > > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily > > in a busy loop. So at the very least, your statement holds true only > > for RDRAND. > > > > But, anyway, if the statement "RDRAND failures only occur on broken > > hardware" is true, then a WARN() in the failure path there presents no > > DoS potential of any kind, and so that's a straightforward conclusion > > to this discussion. However, that really hinges on "RDRAND failures > > only occur on broken hardware" being a true statement. > > There's a useful comment here from an Intel engineer > > https://web.archive.org/web/20190219074642/https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed > > "RDRAND is, indeed, faster than RDSEED because it comes > from a hardware-based pseudorandom number generator. > One seed value (effectively, the output from one RDSEED > command) can provide up to 511 128-bit random values > before forcing a reseed" > > We know we can exhaust RDSEED directly pretty trivially. Making your > test program run in parallel across 20 cpus, I got a mere 3% success > rate from RDSEED. > > If RDRAND is reseeding every 511 values, RDRAND output would have > to be consumed significantly faster than RDSEED in order that the > reseed will happen frequently enough to exhaust the seeds. > > This looks pretty hard, but maybe with a large enough CPU count > this will be possible in extreme load ? So what this suggests is that the guest-guest DoS caused by looping forever (or panic-on-warn'ing) is at least possible on large enough hardware for some non-zero amount of time, depending on whatever hard to hit environmental factors. Another approach would be to treat this as a hardware flaw, in that the RDRAND does not provide a universally reliable interface, and so something like CoCo doesn't work with the current design, and so Intel should issue some microcode updates that gives some separated pools and separated rate limiting on a per-VMX ring 0 basis. Or something like that. I dunno; maybe it's unrealistic to hope Intel will repair their interface. But I think we've got to acknowledge that it's sort of broken/irreliable. Jason
On 1/30/24 10:23, Jason A. Donenfeld wrote: >> Any chance you could share some more information about the environment >> where you're seeing this? It'd be good to reconcile what you're seeing >> with how the hardware is expected to behave. > That is already in this thread already. Maybe catch up on the whole > thing and then jump back in? > https://lore.kernel.org/all/Zbjw5hRHr_E6k18r@zx2c4.com/ Gah, sorry about that. I can reproduce what you're seeing, and it does seem widespread. Let me do some digging and see where we got our wires crossed.
On January 30, 2024 10:05:59 AM PST, "Daniel P. Berrangé" <berrange@redhat.com> wrote: >On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote: >> On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote: >> > >> > On 1/30/24 05:45, Reshetova, Elena wrote: >> > >> You're the Intel employee so you can find out about this with much >> > >> more assurance than me, but I understand the sentence above to be _way >> > >> more_ true for RDRAND than for RDSEED. If your informed opinion is, >> > >> "RDRAND failing can only be due to totally broken hardware" >> > > No, this is not the case per Intel SDM. I think we can live under a simple >> > > assumption that both of these instructions can fail not just due to broken >> > > HW, but also due to enough pressure put into the whole DRBG construction >> > > that supplies random numbers via RDRAND/RDSEED. >> > >> > I don't think the SDM is the right thing to look at for guidance here. >> > >> > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures >> > to be exceedingly rare by design. If they're not, we're going to get >> > our trusty torches and pitchforks and go after the folks who built the >> > broken hardware. >> > >> > Repeat after me: >> > >> > Regular RDRAND/RDSEED failures only occur on broken hardware >> > >> > If it's nice hardware that's gone bad, then we WARN() and try to make >> > the best of it. If it turns out that WARN() was because of a broken >> > hardware _design_ then we go sharpen the pitchforks. >> > >> > Anybody disagree? >> >> Yes, I disagree. I made a trivial test that shows RDSEED breaks easily >> in a busy loop. So at the very least, your statement holds true only >> for RDRAND. >> >> But, anyway, if the statement "RDRAND failures only occur on broken >> hardware" is true, then a WARN() in the failure path there presents no >> DoS potential of any kind, and so that's a straightforward conclusion >> to this discussion. However, that really hinges on "RDRAND failures >> only occur on broken hardware" being a true statement. > >There's a useful comment here from an Intel engineer > >https://web.archive.org/web/20190219074642/https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed > > "RDRAND is, indeed, faster than RDSEED because it comes > from a hardware-based pseudorandom number generator. > One seed value (effectively, the output from one RDSEED > command) can provide up to 511 128-bit random values > before forcing a reseed" > >We know we can exhaust RDSEED directly pretty trivially. Making your >test program run in parallel across 20 cpus, I got a mere 3% success >rate from RDSEED. > >If RDRAND is reseeding every 511 values, RDRAND output would have >to be consumed significantly faster than RDSEED in order that the >reseed will happen frequently enough to exhaust the seeds. > >This looks pretty hard, but maybe with a large enough CPU count >this will be possible in extreme load ? > >So I'm not convinced we can blindly wave away RDRAND failures as >guaranteed to mean broken hardware. > >With regards, >Daniel The general approach has been "don't credit entropy and try again on the next interrupt." We can, of course, be much more aggressive during boot. We only need 256-512 bits for the kernel random pool to be equivalent to breaking mainstream crypto primitives even if it is a PRNG only from that point on (which is extremely unlikely.) The Linux PRNG has a very large state, which helps buffer entropy variations. Again, applications *should* be using /dev/[u]random as appropriate, and if they opt to use lower level primitives in user space they need to implement them correctly – there is literally nothing the kernel can do at that point. If the probability of success is 3% per your CPU that is still 2 bits of true entropy per invocation. However, the probability of failure after 16 loops is over 60%. I think this validates the concept of continuing to poll periodically rather than looping in time critical paths.
> On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote: > > On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > On 1/30/24 05:45, Reshetova, Elena wrote: > > > >> You're the Intel employee so you can find out about this with much > > > >> more assurance than me, but I understand the sentence above to be _way > > > >> more_ true for RDRAND than for RDSEED. If your informed opinion is, > > > >> "RDRAND failing can only be due to totally broken hardware" > > > > No, this is not the case per Intel SDM. I think we can live under a simple > > > > assumption that both of these instructions can fail not just due to broken > > > > HW, but also due to enough pressure put into the whole DRBG construction > > > > that supplies random numbers via RDRAND/RDSEED. > > > > > > I don't think the SDM is the right thing to look at for guidance here. > > > > > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures > > > to be exceedingly rare by design. If they're not, we're going to get > > > our trusty torches and pitchforks and go after the folks who built the > > > broken hardware. > > > > > > Repeat after me: > > > > > > Regular RDRAND/RDSEED failures only occur on broken hardware > > > > > > If it's nice hardware that's gone bad, then we WARN() and try to make > > > the best of it. If it turns out that WARN() was because of a broken > > > hardware _design_ then we go sharpen the pitchforks. > > > > > > Anybody disagree? > > > > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily > > in a busy loop. So at the very least, your statement holds true only > > for RDRAND. > > > > But, anyway, if the statement "RDRAND failures only occur on broken > > hardware" is true, then a WARN() in the failure path there presents no > > DoS potential of any kind, and so that's a straightforward conclusion > > to this discussion. However, that really hinges on "RDRAND failures > > only occur on broken hardware" being a true statement. > > There's a useful comment here from an Intel engineer > > https://web.archive.org/web/20190219074642/https://software.intel.com/en- > us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed > > "RDRAND is, indeed, faster than RDSEED because it comes > from a hardware-based pseudorandom number generator. > One seed value (effectively, the output from one RDSEED > command) can provide up to 511 128-bit random values > before forcing a reseed" > > We know we can exhaust RDSEED directly pretty trivially. Making your > test program run in parallel across 20 cpus, I got a mere 3% success > rate from RDSEED. > > If RDRAND is reseeding every 511 values, RDRAND output would have > to be consumed significantly faster than RDSEED in order that the > reseed will happen frequently enough to exhaust the seeds. > > This looks pretty hard, but maybe with a large enough CPU count > this will be possible in extreme load ? > > So I'm not convinced we can blindly wave away RDRAND failures as > guaranteed to mean broken hardware. This matches both my understanding (I do have cryptography background and understanding how cryptographic RNGs work) and official public docs that Intel published on this matter. Given that the physical entropy source is limited anyhow, and by giving enough pressure on the whole construction you should be able to make RDRAND fail because if the intermediate AES-CBC MAC extractor/ conditioner is not getting its min entropy input rate, it wont produce a proper seed for AES CTR DRBG. Of course exact details/numbers can wary between different generations of Intel DRNG implementation, and the platforms where it is running on, so be careful to sticking to concrete numbers. That said, I have taken an AR to follow up internally on what can be done to improve our situation with RDRAND/RDSEED. But I would still like to finish the discussion on what people think should be done in the meanwhile keeping in mind that the problem is not intel specific, despite us intel people bringing it for public discussion first. The old saying is still here: "Please don’t shoot the messenger" )) We are actually trying to be open about these things and create a public discussion. Best Regards, Elena. > > With regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Wed, Jan 31, 2024 at 08:16:56AM +0000, Reshetova, Elena wrote: Good morning, I hope the week is going well for everyone. > > On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote: > > > On Tue, Jan 30, 2024 at 6:32???PM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > > > On 1/30/24 05:45, Reshetova, Elena wrote: > > > > >> You're the Intel employee so you can find out about this with much > > > > >> more assurance than me, but I understand the sentence above to be _way > > > > >> more_ true for RDRAND than for RDSEED. If your informed opinion is, > > > > >> "RDRAND failing can only be due to totally broken hardware" > > > > > No, this is not the case per Intel SDM. I think we can live under a simple > > > > > assumption that both of these instructions can fail not just due to broken > > > > > HW, but also due to enough pressure put into the whole DRBG construction > > > > > that supplies random numbers via RDRAND/RDSEED. > > > > > > > > I don't think the SDM is the right thing to look at for guidance here. > > > > > > > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures > > > > to be exceedingly rare by design. If they're not, we're going to get > > > > our trusty torches and pitchforks and go after the folks who built the > > > > broken hardware. > > > > > > > > Repeat after me: > > > > > > > > Regular RDRAND/RDSEED failures only occur on broken hardware > > > > > > > > If it's nice hardware that's gone bad, then we WARN() and try to make > > > > the best of it. If it turns out that WARN() was because of a broken > > > > hardware _design_ then we go sharpen the pitchforks. > > > > > > > > Anybody disagree? > > > > > > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily > > > in a busy loop. So at the very least, your statement holds true only > > > for RDRAND. > > > > > > But, anyway, if the statement "RDRAND failures only occur on broken > > > hardware" is true, then a WARN() in the failure path there presents no > > > DoS potential of any kind, and so that's a straightforward conclusion > > > to this discussion. However, that really hinges on "RDRAND failures > > > only occur on broken hardware" being a true statement. > > > > There's a useful comment here from an Intel engineer > > > > https://web.archive.org/web/20190219074642/https://software.intel.com/en- > > us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed > > > > "RDRAND is, indeed, faster than RDSEED because it comes > > from a hardware-based pseudorandom number generator. > > One seed value (effectively, the output from one RDSEED > > command) can provide up to 511 128-bit random values > > before forcing a reseed" > > > > We know we can exhaust RDSEED directly pretty trivially. Making your > > test program run in parallel across 20 cpus, I got a mere 3% success > > rate from RDSEED. > > > > If RDRAND is reseeding every 511 values, RDRAND output would have > > to be consumed significantly faster than RDSEED in order that the > > reseed will happen frequently enough to exhaust the seeds. > > > > This looks pretty hard, but maybe with a large enough CPU count > > this will be possible in extreme load ? > > > > So I'm not convinced we can blindly wave away RDRAND failures as > > guaranteed to mean broken hardware. > This matches both my understanding (I do have cryptography > background and understanding how cryptographic RNGs work) and > official public docs that Intel published on this matter. Given > that the physical entropy source is limited anyhow, and by giving > enough pressure on the whole construction you should be able to make > RDRAND fail because if the intermediate AES-CBC MAC extractor/ > conditioner is not getting its min entropy input rate, it wont > produce a proper seed for AES CTR DRBG. Of course exact > details/numbers can wary between different generations of Intel DRNG > implementation, and the platforms where it is running on, so be > careful to sticking to concrete numbers. > > That said, I have taken an AR to follow up internally on what can be > done to improve our situation with RDRAND/RDSEED. But I would still > like to finish the discussion on what people think should be done in > the meanwhile keeping in mind that the problem is not intel > specific, despite us intel people bringing it for public discussion > first. The old saying is still here: "Please don't shoot the > messenger" )) We are actually trying to be open about these things > and create a public discussion. Based on Dave Hansen's comments above, it appears that the COCO community needs to break out the oil and whetstones and hone the tips of their pitchforks.. :-) The positive issue in all of this is that, to date, TDX hardware has not seen significant public availability. I suspect that when that happens, if this problem isn't corrected, there will be the usual flood of papers demonstrating quasi-practical lab attacks that stem from the fruits of a poisonable random number source. The problem reproduces pretty easily, albeit on somewhat dated hardware. One of our lab machines, that reports a model name of 'Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz', shows a consistent failure rate of 65% for RDSEED on a single-threaded test. It gets worse when more simultaneous demand is placed on the hardware randomness source, as was demonstrated elsewhere. Corrupted randomness has the potential to strike pretty deeply into the TDX attestation architecture, given the need to generate signed attestation reports and the SGX/TDX key generation architecture that requires 'wearout protection'. Beyond that, there is the subsequent need to conduct userspace attestation, currently by IMA as I believe is the intent, that in turn requires cryptography with undeniable integrity. At this point, given that all this is about confidentiality, that in turn implies a trusted platform, there is only one option, panic and hard fail the boot if there is any indication that the hardware has not been able to provide sound instruction based randomness. Doing anything else breaks the 'contract' that a user is pushing a workload into a trusted/confidential environment. RDSEED is the root of hardware instruction based randomness and its randomness comes from quantum noise across a diode junction (simplistically). The output of RDSEED drives the AES derived RDRAND randomness. Additional clarification from inside of Intel on this is needed, but the problem would appear to be that the above infrastructure (RDSEED/RDRAND) is part of the 'Uncore' architecture, rather than being core specific. This creates an incestuous relationship across all of the cores sharing a resource, that as in the past, creates security issues. This issue was easily anticipated and foreshadowed by the demonstration of the CVE-2020-0543/CrossTalk vulnerability. If the above interpretion is correct, a full fix should be 'straight forward', for some definition of 'straight forward'... :-) On TDX capable hardare, the RDSEED noise source needs to come from a piece of core specific silicon. If the boot of a TDX VM is core locked, this would create an environment where a socket based sibling adversary would be unable to compromise the root of the randomness source. Once the Linux random number generator is properly seeded, the issue should be over, given that by now, everyone has agreed that a properly initialized Linux RNG cannot 'run out' of randomness. Given that it seems pretty clear that timing and other 'noise' information in a COCO environment can't be trusted, having core specific randomness would be a win for the overall cryptographic health of VM's that are running in a COCO environment. Once again, an attack doesn't need to be practical, only demonstrable. Once demonstrated, faith is lost in the technology, SGX clearly demonstrated that, as did the micro-architectural attacks. Both SGX and TDX operate from the notion of 'you trust Intel and the silicon', so the fix is for Intel to implement a secure silicon based source of randomness. AMD will probably need the same thing. > Best Regards, > Elena. Hopefuly the above is helpful in furthering these discussions. Have a good remainder of the week. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
On Wed, Jan 31, 2024 at 9:17 AM Reshetova, Elena <elena.reshetova@intel.com> wrote: > This matches both my understanding (I do have cryptography background > and understanding how cryptographic RNGs work) > and official public docs that Intel published on this matter. > Given that the physical entropy source is limited anyhow, and by giving > enough pressure on the whole construction you should be able to > make RDRAND fail because if the intermediate AES-CBC MAC extractor/ > conditioner is not getting its min entropy input rate, it wont > produce a proper seed for AES CTR DRBG. > Of course exact details/numbers can wary between different generations of > Intel DRNG implementation, and the platforms where it is running on, > so be careful to sticking to concrete numbers. Alright, so RDRAND is not reliable. The question for us now is: do we want RDRAND unreliability to translate to another form of unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or would it be better to declare the hardware simply broken and ask Intel to fix it? (I don't know the answer to that question.) > That said, I have taken an AR to follow up internally on what can be done > to improve our situation with RDRAND/RDSEED. Specifying this is an interesting question. What exactly might our requirements be for a "non-broken" RDRAND? It seems like we have two basic ones: - One VMX (or host) context can't DoS another one. - Ring 3 can't DoS ring 0. I don't know whether that'd be implemented with context-tied rate limiting or more state or what. But I think, short of just making RDRAND never fail, that's basically what's needed. Jason
> On Wed, Jan 31, 2024 at 9:17 AM Reshetova, Elena > <elena.reshetova@intel.com> wrote: > > This matches both my understanding (I do have cryptography background > > and understanding how cryptographic RNGs work) > > and official public docs that Intel published on this matter. > > Given that the physical entropy source is limited anyhow, and by giving > > enough pressure on the whole construction you should be able to > > make RDRAND fail because if the intermediate AES-CBC MAC extractor/ > > conditioner is not getting its min entropy input rate, it wont > > produce a proper seed for AES CTR DRBG. > > Of course exact details/numbers can wary between different generations of > > Intel DRNG implementation, and the platforms where it is running on, > > so be careful to sticking to concrete numbers. > > Alright, so RDRAND is not reliable. Correction here: "... not reliable *in theory*". Because in practice it all depends on amount of pressure you are able to put on the overall construction, which goes into concrete numbers I warned about. That would depend on the number of available cores, and some other platform specific factors. I will work on getting this clarified externally so that there is no confusion. The question for us now is: do we > want RDRAND unreliability to translate to another form of > unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or > would it be better to declare the hardware simply broken and ask Intel > to fix it? (I don't know the answer to that question.) > > > That said, I have taken an AR to follow up internally on what can be done > > to improve our situation with RDRAND/RDSEED. > > Specifying this is an interesting question. What exactly might our > requirements be for a "non-broken" RDRAND? It seems like we have two > basic ones: > > - One VMX (or host) context can't DoS another one. > - Ring 3 can't DoS ring 0. > > I don't know whether that'd be implemented with context-tied rate > limiting or more state or what. But I think, short of just making > RDRAND never fail, that's basically what's needed. I agree. Best Regards, Elena. > > Jason
On Wed, Jan 31, 2024 at 02:06:13PM +0100, Jason A. Donenfeld wrote: Hi again to everyone, beautiful day here in North Dakota. > On Wed, Jan 31, 2024 at 9:17???AM Reshetova, Elena > <elena.reshetova@intel.com> wrote: > > This matches both my understanding (I do have cryptography background > > and understanding how cryptographic RNGs work) > > and official public docs that Intel published on this matter. > > Given that the physical entropy source is limited anyhow, and by giving > > enough pressure on the whole construction you should be able to > > make RDRAND fail because if the intermediate AES-CBC MAC extractor/ > > conditioner is not getting its min entropy input rate, it wont > > produce a proper seed for AES CTR DRBG. > > Of course exact details/numbers can wary between different generations of > > Intel DRNG implementation, and the platforms where it is running on, > > so be careful to sticking to concrete numbers. > Alright, so RDRAND is not reliable. The question for us now is: do > we want RDRAND unreliability to translate to another form of > unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or > would it be better to declare the hardware simply broken and ask > Intel to fix it? (I don't know the answer to that question.) I think it would demonstrate a lack of appropriate engineering diligence on the part of our community to declare RDRAND 'busted' at this point. While it appeares to be trivially easy to force RDSEED into depletion, there does not seem to be a suggestion, at least in the open literature, that this directly or easily translates into stalling output from RDRAND in any type of relevant adversarial fashion. If this were the case, given what CVE's seem to be worth on a resume, someone would have rented a cloud machine and come up with a POC against RDRAND in a multi-tenant environment and then promptly put up a web-site called 'Random Starve' or something equally ominous. This is no doubt secondary to the 1022x amplication factor inherent in the 'Bull Mountain' architecture. I'm a bit surprised that no one from the Intel side of this conversation didn't pitch this over the wall as soon as this conversation came up, but I would suggest that everyone concerned about this issue give the following a thorough read: https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html Relevant highlights: - As I suggested in my earlier e-mail, random number generation is a socket based resource, hence an adversarial domain limited to only the cores on a common socket. - There is a maximum randomness throughput rate of 800 MB/s over all cores sharing common random number infrastructure. Single thread throughput rates of 70-200 MB/s are demonstratable. - A failure of RDRAND over 10 re-tries is 'astronomically' small, with no definition of astronomical provided, one would assume really small, given they are using the word astronomical. > > That said, I have taken an AR to follow up internally on what can be done > > to improve our situation with RDRAND/RDSEED. I think I can save you some time Elena. > Specifying this is an interesting question. What exactly might our > requirements be for a "non-broken" RDRAND? It seems like we have two > basic ones: > > - One VMX (or host) context can't DoS another one. > - Ring 3 can't DoS ring 0. > > I don't know whether that'd be implemented with context-tied rate > limiting or more state or what. But I think, short of just making > RDRAND never fail, that's basically what's needed. I think we probably have that, for all intents and purposes, given that we embrace the following methodogy: - Use RDRAND exclusively. - Be willing to take 10 swings at the plate. - Given the somewhat demanding requirements for TDX/COCO, fail and either deadlock or panic after 10 swings since that would seem to suggest the hardware is broken, ie. RMA time. Either deadlock or panic would be appropriate. The objective in the COCO environment is to get the person who clicked on the 'Enable Azure Confidential' checkbox, or its equivalent, on their cloud dashboard, to call the HelpDesk and ask them why their confidential application won't come up. After the user confirms to the HelpDesk that their computer is plugged in, the problem will get fixed. Either the broken hardware will be identified and idled out or the mighty sword of vengeance will be summoned down on whoever has all of the other cores on the socket pegged. Final thoughts: - RDSEED is probably a poor thing to be using. - There may be a reasonable argument that RDSEED shouldn't have been exposed above ring 0, but that ship has sailed. Brownie points moving forward for an RDsomething that is ring 0 and has guaranteed access to some amount of functionally reasonable entropy. - Intel and AMD are already doing a lot of 'special' stuff with their COCO hardware in order to defy the long standing adage of: 'You can't have security without physical security'. Access to per core thermal noise, as I suggested, is probably a big lift but clever engineers can probably cook up some type of fairness doctrine for randomness in TDX or SEV_SNP, given the particular importance of instruction based randomness in COCO. - Perfection is the enemy of good. > Jason Have a good day. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
On Wed, Jan 31, 2024 at 02:35:32PM -0600, Dr. Greg wrote: > I think it would demonstrate a lack of appropriate engineering > diligence on the part of our community to declare RDRAND 'busted' at > this point. > > While it appeares to be trivially easy to force RDSEED into depletion, > there does not seem to be a suggestion, at least in the open > literature, that this directly or easily translates into stalling > output from RDRAND in any type of relevant adversarial fashion. > > If this were the case, given what CVE's seem to be worth on a resume, > someone would have rented a cloud machine and come up with a POC > against RDRAND in a multi-tenant environment and then promptly put up > a web-site called 'Random Starve' or something equally ominous. I suspect the reason why DOS attacks aren't happening in practice, is because of concerns over the ability to trust the RDRAND (how do you prove that the NSA didn't put a backdoor into the hardware with Intel's acquisence --- after all, the NSA absolutely positively didn't encourage the kneecaping of WEP and absolutely didn't put a trapdoor into DUAL_EC_DRBG...) since it can not externally audited and verfied by a third party, in contrast to the source code for the /dev/random driver or the RNG used in OpenSSL. As a result, most random number generators use RDRAND in combination with other techniques. If RDRAND is absolutely trustworthy, the extra sources won't hurt --- and if it isn't trustworthy mixing in other sources will likely make things harder for Fort Meade. And even if these other sources might be observable for someone who can listen in on the inter-packet arrival times on the LAN (for example), it might not be so easy for an analyst sitting at their desk in Fort Meade. And once you do _that_, you don't need to necessarily loop on RDRAND, because it's one of multiple sources of entropies that are getting mixed togethwer. Hence, even if someone drives RDRAND into depletion, if they are using getrandom(2), it's not a big deal. There's a special case with Confidential Compute VM's, since the assumption is that you want to protect against even a malicious hypervisor who could theoretically control all other sources of timing uncertainty. And so, yes, in that case, the only thing we can do is Panic if RDRAND fails. - Ted
> On Wed, Jan 31, 2024 at 02:06:13PM +0100, Jason A. Donenfeld wrote: > > Hi again to everyone, beautiful day here in North Dakota. > > > On Wed, Jan 31, 2024 at 9:17???AM Reshetova, Elena > > <elena.reshetova@intel.com> wrote: > > > This matches both my understanding (I do have cryptography background > > > and understanding how cryptographic RNGs work) > > > and official public docs that Intel published on this matter. > > > Given that the physical entropy source is limited anyhow, and by giving > > > enough pressure on the whole construction you should be able to > > > make RDRAND fail because if the intermediate AES-CBC MAC extractor/ > > > conditioner is not getting its min entropy input rate, it wont > > > produce a proper seed for AES CTR DRBG. > > > Of course exact details/numbers can wary between different generations of > > > Intel DRNG implementation, and the platforms where it is running on, > > > so be careful to sticking to concrete numbers. > > > Alright, so RDRAND is not reliable. The question for us now is: do > > we want RDRAND unreliability to translate to another form of > > unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or > > would it be better to declare the hardware simply broken and ask > > Intel to fix it? (I don't know the answer to that question.) > > I think it would demonstrate a lack of appropriate engineering > diligence on the part of our community to declare RDRAND 'busted' at > this point. > > While it appeares to be trivially easy to force RDSEED into depletion, > there does not seem to be a suggestion, at least in the open > literature, that this directly or easily translates into stalling > output from RDRAND in any type of relevant adversarial fashion. > > If this were the case, given what CVE's seem to be worth on a resume, > someone would have rented a cloud machine and come up with a POC > against RDRAND in a multi-tenant environment and then promptly put up > a web-site called 'Random Starve' or something equally ominous. > > This is no doubt secondary to the 1022x amplication factor inherent in > the 'Bull Mountain' architecture. > > I'm a bit surprised that no one from the Intel side of this > conversation didn't pitch this over the wall as soon as this > conversation came up, but I would suggest that everyone concerned > about this issue give the following a thorough read: > > https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital- > random-number-generator-drng-software-implementation-guide.html > > Relevant highlights: > > - As I suggested in my earlier e-mail, random number generation is a > socket based resource, hence an adversarial domain limited to only > the cores on a common socket. > > - There is a maximum randomness throughput rate of 800 MB/s over all > cores sharing common random number infrastructure. Single thread > throughput rates of 70-200 MB/s are demonstratable. > > - A failure of RDRAND over 10 re-tries is 'astronomically' small, with > no definition of astronomical provided, one would assume really > small, given they are using the word astronomical. As I said, I want to investigate this properly before stating anything. In a CoCo VM we cannot guarantee that a victim guest is able to execute this 10 re-try loop (there is also a tightness requirement listed in official guide that is not further specified) without interruption since all guest scheduling is under the host control. Again, this is the angle that was not present before and I want to make sure we are protected against this case. > > > > That said, I have taken an AR to follow up internally on what can be done > > > to improve our situation with RDRAND/RDSEED. > > I think I can save you some time Elena. > > > Specifying this is an interesting question. What exactly might our > > requirements be for a "non-broken" RDRAND? It seems like we have two > > basic ones: > > > > - One VMX (or host) context can't DoS another one. > > - Ring 3 can't DoS ring 0. > > > > I don't know whether that'd be implemented with context-tied rate > > limiting or more state or what. But I think, short of just making > > RDRAND never fail, that's basically what's needed. > > I think we probably have that, for all intents and purposes, given > that we embrace the following methodogy: > > - Use RDRAND exclusively. > > - Be willing to take 10 swings at the plate. > > - Given the somewhat demanding requirements for TDX/COCO, fail and > either deadlock or panic after 10 swings since that would seem to > suggest the hardware is broken, ie. RMA time. Again, my worry here that a CoCo guest is not in control of its own scheduling and this might make an impact on the above statement, i.e. it might theoretical be possible to cause this without physically broken HW. Best Regards, Elena. > > Either deadlock or panic would be appropriate. The objective in the > COCO environment is to get the person who clicked on the 'Enable Azure > Confidential' checkbox, or its equivalent, on their cloud dashboard, > to call the HelpDesk and ask them why their confidential application > won't come up. > > After the user confirms to the HelpDesk that their computer is plugged > in, the problem will get fixed. Either the broken hardware will be > identified and idled out or the mighty sword of vengeance will be > summoned down on whoever has all of the other cores on the socket > pegged. > > Final thoughts: > > - RDSEED is probably a poor thing to be using. > > - There may be a reasonable argument that RDSEED shouldn't have been > exposed above ring 0, but that ship has sailed. Brownie points > moving forward for an RDsomething that is ring 0 and has guaranteed > access to some amount of functionally reasonable entropy. > > - Intel and AMD are already doing a lot of 'special' stuff with their > COCO hardware in order to defy the long standing adage of: 'You > can't have security without physical security'. Access to per core thermal > noise, as I suggested, is probably a big lift but clever engineers can > probably cook up some type of fairness doctrine for randomness in > TDX or SEV_SNP, given the particular importance of instruction based > randomness in COCO. > > - Perfection is the enemy of good. > > > Jason > > Have a good day. > > As always, > Dr. Greg > > The Quixote Project - Flailing at the Travails of Cybersecurity > https://github.com/Quixote-Project
On Wed, Jan 31, 2024 at 11:47:35PM -0500, Theodore Ts'o wrote: > On Wed, Jan 31, 2024 at 02:35:32PM -0600, Dr. Greg wrote: > > I think it would demonstrate a lack of appropriate engineering > > diligence on the part of our community to declare RDRAND 'busted' at > > this point. > > > > While it appeares to be trivially easy to force RDSEED into depletion, > > there does not seem to be a suggestion, at least in the open > > literature, that this directly or easily translates into stalling > > output from RDRAND in any type of relevant adversarial fashion. > > > > If this were the case, given what CVE's seem to be worth on a resume, > > someone would have rented a cloud machine and come up with a POC > > against RDRAND in a multi-tenant environment and then promptly put up > > a web-site called 'Random Starve' or something equally ominous. > I suspect the reason why DOS attacks aren't happening in practice, is > because of concerns over the ability to trust the RDRAND (how do you > prove that the NSA didn't put a backdoor into the hardware with > Intel's acquisence --- after all, the NSA absolutely positively didn't > encourage the kneecaping of WEP and absolutely didn't put a trapdoor > into DUAL_EC_DRBG...) since it can not externally audited and verfied > by a third party, in contrast to the source code for the /dev/random > driver or the RNG used in OpenSSL. > > As a result, most random number generators use RDRAND in combination > with other techniques. If RDRAND is absolutely trustworthy, the extra > sources won't hurt --- and if it isn't trustworthy mixing in other > sources will likely make things harder for Fort Meade. And even if > these other sources might be observable for someone who can listen in > on the inter-packet arrival times on the LAN (for example), it might > not be so easy for an analyst sitting at their desk in Fort Meade. > > And once you do _that_, you don't need to necessarily loop on RDRAND, > because it's one of multiple sources of entropies that are getting > mixed togethwer. Hence, even if someone drives RDRAND into depletion, > if they are using getrandom(2), it's not a big deal. All well taken points, the Linux RNG and associated community has benefited from your and Jason's concerns and work on all of this. However, whether or not DOS attacks based on RNG depletion are happening in the wild, and the reasons they are not, are orthogonal to whether or not they can be proven to exist, which was the point I was trying to make. Demonstrating a vulnerability in something as critical as Intel's RNG implementation would be a big motivation for some research group. The fact that hasn't occurred would seem to suggest that the RDRAND resource depletion we are concerned with is not adversarially exploitable. I suspect that the achievable socket core count cannot effectively overwhelm the 1022x amplification factor inherent in the design of the RDSEED based seeding of RDRAND. We will see if Elena can come up with what Intel engineering's definition of 'astronomical' is.. :-) > There's a special case with Confidential Compute VM's, since the > assumption is that you want to protect against even a malicious > hypervisor who could theoretically control all other sources of > timing uncertainty. And so, yes, in that case, the only thing we > can do is Panic if RDRAND fails. Indeed. The bigger question, which I will respond to Elena with, is how much this issue calls the entire question of confidential computing into question. > - Ted Have a good day, it has been a long time since you and I were standing around with Phil Hughes in the Galleria in Atlanta argueing about whether or not Active Directory was going to dominate enterprise computing. It does seem to have gained some significant amount of traction... :-) As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
On Thu, Feb 01, 2024 at 07:26:15AM +0000, Reshetova, Elena wrote: Good morning to everyone. > > On Wed, Jan 31, 2024 at 02:06:13PM +0100, Jason A. Donenfeld wrote: > > > > Hi again to everyone, beautiful day here in North Dakota. > > > > > On Wed, Jan 31, 2024 at 9:17???AM Reshetova, Elena > > > <elena.reshetova@intel.com> wrote: > > > > This matches both my understanding (I do have cryptography background > > > > and understanding how cryptographic RNGs work) > > > > and official public docs that Intel published on this matter. > > > > Given that the physical entropy source is limited anyhow, and by giving > > > > enough pressure on the whole construction you should be able to > > > > make RDRAND fail because if the intermediate AES-CBC MAC extractor/ > > > > conditioner is not getting its min entropy input rate, it wont > > > > produce a proper seed for AES CTR DRBG. > > > > Of course exact details/numbers can wary between different generations of > > > > Intel DRNG implementation, and the platforms where it is running on, > > > > so be careful to sticking to concrete numbers. > > > > > Alright, so RDRAND is not reliable. The question for us now is: do > > > we want RDRAND unreliability to translate to another form of > > > unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or > > > would it be better to declare the hardware simply broken and ask > > > Intel to fix it? (I don't know the answer to that question.) > > > > I think it would demonstrate a lack of appropriate engineering > > diligence on the part of our community to declare RDRAND 'busted' at > > this point. > > > > While it appeares to be trivially easy to force RDSEED into depletion, > > there does not seem to be a suggestion, at least in the open > > literature, that this directly or easily translates into stalling > > output from RDRAND in any type of relevant adversarial fashion. > > > > If this were the case, given what CVE's seem to be worth on a resume, > > someone would have rented a cloud machine and come up with a POC > > against RDRAND in a multi-tenant environment and then promptly put up > > a web-site called 'Random Starve' or something equally ominous. > > > > This is no doubt secondary to the 1022x amplication factor inherent in > > the 'Bull Mountain' architecture. > > > > I'm a bit surprised that no one from the Intel side of this > > conversation didn't pitch this over the wall as soon as this > > conversation came up, but I would suggest that everyone concerned > > about this issue give the following a thorough read: > > > > https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital- > > random-number-generator-drng-software-implementation-guide.html > > > > Relevant highlights: > > > > - As I suggested in my earlier e-mail, random number generation is a > > socket based resource, hence an adversarial domain limited to only > > the cores on a common socket. > > > > - There is a maximum randomness throughput rate of 800 MB/s over all > > cores sharing common random number infrastructure. Single thread > > throughput rates of 70-200 MB/s are demonstratable. > > > > - A failure of RDRAND over 10 re-tries is 'astronomically' small, with > > no definition of astronomical provided, one would assume really > > small, given they are using the word astronomical. > As I said, I want to investigate this properly before stating > anything. In a CoCo VM we cannot guarantee that a victim guest is > able to execute this 10 re-try loop (there is also a tightness > requirement listed in official guide that is not further specified) > without interruption since all guest scheduling is under the host > control. Again, this is the angle that was not present before and I > want to make sure we are protected against this case. I suspect that all of this may be the source of interesting discussions inside of Intel, see my closing question below. If nothing else, we will wait with baited breath for a definition of astronomical, if of course, the definition of that value is unprivileged and you would be free to forward it along... :-) > > > > That said, I have taken an AR to follow up internally on what can be done > > > > to improve our situation with RDRAND/RDSEED. > > > > I think I can save you some time Elena. > > > > > Specifying this is an interesting question. What exactly might our > > > requirements be for a "non-broken" RDRAND? It seems like we have two > > > basic ones: > > > > > > - One VMX (or host) context can't DoS another one. > > > - Ring 3 can't DoS ring 0. > > > > > > I don't know whether that'd be implemented with context-tied rate > > > limiting or more state or what. But I think, short of just making > > > RDRAND never fail, that's basically what's needed. > > > > I think we probably have that, for all intents and purposes, given > > that we embrace the following methodogy: > > > > - Use RDRAND exclusively. > > > > - Be willing to take 10 swings at the plate. > > > > - Given the somewhat demanding requirements for TDX/COCO, fail and > > either deadlock or panic after 10 swings since that would seem to > > suggest the hardware is broken, ie. RMA time. > Again, my worry here that a CoCo guest is not in control of its own > scheduling and this might make an impact on the above statement, > i.e. it might theoretical be possible to cause this without > physically broken HW. So all of this leaves open a very significant question that would seem to be worthy of further enlightenment from inside the bowels of Intel engineering. Our discussion has now led us to a point where there appears to be a legitimate concern that the hypervisor has such significant control over a confidential VM that the integrity of a simple re-try loop is an open question. Let us posit for argument, that confidential computing resolves down to the implementation of a trusted computing platform that in turn resolves to a requirement for competent and robust cryptography for initial and ongoing attestation, let alone confidentiality in the face of possible side-channel and timing attacks. I'm sure there would be a great deal of interest in any information that can be provided that this scenario is possible, given the level of control that is being suggested that a hypervisor would enjoy over an ostensibly confidential and trusted guest. > Best Regards, > Elena. Have a good day. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
On Thu, Feb 01, 2024 at 03:54:51AM -0600, Dr. Greg wrote: > I suspect that the achievable socket core count cannot effectively > overwhelm the 1022x amplification factor inherent in the design of the > RDSEED based seeding of RDRAND. In testing I could get RDSEED down to < 3% success rate when running on 20 cores in parallel on a laptop class i7. If that failure rate can be improved by a little more than one order of magnitude to 0.1% we're starting to get to the point where it might be enough to make RDRAND re-seed fail. Intel's Sierra Forest CPUs are said to have a variant with 288 cores per socket, which is an order of magnitude larger. It is conceivable this might be large enough to demonstrate RDRAND failure in extreme load. Then again who knows what else has changed that might alter the equation, maybe the DRBG is also better / faster. Only real world testing can say for sure. One thing is certain though, core counts per socket keep going up, so the potential worst case load on RDSEED will increase... > We will see if Elena can come up with what Intel engineering's > definition of 'astronomical' is.. :-) > > > There's a special case with Confidential Compute VM's, since the > > assumption is that you want to protect against even a malicious > > hypervisor who could theoretically control all other sources of > > timing uncertainty. And so, yes, in that case, the only thing we > > can do is Panic if RDRAND fails. > > Indeed. > > The bigger question, which I will respond to Elena with, is how much > this issue calls the entire question of confidential computing into > question. A denial of service (from a panic on RDRAND fail) doesn't undermine confidental computing. Guest data confidentiality is maintained by panicing on RDRAND failure and DoS protection isn't a threat that CC claims to be able to mitigate in general. With regards, Daniel
On Thu, Feb 01, 2024 at 11:08:09AM +0000, Daniel P. Berrang?? wrote: Hi Dan, thanks for the thoughts. > On Thu, Feb 01, 2024 at 03:54:51AM -0600, Dr. Greg wrote: > > I suspect that the achievable socket core count cannot effectively > > overwhelm the 1022x amplification factor inherent in the design of the > > RDSEED based seeding of RDRAND. > In testing I could get RDSEED down to < 3% success rate when > running on 20 cores in parallel on a laptop class i7. If that > failure rate can be improved by a little more than one order > of magnitude to 0.1% we're starting to get to the point where > it might be enough to make RDRAND re-seed fail. > > Intel's Sierra Forest CPUs are said to have a variant with 288 > cores per socket, which is an order of magnitude larger. It is > conceivable this might be large enough to demonstrate RDRAND > failure in extreme load. Then again who knows what else has > changed that might alter the equation, maybe the DRBG is also > better / faster. Only real world testing can say for sure. > One thing is certain though, core counts per socket keep going > up, so the potential worst case load on RDSEED will increase... Indeed, that would seem to be the important and operative question that Intel could answer, maybe Dave and Elena will be able to provide some guidance. Until someone can actually demonstrate a sustained RDRAND depletion attack we don't have an issue, only a lot of wringing of hands and other handwaving on what we should do. The thing that intrigues me is that we have two AMD engineers following this, do you guys have any comments, reflections? Unless I misunderstand, SEV-SNP has the same challenges and issues. As of late you guys have been delivering higher core counts that would make your platform more susceptible. Does your hardware design not have a socket common RNG architecture that makes RDSEED vulnerable to socket adversarial depletion? Is this a complete non-issue in practice? Big opportunity here to proclaim: "Just buy AMD"... :-) > > We will see if Elena can come up with what Intel engineering's > > definition of 'astronomical' is.. :-) > > > > > There's a special case with Confidential Compute VM's, since the > > > assumption is that you want to protect against even a malicious > > > hypervisor who could theoretically control all other sources of > > > timing uncertainty. And so, yes, in that case, the only thing we > > > can do is Panic if RDRAND fails. > > > > Indeed. > > > > The bigger question, which I will respond to Elena with, is how much > > this issue calls the entire question of confidential computing into > > question. > A denial of service (from a panic on RDRAND fail) doesn't undermine > confidental computing. Guest data confidentiality is maintained by > panicing on RDRAND failure and DoS protection isn't a threat that CC > claims to be able to mitigate in general. Yes, if there is a problem with RDRAND we have a CoCo solution, full stop. The issue that I was raising with Elena is more generic, to wit: Her expressed concern is that a code construct looking something like this, rdrand() returning 0 on success: for (i= 0; i < 9; ++i) if (!rdrand(&seed)) break; sleep(some time); } if (i == 9) BUG("No entropy"); do_something_with(seed); Could be sufficiently manipulated by a malicious hypervisor in a TDX environment so as to compromise its functionality. If this level of control is indeed possible, given the long history of timing and side-channel attacks against cryptography, this would seem to pose significant questions as to whether or not CoCo can deliver on its stated goals. > With regards, > Daniel Have a good evening. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
> > > The bigger question, which I will respond to Elena with, is how much > > > this issue calls the entire question of confidential computing into > > > question. > > > A denial of service (from a panic on RDRAND fail) doesn't undermine > > confidental computing. Guest data confidentiality is maintained by > > panicing on RDRAND failure and DoS protection isn't a threat that CC > > claims to be able to mitigate in general. > > Yes, if there is a problem with RDRAND we have a CoCo solution, full > stop. > > The issue that I was raising with Elena is more generic, to wit: > > Her expressed concern is that a code construct looking something like this, > rdrand() returning 0 on success: > > for (i= 0; i < 9; ++i) > if (!rdrand(&seed)) > break; > sleep(some time); > } > if (i == 9) > BUG("No entropy"); > > do_something_with(seed); > > Could be sufficiently manipulated by a malicious hypervisor in a TDX > environment so as to compromise its functionality. This is not what I had in mind. How does the above can be manipulated by a malicious hypervisor? If the above construction can be logically manipulated we have other issues than rdrand, this is imo already a control flow manipulation attack that you are stating here. What a malicious hypervisor can *in theory* do is to insert the execution delays and make the above loop fail even if we assume that the probability of falling the 10 retry loop is negligible in normal cases (assuming tightness or other timing requirements). But again, this is theoretical at this point. But if the SW refuses to proceed and panics in such cases, we have a DoS as we already discussed. Best Regards, Elena. > > If this level of control is indeed possible, given the long history of > timing and side-channel attacks against cryptography, this would seem > to pose significant questions as to whether or not CoCo can deliver on > its stated goals. > > > With regards, > > Daniel > > Have a good evening. > > As always, > Dr. Greg > > The Quixote Project - Flailing at the Travails of Cybersecurity > https://github.com/Quixote-Project
On Wed, Jan 31, 2024 at 08:16:56AM +0000, Reshetova, Elena wrote: Good evening, I hope the week has started well for everyone. > > On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote: > > > On Tue, Jan 30, 2024 at 6:32???PM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > > > On 1/30/24 05:45, Reshetova, Elena wrote: > > > > >> You're the Intel employee so you can find out about this with much > > > > >> more assurance than me, but I understand the sentence above to be _way > > > > >> more_ true for RDRAND than for RDSEED. If your informed opinion is, > > > > >> "RDRAND failing can only be due to totally broken hardware" > > > > > No, this is not the case per Intel SDM. I think we can live under a simple > > > > > assumption that both of these instructions can fail not just due to broken > > > > > HW, but also due to enough pressure put into the whole DRBG construction > > > > > that supplies random numbers via RDRAND/RDSEED. > > > > > > > > I don't think the SDM is the right thing to look at for guidance here. > > > > > > > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures > > > > to be exceedingly rare by design. If they're not, we're going to get > > > > our trusty torches and pitchforks and go after the folks who built the > > > > broken hardware. > > > > > > > > Repeat after me: > > > > > > > > Regular RDRAND/RDSEED failures only occur on broken hardware > > > > > > > > If it's nice hardware that's gone bad, then we WARN() and try to make > > > > the best of it. If it turns out that WARN() was because of a broken > > > > hardware _design_ then we go sharpen the pitchforks. > > > > > > > > Anybody disagree? > > > > > > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily > > > in a busy loop. So at the very least, your statement holds true only > > > for RDRAND. > > > > > > But, anyway, if the statement "RDRAND failures only occur on broken > > > hardware" is true, then a WARN() in the failure path there presents no > > > DoS potential of any kind, and so that's a straightforward conclusion > > > to this discussion. However, that really hinges on "RDRAND failures > > > only occur on broken hardware" being a true statement. > > > > There's a useful comment here from an Intel engineer > > > > https://web.archive.org/web/20190219074642/https://software.intel.com/en- > > us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed > > > > "RDRAND is, indeed, faster than RDSEED because it comes > > from a hardware-based pseudorandom number generator. > > One seed value (effectively, the output from one RDSEED > > command) can provide up to 511 128-bit random values > > before forcing a reseed" > > > > We know we can exhaust RDSEED directly pretty trivially. Making your > > test program run in parallel across 20 cpus, I got a mere 3% success > > rate from RDSEED. > > > > If RDRAND is reseeding every 511 values, RDRAND output would have > > to be consumed significantly faster than RDSEED in order that the > > reseed will happen frequently enough to exhaust the seeds. > > > > This looks pretty hard, but maybe with a large enough CPU count > > this will be possible in extreme load ? > > > > So I'm not convinced we can blindly wave away RDRAND failures as > > guaranteed to mean broken hardware. > This matches both my understanding (I do have cryptography > background and understanding how cryptographic RNGs work) and > official public docs that Intel published on this matter. Given > that the physical entropy source is limited anyhow, and by giving > enough pressure on the whole construction you should be able to make > RDRAND fail because if the intermediate AES-CBC MAC extractor/ > conditioner is not getting its min entropy input rate, it wont > produce a proper seed for AES CTR DRBG. Of course exact > details/numbers can wary between different generations of Intel DRNG > implementation, and the platforms where it is running on, so be > careful to sticking to concrete numbers. In the spirit of that philosophy we proffer the response below. > That said, I have taken an AR to follow up internally on what can be > done to improve our situation with RDRAND/RDSEED. But I would still > like to finish the discussion on what people think should be done in > the meanwhile keeping in mind that the problem is not intel > specific, despite us intel people bringing it for public discussion > first. The old saying is still here: "Please don't shoot the > messenger" )) We are actually trying to be open about these things > and create a public discussion. Actually, I now believe there is clear evidence that the problem is indeed Intel specific. In light of our testing, it will be interesting to see what your 'AR' returns with respect to an official response from Intel engineering on this issue. One of the very bright young engineers collaborating on Quixote, who has been following this conversation, took it upon himself to do some very methodical engineering analysis on this issue. I'm the messenger but this is very much his work product. Executive summary is as follows: - No RDRAND depletion failures were observable with either the Intel or AMD hardware that was load tested. - RDSEED depletion is an Intel specific issue, AMD's RDSEED implementation could not be provoked into failure. - AMD's RDRAND/RDSEED implementation is significantly slower than Intel's. Here are the engineer's lab notes verbatim: --------------------------------------------------------------------------- I tested both the single-threaded and OMP-multithreaded versions of the RDSEED/RDRAND depletion loop on each of the machines below. AMD: 2X AMD EPYC 7713 (Milan) 64-Core Processor @ 2.0 GHz, 128 physical cores total Intel: 2X Intel Xeon Gold 6140 (Skylake) 18-Core Processor @ 2.3 GHz, 36 physical cores total Single-threaded results: Test case: 1,000,000 iterations each for RDRAND and RDSEED, n=100 tests, single-threaded. AMD: 100% success rate for both RDRAND and RDSEED for all tests, runtime 0.909-1.055s (min-max). Intel: 100% success rate for RDRAND for all tests, 20.01-20.12% (min-max) success rate for RSEED, runtime 0.256-0.281s (min-max) OMP multithreaded results: Test case: 1,000,000 iterations per thread, for both RDRAND and RDSEED, n=100 tests, OMP multithreaded with OMP_NUM_THREADS=<total physical cores> (i.e. 128 for AMD and 36 for Intel) AMD: 100% success rate for both RDRAND and RDSEED for all tests, runtime 47.229-47.603s (min-max). Intel: 100% success rate for RDRAND for all tests, 1.77-5.62% (min-max) success rate for RSEED, runtime 0.562-0.595s (min-max) CONCLUSION RDSEED failure was reproducibly induced on the Intel Skylake platform, for both single- and multithreaded tests, whereas RDSEED failure could not be induced on the AMD platform for either test. RDRAND did not fail on either platform for either test. AMD execution time was roughly 4x slower than Intel (1s vs 0.25s) for the single-threaded test, and almost 100x slower than Intel (47s vs 0.5s) for the multithreaded test. The difference in clock rates (2.0 GHz for AMD vs 2.3 GHz for Intel) is not sufficient to explain these runtime differences. So it seems likely that AMD is gating the rate at which a new RDSEED value can be requested. --------------------------------------------------------------------------- Speaking now with my voice: Unless additional information shows up, despite our collective handwringing, as long as the RDRAND instruction is used as the cryptographic primitive, there appears to be little likelihood of a DOS randomness attack against a TDX based CoCo virtual machine. While it is highly unlikely we will ever get an 'official' readout on this issue, I suspect there is a high probability that Intel engineering favored performance with their RDSEED/RDRAND implementation. AMD 'appears', and without engineering feedback from AMD I would emphasize the notion of 'appears', to have embraced the principal of taking steps to eliminate the possibility of a socket based adversary attack against their RNG infrastructure. > Elena. Hopefully the above is useful for everyone interested in this issue. Once again, a thank you to our version of 'Sancho' for his legwork on this, who has also read Cervantes at length... :-) Have a good remainder of the week. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote: > > Actually, I now believe there is clear evidence that the problem is > indeed Intel specific. In light of our testing, it will be > interesting to see what your 'AR' returns with respect to an official > response from Intel engineering on this issue. > > One of the very bright young engineers collaborating on Quixote, who > has been following this conversation, took it upon himself to do some > very methodical engineering analysis on this issue. I'm the messenger > but this is very much his work product. > > Executive summary is as follows: > > - No RDRAND depletion failures were observable with either the Intel > or AMD hardware that was load tested. > > - RDSEED depletion is an Intel specific issue, AMD's RDSEED > implementation could not be provoked into failure. My colleague ran a multithread parallel stress test program on his 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in RDSEED. > - AMD's RDRAND/RDSEED implementation is significantly slower than > Intel's. Yes, we also noticed the AMD impl is horribly slow compared to Intel, had to cut test iterations x100 With regards, Daniel
On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote: Good morning to everyone. > On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote: > > > > Actually, I now believe there is clear evidence that the problem is > > indeed Intel specific. In light of our testing, it will be > > interesting to see what your 'AR' returns with respect to an official > > response from Intel engineering on this issue. > > > > One of the very bright young engineers collaborating on Quixote, who > > has been following this conversation, took it upon himself to do some > > very methodical engineering analysis on this issue. I'm the messenger > > but this is very much his work product. > > > > Executive summary is as follows: > > > > - No RDRAND depletion failures were observable with either the Intel > > or AMD hardware that was load tested. > > > > - RDSEED depletion is an Intel specific issue, AMD's RDSEED > > implementation could not be provoked into failure. > My colleague ran a multithread parallel stress test program on his > 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in > RDSEED. Interesting datapoint, thanks for forwarding it along, so the issue shows up on at least some AMD platforms as well. On the 18 core/socket Intel Skylake platform, the parallelized depletion test forces RDSEED success rates down to around 2%. It would appear that your tests suggest that the AMD platform fairs better than the Intel platform. So this is turning into even more of a morass, given that RDSEED depletion on AMD may be a function of the micro-architecture the platform is based on. The other variable is that our AMD test platform had a substantially higher core count per socket, one would assume that would result in higher depletion rates, if the operative theory of socket common RNG infrastructure is valid. Unless AMD engineering understands the problem and has taken some type of action on higher core count systems to address the issue. Of course, the other variable may be how the parallelized stress test is conducted. If you would like to share your implementation source we could give it a twirl on the systems we have access to. The continuing operative question is whether or not any of this ever leads to an RDRAND failure. We've conducted some additional tests on the Intel platform where RDSEED depletion was driven low as possible, ~1-2% success rates, while RDRAND depletion tests were being run simultaneously. No RDRAND failures have been noted. So the operative question remains, why worry about this if RDRAND is used as the randomness primitive. We haven't seen anything out of Intel yet on this, maybe AMD has a quantifying definition for 'astronomical' when it comes to RDRAND failures. The silence appears to be deafening out of the respective engineering camps... :-) > > - AMD's RDRAND/RDSEED implementation is significantly slower than > > Intel's. > Yes, we also noticed the AMD impl is horribly slow compared to > Intel, had to cut test iterations x100. The operative question is the impact of 'slow', in the absence of artifical stress tests. It would seem that a major question is what are or were the engineering thought processes on the throughput of the hardware randomness instructions. Intel documents the following randomness throughput rates: RDSEED: 3 Gbit/second RDRAND: 6.4 Gbit/second If there is the possibility of over-harvesting randomness, why not design the implementations to be clamped at some per core value such as a megabit/second. In the case of the documented RDSEED generation rates, that would allow the servicing of 3222 cores, if my math at 0530 in the morning is correct. Would a core need more than 128 kilobytes of randomness, ie. one second of output, to effectively seed a random number generator? A cynical conclusion would suggest engineering acquiesing to marketing demands... :-) > With regards, > Daniel Have a good day. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote: > On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote: > > Good morning to everyone. > > > On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote: > > > > > > Actually, I now believe there is clear evidence that the problem is > > > indeed Intel specific. In light of our testing, it will be > > > interesting to see what your 'AR' returns with respect to an official > > > response from Intel engineering on this issue. > > > > > > One of the very bright young engineers collaborating on Quixote, who > > > has been following this conversation, took it upon himself to do some > > > very methodical engineering analysis on this issue. I'm the messenger > > > but this is very much his work product. > > > > > > Executive summary is as follows: > > > > > > - No RDRAND depletion failures were observable with either the Intel > > > or AMD hardware that was load tested. > > > > > > - RDSEED depletion is an Intel specific issue, AMD's RDSEED > > > implementation could not be provoked into failure. > > > My colleague ran a multithread parallel stress test program on his > > 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in > > RDSEED. > > Interesting datapoint, thanks for forwarding it along, so the issue > shows up on at least some AMD platforms as well. > > On the 18 core/socket Intel Skylake platform, the parallelized > depletion test forces RDSEED success rates down to around 2%. It > would appear that your tests suggest that the AMD platform fairs > better than the Intel platform. Yes, given the speed of the AMD RDRAND/RDSEED ops, compared to my Intel test platforms, their DRBG looks better able to keep up with the demand for bits. > Of course, the other variable may be how the parallelized stress test > is conducted. If you would like to share your implementation source > we could give it a twirl on the systems we have access to. It is just Jason's earlier test program, but moved into one thread for each core.... $ cat cpurngstress.c #include <stdio.h> #include <immintrin.h> #include <pthread.h> #include <unistd.h> /* * Gives about 25 seconds walllock time on my Alderlake CPU * * Probably want to reduce this x10, or possibly even x100 * on AMD due to much slower ops. */ #define MAX_ITER 10000000 #define MAX_CPUS 4096 void *doit(void *f) { unsigned long long rand; unsigned int i, success_rand = 0, success_seed = 0; for (i = 0; i < MAX_ITER; ++i) { success_seed += !!_rdseed64_step(&rand); } for (i = 0; i < MAX_ITER; ++i) { success_rand += !!_rdrand64_step(&rand); } fprintf(stderr, "RDRAND: %.2f%%, RDSEED: %.2f%%\n", success_rand * 100.0 / MAX_ITER, success_seed * 100.0 / MAX_ITER); return NULL; } int main(int argc, char *argv[]) { pthread_t th[MAX_CPUS]; int nproc = sysconf(_SC_NPROCESSORS_ONLN); if (nproc > MAX_CPUS) { nproc = MAX_CPUS; } fprintf(stderr, "Stressing RDRAND/RDSEED across %d CPUs\n", nproc); for (int i = 0 ; i < nproc;i ++) { pthread_create(&th[i], NULL, doit,NULL); } for (int i = 0 ; i < nproc;i ++) { pthread_join(th[i], NULL); } return 0; } $ gcc -march=native -o cpurngstress cpurngstress.c > If there is the possibility of over-harvesting randomness, why not > design the implementations to be clamped at some per core value such > as a megabit/second. In the case of the documented RDSEED generation > rates, that would allow the servicing of 3222 cores, if my math at > 0530 in the morning is correct. > > Would a core need more than 128 kilobytes of randomness, ie. one > second of output, to effectively seed a random number generator? > > A cynical conclusion would suggest engineering acquiesing to marketing > demands... :-) My assumption is that it was simply easier to not implement a rate limiting feature at the CPU level and punt the starvation problem to software :-) With regards, Daniel
On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote: > On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote: > > Good morning to everyone. > > > On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote: > > > > > > Actually, I now believe there is clear evidence that the problem is > > > indeed Intel specific. In light of our testing, it will be > > > interesting to see what your 'AR' returns with respect to an official > > > response from Intel engineering on this issue. > > > > > > One of the very bright young engineers collaborating on Quixote, who > > > has been following this conversation, took it upon himself to do some > > > very methodical engineering analysis on this issue. I'm the messenger > > > but this is very much his work product. > > > > > > Executive summary is as follows: > > > > > > - No RDRAND depletion failures were observable with either the Intel > > > or AMD hardware that was load tested. > > > > > > - RDSEED depletion is an Intel specific issue, AMD's RDSEED > > > implementation could not be provoked into failure. > > > My colleague ran a multithread parallel stress test program on his > > 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in > > RDSEED. > > Interesting datapoint, thanks for forwarding it along, so the issue > shows up on at least some AMD platforms as well. I got access to a couple more AMD machines. An EPYC 24core/2HT (Zen-1 uarch) and an EPYC 2socket/16core/2HT (Zen-3 uarch). Both of these show 100% success with RDSEED. So there's clearly some variance across AMD SKUs. So perhaps this is an EPYC vs Ryzen distinction, with the server focused EPYCs able to sustain RDSEED. With regards, Daniel
On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote: > The silence appears to be deafening out of the respective engineering > camps... :-) I usually wait for those threads to "relax" themselves first. :) So, what do you wanna know?
On February 6, 2024 4:04:45 AM PST, "Dr. Greg" <greg@enjellic.com> wrote: >On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote: > >Good morning to everyone. > >> On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote: >> > >> > Actually, I now believe there is clear evidence that the problem is >> > indeed Intel specific. In light of our testing, it will be >> > interesting to see what your 'AR' returns with respect to an official >> > response from Intel engineering on this issue. >> > >> > One of the very bright young engineers collaborating on Quixote, who >> > has been following this conversation, took it upon himself to do some >> > very methodical engineering analysis on this issue. I'm the messenger >> > but this is very much his work product. >> > >> > Executive summary is as follows: >> > >> > - No RDRAND depletion failures were observable with either the Intel >> > or AMD hardware that was load tested. >> > >> > - RDSEED depletion is an Intel specific issue, AMD's RDSEED >> > implementation could not be provoked into failure. > >> My colleague ran a multithread parallel stress test program on his >> 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in >> RDSEED. > >Interesting datapoint, thanks for forwarding it along, so the issue >shows up on at least some AMD platforms as well. > >On the 18 core/socket Intel Skylake platform, the parallelized >depletion test forces RDSEED success rates down to around 2%. It >would appear that your tests suggest that the AMD platform fairs >better than the Intel platform. > >So this is turning into even more of a morass, given that RDSEED >depletion on AMD may be a function of the micro-architecture the >platform is based on. The other variable is that our AMD test >platform had a substantially higher core count per socket, one would >assume that would result in higher depletion rates, if the operative >theory of socket common RNG infrastructure is valid. > >Unless AMD engineering understands the problem and has taken some type >of action on higher core count systems to address the issue. > >Of course, the other variable may be how the parallelized stress test >is conducted. If you would like to share your implementation source >we could give it a twirl on the systems we have access to. > >The continuing operative question is whether or not any of this ever >leads to an RDRAND failure. > >We've conducted some additional tests on the Intel platform where >RDSEED depletion was driven low as possible, ~1-2% success rates, >while RDRAND depletion tests were being run simultaneously. No RDRAND >failures have been noted. > >So the operative question remains, why worry about this if RDRAND is >used as the randomness primitive. > >We haven't seen anything out of Intel yet on this, maybe AMD has a >quantifying definition for 'astronomical' when it comes to RDRAND >failures. > >The silence appears to be deafening out of the respective engineering >camps... :-) > >> > - AMD's RDRAND/RDSEED implementation is significantly slower than >> > Intel's. > >> Yes, we also noticed the AMD impl is horribly slow compared to >> Intel, had to cut test iterations x100. > >The operative question is the impact of 'slow', in the absence of >artifical stress tests. > >It would seem that a major question is what are or were the >engineering thought processes on the throughput of the hardware >randomness instructions. > >Intel documents the following randomness throughput rates: > >RDSEED: 3 Gbit/second >RDRAND: 6.4 Gbit/second > >If there is the possibility of over-harvesting randomness, why not >design the implementations to be clamped at some per core value such >as a megabit/second. In the case of the documented RDSEED generation >rates, that would allow the servicing of 3222 cores, if my math at >0530 in the morning is correct. > >Would a core need more than 128 kilobytes of randomness, ie. one >second of output, to effectively seed a random number generator? > >A cynical conclusion would suggest engineering acquiesing to marketing >demands... :-) > >> With regards, >> Daniel > >Have a good day. > >As always, >Dr. Greg > >The Quixote Project - Flailing at the Travails of Cybersecurity > https://github.com/Quixote-Project You do realize, right, that the "deafening silence" is due to the need for research and discussions on our part, and presumably AMD's. In addition, quite frankly, your rather abusive language isn't exactly encouraging people to speak publicly based on immediately available and therefore inherently incomplete and/or dated information, meaning that we have had to take even what discussions we might have been able to have in public without IP concerns behind the scenes. Yes, we work for Intel. No, we don't know every detail about every Intel chip ever created off the top of my head, nor do we necessarily know the exact person that is *currently* in charge of the architecture of a particular unit, nor is it necessarily true that even *that* person knows all the exact properties of the behavior of their unit when integrated into a particular SoC, as units are modular by design.
On Tue, Feb 06, 2024 at 01:00:03PM +0000, Daniel P. Berrang?? wrote: Good morning. > On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote: > > On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote: > > > > Good morning to everyone. > > > > > On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote: > > > > > > > > Actually, I now believe there is clear evidence that the problem is > > > > indeed Intel specific. In light of our testing, it will be > > > > interesting to see what your 'AR' returns with respect to an official > > > > response from Intel engineering on this issue. > > > > > > > > One of the very bright young engineers collaborating on Quixote, who > > > > has been following this conversation, took it upon himself to do some > > > > very methodical engineering analysis on this issue. I'm the messenger > > > > but this is very much his work product. > > > > > > > > Executive summary is as follows: > > > > > > > > - No RDRAND depletion failures were observable with either the Intel > > > > or AMD hardware that was load tested. > > > > > > > > - RDSEED depletion is an Intel specific issue, AMD's RDSEED > > > > implementation could not be provoked into failure. > > > > > My colleague ran a multithread parallel stress test program on his > > > 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in > > > RDSEED. > > > > Interesting datapoint, thanks for forwarding it along, so the issue > > shows up on at least some AMD platforms as well. > > > > On the 18 core/socket Intel Skylake platform, the parallelized > > depletion test forces RDSEED success rates down to around 2%. It > > would appear that your tests suggest that the AMD platform fairs > > better than the Intel platform. > Yes, given the speed of the AMD RDRAND/RDSEED ops, compared to my > Intel test platforms, their DRBG looks better able to keep up with > the demand for bits. We now believe the observed resiliency of AMD's RNG infrastructure comes down to the fact that the completion times of their RNG instructions are significantly slower than Intel's. SkyLake and KabyLake instruction completion times are documented at 463 clock cycles, regardless of operand size. AMD Ryzen documents variable completion times based on operand size. 16 and 32 bit transfers complete in 1200 clock cycles with 64 bit requests completing in 2500 clock cycles. Given that Jason's test program was issueing 64-bit RNG requests, the AMD platforms are going to be approximately 5.4 times slower than Intel platforms, provided the results are corrected for CPU clock rates. AMD's entropy source is execution jitter time over a bank of inverter based ring oscillors, presumably sampled by a constant clock rate sampler. Slower instruction retirement times consumes less of the constant rate entropy production. Intel uses thermal/quantum noise across a diode junction retrieved by a self-clocked sampler. Faster instruction retirement translates into increased bandwidth demands on the sampler. > > Of course, the other variable may be how the parallelized stress test > > is conducted. If you would like to share your implementation source > > we could give it a twirl on the systems we have access to. > > It is just Jason's earlier test program, but moved into one thread > for each core.... > > $ cat cpurngstress.c > #include <stdio.h> > #include <immintrin.h> > #include <pthread.h> > #include <unistd.h> > > /* > * Gives about 25 seconds walllock time on my Alderlake CPU > * > * Probably want to reduce this x10, or possibly even x100 > * on AMD due to much slower ops. > */ > #define MAX_ITER 10000000 > > #define MAX_CPUS 4096 > > void *doit(void *f) { > unsigned long long rand; > unsigned int i, success_rand = 0, success_seed = 0; > > for (i = 0; i < MAX_ITER; ++i) { > success_seed += !!_rdseed64_step(&rand); > } > for (i = 0; i < MAX_ITER; ++i) { > success_rand += !!_rdrand64_step(&rand); > } > > fprintf(stderr, > "RDRAND: %.2f%%, RDSEED: %.2f%%\n", > success_rand * 100.0 / MAX_ITER, > success_seed * 100.0 / MAX_ITER); > > return NULL; > } > > > int main(int argc, char *argv[]) > { > pthread_t th[MAX_CPUS]; > int nproc = sysconf(_SC_NPROCESSORS_ONLN); > if (nproc > MAX_CPUS) { > nproc = MAX_CPUS; > } > fprintf(stderr, "Stressing RDRAND/RDSEED across %d CPUs\n", nproc); > > for (int i = 0 ; i < nproc;i ++) { > pthread_create(&th[i], NULL, doit,NULL); > } > > for (int i = 0 ; i < nproc;i ++) { > pthread_join(th[i], NULL); > } > > return 0; > } > > $ gcc -march=native -o cpurngstress cpurngstress.c Thanks for forwarding your test code along, we've added it to our tests for comparison. > > If there is the possibility of over-harvesting randomness, why not > > design the implementations to be clamped at some per core value such > > as a megabit/second. In the case of the documented RDSEED generation > > rates, that would allow the servicing of 3222 cores, if my math at > > 0530 in the morning is correct. > > > > Would a core need more than 128 kilobytes of randomness, ie. one > > second of output, to effectively seed a random number generator? > > > > A cynical conclusion would suggest engineering acquiesing to marketing > > demands... :-) > My assumption is that it was simply easier to not implement a rate > limiting feature at the CPU level and punt the starvation problem to > software :-) Could be, it does seem unlikely that random number generation speed would be seen as fertile ground for marketing types. Punting to software is certainly rationale, perhaps problematic in a CoCo environment depending on the definition of 'astronomical'. See my response to Borislav who was kind enough to respond to all of this. > With regards, > Daniel Have a good day. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
On Tue, Feb 06, 2024 at 04:35:29PM +0100, Borislav Petkov wrote: Good morning, or perhaps afternoon, thanks for taking the time to reply. > On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote: > > The silence appears to be deafening out of the respective engineering > > camps... :-) > I usually wait for those threads to "relax" themselves first. :) Indeed, my standard practice is to wait 24 hours before replying to any public e-mail, hence the delay in my response. > So, what do you wanna know? I guess a useful starting point would be if AMD would like to offer any type of quantification for 'astronomically small' when it comes to the probability of failure over 10 RDRAND attempts... :-) Secondly, given our test findings and those of RedHat, would it be safe to assume that EPYC has engineering that prevents RDSEED failures that Ryzen does not? Given HPA's response in this thread, I do appreciate that all of this may be shrouded in trade secrets and other issues. With an acknowledgement to that fact, let me see if I can extend the discussion in a generic manner that may prove useful to the community without being 'abusive'. Both AMD and Intel designs start with a hardware based entropy source. Intel samples thermal/quantum junction noise, AMD samples execution jitter over a bank of inverter based oscillators. An assumption of constant clocked sampling implies a maximum randomness bandwidth limit. None of this implies that randomness is a finite resource, it will always become available, with the caveat that a core may have to stand in line, cup in hand, waiting for a dollop. So this leaves the fundamental question of what does an RDRAND or RDSEED failure return actually imply? Silicon is a expensive resource, which would imply a queue depth limitation for access to the socket common RNG infastructure. If the queue is full when an instruction issues, it would be a logical response to signal an instruction failure quickly and let software try again. An alternate theory would be a requirement for constant instruction time completion. In that case a 'buffer' of cycles would be included in the RNG instruction cycle allocation count. If the instruction would need to 'sleep', waiting for randomness, beyond this cycle buffer, a failure would be returned. Absent broken hardware, astronomical then becomes the probability of a core being unlucky enough to run into these or alternate implementation scenarios 10 times in a row. Particularly given the recommendation to sleep between attempts, which implies getting scheduled onto different cores for the attempts. Any enlightenment along these lines would seem to be useful in facilitating an understanding of the issues at hand. Given the time and engineering invested in the engineering behind both TDX and SEV-SNP, it would seem unlikely that really smart engineers at both Intel and AMD didn't anticipate this issue and its proper resolution for CoCo environments. > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette All the best from the Upper Midwest. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
On Tue, Feb 06, 2024 at 10:49:59AM -0800, H. Peter Anvin wrote: Good morning HPA, I hope your week is going well, thanks for taking the time to extend comments. > On February 6, 2024 4:04:45 AM PST, "Dr. Greg" <greg@enjellic.com> wrote: > >On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote: > > > >Good morning to everyone. > > > >> On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote: > >> > > >> > Actually, I now believe there is clear evidence that the problem is > >> > indeed Intel specific. In light of our testing, it will be > >> > interesting to see what your 'AR' returns with respect to an official > >> > response from Intel engineering on this issue. > >> > > >> > One of the very bright young engineers collaborating on Quixote, who > >> > has been following this conversation, took it upon himself to do some > >> > very methodical engineering analysis on this issue. I'm the messenger > >> > but this is very much his work product. > >> > > >> > Executive summary is as follows: > >> > > >> > - No RDRAND depletion failures were observable with either the Intel > >> > or AMD hardware that was load tested. > >> > > >> > - RDSEED depletion is an Intel specific issue, AMD's RDSEED > >> > implementation could not be provoked into failure. > > > >> My colleague ran a multithread parallel stress test program on his > >> 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in > >> RDSEED. > > > >Interesting datapoint, thanks for forwarding it along, so the issue > >shows up on at least some AMD platforms as well. > > > >On the 18 core/socket Intel Skylake platform, the parallelized > >depletion test forces RDSEED success rates down to around 2%. It > >would appear that your tests suggest that the AMD platform fairs > >better than the Intel platform. > > > >So this is turning into even more of a morass, given that RDSEED > >depletion on AMD may be a function of the micro-architecture the > >platform is based on. The other variable is that our AMD test > >platform had a substantially higher core count per socket, one would > >assume that would result in higher depletion rates, if the operative > >theory of socket common RNG infrastructure is valid. > > > >Unless AMD engineering understands the problem and has taken some type > >of action on higher core count systems to address the issue. > > > >Of course, the other variable may be how the parallelized stress test > >is conducted. If you would like to share your implementation source > >we could give it a twirl on the systems we have access to. > > > >The continuing operative question is whether or not any of this ever > >leads to an RDRAND failure. > > > >We've conducted some additional tests on the Intel platform where > >RDSEED depletion was driven low as possible, ~1-2% success rates, > >while RDRAND depletion tests were being run simultaneously. No RDRAND > >failures have been noted. > > > >So the operative question remains, why worry about this if RDRAND is > >used as the randomness primitive. > > > >We haven't seen anything out of Intel yet on this, maybe AMD has a > >quantifying definition for 'astronomical' when it comes to RDRAND > >failures. > > > >The silence appears to be deafening out of the respective engineering > >camps... :-) > > > >> > - AMD's RDRAND/RDSEED implementation is significantly slower than > >> > Intel's. > > > >> Yes, we also noticed the AMD impl is horribly slow compared to > >> Intel, had to cut test iterations x100. > > > >The operative question is the impact of 'slow', in the absence of > >artifical stress tests. > > > >It would seem that a major question is what are or were the > >engineering thought processes on the throughput of the hardware > >randomness instructions. > > > >Intel documents the following randomness throughput rates: > > > >RDSEED: 3 Gbit/second > >RDRAND: 6.4 Gbit/second > > > >If there is the possibility of over-harvesting randomness, why not > >design the implementations to be clamped at some per core value such > >as a megabit/second. In the case of the documented RDSEED generation > >rates, that would allow the servicing of 3222 cores, if my math at > >0530 in the morning is correct. > > > >Would a core need more than 128 kilobytes of randomness, ie. one > >second of output, to effectively seed a random number generator? > > > >A cynical conclusion would suggest engineering acquiesing to marketing > >demands... :-) > > > >> With regards, > >> Daniel > > > >Have a good day. > > > >As always, > >Dr. Greg > > > >The Quixote Project - Flailing at the Travails of Cybersecurity > > https://github.com/Quixote-Project > You do realize, right, that the "deafening silence" is due to the > need for research and discussions on our part, and presumably AMD's. That would certainly be anticipated if not embraced, while those discussions ensue, let me explain where we are coming from on this issue. I have a long time friend and valued personal consigliere who is a tremendous attorney and legal scholar. She has long advised me that two basic concepts are instilled in law school; how to make sense out of legal writing and don't ask any questions you don't know the answer to. CoCo is an engineering endeavor to defy the long held, and difficult to deny, premise in information technology that if you don't have physical security of a platform you don't have security. We value ourselves as a good engineering team with considerable interest and experience in all of this. Any suggestion that there may be some type of, even subtle, concern over the behavior of fundamental hardware security primitives causes us to start asking questions and testing things. In this case the testing, quickly and easily, caused even more questions to emerge. > In addition, quite frankly, your rather abusive language isn't > exactly encouraging people to speak publicly based on immediately > available and therefore inherently incomplete and/or dated > information, meaning that we have had to take even what discussions we > might have been able to have in public without IP concerns behind the > scenes. Abusive? I will freely don the moniker of being a practioner and purveyor of rapier cynicism and wit, however, everyone who knows me beyond e-mail would tell you that abusive would be the last definition they would use in describing my intent and character. I had the opportunity to sit next to Jim Gordon at dinner, who at the time ran the Platform Security Division for Intel, at the SGX Development Outreach Meeting that was held in Tel Aviv. That was after a pretty direct, but productive, technical exchange with the SGX hardware engineers from Haifa. I told him that I didn't mean to put the engineers on the spot but we had been asked to voice our concerns as SGX infrastructure developers. He told me the purpose of the meeting was for Intel to get tough and demanding questions on issues of concern to developers so Intel could deliver better and more relevant technology. Just for the record and to close the abusive issue. A review of this thread will show that I never threw out accusations that hardware was busted, backdoored nor did I advocate that the solution was to find a better hardware vendor. You fix engineering problems with engineering facts, hence our interest in seeing how the question that got asked, perhaps inadvertently, gets answered, so appropriate engineering changes can be made in security dependent products. > Yes, we work for Intel. No, we don't know every detail about every > Intel chip ever created off the top of my head, nor do we > necessarily know the exact person that is *currently* in charge of > the architecture of a particular unit, nor is it necessarily true > that even *that* person knows all the exact properties of the > behavior of their unit when integrated into a particular SoC, as > units are modular by design. Interesting. From the outside looking in, as engineers, this raises the obvious question if the 'bus factor' for Bull Mountain has been exceeded. Let me toss out, hopefully as a positive contribution, a 'Gedanken Thought Dilemma' that the Intel team can take into their internal deliberations. One of the other products of this thread was the suggestion that a CoCo hypervisor/hardware contribution could exert sufficient timing or scheduling control so as to defeat the SDM's 10 try RDRAND recommendation and induce a denial-of-service condition. If that is indeed a possibility, given the long history of timing based observation attacks on confidentiality, what guidance can be offered to consumers of the relevant technologies that CoCo is indeed a valid concept. Particularly given the fact that the hardware that consumers are trusting is physically in the hands of highly skilled personnel, who have both the skills and phsyical control of the hardware needed, to mount such an attack? This is obviously a somewhat larger question than if RDRAND depletion can be practically induced, so no need to rush the deliberations on our behalf. We will stand by in a quiet and decidedly non-abusive and non-threatening posture, waiting to see what reflections that Borislav might have on all of this... :-) Have a good weekend. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity https://github.com/Quixote-Project
On Thu, Feb 08, 2024 at 05:44:44AM -0600, Dr. Greg wrote: > I guess a useful starting point would be if AMD would like to offer > any type of quantification for 'astronomically small' when it comes to > the probability of failure over 10 RDRAND attempts... :-) Right, let's establish the common ground first: please have a look at this, albeit a bit outdated whitepaper: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/white-papers/amd-random-number-generator.pdf in case you haven't seen it yet. Now, considering that this is a finite resource, you can imagine that there can be scenarios where that source can be depleted. And newer Zen generations perform significantly better. So much so that on Zen3 and later 10 retries should never observe a failure unless it is bad hardware. Also, I agree with hpa's note that any and all retries should be time based. > Secondly, given our test findings and those of RedHat, would it be > safe to assume that EPYC has engineering that prevents RDSEED failures > that Ryzen does not? Well, roughly speaking, client is a less beefier and less performant version of server. You can extrapolate that to the topic at hand. But at least on AMD, any potential DoSing of RDRAND on client doesn't matter for CoCo because client doesn't enable SEV*. > Both AMD and Intel designs start with a hardware based entropy source. > Intel samples thermal/quantum junction noise, AMD samples execution > jitter over a bank of inverter based oscillators. See above paper for the AMD side. > An assumption of constant clocked sampling implies a maximum > randomness bandwidth limit. You said it. > None of this implies that randomness is a finite resource Huh? This contradicts with what you just said in the above sentence. Or maybe I'm reading this wrong... > So this leaves the fundamental question of what does an RDRAND or > RDSEED failure return actually imply? Simple: if no random data is ready at the time the insn executes, it says "invalid". Because the generator is a finite resource as you said above, if the software tries to pull random data faster than it can generate, this is the main case for CF=0. > Silicon is a expensive resource, which would imply a queue depth > limitation for access to the socket common RNG infastructure. If the > queue is full when an instruction issues, it would be a logical > response to signal an instruction failure quickly and let software try > again. That's actually in the APM documenting RDRAND: "If the returned value is invalid, software must execute the instruction again." > Given the time and engineering invested in the engineering behind both > TDX and SEV-SNP, it would seem unlikely that really smart engineers at > both Intel and AMD didn't anticipate this issue and its proper > resolution for CoCo environments. You can probably imagine that no one can do a fully secure system in one single attempt but rather needs to do an iterative process. And I don't know how much you've followed those technologies but they *are* the perfect example for such an iterative improvement process. I hope this answers at least some of your questions. Thx.
Hey Boris, While you're here, I was wondering if you could comment on one thing related: On Fri, Feb 9, 2024 at 6:31 PM Borislav Petkov <bp@alien8.de> wrote: > Now, considering that this is a finite resource, you can imagine that > there can be scenarios where that source can be depleted. Yea, this makes sense. [As an aside, I would like to note that a different construction of RDRAND could keep outputting good random numbers for a reeeeeallly long time without needing to reseed, or without penalty if RDSEED is depleted, and so could be made to actually never fail. But given the design goals of RDRAND, this kind of crypto is highly likely to never be implemented, so I'm not even moving to suggest that AMD/Intel just 'fix' the crypto design goals of the instruction. It's not gonna happen for lots of reasons.] So assuming that RDSEED and hence RDRAND can never be made to never fail, the options are: 1. Finite resource that refills faster than whatever instruction issuance latency, so it's never observably empty. (Seems unlikely) 2. More secure sharing of the finite resource. It's this second option I wanted to ask you about. I wrote down what I thought "secure sharing" meant here [1]: > - One VMX (or host) context can't DoS another one. > - Ring 3 can't DoS ring 0. It's a bit of a scheduling/queueing thing, where different security contexts shouldn't be able to starve others out of the finite resource indefinitely. What I'm wondering is if that kind of fairness is even possible to achieve in the hardware or the microcode. I don't really know how that all works under the covers and what sorts of "policies" and such are feasible to implement. In suggesting it, I feel like a bit of a presumptuous kernel developer talking to hardware people, not fully appreciating their domain and its challenges. For, if this were just a C program, I know exactly what I'd do, but we're talking about a CPU here. Is it actually possible to make RDRAND usage "fair" between different security contexts? Or am I totally delusional and this is not how the hardware works or can ever work? Jason [1] https://lore.kernel.org/all/CAHmME9ps6W5snQrYeNVMFgfhMKFKciky=-UxxGFbAx_RrxSHoA@mail.gmail.com/
On 2/9/24 11:49, Jason A. Donenfeld wrote: > [As an aside, I would like to note that a different construction of > RDRAND could keep outputting good random numbers for a reeeeeallly > long time without needing to reseed, or without penalty if RDSEED is > depleted, and so could be made to actually never fail. But given the > design goals of RDRAND, this kind of crypto is highly likely to never > be implemented, so I'm not even moving to suggest that AMD/Intel just > 'fix' the crypto design goals of the instruction. It's not gonna > happen for lots of reasons.] Intel's RDRAND reseeding behavior is spelled out here: > https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html In the "Guaranteeing DBRG Reseeding" section. > It's a bit of a scheduling/queueing thing, where different security > contexts shouldn't be able to starve others out of the finite resource > indefinitely. > > What I'm wondering is if that kind of fairness is even possible to > achieve in the hardware or the microcode. .. Even ignoring different security contexts, Intel's whitepaper claims that no starvation happens with RDRAND: > If multiple threads are invoking RDRAND simultaneously, total RDRAND > throughput (across all threads) scales approximately linearly with > the number of threads until no more hardware threads remain, the bus > limits of the processor are reached, or the DRNG interface is fully > saturated. Past this point, the maximum throughput is divided equally > among the active threads. No threads get starved. 800 MB/sec of total RDRAND throughput across all threads, guaranteed reseeding, and no starvation sounds pretty good to me. Does that need improving?
On Fri, Feb 09, 2024 at 08:49:40PM +0100, Jason A. Donenfeld wrote: > While you're here, I was here the whole time, lurking in the shadows. :) > Is it actually possible to make RDRAND usage "fair" between different > security contexts? Or am I totally delusional and this is not how the > hardware works or can ever work? Yeah, I know exactly what you mean and I won't go into details for obvious reasons. Two things: * Starting with Zen3, provided properly configured hw RDRAND will never fail. It is also fair when feeding the different contexts. * My hardware engineers tell me that this is tough to do for RDSEED Thx.
diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h index 918c5880de9e..fc8d837fb3b9 100644 --- a/arch/x86/include/asm/archrandom.h +++ b/arch/x86/include/asm/archrandom.h @@ -13,6 +13,12 @@ #include <asm/processor.h> #include <asm/cpufeature.h> +#ifdef KASLR_COMPRESSED_BOOT +#define rd_warn(msg) warn(msg) +#else +#define rd_warn(msg) WARN_ONCE(1, msg) +#endif + #define RDRAND_RETRY_LOOPS 10 /* Unconditional execution of RDRAND and RDSEED */ @@ -28,6 +34,9 @@ static inline bool __must_check rdrand_long(unsigned long *v) if (ok) return true; } while (--retry); + + rd_warn("RDRAND failed\n"); + return false; } @@ -45,6 +54,8 @@ static inline bool __must_check rdseed_long(unsigned long *v) return true; } while (--retry); + rd_warn("RDSEED failed\n"); + return false; }