Message ID | 20240220123615.963916-1-geissonator@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-73267-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2685:b0:108:e6aa:91d0 with SMTP id mn5csp484780dyc; Tue, 20 Feb 2024 07:43:22 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWHiJiZ/Q4b1RzO/hPziQLfwf15CfeskCbH8O/8Nq7GEkUsNK5wvj1RxHW1uBJrIzgfNuiGPwi5imrdwPbVOnQ9C0rg4g== X-Google-Smtp-Source: AGHT+IFIWSmqRe9K+ZNDsUFACR1hNy+RCA87IqJKlfkCu7eM+6CbIsgd4lq1lMytHElsHrGiqcZc X-Received: by 2002:a05:622a:3d4:b0:42c:6e6b:2d60 with SMTP id k20-20020a05622a03d400b0042c6e6b2d60mr20082262qtx.48.1708443801766; Tue, 20 Feb 2024 07:43:21 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708443801; cv=pass; d=google.com; s=arc-20160816; b=OeaygaLU9xAlwMgid+mdO33Dc6iGSOXDKIfn7sJ5JE+xjcjGM7EQj4sJLIyw6RqFz7 VJ5hmOVIRxwB6EOko6pmevwDyiY3jWUf/pBe6LytjCvtIJ9QcbL3wr6EevxEpvujF08F Pg+oRuxsNHuGymv1BK/yaJ31wIG9q7I1l5i7wEjeawPdnSVgXl0IM5cn1+QGyBrBaT93 DorTmV3dNB9fr7zni7ep9URowRr2bYl0Oj2WrUCn1aqtFKiOHY3TlwJtX9bTXvYpcyAp Y/+uD5T/IUWIT+fVr4ed+Mj/nwiVSaKBaKHhtnUgV2hkdiHgyCtfrEk7RzBzDiHiCT7L nGFg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=Ba+FZT/y8DAA+ic+9F75Kom97L71NReEfdPzSuOXjjQ=; fh=Ovt/Xw2iB6PjxVZCCF+o3ryYpu1MwqOaJHXra+zgPEs=; b=VfiC9//87J3W19NSlpgXKrlXrCj1GiEpVY/DbyZBo1fH2n2LfUBxLDvjmi/rh2v3DU qHiTLz3GoUia4OMoBRP4zwgUdPCxyM+JUO8Rore19LfCAetx5DhlwEk+Y1xAocPqWV5D D3kgqxAppfT83vFCF9j9xM9gkUFYvaM6jfCzK9Qz9DAhuq3gKHbeS/CggkUGGonS24q6 tr/C1+wZBeNrER1pJdD/Cgw4O+b4AuVNr1q/XeOI5sPR+LptNN3cV2eHO3OL3My8ow+w n9lCZeICkTx4Ppt05yC3lFPHMPbXXrXp2jQbK4/XDncUHtdyoPqscOomkhorR+QFfSwX xZiw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=DL4j15bR; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-73267-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-73267-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id a18-20020ac85b92000000b0042c75a66b95si9138041qta.799.2024.02.20.07.43.21 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Feb 2024 07:43:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-73267-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=DL4j15bR; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-73267-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-73267-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 8E2D21C21557 for <ouuuleilei@gmail.com>; Tue, 20 Feb 2024 15:43:21 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3723F76056; Tue, 20 Feb 2024 15:43:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="DL4j15bR" Received: from mail-oo1-f50.google.com (mail-oo1-f50.google.com [209.85.161.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3C93D627EA for <linux-kernel@vger.kernel.org>; Tue, 20 Feb 2024 15:43:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708443788; cv=none; b=UaTZxwQ23E5YA2F7YbrHuk6vKwkB6m4oqGTiPhDeECHytk4ppJ0f6D8wiFAPABUW8PZjVFq+5aMpi65pkl3FeK828QjuW43RGCfx2Fpfq6TWAjBlH+qfDfP8RC0ypU04qqc1Cf0ahV6YRIf3IRX7gF7M8AtkJnJlkhGR5JIcyis= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708443788; c=relaxed/simple; bh=BHwUKr8biU6qwSoYRciff715r/Wq10MJ8jrCOgZ4EeI=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=VcZy380SU6gf233ivH/T3yQgzbF7AP+9+JgltTzj5IlPDNcGK8xlmOxB1fNJt3eUn2HnYjpIzEV0kwzmoRjEutDNZdGenOgHJrh1BqiQ1m8LQvTrkwf/B7cgYh0g0X9vOALKDTx/3jsipNtGMUSc+XY+yaD+ZP4BZkHRgBOLVt4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=DL4j15bR; arc=none smtp.client-ip=209.85.161.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-oo1-f50.google.com with SMTP id 006d021491bc7-59d78deb469so3513849eaf.1 for <linux-kernel@vger.kernel.org>; Tue, 20 Feb 2024 07:43:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708443786; x=1709048586; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Ba+FZT/y8DAA+ic+9F75Kom97L71NReEfdPzSuOXjjQ=; b=DL4j15bRVw8JX0Ue+Ro2pio4rRhekzaUMSW1tEC7sQOMlxAiPh+xP330Y+nZm9Hlhz Wk8IoeF0J8sC69vf7kGBNng3CHShuVURcTx34I0z4ynj/VCzw20J+FJiovGZAMjMINLV T+QL2QgJhi4Gw41wbWkyuAQqlGVVqxInrfzyndyOqAuhuor/65HEnOqnSMDK90F3Er3I T3eDTA1IxWlKiUca0YgoHAZazT4/HIzS4XjTcyjSs+UAxA1NGaATlG3Bsf3+/1pym/Ro fgFld9PggqtwvfRmydvbmhjINQ1Uw2tyopRfRpr5nKD8Y1FyLzkkFYmLWywzzaNCeUvG G6sA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708443786; x=1709048586; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Ba+FZT/y8DAA+ic+9F75Kom97L71NReEfdPzSuOXjjQ=; b=LktsoA6HqA6Te1VklGP2MaC4euxQstPkxKBzZ8s4J8rKl78yxoCPSw0r4DsNz0EIXJ HK338ASiRJ4BfDTzR6hUEwzdL+glQGNGLNyPWbijUr1dVE+JcwCRYjsy2ILNjghHfT2b exrO/RmX2eEDiRXyFYbuZRuZtdDLjaprsI0gZ3OIk3c9ZEHIZG3SfQNb4JHCaKbeOWFg mkC3tnX7LPVPiciitJmeADykqfUsD9ZcM0CmPNNNvVPpTviPWw29n1eQn0d6xV5PEL5f 17ULJbx5mQ+OKNrkaEzKqYqkCQbzCS+bq4GvyozJ1VRaquNonV8c56oFTTT40zO4/PYe DQ3Q== X-Forwarded-Encrypted: i=1; AJvYcCW/ndu4QExu3rKkl83EjY03rgLO8iam8PyQLnHPlmlLu54CQ8To12lFOR2m8Y7v3c+sbv4rGAxOcoPC4H/aqomvmGVg07+hs7AhZazi X-Gm-Message-State: AOJu0Yy6+0M2OXbDgv8wopZDHKnWFoE1K/5b9IBL/ELhErVTciyz0u9A xppQK5RQj1OeBIT9WaGPOnO8moA9hKMZEcjmjozRZxW7ohQn5+FN X-Received: by 2002:a4a:355a:0:b0:59f:fc30:d3aa with SMTP id w26-20020a4a355a000000b0059ffc30d3aamr2191859oog.3.1708443786272; Tue, 20 Feb 2024 07:43:06 -0800 (PST) Received: from localhost.localdomain ([129.41.86.2]) by smtp.gmail.com with ESMTPSA id p3-20020a4adfc3000000b0059aaa2bebb6sm1388768ood.48.2024.02.20.07.43.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Feb 2024 07:43:05 -0800 (PST) From: Andrew Geissler <geissonator@gmail.com> To: minyard@acm.org, joel@jms.id.au, andrew@codeconstruct.com.au, openipmi-developer@lists.sourceforge.net, linux-arm-kernel@lists.infradead.org, linux-aspeed@lists.ozlabs.org, linux-kernel@vger.kernel.org Cc: Andrew Geissler <geissonator@yahoo.com>, openbmc@lists.ozlabs.org Subject: [PATCH] ipmi: kcs: Update OBF poll timeout to reduce latency Date: Tue, 20 Feb 2024 06:36:15 -0600 Message-Id: <20240220123615.963916-1-geissonator@gmail.com> X-Mailer: git-send-email 2.39.2 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791433167857592004 X-GMAIL-MSGID: 1791433167857592004 |
Series |
ipmi: kcs: Update OBF poll timeout to reduce latency
|
|
Commit Message
Andrew Geissler
Feb. 20, 2024, 12:36 p.m. UTC
From: Andrew Geissler <geissonator@yahoo.com> Commit f90bc0f97f2b ("ipmi: kcs: Poll OBF briefly to reduce OBE latency") introduced an optimization to poll when the host has read the output data register (ODR). Testing has shown that the 100us timeout was not always enough. When we miss that 100us window, it results in 10x the time to get the next message from the BMC to the host. When you're sending 100's of messages between the BMC and Host, this results in a server boot taking 50% longer for IBM P10 machines. Started with 1000 and worked it down until the issue started to reoccur. 200 was the sweet spot in my testing. 150 showed the issue intermittently. Signed-off-by: Andrew Geissler <geissonator@yahoo.com> --- drivers/char/ipmi/kcs_bmc_aspeed.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Comments
Dear Andrew, Thank you for your patch. Some style suggestions. Am 20.02.24 um 13:36 schrieb Andrew Geissler: > From: Andrew Geissler <geissonator@yahoo.com> (Oh no, Yahoo. (ignore)) You could be more specific in the git commit message by using *Double*: > ipmi: kcs: Double OBF poll timeout to reduce latency > ipmi: kcs: Double OBF poll timeout to 200 us to reduce latency > Commit f90bc0f97f2b ("ipmi: kcs: Poll OBF briefly to reduce OBE > latency") introduced an optimization to poll when the host has > read the output data register (ODR). Testing has shown that the 100us > timeout was not always enough. When we miss that 100us window, it > results in 10x the time to get the next message from the BMC to the > host. When you're sending 100's of messages between the BMC and Host, I do not understand, how this poll timeout can result in such an increase, and why a quite big timeout hurts, but I do not know the implementation. > this results in a server boot taking 50% longer for IBM P10 machines. > > Started with 1000 and worked it down until the issue started to reoccur. > 200 was the sweet spot in my testing. 150 showed the issue > intermittently. I’d add a blank line here. > Signed-off-by: Andrew Geissler <geissonator@yahoo.com> > --- > drivers/char/ipmi/kcs_bmc_aspeed.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/char/ipmi/kcs_bmc_aspeed.c b/drivers/char/ipmi/kcs_bmc_aspeed.c > index 72640da55380..af1eae6153f6 100644 > --- a/drivers/char/ipmi/kcs_bmc_aspeed.c > +++ b/drivers/char/ipmi/kcs_bmc_aspeed.c > @@ -422,7 +422,7 @@ static void aspeed_kcs_irq_mask_update(struct kcs_bmc_device *kcs_bmc, u8 mask, > * missed the event. > */ > rc = read_poll_timeout_atomic(aspeed_kcs_inb, str, > - !(str & KCS_BMC_STR_OBF), 1, 100, false, > + !(str & KCS_BMC_STR_OBF), 1, 200, false, > &priv->kcs_bmc, priv->kcs_bmc.ioreg.str); > /* Time for the slow path? */ > if (rc == -ETIMEDOUT) Kind regards, Paul
On Tue, Feb 20, 2024 at 04:51:21PM +0100, Paul Menzel wrote: > Dear Andrew, > > > Thank you for your patch. Some style suggestions. > > Am 20.02.24 um 13:36 schrieb Andrew Geissler: > > From: Andrew Geissler <geissonator@yahoo.com> > > (Oh no, Yahoo. (ignore)) > > You could be more specific in the git commit message by using *Double*: > > > ipmi: kcs: Double OBF poll timeout to reduce latency > > > ipmi: kcs: Double OBF poll timeout to 200 us to reduce latency > > > Commit f90bc0f97f2b ("ipmi: kcs: Poll OBF briefly to reduce OBE > > latency") introduced an optimization to poll when the host has I assume that removing that patch doesn't fix the issue, it would only make it worse, right? > > read the output data register (ODR). Testing has shown that the 100us > > timeout was not always enough. When we miss that 100us window, it > > results in 10x the time to get the next message from the BMC to the > > host. When you're sending 100's of messages between the BMC and Host, > > I do not understand, how this poll timeout can result in such an increase, > and why a quite big timeout hurts, but I do not know the implementation. It's because increasing that number causes it to poll longer for the event, the host takes longer than 100us to generate the event, and if the event is missed the time when it is checked again is very long. Polling for 100us is already pretty extreme. 200us is really too long. The real problem is that there is no interrupt for this. I'd also guess there is no interrupt on the host side, because that would solve this problem, too, as it would certainly get around to handling the interupt in 100us. I'm assuming the host driver is not the Linux driver, as it should also handle this in a timely manner, even when polling. If people want hardware to perform well, they ought to design it and not expect software to fix all the problems. The right way to fix this is probably to do the same thing the host side Linux driver does. It has a kernel thread that is kicked off to do this. Unfortunately, that's more complicated to implement, but it avoids polling in this location (which causes latency issues on the BMC side) and lets you poll longer without causing issues. I'll let the people who maintain that code comment. -corey > > > this results in a server boot taking 50% longer for IBM P10 machines. > > > > Started with 1000 and worked it down until the issue started to reoccur. > > 200 was the sweet spot in my testing. 150 showed the issue > > intermittently. > > I’d add a blank line here. > > > Signed-off-by: Andrew Geissler <geissonator@yahoo.com> > > --- > > drivers/char/ipmi/kcs_bmc_aspeed.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/char/ipmi/kcs_bmc_aspeed.c b/drivers/char/ipmi/kcs_bmc_aspeed.c > > index 72640da55380..af1eae6153f6 100644 > > --- a/drivers/char/ipmi/kcs_bmc_aspeed.c > > +++ b/drivers/char/ipmi/kcs_bmc_aspeed.c > > @@ -422,7 +422,7 @@ static void aspeed_kcs_irq_mask_update(struct kcs_bmc_device *kcs_bmc, u8 mask, > > * missed the event. > > */ > > rc = read_poll_timeout_atomic(aspeed_kcs_inb, str, > > - !(str & KCS_BMC_STR_OBF), 1, 100, false, > > + !(str & KCS_BMC_STR_OBF), 1, 200, false, > > &priv->kcs_bmc, priv->kcs_bmc.ioreg.str); > > /* Time for the slow path? */ > > if (rc == -ETIMEDOUT) > > > Kind regards, > > Paul
On Tue, 2024-02-20 at 13:33 -0600, Corey Minyard wrote: > On Tue, Feb 20, 2024 at 04:51:21PM +0100, Paul Menzel wrote: > > Dear Andrew, > > > > > > Thank you for your patch. Some style suggestions. > > > > Am 20.02.24 um 13:36 schrieb Andrew Geissler: > > > From: Andrew Geissler <geissonator@yahoo.com> > > > > (Oh no, Yahoo. (ignore)) > > > > You could be more specific in the git commit message by using *Double*: > > > > > ipmi: kcs: Double OBF poll timeout to reduce latency > > > > > ipmi: kcs: Double OBF poll timeout to 200 us to reduce latency > > > > > Commit f90bc0f97f2b ("ipmi: kcs: Poll OBF briefly to reduce OBE > > > latency") introduced an optimization to poll when the host has > > I assume that removing that patch doesn't fix the issue, it would only > make it worse, right? Yep. > > > > read the output data register (ODR). Testing has shown that the 100us > > > timeout was not always enough. When we miss that 100us window, it > > > results in 10x the time to get the next message from the BMC to the > > > host. When you're sending 100's of messages between the BMC and Host, > > > > I do not understand, how this poll timeout can result in such an increase, > > and why a quite big timeout hurts, but I do not know the implementation. > > It's because increasing that number causes it to poll longer for the > event, the host takes longer than 100us to generate the event, and if > the event is missed the time when it is checked again is very long. > > Polling for 100us is already pretty extreme. 200us is really too long. > > The real problem is that there is no interrupt for this. I'd also guess > there is no interrupt on the host side, because that would solve this > problem, too, as it would certainly get around to handling the interupt > in 100us. I'm assuming the host driver is not the Linux driver, as it > should also handle this in a timely manner, even when polling. I expect the issues Andrew G is observing are with the Power10 boot firmware. The boot firmware only polls. The runtime firmware enables interrupts. > > If people want hardware to perform well, they ought to design it and not > expect software to fix all the problems. +1 > > The right way to fix this is probably to do the same thing the host side > Linux driver does. It has a kernel thread that is kicked off to do > this. Unfortunately, that's more complicated to implement, but it > avoids polling in this location (which causes latency issues on the BMC > side) and lets you poll longer without causing issues. In Andrew G's case he's talking MCTP over KCS using a vendor-defined transport binding (that also leverages LPC FWH cycles for bulk data transfers)[1]. I think it could have taken more inspiration from the IPMI KCS protocol: It might be worth an experiment to write the dummy command value to IDR from the host side after each ODR read to signal the host's clearing of OBF (no interrupt for the BMC) with an IBF (which does interrupt the BMC). And doing the obverse for the BMC. Some brief thought suggests that if the dummy value is read there's no need to send a dummy value in reply (as it's an indicator to read the status register). With that the need for the spin here (or on the host side) is reduced at the cost of some constant protocol overhead. [1]: https://github.com/openbmc/libmctp/blob/master/docs/bindings/vendor-ibm-astlpc.md Andrew J
> On Feb 20, 2024, at 4:36 PM, Andrew Jeffery <andrew@codeconstruct.com.au> wrote: > > On Tue, 2024-02-20 at 13:33 -0600, Corey Minyard wrote: >> On Tue, Feb 20, 2024 at 04:51:21PM +0100, Paul Menzel wrote: >>> Dear Andrew, >> >> It's because increasing that number causes it to poll longer for the >> event, the host takes longer than 100us to generate the event, and if >> the event is missed the time when it is checked again is very long. >> >> Polling for 100us is already pretty extreme. 200us is really too long. >> >> The real problem is that there is no interrupt for this. I'd also guess >> there is no interrupt on the host side, because that would solve this >> problem, too, as it would certainly get around to handling the interupt >> in 100us. I'm assuming the host driver is not the Linux driver, as it >> should also handle this in a timely manner, even when polling. > > I expect the issues Andrew G is observing are with the Power10 boot > firmware. The boot firmware only polls. The runtime firmware enables > interrupts. Yep, this is with the low level host boot firmware. Also, further testing over night showed that 200us wasn’t enough for our larger Everest P10 machines, I needed to go to 300us. As we were struggling to allow 200us, I assume 300us is going to be a no-go. >> > >> >> The right way to fix this is probably to do the same thing the host side >> Linux driver does. It has a kernel thread that is kicked off to do >> this. Unfortunately, that's more complicated to implement, but it >> avoids polling in this location (which causes latency issues on the BMC >> side) and lets you poll longer without causing issues. > > In Andrew G's case he's talking MCTP over KCS using a vendor-defined > transport binding (that also leverages LPC FWH cycles for bulk data > transfers)[1]. I think it could have taken more inspiration from the > IPMI KCS protocol: It might be worth an experiment to write the dummy > command value to IDR from the host side after each ODR read to signal > the host's clearing of OBF (no interrupt for the BMC) with an IBF > (which does interrupt the BMC). And doing the obverse for the BMC. Some > brief thought suggests that if the dummy value is read there's no need > to send a dummy value in reply (as it's an indicator to read the status > register). With that the need for the spin here (or on the host side) > is reduced at the cost of some constant protocol overhead. > Thanks for the quick reviews and ideas. I’ll see if I can find someone on the team to help out with Andrew J’s thoughts and if that doesn’t work, look into the kernel thread idea. > > > Andrew J
On Wed, Feb 21, 2024 at 10:57:38AM -0600, Andrew Geissler wrote: > > > > On Feb 20, 2024, at 4:36 PM, Andrew Jeffery <andrew@codeconstruct.com.au> wrote: > > > > On Tue, 2024-02-20 at 13:33 -0600, Corey Minyard wrote: > >> On Tue, Feb 20, 2024 at 04:51:21PM +0100, Paul Menzel wrote: > >>> Dear Andrew, > >> > >> It's because increasing that number causes it to poll longer for the > >> event, the host takes longer than 100us to generate the event, and if > >> the event is missed the time when it is checked again is very long. > >> > >> Polling for 100us is already pretty extreme. 200us is really too long. > >> > >> The real problem is that there is no interrupt for this. I'd also guess > >> there is no interrupt on the host side, because that would solve this > >> problem, too, as it would certainly get around to handling the interupt > >> in 100us. I'm assuming the host driver is not the Linux driver, as it > >> should also handle this in a timely manner, even when polling. > > > > I expect the issues Andrew G is observing are with the Power10 boot > > firmware. The boot firmware only polls. The runtime firmware enables > > interrupts. > > Yep, this is with the low level host boot firmware. > Also, further testing over night showed that 200us wasn’t enough for > our larger Everest P10 machines, I needed to go to 300us. As we > were struggling to allow 200us, I assume 300us is going to be a no-go. It seems odd to me that firmware polling would be an issue. Usually, with firmware, you have it just spinning waiting for something. At least in the firmware I worked with. I'm not familiar with this firmware, though, maybe it has timers and such and parallel execution. Can this be fixed on the firmware side? > > >> > > > >> > >> The right way to fix this is probably to do the same thing the host side > >> Linux driver does. It has a kernel thread that is kicked off to do > >> this. Unfortunately, that's more complicated to implement, but it > >> avoids polling in this location (which causes latency issues on the BMC > >> side) and lets you poll longer without causing issues. > > > > In Andrew G's case he's talking MCTP over KCS using a vendor-defined > > transport binding (that also leverages LPC FWH cycles for bulk data > > transfers)[1]. I think it could have taken more inspiration from the > > IPMI KCS protocol: It might be worth an experiment to write the dummy > > command value to IDR from the host side after each ODR read to signal > > the host's clearing of OBF (no interrupt for the BMC) with an IBF > > (which does interrupt the BMC). And doing the obverse for the BMC. Some > > brief thought suggests that if the dummy value is read there's no need > > to send a dummy value in reply (as it's an indicator to read the status > > register). With that the need for the spin here (or on the host side) > > is reduced at the cost of some constant protocol overhead. > > > > Thanks for the quick reviews and ideas. > I’ll see if I can find someone on the team to help out with Andrew J’s > thoughts and if that doesn’t work, look into the kernel thread idea. I don't really understand Andrew J's ideas very well, but hopefully they help. The kernel thread idea is fairly complicated to implement, and there has been an impetus in the kernel to not create new kernel threads. But there just has to be a good reason, and this probably is one. We worked on it a lot in the IPMI host driver to tune it and got it to a point where it provided decent performance without causing power management issues. When I first read the title I was worried it was talking about this code; I'm lothe to touch it for fear of breaking things. -corey
diff --git a/drivers/char/ipmi/kcs_bmc_aspeed.c b/drivers/char/ipmi/kcs_bmc_aspeed.c index 72640da55380..af1eae6153f6 100644 --- a/drivers/char/ipmi/kcs_bmc_aspeed.c +++ b/drivers/char/ipmi/kcs_bmc_aspeed.c @@ -422,7 +422,7 @@ static void aspeed_kcs_irq_mask_update(struct kcs_bmc_device *kcs_bmc, u8 mask, * missed the event. */ rc = read_poll_timeout_atomic(aspeed_kcs_inb, str, - !(str & KCS_BMC_STR_OBF), 1, 100, false, + !(str & KCS_BMC_STR_OBF), 1, 200, false, &priv->kcs_bmc, priv->kcs_bmc.ioreg.str); /* Time for the slow path? */ if (rc == -ETIMEDOUT)