Message ID | 20240222032803.2177856-4-maobibo@loongson.cn |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-75852-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:aa16:b0:108:e6aa:91d0 with SMTP id by22csp9029dyb; Wed, 21 Feb 2024 19:29:38 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVaCEH5vngN9DTAMMD+51l5nTs/H2eWbwCmMowdsC4SqU8e3c3xQmLmJYksfmLl9VnoQJyhqIqZqC0hKd00WRbp4DZz4A== X-Google-Smtp-Source: AGHT+IGA9qOC+CF0eQjIVJdK8DT2nKNIehTPLnZffwQxscIs7OqiR+2z1p6H6goXW9PDv5FMCe+h X-Received: by 2002:a05:6a20:d80e:b0:19c:a16c:8ab4 with SMTP id iv14-20020a056a20d80e00b0019ca16c8ab4mr29477003pzb.47.1708572578231; Wed, 21 Feb 2024 19:29:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708572578; cv=pass; d=google.com; s=arc-20160816; b=FUeah05leBug7pz3cxpH0dsYSkqzssvn+cYnfW4k/81yXlIpun5ilHTTx2QnxkZBE7 mdhgvDCWv6Tipvlm0uq8dEZx52/pindclWkKRUT2x/yk7TCabEmXWRIhUxVUwdwVYxVU 8KD59XBfeUjd12tnb3y/+i2BFDSNmGW8WH4X0PhgskPB43Jj5q8emDJZVZS7P/sR0Iao k8o7zZvQMHmWgq95cQVr0D7hHxj38Njmrb6Q9BCraSnEo0DNE/UPku8KL36Xd8NS27o1 j+5de0UdxQazPLOadPeNMdfqpIsJCZcT4OGMQIbULjYGq4XYBjOZRWYMpLvDkSsiqufu rzpA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=NzZLPztwdrKRfVVcfuQOUwJ1xzJ6fbu3/u6Wl5gG4Go=; fh=QPRkXqumSCsPIO8Xwon8471RgDncjW8zGe30eUGTd0E=; b=fWp9KQWKVt8ivgYuiIXx76B96eKDJPbQqCcaVw72VtegtbtFXcUBButhIwYEGxD/ba 63Y7zaujMU8Ws5aI7fDdWiRomeS2QrnPVOIGqNoYlBhpMbIq5DghlGYib8qoXXs6y2fj /yABs4zMGSbbmXLlDR3A7Kjly17W/rYZ36b1JBfxNuw8jat+NzlV1eKFlUPewpVzx0OY FeXXpj4Eh5T/mrH94aPQJCSuqXLKbYgVQTMnMAvAcJZJyQaSuMw5jQ2kG0zLIow5Y2E3 YigQnpVzgssyDce6hkO3WkeQ94nGjIoPXM1tQYFrjlOprFCasNOFVsO6s5uJNxn8wQ8B DLxw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=loongson.cn); spf=pass (google.com: domain of linux-kernel+bounces-75852-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-75852-ouuuleilei=gmail.com@vger.kernel.org" Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id q15-20020a170902eb8f00b001dbc70ef010si8905504plg.362.2024.02.21.19.29.38 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Feb 2024 19:29:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-75852-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=loongson.cn); spf=pass (google.com: domain of linux-kernel+bounces-75852-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-75852-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 0FB612846C4 for <ouuuleilei@gmail.com>; Thu, 22 Feb 2024 03:29:38 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 343AD224FB; Thu, 22 Feb 2024 03:28:14 +0000 (UTC) Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0A0061755B; Thu, 22 Feb 2024 03:28:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708572492; cv=none; b=rDEXo1jllGG1+XgVK9yQ2Zx8R7dmLinrfJVs/I4aJInv1vegSBusNzPHNGXUORauduvOeKYK1oxYL6dbJkU1Twr3W4lIlLh6vJPLqXh/TKbI2Ra21SsTAr7IfcnHyHmyAQw6YWiGqyrID7Qc0pr1PUqllbVdHBQ9RG4MR8jMunA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708572492; c=relaxed/simple; bh=SAcM9fY/E3zA0R7+PHS0Z3hmSaLbgnUcRPKZ6FV4PGI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=vFwwLJFI669dqWGcX+PI1opKmvz05GqcWrYyyL2jYs9so9CezGjvTEDsvlRyagz07Hky7sC2NkHY46nLc/es7c5yyZuAdAMPHAUi0b9AMU4rUs6LGd0tP36ZMdzpbDzqJpn3P+F41SCiCvD7y7O9VP4mWAS3W4reJUjK515thow= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=loongson.cn; spf=pass smtp.mailfrom=loongson.cn; arc=none smtp.client-ip=114.242.206.163 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [10.2.5.213]) by gateway (Coremail) with SMTP id _____8CxmehIv9ZlYwoQAA--.22075S3; Thu, 22 Feb 2024 11:28:08 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.213]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxrhNDv9Zl+nM+AA--.41033S5; Thu, 22 Feb 2024 11:28:06 +0800 (CST) From: Bibo Mao <maobibo@loongson.cn> To: Huacai Chen <chenhuacai@kernel.org>, Tianrui Zhao <zhaotianrui@loongson.cn>, Juergen Gross <jgross@suse.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: loongarch@lists.linux.dev, linux-kernel@vger.kernel.org, virtualization@lists.linux.dev, kvm@vger.kernel.org Subject: [PATCH v5 3/6] LoongArch: KVM: Add cpucfg area for kvm hypervisor Date: Thu, 22 Feb 2024 11:28:00 +0800 Message-Id: <20240222032803.2177856-4-maobibo@loongson.cn> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20240222032803.2177856-1-maobibo@loongson.cn> References: <20240222032803.2177856-1-maobibo@loongson.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: AQAAf8CxrhNDv9Zl+nM+AA--.41033S5 X-CM-SenderInfo: xpdruxter6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBj93XoWxGr13ZF15ZryftrW5WF48uFX_yoWrJrW3pF Z7Zr1kWr48GryfA39rt3y5Wrs8uF4kGr12qFW3t3yUCF42qryrZr4vkrZFyFyDKws5C3WI qF15tr1aqF4DAabCm3ZEXasCq-sJn29KB7ZKAUJUUUU7529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUBab4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1Y6r17M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Cr0_Gr1UM28EF7xvwVC2z280aVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIEc7CjxVAF wI0_Gr1j6F4UJwAaw2AFwI0_Jrv_JF1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2 xF0cIa020Ex4CE44I27wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_ Jw0_WrylYx0Ex4A2jsIE14v26r4j6F4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x 0EwIxGrwCY1x0262kKe7AKxVWUAVWUtwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwCFI7km07C267AKxVWUXVWUAwC20s026c02F40E14v26r1j6r18MI8I3I0E74 80Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0 I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04 k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7Cj xVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07j2MKZUUUUU= X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791568199525034962 X-GMAIL-MSGID: 1791568199525034962 |
Series |
[v5,1/6] LoongArch/smp: Refine some ipi functions on LoongArch platform
|
|
Commit Message
maobibo
Feb. 22, 2024, 3:28 a.m. UTC
Instruction cpucfg can be used to get processor features. And there
is trap exception when it is executed in VM mode, and also it is
to provide cpu features to VM. On real hardware cpucfg area 0 - 20
is used. Here one specified area 0x40000000 -- 0x400000ff is used
for KVM hypervisor to privide PV features, and the area can be extended
for other hypervisors in future. This area will never be used for
real HW, it is only used by software.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
arch/loongarch/include/asm/inst.h | 1 +
arch/loongarch/include/asm/loongarch.h | 10 ++++++
arch/loongarch/kvm/exit.c | 46 +++++++++++++++++---------
3 files changed, 41 insertions(+), 16 deletions(-)
Comments
Hi, Bibo, On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: > > Instruction cpucfg can be used to get processor features. And there > is trap exception when it is executed in VM mode, and also it is > to provide cpu features to VM. On real hardware cpucfg area 0 - 20 > is used. Here one specified area 0x40000000 -- 0x400000ff is used > for KVM hypervisor to privide PV features, and the area can be extended > for other hypervisors in future. This area will never be used for > real HW, it is only used by software. After reading and thinking, I find that the hypercall method which is used in our productive kernel is better than this cpucfg method. Because hypercall is more simple and straightforward, plus we don't worry about conflicting with the real hardware. Huacai > > Signed-off-by: Bibo Mao <maobibo@loongson.cn> > --- > arch/loongarch/include/asm/inst.h | 1 + > arch/loongarch/include/asm/loongarch.h | 10 ++++++ > arch/loongarch/kvm/exit.c | 46 +++++++++++++++++--------- > 3 files changed, 41 insertions(+), 16 deletions(-) > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h > index d8f637f9e400..ad120f924905 100644 > --- a/arch/loongarch/include/asm/inst.h > +++ b/arch/loongarch/include/asm/inst.h > @@ -67,6 +67,7 @@ enum reg2_op { > revhd_op = 0x11, > extwh_op = 0x16, > extwb_op = 0x17, > + cpucfg_op = 0x1b, > iocsrrdb_op = 0x19200, > iocsrrdh_op = 0x19201, > iocsrrdw_op = 0x19202, > diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h > index 46366e783c84..a1d22e8b6f94 100644 > --- a/arch/loongarch/include/asm/loongarch.h > +++ b/arch/loongarch/include/asm/loongarch.h > @@ -158,6 +158,16 @@ > #define CPUCFG48_VFPU_CG BIT(2) > #define CPUCFG48_RAM_CG BIT(3) > > +/* > + * cpucfg index area: 0x40000000 -- 0x400000ff > + * SW emulation for KVM hypervirsor > + */ > +#define CPUCFG_KVM_BASE 0x40000000UL > +#define CPUCFG_KVM_SIZE 0x100 > +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE > +#define KVM_SIGNATURE "KVM\0" > +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) > + > #ifndef __ASSEMBLY__ > > /* CSR */ > diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c > index 923bbca9bd22..6a38fd59d86d 100644 > --- a/arch/loongarch/kvm/exit.c > +++ b/arch/loongarch/kvm/exit.c > @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) > return EMULATE_DONE; > } > > -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) > +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) > { > int rd, rj; > unsigned int index; > + > + rd = inst.reg2_format.rd; > + rj = inst.reg2_format.rj; > + ++vcpu->stat.cpucfg_exits; > + index = vcpu->arch.gprs[rj]; > + > + /* > + * By LoongArch Reference Manual 2.2.10.5 > + * Return value is 0 for undefined cpucfg index > + */ > + switch (index) { > + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): > + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; > + break; > + case CPUCFG_KVM_SIG: > + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; > + break; > + default: > + vcpu->arch.gprs[rd] = 0; > + break; > + } > + > + return EMULATE_DONE; > +} > + > +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) > +{ > unsigned long curr_pc; > larch_inst inst; > enum emulation_result er = EMULATE_DONE; > @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) > er = EMULATE_FAIL; > switch (((inst.word >> 24) & 0xff)) { > case 0x0: /* CPUCFG GSPR */ > - if (inst.reg2_format.opcode == 0x1B) { > - rd = inst.reg2_format.rd; > - rj = inst.reg2_format.rj; > - ++vcpu->stat.cpucfg_exits; > - index = vcpu->arch.gprs[rj]; > - er = EMULATE_DONE; > - /* > - * By LoongArch Reference Manual 2.2.10.5 > - * return value is 0 for undefined cpucfg index > - */ > - if (index < KVM_MAX_CPUCFG_REGS) > - vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; > - else > - vcpu->arch.gprs[rd] = 0; > - } > + if (inst.reg2_format.opcode == cpucfg_op) > + er = kvm_emu_cpucfg(vcpu, inst); > break; > case 0x4: /* CSR{RD,WR,XCHG} GSPR */ > er = kvm_handle_csr(vcpu, inst); > -- > 2.39.3 >
On 2024/2/24 下午5:13, Huacai Chen wrote: > Hi, Bibo, > > On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: >> >> Instruction cpucfg can be used to get processor features. And there >> is trap exception when it is executed in VM mode, and also it is >> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >> is used. Here one specified area 0x40000000 -- 0x400000ff is used >> for KVM hypervisor to privide PV features, and the area can be extended >> for other hypervisors in future. This area will never be used for >> real HW, it is only used by software. > After reading and thinking, I find that the hypercall method which is > used in our productive kernel is better than this cpucfg method. > Because hypercall is more simple and straightforward, plus we don't > worry about conflicting with the real hardware. No, I do not think so. cpucfg is simper than hypercall, hypercall can be in effect when system runs in guest mode. In some scenario like TCG mode, hypercall is illegal intruction, however cpucfg can work. Extioi virtualization extension will be added later, cpucfg can be used to get extioi features. It is unlikely that extioi driver depends on PARA_VIRT macro if hypercall is used to get features. Regards Bibo Mao > > Huacai > >> >> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >> --- >> arch/loongarch/include/asm/inst.h | 1 + >> arch/loongarch/include/asm/loongarch.h | 10 ++++++ >> arch/loongarch/kvm/exit.c | 46 +++++++++++++++++--------- >> 3 files changed, 41 insertions(+), 16 deletions(-) >> >> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h >> index d8f637f9e400..ad120f924905 100644 >> --- a/arch/loongarch/include/asm/inst.h >> +++ b/arch/loongarch/include/asm/inst.h >> @@ -67,6 +67,7 @@ enum reg2_op { >> revhd_op = 0x11, >> extwh_op = 0x16, >> extwb_op = 0x17, >> + cpucfg_op = 0x1b, >> iocsrrdb_op = 0x19200, >> iocsrrdh_op = 0x19201, >> iocsrrdw_op = 0x19202, >> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h >> index 46366e783c84..a1d22e8b6f94 100644 >> --- a/arch/loongarch/include/asm/loongarch.h >> +++ b/arch/loongarch/include/asm/loongarch.h >> @@ -158,6 +158,16 @@ >> #define CPUCFG48_VFPU_CG BIT(2) >> #define CPUCFG48_RAM_CG BIT(3) >> >> +/* >> + * cpucfg index area: 0x40000000 -- 0x400000ff >> + * SW emulation for KVM hypervirsor >> + */ >> +#define CPUCFG_KVM_BASE 0x40000000UL >> +#define CPUCFG_KVM_SIZE 0x100 >> +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE >> +#define KVM_SIGNATURE "KVM\0" >> +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) >> + >> #ifndef __ASSEMBLY__ >> >> /* CSR */ >> diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c >> index 923bbca9bd22..6a38fd59d86d 100644 >> --- a/arch/loongarch/kvm/exit.c >> +++ b/arch/loongarch/kvm/exit.c >> @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) >> return EMULATE_DONE; >> } >> >> -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >> +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) >> { >> int rd, rj; >> unsigned int index; >> + >> + rd = inst.reg2_format.rd; >> + rj = inst.reg2_format.rj; >> + ++vcpu->stat.cpucfg_exits; >> + index = vcpu->arch.gprs[rj]; >> + >> + /* >> + * By LoongArch Reference Manual 2.2.10.5 >> + * Return value is 0 for undefined cpucfg index >> + */ >> + switch (index) { >> + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): >> + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >> + break; >> + case CPUCFG_KVM_SIG: >> + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; >> + break; >> + default: >> + vcpu->arch.gprs[rd] = 0; >> + break; >> + } >> + >> + return EMULATE_DONE; >> +} >> + >> +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >> +{ >> unsigned long curr_pc; >> larch_inst inst; >> enum emulation_result er = EMULATE_DONE; >> @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >> er = EMULATE_FAIL; >> switch (((inst.word >> 24) & 0xff)) { >> case 0x0: /* CPUCFG GSPR */ >> - if (inst.reg2_format.opcode == 0x1B) { >> - rd = inst.reg2_format.rd; >> - rj = inst.reg2_format.rj; >> - ++vcpu->stat.cpucfg_exits; >> - index = vcpu->arch.gprs[rj]; >> - er = EMULATE_DONE; >> - /* >> - * By LoongArch Reference Manual 2.2.10.5 >> - * return value is 0 for undefined cpucfg index >> - */ >> - if (index < KVM_MAX_CPUCFG_REGS) >> - vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >> - else >> - vcpu->arch.gprs[rd] = 0; >> - } >> + if (inst.reg2_format.opcode == cpucfg_op) >> + er = kvm_emu_cpucfg(vcpu, inst); >> break; >> case 0x4: /* CSR{RD,WR,XCHG} GSPR */ >> er = kvm_handle_csr(vcpu, inst); >> -- >> 2.39.3 >>
On 2024/2/24 下午5:13, Huacai Chen wrote: > Hi, Bibo, > > On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: >> >> Instruction cpucfg can be used to get processor features. And there >> is trap exception when it is executed in VM mode, and also it is >> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >> is used. Here one specified area 0x40000000 -- 0x400000ff is used >> for KVM hypervisor to privide PV features, and the area can be extended >> for other hypervisors in future. This area will never be used for >> real HW, it is only used by software. > After reading and thinking, I find that the hypercall method which is > used in our productive kernel is better than this cpucfg method. > Because hypercall is more simple and straightforward, plus we don't > worry about conflicting with the real hardware. About cpucfg area 0x40000000 -- 0x400000ff, I have negotiated with chip designer. This area will never be used with real hardware. Regards Bibo Mao > > Huacai > >> >> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >> --- >> arch/loongarch/include/asm/inst.h | 1 + >> arch/loongarch/include/asm/loongarch.h | 10 ++++++ >> arch/loongarch/kvm/exit.c | 46 +++++++++++++++++--------- >> 3 files changed, 41 insertions(+), 16 deletions(-) >> >> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h >> index d8f637f9e400..ad120f924905 100644 >> --- a/arch/loongarch/include/asm/inst.h >> +++ b/arch/loongarch/include/asm/inst.h >> @@ -67,6 +67,7 @@ enum reg2_op { >> revhd_op = 0x11, >> extwh_op = 0x16, >> extwb_op = 0x17, >> + cpucfg_op = 0x1b, >> iocsrrdb_op = 0x19200, >> iocsrrdh_op = 0x19201, >> iocsrrdw_op = 0x19202, >> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h >> index 46366e783c84..a1d22e8b6f94 100644 >> --- a/arch/loongarch/include/asm/loongarch.h >> +++ b/arch/loongarch/include/asm/loongarch.h >> @@ -158,6 +158,16 @@ >> #define CPUCFG48_VFPU_CG BIT(2) >> #define CPUCFG48_RAM_CG BIT(3) >> >> +/* >> + * cpucfg index area: 0x40000000 -- 0x400000ff >> + * SW emulation for KVM hypervirsor >> + */ >> +#define CPUCFG_KVM_BASE 0x40000000UL >> +#define CPUCFG_KVM_SIZE 0x100 >> +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE >> +#define KVM_SIGNATURE "KVM\0" >> +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) >> + >> #ifndef __ASSEMBLY__ >> >> /* CSR */ >> diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c >> index 923bbca9bd22..6a38fd59d86d 100644 >> --- a/arch/loongarch/kvm/exit.c >> +++ b/arch/loongarch/kvm/exit.c >> @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) >> return EMULATE_DONE; >> } >> >> -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >> +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) >> { >> int rd, rj; >> unsigned int index; >> + >> + rd = inst.reg2_format.rd; >> + rj = inst.reg2_format.rj; >> + ++vcpu->stat.cpucfg_exits; >> + index = vcpu->arch.gprs[rj]; >> + >> + /* >> + * By LoongArch Reference Manual 2.2.10.5 >> + * Return value is 0 for undefined cpucfg index >> + */ >> + switch (index) { >> + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): >> + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >> + break; >> + case CPUCFG_KVM_SIG: >> + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; >> + break; >> + default: >> + vcpu->arch.gprs[rd] = 0; >> + break; >> + } >> + >> + return EMULATE_DONE; >> +} >> + >> +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >> +{ >> unsigned long curr_pc; >> larch_inst inst; >> enum emulation_result er = EMULATE_DONE; >> @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >> er = EMULATE_FAIL; >> switch (((inst.word >> 24) & 0xff)) { >> case 0x0: /* CPUCFG GSPR */ >> - if (inst.reg2_format.opcode == 0x1B) { >> - rd = inst.reg2_format.rd; >> - rj = inst.reg2_format.rj; >> - ++vcpu->stat.cpucfg_exits; >> - index = vcpu->arch.gprs[rj]; >> - er = EMULATE_DONE; >> - /* >> - * By LoongArch Reference Manual 2.2.10.5 >> - * return value is 0 for undefined cpucfg index >> - */ >> - if (index < KVM_MAX_CPUCFG_REGS) >> - vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >> - else >> - vcpu->arch.gprs[rd] = 0; >> - } >> + if (inst.reg2_format.opcode == cpucfg_op) >> + er = kvm_emu_cpucfg(vcpu, inst); >> break; >> case 0x4: /* CSR{RD,WR,XCHG} GSPR */ >> er = kvm_handle_csr(vcpu, inst); >> -- >> 2.39.3 >>
Hi, On 2/26/24 10:04, maobibo wrote: > On 2024/2/24 下午5:13, Huacai Chen wrote: >> Hi, Bibo, >> >> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: >>> >>> Instruction cpucfg can be used to get processor features. And there >>> is trap exception when it is executed in VM mode, and also it is >>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>> for KVM hypervisor to privide PV features, and the area can be extended >>> for other hypervisors in future. This area will never be used for >>> real HW, it is only used by software. >> After reading and thinking, I find that the hypercall method which is >> used in our productive kernel is better than this cpucfg method. >> Because hypercall is more simple and straightforward, plus we don't >> worry about conflicting with the real hardware. > No, I do not think so. cpucfg is simper than hypercall, hypercall can > be in effect when system runs in guest mode. In some scenario like TCG > mode, hypercall is illegal intruction, however cpucfg can work. While the CPUCFG instruction is universally available, it's also unprivileged, so any additional CPUCFG behavior also automatically becomes UAPI, which likely isn't what you expect. Hypervisor implementation details shouldn't be leaked to userland because it has no reason to care -- even though userland learns about the capabilities, it cannot actually access the resources, because relevant CSRs and/or instructions are privileged. Worse, the unnecessary exposure of information could be a problem security-wise. A possible way to preserve the unprivileged CPUCFG behavior would be acting differently based on guest CSR.CRMD.PLV: only returning data for the new configuration space when guest is not in PLV3. But this behavior isn't explicitly allowed nor disallowed in the LoongArch manuals, and is in my opinion unnecessarily complex. And regarding the lack of hypcall support from QEMU system mode emulation on TCG, I'd argue it's simply a matter of adding support in target/loongarch64. This would be attractive because it will enable easy development and testing of hypervisor software with QEMU -- both locally and in CI. > Extioi virtualization extension will be added later, cpucfg can be > used to get extioi features. It is unlikely that extioi driver depends > on PARA_VIRT macro if hypercall is used to get features. And the EXTIOI feature too isn't something usable from unprivileged code, so I don't think it will affect the conclusions above.
On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> wrote: > > > > On 2024/2/24 下午5:13, Huacai Chen wrote: > > Hi, Bibo, > > > > On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: > >> > >> Instruction cpucfg can be used to get processor features. And there > >> is trap exception when it is executed in VM mode, and also it is > >> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 > >> is used. Here one specified area 0x40000000 -- 0x400000ff is used > >> for KVM hypervisor to privide PV features, and the area can be extended > >> for other hypervisors in future. This area will never be used for > >> real HW, it is only used by software. > > After reading and thinking, I find that the hypercall method which is > > used in our productive kernel is better than this cpucfg method. > > Because hypercall is more simple and straightforward, plus we don't > > worry about conflicting with the real hardware. > No, I do not think so. cpucfg is simper than hypercall, hypercall can > be in effect when system runs in guest mode. In some scenario like TCG > mode, hypercall is illegal intruction, however cpucfg can work. Nearly all architectures use hypercall except x86 for its historical reasons. If we use CPUCFG, then the hypervisor information is unnecessarily leaked to userspace, and this may be a security issue. Meanwhile, I don't think TCG mode needs PV features. I consulted with Jiaxun before, and maybe he can give some more comments. > > Extioi virtualization extension will be added later, cpucfg can be used > to get extioi features. It is unlikely that extioi driver depends on > PARA_VIRT macro if hypercall is used to get features. CPUCFG is per-core information, if we really need something about extioi, it should be in iocsr (LOONGARCH_IOCSR_FEATURES). Huacai > > Regards > Bibo Mao > > > > > Huacai > > > >> > >> Signed-off-by: Bibo Mao <maobibo@loongson.cn> > >> --- > >> arch/loongarch/include/asm/inst.h | 1 + > >> arch/loongarch/include/asm/loongarch.h | 10 ++++++ > >> arch/loongarch/kvm/exit.c | 46 +++++++++++++++++--------- > >> 3 files changed, 41 insertions(+), 16 deletions(-) > >> > >> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h > >> index d8f637f9e400..ad120f924905 100644 > >> --- a/arch/loongarch/include/asm/inst.h > >> +++ b/arch/loongarch/include/asm/inst.h > >> @@ -67,6 +67,7 @@ enum reg2_op { > >> revhd_op = 0x11, > >> extwh_op = 0x16, > >> extwb_op = 0x17, > >> + cpucfg_op = 0x1b, > >> iocsrrdb_op = 0x19200, > >> iocsrrdh_op = 0x19201, > >> iocsrrdw_op = 0x19202, > >> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h > >> index 46366e783c84..a1d22e8b6f94 100644 > >> --- a/arch/loongarch/include/asm/loongarch.h > >> +++ b/arch/loongarch/include/asm/loongarch.h > >> @@ -158,6 +158,16 @@ > >> #define CPUCFG48_VFPU_CG BIT(2) > >> #define CPUCFG48_RAM_CG BIT(3) > >> > >> +/* > >> + * cpucfg index area: 0x40000000 -- 0x400000ff > >> + * SW emulation for KVM hypervirsor > >> + */ > >> +#define CPUCFG_KVM_BASE 0x40000000UL > >> +#define CPUCFG_KVM_SIZE 0x100 > >> +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE > >> +#define KVM_SIGNATURE "KVM\0" > >> +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) > >> + > >> #ifndef __ASSEMBLY__ > >> > >> /* CSR */ > >> diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c > >> index 923bbca9bd22..6a38fd59d86d 100644 > >> --- a/arch/loongarch/kvm/exit.c > >> +++ b/arch/loongarch/kvm/exit.c > >> @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) > >> return EMULATE_DONE; > >> } > >> > >> -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) > >> +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) > >> { > >> int rd, rj; > >> unsigned int index; > >> + > >> + rd = inst.reg2_format.rd; > >> + rj = inst.reg2_format.rj; > >> + ++vcpu->stat.cpucfg_exits; > >> + index = vcpu->arch.gprs[rj]; > >> + > >> + /* > >> + * By LoongArch Reference Manual 2.2.10.5 > >> + * Return value is 0 for undefined cpucfg index > >> + */ > >> + switch (index) { > >> + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): > >> + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; > >> + break; > >> + case CPUCFG_KVM_SIG: > >> + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; > >> + break; > >> + default: > >> + vcpu->arch.gprs[rd] = 0; > >> + break; > >> + } > >> + > >> + return EMULATE_DONE; > >> +} > >> + > >> +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) > >> +{ > >> unsigned long curr_pc; > >> larch_inst inst; > >> enum emulation_result er = EMULATE_DONE; > >> @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) > >> er = EMULATE_FAIL; > >> switch (((inst.word >> 24) & 0xff)) { > >> case 0x0: /* CPUCFG GSPR */ > >> - if (inst.reg2_format.opcode == 0x1B) { > >> - rd = inst.reg2_format.rd; > >> - rj = inst.reg2_format.rj; > >> - ++vcpu->stat.cpucfg_exits; > >> - index = vcpu->arch.gprs[rj]; > >> - er = EMULATE_DONE; > >> - /* > >> - * By LoongArch Reference Manual 2.2.10.5 > >> - * return value is 0 for undefined cpucfg index > >> - */ > >> - if (index < KVM_MAX_CPUCFG_REGS) > >> - vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; > >> - else > >> - vcpu->arch.gprs[rd] = 0; > >> - } > >> + if (inst.reg2_format.opcode == cpucfg_op) > >> + er = kvm_emu_cpucfg(vcpu, inst); > >> break; > >> case 0x4: /* CSR{RD,WR,XCHG} GSPR */ > >> er = kvm_handle_csr(vcpu, inst); > >> -- > >> 2.39.3 > >> > >
On 2024/2/26 下午1:25, WANG Xuerui wrote: > Hi, > > On 2/26/24 10:04, maobibo wrote: >> On 2024/2/24 下午5:13, Huacai Chen wrote: >>> Hi, Bibo, >>> >>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: >>>> >>>> Instruction cpucfg can be used to get processor features. And there >>>> is trap exception when it is executed in VM mode, and also it is >>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>> for KVM hypervisor to privide PV features, and the area can be extended >>>> for other hypervisors in future. This area will never be used for >>>> real HW, it is only used by software. >>> After reading and thinking, I find that the hypercall method which is >>> used in our productive kernel is better than this cpucfg method. >>> Because hypercall is more simple and straightforward, plus we don't >>> worry about conflicting with the real hardware. >> No, I do not think so. cpucfg is simper than hypercall, hypercall can >> be in effect when system runs in guest mode. In some scenario like TCG >> mode, hypercall is illegal intruction, however cpucfg can work. > > While the CPUCFG instruction is universally available, it's also > unprivileged, so any additional CPUCFG behavior also automatically > becomes UAPI, which likely isn't what you expect. Hypervisor > implementation details shouldn't be leaked to userland because it has no > reason to care -- even though userland learns about the capabilities, it > cannot actually access the resources, because relevant CSRs and/or > instructions are privileged. Worse, the unnecessary exposure of > information could be a problem security-wise. cpucfg is read-only and used to represent current hw cpu features, why do you think there is security issue? Is there security issue about cpucfg2 and cpucfg6 since it can be accessed in user space also? PMU feature is defined in cpucfg6, PMU driver is written in kernel mode. > > A possible way to preserve the unprivileged CPUCFG behavior would be > acting differently based on guest CSR.CRMD.PLV: only returning data for > the new configuration space when guest is not in PLV3. But this behavior > isn't explicitly allowed nor disallowed in the LoongArch manuals, and is > in my opinion unnecessarily complex. > > And regarding the lack of hypcall support from QEMU system mode > emulation on TCG, I'd argue it's simply a matter of adding support in > target/loongarch64. This would be attractive because it will enable easy > development and testing of hypervisor software with QEMU -- both locally > and in CI. Hypercall is part of hardware assisted virtualization LVZ, do you think only adding hypercall instruction withou LVZ is possible? > >> Extioi virtualization extension will be added later, cpucfg can be >> used to get extioi features. It is unlikely that extioi driver depends >> on PARA_VIRT macro if hypercall is used to get features. > And the EXTIOI feature too isn't something usable from unprivileged > code, so I don't think it will affect the conclusions above. Sorry, I do not know what do you mean. Regards Bibo Mao >
On 2024/2/26 下午2:12, Huacai Chen wrote: > On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> wrote: >> >> >> >> On 2024/2/24 下午5:13, Huacai Chen wrote: >>> Hi, Bibo, >>> >>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: >>>> >>>> Instruction cpucfg can be used to get processor features. And there >>>> is trap exception when it is executed in VM mode, and also it is >>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>> for KVM hypervisor to privide PV features, and the area can be extended >>>> for other hypervisors in future. This area will never be used for >>>> real HW, it is only used by software. >>> After reading and thinking, I find that the hypercall method which is >>> used in our productive kernel is better than this cpucfg method. >>> Because hypercall is more simple and straightforward, plus we don't >>> worry about conflicting with the real hardware. >> No, I do not think so. cpucfg is simper than hypercall, hypercall can >> be in effect when system runs in guest mode. In some scenario like TCG >> mode, hypercall is illegal intruction, however cpucfg can work. > Nearly all architectures use hypercall except x86 for its historical Only x86 support multiple hypervisors and there is multiple hypervisor in x86 only. It is an advantage, not historical reason. > reasons. If we use CPUCFG, then the hypervisor information is > unnecessarily leaked to userspace, and this may be a security issue. > Meanwhile, I don't think TCG mode needs PV features. Besides PV features, there is other features different with real hw such as virtio device, virtual interrupt controller. Regards Bibo Mao > > I consulted with Jiaxun before, and maybe he can give some more comments. > >> >> Extioi virtualization extension will be added later, cpucfg can be used >> to get extioi features. It is unlikely that extioi driver depends on >> PARA_VIRT macro if hypercall is used to get features. > CPUCFG is per-core information, if we really need something about > extioi, it should be in iocsr (LOONGARCH_IOCSR_FEATURES). > > > Huacai > >> >> Regards >> Bibo Mao >> >>> >>> Huacai >>> >>>> >>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >>>> --- >>>> arch/loongarch/include/asm/inst.h | 1 + >>>> arch/loongarch/include/asm/loongarch.h | 10 ++++++ >>>> arch/loongarch/kvm/exit.c | 46 +++++++++++++++++--------- >>>> 3 files changed, 41 insertions(+), 16 deletions(-) >>>> >>>> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h >>>> index d8f637f9e400..ad120f924905 100644 >>>> --- a/arch/loongarch/include/asm/inst.h >>>> +++ b/arch/loongarch/include/asm/inst.h >>>> @@ -67,6 +67,7 @@ enum reg2_op { >>>> revhd_op = 0x11, >>>> extwh_op = 0x16, >>>> extwb_op = 0x17, >>>> + cpucfg_op = 0x1b, >>>> iocsrrdb_op = 0x19200, >>>> iocsrrdh_op = 0x19201, >>>> iocsrrdw_op = 0x19202, >>>> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h >>>> index 46366e783c84..a1d22e8b6f94 100644 >>>> --- a/arch/loongarch/include/asm/loongarch.h >>>> +++ b/arch/loongarch/include/asm/loongarch.h >>>> @@ -158,6 +158,16 @@ >>>> #define CPUCFG48_VFPU_CG BIT(2) >>>> #define CPUCFG48_RAM_CG BIT(3) >>>> >>>> +/* >>>> + * cpucfg index area: 0x40000000 -- 0x400000ff >>>> + * SW emulation for KVM hypervirsor >>>> + */ >>>> +#define CPUCFG_KVM_BASE 0x40000000UL >>>> +#define CPUCFG_KVM_SIZE 0x100 >>>> +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE >>>> +#define KVM_SIGNATURE "KVM\0" >>>> +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) >>>> + >>>> #ifndef __ASSEMBLY__ >>>> >>>> /* CSR */ >>>> diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c >>>> index 923bbca9bd22..6a38fd59d86d 100644 >>>> --- a/arch/loongarch/kvm/exit.c >>>> +++ b/arch/loongarch/kvm/exit.c >>>> @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) >>>> return EMULATE_DONE; >>>> } >>>> >>>> -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>> +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) >>>> { >>>> int rd, rj; >>>> unsigned int index; >>>> + >>>> + rd = inst.reg2_format.rd; >>>> + rj = inst.reg2_format.rj; >>>> + ++vcpu->stat.cpucfg_exits; >>>> + index = vcpu->arch.gprs[rj]; >>>> + >>>> + /* >>>> + * By LoongArch Reference Manual 2.2.10.5 >>>> + * Return value is 0 for undefined cpucfg index >>>> + */ >>>> + switch (index) { >>>> + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): >>>> + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >>>> + break; >>>> + case CPUCFG_KVM_SIG: >>>> + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; >>>> + break; >>>> + default: >>>> + vcpu->arch.gprs[rd] = 0; >>>> + break; >>>> + } >>>> + >>>> + return EMULATE_DONE; >>>> +} >>>> + >>>> +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>> +{ >>>> unsigned long curr_pc; >>>> larch_inst inst; >>>> enum emulation_result er = EMULATE_DONE; >>>> @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>> er = EMULATE_FAIL; >>>> switch (((inst.word >> 24) & 0xff)) { >>>> case 0x0: /* CPUCFG GSPR */ >>>> - if (inst.reg2_format.opcode == 0x1B) { >>>> - rd = inst.reg2_format.rd; >>>> - rj = inst.reg2_format.rj; >>>> - ++vcpu->stat.cpucfg_exits; >>>> - index = vcpu->arch.gprs[rj]; >>>> - er = EMULATE_DONE; >>>> - /* >>>> - * By LoongArch Reference Manual 2.2.10.5 >>>> - * return value is 0 for undefined cpucfg index >>>> - */ >>>> - if (index < KVM_MAX_CPUCFG_REGS) >>>> - vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >>>> - else >>>> - vcpu->arch.gprs[rd] = 0; >>>> - } >>>> + if (inst.reg2_format.opcode == cpucfg_op) >>>> + er = kvm_emu_cpucfg(vcpu, inst); >>>> break; >>>> case 0x4: /* CSR{RD,WR,XCHG} GSPR */ >>>> er = kvm_handle_csr(vcpu, inst); >>>> -- >>>> 2.39.3 >>>> >> >>
On 2/26/24 16:00, maobibo wrote: > > > On 2024/2/26 下午1:25, WANG Xuerui wrote: >> Hi, >> >> On 2/26/24 10:04, maobibo wrote: >>> On 2024/2/24 下午5:13, Huacai Chen wrote: >>>> Hi, Bibo, >>>> >>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: >>>>> >>>>> Instruction cpucfg can be used to get processor features. And there >>>>> is trap exception when it is executed in VM mode, and also it is >>>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>>> for KVM hypervisor to privide PV features, and the area can be >>>>> extended >>>>> for other hypervisors in future. This area will never be used for >>>>> real HW, it is only used by software. >>>> After reading and thinking, I find that the hypercall method which is >>>> used in our productive kernel is better than this cpucfg method. >>>> Because hypercall is more simple and straightforward, plus we don't >>>> worry about conflicting with the real hardware. >>> No, I do not think so. cpucfg is simper than hypercall, hypercall can >>> be in effect when system runs in guest mode. In some scenario like >>> TCG mode, hypercall is illegal intruction, however cpucfg can work. >> >> While the CPUCFG instruction is universally available, it's also >> unprivileged, so any additional CPUCFG behavior also automatically >> becomes UAPI, which likely isn't what you expect. Hypervisor >> implementation details shouldn't be leaked to userland because it has >> no reason to care -- even though userland learns about the >> capabilities, it cannot actually access the resources, because >> relevant CSRs and/or instructions are privileged. Worse, the >> unnecessary exposure of information could be a problem security-wise. > cpucfg is read-only and used to represent current hw cpu features, > why do you think there is security issue? Is there security issue about > cpucfg2 and cpucfg6 since it can be accessed in user space also? > > PMU feature is defined in cpucfg6, PMU driver is written in kernel mode. These CPUCFG leaves were added before existence of LoongArch were publicized, without community review. If early drafts of the manual were available to community reviewers, at least I would strongly NAK it. >> >> A possible way to preserve the unprivileged CPUCFG behavior would be >> acting differently based on guest CSR.CRMD.PLV: only returning data >> for the new configuration space when guest is not in PLV3. But this >> behavior isn't explicitly allowed nor disallowed in the LoongArch >> manuals, and is in my opinion unnecessarily complex. >> >> And regarding the lack of hypcall support from QEMU system mode >> emulation on TCG, I'd argue it's simply a matter of adding support in >> target/loongarch64. This would be attractive because it will enable >> easy development and testing of hypervisor software with QEMU -- both >> locally and in CI. > Hypercall is part of hardware assisted virtualization LVZ, do you think > only adding hypercall instruction withou LVZ is possible? I cannot comment on the actual feasibility of doing so, because I don't have access to the LVZ manuals which *still* isn't publicly available. But from my intuition it should be a more-or-less trivial processor mode transition like with syscall -- whether that's indeed the case I can't (dis)prove. >>> Extioi virtualization extension will be added later, cpucfg can be >>> used to get extioi features. It is unlikely that extioi driver >>> depends on PARA_VIRT macro if hypercall is used to get features. >> And the EXTIOI feature too isn't something usable from unprivileged >> code, so I don't think it will affect the conclusions above. > Sorry, I do not know what do you mean. I was just saying this example provided no additional information at least for me -- while it's appreciated that you informed the community of your intended future use case, like what I stated in the first paragraph in my reply, it looked essentially the same because both PV and EXTIOI are privileged things.
在2024年2月26日二月 上午8:04,maobibo写道: > On 2024/2/26 下午2:12, Huacai Chen wrote: >> On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> wrote: >>> >>> >>> >>> On 2024/2/24 下午5:13, Huacai Chen wrote: >>>> Hi, Bibo, >>>> >>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongsoncn> wrote: >>>>> >>>>> Instruction cpucfg can be used to get processor features. And there >>>>> is trap exception when it is executed in VM mode, and also it is >>>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>>> for KVM hypervisor to privide PV features, and the area can be extended >>>>> for other hypervisors in future. This area will never be used for >>>>> real HW, it is only used by software. >>>> After reading and thinking, I find that the hypercall method which is >>>> used in our productive kernel is better than this cpucfg method. >>>> Because hypercall is more simple and straightforward, plus we don't >>>> worry about conflicting with the real hardware. >>> No, I do not think so. cpucfg is simper than hypercall, hypercall can >>> be in effect when system runs in guest mode. In some scenario like TCG >>> mode, hypercall is illegal intruction, however cpucfg can work. >> Nearly all architectures use hypercall except x86 for its historical > Only x86 support multiple hypervisors and there is multiple hypervisor > in x86 only. It is an advantage, not historical reason. I do believe that all those stuff should not be exposed to guest user space for security reasons. Also for different implementations of hypervisors they may have different PV features behavior, using hypcall to perform feature detection can pass more information to help us cope with hypervisor diversity. > >> reasons. If we use CPUCFG, then the hypervisor information is >> unnecessarily leaked to userspace, and this may be a security issue. >> Meanwhile, I don't think TCG mode needs PV features. > Besides PV features, there is other features different with real hw such > as virtio device, virtual interrupt controller. Those are *device* level information, they must be passed in firmware interfaces to keep processor emulation sane. Thanks > > Regards > Bibo Mao > >> >> I consulted with Jiaxun before, and maybe he can give some more comments. >> >>> >>> Extioi virtualization extension will be added later, cpucfg can be used >>> to get extioi features. It is unlikely that extioi driver depends on >>> PARA_VIRT macro if hypercall is used to get features. >> CPUCFG is per-core information, if we really need something about >> extioi, it should be in iocsr (LOONGARCH_IOCSR_FEATURES). >> >> >> Huacai >> >>> >>> Regards >>> Bibo Mao >>> >>>> >>>> Huacai >>>> >>>>> >>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >>>>> --- >>>>> arch/loongarch/include/asm/inst.h | 1 + >>>>> arch/loongarch/include/asm/loongarch.h | 10 ++++++ >>>>> arch/loongarch/kvm/exit.c | 46 +++++++++++++++++--------- >>>>> 3 files changed, 41 insertions(+), 16 deletions(-) >>>>> >>>>> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h >>>>> index d8f637f9e400..ad120f924905 100644 >>>>> --- a/arch/loongarch/include/asm/inst.h >>>>> +++ b/arch/loongarch/include/asm/inst.h >>>>> @@ -67,6 +67,7 @@ enum reg2_op { >>>>> revhd_op = 0x11, >>>>> extwh_op = 0x16, >>>>> extwb_op = 0x17, >>>>> + cpucfg_op = 0x1b, >>>>> iocsrrdb_op = 0x19200, >>>>> iocsrrdh_op = 0x19201, >>>>> iocsrrdw_op = 0x19202, >>>>> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h >>>>> index 46366e783c84..a1d22e8b6f94 100644 >>>>> --- a/arch/loongarch/include/asm/loongarch.h >>>>> +++ b/arch/loongarch/include/asm/loongarch.h >>>>> @@ -158,6 +158,16 @@ >>>>> #define CPUCFG48_VFPU_CG BIT(2) >>>>> #define CPUCFG48_RAM_CG BIT(3) >>>>> >>>>> +/* >>>>> + * cpucfg index area: 0x40000000 -- 0x400000ff >>>>> + * SW emulation for KVM hypervirsor >>>>> + */ >>>>> +#define CPUCFG_KVM_BASE 0x40000000UL >>>>> +#define CPUCFG_KVM_SIZE 0x100 >>>>> +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE >>>>> +#define KVM_SIGNATURE "KVM\0" >>>>> +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) >>>>> + >>>>> #ifndef __ASSEMBLY__ >>>>> >>>>> /* CSR */ >>>>> diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c >>>>> index 923bbca9bd22..6a38fd59d86d 100644 >>>>> --- a/arch/loongarch/kvm/exit.c >>>>> +++ b/arch/loongarch/kvm/exit.c >>>>> @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) >>>>> return EMULATE_DONE; >>>>> } >>>>> >>>>> -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>> +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) >>>>> { >>>>> int rd, rj; >>>>> unsigned int index; >>>>> + >>>>> + rd = inst.reg2_format.rd; >>>>> + rj = inst.reg2_format.rj; >>>>> + ++vcpu->stat.cpucfg_exits; >>>>> + index = vcpu->arch.gprs[rj]; >>>>> + >>>>> + /* >>>>> + * By LoongArch Reference Manual 2.2.10.5 >>>>> + * Return value is 0 for undefined cpucfg index >>>>> + */ >>>>> + switch (index) { >>>>> + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): >>>>> + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >>>>> + break; >>>>> + case CPUCFG_KVM_SIG: >>>>> + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; >>>>> + break; >>>>> + default: >>>>> + vcpu->arch.gprs[rd] = 0; >>>>> + break; >>>>> + } >>>>> + >>>>> + return EMULATE_DONE; >>>>> +} >>>>> + >>>>> +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>> +{ >>>>> unsigned long curr_pc; >>>>> larch_inst inst; >>>>> enum emulation_result er = EMULATE_DONE; >>>>> @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>> er = EMULATE_FAIL; >>>>> switch (((inst.word >> 24) & 0xff)) { >>>>> case 0x0: /* CPUCFG GSPR */ >>>>> - if (inst.reg2_format.opcode == 0x1B) { >>>>> - rd = inst.reg2_format.rd; >>>>> - rj = inst.reg2_format.rj; >>>>> - ++vcpu->stat.cpucfg_exits; >>>>> - index = vcpu->arch.gprs[rj]; >>>>> - er = EMULATE_DONE; >>>>> - /* >>>>> - * By LoongArch Reference Manual 2.2.10.5 >>>>> - * return value is 0 for undefined cpucfg index >>>>> - */ >>>>> - if (index < KVM_MAX_CPUCFG_REGS) >>>>> - vcpu->arch.gprs[rd] = vcpu->archcpucfg[index]; >>>>> - else >>>>> - vcpu->arch.gprs[rd] = 0; >>>>> - } >>>>> + if (inst.reg2_format.opcode == cpucfg_op) >>>>> + er = kvm_emu_cpucfg(vcpu, inst); >>>>> break; >>>>> case 0x4: /* CSR{RD,WR,XCHG} GSPR */ >>>>> er = kvm_handle_csr(vcpu, inst); >>>>> -- >>>>> 2.39.3 >>>>> >>> >>>
On 2024/2/27 上午4:02, Jiaxun Yang wrote: > > > 在2024年2月26日二月 上午8:04,maobibo写道: >> On 2024/2/26 下午2:12, Huacai Chen wrote: >>> On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> wrote: >>>> >>>> >>>> >>>> On 2024/2/24 下午5:13, Huacai Chen wrote: >>>>> Hi, Bibo, >>>>> >>>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: >>>>>> >>>>>> Instruction cpucfg can be used to get processor features. And there >>>>>> is trap exception when it is executed in VM mode, and also it is >>>>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>>>> for KVM hypervisor to privide PV features, and the area can be extended >>>>>> for other hypervisors in future. This area will never be used for >>>>>> real HW, it is only used by software. >>>>> After reading and thinking, I find that the hypercall method which is >>>>> used in our productive kernel is better than this cpucfg method. >>>>> Because hypercall is more simple and straightforward, plus we don't >>>>> worry about conflicting with the real hardware. >>>> No, I do not think so. cpucfg is simper than hypercall, hypercall can >>>> be in effect when system runs in guest mode. In some scenario like TCG >>>> mode, hypercall is illegal intruction, however cpucfg can work. >>> Nearly all architectures use hypercall except x86 for its historical >> Only x86 support multiple hypervisors and there is multiple hypervisor >> in x86 only. It is an advantage, not historical reason. > > I do believe that all those stuff should not be exposed to guest user space > for security reasons. Can you add PLV checking when cpucfg 0x40000000-0x400000FF is emulated? if it is user mode return value is zero and it is kernel mode emulated value will be returned. It can avoid information leaking. > > Also for different implementations of hypervisors they may have different > PV features behavior, using hypcall to perform feature detection > can pass more information to help us cope with hypervisor diversity. How do different hypervisors can be detected firstly? On x86 MSR is used for all hypervisors detection and on ARM64 hyperv used acpi_gbl_FADT and kvm use smc forcely, host mode can execute smc instruction without exception on ARM64. I do not know why hypercall is better than cpucfg on LoongArch, cpucfg is basic intruction however hypercall is not, it is part of LVZ feature. >> >>> reasons. If we use CPUCFG, then the hypervisor information is >>> unnecessarily leaked to userspace, and this may be a security issue. >>> Meanwhile, I don't think TCG mode needs PV features. >> Besides PV features, there is other features different with real hw such >> as virtio device, virtual interrupt controller. > > Those are *device* level information, they must be passed in firmware > interfaces to keep processor emulation sane. File arch/x86/hyperv/hv_apic.c can be referenced, apic features comes from ms_hyperv.hints and HYPERV_CPUID_ENLIGHTMENT_INFO cpuid info, not must be passed by firmware interface. Regards Bibo Mao > > Thanks > >> >> Regards >> Bibo Mao >> >>> >>> I consulted with Jiaxun before, and maybe he can give some more comments. >>> >>>> >>>> Extioi virtualization extension will be added later, cpucfg can be used >>>> to get extioi features. It is unlikely that extioi driver depends on >>>> PARA_VIRT macro if hypercall is used to get features. >>> CPUCFG is per-core information, if we really need something about >>> extioi, it should be in iocsr (LOONGARCH_IOCSR_FEATURES). >>> >>> >>> Huacai >>> >>>> >>>> Regards >>>> Bibo Mao >>>> >>>>> >>>>> Huacai >>>>> >>>>>> >>>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >>>>>> --- >>>>>> arch/loongarch/include/asm/inst.h | 1 + >>>>>> arch/loongarch/include/asm/loongarch.h | 10 ++++++ >>>>>> arch/loongarch/kvm/exit.c | 46 +++++++++++++++++--------- >>>>>> 3 files changed, 41 insertions(+), 16 deletions(-) >>>>>> >>>>>> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h >>>>>> index d8f637f9e400..ad120f924905 100644 >>>>>> --- a/arch/loongarch/include/asm/inst.h >>>>>> +++ b/arch/loongarch/include/asm/inst.h >>>>>> @@ -67,6 +67,7 @@ enum reg2_op { >>>>>> revhd_op = 0x11, >>>>>> extwh_op = 0x16, >>>>>> extwb_op = 0x17, >>>>>> + cpucfg_op = 0x1b, >>>>>> iocsrrdb_op = 0x19200, >>>>>> iocsrrdh_op = 0x19201, >>>>>> iocsrrdw_op = 0x19202, >>>>>> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h >>>>>> index 46366e783c84..a1d22e8b6f94 100644 >>>>>> --- a/arch/loongarch/include/asm/loongarch.h >>>>>> +++ b/arch/loongarch/include/asm/loongarch.h >>>>>> @@ -158,6 +158,16 @@ >>>>>> #define CPUCFG48_VFPU_CG BIT(2) >>>>>> #define CPUCFG48_RAM_CG BIT(3) >>>>>> >>>>>> +/* >>>>>> + * cpucfg index area: 0x40000000 -- 0x400000ff >>>>>> + * SW emulation for KVM hypervirsor >>>>>> + */ >>>>>> +#define CPUCFG_KVM_BASE 0x40000000UL >>>>>> +#define CPUCFG_KVM_SIZE 0x100 >>>>>> +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE >>>>>> +#define KVM_SIGNATURE "KVM\0" >>>>>> +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) >>>>>> + >>>>>> #ifndef __ASSEMBLY__ >>>>>> >>>>>> /* CSR */ >>>>>> diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c >>>>>> index 923bbca9bd22..6a38fd59d86d 100644 >>>>>> --- a/arch/loongarch/kvm/exit.c >>>>>> +++ b/arch/loongarch/kvm/exit.c >>>>>> @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) >>>>>> return EMULATE_DONE; >>>>>> } >>>>>> >>>>>> -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>>> +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) >>>>>> { >>>>>> int rd, rj; >>>>>> unsigned int index; >>>>>> + >>>>>> + rd = inst.reg2_format.rd; >>>>>> + rj = inst.reg2_format.rj; >>>>>> + ++vcpu->stat.cpucfg_exits; >>>>>> + index = vcpu->arch.gprs[rj]; >>>>>> + >>>>>> + /* >>>>>> + * By LoongArch Reference Manual 2.2.10.5 >>>>>> + * Return value is 0 for undefined cpucfg index >>>>>> + */ >>>>>> + switch (index) { >>>>>> + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): >>>>>> + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >>>>>> + break; >>>>>> + case CPUCFG_KVM_SIG: >>>>>> + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; >>>>>> + break; >>>>>> + default: >>>>>> + vcpu->arch.gprs[rd] = 0; >>>>>> + break; >>>>>> + } >>>>>> + >>>>>> + return EMULATE_DONE; >>>>>> +} >>>>>> + >>>>>> +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>>> +{ >>>>>> unsigned long curr_pc; >>>>>> larch_inst inst; >>>>>> enum emulation_result er = EMULATE_DONE; >>>>>> @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>>> er = EMULATE_FAIL; >>>>>> switch (((inst.word >> 24) & 0xff)) { >>>>>> case 0x0: /* CPUCFG GSPR */ >>>>>> - if (inst.reg2_format.opcode == 0x1B) { >>>>>> - rd = inst.reg2_format.rd; >>>>>> - rj = inst.reg2_format.rj; >>>>>> - ++vcpu->stat.cpucfg_exits; >>>>>> - index = vcpu->arch.gprs[rj]; >>>>>> - er = EMULATE_DONE; >>>>>> - /* >>>>>> - * By LoongArch Reference Manual 2.2.10.5 >>>>>> - * return value is 0 for undefined cpucfg index >>>>>> - */ >>>>>> - if (index < KVM_MAX_CPUCFG_REGS) >>>>>> - vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >>>>>> - else >>>>>> - vcpu->arch.gprs[rd] = 0; >>>>>> - } >>>>>> + if (inst.reg2_format.opcode == cpucfg_op) >>>>>> + er = kvm_emu_cpucfg(vcpu, inst); >>>>>> break; >>>>>> case 0x4: /* CSR{RD,WR,XCHG} GSPR */ >>>>>> er = kvm_handle_csr(vcpu, inst); >>>>>> -- >>>>>> 2.39.3 >>>>>> >>>> >>>> >
在2024年2月27日二月 上午3:14,maobibo写道: > On 2024/2/27 上午4:02, Jiaxun Yang wrote: >> >> >> 在2024年2月26日二月 上午8:04,maobibo写道: >>> On 2024/2/26 下午2:12, Huacai Chen wrote: >>>> On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> wrote: >>>>> >>>>> >>>>> >>>>> On 2024/2/24 下午5:13, Huacai Chen wrote: >>>>>> Hi, Bibo, >>>>>> >>>>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: >>>>>>> >>>>>>> Instruction cpucfg can be used to get processor features. And there >>>>>>> is trap exception when it is executed in VM mode, and also it is >>>>>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>>>>> for KVM hypervisor to privide PV features, and the area can be extended >>>>>>> for other hypervisors in future. This area will never be used for >>>>>>> real HW, it is only used by software. >>>>>> After reading and thinking, I find that the hypercall method which is >>>>>> used in our productive kernel is better than this cpucfg method. >>>>>> Because hypercall is more simple and straightforward, plus we don't >>>>>> worry about conflicting with the real hardware. >>>>> No, I do not think so. cpucfg is simper than hypercall, hypercall can >>>>> be in effect when system runs in guest mode. In some scenario like TCG >>>>> mode, hypercall is illegal intruction, however cpucfg can work. >>>> Nearly all architectures use hypercall except x86 for its historical >>> Only x86 support multiple hypervisors and there is multiple hypervisor >>> in x86 only. It is an advantage, not historical reason. >> >> I do believe that all those stuff should not be exposed to guest user space >> for security reasons. > Can you add PLV checking when cpucfg 0x40000000-0x400000FF is emulated? > if it is user mode return value is zero and it is kernel mode emulated > value will be returned. It can avoid information leaking. Please don’t do insane stuff here, applications are not expecting exception from cpucfg. > >> >> Also for different implementations of hypervisors they may have different >> PV features behavior, using hypcall to perform feature detection >> can pass more information to help us cope with hypervisor diversity. > How do different hypervisors can be detected firstly? On x86 MSR is > used for all hypervisors detection and on ARM64 hyperv used > acpi_gbl_FADT and kvm use smc forcely, host mode can execute smc > instruction without exception on ARM64. That’s hypcall ABI design choices, those information can come from firmware or privileged CSRs on LoongArch. > > I do not know why hypercall is better than cpucfg on LoongArch, cpucfg > is basic intruction however hypercall is not, it is part of LVZ feature. KVM can only work with LVZ right? > >>> >>>> reasons. If we use CPUCFG, then the hypervisor information is >>>> unnecessarily leaked to userspace, and this may be a security issue. >>>> Meanwhile, I don't think TCG mode needs PV features. >>> Besides PV features, there is other features different with real hw such >>> as virtio device, virtual interrupt controller. >> >> Those are *device* level information, they must be passed in firmware >> interfaces to keep processor emulation sane. > File arch/x86/hyperv/hv_apic.c can be referenced, apic features comes > from ms_hyperv.hints and HYPERV_CPUID_ENLIGHTMENT_INFO cpuid info, not > must be passed by firmware interface. That’s not KVM, that’s Hyper V. At Linux Kernel we enjoy the benefits of better modularity on device abstractions, please don’t break it. Thanks > > Regards > Bibo Mao >> >> Thanks >> >>> >>> Regards >>> Bibo Mao >>> >>>> >>>> I consulted with Jiaxun before, and maybe he can give some more comments. >>>> >>>>> >>>>> Extioi virtualization extension will be added later, cpucfg can be used >>>>> to get extioi features. It is unlikely that extioi driver depends on >>>>> PARA_VIRT macro if hypercall is used to get features. >>>> CPUCFG is per-core information, if we really need something about >>>> extioi, it should be in iocsr (LOONGARCH_IOCSR_FEATURES). >>>> >>>> >>>> Huacai >>>> >>>>> >>>>> Regards >>>>> Bibo Mao >>>>> >>>>>> >>>>>> Huacai >>>>>> >>>>>>> >>>>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >>>>>>> --- >>>>>>> arch/loongarch/include/asm/inst.h | 1 + >>>>>>> arch/loongarch/include/asm/loongarch.h | 10 ++++++ >>>>>>> arch/loongarch/kvm/exit.c | 46 +++++++++++++++++--------- >>>>>>> 3 files changed, 41 insertions(+), 16 deletions(-) >>>>>>> >>>>>>> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h >>>>>>> index d8f637f9e400..ad120f924905 100644 >>>>>>> --- a/arch/loongarch/include/asm/inst.h >>>>>>> +++ b/arch/loongarch/include/asm/inst.h >>>>>>> @@ -67,6 +67,7 @@ enum reg2_op { >>>>>>> revhd_op = 0x11, >>>>>>> extwh_op = 0x16, >>>>>>> extwb_op = 0x17, >>>>>>> + cpucfg_op = 0x1b, >>>>>>> iocsrrdb_op = 0x19200, >>>>>>> iocsrrdh_op = 0x19201, >>>>>>> iocsrrdw_op = 0x19202, >>>>>>> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h >>>>>>> index 46366e783c84..a1d22e8b6f94 100644 >>>>>>> --- a/arch/loongarch/include/asm/loongarch.h >>>>>>> +++ b/arch/loongarch/include/asm/loongarch.h >>>>>>> @@ -158,6 +158,16 @@ >>>>>>> #define CPUCFG48_VFPU_CG BIT(2) >>>>>>> #define CPUCFG48_RAM_CG BIT(3) >>>>>>> >>>>>>> +/* >>>>>>> + * cpucfg index area: 0x40000000 -- 0x400000ff >>>>>>> + * SW emulation for KVM hypervirsor >>>>>>> + */ >>>>>>> +#define CPUCFG_KVM_BASE 0x40000000UL >>>>>>> +#define CPUCFG_KVM_SIZE 0x100 >>>>>>> +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE >>>>>>> +#define KVM_SIGNATURE "KVM\0" >>>>>>> +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) >>>>>>> + >>>>>>> #ifndef __ASSEMBLY__ >>>>>>> >>>>>>> /* CSR */ >>>>>>> diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exitc >>>>>>> index 923bbca9bd22..6a38fd59d86d 100644 >>>>>>> --- a/arch/loongarch/kvm/exit.c >>>>>>> +++ b/arch/loongarch/kvm/exit.c >>>>>>> @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) >>>>>>> return EMULATE_DONE; >>>>>>> } >>>>>>> >>>>>>> -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>>>> +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) >>>>>>> { >>>>>>> int rd, rj; >>>>>>> unsigned int index; >>>>>>> + >>>>>>> + rd = inst.reg2_format.rd; >>>>>>> + rj = inst.reg2_format.rj; >>>>>>> + ++vcpu->stat.cpucfg_exits; >>>>>>> + index = vcpu->arch.gprs[rj]; >>>>>>> + >>>>>>> + /* >>>>>>> + * By LoongArch Reference Manual 2.2.10.5 >>>>>>> + * Return value is 0 for undefined cpucfg index >>>>>>> + */ >>>>>>> + switch (index) { >>>>>>> + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): >>>>>>> + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >>>>>>> + break; >>>>>>> + case CPUCFG_KVM_SIG: >>>>>>> + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; >>>>>>> + break; >>>>>>> + default: >>>>>>> + vcpu->arch.gprs[rd] = 0; >>>>>>> + break; >>>>>>> + } >>>>>>> + >>>>>>> + return EMULATE_DONE; >>>>>>> +} >>>>>>> + >>>>>>> +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>>>> +{ >>>>>>> unsigned long curr_pc; >>>>>>> larch_inst inst; >>>>>>> enum emulation_result er = EMULATE_DONE; >>>>>>> @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>>>> er = EMULATE_FAIL; >>>>>>> switch (((inst.word >> 24) & 0xff)) { >>>>>>> case 0x0: /* CPUCFG GSPR */ >>>>>>> - if (inst.reg2_format.opcode == 0x1B) { >>>>>>> - rd = inst.reg2_format.rd; >>>>>>> - rj = inst.reg2_format.rj; >>>>>>> - ++vcpu->stat.cpucfg_exits; >>>>>>> - index = vcpu->arch.gprs[rj]; >>>>>>> - er = EMULATE_DONE; >>>>>>> - /* >>>>>>> - * By LoongArch Reference Manual 2.2.10.5 >>>>>>> - * return value is 0 for undefined cpucfg index >>>>>>> - */ >>>>>>> - if (index < KVM_MAX_CPUCFG_REGS) >>>>>>> - vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >>>>>>> - else >>>>>>> - vcpu->arch.gprs[rd] = 0; >>>>>>> - } >>>>>>> + if (inst.reg2_format.opcode == cpucfg_op) >>>>>>> + er = kvm_emu_cpucfg(vcpu, inst); >>>>>>> break; >>>>>>> case 0x4: /* CSR{RD,WR,XCHG} GSPR */ >>>>>>> er = kvm_handle_csr(vcpu, inst); >>>>>>> -- >>>>>>> 2.39.3 >>>>>>> >>>>> >>>>> >>
On 2024/2/27 下午1:23, Jiaxun Yang wrote: > > > 在2024年2月27日二月 上午3:14,maobibo写道: >> On 2024/2/27 上午4:02, Jiaxun Yang wrote: >>> >>> >>> 在2024年2月26日二月 上午8:04,maobibo写道: >>>> On 2024/2/26 下午2:12, Huacai Chen wrote: >>>>> On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 2024/2/24 下午5:13, Huacai Chen wrote: >>>>>>> Hi, Bibo, >>>>>>> >>>>>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> wrote: >>>>>>>> >>>>>>>> Instruction cpucfg can be used to get processor features. And there >>>>>>>> is trap exception when it is executed in VM mode, and also it is >>>>>>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>>>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>>>>>> for KVM hypervisor to privide PV features, and the area can be extended >>>>>>>> for other hypervisors in future. This area will never be used for >>>>>>>> real HW, it is only used by software. >>>>>>> After reading and thinking, I find that the hypercall method which is >>>>>>> used in our productive kernel is better than this cpucfg method. >>>>>>> Because hypercall is more simple and straightforward, plus we don't >>>>>>> worry about conflicting with the real hardware. >>>>>> No, I do not think so. cpucfg is simper than hypercall, hypercall can >>>>>> be in effect when system runs in guest mode. In some scenario like TCG >>>>>> mode, hypercall is illegal intruction, however cpucfg can work. >>>>> Nearly all architectures use hypercall except x86 for its historical >>>> Only x86 support multiple hypervisors and there is multiple hypervisor >>>> in x86 only. It is an advantage, not historical reason. >>> >>> I do believe that all those stuff should not be exposed to guest user space >>> for security reasons. >> Can you add PLV checking when cpucfg 0x40000000-0x400000FF is emulated? >> if it is user mode return value is zero and it is kernel mode emulated >> value will be returned. It can avoid information leaking. > > Please don’t do insane stuff here, applications are not expecting exception from > cpucfg. Sorry, I do not understand. Can you describe the behavior about cpucfg instruction from applications? Why is there no exception for cpucfg. > >> >>> >>> Also for different implementations of hypervisors they may have different >>> PV features behavior, using hypcall to perform feature detection >>> can pass more information to help us cope with hypervisor diversity. >> How do different hypervisors can be detected firstly? On x86 MSR is >> used for all hypervisors detection and on ARM64 hyperv used >> acpi_gbl_FADT and kvm use smc forcely, host mode can execute smc >> instruction without exception on ARM64. > > That’s hypcall ABI design choices, those information can come from firmware > or privileged CSRs on LoongArch. Firstly the firmware or privileged CSRs is not relative with hypcall ABI design choices. With CSR instruction, CSR ID is encoded in CSR instruction, range about CSR ID is 16K; for cpucfg instruction, cpucfg area is passed with register, range is UINT_MAX at least. It is difficult to find an area unused by HW for CSR method since the small CSR ID range. However it is easy for cpucfg. Here I doubt whether you really know about LoongArch LVZ. > >> >> I do not know why hypercall is better than cpucfg on LoongArch, cpucfg >> is basic intruction however hypercall is not, it is part of LVZ feature. > > KVM can only work with LVZ right? Linux kernel need boot well with TCG and KVM mode, hypercall is illegal instruction with TCG mode. Regards Bibo Mao > >> >>>> >>>>> reasons. If we use CPUCFG, then the hypervisor information is >>>>> unnecessarily leaked to userspace, and this may be a security issue. >>>>> Meanwhile, I don't think TCG mode needs PV features. >>>> Besides PV features, there is other features different with real hw such >>>> as virtio device, virtual interrupt controller. >>> >>> Those are *device* level information, they must be passed in firmware >>> interfaces to keep processor emulation sane. >> File arch/x86/hyperv/hv_apic.c can be referenced, apic features comes >> from ms_hyperv.hints and HYPERV_CPUID_ENLIGHTMENT_INFO cpuid info, not >> must be passed by firmware interface. > > That’s not KVM, that’s Hyper V. At Linux Kernel we enjoy the benefits of better > modularity on device abstractions, please don’t break it. > > Thanks > >> >> Regards >> Bibo Mao >>> >>> Thanks >>> >>>> >>>> Regards >>>> Bibo Mao >>>> >>>>> >>>>> I consulted with Jiaxun before, and maybe he can give some more comments. >>>>> >>>>>> >>>>>> Extioi virtualization extension will be added later, cpucfg can be used >>>>>> to get extioi features. It is unlikely that extioi driver depends on >>>>>> PARA_VIRT macro if hypercall is used to get features. >>>>> CPUCFG is per-core information, if we really need something about >>>>> extioi, it should be in iocsr (LOONGARCH_IOCSR_FEATURES). >>>>> >>>>> >>>>> Huacai >>>>> >>>>>> >>>>>> Regards >>>>>> Bibo Mao >>>>>> >>>>>>> >>>>>>> Huacai >>>>>>> >>>>>>>> >>>>>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >>>>>>>> --- >>>>>>>> arch/loongarch/include/asm/inst.h | 1 + >>>>>>>> arch/loongarch/include/asm/loongarch.h | 10 ++++++ >>>>>>>> arch/loongarch/kvm/exit.c | 46 +++++++++++++++++--------- >>>>>>>> 3 files changed, 41 insertions(+), 16 deletions(-) >>>>>>>> >>>>>>>> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h >>>>>>>> index d8f637f9e400..ad120f924905 100644 >>>>>>>> --- a/arch/loongarch/include/asm/inst.h >>>>>>>> +++ b/arch/loongarch/include/asm/inst.h >>>>>>>> @@ -67,6 +67,7 @@ enum reg2_op { >>>>>>>> revhd_op = 0x11, >>>>>>>> extwh_op = 0x16, >>>>>>>> extwb_op = 0x17, >>>>>>>> + cpucfg_op = 0x1b, >>>>>>>> iocsrrdb_op = 0x19200, >>>>>>>> iocsrrdh_op = 0x19201, >>>>>>>> iocsrrdw_op = 0x19202, >>>>>>>> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h >>>>>>>> index 46366e783c84..a1d22e8b6f94 100644 >>>>>>>> --- a/arch/loongarch/include/asm/loongarch.h >>>>>>>> +++ b/arch/loongarch/include/asm/loongarch.h >>>>>>>> @@ -158,6 +158,16 @@ >>>>>>>> #define CPUCFG48_VFPU_CG BIT(2) >>>>>>>> #define CPUCFG48_RAM_CG BIT(3) >>>>>>>> >>>>>>>> +/* >>>>>>>> + * cpucfg index area: 0x40000000 -- 0x400000ff >>>>>>>> + * SW emulation for KVM hypervirsor >>>>>>>> + */ >>>>>>>> +#define CPUCFG_KVM_BASE 0x40000000UL >>>>>>>> +#define CPUCFG_KVM_SIZE 0x100 >>>>>>>> +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE >>>>>>>> +#define KVM_SIGNATURE "KVM\0" >>>>>>>> +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) >>>>>>>> + >>>>>>>> #ifndef __ASSEMBLY__ >>>>>>>> >>>>>>>> /* CSR */ >>>>>>>> diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c >>>>>>>> index 923bbca9bd22..6a38fd59d86d 100644 >>>>>>>> --- a/arch/loongarch/kvm/exit.c >>>>>>>> +++ b/arch/loongarch/kvm/exit.c >>>>>>>> @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) >>>>>>>> return EMULATE_DONE; >>>>>>>> } >>>>>>>> >>>>>>>> -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>>>>> +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) >>>>>>>> { >>>>>>>> int rd, rj; >>>>>>>> unsigned int index; >>>>>>>> + >>>>>>>> + rd = inst.reg2_format.rd; >>>>>>>> + rj = inst.reg2_format.rj; >>>>>>>> + ++vcpu->stat.cpucfg_exits; >>>>>>>> + index = vcpu->arch.gprs[rj]; >>>>>>>> + >>>>>>>> + /* >>>>>>>> + * By LoongArch Reference Manual 2.2.10.5 >>>>>>>> + * Return value is 0 for undefined cpucfg index >>>>>>>> + */ >>>>>>>> + switch (index) { >>>>>>>> + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): >>>>>>>> + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >>>>>>>> + break; >>>>>>>> + case CPUCFG_KVM_SIG: >>>>>>>> + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; >>>>>>>> + break; >>>>>>>> + default: >>>>>>>> + vcpu->arch.gprs[rd] = 0; >>>>>>>> + break; >>>>>>>> + } >>>>>>>> + >>>>>>>> + return EMULATE_DONE; >>>>>>>> +} >>>>>>>> + >>>>>>>> +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>>>>> +{ >>>>>>>> unsigned long curr_pc; >>>>>>>> larch_inst inst; >>>>>>>> enum emulation_result er = EMULATE_DONE; >>>>>>>> @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) >>>>>>>> er = EMULATE_FAIL; >>>>>>>> switch (((inst.word >> 24) & 0xff)) { >>>>>>>> case 0x0: /* CPUCFG GSPR */ >>>>>>>> - if (inst.reg2_format.opcode == 0x1B) { >>>>>>>> - rd = inst.reg2_format.rd; >>>>>>>> - rj = inst.reg2_format.rj; >>>>>>>> - ++vcpu->stat.cpucfg_exits; >>>>>>>> - index = vcpu->arch.gprs[rj]; >>>>>>>> - er = EMULATE_DONE; >>>>>>>> - /* >>>>>>>> - * By LoongArch Reference Manual 2.2.10.5 >>>>>>>> - * return value is 0 for undefined cpucfg index >>>>>>>> - */ >>>>>>>> - if (index < KVM_MAX_CPUCFG_REGS) >>>>>>>> - vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; >>>>>>>> - else >>>>>>>> - vcpu->arch.gprs[rd] = 0; >>>>>>>> - } >>>>>>>> + if (inst.reg2_format.opcode == cpucfg_op) >>>>>>>> + er = kvm_emu_cpucfg(vcpu, inst); >>>>>>>> break; >>>>>>>> case 0x4: /* CSR{RD,WR,XCHG} GSPR */ >>>>>>>> er = kvm_handle_csr(vcpu, inst); >>>>>>>> -- >>>>>>>> 2.39.3 >>>>>>>> >>>>>> >>>>>> >>> >
On Tue, 2024-02-27 at 15:09 +0800, maobibo wrote: > It is difficult to find an area unused by HW for CSR method since the > small CSR ID range. We may use IOCSR instead. In kernel/cpu-probe.c there are already some IOCSR reads. > It is difficult to find an area unused by HW for CSR method since the > small CSR ID range. However it is easy for cpucfg. Here I doubt whether > you really know about LoongArch LVZ. It's unfair to accuse the others for not knowing about LVZ considering the lack of public documentation.
On 2/27/24 11:14, maobibo wrote: > > > On 2024/2/27 上午4:02, Jiaxun Yang wrote: >> >> >> 在2024年2月26日二月 上午8:04,maobibo写道: >>> On 2024/2/26 下午2:12, Huacai Chen wrote: >>>> On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> wrote: >>>>> >>>>> >>>>> >>>>> On 2024/2/24 下午5:13, Huacai Chen wrote: >>>>>> Hi, Bibo, >>>>>> >>>>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> >>>>>> wrote: >>>>>>> >>>>>>> Instruction cpucfg can be used to get processor features. And there >>>>>>> is trap exception when it is executed in VM mode, and also it is >>>>>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>>>>> for KVM hypervisor to privide PV features, and the area can be >>>>>>> extended >>>>>>> for other hypervisors in future. This area will never be used for >>>>>>> real HW, it is only used by software. >>>>>> After reading and thinking, I find that the hypercall method which is >>>>>> used in our productive kernel is better than this cpucfg method. >>>>>> Because hypercall is more simple and straightforward, plus we don't >>>>>> worry about conflicting with the real hardware. >>>>> No, I do not think so. cpucfg is simper than hypercall, hypercall can >>>>> be in effect when system runs in guest mode. In some scenario like TCG >>>>> mode, hypercall is illegal intruction, however cpucfg can work. >>>> Nearly all architectures use hypercall except x86 for its historical >>> Only x86 support multiple hypervisors and there is multiple hypervisor >>> in x86 only. It is an advantage, not historical reason. >> >> I do believe that all those stuff should not be exposed to guest user >> space >> for security reasons. > Can you add PLV checking when cpucfg 0x40000000-0x400000FF is emulated? > if it is user mode return value is zero and it is kernel mode emulated > value will be returned. It can avoid information leaking. I've suggested this approach in another reply [1], but I've rechecked the manual, and it turns out this behavior is not permitted by the current wording. See LoongArch Reference Manual v1.10, Volume 1, Section 2.2.10.5 "CPUCFG": > CPUCFG 访问未定义的配置字将读回全 0 值。 > > Reads of undefined CPUCFG configuration words shall return all-zeroes. This sentence mentions no distinction based on privilege modes, so it can only mean the behavior applies universally regardless of privilege modes. I think if you want to make CPUCFG behavior PLV-dependent, you may have to ask the LoongArch spec editors, internally or in public, for a new spec revision. (There are already multiple third-party LoongArch implementers as of late 2023, so any ISA-level change like this would best be coordinated, to minimize surprises.) [1]: https://lore.kernel.org/loongarch/d8994f0f-d789-46d2-bc4d-f9b37fb396ff@xen0n.name/
On 2024/2/27 下午5:05, Xi Ruoyao wrote: > On Tue, 2024-02-27 at 15:09 +0800, maobibo wrote: > >> It is difficult to find an area unused by HW for CSR method since the >> small CSR ID range. > > We may use IOCSR instead. In kernel/cpu-probe.c there are already some > IOCSR reads. yes, IOCSR can be used and one IOCSR area will be reserved for software. In general CSR/CPUCFG is to describe CPU features and IOCSR is to describe board/device features. CSR area is limited to 16K, it is difficult for HW guys to reserve special area for software usage. IOCSR/CPUCFG is possible however. Regards Bibo Mao > >> It is difficult to find an area unused by HW for CSR method since the >> small CSR ID range. However it is easy for cpucfg. Here I doubt whether >> you really know about LoongArch LVZ. > > It's unfair to accuse the others for not knowing about LVZ considering > the lack of public documentation. >
On 2024/2/27 下午5:10, WANG Xuerui wrote: > On 2/27/24 11:14, maobibo wrote: >> >> >> On 2024/2/27 上午4:02, Jiaxun Yang wrote: >>> >>> >>> 在2024年2月26日二月 上午8:04,maobibo写道: >>>> On 2024/2/26 下午2:12, Huacai Chen wrote: >>>>> On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 2024/2/24 下午5:13, Huacai Chen wrote: >>>>>>> Hi, Bibo, >>>>>>> >>>>>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> >>>>>>> wrote: >>>>>>>> >>>>>>>> Instruction cpucfg can be used to get processor features. And there >>>>>>>> is trap exception when it is executed in VM mode, and also it is >>>>>>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>>>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>>>>>> for KVM hypervisor to privide PV features, and the area can be >>>>>>>> extended >>>>>>>> for other hypervisors in future. This area will never be used for >>>>>>>> real HW, it is only used by software. >>>>>>> After reading and thinking, I find that the hypercall method >>>>>>> which is >>>>>>> used in our productive kernel is better than this cpucfg method. >>>>>>> Because hypercall is more simple and straightforward, plus we don't >>>>>>> worry about conflicting with the real hardware. >>>>>> No, I do not think so. cpucfg is simper than hypercall, hypercall can >>>>>> be in effect when system runs in guest mode. In some scenario like >>>>>> TCG >>>>>> mode, hypercall is illegal intruction, however cpucfg can work. >>>>> Nearly all architectures use hypercall except x86 for its historical >>>> Only x86 support multiple hypervisors and there is multiple hypervisor >>>> in x86 only. It is an advantage, not historical reason. >>> >>> I do believe that all those stuff should not be exposed to guest user >>> space >>> for security reasons. >> Can you add PLV checking when cpucfg 0x40000000-0x400000FF is >> emulated? if it is user mode return value is zero and it is kernel >> mode emulated value will be returned. It can avoid information leaking. > > I've suggested this approach in another reply [1], but I've rechecked > the manual, and it turns out this behavior is not permitted by the > current wording. See LoongArch Reference Manual v1.10, Volume 1, Section > 2.2.10.5 "CPUCFG": > > > CPUCFG 访问未定义的配置字将读回全 0 值。 > > > > Reads of undefined CPUCFG configuration words shall return all-zeroes. > > This sentence mentions no distinction based on privilege modes, so it > can only mean the behavior applies universally regardless of privilege > modes. > > I think if you want to make CPUCFG behavior PLV-dependent, you may have > to ask the LoongArch spec editors, internally or in public, for a new > spec revision. No, CPUCFG behavior between CPUCFG0-CPUCFG21 is unchanged, only that it can be defined by software since CPUCFG 0x400000000 is used by software. > > (There are already multiple third-party LoongArch implementers as of > late 2023, so any ISA-level change like this would best be coordinated, > to minimize surprises.) With document Vol 4-23 https://www.intel.com/content/dam/develop/external/us/en/documents/335592-sdm-vol-4.pdf There is one line "MSR address range between 40000000H - 400000FFH is marked as a specially reserved range. All existing and future processors will not implement any features using any MSR in this range." It only says that it is reserved, it does not say detailed software behavior. Software behavior is defined in hypervisor such as: https://github.com/MicrosoftDocs/Virtualization-Documentation/blob/main/tlfs/Requirements%20for%20Implementing%20the%20Microsoft%20Hypervisor%20Interface.pdf https://kb.vmware.com/s/article/1009458 If hypercall method is used, there should be ABI also like aarch64: https://documentation-service.arm.com/static/6013e5faeee5236980d08619 Regards Bibo Mao > > [1]: > https://lore.kernel.org/loongarch/d8994f0f-d789-46d2-bc4d-f9b37fb396ff@xen0n.name/ > >
On 2/27/24 18:12, maobibo wrote: > > > On 2024/2/27 下午5:10, WANG Xuerui wrote: >> On 2/27/24 11:14, maobibo wrote: >>> >>> >>> On 2024/2/27 上午4:02, Jiaxun Yang wrote: >>>> >>>> >>>> 在2024年2月26日二月 上午8:04,maobibo写道: >>>>> On 2024/2/26 下午2:12, Huacai Chen wrote: >>>>>> On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2024/2/24 下午5:13, Huacai Chen wrote: >>>>>>>> Hi, Bibo, >>>>>>>> >>>>>>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Instruction cpucfg can be used to get processor features. And >>>>>>>>> there >>>>>>>>> is trap exception when it is executed in VM mode, and also it is >>>>>>>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 >>>>>>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used >>>>>>>>> for KVM hypervisor to privide PV features, and the area can be >>>>>>>>> extended >>>>>>>>> for other hypervisors in future. This area will never be used for >>>>>>>>> real HW, it is only used by software. >>>>>>>> After reading and thinking, I find that the hypercall method >>>>>>>> which is >>>>>>>> used in our productive kernel is better than this cpucfg method. >>>>>>>> Because hypercall is more simple and straightforward, plus we don't >>>>>>>> worry about conflicting with the real hardware. >>>>>>> No, I do not think so. cpucfg is simper than hypercall, hypercall >>>>>>> can >>>>>>> be in effect when system runs in guest mode. In some scenario >>>>>>> like TCG >>>>>>> mode, hypercall is illegal intruction, however cpucfg can work. >>>>>> Nearly all architectures use hypercall except x86 for its historical >>>>> Only x86 support multiple hypervisors and there is multiple hypervisor >>>>> in x86 only. It is an advantage, not historical reason. >>>> >>>> I do believe that all those stuff should not be exposed to guest >>>> user space >>>> for security reasons. >>> Can you add PLV checking when cpucfg 0x40000000-0x400000FF is >>> emulated? if it is user mode return value is zero and it is kernel >>> mode emulated value will be returned. It can avoid information leaking. >> >> I've suggested this approach in another reply [1], but I've rechecked >> the manual, and it turns out this behavior is not permitted by the >> current wording. See LoongArch Reference Manual v1.10, Volume 1, >> Section 2.2.10.5 "CPUCFG": >> >> > CPUCFG 访问未定义的配置字将读回全 0 值。 >> > >> > Reads of undefined CPUCFG configuration words shall return all-zeroes. >> >> This sentence mentions no distinction based on privilege modes, so it >> can only mean the behavior applies universally regardless of privilege >> modes. >> >> I think if you want to make CPUCFG behavior PLV-dependent, you may >> have to ask the LoongArch spec editors, internally or in public, for a >> new spec revision. > No, CPUCFG behavior between CPUCFG0-CPUCFG21 is unchanged, only that it > can be defined by software since CPUCFG 0x400000000 is used by software. The 0x40000000 range is not mentioned in the manuals. I know you've confirmed privately with HW team but this needs to be properly documented for public projects to properly rely on. >> (There are already multiple third-party LoongArch implementers as of >> late 2023, so any ISA-level change like this would best be >> coordinated, to minimize surprises.) > With document Vol 4-23 > https://www.intel.com/content/dam/develop/external/us/en/documents/335592-sdm-vol-4.pdf > > There is one line "MSR address range between 40000000H - 400000FFH is > marked as a specially reserved range. All existing and > future processors will not implement any features using any MSR in this > range." Thanks for providing this info, now at least we know why it's this specific range of 0x400000XX that's chosen. > > It only says that it is reserved, it does not say detailed software > behavior. Software behavior is defined in hypervisor such as: > https://github.com/MicrosoftDocs/Virtualization-Documentation/blob/main/tlfs/Requirements%20for%20Implementing%20the%20Microsoft%20Hypervisor%20Interface.pdf > https://kb.vmware.com/s/article/1009458 > > If hypercall method is used, there should be ABI also like aarch64: > https://documentation-service.arm.com/static/6013e5faeee5236980d08619 Yes proper documentation of public API surface is always necessary *before* doing real work. Because right now the hypercall provider is Linux KVM, maybe we can document the existing and planned hypercall usage and ABI in the kernel docs along with code changes.
Hi, Paolo, I'm very sorry to bother you, now we have a controversy about how to query hypervisor features (PV IPI, PV timer and maybe PV spinlock) on LoongArch, and we have two choices: 1, Get from CPUCFG registers; 2, Get from a hypercall. CPUCFG are unprivileged registers which can be read from user space, so the first method can unnecessarily leak information to userspace, but this method is more or less similar to x86's CPUID solution. Hypercall is of course a privileged method (so no info leak issues), and this method is used by ARM/ARM64 and most other architectures. Besides, LoongArch's CPUCFG is supposed to provide per-core features, while not all hypervisor features are per-core information (Of course PV IPI is, but others may be or may not be, while 'extioi' is obviously not). Bibo thinks that only CPUCFG has enough space to contain all hypervisor features (CSR and IOCSR are not enough). However it is obvious that we don't need any register space to hold these features, because they are held in the hypervisor's memory. The only information that needs register space is "whether I am in a VM" (in this case we really cannot use hypercall), but this feature is already in IOCSR (LOONGARCH_IOCSR_FEATURES). Now my question is: for a new architecture, which method is preferable, maintainable and extensible? Of course "they both OK" for the current purpose in this patch, but I think you can give us more useful information from a professor's view. More details are available in this thread, about the 3rd patch. Any suggestions are welcome. Huacai On Tue, Feb 27, 2024 at 6:19 PM WANG Xuerui <kernel@xen0n.name> wrote: > > On 2/27/24 18:12, maobibo wrote: > > > > > > On 2024/2/27 下午5:10, WANG Xuerui wrote: > >> On 2/27/24 11:14, maobibo wrote: > >>> > >>> > >>> On 2024/2/27 上午4:02, Jiaxun Yang wrote: > >>>> > >>>> > >>>> 在2024年2月26日二月 上午8:04,maobibo写道: > >>>>> On 2024/2/26 下午2:12, Huacai Chen wrote: > >>>>>> On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongsoncn> wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 2024/2/24 下午5:13, Huacai Chen wrote: > >>>>>>>> Hi, Bibo, > >>>>>>>> > >>>>>>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> > >>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Instruction cpucfg can be used to get processor features. And > >>>>>>>>> there > >>>>>>>>> is trap exception when it is executed in VM mode, and also it is > >>>>>>>>> to provide cpu features to VM. On real hardware cpucfg area 0 - 20 > >>>>>>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is used > >>>>>>>>> for KVM hypervisor to privide PV features, and the area can be > >>>>>>>>> extended > >>>>>>>>> for other hypervisors in future. This area will never be used for > >>>>>>>>> real HW, it is only used by software. > >>>>>>>> After reading and thinking, I find that the hypercall method > >>>>>>>> which is > >>>>>>>> used in our productive kernel is better than this cpucfg method. > >>>>>>>> Because hypercall is more simple and straightforward, plus we don't > >>>>>>>> worry about conflicting with the real hardware. > >>>>>>> No, I do not think so. cpucfg is simper than hypercall, hypercall > >>>>>>> can > >>>>>>> be in effect when system runs in guest mode. In some scenario > >>>>>>> like TCG > >>>>>>> mode, hypercall is illegal intruction, however cpucfg can work. > >>>>>> Nearly all architectures use hypercall except x86 for its historical > >>>>> Only x86 support multiple hypervisors and there is multiple hypervisor > >>>>> in x86 only. It is an advantage, not historical reason. > >>>> > >>>> I do believe that all those stuff should not be exposed to guest > >>>> user space > >>>> for security reasons. > >>> Can you add PLV checking when cpucfg 0x40000000-0x400000FF is > >>> emulated? if it is user mode return value is zero and it is kernel > >>> mode emulated value will be returned. It can avoid information leaking. > >> > >> I've suggested this approach in another reply [1], but I've rechecked > >> the manual, and it turns out this behavior is not permitted by the > >> current wording. See LoongArch Reference Manual v1.10, Volume 1, > >> Section 2.2.10.5 "CPUCFG": > >> > >> > CPUCFG 访问未定义的配置字将读回全 0 值。 > >> > > >> > Reads of undefined CPUCFG configuration words shall return all-zeroes. > >> > >> This sentence mentions no distinction based on privilege modes, so it > >> can only mean the behavior applies universally regardless of privilege > >> modes. > >> > >> I think if you want to make CPUCFG behavior PLV-dependent, you may > >> have to ask the LoongArch spec editors, internally or in public, for a > >> new spec revision. > > No, CPUCFG behavior between CPUCFG0-CPUCFG21 is unchanged, only that it > > can be defined by software since CPUCFG 0x400000000 is used by software. > > The 0x40000000 range is not mentioned in the manuals. I know you've > confirmed privately with HW team but this needs to be properly > documented for public projects to properly rely on. > > >> (There are already multiple third-party LoongArch implementers as of > >> late 2023, so any ISA-level change like this would best be > >> coordinated, to minimize surprises.) > > With document Vol 4-23 > > https://www.intel.com/content/dam/develop/external/us/en/documents/335592-sdm-vol-4.pdf > > > > There is one line "MSR address range between 40000000H - 400000FFH is > > marked as a specially reserved range. All existing and > > future processors will not implement any features using any MSR in this > > range." > > Thanks for providing this info, now at least we know why it's this > specific range of 0x400000XX that's chosen. > > > > > It only says that it is reserved, it does not say detailed software > > behavior. Software behavior is defined in hypervisor such as: > > https://github.com/MicrosoftDocs/Virtualization-Documentation/blob/main/tlfs/Requirements%20for%20Implementing%20the%20Microsoft%20Hypervisor%20Interface.pdf > > https://kb.vmware.com/s/article/1009458 > > > > If hypercall method is used, there should be ABI also like aarch64: > > https://documentation-service.arm.com/static/6013e5faeee5236980d08619 > > Yes proper documentation of public API surface is always necessary > *before* doing real work. Because right now the hypercall provider is > Linux KVM, maybe we can document the existing and planned hypercall > usage and ABI in the kernel docs along with code changes. > > -- > WANG "xen0n" Xuerui > > Linux/LoongArch mailing list: https://lore.kernel.org/loongarch/ >
On 2024/2/27 下午6:19, WANG Xuerui wrote: > On 2/27/24 18:12, maobibo wrote: >> >> >> On 2024/2/27 下午5:10, WANG Xuerui wrote: >>> On 2/27/24 11:14, maobibo wrote: >>>> >>>> >>>> On 2024/2/27 上午4:02, Jiaxun Yang wrote: >>>>> >>>>> >>>>> 在2024年2月26日二月 上午8:04,maobibo写道: >>>>>> On 2024/2/26 下午2:12, Huacai Chen wrote: >>>>>>> On Mon, Feb 26, 2024 at 10:04 AM maobibo <maobibo@loongson.cn> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 2024/2/24 下午5:13, Huacai Chen wrote: >>>>>>>>> Hi, Bibo, >>>>>>>>> >>>>>>>>> On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao <maobibo@loongson.cn> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Instruction cpucfg can be used to get processor features. And >>>>>>>>>> there >>>>>>>>>> is trap exception when it is executed in VM mode, and also it is >>>>>>>>>> to provide cpu features to VM. On real hardware cpucfg area 0 >>>>>>>>>> - 20 >>>>>>>>>> is used. Here one specified area 0x40000000 -- 0x400000ff is >>>>>>>>>> used >>>>>>>>>> for KVM hypervisor to privide PV features, and the area can be >>>>>>>>>> extended >>>>>>>>>> for other hypervisors in future. This area will never be used for >>>>>>>>>> real HW, it is only used by software. >>>>>>>>> After reading and thinking, I find that the hypercall method >>>>>>>>> which is >>>>>>>>> used in our productive kernel is better than this cpucfg method. >>>>>>>>> Because hypercall is more simple and straightforward, plus we >>>>>>>>> don't >>>>>>>>> worry about conflicting with the real hardware. >>>>>>>> No, I do not think so. cpucfg is simper than hypercall, >>>>>>>> hypercall can >>>>>>>> be in effect when system runs in guest mode. In some scenario >>>>>>>> like TCG >>>>>>>> mode, hypercall is illegal intruction, however cpucfg can work. >>>>>>> Nearly all architectures use hypercall except x86 for its historical >>>>>> Only x86 support multiple hypervisors and there is multiple >>>>>> hypervisor >>>>>> in x86 only. It is an advantage, not historical reason. >>>>> >>>>> I do believe that all those stuff should not be exposed to guest >>>>> user space >>>>> for security reasons. >>>> Can you add PLV checking when cpucfg 0x40000000-0x400000FF is >>>> emulated? if it is user mode return value is zero and it is kernel >>>> mode emulated value will be returned. It can avoid information leaking. >>> >>> I've suggested this approach in another reply [1], but I've rechecked >>> the manual, and it turns out this behavior is not permitted by the >>> current wording. See LoongArch Reference Manual v1.10, Volume 1, >>> Section 2.2.10.5 "CPUCFG": >>> >>> > CPUCFG 访问未定义的配置字将读回全 0 值。 >>> > >>> > Reads of undefined CPUCFG configuration words shall return >>> all-zeroes. >>> >>> This sentence mentions no distinction based on privilege modes, so it >>> can only mean the behavior applies universally regardless of >>> privilege modes. >>> >>> I think if you want to make CPUCFG behavior PLV-dependent, you may >>> have to ask the LoongArch spec editors, internally or in public, for >>> a new spec revision. >> No, CPUCFG behavior between CPUCFG0-CPUCFG21 is unchanged, only that >> it can be defined by software since CPUCFG 0x400000000 is used by >> software. > > The 0x40000000 range is not mentioned in the manuals. I know you've > confirmed privately with HW team but this needs to be properly > documented for public projects to properly rely on. > >>> (There are already multiple third-party LoongArch implementers as of >>> late 2023, so any ISA-level change like this would best be >>> coordinated, to minimize surprises.) >> With document Vol 4-23 >> https://www.intel.com/content/dam/develop/external/us/en/documents/335592-sdm-vol-4.pdf >> >> >> There is one line "MSR address range between 40000000H - 400000FFH is >> marked as a specially reserved range. All existing and >> future processors will not implement any features using any MSR in >> this range." > > Thanks for providing this info, now at least we know why it's this > specific range of 0x400000XX that's chosen. > >> >> It only says that it is reserved, it does not say detailed software >> behavior. Software behavior is defined in hypervisor such as: >> https://github.com/MicrosoftDocs/Virtualization-Documentation/blob/main/tlfs/Requirements%20for%20Implementing%20the%20Microsoft%20Hypervisor%20Interface.pdf >> >> https://kb.vmware.com/s/article/1009458 >> >> If hypercall method is used, there should be ABI also like aarch64: >> https://documentation-service.arm.com/static/6013e5faeee5236980d08619 > > Yes proper documentation of public API surface is always necessary > *before* doing real work. Because right now the hypercall provider is > Linux KVM, maybe we can document the existing and planned hypercall > usage and ABI in the kernel docs along with code changes. > yes, I will add hypercall in kernel docs. SMC calling convention is ABI between OS and secure firmware, LoongArch has no secure mode, it not necessary to has such doc.The hypercall calling convention is relative with hypervisor SW, not relative with CPU HW vendor. Each hypervisor maybe has its separate hypercall calling convention just like syscall ABIs. Regards Bibo Mao
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h index d8f637f9e400..ad120f924905 100644 --- a/arch/loongarch/include/asm/inst.h +++ b/arch/loongarch/include/asm/inst.h @@ -67,6 +67,7 @@ enum reg2_op { revhd_op = 0x11, extwh_op = 0x16, extwb_op = 0x17, + cpucfg_op = 0x1b, iocsrrdb_op = 0x19200, iocsrrdh_op = 0x19201, iocsrrdw_op = 0x19202, diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h index 46366e783c84..a1d22e8b6f94 100644 --- a/arch/loongarch/include/asm/loongarch.h +++ b/arch/loongarch/include/asm/loongarch.h @@ -158,6 +158,16 @@ #define CPUCFG48_VFPU_CG BIT(2) #define CPUCFG48_RAM_CG BIT(3) +/* + * cpucfg index area: 0x40000000 -- 0x400000ff + * SW emulation for KVM hypervirsor + */ +#define CPUCFG_KVM_BASE 0x40000000UL +#define CPUCFG_KVM_SIZE 0x100 +#define CPUCFG_KVM_SIG CPUCFG_KVM_BASE +#define KVM_SIGNATURE "KVM\0" +#define CPUCFG_KVM_FEATURE (CPUCFG_KVM_BASE + 4) + #ifndef __ASSEMBLY__ /* CSR */ diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c index 923bbca9bd22..6a38fd59d86d 100644 --- a/arch/loongarch/kvm/exit.c +++ b/arch/loongarch/kvm/exit.c @@ -206,10 +206,37 @@ int kvm_emu_idle(struct kvm_vcpu *vcpu) return EMULATE_DONE; } -static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) +static int kvm_emu_cpucfg(struct kvm_vcpu *vcpu, larch_inst inst) { int rd, rj; unsigned int index; + + rd = inst.reg2_format.rd; + rj = inst.reg2_format.rj; + ++vcpu->stat.cpucfg_exits; + index = vcpu->arch.gprs[rj]; + + /* + * By LoongArch Reference Manual 2.2.10.5 + * Return value is 0 for undefined cpucfg index + */ + switch (index) { + case 0 ... (KVM_MAX_CPUCFG_REGS - 1): + vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; + break; + case CPUCFG_KVM_SIG: + vcpu->arch.gprs[rd] = *(unsigned int *)KVM_SIGNATURE; + break; + default: + vcpu->arch.gprs[rd] = 0; + break; + } + + return EMULATE_DONE; +} + +static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) +{ unsigned long curr_pc; larch_inst inst; enum emulation_result er = EMULATE_DONE; @@ -224,21 +251,8 @@ static int kvm_trap_handle_gspr(struct kvm_vcpu *vcpu) er = EMULATE_FAIL; switch (((inst.word >> 24) & 0xff)) { case 0x0: /* CPUCFG GSPR */ - if (inst.reg2_format.opcode == 0x1B) { - rd = inst.reg2_format.rd; - rj = inst.reg2_format.rj; - ++vcpu->stat.cpucfg_exits; - index = vcpu->arch.gprs[rj]; - er = EMULATE_DONE; - /* - * By LoongArch Reference Manual 2.2.10.5 - * return value is 0 for undefined cpucfg index - */ - if (index < KVM_MAX_CPUCFG_REGS) - vcpu->arch.gprs[rd] = vcpu->arch.cpucfg[index]; - else - vcpu->arch.gprs[rd] = 0; - } + if (inst.reg2_format.opcode == cpucfg_op) + er = kvm_emu_cpucfg(vcpu, inst); break; case 0x4: /* CSR{RD,WR,XCHG} GSPR */ er = kvm_handle_csr(vcpu, inst);