Message ID | 20230417113219.1354078-1-suijingfeng@loongson.cn |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2071068vqo; Mon, 17 Apr 2023 05:05:01 -0700 (PDT) X-Google-Smtp-Source: AKy350awi/zWAypb26zxmayv4cE/Xl9elDWop1WkMTESvFdih761oytrMjc11cNUfYgo4JSF0Kvt X-Received: by 2002:a9d:7587:0:b0:6a4:388e:eac6 with SMTP id s7-20020a9d7587000000b006a4388eeac6mr5128862otk.24.1681733101132; Mon, 17 Apr 2023 05:05:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681733101; cv=none; d=google.com; s=arc-20160816; b=ul2Wm9Jg14TEmBcIUyhaTJalQCTHjHViH2BiDLi+4qt95MIm+wIDF6enMyOioj9t5e /1wBS74BGzjWOVV037hm+rXSNezy7eCFEDVDa/S0V/NhU0oT1vzWa2Hvno+y8UjOCfTU Utr2y38uMYGj8XLFwn6v6GAWPAyImGSgZW+56ftg0sLoTwZ97rwtVw5eif28KlPnhfcg yiNdySbjzz8M7ZyZKqC+qYsLGbOKAoKIBoar4M6jBHcklFAdz+yV+fOSf/xMp+8u4w2Y Bo6Maw7ZUj6b7XFAF4IYta4ibqvm1YEcPxps2n7Z6ky3xEFmo3hMVi1gsE9/8okckANX W9+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=jIm81PgDyHJQx34z7b73CsxJwKVi4Up7ibu5QwcwyVo=; b=ujqfU9we/2ngVkP46PXVP21mNnYWnwhbZcgsD3ChkLFCfZFZ5p0yqbJ/gi1s8lmuLH +RRM/+/hHw89Dn/fSWHXA10bwWdq7rAm9BVnRX6FQjGWM+3KFySL+sk7+9LGUwYvgckc GTzc2iRbryDh2wri0ZZmWjk0Kk4h7plB8Ka6NAUKLPS/t3yhB6OjYJt0TphARJjAwR0x Nv/0NoNtsvxqI7mJ9+C75XRmJzGuF2QNNqUMUdQMlZ8c8AKi/gRfI32c+Jm6aphhANQJ 2+ILVBcuu7TGAugX+I4K0HW1386u1YPpULGNaPqt5hLbV4eE8xpny2j2w9ZPAEVWq7Tn 7SIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x27-20020a9d705b000000b006a5ec740c48si1280523otj.228.2023.04.17.05.04.47; Mon, 17 Apr 2023 05:05:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229883AbjDQLkK (ORCPT <rfc822;leviz.kernel.dev@gmail.com> + 99 others); Mon, 17 Apr 2023 07:40:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229498AbjDQLkI (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 17 Apr 2023 07:40:08 -0400 Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 205741706; Mon, 17 Apr 2023 04:39:18 -0700 (PDT) Received: from loongson.cn (unknown [10.20.42.43]) by gateway (Coremail) with SMTP id _____8AxEk5ILj1kOO0dAA--.34881S3; Mon, 17 Apr 2023 19:32:24 +0800 (CST) Received: from openarena.loongson.cn (unknown [10.20.42.43]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Cx97xDLj1kB8AqAA--.48515S2; Mon, 17 Apr 2023 19:32:21 +0800 (CST) From: Sui Jingfeng <suijingfeng@loongson.cn> To: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>, Maxime Ripard <mripard@kernel.org>, Thomas Zimmermann <tzimmermann@suse.de>, David Airlie <airlied@gmail.com>, Daniel Vetter <daniel@ffwll.ch>, Sui Jingfeng <suijingfeng@loongson.cn>, Li Yi <liyi@loongson.cn>, Helge Deller <deller@gmx.de>, Lucas De Marchi <lucas.demarchi@intel.com> Cc: linux-kernel@vger.kernel.org, linux-fbdev@vger.kernel.org, dri-devel@lists.freedesktop.org, loongson-kernel@lists.loongnix.cn Subject: [PATCH v3] drm/fbdev-generic: prohibit potential out-of-bounds access Date: Mon, 17 Apr 2023 19:32:19 +0800 Message-Id: <20230417113219.1354078-1-suijingfeng@loongson.cn> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: AQAAf8Cx97xDLj1kB8AqAA--.48515S2 X-CM-SenderInfo: xvxlyxpqjiv03j6o00pqjv00gofq/ X-Coremail-Antispam: 1Uk129KBjvJXoWxCF17tr43CFWDJF1UGw17Wrg_yoW5ZrWxpF WfKFWUKr4kJFn8Xr47A3WUJw1UAanrZFWxurWxKryjyFyYy3429ryjyrWUWFy5Gr18Jr13 trn093W0kr1qyaUanT9S1TB71UUUUjJqnTZGkaVYY2UrUUUUj1kv1TuYvTs0mT0YCTnIWj qI5I8CrVACY4xI64kE6c02F40Ex7xfYxn0WfASr-VFAUDa7-sFnT9fnUUIcSsGvfJTRUUU bf8YFVCjjxCrM7AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s 1l1IIY67AEw4v_Jrv_JF1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVW5JVW7JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwA2z4 x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAaw2AF wI0_JF0_Jw1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2xF0cIa020Ex4CE44I27w Aqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_JF0_Jw1lYx0Ex4A2jsIE 14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwCY1x0262kKe7 AKxVWUAVWUtwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwCFI7km07C2 67AKxVWUAVWUtwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI 8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8 JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r 1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Jr0_GrUvcSsG vfC2KfnxnUUI43ZEXa7IU8Dl1DUUUUU== X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1763424968485802670?= X-GMAIL-MSGID: =?utf-8?q?1763424968485802670?= |
Series |
[v3] drm/fbdev-generic: prohibit potential out-of-bounds access
|
|
Commit Message
Sui Jingfeng
April 17, 2023, 11:32 a.m. UTC
The fbdev test of IGT may write after EOF, which lead to out-of-bound
access for the drm drivers using fbdev-generic. For example, on a x86
+ aspeed bmc card platform, with a 1680x1050 resolution display, running
fbdev test if IGT will cause the linux kernel hang with the following
call trace:
Oops: 0000 [#1] PREEMPT SMP PTI
[IGT] fbdev: starting subtest eof
Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
[IGT] fbdev: starting subtest nullptr
RIP: 0010:memcpy_erms+0xa/0x20
RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246
RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0
RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000
RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0
R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80
R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30
FS: 0000000000000000(0000) GS:ffff895257380000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0
Call Trace:
<TASK>
? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper]
drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper]
process_one_work+0x21f/0x430
worker_thread+0x4e/0x3c0
? __pfx_worker_thread+0x10/0x10
kthread+0xf4/0x120
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
CR2: ffffa17d40e0b000
---[ end trace 0000000000000000 ]---
The direct reason is that damage rectange computed by
drm_fb_helper_memory_range_to_clip() does not guaranteed to be in-bound.
It is already results in workaround code populate to elsewhere. Another
reason is that exposing a larger buffer size than the actual needed help
to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip().
Others fbdev emulation solutions write to the GEM buffer directly, they
won't reproduce this bug because the .fb_dirty function callback do not
being hooked, so no chance is given to drm_fb_helper_memory_range_to_clip()
to generate a out-of-bound when drm_fb_helper_sys_write() is called.
This patch break the trigger condition of this bug by shrinking the shadow
buffer size to sizes->surface_height * buffer->fb->pitches[0].
Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of GEM
buffer")'
Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn>
---
drivers/gpu/drm/drm_fbdev_generic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
Hi Am 17.04.23 um 13:32 schrieb Sui Jingfeng: > The fbdev test of IGT may write after EOF, which lead to out-of-bound > access for the drm drivers using fbdev-generic. For example, on a x86 > + aspeed bmc card platform, with a 1680x1050 resolution display, running > fbdev test if IGT will cause the linux kernel hang with the following > call trace: > > Oops: 0000 [#1] PREEMPT SMP PTI > [IGT] fbdev: starting subtest eof > Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] > [IGT] fbdev: starting subtest nullptr > > RIP: 0010:memcpy_erms+0xa/0x20 > RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 > RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 > RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 > RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 > R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 > R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 > FS: 0000000000000000(0000) GS:ffff895257380000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 > Call Trace: > <TASK> > ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] > drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] > process_one_work+0x21f/0x430 > worker_thread+0x4e/0x3c0 > ? __pfx_worker_thread+0x10/0x10 > kthread+0xf4/0x120 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x2c/0x50 > </TASK> > CR2: ffffa17d40e0b000 > ---[ end trace 0000000000000000 ]--- > > The direct reason is that damage rectange computed by > drm_fb_helper_memory_range_to_clip() does not guaranteed to be in-bound. > It is already results in workaround code populate to elsewhere. Another > reason is that exposing a larger buffer size than the actual needed help > to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). > > Others fbdev emulation solutions write to the GEM buffer directly, they > won't reproduce this bug because the .fb_dirty function callback do not > being hooked, so no chance is given to drm_fb_helper_memory_range_to_clip() > to generate a out-of-bound when drm_fb_helper_sys_write() is called. > > This patch break the trigger condition of this bug by shrinking the shadow > buffer size to sizes->surface_height * buffer->fb->pitches[0]. > > Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of GEM > buffer")' > > Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> > --- > drivers/gpu/drm/drm_fbdev_generic.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_fbdev_generic.c b/drivers/gpu/drm/drm_fbdev_generic.c > index 8e5148bf40bb..b057cfbba938 100644 > --- a/drivers/gpu/drm/drm_fbdev_generic.c > +++ b/drivers/gpu/drm/drm_fbdev_generic.c > @@ -94,7 +94,7 @@ static int drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, > fb_helper->buffer = buffer; > fb_helper->fb = buffer->fb; > > - screen_size = buffer->gem->size; > + screen_size = sizes->surface_height * buffer->fb->pitches[0]; Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Thanks a lot for the bugfix. Best regards Thomas > screen_buffer = vzalloc(screen_size); > if (!screen_buffer) { > ret = -ENOMEM; -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
On Mon, Apr 17, 2023 at 1:45 PM Sui Jingfeng <suijingfeng@loongson.cn> wrote: > The fbdev test of IGT may write after EOF, which lead to out-of-bound > access for the drm drivers using fbdev-generic. For example, on a x86 > + aspeed bmc card platform, with a 1680x1050 resolution display, running > fbdev test if IGT will cause the linux kernel hang with the following > call trace: > > Oops: 0000 [#1] PREEMPT SMP PTI > [IGT] fbdev: starting subtest eof > Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] > [IGT] fbdev: starting subtest nullptr > > RIP: 0010:memcpy_erms+0xa/0x20 > RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 > RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 > RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 > RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 > R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 > R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 > FS: 0000000000000000(0000) GS:ffff895257380000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 > Call Trace: > <TASK> > ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] > drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] > process_one_work+0x21f/0x430 > worker_thread+0x4e/0x3c0 > ? __pfx_worker_thread+0x10/0x10 > kthread+0xf4/0x120 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x2c/0x50 > </TASK> > CR2: ffffa17d40e0b000 > ---[ end trace 0000000000000000 ]--- > > The direct reason is that damage rectange computed by > drm_fb_helper_memory_range_to_clip() does not guaranteed to be in-bound. > It is already results in workaround code populate to elsewhere. Another > reason is that exposing a larger buffer size than the actual needed help > to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). > > Others fbdev emulation solutions write to the GEM buffer directly, they > won't reproduce this bug because the .fb_dirty function callback do not > being hooked, so no chance is given to drm_fb_helper_memory_range_to_clip() > to generate a out-of-bound when drm_fb_helper_sys_write() is called. > > This patch break the trigger condition of this bug by shrinking the shadow > buffer size to sizes->surface_height * buffer->fb->pitches[0]. > > Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of GEM > buffer")' > > Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Thanks, this fixes the crashes when running fbtest on shmob-drm. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Gr{oetje,eeting}s, Geert
On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: > The fbdev test of IGT may write after EOF, which lead to out-of-bound > access for the drm drivers using fbdev-generic. For example, on a x86 > + aspeed bmc card platform, with a 1680x1050 resolution display, running > fbdev test if IGT will cause the linux kernel hang with the following > call trace: > > Oops: 0000 [#1] PREEMPT SMP PTI > [IGT] fbdev: starting subtest eof > Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] > [IGT] fbdev: starting subtest nullptr > > RIP: 0010:memcpy_erms+0xa/0x20 > RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 > RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 > RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 > RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 > R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 > R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 > FS: 0000000000000000(0000) GS:ffff895257380000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 > Call Trace: > <TASK> > ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] > drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] > process_one_work+0x21f/0x430 > worker_thread+0x4e/0x3c0 > ? __pfx_worker_thread+0x10/0x10 > kthread+0xf4/0x120 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x2c/0x50 > </TASK> > CR2: ffffa17d40e0b000 > ---[ end trace 0000000000000000 ]--- > > The direct reason is that damage rectange computed by > drm_fb_helper_memory_range_to_clip() does not guaranteed to be in-bound. > It is already results in workaround code populate to elsewhere. Another > reason is that exposing a larger buffer size than the actual needed help > to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). > > Others fbdev emulation solutions write to the GEM buffer directly, they > won't reproduce this bug because the .fb_dirty function callback do not > being hooked, so no chance is given to drm_fb_helper_memory_range_to_clip() > to generate a out-of-bound when drm_fb_helper_sys_write() is called. > > This patch break the trigger condition of this bug by shrinking the shadow > buffer size to sizes->surface_height * buffer->fb->pitches[0]. > > Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of GEM > buffer")' > > Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> > --- > drivers/gpu/drm/drm_fbdev_generic.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_fbdev_generic.c b/drivers/gpu/drm/drm_fbdev_generic.c > index 8e5148bf40bb..b057cfbba938 100644 > --- a/drivers/gpu/drm/drm_fbdev_generic.c > +++ b/drivers/gpu/drm/drm_fbdev_generic.c > @@ -94,7 +94,7 @@ static int drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, > fb_helper->buffer = buffer; > fb_helper->fb = buffer->fb; > > - screen_size = buffer->gem->size; > + screen_size = sizes->surface_height * buffer->fb->pitches[0]; So I read core some more and stumbled over drm_fb_helper_deferred_io(). Which has all the code and comments about this, including limiting. I think it would be clearer if we fix the issue there, instead of passing limits around in obscure places that then again get broken? The thing is, Thomas both authored the limit checks in drm_fb_helper_deferred_io() and the patch which broken them again, so clearly this isn't very obvious. I'm thinking of something like this: diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index ef4eb8b12766..726dab67c359 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info *info, struct list_head *pagerefli * of the screen and account for non-existing scanlines. Hence, * keep the covered memory area within the screen buffer. */ - if (info->screen_size) - total_size = info->screen_size; - else - total_size = info->fix.smem_len; + total_size = helper->fb->height * helper->fb->pitches[0]; max_off = min(max_off, total_size); if (min_off < max_off) { I think that would make it utmost clear on what we're doing and why. Otherwise we're just going to re-create the same bug again, like we've done already :-) -Daniel > screen_buffer = vzalloc(screen_size); > if (!screen_buffer) { > ret = -ENOMEM; > -- > 2.25.1 >
Hi Am 18.04.23 um 10:32 schrieb Daniel Vetter: > On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: >> The fbdev test of IGT may write after EOF, which lead to out-of-bound >> access for the drm drivers using fbdev-generic. For example, on a x86 >> + aspeed bmc card platform, with a 1680x1050 resolution display, running >> fbdev test if IGT will cause the linux kernel hang with the following >> call trace: >> >> Oops: 0000 [#1] PREEMPT SMP PTI >> [IGT] fbdev: starting subtest eof >> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >> [IGT] fbdev: starting subtest nullptr >> >> RIP: 0010:memcpy_erms+0xa/0x20 >> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 >> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 >> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 >> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 >> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 >> R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 >> FS: 0000000000000000(0000) GS:ffff895257380000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 >> Call Trace: >> <TASK> >> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] >> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] >> process_one_work+0x21f/0x430 >> worker_thread+0x4e/0x3c0 >> ? __pfx_worker_thread+0x10/0x10 >> kthread+0xf4/0x120 >> ? __pfx_kthread+0x10/0x10 >> ret_from_fork+0x2c/0x50 >> </TASK> >> CR2: ffffa17d40e0b000 >> ---[ end trace 0000000000000000 ]--- >> >> The direct reason is that damage rectange computed by >> drm_fb_helper_memory_range_to_clip() does not guaranteed to be in-bound. >> It is already results in workaround code populate to elsewhere. Another >> reason is that exposing a larger buffer size than the actual needed help >> to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). >> >> Others fbdev emulation solutions write to the GEM buffer directly, they >> won't reproduce this bug because the .fb_dirty function callback do not >> being hooked, so no chance is given to drm_fb_helper_memory_range_to_clip() >> to generate a out-of-bound when drm_fb_helper_sys_write() is called. >> >> This patch break the trigger condition of this bug by shrinking the shadow >> buffer size to sizes->surface_height * buffer->fb->pitches[0]. >> >> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of GEM >> buffer")' >> >> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> >> --- >> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c b/drivers/gpu/drm/drm_fbdev_generic.c >> index 8e5148bf40bb..b057cfbba938 100644 >> --- a/drivers/gpu/drm/drm_fbdev_generic.c >> +++ b/drivers/gpu/drm/drm_fbdev_generic.c >> @@ -94,7 +94,7 @@ static int drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, >> fb_helper->buffer = buffer; >> fb_helper->fb = buffer->fb; >> >> - screen_size = buffer->gem->size; >> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; > > So I read core some more and stumbled over drm_fb_helper_deferred_io(). > Which has all the code and comments about this, including limiting. > > I think it would be clearer if we fix the issue there, instead of passing > limits around in obscure places that then again get broken? The thing is, > Thomas both authored the limit checks in drm_fb_helper_deferred_io() and > the patch which broken them again, so clearly this isn't very obvious. I'm > thinking of something like this: > > > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c > index ef4eb8b12766..726dab67c359 100644 > --- a/drivers/gpu/drm/drm_fb_helper.c > +++ b/drivers/gpu/drm/drm_fb_helper.c > @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info *info, struct list_head *pagerefli > * of the screen and account for non-existing scanlines. Hence, > * keep the covered memory area within the screen buffer. > */ > - if (info->screen_size) > - total_size = info->screen_size; > - else > - total_size = info->fix.smem_len; > + total_size = helper->fb->height * helper->fb->pitches[0]; But we now have two places where we compute the size of the buffer. That violates the SPOT rule. Can we at least add a warning if total_size is larger than either of the info fields? Yesterday, I've been thinking about disconnecting the size of the DRM framebuffer from the overall size of the GEM buffer. That's probably the only way to fully resolve those problems. It's just that it's a huge can of worms. :/ Best regards Thomas > max_off = min(max_off, total_size); > > if (min_off < max_off) { > > > I think that would make it utmost clear on what we're doing and why. > Otherwise we're just going to re-create the same bug again, like we've > done already :-) > -Daniel > >> screen_buffer = vzalloc(screen_size); >> if (!screen_buffer) { >> ret = -ENOMEM; >> -- >> 2.25.1 >> > -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
Hi, On 2023/4/18 16:32, Daniel Vetter wrote: > On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: >> The fbdev test of IGT may write after EOF, which lead to out-of-bound >> access for the drm drivers using fbdev-generic. For example, on a x86 >> + aspeed bmc card platform, with a 1680x1050 resolution display, running >> fbdev test if IGT will cause the linux kernel hang with the following >> call trace: >> >> Oops: 0000 [#1] PREEMPT SMP PTI >> [IGT] fbdev: starting subtest eof >> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >> [IGT] fbdev: starting subtest nullptr >> >> RIP: 0010:memcpy_erms+0xa/0x20 >> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 >> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 >> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 >> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 >> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 >> R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 >> FS: 0000000000000000(0000) GS:ffff895257380000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 >> Call Trace: >> <TASK> >> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] >> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] >> process_one_work+0x21f/0x430 >> worker_thread+0x4e/0x3c0 >> ? __pfx_worker_thread+0x10/0x10 >> kthread+0xf4/0x120 >> ? __pfx_kthread+0x10/0x10 >> ret_from_fork+0x2c/0x50 >> </TASK> >> CR2: ffffa17d40e0b000 >> ---[ end trace 0000000000000000 ]--- >> >> The direct reason is that damage rectange computed by >> drm_fb_helper_memory_range_to_clip() does not guaranteed to be in-bound. >> It is already results in workaround code populate to elsewhere. Another >> reason is that exposing a larger buffer size than the actual needed help >> to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). >> >> Others fbdev emulation solutions write to the GEM buffer directly, they >> won't reproduce this bug because the .fb_dirty function callback do not >> being hooked, so no chance is given to drm_fb_helper_memory_range_to_clip() >> to generate a out-of-bound when drm_fb_helper_sys_write() is called. >> >> This patch break the trigger condition of this bug by shrinking the shadow >> buffer size to sizes->surface_height * buffer->fb->pitches[0]. >> >> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of GEM >> buffer")' >> >> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> >> --- >> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c b/drivers/gpu/drm/drm_fbdev_generic.c >> index 8e5148bf40bb..b057cfbba938 100644 >> --- a/drivers/gpu/drm/drm_fbdev_generic.c >> +++ b/drivers/gpu/drm/drm_fbdev_generic.c >> @@ -94,7 +94,7 @@ static int drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, >> fb_helper->buffer = buffer; >> fb_helper->fb = buffer->fb; >> >> - screen_size = buffer->gem->size; >> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; > So I read core some more and stumbled over drm_fb_helper_deferred_io(). > Which has all the code and comments about this, including limiting. > > I think it would be clearer if we fix the issue there, instead of passing > limits around in obscure places that then again get broken? No, it is more obscure doing that way... As the size of the shadow screen buffer will be exposed to userspace. The size 'helper->fb->height * helper->fb->pitches[0]' is a exactly(best) fit, You are guaranteed to waste at lease one byte by increasing one byte, and can not store all pixels by decreasing one byte (In the case where `helper->fb->pitches[0] = helper->fb->width * 4`). It implicitly tell the userspace do not go beyond that boundary. although userspace program can still choose to write after EOF, But it is for test purpose, to test the kernel if it can return a -EFBIG or not. > The thing is, > Thomas both authored the limit checks in drm_fb_helper_deferred_io() and > the patch which broken them again, so clearly this isn't very obvious. I'm > thinking of something like this: > > > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c > index ef4eb8b12766..726dab67c359 100644 > --- a/drivers/gpu/drm/drm_fb_helper.c > +++ b/drivers/gpu/drm/drm_fb_helper.c > @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info *info, struct list_head *pagerefli > * of the screen and account for non-existing scanlines. Hence, > * keep the covered memory area within the screen buffer. > */ > - if (info->screen_size) > - total_size = info->screen_size; > - else > - total_size = info->fix.smem_len; > + total_size = helper->fb->height * helper->fb->pitches[0]; This is just to mitigate the mistakes already has been made, because it do not do a good splitting between the *clip* part and the *damage update* part. An ideal clipping do not obscure its updating backend with a out-of-bound damage rectangle. Why did the drm_fb_helper_memory_range_to_clip() can not do a good job in all case to pass its backend a always meaningful damage rect ? > max_off = min(max_off, total_size); > > if (min_off < max_off) { > > > I think that would make it utmost clear on what we're doing and why. > Otherwise we're just going to re-create the same bug again, like we've > done already :-) No, we create no bugs, we fix one. Thanks. > -Daniel > >> screen_buffer = vzalloc(screen_size); >> if (!screen_buffer) { >> ret = -ENOMEM; >> -- >> 2.25.1 >>
Hi, On 2023/4/19 01:52, Sui Jingfeng wrote: > Hi, > > On 2023/4/18 16:32, Daniel Vetter wrote: >> On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: >>> The fbdev test of IGT may write after EOF, which lead to out-of-bound >>> access for the drm drivers using fbdev-generic. For example, on a x86 >>> + aspeed bmc card platform, with a 1680x1050 resolution display, >>> running >>> fbdev test if IGT will cause the linux kernel hang with the following >>> call trace: >>> >>> Oops: 0000 [#1] PREEMPT SMP PTI >>> [IGT] fbdev: starting subtest eof >>> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >>> [IGT] fbdev: starting subtest nullptr >>> >>> RIP: 0010:memcpy_erms+0xa/0x20 >>> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 >>> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 >>> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 >>> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 >>> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 >>> R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 >>> FS: 0000000000000000(0000) GS:ffff895257380000(0000) >>> knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 >>> Call Trace: >>> <TASK> >>> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] >>> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] >>> process_one_work+0x21f/0x430 >>> worker_thread+0x4e/0x3c0 >>> ? __pfx_worker_thread+0x10/0x10 >>> kthread+0xf4/0x120 >>> ? __pfx_kthread+0x10/0x10 >>> ret_from_fork+0x2c/0x50 >>> </TASK> >>> CR2: ffffa17d40e0b000 >>> ---[ end trace 0000000000000000 ]--- >>> >>> The direct reason is that damage rectange computed by >>> drm_fb_helper_memory_range_to_clip() does not guaranteed to be >>> in-bound. >>> It is already results in workaround code populate to elsewhere. Another >>> reason is that exposing a larger buffer size than the actual needed >>> help >>> to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). >>> >>> Others fbdev emulation solutions write to the GEM buffer directly, they >>> won't reproduce this bug because the .fb_dirty function callback do not >>> being hooked, so no chance is given to >>> drm_fb_helper_memory_range_to_clip() >>> to generate a out-of-bound when drm_fb_helper_sys_write() is called. >>> >>> This patch break the trigger condition of this bug by shrinking the >>> shadow >>> buffer size to sizes->surface_height * buffer->fb->pitches[0]. >>> >>> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of >>> GEM >>> buffer")' >>> >>> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> >>> --- >>> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c >>> b/drivers/gpu/drm/drm_fbdev_generic.c >>> index 8e5148bf40bb..b057cfbba938 100644 >>> --- a/drivers/gpu/drm/drm_fbdev_generic.c >>> +++ b/drivers/gpu/drm/drm_fbdev_generic.c >>> @@ -94,7 +94,7 @@ static int >>> drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, >>> fb_helper->buffer = buffer; >>> fb_helper->fb = buffer->fb; >>> - screen_size = buffer->gem->size; >>> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; >> So I read core some more and stumbled over drm_fb_helper_deferred_io(). >> Which has all the code and comments about this, including limiting. >> >> I think it would be clearer if we fix the issue there, instead of >> passing >> limits around in obscure places that then again get broken? > > No, it is more obscure doing that way... > > > As the size of the shadow screen buffer will be exposed to userspace. > > The size 'helper->fb->height * helper->fb->pitches[0]' is a > exactly(best) fit, > > You are guaranteed to waste at lease one byte by increasing one byte, > > and can not store all pixels by decreasing one byte (In the case where > `helper->fb->pitches[0] = helper->fb->width * 4`). > > It implicitly tell the userspace do not go beyond that boundary. > > although userspace program can still choose to write after EOF, > > But it is for test purpose, to test the kernel if it can return a > -EFBIG or not. > >> The thing is, >> Thomas both authored the limit checks in drm_fb_helper_deferred_io() and >> the patch which broken them again, so clearly this isn't very >> obvious. I'm >> thinking of something like this: >> >> >> diff --git a/drivers/gpu/drm/drm_fb_helper.c >> b/drivers/gpu/drm/drm_fb_helper.c >> index ef4eb8b12766..726dab67c359 100644 >> --- a/drivers/gpu/drm/drm_fb_helper.c >> +++ b/drivers/gpu/drm/drm_fb_helper.c >> @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info >> *info, struct list_head *pagerefli >> * of the screen and account for non-existing scanlines. Hence, >> * keep the covered memory area within the screen buffer. >> */ >> - if (info->screen_size) >> - total_size = info->screen_size; >> - else >> - total_size = info->fix.smem_len; >> + total_size = helper->fb->height * helper->fb->pitches[0]; > > This is just to mitigate the mistakes already has been made, > > because it do not do a good splitting between the *clip* part and the > *damage update* part. > > An ideal clipping do not obscure its updating backend with a > out-of-bound damage rectangle. > > Why did the drm_fb_helper_memory_range_to_clip() can not do a good job > in all case > > to pass its backend a always meaningful damage rect ? > >> max_off = min(max_off, total_size); >> if (min_off < max_off) { >> >> >> I think that would make it utmost clear on what we're doing and why. >> Otherwise we're just going to re-create the same bug again, like we've >> done already :-) > > No, we create no bugs, we fix one. > > Thanks. > But honestly I do not have strong feel toward this, I just type what I'm understand without seeing you resend a V3. It's OK in overall, I will help to test this tomorrow. :-) >> -Daniel >> >>> screen_buffer = vzalloc(screen_size); >>> if (!screen_buffer) { >>> ret = -ENOMEM; >>> -- >>> 2.25.1 >>>
On Tue, 18 Apr 2023 at 20:16, Sui Jingfeng <15330273260@189.cn> wrote: > > Hi, > > On 2023/4/19 01:52, Sui Jingfeng wrote: > > Hi, > > > > On 2023/4/18 16:32, Daniel Vetter wrote: > >> On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: > >>> The fbdev test of IGT may write after EOF, which lead to out-of-bound > >>> access for the drm drivers using fbdev-generic. For example, on a x86 > >>> + aspeed bmc card platform, with a 1680x1050 resolution display, > >>> running > >>> fbdev test if IGT will cause the linux kernel hang with the following > >>> call trace: > >>> > >>> Oops: 0000 [#1] PREEMPT SMP PTI > >>> [IGT] fbdev: starting subtest eof > >>> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] > >>> [IGT] fbdev: starting subtest nullptr > >>> > >>> RIP: 0010:memcpy_erms+0xa/0x20 > >>> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 > >>> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 > >>> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 > >>> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 > >>> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 > >>> R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 > >>> FS: 0000000000000000(0000) GS:ffff895257380000(0000) > >>> knlGS:0000000000000000 > >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 > >>> Call Trace: > >>> <TASK> > >>> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] > >>> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] > >>> process_one_work+0x21f/0x430 > >>> worker_thread+0x4e/0x3c0 > >>> ? __pfx_worker_thread+0x10/0x10 > >>> kthread+0xf4/0x120 > >>> ? __pfx_kthread+0x10/0x10 > >>> ret_from_fork+0x2c/0x50 > >>> </TASK> > >>> CR2: ffffa17d40e0b000 > >>> ---[ end trace 0000000000000000 ]--- > >>> > >>> The direct reason is that damage rectange computed by > >>> drm_fb_helper_memory_range_to_clip() does not guaranteed to be > >>> in-bound. > >>> It is already results in workaround code populate to elsewhere. Another > >>> reason is that exposing a larger buffer size than the actual needed > >>> help > >>> to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). > >>> > >>> Others fbdev emulation solutions write to the GEM buffer directly, they > >>> won't reproduce this bug because the .fb_dirty function callback do not > >>> being hooked, so no chance is given to > >>> drm_fb_helper_memory_range_to_clip() > >>> to generate a out-of-bound when drm_fb_helper_sys_write() is called. > >>> > >>> This patch break the trigger condition of this bug by shrinking the > >>> shadow > >>> buffer size to sizes->surface_height * buffer->fb->pitches[0]. > >>> > >>> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of > >>> GEM > >>> buffer")' > >>> > >>> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> > >>> --- > >>> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c > >>> b/drivers/gpu/drm/drm_fbdev_generic.c > >>> index 8e5148bf40bb..b057cfbba938 100644 > >>> --- a/drivers/gpu/drm/drm_fbdev_generic.c > >>> +++ b/drivers/gpu/drm/drm_fbdev_generic.c > >>> @@ -94,7 +94,7 @@ static int > >>> drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, > >>> fb_helper->buffer = buffer; > >>> fb_helper->fb = buffer->fb; > >>> - screen_size = buffer->gem->size; > >>> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; > >> So I read core some more and stumbled over drm_fb_helper_deferred_io(). > >> Which has all the code and comments about this, including limiting. > >> > >> I think it would be clearer if we fix the issue there, instead of > >> passing > >> limits around in obscure places that then again get broken? > > > > No, it is more obscure doing that way... > > > > > > As the size of the shadow screen buffer will be exposed to userspace. > > > > The size 'helper->fb->height * helper->fb->pitches[0]' is a > > exactly(best) fit, > > > > You are guaranteed to waste at lease one byte by increasing one byte, > > > > and can not store all pixels by decreasing one byte (In the case where > > `helper->fb->pitches[0] = helper->fb->width * 4`). > > > > It implicitly tell the userspace do not go beyond that boundary. > > > > although userspace program can still choose to write after EOF, > > > > But it is for test purpose, to test the kernel if it can return a > > -EFBIG or not. > > > >> The thing is, > >> Thomas both authored the limit checks in drm_fb_helper_deferred_io() and > >> the patch which broken them again, so clearly this isn't very > >> obvious. I'm > >> thinking of something like this: > >> > >> > >> diff --git a/drivers/gpu/drm/drm_fb_helper.c > >> b/drivers/gpu/drm/drm_fb_helper.c > >> index ef4eb8b12766..726dab67c359 100644 > >> --- a/drivers/gpu/drm/drm_fb_helper.c > >> +++ b/drivers/gpu/drm/drm_fb_helper.c > >> @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info > >> *info, struct list_head *pagerefli > >> * of the screen and account for non-existing scanlines. Hence, > >> * keep the covered memory area within the screen buffer. > >> */ > >> - if (info->screen_size) > >> - total_size = info->screen_size; > >> - else > >> - total_size = info->fix.smem_len; > >> + total_size = helper->fb->height * helper->fb->pitches[0]; > > > > This is just to mitigate the mistakes already has been made, > > > > because it do not do a good splitting between the *clip* part and the > > *damage update* part. > > > > An ideal clipping do not obscure its updating backend with a > > out-of-bound damage rectangle. > > > > Why did the drm_fb_helper_memory_range_to_clip() can not do a good job > > in all case > > > > to pass its backend a always meaningful damage rect ? > > > >> max_off = min(max_off, total_size); > >> if (min_off < max_off) { > >> > >> > >> I think that would make it utmost clear on what we're doing and why. > >> Otherwise we're just going to re-create the same bug again, like we've > >> done already :-) > > > > No, we create no bugs, we fix one. > > > > Thanks. > > > But honestly I do not have strong feel toward this, I just type what I'm > understand without seeing you resend a V3. > > It's OK in overall, I will help to test this tomorrow. :-) Apologies for making you jump around all the time and doing different versions of the same bugfix :-/ I think this one here is ok to merge, I just thought when looking at the history that we revert the exact patch without any other changes or comments, and usually that means someone will come up with the same cleanup idea again, and then we'll have a bug again. So maybe a comment or a WARN_ON or something else would be good. I guess we could also do your patch, but put a WARN_ON that the computed total_size is never bigger than the drm_fb size into drm_fb_helper_deferred_io()? That would also make sure that this bug doesn't get resurrected again. -Daniel
Hi Am 19.04.23 um 17:09 schrieb Daniel Vetter: > On Tue, 18 Apr 2023 at 20:16, Sui Jingfeng <15330273260@189.cn> wrote: >> >> Hi, >> >> On 2023/4/19 01:52, Sui Jingfeng wrote: >>> Hi, >>> >>> On 2023/4/18 16:32, Daniel Vetter wrote: >>>> On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: >>>>> The fbdev test of IGT may write after EOF, which lead to out-of-bound >>>>> access for the drm drivers using fbdev-generic. For example, on a x86 >>>>> + aspeed bmc card platform, with a 1680x1050 resolution display, >>>>> running >>>>> fbdev test if IGT will cause the linux kernel hang with the following >>>>> call trace: >>>>> >>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>> [IGT] fbdev: starting subtest eof >>>>> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >>>>> [IGT] fbdev: starting subtest nullptr >>>>> >>>>> RIP: 0010:memcpy_erms+0xa/0x20 >>>>> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 >>>>> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 >>>>> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 >>>>> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 >>>>> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 >>>>> R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 >>>>> FS: 0000000000000000(0000) GS:ffff895257380000(0000) >>>>> knlGS:0000000000000000 >>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 >>>>> Call Trace: >>>>> <TASK> >>>>> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] >>>>> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] >>>>> process_one_work+0x21f/0x430 >>>>> worker_thread+0x4e/0x3c0 >>>>> ? __pfx_worker_thread+0x10/0x10 >>>>> kthread+0xf4/0x120 >>>>> ? __pfx_kthread+0x10/0x10 >>>>> ret_from_fork+0x2c/0x50 >>>>> </TASK> >>>>> CR2: ffffa17d40e0b000 >>>>> ---[ end trace 0000000000000000 ]--- >>>>> >>>>> The direct reason is that damage rectange computed by >>>>> drm_fb_helper_memory_range_to_clip() does not guaranteed to be >>>>> in-bound. >>>>> It is already results in workaround code populate to elsewhere. Another >>>>> reason is that exposing a larger buffer size than the actual needed >>>>> help >>>>> to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). >>>>> >>>>> Others fbdev emulation solutions write to the GEM buffer directly, they >>>>> won't reproduce this bug because the .fb_dirty function callback do not >>>>> being hooked, so no chance is given to >>>>> drm_fb_helper_memory_range_to_clip() >>>>> to generate a out-of-bound when drm_fb_helper_sys_write() is called. >>>>> >>>>> This patch break the trigger condition of this bug by shrinking the >>>>> shadow >>>>> buffer size to sizes->surface_height * buffer->fb->pitches[0]. >>>>> >>>>> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of >>>>> GEM >>>>> buffer")' >>>>> >>>>> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> >>>>> --- >>>>> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c >>>>> b/drivers/gpu/drm/drm_fbdev_generic.c >>>>> index 8e5148bf40bb..b057cfbba938 100644 >>>>> --- a/drivers/gpu/drm/drm_fbdev_generic.c >>>>> +++ b/drivers/gpu/drm/drm_fbdev_generic.c >>>>> @@ -94,7 +94,7 @@ static int >>>>> drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, >>>>> fb_helper->buffer = buffer; >>>>> fb_helper->fb = buffer->fb; >>>>> - screen_size = buffer->gem->size; >>>>> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; >>>> So I read core some more and stumbled over drm_fb_helper_deferred_io(). >>>> Which has all the code and comments about this, including limiting. >>>> >>>> I think it would be clearer if we fix the issue there, instead of >>>> passing >>>> limits around in obscure places that then again get broken? >>> >>> No, it is more obscure doing that way... >>> >>> >>> As the size of the shadow screen buffer will be exposed to userspace. >>> >>> The size 'helper->fb->height * helper->fb->pitches[0]' is a >>> exactly(best) fit, >>> >>> You are guaranteed to waste at lease one byte by increasing one byte, >>> >>> and can not store all pixels by decreasing one byte (In the case where >>> `helper->fb->pitches[0] = helper->fb->width * 4`). >>> >>> It implicitly tell the userspace do not go beyond that boundary. >>> >>> although userspace program can still choose to write after EOF, >>> >>> But it is for test purpose, to test the kernel if it can return a >>> -EFBIG or not. >>> >>>> The thing is, >>>> Thomas both authored the limit checks in drm_fb_helper_deferred_io() and >>>> the patch which broken them again, so clearly this isn't very >>>> obvious. I'm >>>> thinking of something like this: >>>> >>>> >>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c >>>> b/drivers/gpu/drm/drm_fb_helper.c >>>> index ef4eb8b12766..726dab67c359 100644 >>>> --- a/drivers/gpu/drm/drm_fb_helper.c >>>> +++ b/drivers/gpu/drm/drm_fb_helper.c >>>> @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info >>>> *info, struct list_head *pagerefli >>>> * of the screen and account for non-existing scanlines. Hence, >>>> * keep the covered memory area within the screen buffer. >>>> */ >>>> - if (info->screen_size) >>>> - total_size = info->screen_size; >>>> - else >>>> - total_size = info->fix.smem_len; >>>> + total_size = helper->fb->height * helper->fb->pitches[0]; >>> >>> This is just to mitigate the mistakes already has been made, >>> >>> because it do not do a good splitting between the *clip* part and the >>> *damage update* part. >>> >>> An ideal clipping do not obscure its updating backend with a >>> out-of-bound damage rectangle. >>> >>> Why did the drm_fb_helper_memory_range_to_clip() can not do a good job >>> in all case >>> >>> to pass its backend a always meaningful damage rect ? >>> >>>> max_off = min(max_off, total_size); >>>> if (min_off < max_off) { >>>> >>>> >>>> I think that would make it utmost clear on what we're doing and why. >>>> Otherwise we're just going to re-create the same bug again, like we've >>>> done already :-) >>> >>> No, we create no bugs, we fix one. >>> >>> Thanks. >>> >> But honestly I do not have strong feel toward this, I just type what I'm >> understand without seeing you resend a V3. >> >> It's OK in overall, I will help to test this tomorrow. :-) > > Apologies for making you jump around all the time and doing different > versions of the same bugfix :-/ > > I think this one here is ok to merge, I just thought when looking at > the history that we revert the exact patch without any other changes > or comments, and usually that means someone will come up with the same > cleanup idea again, and then we'll have a bug again. So maybe a > comment or a WARN_ON or something else would be good. > > I guess we could also do your patch, but put a WARN_ON that the > computed total_size is never bigger than the drm_fb size into > drm_fb_helper_deferred_io()? That would also make sure that this bug > doesn't get resurrected again. We'd have to put this test into drm_fbdev_generic.c. Otherwise we'll break i915, which also uses deferred I/O, but without shadow buffering.. Maybe test in drm_fbdev_generic_helper_fb_dirty() if the clip rectangle extends the framebuffer size. Best regards Thomas > -Daniel -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
Hi, Sorry about reply to you so late, our downstream (product kernel side) userspace GPU/DC driver has been tested out a few bugs, I'm asking to fulfill my duty to that part all days. I may slow to reply, but I really love to reply. On 2023/4/19 23:09, Daniel Vetter wrote: > On Tue, 18 Apr 2023 at 20:16, Sui Jingfeng <15330273260@189.cn> wrote: >> Hi, >> >> On 2023/4/19 01:52, Sui Jingfeng wrote: >>> Hi, >>> >>> On 2023/4/18 16:32, Daniel Vetter wrote: >>>> On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: >>>>> The fbdev test of IGT may write after EOF, which lead to out-of-bound >>>>> access for the drm drivers using fbdev-generic. For example, on a x86 >>>>> + aspeed bmc card platform, with a 1680x1050 resolution display, >>>>> running >>>>> fbdev test if IGT will cause the linux kernel hang with the following >>>>> call trace: >>>>> >>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>> [IGT] fbdev: starting subtest eof >>>>> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >>>>> [IGT] fbdev: starting subtest nullptr >>>>> >>>>> RIP: 0010:memcpy_erms+0xa/0x20 >>>>> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 >>>>> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 >>>>> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 >>>>> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 >>>>> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 >>>>> R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 >>>>> FS: 0000000000000000(0000) GS:ffff895257380000(0000) >>>>> knlGS:0000000000000000 >>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 >>>>> Call Trace: >>>>> <TASK> >>>>> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] >>>>> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] >>>>> process_one_work+0x21f/0x430 >>>>> worker_thread+0x4e/0x3c0 >>>>> ? __pfx_worker_thread+0x10/0x10 >>>>> kthread+0xf4/0x120 >>>>> ? __pfx_kthread+0x10/0x10 >>>>> ret_from_fork+0x2c/0x50 >>>>> </TASK> >>>>> CR2: ffffa17d40e0b000 >>>>> ---[ end trace 0000000000000000 ]--- >>>>> >>>>> The direct reason is that damage rectange computed by >>>>> drm_fb_helper_memory_range_to_clip() does not guaranteed to be >>>>> in-bound. >>>>> It is already results in workaround code populate to elsewhere. Another >>>>> reason is that exposing a larger buffer size than the actual needed >>>>> help >>>>> to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). >>>>> >>>>> Others fbdev emulation solutions write to the GEM buffer directly, they >>>>> won't reproduce this bug because the .fb_dirty function callback do not >>>>> being hooked, so no chance is given to >>>>> drm_fb_helper_memory_range_to_clip() >>>>> to generate a out-of-bound when drm_fb_helper_sys_write() is called. >>>>> >>>>> This patch break the trigger condition of this bug by shrinking the >>>>> shadow >>>>> buffer size to sizes->surface_height * buffer->fb->pitches[0]. >>>>> >>>>> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of >>>>> GEM >>>>> buffer")' >>>>> >>>>> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> >>>>> --- >>>>> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c >>>>> b/drivers/gpu/drm/drm_fbdev_generic.c >>>>> index 8e5148bf40bb..b057cfbba938 100644 >>>>> --- a/drivers/gpu/drm/drm_fbdev_generic.c >>>>> +++ b/drivers/gpu/drm/drm_fbdev_generic.c >>>>> @@ -94,7 +94,7 @@ static int >>>>> drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, >>>>> fb_helper->buffer = buffer; >>>>> fb_helper->fb = buffer->fb; >>>>> - screen_size = buffer->gem->size; >>>>> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; >>>> So I read core some more and stumbled over drm_fb_helper_deferred_io(). >>>> Which has all the code and comments about this, including limiting. >>>> >>>> I think it would be clearer if we fix the issue there, instead of >>>> passing >>>> limits around in obscure places that then again get broken? >>> No, it is more obscure doing that way... >>> >>> >>> As the size of the shadow screen buffer will be exposed to userspace. >>> >>> The size 'helper->fb->height * helper->fb->pitches[0]' is a >>> exactly(best) fit, >>> >>> You are guaranteed to waste at lease one byte by increasing one byte, >>> >>> and can not store all pixels by decreasing one byte (In the case where >>> `helper->fb->pitches[0] = helper->fb->width * 4`). >>> >>> It implicitly tell the userspace do not go beyond that boundary. >>> >>> although userspace program can still choose to write after EOF, >>> >>> But it is for test purpose, to test the kernel if it can return a >>> -EFBIG or not. >>> >>>> The thing is, >>>> Thomas both authored the limit checks in drm_fb_helper_deferred_io() and >>>> the patch which broken them again, so clearly this isn't very >>>> obvious. I'm >>>> thinking of something like this: >>>> >>>> >>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c >>>> b/drivers/gpu/drm/drm_fb_helper.c >>>> index ef4eb8b12766..726dab67c359 100644 >>>> --- a/drivers/gpu/drm/drm_fb_helper.c >>>> +++ b/drivers/gpu/drm/drm_fb_helper.c >>>> @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info >>>> *info, struct list_head *pagerefli >>>> * of the screen and account for non-existing scanlines. Hence, >>>> * keep the covered memory area within the screen buffer. >>>> */ >>>> - if (info->screen_size) >>>> - total_size = info->screen_size; >>>> - else >>>> - total_size = info->fix.smem_len; >>>> + total_size = helper->fb->height * helper->fb->pitches[0]; >>> This is just to mitigate the mistakes already has been made, >>> >>> because it do not do a good splitting between the *clip* part and the >>> *damage update* part. >>> >>> An ideal clipping do not obscure its updating backend with a >>> out-of-bound damage rectangle. >>> >>> Why did the drm_fb_helper_memory_range_to_clip() can not do a good job >>> in all case >>> >>> to pass its backend a always meaningful damage rect ? >>> >>>> max_off = min(max_off, total_size); >>>> if (min_off < max_off) { >>>> >>>> >>>> I think that would make it utmost clear on what we're doing and why. >>>> Otherwise we're just going to re-create the same bug again, like we've >>>> done already :-) >>> No, we create no bugs, we fix one. >>> >>> Thanks. >>> >> But honestly I do not have strong feel toward this, I just type what I'm >> understand without seeing you resend a V3. >> >> It's OK in overall, I will help to test this tomorrow. :-) > Apologies for making you jump around all the time and doing different > versions of the same bugfix :-/ No, I do not mind. I'm wondering if you are testing me. > I think this one here is ok to merge, NO, to be honest, this version is not ok. I have just tested it on LoongArch . It does not prevent out-of-bound on LoongArch. bellow is the call trace when running the fbdev test of IGT. [ 369.628841] Console: switching to colour dummy device 80x25 [ 369.634440] [IGT] fbdev: executing [ 369.654684] [IGT] fbdev: starting subtest info [ 369.659173] [IGT] fbdev: starting subtest pan [ 369.722093] [IGT] fbdev: starting subtest read [ 369.737795] [IGT] fbdev: starting subtest unaligned-read [ 369.745695] [IGT] fbdev: starting subtest write [ 369.753154] CPU 3 Unable to handle kernel paging request at virtual address ffff800034bd0000, era == 9000000000223d5c, ra == ffff8000020d75a8 [ 369.774570] [IGT] fbdev: starting subtest unaligned-write [ 369.779960] Oops[#1]: [ 369.782215] CPU: 3 PID: 504 Comm: kworker/3:3 Not tainted 6.3.0-rc5+ #377 [ 369.782219] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-EVB/Loongson-LS3A5000-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05383-beta10 1 [ 369.782223] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] [ 369.802706] $ 0 : 0000000000000000 ffff8000020d75a8 90000001367fc000 90000001367ffcf0 [ 369.809368] $ 4 : ffff80003a3ec000 ffff800034bcee00 0000000000001e00 9000000001160000 [ 369.825275] $ 8 : ffff80003a3ebfff 0000000000000040 0000000000000000 ffff80000214f20c [ 369.825280] $12 : ffff80003a3ecc00 ffff800034bd0000 0000000000000000 0000000000001e00 [ 369.841188] $16 : 9000000107490800 9000000001a24000 9000000001a28000 00000e00207ec000 [ 369.849152] $20 : fffffffffe000000 900000010684cec0 0000000000000439 90000001367ffd90 [ 369.849165] $24 : 9000000106bc5200 ffff8000020de000 ffff800034bcee00 0000000000001e00 [ 369.865072] $28 : 90000001076d4400 0000000000000001 900000010553c000 ffff80003a3eae00 [ 369.865078] era : 9000000000223d5c __memcpy_toio+0x4c/0x90 [ 369.878651] ra : ffff8000020d75a8 drm_fbdev_generic_helper_fb_dirty+0x1cc/0x300 [drm_kms_helper] [ 369.878762] [IGT] fbdev: starting subtest eof [ 369.887679] CSR crmd: 000000b0 [ 369.887680] CSR prmd: 00000004 [ 369.887682] CSR euen: 00000000 [ 369.887683] CSR ecfg: 00071c1c [ 369.901340] CSR estat: 00010000 [ 369.901342] ExcCode : 1 (SubCode 0) [ 369.901344] BadVA : ffff800034bd0000 [ 369.914659] PrId : 0014c011 (Loongson-64bit) [ 369.914660] Modules linked in: uas usb_storage snd_seq_dummy snd_seq snd_seq_device ip_set rfkill nf_tables nfnetlink vfat fat loongson ttm acpi_ipmi drm_kms_helper syscopyarea sysfillrect ipmi_si ipmi_devintf sysimgblt ipmi_msghandler fuse efivarfs [ 369.919013] Process kworker/3:3 (pid: 504, threadinfo=00000000a1234af0, task=000000004e2cde6f) [ 369.949519] Stack : 9000000106bc5318 900000010684cec0 9000000107afd470 ffff800039c00000 [ 369.949535] [IGT] fbdev: starting subtest nullptr [ 369.957476] 0000000000000001 0000000000000000 0000000000000000 76e0ff420f8eaeab [ 369.957481] 9000000001712000 fffffffffffffffb 9000000100161080 0000000000000003 [ 369.970105] 900000000980ba05 900000000980ba00 9000000106bc5200 9000000106bc52d4 [ 369.978062] ffff8000020de000 9000000106bc52d8 9000000110664b40 ffff8000020d78e4 [ 369.986018] 043a078000000000 76e0ff420f8eaeab 900000000980be00 9000000009807400 [ 370.001926] 0000000000000000 9000000110664b40 9000000106bc52d8 9000000000256dbc [ 370.001931] 900000000170e000 9000000107afce00 0000000000000001 900000000170e000 [ 370.009888] 9000000009807428 9000000110664b70 9000000009807400 900000000025737c [ 370.025797] 9000000100161080 9000000001148080 9000000107afce00 0000000000000003 [ 370.026543] [IGT] fbdev: exiting, ret=0 [ 370.033753] ... [ 370.033756] Call Trace: [ 370.033757] [<9000000000223d5c>] __memcpy_toio+0x4c/0x90 [ 370.047681] [<ffff8000020d75a8>] drm_fbdev_generic_helper_fb_dirty+0x1cc/0x300 [drm_kms_helper] [ 370.056354] [<ffff8000020d78e4>] drm_fb_helper_damage_work+0xa4/0x1d0 [drm_kms_helper] [ 370.056381] [<9000000000256dbc>] process_one_work+0x1ec/0x35c [ 370.056385] [<900000000025737c>] worker_thread+0x88/0x428 [ 370.056387] [<900000000025f4bc>] kthread+0x114/0x120 [ 370.056392] [<90000000002215a8>] ret_from_kernel_thread+0xc/0xa4 [ 370.056395] [ 370.056396] Code: 00410def 0010bc8c 001500ad <260001ae> 02c02084 02c021ad 29ffe08e 5ffff184 03401cc6 [ 370.056406] [ 370.056421] fbcon_init: detected unhandled fb_set_par error, error code -16 [ 370.056482] ---[ end trace 0000000000000000 ]--- [ 370.066620] Console: switching to colour frame buffer device 240x67 > I just thought when looking at > the history that we revert the exact patch without any other changes > or comments, Other part of that patch(except this line) may still useful, at least for cleanup purpose. > and usually that means someone will come up with the same > cleanup idea again, and then we'll have a bug again. So maybe a > comment or a WARN_ON or something else would be good. A WARN_ON is acceptable. > I guess we could also do your patch, but put a WARN_ON that the > computed total_size is never bigger than the drm_fb size into > drm_fb_helper_deferred_io()? That would also make sure that this bug > doesn't get resurrected again. Best to merge V2 [1] of this series, that is what I am really fixed. Maybe somebody can help to refine it, to add a better description about this question and so on. [1] https://patchwork.freedesktop.org/patch/532143/?series=116454&rev=1 > -Daniel
On Thu, Apr 20, 2023 at 12:00:41AM +0800, Sui Jingfeng wrote: > Hi, > > Sorry about reply to you so late, > > our downstream (product kernel side) userspace GPU/DC driver > > has been tested out a few bugs, I'm asking to fulfill my duty to that part > all days. > > I may slow to reply, but I really love to reply. > > > On 2023/4/19 23:09, Daniel Vetter wrote: > > On Tue, 18 Apr 2023 at 20:16, Sui Jingfeng <15330273260@189.cn> wrote: > > > Hi, > > > > > > On 2023/4/19 01:52, Sui Jingfeng wrote: > > > > Hi, > > > > > > > > On 2023/4/18 16:32, Daniel Vetter wrote: > > > > > On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: > > > > > > The fbdev test of IGT may write after EOF, which lead to out-of-bound > > > > > > access for the drm drivers using fbdev-generic. For example, on a x86 > > > > > > + aspeed bmc card platform, with a 1680x1050 resolution display, > > > > > > running > > > > > > fbdev test if IGT will cause the linux kernel hang with the following > > > > > > call trace: > > > > > > > > > > > > Oops: 0000 [#1] PREEMPT SMP PTI > > > > > > [IGT] fbdev: starting subtest eof > > > > > > Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] > > > > > > [IGT] fbdev: starting subtest nullptr > > > > > > > > > > > > RIP: 0010:memcpy_erms+0xa/0x20 > > > > > > RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 > > > > > > RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 > > > > > > RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 > > > > > > RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 > > > > > > R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 > > > > > > R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 > > > > > > FS: 0000000000000000(0000) GS:ffff895257380000(0000) > > > > > > knlGS:0000000000000000 > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 > > > > > > Call Trace: > > > > > > <TASK> > > > > > > ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] > > > > > > drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] > > > > > > process_one_work+0x21f/0x430 > > > > > > worker_thread+0x4e/0x3c0 > > > > > > ? __pfx_worker_thread+0x10/0x10 > > > > > > kthread+0xf4/0x120 > > > > > > ? __pfx_kthread+0x10/0x10 > > > > > > ret_from_fork+0x2c/0x50 > > > > > > </TASK> > > > > > > CR2: ffffa17d40e0b000 > > > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > > > > > The direct reason is that damage rectange computed by > > > > > > drm_fb_helper_memory_range_to_clip() does not guaranteed to be > > > > > > in-bound. > > > > > > It is already results in workaround code populate to elsewhere. Another > > > > > > reason is that exposing a larger buffer size than the actual needed > > > > > > help > > > > > > to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). > > > > > > > > > > > > Others fbdev emulation solutions write to the GEM buffer directly, they > > > > > > won't reproduce this bug because the .fb_dirty function callback do not > > > > > > being hooked, so no chance is given to > > > > > > drm_fb_helper_memory_range_to_clip() > > > > > > to generate a out-of-bound when drm_fb_helper_sys_write() is called. > > > > > > > > > > > > This patch break the trigger condition of this bug by shrinking the > > > > > > shadow > > > > > > buffer size to sizes->surface_height * buffer->fb->pitches[0]. > > > > > > > > > > > > Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of > > > > > > GEM > > > > > > buffer")' > > > > > > > > > > > > Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> > > > > > > --- > > > > > > drivers/gpu/drm/drm_fbdev_generic.c | 2 +- > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/drm_fbdev_generic.c > > > > > > b/drivers/gpu/drm/drm_fbdev_generic.c > > > > > > index 8e5148bf40bb..b057cfbba938 100644 > > > > > > --- a/drivers/gpu/drm/drm_fbdev_generic.c > > > > > > +++ b/drivers/gpu/drm/drm_fbdev_generic.c > > > > > > @@ -94,7 +94,7 @@ static int > > > > > > drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, > > > > > > fb_helper->buffer = buffer; > > > > > > fb_helper->fb = buffer->fb; > > > > > > - screen_size = buffer->gem->size; > > > > > > + screen_size = sizes->surface_height * buffer->fb->pitches[0]; > > > > > So I read core some more and stumbled over drm_fb_helper_deferred_io(). > > > > > Which has all the code and comments about this, including limiting. > > > > > > > > > > I think it would be clearer if we fix the issue there, instead of > > > > > passing > > > > > limits around in obscure places that then again get broken? > > > > No, it is more obscure doing that way... > > > > > > > > > > > > As the size of the shadow screen buffer will be exposed to userspace. > > > > > > > > The size 'helper->fb->height * helper->fb->pitches[0]' is a > > > > exactly(best) fit, > > > > > > > > You are guaranteed to waste at lease one byte by increasing one byte, > > > > > > > > and can not store all pixels by decreasing one byte (In the case where > > > > `helper->fb->pitches[0] = helper->fb->width * 4`). > > > > > > > > It implicitly tell the userspace do not go beyond that boundary. > > > > > > > > although userspace program can still choose to write after EOF, > > > > > > > > But it is for test purpose, to test the kernel if it can return a > > > > -EFBIG or not. > > > > > > > > > The thing is, > > > > > Thomas both authored the limit checks in drm_fb_helper_deferred_io() and > > > > > the patch which broken them again, so clearly this isn't very > > > > > obvious. I'm > > > > > thinking of something like this: > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/drm_fb_helper.c > > > > > b/drivers/gpu/drm/drm_fb_helper.c > > > > > index ef4eb8b12766..726dab67c359 100644 > > > > > --- a/drivers/gpu/drm/drm_fb_helper.c > > > > > +++ b/drivers/gpu/drm/drm_fb_helper.c > > > > > @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info > > > > > *info, struct list_head *pagerefli > > > > > * of the screen and account for non-existing scanlines. Hence, > > > > > * keep the covered memory area within the screen buffer. > > > > > */ > > > > > - if (info->screen_size) > > > > > - total_size = info->screen_size; > > > > > - else > > > > > - total_size = info->fix.smem_len; > > > > > + total_size = helper->fb->height * helper->fb->pitches[0]; > > > > This is just to mitigate the mistakes already has been made, > > > > > > > > because it do not do a good splitting between the *clip* part and the > > > > *damage update* part. > > > > > > > > An ideal clipping do not obscure its updating backend with a > > > > out-of-bound damage rectangle. > > > > > > > > Why did the drm_fb_helper_memory_range_to_clip() can not do a good job > > > > in all case > > > > > > > > to pass its backend a always meaningful damage rect ? > > > > > > > > > max_off = min(max_off, total_size); > > > > > if (min_off < max_off) { > > > > > > > > > > > > > > > I think that would make it utmost clear on what we're doing and why. > > > > > Otherwise we're just going to re-create the same bug again, like we've > > > > > done already :-) > > > > No, we create no bugs, we fix one. > > > > > > > > Thanks. > > > > > > > But honestly I do not have strong feel toward this, I just type what I'm > > > understand without seeing you resend a V3. > > > > > > It's OK in overall, I will help to test this tomorrow. :-) > > Apologies for making you jump around all the time and doing different > > versions of the same bugfix :-/ > > No, I do not mind. I'm wondering if you are testing me. Nah I'm really not any clearer on this than you :-/ > > I think this one here is ok to merge, > > NO, to be honest, this version is not ok. > > I have just tested it on LoongArch . It does not prevent out-of-bound on > LoongArch. > > bellow is the call trace when running the fbdev test of IGT. > > > [ 369.628841] Console: switching to colour dummy device 80x25 > [ 369.634440] [IGT] fbdev: executing > [ 369.654684] [IGT] fbdev: starting subtest info > [ 369.659173] [IGT] fbdev: starting subtest pan > [ 369.722093] [IGT] fbdev: starting subtest read > [ 369.737795] [IGT] fbdev: starting subtest unaligned-read > [ 369.745695] [IGT] fbdev: starting subtest write > [ 369.753154] CPU 3 Unable to handle kernel paging request at virtual address ffff800034bd0000, era == 9000000000223d5c, ra == ffff8000020d75a8 > [ 369.774570] [IGT] fbdev: starting subtest unaligned-write > [ 369.779960] Oops[#1]: > [ 369.782215] CPU: 3 PID: 504 Comm: kworker/3:3 Not tainted 6.3.0-rc5+ #377 > [ 369.782219] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-EVB/Loongson-LS3A5000-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05383-beta10 1 > [ 369.782223] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] > [ 369.802706] $ 0 : 0000000000000000 ffff8000020d75a8 90000001367fc000 90000001367ffcf0 > [ 369.809368] $ 4 : ffff80003a3ec000 ffff800034bcee00 0000000000001e00 9000000001160000 > [ 369.825275] $ 8 : ffff80003a3ebfff 0000000000000040 0000000000000000 ffff80000214f20c > [ 369.825280] $12 : ffff80003a3ecc00 ffff800034bd0000 0000000000000000 0000000000001e00 > [ 369.841188] $16 : 9000000107490800 9000000001a24000 9000000001a28000 00000e00207ec000 > [ 369.849152] $20 : fffffffffe000000 900000010684cec0 0000000000000439 90000001367ffd90 > [ 369.849165] $24 : 9000000106bc5200 ffff8000020de000 ffff800034bcee00 0000000000001e00 > [ 369.865072] $28 : 90000001076d4400 0000000000000001 900000010553c000 ffff80003a3eae00 > [ 369.865078] era : 9000000000223d5c __memcpy_toio+0x4c/0x90 > [ 369.878651] ra : ffff8000020d75a8 drm_fbdev_generic_helper_fb_dirty+0x1cc/0x300 [drm_kms_helper] > [ 369.878762] [IGT] fbdev: starting subtest eof > [ 369.887679] CSR crmd: 000000b0 > [ 369.887680] CSR prmd: 00000004 > [ 369.887682] CSR euen: 00000000 > [ 369.887683] CSR ecfg: 00071c1c > [ 369.901340] CSR estat: 00010000 > [ 369.901342] ExcCode : 1 (SubCode 0) > [ 369.901344] BadVA : ffff800034bd0000 > [ 369.914659] PrId : 0014c011 (Loongson-64bit) > [ 369.914660] Modules linked in: uas usb_storage snd_seq_dummy snd_seq snd_seq_device ip_set rfkill nf_tables nfnetlink vfat fat loongson ttm acpi_ipmi drm_kms_helper syscopyarea sysfillrect ipmi_si ipmi_devintf sysimgblt ipmi_msghandler fuse efivarfs > [ 369.919013] Process kworker/3:3 (pid: 504, threadinfo=00000000a1234af0, task=000000004e2cde6f) > [ 369.949519] Stack : 9000000106bc5318 900000010684cec0 9000000107afd470 ffff800039c00000 > [ 369.949535] [IGT] fbdev: starting subtest nullptr > [ 369.957476] 0000000000000001 0000000000000000 0000000000000000 76e0ff420f8eaeab > [ 369.957481] 9000000001712000 fffffffffffffffb 9000000100161080 0000000000000003 > [ 369.970105] 900000000980ba05 900000000980ba00 9000000106bc5200 9000000106bc52d4 > [ 369.978062] ffff8000020de000 9000000106bc52d8 9000000110664b40 ffff8000020d78e4 > [ 369.986018] 043a078000000000 76e0ff420f8eaeab 900000000980be00 9000000009807400 > [ 370.001926] 0000000000000000 9000000110664b40 9000000106bc52d8 9000000000256dbc > [ 370.001931] 900000000170e000 9000000107afce00 0000000000000001 900000000170e000 > [ 370.009888] 9000000009807428 9000000110664b70 9000000009807400 900000000025737c > [ 370.025797] 9000000100161080 9000000001148080 9000000107afce00 0000000000000003 > [ 370.026543] [IGT] fbdev: exiting, ret=0 > [ 370.033753] ... > [ 370.033756] Call Trace: > [ 370.033757] [<9000000000223d5c>] __memcpy_toio+0x4c/0x90 > [ 370.047681] [<ffff8000020d75a8>] drm_fbdev_generic_helper_fb_dirty+0x1cc/0x300 [drm_kms_helper] > [ 370.056354] [<ffff8000020d78e4>] drm_fb_helper_damage_work+0xa4/0x1d0 [drm_kms_helper] > [ 370.056381] [<9000000000256dbc>] process_one_work+0x1ec/0x35c > [ 370.056385] [<900000000025737c>] worker_thread+0x88/0x428 > [ 370.056387] [<900000000025f4bc>] kthread+0x114/0x120 > [ 370.056392] [<90000000002215a8>] ret_from_kernel_thread+0xc/0xa4 > [ 370.056395] > [ 370.056396] Code: 00410def 0010bc8c 001500ad <260001ae> 02c02084 02c021ad 29ffe08e 5ffff184 03401cc6 > [ 370.056406] > [ 370.056421] fbcon_init: detected unhandled fb_set_par error, error code -16 > [ 370.056482] ---[ end trace 0000000000000000 ]--- > [ 370.066620] Console: switching to colour frame buffer device 240x67 > > > > I just thought when looking at > > the history that we revert the exact patch without any other changes > > or comments, > Other part of that patch(except this line) may still useful, at least for > cleanup purpose. > > and usually that means someone will come up with the same > > cleanup idea again, and then we'll have a bug again. So maybe a > > comment or a WARN_ON or something else would be good. > > A WARN_ON is acceptable. > > > I guess we could also do your patch, but put a WARN_ON that the > > computed total_size is never bigger than the drm_fb size into > > drm_fb_helper_deferred_io()? That would also make sure that this bug > > doesn't get resurrected again. > > Best to merge V2 [1] of this series, that is what I am really fixed. > > Maybe somebody can help to refine it, to add a better description about this > question and so on. > > [1] https://patchwork.freedesktop.org/patch/532143/?series=116454&rev=1 Ok, I guess this really is the safest one. For that patch, do we need the change to screen_size in drm_fbdev_generic_helper_fb_probe()? I'm still not entirely clear. If it works without that change I think that's clearer for a minimal bugfix, if so can you send that out as v4 please? Also please Cc: Geert on whatever you're resubmitting, so he can test too and we can make sure it's still fixing the shmob issue he's seeing. And finally please include a link to this discussion here with a note that just reverting the screen_size changes is not enough: https://lore.kernel.org/dri-devel/ad44df29-3241-0d9e-e708-b0338bf3c623@189.cn/ Thanks a lot! -Daniel
Hi, On 2023/4/20 00:31, Daniel Vetter wrote: > On Thu, Apr 20, 2023 at 12:00:41AM +0800, Sui Jingfeng wrote: >> Hi, >> >> Sorry about reply to you so late, >> >> our downstream (product kernel side) userspace GPU/DC driver >> >> has been tested out a few bugs, I'm asking to fulfill my duty to that part >> all days. >> >> I may slow to reply, but I really love to reply. >> >> >> On 2023/4/19 23:09, Daniel Vetter wrote: >>> On Tue, 18 Apr 2023 at 20:16, Sui Jingfeng <15330273260@189.cn> wrote: >>>> Hi, >>>> >>>> On 2023/4/19 01:52, Sui Jingfeng wrote: >>>>> Hi, >>>>> >>>>> On 2023/4/18 16:32, Daniel Vetter wrote: >>>>>> On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: >>>>>>> The fbdev test of IGT may write after EOF, which lead to out-of-bound >>>>>>> access for the drm drivers using fbdev-generic. For example, on a x86 >>>>>>> + aspeed bmc card platform, with a 1680x1050 resolution display, >>>>>>> running >>>>>>> fbdev test if IGT will cause the linux kernel hang with the following >>>>>>> call trace: >>>>>>> >>>>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>>>> [IGT] fbdev: starting subtest eof >>>>>>> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >>>>>>> [IGT] fbdev: starting subtest nullptr >>>>>>> >>>>>>> RIP: 0010:memcpy_erms+0xa/0x20 >>>>>>> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 >>>>>>> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 >>>>>>> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 >>>>>>> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 >>>>>>> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 >>>>>>> R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 >>>>>>> FS: 0000000000000000(0000) GS:ffff895257380000(0000) >>>>>>> knlGS:0000000000000000 >>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>>> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 >>>>>>> Call Trace: >>>>>>> <TASK> >>>>>>> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] >>>>>>> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] >>>>>>> process_one_work+0x21f/0x430 >>>>>>> worker_thread+0x4e/0x3c0 >>>>>>> ? __pfx_worker_thread+0x10/0x10 >>>>>>> kthread+0xf4/0x120 >>>>>>> ? __pfx_kthread+0x10/0x10 >>>>>>> ret_from_fork+0x2c/0x50 >>>>>>> </TASK> >>>>>>> CR2: ffffa17d40e0b000 >>>>>>> ---[ end trace 0000000000000000 ]--- >>>>>>> >>>>>>> The direct reason is that damage rectange computed by >>>>>>> drm_fb_helper_memory_range_to_clip() does not guaranteed to be >>>>>>> in-bound. >>>>>>> It is already results in workaround code populate to elsewhere. Another >>>>>>> reason is that exposing a larger buffer size than the actual needed >>>>>>> help >>>>>>> to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). >>>>>>> >>>>>>> Others fbdev emulation solutions write to the GEM buffer directly, they >>>>>>> won't reproduce this bug because the .fb_dirty function callback do not >>>>>>> being hooked, so no chance is given to >>>>>>> drm_fb_helper_memory_range_to_clip() >>>>>>> to generate a out-of-bound when drm_fb_helper_sys_write() is called. >>>>>>> >>>>>>> This patch break the trigger condition of this bug by shrinking the >>>>>>> shadow >>>>>>> buffer size to sizes->surface_height * buffer->fb->pitches[0]. >>>>>>> >>>>>>> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of >>>>>>> GEM >>>>>>> buffer")' >>>>>>> >>>>>>> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> >>>>>>> --- >>>>>>> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- >>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> b/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> index 8e5148bf40bb..b057cfbba938 100644 >>>>>>> --- a/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> +++ b/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> @@ -94,7 +94,7 @@ static int >>>>>>> drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, >>>>>>> fb_helper->buffer = buffer; >>>>>>> fb_helper->fb = buffer->fb; >>>>>>> - screen_size = buffer->gem->size; >>>>>>> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; >>>>>> So I read core some more and stumbled over drm_fb_helper_deferred_io(). >>>>>> Which has all the code and comments about this, including limiting. >>>>>> >>>>>> I think it would be clearer if we fix the issue there, instead of >>>>>> passing >>>>>> limits around in obscure places that then again get broken? >>>>> No, it is more obscure doing that way... >>>>> >>>>> >>>>> As the size of the shadow screen buffer will be exposed to userspace. >>>>> >>>>> The size 'helper->fb->height * helper->fb->pitches[0]' is a >>>>> exactly(best) fit, >>>>> >>>>> You are guaranteed to waste at lease one byte by increasing one byte, >>>>> >>>>> and can not store all pixels by decreasing one byte (In the case where >>>>> `helper->fb->pitches[0] = helper->fb->width * 4`). >>>>> >>>>> It implicitly tell the userspace do not go beyond that boundary. >>>>> >>>>> although userspace program can still choose to write after EOF, >>>>> >>>>> But it is for test purpose, to test the kernel if it can return a >>>>> -EFBIG or not. >>>>> >>>>>> The thing is, >>>>>> Thomas both authored the limit checks in drm_fb_helper_deferred_io() and >>>>>> the patch which broken them again, so clearly this isn't very >>>>>> obvious. I'm >>>>>> thinking of something like this: >>>>>> >>>>>> >>>>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c >>>>>> b/drivers/gpu/drm/drm_fb_helper.c >>>>>> index ef4eb8b12766..726dab67c359 100644 >>>>>> --- a/drivers/gpu/drm/drm_fb_helper.c >>>>>> +++ b/drivers/gpu/drm/drm_fb_helper.c >>>>>> @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info >>>>>> *info, struct list_head *pagerefli >>>>>> * of the screen and account for non-existing scanlines. Hence, >>>>>> * keep the covered memory area within the screen buffer. >>>>>> */ >>>>>> - if (info->screen_size) >>>>>> - total_size = info->screen_size; >>>>>> - else >>>>>> - total_size = info->fix.smem_len; >>>>>> + total_size = helper->fb->height * helper->fb->pitches[0]; >>>>> This is just to mitigate the mistakes already has been made, >>>>> >>>>> because it do not do a good splitting between the *clip* part and the >>>>> *damage update* part. >>>>> >>>>> An ideal clipping do not obscure its updating backend with a >>>>> out-of-bound damage rectangle. >>>>> >>>>> Why did the drm_fb_helper_memory_range_to_clip() can not do a good job >>>>> in all case >>>>> >>>>> to pass its backend a always meaningful damage rect ? >>>>> >>>>>> max_off = min(max_off, total_size); >>>>>> if (min_off < max_off) { >>>>>> >>>>>> >>>>>> I think that would make it utmost clear on what we're doing and why. >>>>>> Otherwise we're just going to re-create the same bug again, like we've >>>>>> done already :-) >>>>> No, we create no bugs, we fix one. >>>>> >>>>> Thanks. >>>>> >>>> But honestly I do not have strong feel toward this, I just type what I'm >>>> understand without seeing you resend a V3. >>>> >>>> It's OK in overall, I will help to test this tomorrow. :-) >>> Apologies for making you jump around all the time and doing different >>> versions of the same bugfix :-/ >> No, I do not mind. I'm wondering if you are testing me. > Nah I'm really not any clearer on this than you :-/ > >>> I think this one here is ok to merge, >> NO, to be honest, this version is not ok. >> >> I have just tested it on LoongArch . It does not prevent out-of-bound on >> LoongArch. >> >> bellow is the call trace when running the fbdev test of IGT. >> >> >> [ 369.628841] Console: switching to colour dummy device 80x25 >> [ 369.634440] [IGT] fbdev: executing >> [ 369.654684] [IGT] fbdev: starting subtest info >> [ 369.659173] [IGT] fbdev: starting subtest pan >> [ 369.722093] [IGT] fbdev: starting subtest read >> [ 369.737795] [IGT] fbdev: starting subtest unaligned-read >> [ 369.745695] [IGT] fbdev: starting subtest write >> [ 369.753154] CPU 3 Unable to handle kernel paging request at virtual address ffff800034bd0000, era == 9000000000223d5c, ra == ffff8000020d75a8 >> [ 369.774570] [IGT] fbdev: starting subtest unaligned-write >> [ 369.779960] Oops[#1]: >> [ 369.782215] CPU: 3 PID: 504 Comm: kworker/3:3 Not tainted 6.3.0-rc5+ #377 >> [ 369.782219] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-EVB/Loongson-LS3A5000-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05383-beta10 1 >> [ 369.782223] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >> [ 369.802706] $ 0 : 0000000000000000 ffff8000020d75a8 90000001367fc000 90000001367ffcf0 >> [ 369.809368] $ 4 : ffff80003a3ec000 ffff800034bcee00 0000000000001e00 9000000001160000 >> [ 369.825275] $ 8 : ffff80003a3ebfff 0000000000000040 0000000000000000 ffff80000214f20c >> [ 369.825280] $12 : ffff80003a3ecc00 ffff800034bd0000 0000000000000000 0000000000001e00 >> [ 369.841188] $16 : 9000000107490800 9000000001a24000 9000000001a28000 00000e00207ec000 >> [ 369.849152] $20 : fffffffffe000000 900000010684cec0 0000000000000439 90000001367ffd90 >> [ 369.849165] $24 : 9000000106bc5200 ffff8000020de000 ffff800034bcee00 0000000000001e00 >> [ 369.865072] $28 : 90000001076d4400 0000000000000001 900000010553c000 ffff80003a3eae00 >> [ 369.865078] era : 9000000000223d5c __memcpy_toio+0x4c/0x90 >> [ 369.878651] ra : ffff8000020d75a8 drm_fbdev_generic_helper_fb_dirty+0x1cc/0x300 [drm_kms_helper] >> [ 369.878762] [IGT] fbdev: starting subtest eof >> [ 369.887679] CSR crmd: 000000b0 >> [ 369.887680] CSR prmd: 00000004 >> [ 369.887682] CSR euen: 00000000 >> [ 369.887683] CSR ecfg: 00071c1c >> [ 369.901340] CSR estat: 00010000 >> [ 369.901342] ExcCode : 1 (SubCode 0) >> [ 369.901344] BadVA : ffff800034bd0000 >> [ 369.914659] PrId : 0014c011 (Loongson-64bit) >> [ 369.914660] Modules linked in: uas usb_storage snd_seq_dummy snd_seq snd_seq_device ip_set rfkill nf_tables nfnetlink vfat fat loongson ttm acpi_ipmi drm_kms_helper syscopyarea sysfillrect ipmi_si ipmi_devintf sysimgblt ipmi_msghandler fuse efivarfs >> [ 369.919013] Process kworker/3:3 (pid: 504, threadinfo=00000000a1234af0, task=000000004e2cde6f) >> [ 369.949519] Stack : 9000000106bc5318 900000010684cec0 9000000107afd470 ffff800039c00000 >> [ 369.949535] [IGT] fbdev: starting subtest nullptr >> [ 369.957476] 0000000000000001 0000000000000000 0000000000000000 76e0ff420f8eaeab >> [ 369.957481] 9000000001712000 fffffffffffffffb 9000000100161080 0000000000000003 >> [ 369.970105] 900000000980ba05 900000000980ba00 9000000106bc5200 9000000106bc52d4 >> [ 369.978062] ffff8000020de000 9000000106bc52d8 9000000110664b40 ffff8000020d78e4 >> [ 369.986018] 043a078000000000 76e0ff420f8eaeab 900000000980be00 9000000009807400 >> [ 370.001926] 0000000000000000 9000000110664b40 9000000106bc52d8 9000000000256dbc >> [ 370.001931] 900000000170e000 9000000107afce00 0000000000000001 900000000170e000 >> [ 370.009888] 9000000009807428 9000000110664b70 9000000009807400 900000000025737c >> [ 370.025797] 9000000100161080 9000000001148080 9000000107afce00 0000000000000003 >> [ 370.026543] [IGT] fbdev: exiting, ret=0 >> [ 370.033753] ... >> [ 370.033756] Call Trace: >> [ 370.033757] [<9000000000223d5c>] __memcpy_toio+0x4c/0x90 >> [ 370.047681] [<ffff8000020d75a8>] drm_fbdev_generic_helper_fb_dirty+0x1cc/0x300 [drm_kms_helper] >> [ 370.056354] [<ffff8000020d78e4>] drm_fb_helper_damage_work+0xa4/0x1d0 [drm_kms_helper] >> [ 370.056381] [<9000000000256dbc>] process_one_work+0x1ec/0x35c >> [ 370.056385] [<900000000025737c>] worker_thread+0x88/0x428 >> [ 370.056387] [<900000000025f4bc>] kthread+0x114/0x120 >> [ 370.056392] [<90000000002215a8>] ret_from_kernel_thread+0xc/0xa4 >> [ 370.056395] >> [ 370.056396] Code: 00410def 0010bc8c 001500ad <260001ae> 02c02084 02c021ad 29ffe08e 5ffff184 03401cc6 >> [ 370.056406] >> [ 370.056421] fbcon_init: detected unhandled fb_set_par error, error code -16 >> [ 370.056482] ---[ end trace 0000000000000000 ]--- >> [ 370.066620] Console: switching to colour frame buffer device 240x67 >> >> >>> I just thought when looking at >>> the history that we revert the exact patch without any other changes >>> or comments, >> Other part of that patch(except this line) may still useful, at least for >> cleanup purpose. >>> and usually that means someone will come up with the same >>> cleanup idea again, and then we'll have a bug again. So maybe a >>> comment or a WARN_ON or something else would be good. >> A WARN_ON is acceptable. >> >>> I guess we could also do your patch, but put a WARN_ON that the >>> computed total_size is never bigger than the drm_fb size into >>> drm_fb_helper_deferred_io()? That would also make sure that this bug >>> doesn't get resurrected again. >> Best to merge V2 [1] of this series, that is what I am really fixed. >> >> Maybe somebody can help to refine it, to add a better description about this >> question and so on. >> >> [1] https://patchwork.freedesktop.org/patch/532143/?series=116454&rev=1 > Ok, I guess this really is the safest one. For that patch, do we need the > change to screen_size in drm_fbdev_generic_helper_fb_probe()? With that patch, we can expose a larger screen_size to userspace, change screen_size to `helper->fb->height * helper->fb->pitches[0]` is *NOT* necessary. But I though it maybe better to keep it line with the counter part in drm_fbdev_dma_helper_fb_probe(). make it double security. > I'm still > not entirely clear. Because drm_fb_helper_memory_range_to_clip() function will also be called by drm_fb_helper_sys_write() function(in drm_fb_helper.c) when running fbdev test. Put restriction in drm_fb_helper_deferred_io() function solely is not enough, because drm_fb_helper_deferred_io() is for mmap code path to writing the shadow buffer. It relative fast, because it does not need another copy from user. However , drm_fb_helper_sys_write() is another code path to write to the shadow screen buffer. it need copy from user to the kernel first, then copy to the shadow screen buffer, and finally copy from the shadow to the real gem buffer. Every time you want to write the screen buffer, you have to issue a system call. But for the mmap code path, I only need map it to userspace address space once. Yeah, only once, with write-combine page table caching property for video ram. but for the shadow buffer in system ram, using cached is more fast and reliable because of cache coherent related concerns... So, we have two ways to writing the shadow(screen) buffer in the system ram. > If it works without that change I think that's clearer > for a minimal bugfix, if so can you send that out as v4 please? OK. But I still want to fix the two by one shot. It makes me feel comfortable. > Also please Cc: Geert on whatever you're resubmitting, so he can test too > and we can make sure it's still fixing the shmob issue he's seeing. > > And finally please include a link to this discussion here with a note that > just reverting the screen_size changes is not enough: > > https://lore.kernel.org/dri-devel/ad44df29-3241-0d9e-e708-b0338bf3c623@189.cn/ > Thanks a lot! > -Daniel
Hi, On 2023/4/19 23:46, Thomas Zimmermann wrote: > Hi > > Am 19.04.23 um 17:09 schrieb Daniel Vetter: >> On Tue, 18 Apr 2023 at 20:16, Sui Jingfeng <15330273260@189.cn> wrote: >>> >>> Hi, >>> >>> On 2023/4/19 01:52, Sui Jingfeng wrote: >>>> Hi, >>>> >>>> On 2023/4/18 16:32, Daniel Vetter wrote: >>>>> On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: >>>>>> The fbdev test of IGT may write after EOF, which lead to >>>>>> out-of-bound >>>>>> access for the drm drivers using fbdev-generic. For example, on a >>>>>> x86 >>>>>> + aspeed bmc card platform, with a 1680x1050 resolution display, >>>>>> running >>>>>> fbdev test if IGT will cause the linux kernel hang with the >>>>>> following >>>>>> call trace: >>>>>> >>>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>>> [IGT] fbdev: starting subtest eof >>>>>> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >>>>>> [IGT] fbdev: starting subtest nullptr >>>>>> >>>>>> RIP: 0010:memcpy_erms+0xa/0x20 >>>>>> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 >>>>>> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: >>>>>> 00000000000014c0 >>>>>> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: >>>>>> ffffa17d4eb80000 >>>>>> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: >>>>>> ffff89522ecff8c0 >>>>>> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: >>>>>> ffffa17d4eb7fa80 >>>>>> R13: 0000000000001a40 R14: 000000000000041a R15: >>>>>> ffffa17d40167e30 >>>>>> FS: 0000000000000000(0000) GS:ffff895257380000(0000) >>>>>> knlGS:0000000000000000 >>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: >>>>>> 00000000001706e0 >>>>>> Call Trace: >>>>>> <TASK> >>>>>> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 >>>>>> [drm_kms_helper] >>>>>> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] >>>>>> process_one_work+0x21f/0x430 >>>>>> worker_thread+0x4e/0x3c0 >>>>>> ? __pfx_worker_thread+0x10/0x10 >>>>>> kthread+0xf4/0x120 >>>>>> ? __pfx_kthread+0x10/0x10 >>>>>> ret_from_fork+0x2c/0x50 >>>>>> </TASK> >>>>>> CR2: ffffa17d40e0b000 >>>>>> ---[ end trace 0000000000000000 ]--- >>>>>> >>>>>> The direct reason is that damage rectange computed by >>>>>> drm_fb_helper_memory_range_to_clip() does not guaranteed to be >>>>>> in-bound. >>>>>> It is already results in workaround code populate to elsewhere. >>>>>> Another >>>>>> reason is that exposing a larger buffer size than the actual needed >>>>>> help >>>>>> to trigger this bug intrinsic in >>>>>> drm_fb_helper_memory_range_to_clip(). >>>>>> >>>>>> Others fbdev emulation solutions write to the GEM buffer >>>>>> directly, they >>>>>> won't reproduce this bug because the .fb_dirty function callback >>>>>> do not >>>>>> being hooked, so no chance is given to >>>>>> drm_fb_helper_memory_range_to_clip() >>>>>> to generate a out-of-bound when drm_fb_helper_sys_write() is called. >>>>>> >>>>>> This patch break the trigger condition of this bug by shrinking the >>>>>> shadow >>>>>> buffer size to sizes->surface_height * buffer->fb->pitches[0]. >>>>>> >>>>>> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of >>>>>> GEM >>>>>> buffer")' >>>>>> >>>>>> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> >>>>>> --- >>>>>> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c >>>>>> b/drivers/gpu/drm/drm_fbdev_generic.c >>>>>> index 8e5148bf40bb..b057cfbba938 100644 >>>>>> --- a/drivers/gpu/drm/drm_fbdev_generic.c >>>>>> +++ b/drivers/gpu/drm/drm_fbdev_generic.c >>>>>> @@ -94,7 +94,7 @@ static int >>>>>> drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, >>>>>> fb_helper->buffer = buffer; >>>>>> fb_helper->fb = buffer->fb; >>>>>> - screen_size = buffer->gem->size; >>>>>> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; >>>>> So I read core some more and stumbled over >>>>> drm_fb_helper_deferred_io(). >>>>> Which has all the code and comments about this, including limiting. >>>>> >>>>> I think it would be clearer if we fix the issue there, instead of >>>>> passing >>>>> limits around in obscure places that then again get broken? >>>> >>>> No, it is more obscure doing that way... >>>> >>>> >>>> As the size of the shadow screen buffer will be exposed to userspace. >>>> >>>> The size 'helper->fb->height * helper->fb->pitches[0]' is a >>>> exactly(best) fit, >>>> >>>> You are guaranteed to waste at lease one byte by increasing one byte, >>>> >>>> and can not store all pixels by decreasing one byte (In the case where >>>> `helper->fb->pitches[0] = helper->fb->width * 4`). >>>> >>>> It implicitly tell the userspace do not go beyond that boundary. >>>> >>>> although userspace program can still choose to write after EOF, >>>> >>>> But it is for test purpose, to test the kernel if it can return a >>>> -EFBIG or not. >>>> >>>>> The thing is, >>>>> Thomas both authored the limit checks in >>>>> drm_fb_helper_deferred_io() and >>>>> the patch which broken them again, so clearly this isn't very >>>>> obvious. I'm >>>>> thinking of something like this: >>>>> >>>>> >>>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c >>>>> b/drivers/gpu/drm/drm_fb_helper.c >>>>> index ef4eb8b12766..726dab67c359 100644 >>>>> --- a/drivers/gpu/drm/drm_fb_helper.c >>>>> +++ b/drivers/gpu/drm/drm_fb_helper.c >>>>> @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info >>>>> *info, struct list_head *pagerefli >>>>> * of the screen and account for non-existing scanlines. >>>>> Hence, >>>>> * keep the covered memory area within the screen buffer. >>>>> */ >>>>> - if (info->screen_size) >>>>> - total_size = info->screen_size; >>>>> - else >>>>> - total_size = info->fix.smem_len; >>>>> + total_size = helper->fb->height * helper->fb->pitches[0]; >>>> >>>> This is just to mitigate the mistakes already has been made, >>>> >>>> because it do not do a good splitting between the *clip* part and the >>>> *damage update* part. >>>> >>>> An ideal clipping do not obscure its updating backend with a >>>> out-of-bound damage rectangle. >>>> >>>> Why did the drm_fb_helper_memory_range_to_clip() can not do a good job >>>> in all case >>>> >>>> to pass its backend a always meaningful damage rect ? >>>> >>>>> max_off = min(max_off, total_size); >>>>> if (min_off < max_off) { >>>>> >>>>> >>>>> I think that would make it utmost clear on what we're doing and why. >>>>> Otherwise we're just going to re-create the same bug again, like >>>>> we've >>>>> done already :-) >>>> >>>> No, we create no bugs, we fix one. >>>> >>>> Thanks. >>>> >>> But honestly I do not have strong feel toward this, I just type what >>> I'm >>> understand without seeing you resend a V3. >>> >>> It's OK in overall, I will help to test this tomorrow. :-) >> >> Apologies for making you jump around all the time and doing different >> versions of the same bugfix :-/ >> >> I think this one here is ok to merge, I just thought when looking at >> the history that we revert the exact patch without any other changes >> or comments, and usually that means someone will come up with the same >> cleanup idea again, and then we'll have a bug again. So maybe a >> comment or a WARN_ON or something else would be good. >> >> I guess we could also do your patch, but put a WARN_ON that the >> computed total_size is never bigger than the drm_fb size into >> drm_fb_helper_deferred_io()? That would also make sure that this bug >> doesn't get resurrected again. > > We'd have to put this test into drm_fbdev_generic.c. Otherwise we'll > break i915, which also uses deferred I/O, but without shadow > buffering.. Maybe test in drm_fbdev_generic_helper_fb_dirty() if the > clip rectangle extends the framebuffer size. > Yeah, i915 carve out part of system ram as video memory, it is also called stolen memory. I just learned it recently from i915 related document. But from what I'm understanding, It's still RAM in its nature, just reserved by firmware. Its bandwidth is extremely high, why not write to the GEM buffer directly? why deferred I/O pay off? > Best regards > Thomas > >> -Daniel >
Hi, On 2023/4/20 00:31, Daniel Vetter wrote: > On Thu, Apr 20, 2023 at 12:00:41AM +0800, Sui Jingfeng wrote: >> Hi, >> >> Sorry about reply to you so late, >> >> our downstream (product kernel side) userspace GPU/DC driver >> >> has been tested out a few bugs, I'm asking to fulfill my duty to that part >> all days. >> >> I may slow to reply, but I really love to reply. >> >> >> On 2023/4/19 23:09, Daniel Vetter wrote: >>> On Tue, 18 Apr 2023 at 20:16, Sui Jingfeng <15330273260@189.cn> wrote: >>>> Hi, >>>> >>>> On 2023/4/19 01:52, Sui Jingfeng wrote: >>>>> Hi, >>>>> >>>>> On 2023/4/18 16:32, Daniel Vetter wrote: >>>>>> On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: >>>>>>> The fbdev test of IGT may write after EOF, which lead to out-of-bound >>>>>>> access for the drm drivers using fbdev-generic. For example, on a x86 >>>>>>> + aspeed bmc card platform, with a 1680x1050 resolution display, >>>>>>> running >>>>>>> fbdev test if IGT will cause the linux kernel hang with the following >>>>>>> call trace: >>>>>>> >>>>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>>>> [IGT] fbdev: starting subtest eof >>>>>>> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >>>>>>> [IGT] fbdev: starting subtest nullptr >>>>>>> >>>>>>> RIP: 0010:memcpy_erms+0xa/0x20 >>>>>>> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 >>>>>>> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0 >>>>>>> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000 >>>>>>> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0 >>>>>>> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80 >>>>>>> R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30 >>>>>>> FS: 0000000000000000(0000) GS:ffff895257380000(0000) >>>>>>> knlGS:0000000000000000 >>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>>> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0 >>>>>>> Call Trace: >>>>>>> <TASK> >>>>>>> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper] >>>>>>> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] >>>>>>> process_one_work+0x21f/0x430 >>>>>>> worker_thread+0x4e/0x3c0 >>>>>>> ? __pfx_worker_thread+0x10/0x10 >>>>>>> kthread+0xf4/0x120 >>>>>>> ? __pfx_kthread+0x10/0x10 >>>>>>> ret_from_fork+0x2c/0x50 >>>>>>> </TASK> >>>>>>> CR2: ffffa17d40e0b000 >>>>>>> ---[ end trace 0000000000000000 ]--- >>>>>>> >>>>>>> The direct reason is that damage rectange computed by >>>>>>> drm_fb_helper_memory_range_to_clip() does not guaranteed to be >>>>>>> in-bound. >>>>>>> It is already results in workaround code populate to elsewhere. Another >>>>>>> reason is that exposing a larger buffer size than the actual needed >>>>>>> help >>>>>>> to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip(). >>>>>>> >>>>>>> Others fbdev emulation solutions write to the GEM buffer directly, they >>>>>>> won't reproduce this bug because the .fb_dirty function callback do not >>>>>>> being hooked, so no chance is given to >>>>>>> drm_fb_helper_memory_range_to_clip() >>>>>>> to generate a out-of-bound when drm_fb_helper_sys_write() is called. >>>>>>> >>>>>>> This patch break the trigger condition of this bug by shrinking the >>>>>>> shadow >>>>>>> buffer size to sizes->surface_height * buffer->fb->pitches[0]. >>>>>>> >>>>>>> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of >>>>>>> GEM >>>>>>> buffer")' >>>>>>> >>>>>>> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> >>>>>>> --- >>>>>>> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- >>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> b/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> index 8e5148bf40bb..b057cfbba938 100644 >>>>>>> --- a/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> +++ b/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> @@ -94,7 +94,7 @@ static int >>>>>>> drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, >>>>>>> fb_helper->buffer = buffer; >>>>>>> fb_helper->fb = buffer->fb; >>>>>>> - screen_size = buffer->gem->size; >>>>>>> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; >>>>>> So I read core some more and stumbled over drm_fb_helper_deferred_io(). >>>>>> Which has all the code and comments about this, including limiting. >>>>>> >>>>>> I think it would be clearer if we fix the issue there, instead of >>>>>> passing >>>>>> limits around in obscure places that then again get broken? >>>>> No, it is more obscure doing that way... >>>>> >>>>> >>>>> As the size of the shadow screen buffer will be exposed to userspace. >>>>> >>>>> The size 'helper->fb->height * helper->fb->pitches[0]' is a >>>>> exactly(best) fit, >>>>> >>>>> You are guaranteed to waste at lease one byte by increasing one byte, >>>>> >>>>> and can not store all pixels by decreasing one byte (In the case where >>>>> `helper->fb->pitches[0] = helper->fb->width * 4`). >>>>> >>>>> It implicitly tell the userspace do not go beyond that boundary. >>>>> >>>>> although userspace program can still choose to write after EOF, >>>>> >>>>> But it is for test purpose, to test the kernel if it can return a >>>>> -EFBIG or not. >>>>> >>>>>> The thing is, >>>>>> Thomas both authored the limit checks in drm_fb_helper_deferred_io() and >>>>>> the patch which broken them again, so clearly this isn't very >>>>>> obvious. I'm >>>>>> thinking of something like this: >>>>>> >>>>>> >>>>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c >>>>>> b/drivers/gpu/drm/drm_fb_helper.c >>>>>> index ef4eb8b12766..726dab67c359 100644 >>>>>> --- a/drivers/gpu/drm/drm_fb_helper.c >>>>>> +++ b/drivers/gpu/drm/drm_fb_helper.c >>>>>> @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info >>>>>> *info, struct list_head *pagerefli >>>>>> * of the screen and account for non-existing scanlines. Hence, >>>>>> * keep the covered memory area within the screen buffer. >>>>>> */ >>>>>> - if (info->screen_size) >>>>>> - total_size = info->screen_size; >>>>>> - else >>>>>> - total_size = info->fix.smem_len; >>>>>> + total_size = helper->fb->height * helper->fb->pitches[0]; >>>>> This is just to mitigate the mistakes already has been made, >>>>> >>>>> because it do not do a good splitting between the *clip* part and the >>>>> *damage update* part. >>>>> >>>>> An ideal clipping do not obscure its updating backend with a >>>>> out-of-bound damage rectangle. >>>>> >>>>> Why did the drm_fb_helper_memory_range_to_clip() can not do a good job >>>>> in all case >>>>> >>>>> to pass its backend a always meaningful damage rect ? >>>>> >>>>>> max_off = min(max_off, total_size); >>>>>> if (min_off < max_off) { >>>>>> >>>>>> >>>>>> I think that would make it utmost clear on what we're doing and why. >>>>>> Otherwise we're just going to re-create the same bug again, like we've >>>>>> done already :-) >>>>> No, we create no bugs, we fix one. >>>>> >>>>> Thanks. >>>>> >>>> But honestly I do not have strong feel toward this, I just type what I'm >>>> understand without seeing you resend a V3. >>>> >>>> It's OK in overall, I will help to test this tomorrow. :-) >>> Apologies for making you jump around all the time and doing different >>> versions of the same bugfix :-/ >> No, I do not mind. I'm wondering if you are testing me. > Nah I'm really not any clearer on this than you :-/ > >>> I think this one here is ok to merge, >> NO, to be honest, this version is not ok. >> >> I have just tested it on LoongArch . It does not prevent out-of-bound on >> LoongArch. >> >> bellow is the call trace when running the fbdev test of IGT. >> >> >> [ 369.628841] Console: switching to colour dummy device 80x25 >> [ 369.634440] [IGT] fbdev: executing >> [ 369.654684] [IGT] fbdev: starting subtest info >> [ 369.659173] [IGT] fbdev: starting subtest pan >> [ 369.722093] [IGT] fbdev: starting subtest read >> [ 369.737795] [IGT] fbdev: starting subtest unaligned-read >> [ 369.745695] [IGT] fbdev: starting subtest write >> [ 369.753154] CPU 3 Unable to handle kernel paging request at virtual address ffff800034bd0000, era == 9000000000223d5c, ra == ffff8000020d75a8 >> [ 369.774570] [IGT] fbdev: starting subtest unaligned-write >> [ 369.779960] Oops[#1]: >> [ 369.782215] CPU: 3 PID: 504 Comm: kworker/3:3 Not tainted 6.3.0-rc5+ #377 >> [ 369.782219] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-EVB/Loongson-LS3A5000-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05383-beta10 1 >> [ 369.782223] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >> [ 369.802706] $ 0 : 0000000000000000 ffff8000020d75a8 90000001367fc000 90000001367ffcf0 >> [ 369.809368] $ 4 : ffff80003a3ec000 ffff800034bcee00 0000000000001e00 9000000001160000 >> [ 369.825275] $ 8 : ffff80003a3ebfff 0000000000000040 0000000000000000 ffff80000214f20c >> [ 369.825280] $12 : ffff80003a3ecc00 ffff800034bd0000 0000000000000000 0000000000001e00 >> [ 369.841188] $16 : 9000000107490800 9000000001a24000 9000000001a28000 00000e00207ec000 >> [ 369.849152] $20 : fffffffffe000000 900000010684cec0 0000000000000439 90000001367ffd90 >> [ 369.849165] $24 : 9000000106bc5200 ffff8000020de000 ffff800034bcee00 0000000000001e00 >> [ 369.865072] $28 : 90000001076d4400 0000000000000001 900000010553c000 ffff80003a3eae00 >> [ 369.865078] era : 9000000000223d5c __memcpy_toio+0x4c/0x90 >> [ 369.878651] ra : ffff8000020d75a8 drm_fbdev_generic_helper_fb_dirty+0x1cc/0x300 [drm_kms_helper] >> [ 369.878762] [IGT] fbdev: starting subtest eof >> [ 369.887679] CSR crmd: 000000b0 >> [ 369.887680] CSR prmd: 00000004 >> [ 369.887682] CSR euen: 00000000 >> [ 369.887683] CSR ecfg: 00071c1c >> [ 369.901340] CSR estat: 00010000 >> [ 369.901342] ExcCode : 1 (SubCode 0) >> [ 369.901344] BadVA : ffff800034bd0000 >> [ 369.914659] PrId : 0014c011 (Loongson-64bit) >> [ 369.914660] Modules linked in: uas usb_storage snd_seq_dummy snd_seq snd_seq_device ip_set rfkill nf_tables nfnetlink vfat fat loongson ttm acpi_ipmi drm_kms_helper syscopyarea sysfillrect ipmi_si ipmi_devintf sysimgblt ipmi_msghandler fuse efivarfs >> [ 369.919013] Process kworker/3:3 (pid: 504, threadinfo=00000000a1234af0, task=000000004e2cde6f) >> [ 369.949519] Stack : 9000000106bc5318 900000010684cec0 9000000107afd470 ffff800039c00000 >> [ 369.949535] [IGT] fbdev: starting subtest nullptr >> [ 369.957476] 0000000000000001 0000000000000000 0000000000000000 76e0ff420f8eaeab >> [ 369.957481] 9000000001712000 fffffffffffffffb 9000000100161080 0000000000000003 >> [ 369.970105] 900000000980ba05 900000000980ba00 9000000106bc5200 9000000106bc52d4 >> [ 369.978062] ffff8000020de000 9000000106bc52d8 9000000110664b40 ffff8000020d78e4 >> [ 369.986018] 043a078000000000 76e0ff420f8eaeab 900000000980be00 9000000009807400 >> [ 370.001926] 0000000000000000 9000000110664b40 9000000106bc52d8 9000000000256dbc >> [ 370.001931] 900000000170e000 9000000107afce00 0000000000000001 900000000170e000 >> [ 370.009888] 9000000009807428 9000000110664b70 9000000009807400 900000000025737c >> [ 370.025797] 9000000100161080 9000000001148080 9000000107afce00 0000000000000003 >> [ 370.026543] [IGT] fbdev: exiting, ret=0 >> [ 370.033753] ... >> [ 370.033756] Call Trace: >> [ 370.033757] [<9000000000223d5c>] __memcpy_toio+0x4c/0x90 >> [ 370.047681] [<ffff8000020d75a8>] drm_fbdev_generic_helper_fb_dirty+0x1cc/0x300 [drm_kms_helper] >> [ 370.056354] [<ffff8000020d78e4>] drm_fb_helper_damage_work+0xa4/0x1d0 [drm_kms_helper] >> [ 370.056381] [<9000000000256dbc>] process_one_work+0x1ec/0x35c >> [ 370.056385] [<900000000025737c>] worker_thread+0x88/0x428 >> [ 370.056387] [<900000000025f4bc>] kthread+0x114/0x120 >> [ 370.056392] [<90000000002215a8>] ret_from_kernel_thread+0xc/0xa4 >> [ 370.056395] >> [ 370.056396] Code: 00410def 0010bc8c 001500ad <260001ae> 02c02084 02c021ad 29ffe08e 5ffff184 03401cc6 >> [ 370.056406] >> [ 370.056421] fbcon_init: detected unhandled fb_set_par error, error code -16 >> [ 370.056482] ---[ end trace 0000000000000000 ]--- >> [ 370.066620] Console: switching to colour frame buffer device 240x67 >> >> >>> I just thought when looking at >>> the history that we revert the exact patch without any other changes >>> or comments, >> Other part of that patch(except this line) may still useful, at least for >> cleanup purpose. >>> and usually that means someone will come up with the same >>> cleanup idea again, and then we'll have a bug again. So maybe a >>> comment or a WARN_ON or something else would be good. >> A WARN_ON is acceptable. >> >>> I guess we could also do your patch, but put a WARN_ON that the >>> computed total_size is never bigger than the drm_fb size into >>> drm_fb_helper_deferred_io()? That would also make sure that this bug >>> doesn't get resurrected again. >> Best to merge V2 [1] of this series, that is what I am really fixed. >> >> Maybe somebody can help to refine it, to add a better description about this >> question and so on. >> >> [1] https://patchwork.freedesktop.org/patch/532143/?series=116454&rev=1 > Ok, I guess this really is the safest one. For that patch, do we need the > change to screen_size in drm_fbdev_generic_helper_fb_probe()? I'm still > not entirely clear. If it works without that change I think that's clearer > for a minimal bugfix, if so can you send that out as v4 please? After those interesting discussion, I think I'm understand this question more thoroughly. See V5 directly please, its commit message description is more clear. hoping that not feel embarrassing about my broken english. I left the screen_size unchanged, it is already page size aligned. It should be page size aligned or not still deserve thinking. > Also please Cc: Geert on whatever you're resubmitting, so he can test too > and we can make sure it's still fixing the shmob issue he's seeing. > > And finally please include a link to this discussion here with a note that > just reverting the screen_size changes is not enough: > > https://lore.kernel.org/dri-devel/ad44df29-3241-0d9e-e708-b0338bf3c623@189.cn/ Done. > Thanks a lot! > -Daniel
Hi Am 19.04.23 um 20:30 schrieb Sui Jingfeng: > Hi, > > On 2023/4/19 23:46, Thomas Zimmermann wrote: >> Hi >> >> Am 19.04.23 um 17:09 schrieb Daniel Vetter: >>> On Tue, 18 Apr 2023 at 20:16, Sui Jingfeng <15330273260@189.cn> wrote: >>>> >>>> Hi, >>>> >>>> On 2023/4/19 01:52, Sui Jingfeng wrote: >>>>> Hi, >>>>> >>>>> On 2023/4/18 16:32, Daniel Vetter wrote: >>>>>> On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote: >>>>>>> The fbdev test of IGT may write after EOF, which lead to >>>>>>> out-of-bound >>>>>>> access for the drm drivers using fbdev-generic. For example, on a >>>>>>> x86 >>>>>>> + aspeed bmc card platform, with a 1680x1050 resolution display, >>>>>>> running >>>>>>> fbdev test if IGT will cause the linux kernel hang with the >>>>>>> following >>>>>>> call trace: >>>>>>> >>>>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>>>> [IGT] fbdev: starting subtest eof >>>>>>> Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] >>>>>>> [IGT] fbdev: starting subtest nullptr >>>>>>> >>>>>>> RIP: 0010:memcpy_erms+0xa/0x20 >>>>>>> RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246 >>>>>>> RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: >>>>>>> 00000000000014c0 >>>>>>> RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: >>>>>>> ffffa17d4eb80000 >>>>>>> RBP: ffffa17d40167e20 R08: 0000000000000000 R09: >>>>>>> ffff89522ecff8c0 >>>>>>> R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: >>>>>>> ffffa17d4eb7fa80 >>>>>>> R13: 0000000000001a40 R14: 000000000000041a R15: >>>>>>> ffffa17d40167e30 >>>>>>> FS: 0000000000000000(0000) GS:ffff895257380000(0000) >>>>>>> knlGS:0000000000000000 >>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>>> CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: >>>>>>> 00000000001706e0 >>>>>>> Call Trace: >>>>>>> <TASK> >>>>>>> ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 >>>>>>> [drm_kms_helper] >>>>>>> drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper] >>>>>>> process_one_work+0x21f/0x430 >>>>>>> worker_thread+0x4e/0x3c0 >>>>>>> ? __pfx_worker_thread+0x10/0x10 >>>>>>> kthread+0xf4/0x120 >>>>>>> ? __pfx_kthread+0x10/0x10 >>>>>>> ret_from_fork+0x2c/0x50 >>>>>>> </TASK> >>>>>>> CR2: ffffa17d40e0b000 >>>>>>> ---[ end trace 0000000000000000 ]--- >>>>>>> >>>>>>> The direct reason is that damage rectange computed by >>>>>>> drm_fb_helper_memory_range_to_clip() does not guaranteed to be >>>>>>> in-bound. >>>>>>> It is already results in workaround code populate to elsewhere. >>>>>>> Another >>>>>>> reason is that exposing a larger buffer size than the actual needed >>>>>>> help >>>>>>> to trigger this bug intrinsic in >>>>>>> drm_fb_helper_memory_range_to_clip(). >>>>>>> >>>>>>> Others fbdev emulation solutions write to the GEM buffer >>>>>>> directly, they >>>>>>> won't reproduce this bug because the .fb_dirty function callback >>>>>>> do not >>>>>>> being hooked, so no chance is given to >>>>>>> drm_fb_helper_memory_range_to_clip() >>>>>>> to generate a out-of-bound when drm_fb_helper_sys_write() is called. >>>>>>> >>>>>>> This patch break the trigger condition of this bug by shrinking the >>>>>>> shadow >>>>>>> buffer size to sizes->surface_height * buffer->fb->pitches[0]. >>>>>>> >>>>>>> Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of >>>>>>> GEM >>>>>>> buffer")' >>>>>>> >>>>>>> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> >>>>>>> --- >>>>>>> drivers/gpu/drm/drm_fbdev_generic.c | 2 +- >>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> b/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> index 8e5148bf40bb..b057cfbba938 100644 >>>>>>> --- a/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> +++ b/drivers/gpu/drm/drm_fbdev_generic.c >>>>>>> @@ -94,7 +94,7 @@ static int >>>>>>> drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, >>>>>>> fb_helper->buffer = buffer; >>>>>>> fb_helper->fb = buffer->fb; >>>>>>> - screen_size = buffer->gem->size; >>>>>>> + screen_size = sizes->surface_height * buffer->fb->pitches[0]; >>>>>> So I read core some more and stumbled over >>>>>> drm_fb_helper_deferred_io(). >>>>>> Which has all the code and comments about this, including limiting. >>>>>> >>>>>> I think it would be clearer if we fix the issue there, instead of >>>>>> passing >>>>>> limits around in obscure places that then again get broken? >>>>> >>>>> No, it is more obscure doing that way... >>>>> >>>>> >>>>> As the size of the shadow screen buffer will be exposed to userspace. >>>>> >>>>> The size 'helper->fb->height * helper->fb->pitches[0]' is a >>>>> exactly(best) fit, >>>>> >>>>> You are guaranteed to waste at lease one byte by increasing one byte, >>>>> >>>>> and can not store all pixels by decreasing one byte (In the case where >>>>> `helper->fb->pitches[0] = helper->fb->width * 4`). >>>>> >>>>> It implicitly tell the userspace do not go beyond that boundary. >>>>> >>>>> although userspace program can still choose to write after EOF, >>>>> >>>>> But it is for test purpose, to test the kernel if it can return a >>>>> -EFBIG or not. >>>>> >>>>>> The thing is, >>>>>> Thomas both authored the limit checks in >>>>>> drm_fb_helper_deferred_io() and >>>>>> the patch which broken them again, so clearly this isn't very >>>>>> obvious. I'm >>>>>> thinking of something like this: >>>>>> >>>>>> >>>>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c >>>>>> b/drivers/gpu/drm/drm_fb_helper.c >>>>>> index ef4eb8b12766..726dab67c359 100644 >>>>>> --- a/drivers/gpu/drm/drm_fb_helper.c >>>>>> +++ b/drivers/gpu/drm/drm_fb_helper.c >>>>>> @@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info >>>>>> *info, struct list_head *pagerefli >>>>>> * of the screen and account for non-existing scanlines. >>>>>> Hence, >>>>>> * keep the covered memory area within the screen buffer. >>>>>> */ >>>>>> - if (info->screen_size) >>>>>> - total_size = info->screen_size; >>>>>> - else >>>>>> - total_size = info->fix.smem_len; >>>>>> + total_size = helper->fb->height * helper->fb->pitches[0]; >>>>> >>>>> This is just to mitigate the mistakes already has been made, >>>>> >>>>> because it do not do a good splitting between the *clip* part and the >>>>> *damage update* part. >>>>> >>>>> An ideal clipping do not obscure its updating backend with a >>>>> out-of-bound damage rectangle. >>>>> >>>>> Why did the drm_fb_helper_memory_range_to_clip() can not do a good job >>>>> in all case >>>>> >>>>> to pass its backend a always meaningful damage rect ? >>>>> >>>>>> max_off = min(max_off, total_size); >>>>>> if (min_off < max_off) { >>>>>> >>>>>> >>>>>> I think that would make it utmost clear on what we're doing and why. >>>>>> Otherwise we're just going to re-create the same bug again, like >>>>>> we've >>>>>> done already :-) >>>>> >>>>> No, we create no bugs, we fix one. >>>>> >>>>> Thanks. >>>>> >>>> But honestly I do not have strong feel toward this, I just type what >>>> I'm >>>> understand without seeing you resend a V3. >>>> >>>> It's OK in overall, I will help to test this tomorrow. :-) >>> >>> Apologies for making you jump around all the time and doing different >>> versions of the same bugfix :-/ >>> >>> I think this one here is ok to merge, I just thought when looking at >>> the history that we revert the exact patch without any other changes >>> or comments, and usually that means someone will come up with the same >>> cleanup idea again, and then we'll have a bug again. So maybe a >>> comment or a WARN_ON or something else would be good. >>> >>> I guess we could also do your patch, but put a WARN_ON that the >>> computed total_size is never bigger than the drm_fb size into >>> drm_fb_helper_deferred_io()? That would also make sure that this bug >>> doesn't get resurrected again. >> >> We'd have to put this test into drm_fbdev_generic.c. Otherwise we'll >> break i915, which also uses deferred I/O, but without shadow >> buffering.. Maybe test in drm_fbdev_generic_helper_fb_dirty() if the >> clip rectangle extends the framebuffer size. >> > Yeah, i915 carve out part of system ram as video memory, it is also > called stolen memory. > > I just learned it recently from i915 related document. > > > But from what I'm understanding, It's still RAM in its nature, just > reserved by firmware. > > Its bandwidth is extremely high, why not write to the GEM buffer directly? > > why deferred I/O pay off? i915 doesn't use shadow buffering. I only flushes its caches in regular intervals: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/i915/display/intel_fb.c#L1865 . The actual clipping rectangle is not relevant. Best regards Thomas > > >> Best regards >> Thomas >> >>> -Daniel >> -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
diff --git a/drivers/gpu/drm/drm_fbdev_generic.c b/drivers/gpu/drm/drm_fbdev_generic.c index 8e5148bf40bb..b057cfbba938 100644 --- a/drivers/gpu/drm/drm_fbdev_generic.c +++ b/drivers/gpu/drm/drm_fbdev_generic.c @@ -94,7 +94,7 @@ static int drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper, fb_helper->buffer = buffer; fb_helper->fb = buffer->fb; - screen_size = buffer->gem->size; + screen_size = sizes->surface_height * buffer->fb->pitches[0]; screen_buffer = vzalloc(screen_size); if (!screen_buffer) { ret = -ENOMEM;