From patchwork Tue Sep 5 02:13:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Feng X-Patchwork-Id: 137471 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ab0a:0:b0:3f2:4152:657d with SMTP id m10csp1419549vqo; Mon, 4 Sep 2023 19:15:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFBY/llBaQR9CYgofzm0T4CpiHaJd0uaiTKi39d7Op5AIASm8dQjN31HmoH+P/kjnj6P2k5 X-Received: by 2002:a05:6512:108d:b0:500:b14a:cc63 with SMTP id j13-20020a056512108d00b00500b14acc63mr9733761lfg.12.1693880100937; Mon, 04 Sep 2023 19:15:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693880100; cv=none; d=google.com; s=arc-20160816; b=cn8FtVbJM2qF/Y44RV41drDLrQHAioZn5qRXHmMEZPA5+1R3dCMHGYpdWxn5to5oo9 3O7kgEJ7PxH8ECu5UScE8BzOj1OZNCvxAokA2EffEW2dMyeH/SFG2nRcH2iON5k8WtoG g0gva4spIVejnd3oPpqmaA/IoJ8G/RFUfO0WJWTj6Ad9V4curWe1NSsXNPvhyAdX6ntC EJsUr+xJw2hh31uhfOKQUbgwa8Q3Pzrtnq+bzHlYrHLgt+cgxA6w8gmAuZn1/w6UmXRd IBTnjuwi7CUeorNg7v/3HwVSmZCC4yhGncy8+4kBCyZusFq/P0eBiAMo/6MkqpasZBqz Hfhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=w0NglmNUATyldKsCVn65Wm73ApkgRTDn0jPA7G9dw3Y=; fh=gxFngV3OvQ81rR5RIjigetJQt2loWvRVyayVFZ0KEk4=; b=IlApD2G/BcSKBGYSTqAI0RoirS7gIpF7io6ofH8573Wclxn/zwOU6tW+MV0e7n/mNR IDmwXkEbNWURY7elFJROueGYEL63eeK67Uz4UO/8kWFOsgamF1gzmiXQWSROkD9P2W4A WCXVrFo7dO1/ot9oKy09+H1lumjyviU/AydwEB9xeLDJIKSTKNAm8iL51o7nBolb/Wpi xPCjQSTVmzfGTMb3e5vc4NJ1uxEm28uGti1CBOx6XVTzBODRIucefywC4Udd0UjMWC3p XHBCRMgBVdmTdu2B5O/0rXixTSiuIKqj6KGkp/EOb6x4shrIj7pDRqjPJZc35TU/kbKA t6Yw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Tr+JWXrh; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id u21-20020a05640207d500b0052563912c39si6834390edy.125.2023.09.04.19.15.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Sep 2023 19:15:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Tr+JWXrh; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 89B853856DFB for ; Tue, 5 Sep 2023 02:14:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 89B853856DFB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693880077; bh=w0NglmNUATyldKsCVn65Wm73ApkgRTDn0jPA7G9dw3Y=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Tr+JWXrhTn7otVRTu39BzcnSCJ8rcgZv59KaaNyOs6hLbO5ICcy3YP4J8pDLQ5aS5 UaMh85swW7ZFTI5FjUK8TQaEDBjqQjvLxxDl0vdE4xqgJwHqgl13E4VQyFn0JalFOW HUrQP+149T7aRR+uH/0ca8FHDVnXbXgy/Yn/WgVc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-00364e01.pphosted.com (mx0a-00364e01.pphosted.com [148.163.135.74]) by sourceware.org (Postfix) with ESMTPS id C1CD43858D32 for ; Tue, 5 Sep 2023 02:13:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C1CD43858D32 Received: from pps.filterd (m0167070.ppops.net [127.0.0.1]) by mx0a-00364e01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3850vOoF022234 for ; Mon, 4 Sep 2023 22:13:47 -0400 Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by mx0a-00364e01.pphosted.com (PPS) with ESMTPS id 3sv143qtks-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 04 Sep 2023 22:13:47 -0400 Received: by mail-qk1-f200.google.com with SMTP id af79cd13be357-76f1e2d6ea4so217809185a.1 for ; Mon, 04 Sep 2023 19:13:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693880026; x=1694484826; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w0NglmNUATyldKsCVn65Wm73ApkgRTDn0jPA7G9dw3Y=; b=K4NpdVCDdt6Ts5oN09qLFR9k/TXUkEfpd0XY36kQUB5pzIvQ8CkcO3Jaloia2HltFB 4fYSZMPaTjL2TiCgfA1fukovVR75hpy1yaCUOiKTPf7Dz2YJB3OajJQIT+UzGFw9cCvt mxb928mkQ8ZhxTGASEykyHiJgb3LwBjn2y+1nw/wzxBdaQDjwQQ+3UIrxLg3Xm6O+7pj A2kMkHIoMNTEuzHZKNZOvTOjnmi+H4abQneIxG/9TenzQ2OdKMX5+Bgh783SLr1iPfis a5YuuRtaNEdeLkj73hTKioxCPPtAI+v5VyN9iaB3iV3DeBIqgIcMzNAdJhAmoQikTkfP 7vWQ== X-Gm-Message-State: AOJu0Yw3Jfxpdq3PNaIXZVA95B6ghQuOM06uWJ/4LLPwR4uoEY2bHns3 XHMQKc2z/4dOwb/B0VDVoH8PdblV+nIoQ44+pv8UvJuWYp3RvhySYmoMp0Stcu5UyvOpp3yN/JA 9niHLV8xYOYLRq/8= X-Received: by 2002:a05:620a:44d2:b0:76f:135d:ce08 with SMTP id y18-20020a05620a44d200b0076f135dce08mr15071480qkp.57.1693880025848; Mon, 04 Sep 2023 19:13:45 -0700 (PDT) X-Received: by 2002:a05:620a:44d2:b0:76f:135d:ce08 with SMTP id y18-20020a05620a44d200b0076f135dce08mr15071469qkp.57.1693880025544; Mon, 04 Sep 2023 19:13:45 -0700 (PDT) Received: from Ericcs-MBP.cable.rcn.com (207-38-164-63.s9254.c3-0.43d-cbr1.qens-43d.ny.cable.rcncustomer.com. [207.38.164.63]) by smtp.gmail.com with ESMTPSA id m6-20020a0c9d06000000b0063d643643b0sm4088886qvf.38.2023.09.04.19.13.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Sep 2023 19:13:45 -0700 (PDT) To: dmalcolm@redhat.com Cc: gcc@gcc.gnu.org, gcc-patches@gcc.gnu.org, Eric Feng Subject: [PATCH] analyzer: implement symbolic value support for CPython plugin's refcnt checker [PR107646] Date: Mon, 4 Sep 2023 22:13:34 -0400 Message-Id: <20230905021334.39124-1-ef2648@columbia.edu> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: References: MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: LuK7kKU3ffqEpCaOyv2BlqCnqd1ikf7Y X-Proofpoint-GUID: LuK7kKU3ffqEpCaOyv2BlqCnqd1ikf7Y X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-05_01,2023-08-31_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=10 mlxscore=0 lowpriorityscore=10 phishscore=0 impostorscore=10 clxscore=1011 suspectscore=0 mlxlogscore=605 spamscore=0 priorityscore=1501 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309050017 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Eric Feng via Gcc-patches From: Eric Feng Reply-To: Eric Feng Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776162020881462443 X-GMAIL-MSGID: 1776162020881462443 Hi Dave, Recently I've been working on symbolic value support for the reference count checker. I've attached a patch for it below; let me know it looks OK for trunk. Thanks! Best, Eric --- This patch enhances the reference count checker in the CPython plugin by adding support for symbolic values. Whereas previously we were only able to check the reference count of PyObject* objects created in the scope of the function; we are now able to emit diagnostics on reference count mismatch of objects that were, for example, passed in as a function parameter. rc6.c:6:10: warning: expected ‘obj’ to have reference count: N + ‘1’ but ob_refcnt field is N + ‘2’ 6 | return obj; | ^~~ ‘create_py_object2’: event 1 | | 6 | return obj; | | ^~~ | | | | | (1) here | gcc/testsuite/ChangeLog: PR analyzer/107646 * gcc.dg/plugin/analyzer_cpython_plugin.c: Support reference count checking of symbolic values. * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: New test. * gcc.dg/plugin/plugin.exp: New test. * gcc.dg/plugin/cpython-plugin-test-refcnt.c: New test. Signed-off-by: Eric Feng --- .../gcc.dg/plugin/analyzer_cpython_plugin.c | 133 +++++++++++------- .../cpython-plugin-test-PyList_Append.c | 21 ++- .../plugin/cpython-plugin-test-refcnt.c | 18 +++ gcc/testsuite/gcc.dg/plugin/plugin.exp | 1 + 4 files changed, 118 insertions(+), 55 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt.c diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c index bf1982e79c3..d7ecd7fce09 100644 --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c @@ -314,17 +314,20 @@ public: { diagnostic_metadata m; bool warned; - // just assuming constants for now - auto actual_refcnt - = m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant (); - auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue ()->get_constant (); - warned = warning_meta (rich_loc, m, get_controlling_option (), - "expected %qE to have " - "reference count: %qE but ob_refcnt field is: %qE", - m_reg_tree, actual_refcnt, ob_refcnt); - - // location_t loc = rich_loc->get_loc (); - // foo (loc); + + const auto *actual_refcnt_constant + = m_actual_refcnt->dyn_cast_constant_svalue (); + const auto *ob_refcnt_constant = m_ob_refcnt->dyn_cast_constant_svalue (); + if (!actual_refcnt_constant || !ob_refcnt_constant) + return false; + + auto actual_refcnt = actual_refcnt_constant->get_constant (); + auto ob_refcnt = ob_refcnt_constant->get_constant (); + warned = warning_meta ( + rich_loc, m, get_controlling_option (), + "expected %qE to have " + "reference count: N + %qE but ob_refcnt field is N + %qE", + m_reg_tree, actual_refcnt, ob_refcnt); return warned; } @@ -336,10 +339,6 @@ public: private: - void foo(location_t loc) const - { - inform(loc, "something is up right here"); - } const region *m_base_region; const svalue *m_ob_refcnt; const svalue *m_actual_refcnt; @@ -369,6 +368,19 @@ increment_region_refcnt (hash_map &map, const region *key) refcnt = existed ? refcnt + 1 : 1; } +static const region * +get_region_from_svalue (const svalue *sval, region_model_manager *mgr) +{ + const auto *region_sval = sval->dyn_cast_region_svalue (); + if (region_sval) + return region_sval->get_pointee (); + + const auto *initial_sval = sval->dyn_cast_initial_svalue (); + if (initial_sval) + return mgr->get_symbolic_region (initial_sval); + + return nullptr; +} /* Recursively fills in region_to_refcnt with the references owned by pyobj_ptr_sval. */ @@ -381,20 +393,9 @@ count_pyobj_references (const region_model *model, if (!pyobj_ptr_sval) return; - const auto *pyobj_region_sval = pyobj_ptr_sval->dyn_cast_region_svalue (); - const auto *pyobj_initial_sval = pyobj_ptr_sval->dyn_cast_initial_svalue (); - if (!pyobj_region_sval && !pyobj_initial_sval) - return; - - // todo: support initial sval (e.g passed in as parameter) - if (pyobj_initial_sval) - { - // increment_region_refcnt (region_to_refcnt, - // pyobj_initial_sval->get_region ()); - return; - } + region_model_manager *mgr = model->get_manager (); - const region *pyobj_region = pyobj_region_sval->get_pointee (); + const region *pyobj_region = get_region_from_svalue (pyobj_ptr_sval, mgr); if (!pyobj_region || seen.contains (pyobj_region)) return; @@ -409,49 +410,75 @@ count_pyobj_references (const region_model *model, return; const auto &retval_binding_map = retval_cluster->get_map (); - for (const auto &binding : retval_binding_map) { - const svalue *binding_sval = binding.second; - const svalue *unwrapped_sval = binding_sval->unwrap_any_unmergeable (); - const region *pointee = unwrapped_sval->maybe_get_region (); - - if (pointee && pointee->get_kind () == RK_HEAP_ALLOCATED) + const svalue *binding_sval = binding.second->unwrap_any_unmergeable (); + if (get_region_from_svalue (binding_sval, mgr)) count_pyobj_references (model, region_to_refcnt, binding_sval, seen); } } +static void +unwrap_any_ob_refcnt_sval (const svalue *&ob_refcnt_sval) +{ + if (ob_refcnt_sval->get_kind () != SK_CONSTANT) + { + auto unwrap_cast = ob_refcnt_sval->maybe_undo_cast (); + if (!unwrap_cast) + unwrap_cast = ob_refcnt_sval; + + if (unwrap_cast->get_kind () == SK_BINOP) + ob_refcnt_sval = unwrap_cast->dyn_cast_binop_svalue ()->get_arg1 (); + } +} + +static void +handle_refcnt_mismatch (const region_model *old_model, + const ana::region *curr_region, + const svalue *ob_refcnt_sval, + const svalue *actual_refcnt_sval, + region_model_context *ctxt) +{ + region_model_manager *mgr = old_model->get_manager (); + const svalue *curr_reg_sval + = mgr->get_ptr_svalue (pyobj_ptr_tree, curr_region); + tree reg_tree = old_model->get_representative_tree (curr_reg_sval); + + if (!reg_tree) + return; + + const auto &eg = ctxt->get_eg (); + refcnt_stmt_finder finder (*eg, reg_tree); + auto pd = make_unique (curr_region, ob_refcnt_sval, + actual_refcnt_sval, reg_tree); + if (pd && eg) + ctxt->warn (std::move (pd), &finder); +} + /* Compare ob_refcnt field vs the actual reference count of a region */ static void check_refcnt (const region_model *model, const region_model *old_model, region_model_context *ctxt, const hash_map::iterator::reference_pair region_refcnt) + int>::iterator::reference_pair ®ion_refcnt) { region_model_manager *mgr = model->get_manager (); const auto &curr_region = region_refcnt.first; const auto &actual_refcnt = region_refcnt.second; + const svalue *ob_refcnt_sval = retrieve_ob_refcnt_sval (curr_region, model, ctxt); + if (!ob_refcnt_sval) + return; + + unwrap_any_ob_refcnt_sval (ob_refcnt_sval); + const svalue *actual_refcnt_sval = mgr->get_or_create_int_cst ( ob_refcnt_sval->get_type (), actual_refcnt); - if (ob_refcnt_sval != actual_refcnt_sval) - { - const svalue *curr_reg_sval - = mgr->get_ptr_svalue (pyobj_ptr_tree, curr_region); - tree reg_tree = old_model->get_representative_tree (curr_reg_sval); - if (!reg_tree) - return; - - const auto &eg = ctxt->get_eg (); - refcnt_stmt_finder finder (*eg, reg_tree); - auto pd = make_unique (curr_region, ob_refcnt_sval, - actual_refcnt_sval, reg_tree); - if (pd && eg) - ctxt->warn (std::move (pd), &finder); - } + handle_refcnt_mismatch (old_model, curr_region, ob_refcnt_sval, + actual_refcnt_sval, ctxt); } static void @@ -493,8 +520,6 @@ count_all_references (const region_model *model, for (const auto &cluster : *model->get_store ()) { auto curr_region = cluster.first; - if (curr_region->get_kind () != RK_HEAP_ALLOCATED) - continue; increment_region_refcnt (region_to_refcnt, curr_region); @@ -505,8 +530,8 @@ count_all_references (const region_model *model, const svalue *unwrapped_sval = binding_sval->unwrap_any_unmergeable (); - // if (unwrapped_sval->get_type () != pyobj_ptr_tree) - // continue; + if (unwrapped_sval->get_type () != pyobj_ptr_tree) + continue; const region *pointee = unwrapped_sval->maybe_get_region (); if (!pointee || pointee->get_kind () != RK_HEAP_ALLOCATED) diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c index e1efd9efda5..46daf2f8975 100644 --- a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c @@ -75,4 +75,23 @@ test_PyListAppend_6 () PyObject *list = NULL; PyList_Append(list, item); return list; -} \ No newline at end of file +} + +PyObject * +test_PyListAppend_7 (PyObject *item) +{ + PyObject *list = PyList_New (0); + Py_INCREF(item); + PyList_Append(list, item); + return list; + /* { dg-warning "expected 'item' to have reference count" "" { target *-*-* } .-1 } */ +} + +PyObject * +test_PyListAppend_8 (PyObject *item, PyObject *list) +{ + Py_INCREF(item); + Py_INCREF(item); + PyList_Append(list, item); + return list; +} diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt.c new file mode 100644 index 00000000000..a7f39509d6d --- /dev/null +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target analyzer } */ +/* { dg-options "-fanalyzer" } */ +/* { dg-require-python-h "" } */ + + +#define PY_SSIZE_T_CLEAN +#include +#include "../analyzer/analyzer-decls.h" + +PyObject * +test_refcnt_1 (PyObject *obj) +{ + Py_INCREF(obj); + Py_INCREF(obj); + return obj; + /* { dg-warning "expected 'obj' to have reference count" "" { target *-*-* } .-1 } */ +} diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp index ed72912309c..87862b4ca00 100644 --- a/gcc/testsuite/gcc.dg/plugin/plugin.exp +++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp @@ -162,6 +162,7 @@ set plugin_test_list [list \ taint-antipatterns-1.c } \ { analyzer_cpython_plugin.c \ cpython-plugin-test-no-Python-h.c \ + cpython-plugin-test-refcnt.c \ cpython-plugin-test-PyList_Append.c \ cpython-plugin-test-PyList_New.c \ cpython-plugin-test-PyLong_FromLong.c } \