From patchwork Mon Oct 23 19:21:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157081 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1510919vqx; Mon, 23 Oct 2023 12:36:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFKO++3vN0HloQ35kfSlfo4H/ZvjQBASbPsLkAHEk4GPiWTZFXfvBGniaVVAdz8bqhDRSum X-Received: by 2002:a17:90b:1490:b0:27d:e1c:5345 with SMTP id js16-20020a17090b149000b0027d0e1c5345mr10226040pjb.15.1698089800319; Mon, 23 Oct 2023 12:36:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089800; cv=none; d=google.com; s=arc-20160816; b=em/lgceRc6Oee8Hl0WYBZxmQLZEV/a6hCA7V/ErNIheFEVidsdY5WwP4kPlRqKrW3K DZlK9Ky2cLWuSgZ66UCWPbbdahyrwgSiiyd0t4KgBQy/uJUp6avGaNBpEbBcjNmvKTAA ue8yzEBvQClbs6BPOuIbhleJCRZUPyG2IqJykxYl8s0d0dzvrQTcdKtUA1h5hcnva+8z Rx82lOB+2q3GH9Aq3FPgTzZzNwdmZDXC37hMIbCxmKj7hOlDAVZgA7ecBtPyRw8sxQyP wu/hrpvBKNyKQ4fk8GPbmSwfV9poMhFsz21OqQSKagposN8F1Wp0zk6vvQypg46w4AB8 9sVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HuG0d8CkorkH1aAZ8NN+T0MJXlWDSn9YYHrGTo8lBxA=; fh=NnmJ8AOnlmXs+UlGry6dQ1shMzdd9RS6An44MzU3b8I=; b=kvJKoXNNMFFpgrkB2Q1YRcOYcVifqJ5O6KAaWFpiWYHdULW7BGlLtGLZIknyY1mIU6 RXmT/C0MIKOUYm5QRgTh03ziGdoBEEBMsSM7XeLbE+UwRDVEcqAEK4N56iMQmqDuFr0f UX2UpKDcAOF+ZzCIedOLu4To86u+LUFsaRakxq7jUQOEGWQVGc+BD7KR8sD5PDuCKC6Q uWu8BzG2KJPkb4B23cHW3VsjCeCfnbYZCHIj4/jk2mIdTmwWmNma51dTxiLhavZyFjNK JtAkJS78vMjVdQxbe3XEDZooLVbxktzA0EWunxbFxDnAxncASbfB3jq19jg+MpYe34b8 0wQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b="NhgexVl/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id kb1-20020a17090ae7c100b0027761a3a4b0si7415624pjb.0.2023.10.23.12.36.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:36:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b="NhgexVl/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 351BD8099881; Mon, 23 Oct 2023 12:36:26 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230081AbjJWTf4 (ORCPT + 27 others); Mon, 23 Oct 2023 15:35:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230489AbjJWTWe (ORCPT ); Mon, 23 Oct 2023 15:22:34 -0400 Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 752C194 for ; Mon, 23 Oct 2023 12:22:29 -0700 (PDT) Received: by mail-lj1-x22b.google.com with SMTP id 38308e7fff4ca-2b9338e4695so51320971fa.2 for ; Mon, 23 Oct 2023 12:22:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088948; x=1698693748; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HuG0d8CkorkH1aAZ8NN+T0MJXlWDSn9YYHrGTo8lBxA=; b=NhgexVl/XQcGTZ0DqoNqNOUEC9uzmP/z8qyOBkW6zXjNjrUSdE5GgLnuybklKklaWF 4/eC9U4vJ0u7Rn0rafredhC7CQLiVXcz2O7GrOX7vtxXIGPb6Hkuvrn3S5RuYN5NJaiE PuBMkH4WfHuGhkiix5X4Uby4jB+jwwbLpsf7Sqn0FX3mtU6EDaSsuQDLBWGOfO8ql5LI 1n3Hd0tolaC9cwvn+e5ITwSfKovg0mA6W91BcO58qykP6WPruz8QiwrVcfUiS/tVfTOo ocG8MPeEaIVMHeEWFg3HtXoL4LsRJ+DGP6xucB6KtoEWwAYeTAKhjrCMSDmJDyuOBgyD LJDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088948; x=1698693748; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HuG0d8CkorkH1aAZ8NN+T0MJXlWDSn9YYHrGTo8lBxA=; b=vUOdXOM8Y6N8ioobKxsGG5MmkQaUkyJefs7HLheKoZzWpGqgSOrZz0wnYgmJIxKNUo HBkwacSAH+ImePySj7Qd/FlEC7mwdd6AUaXuhUvB1qcQC/Z7cOYGesBv7DsgB/ASiy/I FMAiXkT/ttN8GByIIfFcJEBJpXZ986/eQvG1YEzihlLPqw0CwwLf2L11k2Q+tENCu7Ed B+b2qnaaDTBK9D1eVMTfqam/DyDmE7qymSG2oS4TlXMPYb40uYrMu5eWTGv8FyFINmPc TtJcwxmgVLtbY41v/LxrDak2ihWua1vNCJBe8d18H/Xy/rCmGvDtCl0S+F4uExp2Eg6H htLQ== X-Gm-Message-State: AOJu0Ywke4NslLK14r9nSkIiKxN/IIVdCS5NEkfoqgIibp3Y6hsJdvca vBfj+rU9IaMpLXIw6NBcPDdk6g== X-Received: by 2002:a05:651c:c8a:b0:2bf:fa62:5d0e with SMTP id bz10-20020a05651c0c8a00b002bffa625d0emr8398807ljb.2.1698088947462; Mon, 23 Oct 2023 12:22:27 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:26 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org, Steen Hegelund Subject: [PATCH v16 net-next 01/23] net/tcp: Prepare tcp_md5sig_pool for TCP-AO Date: Mon, 23 Oct 2023 20:21:53 +0100 Message-ID: <20231023192217.426455-2-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:36:26 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780576210595544702 X-GMAIL-MSGID: 1780576210595544702 TCP-AO, similarly to TCP-MD5, needs to allocate tfms on a slow-path, which is setsockopt() and use crypto ahash requests on fast paths, which are RX/TX softirqs. Also, it needs a temporary/scratch buffer for preparing the hash. Rework tcp_md5sig_pool in order to support other hashing algorithms than MD5. It will make it possible to share pre-allocated crypto_ahash descriptors and scratch area between all TCP hash users. Internally tcp_sigpool calls crypto_clone_ahash() API over pre-allocated crypto ahash tfm. Kudos to Herbert, who provided this new crypto API. I was a little concerned over GFP_ATOMIC allocations of ahash and crypto_request in RX/TX (see tcp_sigpool_start()), so I benchmarked both "backends" with different algorithms, using patched version of iperf3[2]. On my laptop with i7-7600U @ 2.80GHz: clone-tfm per-CPU-requests TCP-MD5 2.25 Gbits/sec 2.30 Gbits/sec TCP-AO(hmac(sha1)) 2.53 Gbits/sec 2.54 Gbits/sec TCP-AO(hmac(sha512)) 1.67 Gbits/sec 1.64 Gbits/sec TCP-AO(hmac(sha384)) 1.77 Gbits/sec 1.80 Gbits/sec TCP-AO(hmac(sha224)) 1.29 Gbits/sec 1.30 Gbits/sec TCP-AO(hmac(sha3-512)) 481 Mbits/sec 480 Mbits/sec TCP-AO(hmac(md5)) 2.07 Gbits/sec 2.12 Gbits/sec TCP-AO(hmac(rmd160)) 1.01 Gbits/sec 995 Mbits/sec TCP-AO(cmac(aes128)) [not supporetd yet] 2.11 Gbits/sec So, it seems that my concerns don't have strong grounds and per-CPU crypto_request allocation can be dropped/removed from tcp_sigpool once ciphers get crypto_clone_ahash() support. [1]: https://lore.kernel.org/all/ZDefxOq6Ax0JeTRH@gondor.apana.org.au/T/#u [2]: https://github.com/0x7f454c46/iperf/tree/tcp-md5-ao Signed-off-by: Dmitry Safonov Reviewed-by: Steen Hegelund Acked-by: David Ahern --- include/net/tcp.h | 50 ++++-- net/ipv4/Kconfig | 4 + net/ipv4/Makefile | 1 + net/ipv4/tcp.c | 145 +++------------- net/ipv4/tcp_ipv4.c | 97 ++++++----- net/ipv4/tcp_minisocks.c | 21 ++- net/ipv4/tcp_sigpool.c | 358 +++++++++++++++++++++++++++++++++++++++ net/ipv6/tcp_ipv6.c | 60 +++---- 8 files changed, 525 insertions(+), 211 deletions(-) create mode 100644 net/ipv4/tcp_sigpool.c diff --git a/include/net/tcp.h b/include/net/tcp.h index 39b731c900dd..1c52cf2e3f63 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1735,12 +1735,39 @@ union tcp_md5sum_block { #endif }; -/* - pool: digest algorithm, hash description and scratch buffer */ -struct tcp_md5sig_pool { - struct ahash_request *md5_req; - void *scratch; +/* + * struct tcp_sigpool - per-CPU pool of ahash_requests + * @scratch: per-CPU temporary area, that can be used between + * tcp_sigpool_start() and tcp_sigpool_end() to perform + * crypto request + * @req: pre-allocated ahash request + */ +struct tcp_sigpool { + void *scratch; + struct ahash_request *req; }; +int tcp_sigpool_alloc_ahash(const char *alg, size_t scratch_size); +void tcp_sigpool_get(unsigned int id); +void tcp_sigpool_release(unsigned int id); +int tcp_sigpool_hash_skb_data(struct tcp_sigpool *hp, + const struct sk_buff *skb, + unsigned int header_len); + +/** + * tcp_sigpool_start - disable bh and start using tcp_sigpool_ahash + * @id: tcp_sigpool that was previously allocated by tcp_sigpool_alloc_ahash() + * @c: returned tcp_sigpool for usage (uninitialized on failure) + * + * Returns 0 on success, error otherwise. + */ +int tcp_sigpool_start(unsigned int id, struct tcp_sigpool *c); +/** + * tcp_sigpool_end - enable bh and stop using tcp_sigpool + * @c: tcp_sigpool context that was returned by tcp_sigpool_start() + */ +void tcp_sigpool_end(struct tcp_sigpool *c); +size_t tcp_sigpool_algo(unsigned int id, char *buf, size_t buf_len); /* - functions */ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, const struct sock *sk, const struct sk_buff *skb); @@ -1796,17 +1823,12 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, #define tcp_twsk_md5_key(twsk) NULL #endif -bool tcp_alloc_md5sig_pool(void); +int tcp_md5_alloc_sigpool(void); +void tcp_md5_release_sigpool(void); +void tcp_md5_add_sigpool(void); +extern int tcp_md5_sigpool_id; -struct tcp_md5sig_pool *tcp_get_md5sig_pool(void); -static inline void tcp_put_md5sig_pool(void) -{ - local_bh_enable(); -} - -int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *, const struct sk_buff *, - unsigned int header_len); -int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, +int tcp_md5_hash_key(struct tcp_sigpool *hp, const struct tcp_md5sig_key *key); /* From tcp_fastopen.c */ diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig index 2dfb12230f08..89e2ab023272 100644 --- a/net/ipv4/Kconfig +++ b/net/ipv4/Kconfig @@ -741,10 +741,14 @@ config DEFAULT_TCP_CONG default "bbr" if DEFAULT_BBR default "cubic" +config TCP_SIGPOOL + tristate + config TCP_MD5SIG bool "TCP: MD5 Signature Option support (RFC2385)" select CRYPTO select CRYPTO_MD5 + select TCP_SIGPOOL help RFC2385 specifies a method of giving MD5 protection to TCP sessions. Its main (only?) use is to protect BGP sessions between core routers diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index b18ba8ef93ad..cd760793cfcb 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -62,6 +62,7 @@ obj-$(CONFIG_TCP_CONG_SCALABLE) += tcp_scalable.o obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o +obj-$(CONFIG_TCP_SIGPOOL) += tcp_sigpool.o obj-$(CONFIG_NET_SOCK_MSG) += tcp_bpf.o obj-$(CONFIG_BPF_SYSCALL) += udp_bpf.o obj-$(CONFIG_NETLABEL) += cipso_ipv4.o diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index a86d8200a1e8..4db46389a4bd 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4303,141 +4303,52 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval, EXPORT_SYMBOL(tcp_getsockopt); #ifdef CONFIG_TCP_MD5SIG -static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool); -static DEFINE_MUTEX(tcp_md5sig_mutex); -static bool tcp_md5sig_pool_populated = false; +int tcp_md5_sigpool_id = -1; +EXPORT_SYMBOL_GPL(tcp_md5_sigpool_id); -static void __tcp_alloc_md5sig_pool(void) +int tcp_md5_alloc_sigpool(void) { - struct crypto_ahash *hash; - int cpu; + size_t scratch_size; + int ret; - hash = crypto_alloc_ahash("md5", 0, CRYPTO_ALG_ASYNC); - if (IS_ERR(hash)) - return; - - for_each_possible_cpu(cpu) { - void *scratch = per_cpu(tcp_md5sig_pool, cpu).scratch; - struct ahash_request *req; - - if (!scratch) { - scratch = kmalloc_node(sizeof(union tcp_md5sum_block) + - sizeof(struct tcphdr), - GFP_KERNEL, - cpu_to_node(cpu)); - if (!scratch) - return; - per_cpu(tcp_md5sig_pool, cpu).scratch = scratch; - } - if (per_cpu(tcp_md5sig_pool, cpu).md5_req) - continue; - - req = ahash_request_alloc(hash, GFP_KERNEL); - if (!req) - return; - - ahash_request_set_callback(req, 0, NULL, NULL); - - per_cpu(tcp_md5sig_pool, cpu).md5_req = req; + scratch_size = sizeof(union tcp_md5sum_block) + sizeof(struct tcphdr); + ret = tcp_sigpool_alloc_ahash("md5", scratch_size); + if (ret >= 0) { + /* As long as any md5 sigpool was allocated, the return + * id would stay the same. Re-write the id only for the case + * when previously all MD5 keys were deleted and this call + * allocates the first MD5 key, which may return a different + * sigpool id than was used previously. + */ + WRITE_ONCE(tcp_md5_sigpool_id, ret); /* Avoids the compiler potentially being smart here */ + return 0; } - /* before setting tcp_md5sig_pool_populated, we must commit all writes - * to memory. See smp_rmb() in tcp_get_md5sig_pool() - */ - smp_wmb(); - /* Paired with READ_ONCE() from tcp_alloc_md5sig_pool() - * and tcp_get_md5sig_pool(). - */ - WRITE_ONCE(tcp_md5sig_pool_populated, true); + return ret; } -bool tcp_alloc_md5sig_pool(void) +void tcp_md5_release_sigpool(void) { - /* Paired with WRITE_ONCE() from __tcp_alloc_md5sig_pool() */ - if (unlikely(!READ_ONCE(tcp_md5sig_pool_populated))) { - mutex_lock(&tcp_md5sig_mutex); - - if (!tcp_md5sig_pool_populated) - __tcp_alloc_md5sig_pool(); - - mutex_unlock(&tcp_md5sig_mutex); - } - /* Paired with WRITE_ONCE() from __tcp_alloc_md5sig_pool() */ - return READ_ONCE(tcp_md5sig_pool_populated); + tcp_sigpool_release(READ_ONCE(tcp_md5_sigpool_id)); } -EXPORT_SYMBOL(tcp_alloc_md5sig_pool); - -/** - * tcp_get_md5sig_pool - get md5sig_pool for this user - * - * We use percpu structure, so if we succeed, we exit with preemption - * and BH disabled, to make sure another thread or softirq handling - * wont try to get same context. - */ -struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) +void tcp_md5_add_sigpool(void) { - local_bh_disable(); - - /* Paired with WRITE_ONCE() from __tcp_alloc_md5sig_pool() */ - if (READ_ONCE(tcp_md5sig_pool_populated)) { - /* coupled with smp_wmb() in __tcp_alloc_md5sig_pool() */ - smp_rmb(); - return this_cpu_ptr(&tcp_md5sig_pool); - } - local_bh_enable(); - return NULL; + tcp_sigpool_get(READ_ONCE(tcp_md5_sigpool_id)); } -EXPORT_SYMBOL(tcp_get_md5sig_pool); -int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp, - const struct sk_buff *skb, unsigned int header_len) -{ - struct scatterlist sg; - const struct tcphdr *tp = tcp_hdr(skb); - struct ahash_request *req = hp->md5_req; - unsigned int i; - const unsigned int head_data_len = skb_headlen(skb) > header_len ? - skb_headlen(skb) - header_len : 0; - const struct skb_shared_info *shi = skb_shinfo(skb); - struct sk_buff *frag_iter; - - sg_init_table(&sg, 1); - - sg_set_buf(&sg, ((u8 *) tp) + header_len, head_data_len); - ahash_request_set_crypt(req, &sg, NULL, head_data_len); - if (crypto_ahash_update(req)) - return 1; - - for (i = 0; i < shi->nr_frags; ++i) { - const skb_frag_t *f = &shi->frags[i]; - unsigned int offset = skb_frag_off(f); - struct page *page = skb_frag_page(f) + (offset >> PAGE_SHIFT); - - sg_set_page(&sg, page, skb_frag_size(f), - offset_in_page(offset)); - ahash_request_set_crypt(req, &sg, NULL, skb_frag_size(f)); - if (crypto_ahash_update(req)) - return 1; - } - - skb_walk_frags(skb, frag_iter) - if (tcp_md5_hash_skb_data(hp, frag_iter, 0)) - return 1; - - return 0; -} -EXPORT_SYMBOL(tcp_md5_hash_skb_data); - -int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key *key) +int tcp_md5_hash_key(struct tcp_sigpool *hp, + const struct tcp_md5sig_key *key) { u8 keylen = READ_ONCE(key->keylen); /* paired with WRITE_ONCE() in tcp_md5_do_add */ struct scatterlist sg; sg_init_one(&sg, key->key, keylen); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen); + ahash_request_set_crypt(hp->req, &sg, NULL, keylen); - /* We use data_race() because tcp_md5_do_add() might change key->key under us */ - return data_race(crypto_ahash_update(hp->md5_req)); + /* We use data_race() because tcp_md5_do_add() might change + * key->key under us + */ + return data_race(crypto_ahash_update(hp->req)); } EXPORT_SYMBOL(tcp_md5_hash_key); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 7583d4e34c8c..7d81e90b6f5c 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1221,10 +1221,6 @@ static int __tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO); if (!key) return -ENOMEM; - if (!tcp_alloc_md5sig_pool()) { - sock_kfree_s(sk, key, sizeof(*key)); - return -ENOMEM; - } memcpy(key->key, newkey, newkeylen); key->keylen = newkeylen; @@ -1246,15 +1242,21 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr, struct tcp_sock *tp = tcp_sk(sk); if (!rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk))) { - if (tcp_md5sig_info_add(sk, GFP_KERNEL)) + if (tcp_md5_alloc_sigpool()) return -ENOMEM; + if (tcp_md5sig_info_add(sk, GFP_KERNEL)) { + tcp_md5_release_sigpool(); + return -ENOMEM; + } + if (!static_branch_inc(&tcp_md5_needed.key)) { struct tcp_md5sig_info *md5sig; md5sig = rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk)); rcu_assign_pointer(tp->md5sig_info, NULL); kfree_rcu(md5sig, rcu); + tcp_md5_release_sigpool(); return -EUSERS; } } @@ -1271,8 +1273,12 @@ int tcp_md5_key_copy(struct sock *sk, const union tcp_md5_addr *addr, struct tcp_sock *tp = tcp_sk(sk); if (!rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk))) { - if (tcp_md5sig_info_add(sk, sk_gfp_mask(sk, GFP_ATOMIC))) + tcp_md5_add_sigpool(); + + if (tcp_md5sig_info_add(sk, sk_gfp_mask(sk, GFP_ATOMIC))) { + tcp_md5_release_sigpool(); return -ENOMEM; + } if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key)) { struct tcp_md5sig_info *md5sig; @@ -1281,6 +1287,7 @@ int tcp_md5_key_copy(struct sock *sk, const union tcp_md5_addr *addr, net_warn_ratelimited("Too many TCP-MD5 keys in the system\n"); rcu_assign_pointer(tp->md5sig_info, NULL); kfree_rcu(md5sig, rcu); + tcp_md5_release_sigpool(); return -EUSERS; } } @@ -1380,7 +1387,7 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, cmd.tcpm_key, cmd.tcpm_keylen); } -static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp, +static int tcp_v4_md5_hash_headers(struct tcp_sigpool *hp, __be32 daddr, __be32 saddr, const struct tcphdr *th, int nbytes) { @@ -1400,38 +1407,35 @@ static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp, _th->check = 0; sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, + ahash_request_set_crypt(hp->req, &sg, NULL, sizeof(*bp) + sizeof(*th)); - return crypto_ahash_update(hp->md5_req); + return crypto_ahash_update(hp->req); } static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, __be32 daddr, __be32 saddr, const struct tcphdr *th) { - struct tcp_md5sig_pool *hp; - struct ahash_request *req; + struct tcp_sigpool hp; - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; + if (tcp_sigpool_start(tcp_md5_sigpool_id, &hp)) + goto clear_hash_nostart; - if (crypto_ahash_init(req)) + if (crypto_ahash_init(hp.req)) goto clear_hash; - if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) + if (tcp_v4_md5_hash_headers(&hp, daddr, saddr, th, th->doff << 2)) goto clear_hash; - if (tcp_md5_hash_key(hp, key)) + if (tcp_md5_hash_key(&hp, key)) goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) + ahash_request_set_crypt(hp.req, NULL, md5_hash, 0); + if (crypto_ahash_final(hp.req)) goto clear_hash; - tcp_put_md5sig_pool(); + tcp_sigpool_end(&hp); return 0; clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: + tcp_sigpool_end(&hp); +clear_hash_nostart: memset(md5_hash, 0, 16); return 1; } @@ -1440,9 +1444,8 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, const struct sock *sk, const struct sk_buff *skb) { - struct tcp_md5sig_pool *hp; - struct ahash_request *req; const struct tcphdr *th = tcp_hdr(skb); + struct tcp_sigpool hp; __be32 saddr, daddr; if (sk) { /* valid for establish/request sockets */ @@ -1454,30 +1457,28 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key, daddr = iph->daddr; } - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; + if (tcp_sigpool_start(tcp_md5_sigpool_id, &hp)) + goto clear_hash_nostart; - if (crypto_ahash_init(req)) + if (crypto_ahash_init(hp.req)) goto clear_hash; - if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + if (tcp_v4_md5_hash_headers(&hp, daddr, saddr, th, skb->len)) goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + if (tcp_sigpool_hash_skb_data(&hp, skb, th->doff << 2)) goto clear_hash; - if (tcp_md5_hash_key(hp, key)) + if (tcp_md5_hash_key(&hp, key)) goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) + ahash_request_set_crypt(hp.req, NULL, md5_hash, 0); + if (crypto_ahash_final(hp.req)) goto clear_hash; - tcp_put_md5sig_pool(); + tcp_sigpool_end(&hp); return 0; clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: + tcp_sigpool_end(&hp); +clear_hash_nostart: memset(md5_hash, 0, 16); return 1; } @@ -2296,6 +2297,18 @@ static int tcp_v4_init_sock(struct sock *sk) return 0; } +#ifdef CONFIG_TCP_MD5SIG +static void tcp_md5sig_info_free_rcu(struct rcu_head *head) +{ + struct tcp_md5sig_info *md5sig; + + md5sig = container_of(head, struct tcp_md5sig_info, rcu); + kfree(md5sig); + static_branch_slow_dec_deferred(&tcp_md5_needed); + tcp_md5_release_sigpool(); +} +#endif + void tcp_v4_destroy_sock(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); @@ -2320,10 +2333,12 @@ void tcp_v4_destroy_sock(struct sock *sk) #ifdef CONFIG_TCP_MD5SIG /* Clean up the MD5 key list, if any */ if (tp->md5sig_info) { + struct tcp_md5sig_info *md5sig; + + md5sig = rcu_dereference_protected(tp->md5sig_info, 1); tcp_clear_md5_list(sk); - kfree_rcu(rcu_dereference_protected(tp->md5sig_info, 1), rcu); - tp->md5sig_info = NULL; - static_branch_slow_dec_deferred(&tcp_md5_needed); + call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu); + rcu_assign_pointer(tp->md5sig_info, NULL); } #endif diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index ace806c5bd0c..3dcb3fc36e64 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -261,10 +261,9 @@ static void tcp_time_wait_init(struct sock *sk, struct tcp_timewait_sock *tcptw) tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC); if (!tcptw->tw_md5_key) return; - if (!tcp_alloc_md5sig_pool()) - goto out_free; if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key)) goto out_free; + tcp_md5_add_sigpool(); } return; out_free: @@ -349,16 +348,26 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) } EXPORT_SYMBOL(tcp_time_wait); +#ifdef CONFIG_TCP_MD5SIG +static void tcp_md5_twsk_free_rcu(struct rcu_head *head) +{ + struct tcp_md5sig_key *key; + + key = container_of(head, struct tcp_md5sig_key, rcu); + kfree(key); + static_branch_slow_dec_deferred(&tcp_md5_needed); + tcp_md5_release_sigpool(); +} +#endif + void tcp_twsk_destructor(struct sock *sk) { #ifdef CONFIG_TCP_MD5SIG if (static_branch_unlikely(&tcp_md5_needed.key)) { struct tcp_timewait_sock *twsk = tcp_twsk(sk); - if (twsk->tw_md5_key) { - kfree_rcu(twsk->tw_md5_key, rcu); - static_branch_slow_dec_deferred(&tcp_md5_needed); - } + if (twsk->tw_md5_key) + call_rcu(&twsk->tw_md5_key->rcu, tcp_md5_twsk_free_rcu); } #endif } diff --git a/net/ipv4/tcp_sigpool.c b/net/ipv4/tcp_sigpool.c new file mode 100644 index 000000000000..65a8eaae2fec --- /dev/null +++ b/net/ipv4/tcp_sigpool.c @@ -0,0 +1,358 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include +#include +#include +#include +#include +#include +#include +#include + +static size_t __scratch_size; +static DEFINE_PER_CPU(void __rcu *, sigpool_scratch); + +struct sigpool_entry { + struct crypto_ahash *hash; + const char *alg; + struct kref kref; + uint16_t needs_key:1, + reserved:15; +}; + +#define CPOOL_SIZE (PAGE_SIZE / sizeof(struct sigpool_entry)) +static struct sigpool_entry cpool[CPOOL_SIZE]; +static unsigned int cpool_populated; +static DEFINE_MUTEX(cpool_mutex); + +/* Slow-path */ +struct scratches_to_free { + struct rcu_head rcu; + unsigned int cnt; + void *scratches[]; +}; + +static void free_old_scratches(struct rcu_head *head) +{ + struct scratches_to_free *stf; + + stf = container_of(head, struct scratches_to_free, rcu); + while (stf->cnt--) + kfree(stf->scratches[stf->cnt]); + kfree(stf); +} + +/** + * sigpool_reserve_scratch - re-allocates scratch buffer, slow-path + * @size: request size for the scratch/temp buffer + */ +static int sigpool_reserve_scratch(size_t size) +{ + struct scratches_to_free *stf; + size_t stf_sz = struct_size(stf, scratches, num_possible_cpus()); + int cpu, err = 0; + + lockdep_assert_held(&cpool_mutex); + if (__scratch_size >= size) + return 0; + + stf = kmalloc(stf_sz, GFP_KERNEL); + if (!stf) + return -ENOMEM; + stf->cnt = 0; + + size = max(size, __scratch_size); + cpus_read_lock(); + for_each_possible_cpu(cpu) { + void *scratch, *old_scratch; + + scratch = kmalloc_node(size, GFP_KERNEL, cpu_to_node(cpu)); + if (!scratch) { + err = -ENOMEM; + break; + } + + old_scratch = rcu_replace_pointer(per_cpu(sigpool_scratch, cpu), + scratch, lockdep_is_held(&cpool_mutex)); + if (!cpu_online(cpu) || !old_scratch) { + kfree(old_scratch); + continue; + } + stf->scratches[stf->cnt++] = old_scratch; + } + cpus_read_unlock(); + if (!err) + __scratch_size = size; + + call_rcu(&stf->rcu, free_old_scratches); + return err; +} + +static void sigpool_scratch_free(void) +{ + int cpu; + + for_each_possible_cpu(cpu) + kfree(rcu_replace_pointer(per_cpu(sigpool_scratch, cpu), + NULL, lockdep_is_held(&cpool_mutex))); + __scratch_size = 0; +} + +static int __cpool_try_clone(struct crypto_ahash *hash) +{ + struct crypto_ahash *tmp; + + tmp = crypto_clone_ahash(hash); + if (IS_ERR(tmp)) + return PTR_ERR(tmp); + + crypto_free_ahash(tmp); + return 0; +} + +static int __cpool_alloc_ahash(struct sigpool_entry *e, const char *alg) +{ + struct crypto_ahash *cpu0_hash; + int ret; + + e->alg = kstrdup(alg, GFP_KERNEL); + if (!e->alg) + return -ENOMEM; + + cpu0_hash = crypto_alloc_ahash(alg, 0, CRYPTO_ALG_ASYNC); + if (IS_ERR(cpu0_hash)) { + ret = PTR_ERR(cpu0_hash); + goto out_free_alg; + } + + e->needs_key = crypto_ahash_get_flags(cpu0_hash) & CRYPTO_TFM_NEED_KEY; + + ret = __cpool_try_clone(cpu0_hash); + if (ret) + goto out_free_cpu0_hash; + e->hash = cpu0_hash; + kref_init(&e->kref); + return 0; + +out_free_cpu0_hash: + crypto_free_ahash(cpu0_hash); +out_free_alg: + kfree(e->alg); + e->alg = NULL; + return ret; +} + +/** + * tcp_sigpool_alloc_ahash - allocates pool for ahash requests + * @alg: name of async hash algorithm + * @scratch_size: reserve a tcp_sigpool::scratch buffer of this size + */ +int tcp_sigpool_alloc_ahash(const char *alg, size_t scratch_size) +{ + int i, ret; + + /* slow-path */ + mutex_lock(&cpool_mutex); + ret = sigpool_reserve_scratch(scratch_size); + if (ret) + goto out; + for (i = 0; i < cpool_populated; i++) { + if (!cpool[i].alg) + continue; + if (strcmp(cpool[i].alg, alg)) + continue; + + if (kref_read(&cpool[i].kref) > 0) + kref_get(&cpool[i].kref); + else + kref_init(&cpool[i].kref); + ret = i; + goto out; + } + + for (i = 0; i < cpool_populated; i++) { + if (!cpool[i].alg) + break; + } + if (i >= CPOOL_SIZE) { + ret = -ENOSPC; + goto out; + } + + ret = __cpool_alloc_ahash(&cpool[i], alg); + if (!ret) { + ret = i; + if (i == cpool_populated) + cpool_populated++; + } +out: + mutex_unlock(&cpool_mutex); + return ret; +} +EXPORT_SYMBOL_GPL(tcp_sigpool_alloc_ahash); + +static void __cpool_free_entry(struct sigpool_entry *e) +{ + crypto_free_ahash(e->hash); + kfree(e->alg); + memset(e, 0, sizeof(*e)); +} + +static void cpool_cleanup_work_cb(struct work_struct *work) +{ + bool free_scratch = true; + unsigned int i; + + mutex_lock(&cpool_mutex); + for (i = 0; i < cpool_populated; i++) { + if (kref_read(&cpool[i].kref) > 0) { + free_scratch = false; + continue; + } + if (!cpool[i].alg) + continue; + __cpool_free_entry(&cpool[i]); + } + if (free_scratch) + sigpool_scratch_free(); + mutex_unlock(&cpool_mutex); +} + +static DECLARE_WORK(cpool_cleanup_work, cpool_cleanup_work_cb); +static void cpool_schedule_cleanup(struct kref *kref) +{ + schedule_work(&cpool_cleanup_work); +} + +/** + * tcp_sigpool_release - decreases number of users for a pool. If it was + * the last user of the pool, releases any memory that was consumed. + * @id: tcp_sigpool that was previously allocated by tcp_sigpool_alloc_ahash() + */ +void tcp_sigpool_release(unsigned int id) +{ + if (WARN_ON_ONCE(id > cpool_populated || !cpool[id].alg)) + return; + + /* slow-path */ + kref_put(&cpool[id].kref, cpool_schedule_cleanup); +} +EXPORT_SYMBOL_GPL(tcp_sigpool_release); + +/** + * tcp_sigpool_get - increases number of users (refcounter) for a pool + * @id: tcp_sigpool that was previously allocated by tcp_sigpool_alloc_ahash() + */ +void tcp_sigpool_get(unsigned int id) +{ + if (WARN_ON_ONCE(id > cpool_populated || !cpool[id].alg)) + return; + kref_get(&cpool[id].kref); +} +EXPORT_SYMBOL_GPL(tcp_sigpool_get); + +int tcp_sigpool_start(unsigned int id, struct tcp_sigpool *c) __cond_acquires(RCU_BH) +{ + struct crypto_ahash *hash; + + rcu_read_lock_bh(); + if (WARN_ON_ONCE(id > cpool_populated || !cpool[id].alg)) { + rcu_read_unlock_bh(); + return -EINVAL; + } + + hash = crypto_clone_ahash(cpool[id].hash); + if (IS_ERR(hash)) { + rcu_read_unlock_bh(); + return PTR_ERR(hash); + } + + c->req = ahash_request_alloc(hash, GFP_ATOMIC); + if (!c->req) { + crypto_free_ahash(hash); + rcu_read_unlock_bh(); + return -ENOMEM; + } + ahash_request_set_callback(c->req, 0, NULL, NULL); + + /* Pairs with tcp_sigpool_reserve_scratch(), scratch area is + * valid (allocated) until tcp_sigpool_end(). + */ + c->scratch = rcu_dereference_bh(*this_cpu_ptr(&sigpool_scratch)); + return 0; +} +EXPORT_SYMBOL_GPL(tcp_sigpool_start); + +void tcp_sigpool_end(struct tcp_sigpool *c) __releases(RCU_BH) +{ + struct crypto_ahash *hash = crypto_ahash_reqtfm(c->req); + + rcu_read_unlock_bh(); + ahash_request_free(c->req); + crypto_free_ahash(hash); +} +EXPORT_SYMBOL_GPL(tcp_sigpool_end); + +/** + * tcp_sigpool_algo - return algorithm of tcp_sigpool + * @id: tcp_sigpool that was previously allocated by tcp_sigpool_alloc_ahash() + * @buf: buffer to return name of algorithm + * @buf_len: size of @buf + */ +size_t tcp_sigpool_algo(unsigned int id, char *buf, size_t buf_len) +{ + if (WARN_ON_ONCE(id > cpool_populated || !cpool[id].alg)) + return -EINVAL; + + return strscpy(buf, cpool[id].alg, buf_len); +} +EXPORT_SYMBOL_GPL(tcp_sigpool_algo); + +/** + * tcp_sigpool_hash_skb_data - hash data in skb with initialized tcp_sigpool + * @hp: tcp_sigpool pointer + * @skb: buffer to add sign for + * @header_len: TCP header length for this segment + */ +int tcp_sigpool_hash_skb_data(struct tcp_sigpool *hp, + const struct sk_buff *skb, + unsigned int header_len) +{ + const unsigned int head_data_len = skb_headlen(skb) > header_len ? + skb_headlen(skb) - header_len : 0; + const struct skb_shared_info *shi = skb_shinfo(skb); + const struct tcphdr *tp = tcp_hdr(skb); + struct ahash_request *req = hp->req; + struct sk_buff *frag_iter; + struct scatterlist sg; + unsigned int i; + + sg_init_table(&sg, 1); + + sg_set_buf(&sg, ((u8 *)tp) + header_len, head_data_len); + ahash_request_set_crypt(req, &sg, NULL, head_data_len); + if (crypto_ahash_update(req)) + return 1; + + for (i = 0; i < shi->nr_frags; ++i) { + const skb_frag_t *f = &shi->frags[i]; + unsigned int offset = skb_frag_off(f); + struct page *page; + + page = skb_frag_page(f) + (offset >> PAGE_SHIFT); + sg_set_page(&sg, page, skb_frag_size(f), offset_in_page(offset)); + ahash_request_set_crypt(req, &sg, NULL, skb_frag_size(f)); + if (crypto_ahash_update(req)) + return 1; + } + + skb_walk_frags(skb, frag_iter) + if (tcp_sigpool_hash_skb_data(hp, frag_iter, 0)) + return 1; + + return 0; +} +EXPORT_SYMBOL(tcp_sigpool_hash_skb_data); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("Per-CPU pool of crypto requests"); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 0c8a14ba104f..9bdc4e979943 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -671,7 +671,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, cmd.tcpm_key, cmd.tcpm_keylen); } -static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp, +static int tcp_v6_md5_hash_headers(struct tcp_sigpool *hp, const struct in6_addr *daddr, const struct in6_addr *saddr, const struct tcphdr *th, int nbytes) @@ -692,39 +692,36 @@ static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp, _th->check = 0; sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th)); - ahash_request_set_crypt(hp->md5_req, &sg, NULL, + ahash_request_set_crypt(hp->req, &sg, NULL, sizeof(*bp) + sizeof(*th)); - return crypto_ahash_update(hp->md5_req); + return crypto_ahash_update(hp->req); } static int tcp_v6_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key, const struct in6_addr *daddr, struct in6_addr *saddr, const struct tcphdr *th) { - struct tcp_md5sig_pool *hp; - struct ahash_request *req; + struct tcp_sigpool hp; - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; + if (tcp_sigpool_start(tcp_md5_sigpool_id, &hp)) + goto clear_hash_nostart; - if (crypto_ahash_init(req)) + if (crypto_ahash_init(hp.req)) goto clear_hash; - if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2)) + if (tcp_v6_md5_hash_headers(&hp, daddr, saddr, th, th->doff << 2)) goto clear_hash; - if (tcp_md5_hash_key(hp, key)) + if (tcp_md5_hash_key(&hp, key)) goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) + ahash_request_set_crypt(hp.req, NULL, md5_hash, 0); + if (crypto_ahash_final(hp.req)) goto clear_hash; - tcp_put_md5sig_pool(); + tcp_sigpool_end(&hp); return 0; clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: + tcp_sigpool_end(&hp); +clear_hash_nostart: memset(md5_hash, 0, 16); return 1; } @@ -734,10 +731,9 @@ static int tcp_v6_md5_hash_skb(char *md5_hash, const struct sock *sk, const struct sk_buff *skb) { - const struct in6_addr *saddr, *daddr; - struct tcp_md5sig_pool *hp; - struct ahash_request *req; const struct tcphdr *th = tcp_hdr(skb); + const struct in6_addr *saddr, *daddr; + struct tcp_sigpool hp; if (sk) { /* valid for establish/request sockets */ saddr = &sk->sk_v6_rcv_saddr; @@ -748,30 +744,28 @@ static int tcp_v6_md5_hash_skb(char *md5_hash, daddr = &ip6h->daddr; } - hp = tcp_get_md5sig_pool(); - if (!hp) - goto clear_hash_noput; - req = hp->md5_req; + if (tcp_sigpool_start(tcp_md5_sigpool_id, &hp)) + goto clear_hash_nostart; - if (crypto_ahash_init(req)) + if (crypto_ahash_init(hp.req)) goto clear_hash; - if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len)) + if (tcp_v6_md5_hash_headers(&hp, daddr, saddr, th, skb->len)) goto clear_hash; - if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2)) + if (tcp_sigpool_hash_skb_data(&hp, skb, th->doff << 2)) goto clear_hash; - if (tcp_md5_hash_key(hp, key)) + if (tcp_md5_hash_key(&hp, key)) goto clear_hash; - ahash_request_set_crypt(req, NULL, md5_hash, 0); - if (crypto_ahash_final(req)) + ahash_request_set_crypt(hp.req, NULL, md5_hash, 0); + if (crypto_ahash_final(hp.req)) goto clear_hash; - tcp_put_md5sig_pool(); + tcp_sigpool_end(&hp); return 0; clear_hash: - tcp_put_md5sig_pool(); -clear_hash_noput: + tcp_sigpool_end(&hp); +clear_hash_nostart: memset(md5_hash, 0, 16); return 1; } From patchwork Mon Oct 23 19:21:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157080 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1510855vqx; Mon, 23 Oct 2023 12:36:33 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHcHnNlKeefOLD3ryJxoHS459Xc5H50HDt+9LCLEDxxfyyYVsQcj0NLKGYJT0cM74nWgzkJ X-Received: by 2002:a17:902:fa4c:b0:1c9:e48c:727a with SMTP id lb12-20020a170902fa4c00b001c9e48c727amr11120216plb.14.1698089793350; Mon, 23 Oct 2023 12:36:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089793; cv=none; d=google.com; s=arc-20160816; b=AVHz/w8tcgR54+mc5kW2V+KVh4bdpNu9Tai/HQNPPhFVnzsfhdChRXyTSDTuKZT3jS NgD/zPQsCa0eGXAwbMbTgELoXnbYEQGH6D66Gu62/NjeaAkJAsN05BO+GhCHh4tDweuG YlC1zjrGTwEHYO4C/Wyj3KRy6wZHohemkfIvvnFfM8eTPA0dLbmNiXm1De2Uz/fKLqOw 3SCineKThPrSvI/mdW52hzI3moPK71IEeQyBTymvvB5/AeV0H+dEGW1MGXThV92gQ7dp QeYrn5uLK35oB8lTy7mSnmiUKLF/Wp2OUSnjjF6/MbVKD0MB/qRfVRzFK6xrEjMVm7ok PYmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Tnf1iX1DkpNXzc0ZSSfJXKOAq0XBa5++Rmv1hxBzDY8=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=yDmNQDqp3Ckp/WDnc4JGqA8JhuJ5skoR3ysjQCJb3X0SkhsTTqhOJgSfNvP792Clms hr1a7nn4kXqXXu22h4McvXK/ul/sbqxobltA2aWDf8O2gtpgRkrge0l+y+jbDseRnHN2 HzBLjG2igNvRZdaBC1id18qfVHk29IcKXxmCp+FuRGT97SI3jOmQSQ5imgAcHP48AzaU gQgTca/0s9/XfR3yoj9FMHX4sOAmjqP7VEF6gC16fwivowEgUB/wCph1X14nxig0ABsP hUd/cQoiyAFSPSJjG+t90ifWlYW060WtGtS2tE+RBUZkpok2APe27hCuOpt6Bb/qV3r3 DoGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=bUsxq9Us; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id d6-20020a170903230600b001ca6abecb27si7398913plh.498.2023.10.23.12.36.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:36:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=bUsxq9Us; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 519B680621C2; Mon, 23 Oct 2023 12:36:27 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230033AbjJWTfx (ORCPT + 27 others); Mon, 23 Oct 2023 15:35:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230432AbjJWTWe (ORCPT ); Mon, 23 Oct 2023 15:22:34 -0400 Received: from mail-lj1-x22d.google.com (mail-lj1-x22d.google.com [IPv6:2a00:1450:4864:20::22d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67343FD for ; Mon, 23 Oct 2023 12:22:31 -0700 (PDT) Received: by mail-lj1-x22d.google.com with SMTP id 38308e7fff4ca-2c16757987fso53846391fa.3 for ; Mon, 23 Oct 2023 12:22:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088949; x=1698693749; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Tnf1iX1DkpNXzc0ZSSfJXKOAq0XBa5++Rmv1hxBzDY8=; b=bUsxq9UsEX+DhwdbUUKIqGeqqh7eMK2mWmlVX7UlGsc2fervEA7reB+MmmPxeSvcw1 26XUhHq3Jt7G5tROZ5ujX0SV7vnU9gqTxGl9SX58RqCFtKynpzuWx5hPFh1Ef5+cUqeW ULLv2di2OZsvkR2oXv+vAUVQODwiUVpd/AbGypxaEVZdwH3u50WtAidBWt3AYaSNJcgt PpANnMUp4sK2wQ3OATS0qp2cRJ5RoKInI5BiVnid1kttYHZ8UEzYc7xNUReKQaO32L0l +xUqMCY6XQGrl8CE2vCOhlpiQB7Srq0z7dSjWa+BfP3nELmzHf4x9BBXvE0NTYcEQvY0 /odQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088949; x=1698693749; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Tnf1iX1DkpNXzc0ZSSfJXKOAq0XBa5++Rmv1hxBzDY8=; b=gOGxCnMv2FVouWEtc3f54AthngJ1klpKGJ8uehp3rKooGUFtsb1gKaNuFUV5du5LKF FEmbzvdM33IlGVN7V3YFkK5S1S4OojhFppvyLrrTR4H+5yMlud3FclfSdcd6P3O/RM+H XapGF/jogYk1mV+zmtaFIKQ9MDmhJtqKGrmYoRP/bsZidl3atzw85/lqwaOcCAcDGvKF rBcDEDySghxW/3e16Dxu1ev+73yDQRjdwv4RnSz8+h4btAmQbxVs5D78HObRc1oKMGrW TORBiFGGzLbsBupUfyd/SjhiXwK6ny4LoH7zpBrFcI7qfaNzQyiA1sWZiAIbr6V0oQer FzWg== X-Gm-Message-State: AOJu0YxnpKV1ZGJOxYSuQ+wvjHCHJCtcdOdg/EL+XF4wP5KlwlK8DXNW IXSNA8H/ArF6NA8mmKwgU9ZRkQ== X-Received: by 2002:a2e:be2b:0:b0:2c5:1e3f:7379 with SMTP id z43-20020a2ebe2b000000b002c51e3f7379mr7013992ljq.48.1698088949604; Mon, 23 Oct 2023 12:22:29 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:29 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 02/23] net/tcp: Add TCP-AO config and structures Date: Mon, 23 Oct 2023 20:21:54 +0100 Message-ID: <20231023192217.426455-3-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:36:27 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780576202962837978 X-GMAIL-MSGID: 1780576202962837978 Introduce new kernel config option and common structures as well as helpers to be used by TCP-AO code. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/linux/tcp.h | 9 +++- include/net/tcp.h | 8 +--- include/net/tcp_ao.h | 90 ++++++++++++++++++++++++++++++++++++++++ include/uapi/linux/tcp.h | 2 + net/ipv4/Kconfig | 13 ++++++ 5 files changed, 114 insertions(+), 8 deletions(-) create mode 100644 include/net/tcp_ao.h diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 6df715b6e51d..64e7b560fa79 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -447,13 +447,18 @@ struct tcp_sock { bool syn_smc; /* SYN includes SMC */ #endif -#ifdef CONFIG_TCP_MD5SIG -/* TCP AF-Specific parts; only used by MD5 Signature support so far */ +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) +/* TCP AF-Specific parts; only used by TCP-AO/MD5 Signature support so far */ const struct tcp_sock_af_ops *af_specific; +#ifdef CONFIG_TCP_MD5SIG /* TCP MD5 Signature Option information */ struct tcp_md5sig_info __rcu *md5sig_info; #endif +#ifdef CONFIG_TCP_AO + struct tcp_ao_info __rcu *ao_info; +#endif +#endif /* TCP fastopen related information */ struct tcp_fastopen_request *fastopen_req; diff --git a/include/net/tcp.h b/include/net/tcp.h index 1c52cf2e3f63..200e4b2e3898 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -37,6 +37,7 @@ #include #include #include +#include #include #include #include @@ -1686,12 +1687,7 @@ static inline void tcp_clear_all_retrans_hints(struct tcp_sock *tp) tp->retransmit_skb_hint = NULL; } -union tcp_md5_addr { - struct in_addr a4; -#if IS_ENABLED(CONFIG_IPV6) - struct in6_addr a6; -#endif -}; +#define tcp_md5_addr tcp_ao_addr /* - key database */ struct tcp_md5sig_key { diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h new file mode 100644 index 000000000000..af76e1c47bea --- /dev/null +++ b/include/net/tcp_ao.h @@ -0,0 +1,90 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _TCP_AO_H +#define _TCP_AO_H + +#define TCP_AO_KEY_ALIGN 1 +#define __tcp_ao_key_align __aligned(TCP_AO_KEY_ALIGN) + +union tcp_ao_addr { + struct in_addr a4; +#if IS_ENABLED(CONFIG_IPV6) + struct in6_addr a6; +#endif +}; + +struct tcp_ao_hdr { + u8 kind; + u8 length; + u8 keyid; + u8 rnext_keyid; +}; + +struct tcp_ao_key { + struct hlist_node node; + union tcp_ao_addr addr; + u8 key[TCP_AO_MAXKEYLEN] __tcp_ao_key_align; + unsigned int tcp_sigpool_id; + unsigned int digest_size; + u8 prefixlen; + u8 family; + u8 keylen; + u8 keyflags; + u8 sndid; + u8 rcvid; + u8 maclen; + struct rcu_head rcu; + u8 traffic_keys[]; +}; + +static inline u8 *rcv_other_key(struct tcp_ao_key *key) +{ + return key->traffic_keys; +} + +static inline u8 *snd_other_key(struct tcp_ao_key *key) +{ + return key->traffic_keys + key->digest_size; +} + +static inline int tcp_ao_maclen(const struct tcp_ao_key *key) +{ + return key->maclen; +} + +static inline int tcp_ao_len(const struct tcp_ao_key *key) +{ + return tcp_ao_maclen(key) + sizeof(struct tcp_ao_hdr); +} + +static inline unsigned int tcp_ao_digest_size(struct tcp_ao_key *key) +{ + return key->digest_size; +} + +static inline int tcp_ao_sizeof_key(const struct tcp_ao_key *key) +{ + return sizeof(struct tcp_ao_key) + (key->digest_size << 1); +} + +struct tcp_ao_info { + /* List of tcp_ao_key's */ + struct hlist_head head; + /* current_key and rnext_key aren't maintained on listen sockets. + * Their purpose is to cache keys on established connections, + * saving needless lookups. Never dereference any of them from + * listen sockets. + * ::current_key may change in RX to the key that was requested by + * the peer, please use READ_ONCE()/WRITE_ONCE() in order to avoid + * load/store tearing. + * Do the same for ::rnext_key, if you don't hold socket lock + * (it's changed only by userspace request in setsockopt()). + */ + struct tcp_ao_key *current_key; + struct tcp_ao_key *rnext_key; + u32 flags; + __be32 lisn; + __be32 risn; + struct rcu_head rcu; +}; + +#endif /* _TCP_AO_H */ diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index 8aa3916e14f6..74a1267baac5 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -361,6 +361,8 @@ struct tcp_diag_md5sig { __u8 tcpm_key[TCP_MD5SIG_MAXKEYLEN]; }; +#define TCP_AO_MAXKEYLEN 80 + /* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */ #define TCP_RECEIVE_ZEROCOPY_FLAG_TLB_CLEAN_HINT 0x1 diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig index 89e2ab023272..8e94ed7c56a0 100644 --- a/net/ipv4/Kconfig +++ b/net/ipv4/Kconfig @@ -744,6 +744,19 @@ config DEFAULT_TCP_CONG config TCP_SIGPOOL tristate +config TCP_AO + bool "TCP: Authentication Option (RFC5925)" + select CRYPTO + select TCP_SIGPOOL + depends on 64BIT && IPV6 != m # seq-number extension needs WRITE_ONCE(u64) + help + TCP-AO specifies the use of stronger Message Authentication Codes (MACs), + protects against replays for long-lived TCP connections, and + provides more details on the association of security with TCP + connections than TCP MD5 (See RFC5925) + + If unsure, say N. + config TCP_MD5SIG bool "TCP: MD5 Signature Option support (RFC2385)" select CRYPTO From patchwork Mon Oct 23 19:21:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157051 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504358vqx; Mon, 23 Oct 2023 12:23:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFGaKixvWLaEpZCFBjhOJJqqdAlqWROhwY0enE4tpfx1+nbYdomm+nYouODkFo9K6Vy4W1a X-Received: by 2002:a17:90b:30c7:b0:27d:66a9:3466 with SMTP id hi7-20020a17090b30c700b0027d66a93466mr9814543pjb.7.1698088980224; Mon, 23 Oct 2023 12:23:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698088980; cv=none; d=google.com; s=arc-20160816; b=YkTXwn7t9yIcytdSEjNxn1lToFpXgVJVclY1kHI3vrdUm+efhLJnc/YNCBhi11Hmia +NZxQgw5rIcekqhhqDhCjqsod8ZESzE25x54CYAaS+wvmD9UpTpwa3a2VXOtCjyA3ZVx yv+ZZEP/ozfvZz9LnxrpSv3VYln6o2HDcxOfvevEVq8dzhwVjgVWxk8buFmuTPQOkBIp Jjlii+nH7Wm8vu7yubmcludXD6VdarOqx8Cvg+oLeP0gUNmsm1FwksYostao7dCvpNaG vU4PrbR8E6CKBzlOu175yAuRYA/GPKBKmcubR+klDaoJjMKIB0PcVza2mUAHkp3a+DWT ARCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=F87w+I3hla58H3Mc6Z6cjLdoPvguFJ7r3A4YVMcs3HQ=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=U9SiDSCSz5YLBb9SUstXfsa+Ikt8UjF+tjQwd4qghwFUeT54XpNMWlOGx0kAj6gEyB i1pjqmeyUjZbklxen/G0MZw0p2qbUrcEtOS7/Q4sKpbIiCS8e1mkB+587ArsZG8NTJLK tUAhnzzoDZIAMZKGdcb+1PmOIL2G/BwAOc/+pmQw8np7inBZc0flOoWpRdNewPPz+CZO yHZeAcUMXOgJPHrhAQF1ZLFC36nZo4m2qvHunion5IgD5YPp3s57WuSv76kdEEPCRLRY S2RWKhiAmw5OoUaEzVxZ2q3VHkiBW1by7B0Y1BaTpoXrexGjgt0RPBkNuKh7UdENanbw XHwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b="OKC+/GH8"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id u15-20020a170903124f00b001c589ba4a08si7049049plh.1.2023.10.23.12.22.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b="OKC+/GH8"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 842CB80AEB1D; Mon, 23 Oct 2023 12:22:59 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231416AbjJWTWp (ORCPT + 27 others); Mon, 23 Oct 2023 15:22:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229499AbjJWTWi (ORCPT ); Mon, 23 Oct 2023 15:22:38 -0400 Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0B14C5 for ; Mon, 23 Oct 2023 12:22:33 -0700 (PDT) Received: by mail-lj1-x231.google.com with SMTP id 38308e7fff4ca-2c504a5e1deso57959411fa.2 for ; Mon, 23 Oct 2023 12:22:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088952; x=1698693752; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=F87w+I3hla58H3Mc6Z6cjLdoPvguFJ7r3A4YVMcs3HQ=; b=OKC+/GH8scyQS1qoZBbJZMR3U8qifQ3gANBilF5oCCsQwQadzZhVwZGxEgrCF5RqX+ ax59j5f1C4pUzDq7SXM/JuXEjoUyj67g/ghMPHkiA8QJmfuXBP9KGREoqhJF18UAJeRI +9mE6mGN33NunUmoz66XpNwTF2GJplQeTGV3xf5G+K3f78Uy7ICyLDpaTF2wPJysCC5X 4UafsbN/B/MCEk1tIRnHJrpjVvGmXuanLqR+E9DIQtQH0g7r+LNR8dKTNmlbAzjHv2Ht 07pk6s+/hHS32hE7ibO/LgZK/EX/vgy7jiJEyaAsXjdDqnf4zVRK+1IsaLnghZ1DipVA 3o3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088952; x=1698693752; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=F87w+I3hla58H3Mc6Z6cjLdoPvguFJ7r3A4YVMcs3HQ=; b=BVNtbEFiNJ9fk0s+uybKKwL6527D1NjUmnf+RfnwwtxDHPnPcZTDQapJyPr8NLWYlk MoRSnuuGFgR19/ws+nMQWuGrRWVCNw4EkQDUgUYqpExTbc0nsbukkgzySc4bnD0YJp0t DJlzqQjFwf6cSWtgJkjvRnkr/KnB9MY5JutVFUNlF75FUCc7l3SOVXF9e6IvCCxZGGxd cT71yqEoc/RRGfpuBNwAUjKoK2SKCDfqnsvTSJLHcDtzbn+bfIblnXXOtRqlUI9kr6PI dBbmIXRjr9d20+GRCEXgjnmJplTp9CgWHjdLr83rs3jajd3vzN9vPcY2zW1zFq/wqWgh /q6w== X-Gm-Message-State: AOJu0Yzvgdmt6KO8ktJ2mThOnCsvC6ej82/EAI5Wmn5wCaR/611HMt0e IqxgRgKluLX9Dr+VYTpCZM7DFw== X-Received: by 2002:a2e:b8c4:0:b0:2c5:e0:1634 with SMTP id s4-20020a2eb8c4000000b002c500e01634mr6614532ljp.35.1698088951791; Mon, 23 Oct 2023 12:22:31 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:31 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 03/23] net/tcp: Introduce TCP_AO setsockopt()s Date: Mon, 23 Oct 2023 20:21:55 +0100 Message-ID: <20231023192217.426455-4-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:22:59 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575350604486729 X-GMAIL-MSGID: 1780575350604486729 Add 3 setsockopt()s: 1. TCP_AO_ADD_KEY to add a new Master Key Tuple (MKT) on a socket 2. TCP_AO_DEL_KEY to delete present MKT from a socket 3. TCP_AO_INFO to change flags, Current_key/RNext_key on a TCP-AO sk Userspace has to introduce keys on every socket it wants to use TCP-AO option on, similarly to TCP_MD5SIG/TCP_MD5SIG_EXT. RFC5925 prohibits definition of MKTs that would match the same peer, so do sanity checks on the data provided by userspace. Be as conservative as possible, including refusal of defining MKT on an established connection with no AO, removing the key in-use and etc. (1) and (2) are to be used by userspace key manager to add/remove keys. (3) main purpose is to set RNext_key, which (as prescribed by RFC5925) is the KeyID that will be requested in TCP-AO header from the peer to sign their segments with. At this moment the life of ao_info ends in tcp_v4_destroy_sock(). Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/linux/sockptr.h | 23 ++ include/net/tcp.h | 3 + include/net/tcp_ao.h | 17 +- include/uapi/linux/tcp.h | 46 +++ net/ipv4/Makefile | 1 + net/ipv4/tcp.c | 17 + net/ipv4/tcp_ao.c | 794 +++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_ipv4.c | 10 +- net/ipv6/Makefile | 1 + net/ipv6/tcp_ao.c | 19 + net/ipv6/tcp_ipv6.c | 39 +- 11 files changed, 952 insertions(+), 18 deletions(-) create mode 100644 net/ipv4/tcp_ao.c create mode 100644 net/ipv6/tcp_ao.c diff --git a/include/linux/sockptr.h b/include/linux/sockptr.h index bae5e2369b4f..307961b41541 100644 --- a/include/linux/sockptr.h +++ b/include/linux/sockptr.h @@ -55,6 +55,29 @@ static inline int copy_from_sockptr(void *dst, sockptr_t src, size_t size) return copy_from_sockptr_offset(dst, src, 0, size); } +static inline int copy_struct_from_sockptr(void *dst, size_t ksize, + sockptr_t src, size_t usize) +{ + size_t size = min(ksize, usize); + size_t rest = max(ksize, usize) - size; + + if (!sockptr_is_kernel(src)) + return copy_struct_from_user(dst, ksize, src.user, size); + + if (usize < ksize) { + memset(dst + size, 0, rest); + } else if (usize > ksize) { + char *p = src.kernel; + + while (rest--) { + if (*p++) + return -E2BIG; + } + } + memcpy(dst, src.kernel, size); + return 0; +} + static inline int copy_to_sockptr_offset(sockptr_t dst, size_t offset, const void *src, size_t size) { diff --git a/include/net/tcp.h b/include/net/tcp.h index 200e4b2e3898..6c5dbfeb6121 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2173,6 +2173,9 @@ struct tcp_sock_af_ops { sockptr_t optval, int optlen); #endif +#ifdef CONFIG_TCP_AO + int (*ao_parse)(struct sock *sk, int optname, sockptr_t optval, int optlen); +#endif }; struct tcp_request_sock_ops { diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index af76e1c47bea..a81e40fd255a 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -81,10 +81,25 @@ struct tcp_ao_info { */ struct tcp_ao_key *current_key; struct tcp_ao_key *rnext_key; - u32 flags; + u32 ao_required :1, + __unused :31; __be32 lisn; __be32 risn; struct rcu_head rcu; }; +#ifdef CONFIG_TCP_AO +int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family, + sockptr_t optval, int optlen); +void tcp_ao_destroy_sock(struct sock *sk); +/* ipv4 specific functions */ +int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); +/* ipv6 specific functions */ +int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); +#else +static inline void tcp_ao_destroy_sock(struct sock *sk) +{ +} +#endif + #endif /* _TCP_AO_H */ diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index 74a1267baac5..fa49f03e62fe 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -129,6 +129,9 @@ enum { #define TCP_TX_DELAY 37 /* delay outgoing packets by XX usec */ +#define TCP_AO_ADD_KEY 38 /* Add/Set MKT */ +#define TCP_AO_DEL_KEY 39 /* Delete MKT */ +#define TCP_AO_INFO 40 /* Modify TCP-AO per-socket options */ #define TCP_REPAIR_ON 1 #define TCP_REPAIR_OFF 0 @@ -363,6 +366,49 @@ struct tcp_diag_md5sig { #define TCP_AO_MAXKEYLEN 80 +#define TCP_AO_KEYF_IFINDEX (1 << 0) /* L3 ifindex for VRF */ + +struct tcp_ao_add { /* setsockopt(TCP_AO_ADD_KEY) */ + struct __kernel_sockaddr_storage addr; /* peer's address for the key */ + char alg_name[64]; /* crypto hash algorithm to use */ + __s32 ifindex; /* L3 dev index for VRF */ + __u32 set_current :1, /* set key as Current_key at once */ + set_rnext :1, /* request it from peer with RNext_key */ + reserved :30; /* must be 0 */ + __u16 reserved2; /* padding, must be 0 */ + __u8 prefix; /* peer's address prefix */ + __u8 sndid; /* SendID for outgoing segments */ + __u8 rcvid; /* RecvID to match for incoming seg */ + __u8 maclen; /* length of authentication code (hash) */ + __u8 keyflags; /* see TCP_AO_KEYF_ */ + __u8 keylen; /* length of ::key */ + __u8 key[TCP_AO_MAXKEYLEN]; +} __attribute__((aligned(8))); + +struct tcp_ao_del { /* setsockopt(TCP_AO_DEL_KEY) */ + struct __kernel_sockaddr_storage addr; /* peer's address for the key */ + __s32 ifindex; /* L3 dev index for VRF */ + __u32 set_current :1, /* corresponding ::current_key */ + set_rnext :1, /* corresponding ::rnext */ + reserved :30; /* must be 0 */ + __u16 reserved2; /* padding, must be 0 */ + __u8 prefix; /* peer's address prefix */ + __u8 sndid; /* SendID for outgoing segments */ + __u8 rcvid; /* RecvID to match for incoming seg */ + __u8 current_key; /* KeyID to set as Current_key */ + __u8 rnext; /* KeyID to set as Rnext_key */ + __u8 keyflags; /* see TCP_AO_KEYF_ */ +} __attribute__((aligned(8))); + +struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */ + __u32 set_current :1, /* corresponding ::current_key */ + set_rnext :1, /* corresponding ::rnext */ + ao_required :1, /* don't accept non-AO connects */ + reserved :29; /* must be 0 */ + __u8 current_key; /* KeyID to set as Current_key */ + __u8 rnext; /* KeyID to set as Rnext_key */ +} __attribute__((aligned(8))); + /* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */ #define TCP_RECEIVE_ZEROCOPY_FLAG_TLB_CLEAN_HINT 0x1 diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index cd760793cfcb..e144a02a6a61 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -69,6 +69,7 @@ obj-$(CONFIG_NETLABEL) += cipso_ipv4.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ xfrm4_output.o xfrm4_protocol.o +obj-$(CONFIG_TCP_AO) += tcp_ao.o ifeq ($(CONFIG_BPF_JIT),y) obj-$(CONFIG_BPF_SYSCALL) += bpf_tcp_ca.o diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 4db46389a4bd..0e3372265cf4 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3591,6 +3591,23 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, __tcp_sock_set_quickack(sk, val); break; +#ifdef CONFIG_TCP_AO + case TCP_AO_ADD_KEY: + case TCP_AO_DEL_KEY: + case TCP_AO_INFO: { + /* If this is the first TCP-AO setsockopt() on the socket, + * sk_state has to be LISTEN or CLOSE + */ + if (((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) || + rcu_dereference_protected(tcp_sk(sk)->ao_info, + lockdep_sock_is_held(sk))) + err = tp->af_specific->ao_parse(sk, optname, optval, + optlen); + else + err = -EISCONN; + break; + } +#endif #ifdef CONFIG_TCP_MD5SIG case TCP_MD5SIG: case TCP_MD5SIG_EXT: diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c new file mode 100644 index 000000000000..3c2d005a37ce --- /dev/null +++ b/net/ipv4/tcp_ao.c @@ -0,0 +1,794 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * INET An implementation of the TCP Authentication Option (TCP-AO). + * See RFC5925. + * + * Authors: Dmitry Safonov + * Francesco Ruggeri + * Salam Noureddine + */ +#define pr_fmt(fmt) "TCP: " fmt + +#include +#include +#include + +#include +#include + +/* Optimized version of tcp_ao_do_lookup(): only for sockets for which + * it's known that the keys in ao_info are matching peer's + * family/address/VRF/etc. + */ +static struct tcp_ao_key *tcp_ao_established_key(struct tcp_ao_info *ao, + int sndid, int rcvid) +{ + struct tcp_ao_key *key; + + hlist_for_each_entry_rcu(key, &ao->head, node) { + if ((sndid >= 0 && key->sndid != sndid) || + (rcvid >= 0 && key->rcvid != rcvid)) + continue; + return key; + } + + return NULL; +} + +static int ipv4_prefix_cmp(const struct in_addr *addr1, + const struct in_addr *addr2, + unsigned int prefixlen) +{ + __be32 mask = inet_make_mask(prefixlen); + __be32 a1 = addr1->s_addr & mask; + __be32 a2 = addr2->s_addr & mask; + + if (a1 == a2) + return 0; + return memcmp(&a1, &a2, sizeof(a1)); +} + +static int __tcp_ao_key_cmp(const struct tcp_ao_key *key, + const union tcp_ao_addr *addr, u8 prefixlen, + int family, int sndid, int rcvid) +{ + if (sndid >= 0 && key->sndid != sndid) + return (key->sndid > sndid) ? 1 : -1; + if (rcvid >= 0 && key->rcvid != rcvid) + return (key->rcvid > rcvid) ? 1 : -1; + + if (family == AF_UNSPEC) + return 0; + if (key->family != family) + return (key->family > family) ? 1 : -1; + + if (family == AF_INET) { + if (ntohl(key->addr.a4.s_addr) == INADDR_ANY) + return 0; + if (ntohl(addr->a4.s_addr) == INADDR_ANY) + return 0; + return ipv4_prefix_cmp(&key->addr.a4, &addr->a4, prefixlen); +#if IS_ENABLED(CONFIG_IPV6) + } else { + if (ipv6_addr_any(&key->addr.a6) || ipv6_addr_any(&addr->a6)) + return 0; + if (ipv6_prefix_equal(&key->addr.a6, &addr->a6, prefixlen)) + return 0; + return memcmp(&key->addr.a6, &addr->a6, sizeof(addr->a6)); +#endif + } + return -1; +} + +static int tcp_ao_key_cmp(const struct tcp_ao_key *key, + const union tcp_ao_addr *addr, u8 prefixlen, + int family, int sndid, int rcvid) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (family == AF_INET6 && ipv6_addr_v4mapped(&addr->a6)) { + __be32 addr4 = addr->a6.s6_addr32[3]; + + return __tcp_ao_key_cmp(key, (union tcp_ao_addr *)&addr4, + prefixlen, AF_INET, sndid, rcvid); + } +#endif + return __tcp_ao_key_cmp(key, addr, prefixlen, family, sndid, rcvid); +} + +static struct tcp_ao_key *__tcp_ao_do_lookup(const struct sock *sk, + const union tcp_ao_addr *addr, int family, u8 prefix, + int sndid, int rcvid) +{ + struct tcp_ao_key *key; + struct tcp_ao_info *ao; + + ao = rcu_dereference_check(tcp_sk(sk)->ao_info, + lockdep_sock_is_held(sk)); + if (!ao) + return NULL; + + hlist_for_each_entry_rcu(key, &ao->head, node) { + u8 prefixlen = min(prefix, key->prefixlen); + + if (!tcp_ao_key_cmp(key, addr, prefixlen, family, sndid, rcvid)) + return key; + } + return NULL; +} + +static struct tcp_ao_info *tcp_ao_alloc_info(gfp_t flags) +{ + struct tcp_ao_info *ao; + + ao = kzalloc(sizeof(*ao), flags); + if (!ao) + return NULL; + INIT_HLIST_HEAD(&ao->head); + + return ao; +} + +static void tcp_ao_link_mkt(struct tcp_ao_info *ao, struct tcp_ao_key *mkt) +{ + hlist_add_head_rcu(&mkt->node, &ao->head); +} + +static void tcp_ao_key_free_rcu(struct rcu_head *head) +{ + struct tcp_ao_key *key = container_of(head, struct tcp_ao_key, rcu); + + tcp_sigpool_release(key->tcp_sigpool_id); + kfree_sensitive(key); +} + +void tcp_ao_destroy_sock(struct sock *sk) +{ + struct tcp_ao_info *ao; + struct tcp_ao_key *key; + struct hlist_node *n; + + ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, 1); + tcp_sk(sk)->ao_info = NULL; + + if (!ao) + return; + + hlist_for_each_entry_safe(key, n, &ao->head, node) { + hlist_del_rcu(&key->node); + atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc); + call_rcu(&key->rcu, tcp_ao_key_free_rcu); + } + + kfree_rcu(ao, rcu); +} + +static bool tcp_ao_can_set_current_rnext(struct sock *sk) +{ + /* There aren't current/rnext keys on TCP_LISTEN sockets */ + if (sk->sk_state == TCP_LISTEN) + return false; + return true; +} + +static int tcp_ao_verify_ipv4(struct sock *sk, struct tcp_ao_add *cmd, + union tcp_ao_addr **addr) +{ + struct sockaddr_in *sin = (struct sockaddr_in *)&cmd->addr; + struct inet_sock *inet = inet_sk(sk); + + if (sin->sin_family != AF_INET) + return -EINVAL; + + /* Currently matching is not performed on port (or port ranges) */ + if (sin->sin_port != 0) + return -EINVAL; + + /* Check prefix and trailing 0's in addr */ + if (cmd->prefix != 0) { + __be32 mask; + + if (ntohl(sin->sin_addr.s_addr) == INADDR_ANY) + return -EINVAL; + if (cmd->prefix > 32) + return -EINVAL; + + mask = inet_make_mask(cmd->prefix); + if (sin->sin_addr.s_addr & ~mask) + return -EINVAL; + + /* Check that MKT address is consistent with socket */ + if (ntohl(inet->inet_daddr) != INADDR_ANY && + (inet->inet_daddr & mask) != sin->sin_addr.s_addr) + return -EINVAL; + } else { + if (ntohl(sin->sin_addr.s_addr) != INADDR_ANY) + return -EINVAL; + } + + *addr = (union tcp_ao_addr *)&sin->sin_addr; + return 0; +} + +static int tcp_ao_parse_crypto(struct tcp_ao_add *cmd, struct tcp_ao_key *key) +{ + unsigned int syn_tcp_option_space; + bool is_kdf_aes_128_cmac = false; + struct crypto_ahash *tfm; + struct tcp_sigpool hp; + void *tmp_key = NULL; + int err; + + /* RFC5926, 3.1.1.2. KDF_AES_128_CMAC */ + if (!strcmp("cmac(aes128)", cmd->alg_name)) { + strscpy(cmd->alg_name, "cmac(aes)", sizeof(cmd->alg_name)); + is_kdf_aes_128_cmac = (cmd->keylen != 16); + tmp_key = kmalloc(cmd->keylen, GFP_KERNEL); + if (!tmp_key) + return -ENOMEM; + } + + key->maclen = cmd->maclen ?: 12; /* 12 is the default in RFC5925 */ + + /* Check: maclen + tcp-ao header <= (MAX_TCP_OPTION_SPACE - mss + * - tstamp - wscale - sackperm), + * see tcp_syn_options(), tcp_synack_options(), commit 33ad798c924b. + * + * In order to allow D-SACK with TCP-AO, the header size should be: + * (MAX_TCP_OPTION_SPACE - TCPOLEN_TSTAMP_ALIGNED + * - TCPOLEN_SACK_BASE_ALIGNED + * - 2 * TCPOLEN_SACK_PERBLOCK) = 8 (maclen = 4), + * see tcp_established_options(). + * + * RFC5925, 2.2: + * Typical MACs are 96-128 bits (12-16 bytes), but any length + * that fits in the header of the segment being authenticated + * is allowed. + * + * RFC5925, 7.6: + * TCP-AO continues to consume 16 bytes in non-SYN segments, + * leaving a total of 24 bytes for other options, of which + * the timestamp consumes 10. This leaves 14 bytes, of which 10 + * are used for a single SACK block. When two SACK blocks are used, + * such as to handle D-SACK, a smaller TCP-AO MAC would be required + * to make room for the additional SACK block (i.e., to leave 18 + * bytes for the D-SACK variant of the SACK option) [RFC2883]. + * Note that D-SACK is not supportable in TCP MD5 in the presence + * of timestamps, because TCP MD5’s MAC length is fixed and too + * large to leave sufficient option space. + */ + syn_tcp_option_space = MAX_TCP_OPTION_SPACE; + syn_tcp_option_space -= TCPOLEN_TSTAMP_ALIGNED; + syn_tcp_option_space -= TCPOLEN_WSCALE_ALIGNED; + syn_tcp_option_space -= TCPOLEN_SACKPERM_ALIGNED; + if (tcp_ao_len(key) > syn_tcp_option_space) { + err = -EMSGSIZE; + goto err_kfree; + } + + key->keylen = cmd->keylen; + memcpy(key->key, cmd->key, cmd->keylen); + + err = tcp_sigpool_start(key->tcp_sigpool_id, &hp); + if (err) + goto err_kfree; + + tfm = crypto_ahash_reqtfm(hp.req); + if (is_kdf_aes_128_cmac) { + void *scratch = hp.scratch; + struct scatterlist sg; + + memcpy(tmp_key, cmd->key, cmd->keylen); + sg_init_one(&sg, tmp_key, cmd->keylen); + + /* Using zero-key of 16 bytes as described in RFC5926 */ + memset(scratch, 0, 16); + err = crypto_ahash_setkey(tfm, scratch, 16); + if (err) + goto err_pool_end; + + err = crypto_ahash_init(hp.req); + if (err) + goto err_pool_end; + + ahash_request_set_crypt(hp.req, &sg, key->key, cmd->keylen); + err = crypto_ahash_update(hp.req); + if (err) + goto err_pool_end; + + err |= crypto_ahash_final(hp.req); + if (err) + goto err_pool_end; + key->keylen = 16; + } + + err = crypto_ahash_setkey(tfm, key->key, key->keylen); + if (err) + goto err_pool_end; + + tcp_sigpool_end(&hp); + kfree_sensitive(tmp_key); + + if (tcp_ao_maclen(key) > key->digest_size) + return -EINVAL; + + return 0; + +err_pool_end: + tcp_sigpool_end(&hp); +err_kfree: + kfree_sensitive(tmp_key); + return err; +} + +#if IS_ENABLED(CONFIG_IPV6) +static int tcp_ao_verify_ipv6(struct sock *sk, struct tcp_ao_add *cmd, + union tcp_ao_addr **paddr, + unsigned short int *family) +{ + struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd->addr; + struct in6_addr *addr = &sin6->sin6_addr; + u8 prefix = cmd->prefix; + + if (sin6->sin6_family != AF_INET6) + return -EINVAL; + + /* Currently matching is not performed on port (or port ranges) */ + if (sin6->sin6_port != 0) + return -EINVAL; + + /* Check prefix and trailing 0's in addr */ + if (cmd->prefix != 0 && ipv6_addr_v4mapped(addr)) { + __be32 addr4 = addr->s6_addr32[3]; + __be32 mask; + + if (prefix > 32 || ntohl(addr4) == INADDR_ANY) + return -EINVAL; + + mask = inet_make_mask(prefix); + if (addr4 & ~mask) + return -EINVAL; + + /* Check that MKT address is consistent with socket */ + if (!ipv6_addr_any(&sk->sk_v6_daddr)) { + __be32 daddr4 = sk->sk_v6_daddr.s6_addr32[3]; + + if (!ipv6_addr_v4mapped(&sk->sk_v6_daddr)) + return -EINVAL; + if ((daddr4 & mask) != addr4) + return -EINVAL; + } + + *paddr = (union tcp_ao_addr *)&addr->s6_addr32[3]; + *family = AF_INET; + return 0; + } else if (cmd->prefix != 0) { + struct in6_addr pfx; + + if (ipv6_addr_any(addr) || prefix > 128) + return -EINVAL; + + ipv6_addr_prefix(&pfx, addr, prefix); + if (ipv6_addr_cmp(&pfx, addr)) + return -EINVAL; + + /* Check that MKT address is consistent with socket */ + if (!ipv6_addr_any(&sk->sk_v6_daddr) && + !ipv6_prefix_equal(&sk->sk_v6_daddr, addr, prefix)) + + return -EINVAL; + } else { + if (!ipv6_addr_any(addr)) + return -EINVAL; + } + + *paddr = (union tcp_ao_addr *)addr; + return 0; +} +#else +static int tcp_ao_verify_ipv6(struct sock *sk, struct tcp_ao_add *cmd, + union tcp_ao_addr **paddr, + unsigned short int *family) +{ + return -EOPNOTSUPP; +} +#endif + +static struct tcp_ao_info *setsockopt_ao_info(struct sock *sk) +{ + if (sk_fullsock(sk)) { + return rcu_dereference_protected(tcp_sk(sk)->ao_info, + lockdep_sock_is_held(sk)); + } + return ERR_PTR(-ESOCKTNOSUPPORT); +} + +#define TCP_AO_KEYF_ALL (0) + +static struct tcp_ao_key *tcp_ao_key_alloc(struct sock *sk, + struct tcp_ao_add *cmd) +{ + const char *algo = cmd->alg_name; + unsigned int digest_size; + struct crypto_ahash *tfm; + struct tcp_ao_key *key; + struct tcp_sigpool hp; + int err, pool_id; + size_t size; + + /* Force null-termination of alg_name */ + cmd->alg_name[ARRAY_SIZE(cmd->alg_name) - 1] = '\0'; + + /* RFC5926, 3.1.1.2. KDF_AES_128_CMAC */ + if (!strcmp("cmac(aes128)", algo)) + algo = "cmac(aes)"; + + /* Full TCP header (th->doff << 2) should fit into scratch area, + * see tcp_ao_hash_header(). + */ + pool_id = tcp_sigpool_alloc_ahash(algo, 60); + if (pool_id < 0) + return ERR_PTR(pool_id); + + err = tcp_sigpool_start(pool_id, &hp); + if (err) + goto err_free_pool; + + tfm = crypto_ahash_reqtfm(hp.req); + if (crypto_ahash_alignmask(tfm) > TCP_AO_KEY_ALIGN) { + err = -EOPNOTSUPP; + goto err_pool_end; + } + digest_size = crypto_ahash_digestsize(tfm); + tcp_sigpool_end(&hp); + + size = sizeof(struct tcp_ao_key) + (digest_size << 1); + key = sock_kmalloc(sk, size, GFP_KERNEL); + if (!key) { + err = -ENOMEM; + goto err_free_pool; + } + + key->tcp_sigpool_id = pool_id; + key->digest_size = digest_size; + return key; + +err_pool_end: + tcp_sigpool_end(&hp); +err_free_pool: + tcp_sigpool_release(pool_id); + return ERR_PTR(err); +} + +static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, + sockptr_t optval, int optlen) +{ + struct tcp_ao_info *ao_info; + union tcp_ao_addr *addr; + struct tcp_ao_key *key; + struct tcp_ao_add cmd; + bool first = false; + int ret; + + if (optlen < sizeof(cmd)) + return -EINVAL; + + ret = copy_struct_from_sockptr(&cmd, sizeof(cmd), optval, optlen); + if (ret) + return ret; + + if (cmd.keylen > TCP_AO_MAXKEYLEN) + return -EINVAL; + + if (cmd.reserved != 0 || cmd.reserved2 != 0) + return -EINVAL; + + if (family == AF_INET) + ret = tcp_ao_verify_ipv4(sk, &cmd, &addr); + else + ret = tcp_ao_verify_ipv6(sk, &cmd, &addr, &family); + if (ret) + return ret; + + if (cmd.keyflags & ~TCP_AO_KEYF_ALL) + return -EINVAL; + + if (cmd.set_current || cmd.set_rnext) { + if (!tcp_ao_can_set_current_rnext(sk)) + return -EINVAL; + } + + ao_info = setsockopt_ao_info(sk); + if (IS_ERR(ao_info)) + return PTR_ERR(ao_info); + + if (!ao_info) { + ao_info = tcp_ao_alloc_info(GFP_KERNEL); + if (!ao_info) + return -ENOMEM; + first = true; + } else { + /* Check that neither RecvID nor SendID match any + * existing key for the peer, RFC5925 3.1: + * > The IDs of MKTs MUST NOT overlap where their + * > TCP connection identifiers overlap. + */ + if (__tcp_ao_do_lookup(sk, addr, family, + cmd.prefix, -1, cmd.rcvid)) + return -EEXIST; + if (__tcp_ao_do_lookup(sk, addr, family, + cmd.prefix, cmd.sndid, -1)) + return -EEXIST; + } + + key = tcp_ao_key_alloc(sk, &cmd); + if (IS_ERR(key)) { + ret = PTR_ERR(key); + goto err_free_ao; + } + + INIT_HLIST_NODE(&key->node); + memcpy(&key->addr, addr, (family == AF_INET) ? sizeof(struct in_addr) : + sizeof(struct in6_addr)); + key->prefixlen = cmd.prefix; + key->family = family; + key->keyflags = cmd.keyflags; + key->sndid = cmd.sndid; + key->rcvid = cmd.rcvid; + + ret = tcp_ao_parse_crypto(&cmd, key); + if (ret < 0) + goto err_free_sock; + + tcp_ao_link_mkt(ao_info, key); + if (first) { + sk_gso_disable(sk); + rcu_assign_pointer(tcp_sk(sk)->ao_info, ao_info); + } + + if (cmd.set_current) + WRITE_ONCE(ao_info->current_key, key); + if (cmd.set_rnext) + WRITE_ONCE(ao_info->rnext_key, key); + return 0; + +err_free_sock: + atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc); + tcp_sigpool_release(key->tcp_sigpool_id); + kfree_sensitive(key); +err_free_ao: + if (first) + kfree(ao_info); + return ret; +} + +static int tcp_ao_delete_key(struct sock *sk, struct tcp_ao_info *ao_info, + struct tcp_ao_key *key, + struct tcp_ao_key *new_current, + struct tcp_ao_key *new_rnext) +{ + int err; + + hlist_del_rcu(&key->node); + + /* At this moment another CPU could have looked this key up + * while it was unlinked from the list. Wait for RCU grace period, + * after which the key is off-list and can't be looked up again; + * the rx path [just before RCU came] might have used it and set it + * as current_key (very unlikely). + */ + synchronize_rcu(); + if (new_current) + WRITE_ONCE(ao_info->current_key, new_current); + if (new_rnext) + WRITE_ONCE(ao_info->rnext_key, new_rnext); + + if (unlikely(READ_ONCE(ao_info->current_key) == key || + READ_ONCE(ao_info->rnext_key) == key)) { + err = -EBUSY; + goto add_key; + } + + atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc); + call_rcu(&key->rcu, tcp_ao_key_free_rcu); + + return 0; +add_key: + hlist_add_head_rcu(&key->node, &ao_info->head); + return err; +} + +static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family, + sockptr_t optval, int optlen) +{ + struct tcp_ao_key *key, *new_current = NULL, *new_rnext = NULL; + struct tcp_ao_info *ao_info; + union tcp_ao_addr *addr; + struct tcp_ao_del cmd; + int addr_len; + __u8 prefix; + u16 port; + int err; + + if (optlen < sizeof(cmd)) + return -EINVAL; + + err = copy_struct_from_sockptr(&cmd, sizeof(cmd), optval, optlen); + if (err) + return err; + + if (cmd.reserved != 0 || cmd.reserved2 != 0) + return -EINVAL; + + if (cmd.set_current || cmd.set_rnext) { + if (!tcp_ao_can_set_current_rnext(sk)) + return -EINVAL; + } + + ao_info = setsockopt_ao_info(sk); + if (IS_ERR(ao_info)) + return PTR_ERR(ao_info); + if (!ao_info) + return -ENOENT; + + /* For sockets in TCP_CLOSED it's possible set keys that aren't + * matching the future peer (address/VRF/etc), + * tcp_ao_connect_init() will choose a correct matching MKT + * if there's any. + */ + if (cmd.set_current) { + new_current = tcp_ao_established_key(ao_info, cmd.current_key, -1); + if (!new_current) + return -ENOENT; + } + if (cmd.set_rnext) { + new_rnext = tcp_ao_established_key(ao_info, -1, cmd.rnext); + if (!new_rnext) + return -ENOENT; + } + + if (family == AF_INET) { + struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.addr; + + addr = (union tcp_ao_addr *)&sin->sin_addr; + addr_len = sizeof(struct in_addr); + port = ntohs(sin->sin_port); + } else { + struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.addr; + struct in6_addr *addr6 = &sin6->sin6_addr; + + if (ipv6_addr_v4mapped(addr6)) { + addr = (union tcp_ao_addr *)&addr6->s6_addr32[3]; + addr_len = sizeof(struct in_addr); + family = AF_INET; + } else { + addr = (union tcp_ao_addr *)addr6; + addr_len = sizeof(struct in6_addr); + } + port = ntohs(sin6->sin6_port); + } + prefix = cmd.prefix; + + /* Currently matching is not performed on port (or port ranges) */ + if (port != 0) + return -EINVAL; + + /* We could choose random present key here for current/rnext + * but that's less predictable. Let's be strict and don't + * allow removing a key that's in use. RFC5925 doesn't + * specify how-to coordinate key removal, but says: + * "It is presumed that an MKT affecting a particular + * connection cannot be destroyed during an active connection" + */ + hlist_for_each_entry_rcu(key, &ao_info->head, node) { + if (cmd.sndid != key->sndid || + cmd.rcvid != key->rcvid) + continue; + + if (family != key->family || + prefix != key->prefixlen || + memcmp(addr, &key->addr, addr_len)) + continue; + + if (key == new_current || key == new_rnext) + continue; + + return tcp_ao_delete_key(sk, ao_info, key, + new_current, new_rnext); + } + return -ENOENT; +} + +static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family, + sockptr_t optval, int optlen) +{ + struct tcp_ao_key *new_current = NULL, *new_rnext = NULL; + struct tcp_ao_info *ao_info; + struct tcp_ao_info_opt cmd; + bool first = false; + int err; + + if (optlen < sizeof(cmd)) + return -EINVAL; + + err = copy_struct_from_sockptr(&cmd, sizeof(cmd), optval, optlen); + if (err) + return err; + + if (cmd.set_current || cmd.set_rnext) { + if (!tcp_ao_can_set_current_rnext(sk)) + return -EINVAL; + } + + if (cmd.reserved != 0) + return -EINVAL; + + ao_info = setsockopt_ao_info(sk); + if (IS_ERR(ao_info)) + return PTR_ERR(ao_info); + if (!ao_info) { + ao_info = tcp_ao_alloc_info(GFP_KERNEL); + if (!ao_info) + return -ENOMEM; + first = true; + } + + /* For sockets in TCP_CLOSED it's possible set keys that aren't + * matching the future peer (address/port/VRF/etc), + * tcp_ao_connect_init() will choose a correct matching MKT + * if there's any. + */ + if (cmd.set_current) { + new_current = tcp_ao_established_key(ao_info, cmd.current_key, -1); + if (!new_current) { + err = -ENOENT; + goto out; + } + } + if (cmd.set_rnext) { + new_rnext = tcp_ao_established_key(ao_info, -1, cmd.rnext); + if (!new_rnext) { + err = -ENOENT; + goto out; + } + } + + ao_info->ao_required = cmd.ao_required; + if (new_current) + WRITE_ONCE(ao_info->current_key, new_current); + if (new_rnext) + WRITE_ONCE(ao_info->rnext_key, new_rnext); + if (first) { + sk_gso_disable(sk); + rcu_assign_pointer(tcp_sk(sk)->ao_info, ao_info); + } + return 0; +out: + if (first) + kfree(ao_info); + return err; +} + +int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family, + sockptr_t optval, int optlen) +{ + if (WARN_ON_ONCE(family != AF_INET && family != AF_INET6)) + return -EAFNOSUPPORT; + + switch (cmd) { + case TCP_AO_ADD_KEY: + return tcp_ao_add_cmd(sk, family, optval, optlen); + case TCP_AO_DEL_KEY: + return tcp_ao_del_cmd(sk, family, optval, optlen); + case TCP_AO_INFO: + return tcp_ao_info_cmd(sk, family, optval, optlen); + default: + WARN_ON_ONCE(1); + return -EINVAL; + } +} + +int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen) +{ + return tcp_parse_ao(sk, cmd, AF_INET, optval, optlen); +} + diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 7d81e90b6f5c..a4746c27f38a 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2271,11 +2271,16 @@ const struct inet_connection_sock_af_ops ipv4_specific = { }; EXPORT_SYMBOL(ipv4_specific); -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) static const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { +#ifdef CONFIG_TCP_MD5SIG .md5_lookup = tcp_v4_md5_lookup, .calc_md5_hash = tcp_v4_md5_hash_skb, .md5_parse = tcp_v4_parse_md5_keys, +#endif +#ifdef CONFIG_TCP_AO + .ao_parse = tcp_v4_parse_ao, +#endif }; #endif @@ -2290,7 +2295,7 @@ static int tcp_v4_init_sock(struct sock *sk) icsk->icsk_af_ops = &ipv4_specific; -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) tcp_sk(sk)->af_specific = &tcp_sock_ipv4_specific; #endif @@ -2341,6 +2346,7 @@ void tcp_v4_destroy_sock(struct sock *sk) rcu_assign_pointer(tp->md5sig_info, NULL); } #endif + tcp_ao_destroy_sock(sk); /* Clean up a referenced TCP bind bucket. */ if (inet_csk(sk)->icsk_bind_hash) diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile index 3036a45e8a1e..d283c59df4c1 100644 --- a/net/ipv6/Makefile +++ b/net/ipv6/Makefile @@ -52,4 +52,5 @@ obj-$(subst m,y,$(CONFIG_IPV6)) += inet6_hashtables.o ifneq ($(CONFIG_IPV6),) obj-$(CONFIG_NET_UDP_TUNNEL) += ip6_udp_tunnel.o obj-y += mcast_snoop.o +obj-$(CONFIG_TCP_AO) += tcp_ao.o endif diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c new file mode 100644 index 000000000000..049ddbabe049 --- /dev/null +++ b/net/ipv6/tcp_ao.c @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * INET An implementation of the TCP Authentication Option (TCP-AO). + * See RFC5925. + * + * Authors: Dmitry Safonov + * Francesco Ruggeri + * Salam Noureddine + */ +#include + +#include +#include + +int tcp_v6_parse_ao(struct sock *sk, int cmd, + sockptr_t optval, int optlen) +{ + return tcp_parse_ao(sk, cmd, AF_INET6, optval, optlen); +} diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 9bdc4e979943..c266ec76d763 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -76,16 +76,9 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb); static const struct inet_connection_sock_af_ops ipv6_mapped; const struct inet_connection_sock_af_ops ipv6_specific; -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) static const struct tcp_sock_af_ops tcp_sock_ipv6_specific; static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific; -#else -static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk, - const struct in6_addr *addr, - int l3index) -{ - return NULL; -} #endif /* Helper returning the inet6 address from a given tcp socket. @@ -239,7 +232,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, if (sk_is_mptcp(sk)) mptcpv6_handle_mapped(sk, true); sk->sk_backlog_rcv = tcp_v4_do_rcv; -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) tp->af_specific = &tcp_sock_ipv6_mapped_specific; #endif @@ -252,7 +245,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, if (sk_is_mptcp(sk)) mptcpv6_handle_mapped(sk, false); sk->sk_backlog_rcv = tcp_v6_do_rcv; -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) tp->af_specific = &tcp_sock_ipv6_specific; #endif goto failure; @@ -769,7 +762,13 @@ static int tcp_v6_md5_hash_skb(char *md5_hash, memset(md5_hash, 0, 16); return 1; } - +#else /* CONFIG_TCP_MD5SIG */ +static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk, + const struct in6_addr *addr, + int l3index) +{ + return NULL; +} #endif static void tcp_v6_init_req(struct request_sock *req, @@ -1228,7 +1227,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * if (sk_is_mptcp(newsk)) mptcpv6_handle_mapped(newsk, true); newsk->sk_backlog_rcv = tcp_v4_do_rcv; -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) newtp->af_specific = &tcp_sock_ipv6_mapped_specific; #endif @@ -1897,11 +1896,16 @@ const struct inet_connection_sock_af_ops ipv6_specific = { .mtu_reduced = tcp_v6_mtu_reduced, }; -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) static const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { +#ifdef CONFIG_TCP_MD5SIG .md5_lookup = tcp_v6_md5_lookup, .calc_md5_hash = tcp_v6_md5_hash_skb, .md5_parse = tcp_v6_parse_md5_keys, +#endif +#ifdef CONFIG_TCP_AO + .ao_parse = tcp_v6_parse_ao, +#endif }; #endif @@ -1923,11 +1927,16 @@ static const struct inet_connection_sock_af_ops ipv6_mapped = { .mtu_reduced = tcp_v4_mtu_reduced, }; -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { +#ifdef CONFIG_TCP_MD5SIG .md5_lookup = tcp_v4_md5_lookup, .calc_md5_hash = tcp_v4_md5_hash_skb, .md5_parse = tcp_v6_parse_md5_keys, +#endif +#ifdef CONFIG_TCP_AO + .ao_parse = tcp_v6_parse_ao, +#endif }; #endif @@ -1942,7 +1951,7 @@ static int tcp_v6_init_sock(struct sock *sk) icsk->icsk_af_ops = &ipv6_specific; -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) tcp_sk(sk)->af_specific = &tcp_sock_ipv6_specific; #endif From patchwork Mon Oct 23 19:21:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157065 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1505059vqx; Mon, 23 Oct 2023 12:24:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGSN9PmWOmD3uqNngu5Q/g7gzgWXQoQwFUprRPL4uDcxUzJhdW6e49///tz5nUqAQ3/G6gB X-Received: by 2002:a05:6359:1a8d:b0:168:e8e6:b91f with SMTP id rv13-20020a0563591a8d00b00168e8e6b91fmr1550818rwb.18.1698089071321; Mon, 23 Oct 2023 12:24:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089071; cv=none; d=google.com; s=arc-20160816; b=QIvOcJzfvy8TZfjmspnCJQ/CtOS7YJ6vYiNDl9MG7H37gQzI+8gt35hloy3kpzNTQ1 jxJASloHfeTctH/TOzUhubcdBijDnDaCTUblzMv6gBbyjNXMTtFNck6rn8CCTxNymI/F 4vC+1ZujF+78YoY75FHL2bc6kWH+BskSiVDCKZJ0JxOE7wxYY0gD/iKzP0JKUuEYMDlg 9GNqV5/tB7Q97wRiHlANecq2AYDkHCq6GvVY0OMOvhpOAe88+5vmFnmTUNi/YgJ35nuq FkQaxVgD9Sx6EoGQbgQPbntvywoSJz8iTpnIOd94ohIqX66gYy0/kNJyKiqLXB5gf9wK vJuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8LBWoeyMNeqLWs0mfq03ZhUmDYJFCicsjvbit/aSeKo=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=Esm+MlnN3yH8LJ4BeJ8HuTz9bC6vULxD6Il1XFvKBQFreN477K/zUjA0KjBijM8x/D GJ5HXQCwu6m0JlvCot0/vt6BK+KQ9lU/CdncDTgro6H7TgzEhd7VxVfTmnhJ3881wKfg MzXv0L5yahSSxd+KrvHpru88Lce2xwmNwmtrNkxxbQZRl7wHCi0Y3O849JDVUu6eJs76 vO480UwK+IJttlfbYO914nysFwve+G/t49Fk9MD8FSd5aFHk8orGF4WZpexSRbIc3msZ 2PojEHd60HW7cEUzSF0vISHtAnfpSHuXmu8ab6lIB57adzlVSHw9JZ4NdQYgQzymoEq7 E/Og== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=csb0lX0r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id s132-20020a63778a000000b005ab74d58e01si6626082pgc.439.2023.10.23.12.24.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:24:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=csb0lX0r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 4C42580B0285; Mon, 23 Oct 2023 12:24:21 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230386AbjJWTWm (ORCPT + 27 others); Mon, 23 Oct 2023 15:22:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37426 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229557AbjJWTWk (ORCPT ); Mon, 23 Oct 2023 15:22:40 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7366394 for ; Mon, 23 Oct 2023 12:22:35 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-4083f613272so31354655e9.1 for ; Mon, 23 Oct 2023 12:22:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088954; x=1698693754; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8LBWoeyMNeqLWs0mfq03ZhUmDYJFCicsjvbit/aSeKo=; b=csb0lX0rHeZmtKcu0iY0Ek9OLA0CxIIfyGF+yv21qN3Llig8XKUgjAKCd11oyboUyL kVFp5kG/lvFqklE05g1l82RaJA9tyI+Bz8ziql/9ZUJU6DX7iICX2PUS8RWkOFU3ymy/ OPiXLKLI5QIKw98XTynD8MEdtiV7dP67rsygBl92avVIboD92waAY0HBfIIbSa+W94gP +A0Vbns4jFOrXN69ijLMKwPMNne7GHjgju+RdhTRJQQ/X/tAS2xuk55dJZxUTFxf3dr1 3RBrC1z1tUmEVni+UdUm2XDTho6Fam5Soeob6xFsziwQIFkU6VIR246J9GGr7iY2cYyu I8Uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088954; x=1698693754; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8LBWoeyMNeqLWs0mfq03ZhUmDYJFCicsjvbit/aSeKo=; b=s2k7hrRsfu+JEzDa9iJ8QBj9R1hSto09dvQAxq0vjrCKgbS8iMdSfSTZZ8rulqgQ2w VCaCY9jm9Us27ABtLXzT7gFn/V4vCIqg1zU1ncM9N6Fera7NTd+0nSshFGhFnVgUqQQb p3ZTls+h2mcWdQgRDx7VXbAFpmObOJRz3JzlSSbBIIXXCP43ui33hN2KRhExtgsbURn6 LrqYgFU3mqy85oxT0Tp10GJ4lL1ps4wtxyuWmJegNs1lpj4EGpbUJGp6qVK9Lwj4ZU52 HtqIMLDuqTKjyejZowQRREfqKkTwo2+0WPivcduztVVOsVG9L1Bvv6Yr0xSINB+0kLgO hjLw== X-Gm-Message-State: AOJu0Ywy3yCXugwEUUcZgNHlEmhRL8Hxi3u7F4MuydCmb6ARxss6np8T W8LbxmERqZNeIDI0hPpSIkR5WA== X-Received: by 2002:a05:600c:3582:b0:408:3804:2a20 with SMTP id p2-20020a05600c358200b0040838042a20mr7988579wmq.22.1698088953873; Mon, 23 Oct 2023 12:22:33 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:33 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 04/23] net/tcp: Prevent TCP-MD5 with TCP-AO being set Date: Mon, 23 Oct 2023 20:21:56 +0100 Message-ID: <20231023192217.426455-5-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:24:21 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575446350459020 X-GMAIL-MSGID: 1780575446350459020 Be as conservative as possible: if there is TCP-MD5 key for a given peer regardless of L3 interface - don't allow setting TCP-AO key for the same peer. According to RFC5925, TCP-AO is supposed to replace TCP-MD5 and there can't be any switch between both on any connected tuple. Later it can be relaxed, if there's a use, but in the beginning restrict any intersection. Note: it's still should be possible to set both TCP-MD5 and TCP-AO keys on a listening socket for *different* peers. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp.h | 43 +++++++++++++++++++++++++++++++++++++-- include/net/tcp_ao.h | 13 ++++++++++++ net/ipv4/tcp_ao.c | 47 +++++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_ipv4.c | 14 ++++++++++--- net/ipv4/tcp_output.c | 47 +++++++++++++++++++++++++++++++++++++++++++ net/ipv6/tcp_ao.c | 17 ++++++++++++++++ net/ipv6/tcp_ipv6.c | 26 ++++++++++++++++++++---- 7 files changed, 198 insertions(+), 9 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 6c5dbfeb6121..c2de8b8e540a 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1776,6 +1776,7 @@ int tcp_md5_key_copy(struct sock *sk, const union tcp_md5_addr *addr, int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, int family, u8 prefixlen, int l3index, u8 flags); +void tcp_clear_md5_list(struct sock *sk); struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, const struct sock *addr_sk); @@ -1784,14 +1785,23 @@ struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk, extern struct static_key_false_deferred tcp_md5_needed; struct tcp_md5sig_key *__tcp_md5_do_lookup(const struct sock *sk, int l3index, const union tcp_md5_addr *addr, - int family); + int family, bool any_l3index); static inline struct tcp_md5sig_key * tcp_md5_do_lookup(const struct sock *sk, int l3index, const union tcp_md5_addr *addr, int family) { if (!static_branch_unlikely(&tcp_md5_needed.key)) return NULL; - return __tcp_md5_do_lookup(sk, l3index, addr, family); + return __tcp_md5_do_lookup(sk, l3index, addr, family, false); +} + +static inline struct tcp_md5sig_key * +tcp_md5_do_lookup_any_l3index(const struct sock *sk, + const union tcp_md5_addr *addr, int family) +{ + if (!static_branch_unlikely(&tcp_md5_needed.key)) + return NULL; + return __tcp_md5_do_lookup(sk, 0, addr, family, true); } enum skb_drop_reason @@ -1809,6 +1819,13 @@ tcp_md5_do_lookup(const struct sock *sk, int l3index, return NULL; } +static inline struct tcp_md5sig_key * +tcp_md5_do_lookup_any_l3index(const struct sock *sk, + const union tcp_md5_addr *addr, int family) +{ + return NULL; +} + static inline enum skb_drop_reason tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, const void *saddr, const void *daddr, @@ -2175,6 +2192,9 @@ struct tcp_sock_af_ops { #endif #ifdef CONFIG_TCP_AO int (*ao_parse)(struct sock *sk, int optname, sockptr_t optval, int optlen); + struct tcp_ao_key *(*ao_lookup)(const struct sock *sk, + struct sock *addr_sk, + int sndid, int rcvid); #endif }; @@ -2586,4 +2606,23 @@ static inline u64 tcp_transmit_time(const struct sock *sk) return 0; } +static inline bool tcp_ao_required(struct sock *sk, const void *saddr, + int family) +{ +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao_info; + struct tcp_ao_key *ao_key; + + ao_info = rcu_dereference_check(tcp_sk(sk)->ao_info, + lockdep_sock_is_held(sk)); + if (!ao_info) + return false; + + ao_key = tcp_ao_do_lookup(sk, saddr, family, -1, -1); + if (ao_info->ao_required || ao_key) + return true; +#endif + return false; +} + #endif /* _TCP_H */ diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index a81e40fd255a..3c7f576376f9 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -92,11 +92,24 @@ struct tcp_ao_info { int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family, sockptr_t optval, int optlen); void tcp_ao_destroy_sock(struct sock *sk); +struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, + const union tcp_ao_addr *addr, + int family, int sndid, int rcvid); /* ipv4 specific functions */ int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); +struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, + int sndid, int rcvid); /* ipv6 specific functions */ int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); +struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk, + struct sock *addr_sk, int sndid, int rcvid); #else +static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, + const union tcp_ao_addr *addr, int family, int sndid, int rcvid) +{ + return NULL; +} + static inline void tcp_ao_destroy_sock(struct sock *sk) { } diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 3c2d005a37ce..ee23356101f4 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -116,6 +116,13 @@ static struct tcp_ao_key *__tcp_ao_do_lookup(const struct sock *sk, return NULL; } +struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, + const union tcp_ao_addr *addr, + int family, int sndid, int rcvid) +{ + return __tcp_ao_do_lookup(sk, addr, family, U8_MAX, sndid, rcvid); +} + static struct tcp_ao_info *tcp_ao_alloc_info(gfp_t flags) { struct tcp_ao_info *ao; @@ -162,6 +169,14 @@ void tcp_ao_destroy_sock(struct sock *sk) kfree_rcu(ao, rcu); } +struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, + int sndid, int rcvid) +{ + union tcp_ao_addr *addr = (union tcp_ao_addr *)&addr_sk->sk_daddr; + + return tcp_ao_do_lookup(sk, addr, AF_INET, sndid, rcvid); +} + static bool tcp_ao_can_set_current_rnext(struct sock *sk) { /* There aren't current/rnext keys on TCP_LISTEN sockets */ @@ -497,6 +512,10 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, return -EINVAL; } + /* Don't allow keys for peers that have a matching TCP-MD5 key */ + if (tcp_md5_do_lookup_any_l3index(sk, addr, family)) + return -EKEYREJECTED; + ao_info = setsockopt_ao_info(sk); if (IS_ERR(ao_info)) return PTR_ERR(ao_info); @@ -698,6 +717,31 @@ static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family, return -ENOENT; } +/* cmd.ao_required makes a socket TCP-AO only. + * Don't allow any md5 keys for any l3intf on the socket together with it. + * Restricting it early in setsockopt() removes a check for + * ao_info->ao_required on inbound tcp segment fast-path. + */ +static int tcp_ao_required_verify(struct sock *sk) +{ +#ifdef CONFIG_TCP_MD5SIG + const struct tcp_md5sig_info *md5sig; + + if (!static_branch_unlikely(&tcp_md5_needed.key)) + return 0; + + md5sig = rcu_dereference_check(tcp_sk(sk)->md5sig_info, + lockdep_sock_is_held(sk)); + if (!md5sig) + return 0; + + if (rcu_dereference_check(hlist_first_rcu(&md5sig->head), + lockdep_sock_is_held(sk))) + return 1; +#endif + return 0; +} + static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family, sockptr_t optval, int optlen) { @@ -732,6 +776,9 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family, first = true; } + if (cmd.ao_required && tcp_ao_required_verify(sk)) + return -EKEYREJECTED; + /* For sockets in TCP_CLOSED it's possible set keys that aren't * matching the future peer (address/port/VRF/etc), * tcp_ao_connect_init() will choose a correct matching MKT diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index a4746c27f38a..698e58a3ccec 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1082,7 +1082,7 @@ static bool better_md5_match(struct tcp_md5sig_key *old, struct tcp_md5sig_key * /* Find the Key structure for an address. */ struct tcp_md5sig_key *__tcp_md5_do_lookup(const struct sock *sk, int l3index, const union tcp_md5_addr *addr, - int family) + int family, bool any_l3index) { const struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_key *key; @@ -1101,7 +1101,8 @@ struct tcp_md5sig_key *__tcp_md5_do_lookup(const struct sock *sk, int l3index, lockdep_sock_is_held(sk)) { if (key->family != family) continue; - if (key->flags & TCP_MD5SIG_FLAG_IFINDEX && key->l3index != l3index) + if (!any_l3index && key->flags & TCP_MD5SIG_FLAG_IFINDEX && + key->l3index != l3index) continue; if (family == AF_INET) { mask = inet_make_mask(key->prefixlen); @@ -1313,7 +1314,7 @@ int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, int family, } EXPORT_SYMBOL(tcp_md5_do_del); -static void tcp_clear_md5_list(struct sock *sk) +void tcp_clear_md5_list(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_key *key; @@ -1383,6 +1384,12 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) return -EINVAL; + /* Don't allow keys for peers that have a matching TCP-AO key. + * See the comment in tcp_ao_add_cmd() + */ + if (tcp_ao_required(sk, addr, AF_INET)) + return -EKEYREJECTED; + return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags, cmd.tcpm_key, cmd.tcpm_keylen); } @@ -2279,6 +2286,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { .md5_parse = tcp_v4_parse_md5_keys, #endif #ifdef CONFIG_TCP_AO + .ao_lookup = tcp_v4_ao_lookup, .ao_parse = tcp_v4_parse_ao, #endif }; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 2866ccbccde0..4db906ac8e48 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3949,6 +3949,53 @@ int tcp_connect(struct sock *sk) tcp_call_bpf(sk, BPF_SOCK_OPS_TCP_CONNECT_CB, 0, NULL); +#if defined(CONFIG_TCP_MD5SIG) && defined(CONFIG_TCP_AO) + /* Has to be checked late, after setting daddr/saddr/ops. + * Return error if the peer has both a md5 and a tcp-ao key + * configured as this is ambiguous. + */ + if (unlikely(rcu_dereference_protected(tp->md5sig_info, + lockdep_sock_is_held(sk)))) { + bool needs_ao = !!tp->af_specific->ao_lookup(sk, sk, -1, -1); + bool needs_md5 = !!tp->af_specific->md5_lookup(sk, sk); + struct tcp_ao_info *ao_info; + + ao_info = rcu_dereference_check(tp->ao_info, + lockdep_sock_is_held(sk)); + if (ao_info) { + /* This is an extra check: tcp_ao_required() in + * tcp_v{4,6}_parse_md5_keys() should prevent adding + * md5 keys on ao_required socket. + */ + needs_ao |= ao_info->ao_required; + WARN_ON_ONCE(ao_info->ao_required && needs_md5); + } + if (needs_md5 && needs_ao) + return -EKEYREJECTED; + + /* If we have a matching md5 key and no matching tcp-ao key + * then free up ao_info if allocated. + */ + if (needs_md5) { + tcp_ao_destroy_sock(sk); + } else if (needs_ao) { + tcp_clear_md5_list(sk); + kfree(rcu_replace_pointer(tp->md5sig_info, NULL, + lockdep_sock_is_held(sk))); + } + } +#endif +#ifdef CONFIG_TCP_AO + if (unlikely(rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held(sk)))) { + /* Don't allow connecting if ao is configured but no + * matching key is found. + */ + if (!tp->af_specific->ao_lookup(sk, sk, -1, -1)) + return -EKEYREJECTED; + } +#endif + if (inet_csk(sk)->icsk_af_ops->rebuild_header(sk)) return -EHOSTUNREACH; /* Routing failure or similar. */ diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c index 049ddbabe049..0640acaee67b 100644 --- a/net/ipv6/tcp_ao.c +++ b/net/ipv6/tcp_ao.c @@ -12,6 +12,23 @@ #include #include +static struct tcp_ao_key *tcp_v6_ao_do_lookup(const struct sock *sk, + const struct in6_addr *addr, + int sndid, int rcvid) +{ + return tcp_ao_do_lookup(sk, (union tcp_ao_addr *)addr, AF_INET6, + sndid, rcvid); +} + +struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk, + struct sock *addr_sk, + int sndid, int rcvid) +{ + struct in6_addr *addr = &addr_sk->sk_v6_daddr; + + return tcp_v6_ao_do_lookup(sk, addr, sndid, rcvid); +} + int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen) { diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index c266ec76d763..d1a62bbc8a3b 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -600,6 +600,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, { struct tcp_md5sig cmd; struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.tcpm_addr; + union tcp_ao_addr *addr; int l3index = 0; u8 prefixlen; u8 flags; @@ -654,13 +655,28 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN) return -EINVAL; - if (ipv6_addr_v4mapped(&sin6->sin6_addr)) - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3], + if (ipv6_addr_v4mapped(&sin6->sin6_addr)) { + addr = (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3]; + + /* Don't allow keys for peers that have a matching TCP-AO key. + * See the comment in tcp_ao_add_cmd() + */ + if (tcp_ao_required(sk, addr, AF_INET)) + return -EKEYREJECTED; + return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags, cmd.tcpm_key, cmd.tcpm_keylen); + } - return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr, - AF_INET6, prefixlen, l3index, flags, + addr = (union tcp_md5_addr *)&sin6->sin6_addr; + + /* Don't allow keys for peers that have a matching TCP-AO key. + * See the comment in tcp_ao_add_cmd() + */ + if (tcp_ao_required(sk, addr, AF_INET6)) + return -EKEYREJECTED; + + return tcp_md5_do_add(sk, addr, AF_INET6, prefixlen, l3index, flags, cmd.tcpm_key, cmd.tcpm_keylen); } @@ -1904,6 +1920,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { .md5_parse = tcp_v6_parse_md5_keys, #endif #ifdef CONFIG_TCP_AO + .ao_lookup = tcp_v6_ao_lookup, .ao_parse = tcp_v6_parse_ao, #endif }; @@ -1935,6 +1952,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { .md5_parse = tcp_v6_parse_md5_keys, #endif #ifdef CONFIG_TCP_AO + .ao_lookup = tcp_v6_ao_lookup, .ao_parse = tcp_v6_parse_ao, #endif }; From patchwork Mon Oct 23 19:21:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157053 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504393vqx; Mon, 23 Oct 2023 12:23:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH+P6tiDav4TSoxppY76NrJAS1pLNDzMu84EWt/m3xtuEoZCjzffRop4Q6vJZVSWm4hwuWg X-Received: by 2002:a17:902:f542:b0:1ca:1c89:9acf with SMTP id h2-20020a170902f54200b001ca1c899acfmr9085282plf.53.1698088984007; Mon, 23 Oct 2023 12:23:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698088983; cv=none; d=google.com; s=arc-20160816; b=g595mt6AU5TjNM6CFlAFTzxmBO2keEPJlLSBJKYHWx3s2p98Bf2uoxx1TrLLUbzFfD p7A+/BWsOcVIUDTn+5XUGB/PoRS3bX8k7oC14gUX3uru+JswlkOKOcBBjlgGIMQ1MUSK 0kjiJTO77YlOLZK2Zn2xkzlMYXechGC/Nyp0zsE0KjKkshA+6T9IRr838/x8sq4Iokhl 7pTQafZgztWpktBQ24t8DoHicutE/mq/Ngfza8k+mVT3tVyEFBiI18BUP7J+oDUJm2pW s0DsvtTrEDekguqHCh2rWbs8Tpjp1zXbRX1k7+wz5OQqfG73TrN2UwJn0psGLoWWc+fX Dg+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tZEh8mx7QRMwhC/04gavLEz/S6pSyMASGVXqpQQDh3M=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=yWohJDgHoKwvkK2SC7qa9qHQsoVP9wfjq1OT2VoQ7ryZlh35mjCGk7z3C5Or8Xi0rE MDuVW5sjagbqzBxrrnLwLTEncd+oE5MBkh31p5/AGX6vV9ir1Xo5h1z6+OTaxn9d64gz hvkSlThi2rssTTTeARiSD83LEMvRtsy8ZX9G1o3JwdWUwlXlOeP0t7UdU2gL4C5IWpoP frg/L+hqAXpKD3+RwM2jggAQ0tS1pWZ7Yy3bU5oaWXteRELRQrm4kyHkEq+gjsYu+Zkr m9YUcZR+pR35l046WsYgDHi7EeG3PxMkLMqTTC59iFNAwVuUbD91zuj+UyjEnsE50Y9O LRWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=i0mjK371; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id c8-20020a170903234800b001c0727658c3si7333139plh.259.2023.10.23.12.23.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=i0mjK371; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id EEE1580BBC79; Mon, 23 Oct 2023 12:23:02 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231506AbjJWTWs (ORCPT + 27 others); Mon, 23 Oct 2023 15:22:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37434 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229628AbjJWTWk (ORCPT ); Mon, 23 Oct 2023 15:22:40 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 978C9E6 for ; Mon, 23 Oct 2023 12:22:37 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-4083f61322fso28768975e9.1 for ; Mon, 23 Oct 2023 12:22:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088956; x=1698693756; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tZEh8mx7QRMwhC/04gavLEz/S6pSyMASGVXqpQQDh3M=; b=i0mjK371bcmKRPJIgodPfLp7k9I6yO3+vuZlDd50qD8AIBepYQCRi9cnjSzBC9q6M8 2iOR368/fRtI+OlPIimzbKRMwxEbpmtSegk7Xvo/ofeVRQRtP3Hrzxr8o1yd+edpu/2t Yc1zR4mJbnp0gupvpEziDMCsxwZy2XqJY+Q9asj6OhRY8e3x7b+czBqVAqB96uGyOTNL ytPZ2VEGilUsX0FKQMOQMDzHCu7ToffVvcfsKQVkTB7cq5n9WauZXIXCT1gYdfJbsFWk Z+Q1/qXPkWQAOFwpnp3GhLz06KZ/OvCc/kWDiQJ8Ym5brnpvI59K/XPvawyGfd2wmf85 BbEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088956; x=1698693756; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tZEh8mx7QRMwhC/04gavLEz/S6pSyMASGVXqpQQDh3M=; b=wtRC9roFqJmmUPkZEIb5nm8rKPWdwj1+5dFFbWfF5wTBE5NgOcnjMaqMOAR17mKoZp +aw4iKlbO8r0xGJvc4RKz6a+ClPwSOiFL+3GNB8eFvEN+dndpVeaG4UjgJhb+PQcJmiN sRCePactkiYxLg5Un3iOCUslX5P6+QR/ufSeUpC/c9/6wTqRLcYbSDRWskqd2/SyzMDB FXkWO7ExkFyIOoFxpqcuhJntUytDnK7mWOuynFXr7Z+lCZ68k6uSkzWafohDWh1VCDU3 NGrYyDLaXPpHBGuH6iSUn35qDj6xcX9BTani1prcrytle6rd4J3em9owxys0LQ4XLCaD ewxg== X-Gm-Message-State: AOJu0YymJPwCAiRXun97hzPcFQgGQ9ONUC96dZs6CiSWlqgn50ibck5i GfFA/N92cajVw78jW7Flge+xrA== X-Received: by 2002:a05:600c:4446:b0:407:7e5f:ffb9 with SMTP id v6-20020a05600c444600b004077e5fffb9mr7338298wmn.9.1698088955961; Mon, 23 Oct 2023 12:22:35 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:35 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 05/23] net/tcp: Calculate TCP-AO traffic keys Date: Mon, 23 Oct 2023 20:21:57 +0100 Message-ID: <20231023192217.426455-6-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:23:03 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575354340757626 X-GMAIL-MSGID: 1780575354340757626 Add traffic key calculation the way it's described in RFC5926. Wire it up to tcp_finish_connect() and cache the new keys straight away on already established TCP connections. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp.h | 3 + include/net/tcp_ao.h | 51 ++++++++++- net/ipv4/tcp_ao.c | 206 ++++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_input.c | 2 + net/ipv4/tcp_ipv4.c | 1 + net/ipv4/tcp_output.c | 2 + net/ipv6/tcp_ao.c | 50 ++++++++++ net/ipv6/tcp_ipv6.c | 1 + 8 files changed, 314 insertions(+), 2 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index c2de8b8e540a..4bf13c429a2f 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2195,6 +2195,9 @@ struct tcp_sock_af_ops { struct tcp_ao_key *(*ao_lookup)(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid); + int (*ao_calc_key_sk)(struct tcp_ao_key *mkt, u8 *key, + const struct sock *sk, + __be32 sisn, __be32 disn, bool send); #endif }; diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 3c7f576376f9..b021a811511b 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -89,8 +89,32 @@ struct tcp_ao_info { }; #ifdef CONFIG_TCP_AO +/* TCP-AO structures and functions */ + +struct tcp4_ao_context { + __be32 saddr; + __be32 daddr; + __be16 sport; + __be16 dport; + __be32 sisn; + __be32 disn; +}; + +struct tcp6_ao_context { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 sport; + __be16 dport; + __be32 sisn; + __be32 disn; +}; + +struct tcp_sigpool; + int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family, sockptr_t optval, int optlen); +int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, + unsigned int len, struct tcp_sigpool *hp); void tcp_ao_destroy_sock(struct sock *sk); struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, @@ -99,11 +123,22 @@ struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid); +int tcp_v4_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, + const struct sock *sk, + __be32 sisn, __be32 disn, bool send); /* ipv6 specific functions */ -int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); +int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, + const struct sock *sk, __be32 sisn, + __be32 disn, bool send); struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid); -#else +int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); +void tcp_ao_established(struct sock *sk); +void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb); +void tcp_ao_connect_init(struct sock *sk); + +#else /* CONFIG_TCP_AO */ + static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid) { @@ -113,6 +148,18 @@ static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, static inline void tcp_ao_destroy_sock(struct sock *sk) { } + +static inline void tcp_ao_established(struct sock *sk) +{ +} + +static inline void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb) +{ +} + +static inline void tcp_ao_connect_init(struct sock *sk) +{ +} #endif #endif /* _TCP_AO_H */ diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index ee23356101f4..e478341fc336 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -16,6 +16,34 @@ #include #include +int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, + unsigned int len, struct tcp_sigpool *hp) +{ + struct scatterlist sg; + int ret; + + if (crypto_ahash_setkey(crypto_ahash_reqtfm(hp->req), + mkt->key, mkt->keylen)) + goto clear_hash; + + ret = crypto_ahash_init(hp->req); + if (ret) + goto clear_hash; + + sg_init_one(&sg, ctx, len); + ahash_request_set_crypt(hp->req, &sg, key, len); + crypto_ahash_update(hp->req); + + ret = crypto_ahash_final(hp->req); + if (ret) + goto clear_hash; + + return 0; +clear_hash: + memset(key, 0, tcp_ao_digest_size(mkt)); + return 1; +} + /* Optimized version of tcp_ao_do_lookup(): only for sockets for which * it's known that the keys in ao_info are matching peer's * family/address/VRF/etc. @@ -169,6 +197,71 @@ void tcp_ao_destroy_sock(struct sock *sk) kfree_rcu(ao, rcu); } +/* 4 tuple and ISNs are expected in NBO */ +static int tcp_v4_ao_calc_key(struct tcp_ao_key *mkt, u8 *key, + __be32 saddr, __be32 daddr, + __be16 sport, __be16 dport, + __be32 sisn, __be32 disn) +{ + /* See RFC5926 3.1.1 */ + struct kdf_input_block { + u8 counter; + u8 label[6]; + struct tcp4_ao_context ctx; + __be16 outlen; + } __packed * tmp; + struct tcp_sigpool hp; + int err; + + err = tcp_sigpool_start(mkt->tcp_sigpool_id, &hp); + if (err) + return err; + + tmp = hp.scratch; + tmp->counter = 1; + memcpy(tmp->label, "TCP-AO", 6); + tmp->ctx.saddr = saddr; + tmp->ctx.daddr = daddr; + tmp->ctx.sport = sport; + tmp->ctx.dport = dport; + tmp->ctx.sisn = sisn; + tmp->ctx.disn = disn; + tmp->outlen = htons(tcp_ao_digest_size(mkt) * 8); /* in bits */ + + err = tcp_ao_calc_traffic_key(mkt, key, tmp, sizeof(*tmp), &hp); + tcp_sigpool_end(&hp); + + return err; +} + +int tcp_v4_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, + const struct sock *sk, + __be32 sisn, __be32 disn, bool send) +{ + if (send) + return tcp_v4_ao_calc_key(mkt, key, sk->sk_rcv_saddr, + sk->sk_daddr, htons(sk->sk_num), + sk->sk_dport, sisn, disn); + else + return tcp_v4_ao_calc_key(mkt, key, sk->sk_daddr, + sk->sk_rcv_saddr, sk->sk_dport, + htons(sk->sk_num), disn, sisn); +} + +static int tcp_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, + const struct sock *sk, + __be32 sisn, __be32 disn, bool send) +{ + if (mkt->family == AF_INET) + return tcp_v4_ao_calc_key_sk(mkt, key, sk, sisn, disn, send); +#if IS_ENABLED(CONFIG_IPV6) + else if (mkt->family == AF_INET6) + return tcp_v6_ao_calc_key_sk(mkt, key, sk, sisn, disn, send); +#endif + else + return -EOPNOTSUPP; +} + struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid) { @@ -177,6 +270,113 @@ struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, return tcp_ao_do_lookup(sk, addr, AF_INET, sndid, rcvid); } +static int tcp_ao_cache_traffic_keys(const struct sock *sk, + struct tcp_ao_info *ao, + struct tcp_ao_key *ao_key) +{ + u8 *traffic_key = snd_other_key(ao_key); + int ret; + + ret = tcp_ao_calc_key_sk(ao_key, traffic_key, sk, + ao->lisn, ao->risn, true); + if (ret) + return ret; + + traffic_key = rcv_other_key(ao_key); + ret = tcp_ao_calc_key_sk(ao_key, traffic_key, sk, + ao->lisn, ao->risn, false); + return ret; +} + +void tcp_ao_connect_init(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_ao_info *ao_info; + union tcp_ao_addr *addr; + struct tcp_ao_key *key; + int family; + + ao_info = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held(sk)); + if (!ao_info) + return; + + /* Remove all keys that don't match the peer */ + family = sk->sk_family; + if (family == AF_INET) + addr = (union tcp_ao_addr *)&sk->sk_daddr; +#if IS_ENABLED(CONFIG_IPV6) + else if (family == AF_INET6) + addr = (union tcp_ao_addr *)&sk->sk_v6_daddr; +#endif + else + return; + + hlist_for_each_entry_rcu(key, &ao_info->head, node) { + if (!tcp_ao_key_cmp(key, addr, key->prefixlen, family, -1, -1)) + continue; + + if (key == ao_info->current_key) + ao_info->current_key = NULL; + if (key == ao_info->rnext_key) + ao_info->rnext_key = NULL; + hlist_del_rcu(&key->node); + atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc); + call_rcu(&key->rcu, tcp_ao_key_free_rcu); + } + + key = tp->af_specific->ao_lookup(sk, sk, -1, -1); + if (key) { + /* if current_key or rnext_key were not provided, + * use the first key matching the peer + */ + if (!ao_info->current_key) + ao_info->current_key = key; + if (!ao_info->rnext_key) + ao_info->rnext_key = key; + tp->tcp_header_len += tcp_ao_len(key); + + ao_info->lisn = htonl(tp->write_seq); + } else { + /* Can't happen: tcp_connect() verifies that there's + * at least one tcp-ao key that matches the remote peer. + */ + WARN_ON_ONCE(1); + rcu_assign_pointer(tp->ao_info, NULL); + kfree(ao_info); + } +} + +void tcp_ao_established(struct sock *sk) +{ + struct tcp_ao_info *ao; + struct tcp_ao_key *key; + + ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, + lockdep_sock_is_held(sk)); + if (!ao) + return; + + hlist_for_each_entry_rcu(key, &ao->head, node) + tcp_ao_cache_traffic_keys(sk, ao, key); +} + +void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb) +{ + struct tcp_ao_info *ao; + struct tcp_ao_key *key; + + ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, + lockdep_sock_is_held(sk)); + if (!ao) + return; + + WRITE_ONCE(ao->risn, tcp_hdr(skb)->seq); + + hlist_for_each_entry_rcu(key, &ao->head, node) + tcp_ao_cache_traffic_keys(sk, ao, key); +} + static bool tcp_ao_can_set_current_rnext(struct sock *sk) { /* There aren't current/rnext keys on TCP_LISTEN sockets */ @@ -558,6 +758,12 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, if (ret < 0) goto err_free_sock; + /* Change this condition if we allow adding keys in states + * like close_wait, syn_sent or fin_wait... + */ + if (sk->sk_state == TCP_ESTABLISHED) + tcp_ao_cache_traffic_keys(sk, ao_info, key); + tcp_ao_link_mkt(ao_info, key); if (first) { sk_gso_disable(sk); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 18b858597af4..64795a79a4a4 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6150,6 +6150,7 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb) struct tcp_sock *tp = tcp_sk(sk); struct inet_connection_sock *icsk = inet_csk(sk); + tcp_ao_finish_connect(sk, skb); tcp_set_state(sk, TCP_ESTABLISHED); icsk->icsk_ack.lrcvtime = tcp_jiffies32; @@ -6647,6 +6648,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) skb); WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); } + tcp_ao_established(sk); smp_mb(); tcp_set_state(sk, TCP_ESTABLISHED); sk->sk_state_change(sk); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 698e58a3ccec..3c73b5829377 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2288,6 +2288,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { #ifdef CONFIG_TCP_AO .ao_lookup = tcp_v4_ao_lookup, .ao_parse = tcp_v4_parse_ao, + .ao_calc_key_sk = tcp_v4_ao_calc_key_sk, #endif }; #endif diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 4db906ac8e48..930700217a03 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3767,6 +3767,8 @@ static void tcp_connect_init(struct sock *sk) if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps)) tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED; + tcp_ao_connect_init(sk); + /* If user gave his TCP_MAXSEG, record it to clamp */ if (tp->rx_opt.user_mss) tp->rx_opt.mss_clamp = tp->rx_opt.user_mss; diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c index 0640acaee67b..9ab594fadbd9 100644 --- a/net/ipv6/tcp_ao.c +++ b/net/ipv6/tcp_ao.c @@ -12,6 +12,56 @@ #include #include +static int tcp_v6_ao_calc_key(struct tcp_ao_key *mkt, u8 *key, + const struct in6_addr *saddr, + const struct in6_addr *daddr, + __be16 sport, __be16 dport, + __be32 sisn, __be32 disn) +{ + struct kdf_input_block { + u8 counter; + u8 label[6]; + struct tcp6_ao_context ctx; + __be16 outlen; + } __packed * tmp; + struct tcp_sigpool hp; + int err; + + err = tcp_sigpool_start(mkt->tcp_sigpool_id, &hp); + if (err) + return err; + + tmp = hp.scratch; + tmp->counter = 1; + memcpy(tmp->label, "TCP-AO", 6); + tmp->ctx.saddr = *saddr; + tmp->ctx.daddr = *daddr; + tmp->ctx.sport = sport; + tmp->ctx.dport = dport; + tmp->ctx.sisn = sisn; + tmp->ctx.disn = disn; + tmp->outlen = htons(tcp_ao_digest_size(mkt) * 8); /* in bits */ + + err = tcp_ao_calc_traffic_key(mkt, key, tmp, sizeof(*tmp), &hp); + tcp_sigpool_end(&hp); + + return err; +} + +int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, + const struct sock *sk, __be32 sisn, + __be32 disn, bool send) +{ + if (send) + return tcp_v6_ao_calc_key(mkt, key, &sk->sk_v6_rcv_saddr, + &sk->sk_v6_daddr, htons(sk->sk_num), + sk->sk_dport, sisn, disn); + else + return tcp_v6_ao_calc_key(mkt, key, &sk->sk_v6_daddr, + &sk->sk_v6_rcv_saddr, sk->sk_dport, + htons(sk->sk_num), disn, sisn); +} + static struct tcp_ao_key *tcp_v6_ao_do_lookup(const struct sock *sk, const struct in6_addr *addr, int sndid, int rcvid) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index d1a62bbc8a3b..2eda55cbd9ff 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1922,6 +1922,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { #ifdef CONFIG_TCP_AO .ao_lookup = tcp_v6_ao_lookup, .ao_parse = tcp_v6_parse_ao, + .ao_calc_key_sk = tcp_v6_ao_calc_key_sk, #endif }; #endif From patchwork Mon Oct 23 19:21:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157054 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504443vqx; Mon, 23 Oct 2023 12:23:10 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF3ZuJHuaMbzrUdHv3FEiXq0D5tA3WRWpgvkydqnll6vUBRnmHb8LpUKNGUP/G4zFgTQbrc X-Received: by 2002:a05:6a00:1705:b0:6be:bc50:b5e6 with SMTP id h5-20020a056a00170500b006bebc50b5e6mr10085275pfc.0.1698088989751; Mon, 23 Oct 2023 12:23:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698088989; cv=none; d=google.com; s=arc-20160816; b=TGhkYJbAqTKibxCK0m/4Zamn8Z5GXqikajXSlACr7KgCgImFGzFVOwTz1cIBN0jTlq V1tK/ZFTLMkPGNJMIf3ith9Fz/tJn+pnSELTqu3lBB/N0zzCmcpAPv1W/zdK8TDwkkdr RFYXxVZPwNOBwjWK3S4tg5FUuMFPvY7qzvcdFSKSR1P4MaJiaONQ0DVy5PYafgqzI81h R1ITxpSzwxMvO85e6nGYnKRhcSoO25jt53BHPHArbvjYu1qge9VWlivQG5M0PGnR3Es+ WT6N9KSKc+WgUe1FkUC3r4uhaZDIlNbQ5tFjynJqStP/RpQKjx0W/SM89WjTkU+QgEo7 q8gQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ElCBay5AXpKjlxXf7p5Mfe3CueK9385XKkU1G8CCMY8=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=fE/wvXGK6hZx4WKidCCThbF6qACtChkS1AOfMuAdQ+3YshNzERHnkf0RWe4hxnzlVH rAeQDA7kgSkclsAM8HjgFgKrD2MHkk/7F8Qm9pNaC8J+DZYAGf3/cPslJpoXCL0BpdxW klnbm0rHAD3y7+V6D2ZpgxoQQTHrF9t/ck7b0KBG9WVr7Rsmh1IrFF1FhjySU5o1AZ7+ EAHPb1oaWf+GNxCpCPdXDCocS+4cby4jUTkfW+NR9fhJIeR1aooniq1zupQNMqxISo0F sCi6d8BzS0B3KH0MzdIYH/6fT0LRFUSJay0+xfGgDYc8gnQw7d2fj/bQn95UGWBRHt3q g4EQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=kTQb1w66; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id h10-20020a056a00230a00b006bb0aafd563si7060375pfh.237.2023.10.23.12.23.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=kTQb1w66; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 374D280BBE8F; Mon, 23 Oct 2023 12:23:08 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231864AbjJWTW4 (ORCPT + 27 others); Mon, 23 Oct 2023 15:22:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231402AbjJWTWo (ORCPT ); Mon, 23 Oct 2023 15:22:44 -0400 Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFF40102 for ; Mon, 23 Oct 2023 12:22:39 -0700 (PDT) Received: by mail-lj1-x235.google.com with SMTP id 38308e7fff4ca-2b9d07a8d84so48810281fa.3 for ; Mon, 23 Oct 2023 12:22:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088958; x=1698693758; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ElCBay5AXpKjlxXf7p5Mfe3CueK9385XKkU1G8CCMY8=; b=kTQb1w662S+354J5qllQl/2+CCaaepdRFL0tF6q3I08JtS9u4WFFcMIBf4hEUam+va C2l6NsluTtfH4N7D3QmC0RXJOazWuaTDAg6S2wbXWvfx9vqtbWVlkIkv367sNjK/Z3Ly fU7IyEDQOG0rcXWnUzVL8JWVca3egZvPyXkZ3EqDG9KdVnmYW/uOwAonkR2oM/qYgd73 dnD+BNptDNc0HQurC5Z27MdYV75S2irMJCI6+t1PdZC92MAQlSQb1kYTsm2I4uwbg4t8 ckGWuY7/m38H4K7pEpirL/OOn8V3deB2ITp1OtzheENopBrOmpT4jKDozkayj4ZqNLiD qaOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088958; x=1698693758; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ElCBay5AXpKjlxXf7p5Mfe3CueK9385XKkU1G8CCMY8=; b=WJJpgdPqItt8FFPY3eElcYuPX8fi2Y++KDM8ejjDiA4ZLOngwkgGbyNJuDU0EO1/aP WKrqk0WBk1mhwpq00WQhTNpRnhoVIgRylFDvIx0L9YHG0jhTGf7XiuYb0+77SExO/dT4 VS5Fa7IMOcNbP1YDFM8aClO0DTmI3qX/UR31NX2GAIGJLLBEzG/H/ze7sH6kZh3jVRfc IxNaPe1+7Phf+qTJeqExOlHgakDriliei5uwZjszLWxFGOdoGD3TUmLNWWFvm0DcrFD+ zSLXMkDUhVKYxT2S42kwCzKBI6dJtHiXjw7t8+bR9eH1D/HEDpAQkgzyijDVZh2geAbG VlEw== X-Gm-Message-State: AOJu0Yx87l3FFeg+HN5EryrnBIC9FVAzHqJpayijCNbZnhWLRH/abhQa Q7PPnSvVb1YKe7KKyDbFHRvdMA== X-Received: by 2002:a2e:b0c7:0:b0:2bf:ff17:8122 with SMTP id g7-20020a2eb0c7000000b002bfff178122mr6842757ljl.17.1698088957943; Mon, 23 Oct 2023 12:22:37 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:37 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 06/23] net/tcp: Add TCP-AO sign to outgoing packets Date: Mon, 23 Oct 2023 20:21:58 +0100 Message-ID: <20231023192217.426455-7-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:23:08 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575360781439240 X-GMAIL-MSGID: 1780575360781439240 Using precalculated traffic keys, sign TCP segments as prescribed by RFC5925. Per RFC, TCP header options are included in sign calculation: "The TCP header, by default including options, and where the TCP checksum and TCP-AO MAC fields are set to zero, all in network- byte order." (5.1.3) tcp_ao_hash_header() has exclude_options parameter to optionally exclude TCP header from hash calculation, as described in RFC5925 (9.1), this is needed for interaction with middleboxes that may change "some TCP options". This is wired up to AO key flags and setsockopt() later. Similarly to TCP-MD5 hash TCP segment fragments. From this moment a user can start sending TCP-AO signed segments with one of crypto ahash algorithms from supported by Linux kernel. It can have a user-specified MAC length, to either save TCP option header space or provide higher protection using a longer signature. The inbound segments are not yet verified, TCP-AO option is ignored and they are accepted. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp.h | 64 ++++++++++++++ include/net/tcp_ao.h | 23 +++++ net/ipv4/tcp_ao.c | 199 ++++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_ipv4.c | 1 + net/ipv4/tcp_output.c | 112 ++++++++++++++++-------- net/ipv6/tcp_ao.c | 28 ++++++ net/ipv6/tcp_ipv6.c | 2 + 7 files changed, 391 insertions(+), 38 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 4bf13c429a2f..9a996834699d 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -195,6 +195,7 @@ static_assert((1 << ATO_BITS) > TCP_DELACK_MAX); #define TCPOPT_SACK 5 /* SACK Block */ #define TCPOPT_TIMESTAMP 8 /* Better RTT estimations/PAWS */ #define TCPOPT_MD5SIG 19 /* MD5 Signature (RFC2385) */ +#define TCPOPT_AO 29 /* Authentication Option (RFC5925) */ #define TCPOPT_MPTCP 30 /* Multipath TCP (RFC6824) */ #define TCPOPT_FASTOPEN 34 /* Fast open (RFC7413) */ #define TCPOPT_EXP 254 /* Experimental */ @@ -2198,6 +2199,9 @@ struct tcp_sock_af_ops { int (*ao_calc_key_sk)(struct tcp_ao_key *mkt, u8 *key, const struct sock *sk, __be32 sisn, __be32 disn, bool send); + int (*calc_ao_hash)(char *location, struct tcp_ao_key *ao, + const struct sock *sk, const struct sk_buff *skb, + const u8 *tkey, int hash_offset, u32 sne); #endif }; @@ -2251,6 +2255,66 @@ static inline __u32 cookie_init_sequence(const struct tcp_request_sock_ops *ops, } #endif +struct tcp_key { + union { + struct tcp_ao_key *ao_key; + struct tcp_md5sig_key *md5_key; + }; + enum { + TCP_KEY_NONE = 0, + TCP_KEY_MD5, + TCP_KEY_AO, + } type; +}; + +static inline void tcp_get_current_key(const struct sock *sk, + struct tcp_key *out) +{ +#if defined(CONFIG_TCP_AO) || defined(CONFIG_TCP_MD5SIG) + const struct tcp_sock *tp = tcp_sk(sk); +#endif +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; + + ao = rcu_dereference_protected(tp->ao_info, lockdep_sock_is_held(sk)); + if (ao) { + out->ao_key = READ_ONCE(ao->current_key); + out->type = TCP_KEY_AO; + return; + } +#endif +#ifdef CONFIG_TCP_MD5SIG + if (static_branch_unlikely(&tcp_md5_needed.key) && + rcu_access_pointer(tp->md5sig_info)) { + out->md5_key = tp->af_specific->md5_lookup(sk, sk); + if (out->md5_key) { + out->type = TCP_KEY_MD5; + return; + } + } +#endif + out->type = TCP_KEY_NONE; +} + +static inline bool tcp_key_is_md5(const struct tcp_key *key) +{ +#ifdef CONFIG_TCP_MD5SIG + if (static_branch_unlikely(&tcp_md5_needed.key) && + key->type == TCP_KEY_MD5) + return true; +#endif + return false; +} + +static inline bool tcp_key_is_ao(const struct tcp_key *key) +{ +#ifdef CONFIG_TCP_AO + if (key->type == TCP_KEY_AO) + return true; +#endif + return false; +} + int tcpv4_offload_init(void); void tcp_v4_init(void); diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index b021a811511b..0b86bc05d8cf 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -111,6 +111,13 @@ struct tcp6_ao_context { struct tcp_sigpool; +int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, + struct tcp_ao_key *key, struct tcphdr *th, + __u8 *hash_location); +int tcp_ao_hash_skb(unsigned short int family, + char *ao_hash, struct tcp_ao_key *key, + const struct sock *sk, const struct sk_buff *skb, + const u8 *tkey, int hash_offset, u32 sne); int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family, sockptr_t optval, int optlen); int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, @@ -126,12 +133,21 @@ struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, int tcp_v4_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, const struct sock *sk, __be32 sisn, __be32 disn, bool send); +int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, + const struct sock *sk, const struct sk_buff *skb, + const u8 *tkey, int hash_offset, u32 sne); /* ipv6 specific functions */ +int tcp_v6_ao_hash_pseudoheader(struct tcp_sigpool *hp, + const struct in6_addr *daddr, + const struct in6_addr *saddr, int nbytes); int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, const struct sock *sk, __be32 sisn, __be32 disn, bool send); struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid); +int tcp_v6_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, + const struct sock *sk, const struct sk_buff *skb, + const u8 *tkey, int hash_offset, u32 sne); int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); void tcp_ao_established(struct sock *sk); void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb); @@ -139,6 +155,13 @@ void tcp_ao_connect_init(struct sock *sk); #else /* CONFIG_TCP_AO */ +static inline int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, + struct tcp_ao_key *key, struct tcphdr *th, + __u8 *hash_location) +{ + return 0; +} + static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid) { diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index e478341fc336..007f29a2531f 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -262,6 +262,171 @@ static int tcp_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, return -EOPNOTSUPP; } +static int tcp_v4_ao_hash_pseudoheader(struct tcp_sigpool *hp, + __be32 daddr, __be32 saddr, + int nbytes) +{ + struct tcp4_pseudohdr *bp; + struct scatterlist sg; + + bp = hp->scratch; + bp->saddr = saddr; + bp->daddr = daddr; + bp->pad = 0; + bp->protocol = IPPROTO_TCP; + bp->len = cpu_to_be16(nbytes); + + sg_init_one(&sg, bp, sizeof(*bp)); + ahash_request_set_crypt(hp->req, &sg, NULL, sizeof(*bp)); + return crypto_ahash_update(hp->req); +} + +static int tcp_ao_hash_pseudoheader(unsigned short int family, + const struct sock *sk, + const struct sk_buff *skb, + struct tcp_sigpool *hp, int nbytes) +{ + const struct tcphdr *th = tcp_hdr(skb); + + /* TODO: Can we rely on checksum being zero to mean outbound pkt? */ + if (!th->check) { + if (family == AF_INET) + return tcp_v4_ao_hash_pseudoheader(hp, sk->sk_daddr, + sk->sk_rcv_saddr, skb->len); +#if IS_ENABLED(CONFIG_IPV6) + else if (family == AF_INET6) + return tcp_v6_ao_hash_pseudoheader(hp, &sk->sk_v6_daddr, + &sk->sk_v6_rcv_saddr, skb->len); +#endif + else + return -EAFNOSUPPORT; + } + + if (family == AF_INET) { + const struct iphdr *iph = ip_hdr(skb); + + return tcp_v4_ao_hash_pseudoheader(hp, iph->daddr, + iph->saddr, skb->len); +#if IS_ENABLED(CONFIG_IPV6) + } else if (family == AF_INET6) { + const struct ipv6hdr *iph = ipv6_hdr(skb); + + return tcp_v6_ao_hash_pseudoheader(hp, &iph->daddr, + &iph->saddr, skb->len); +#endif + } + return -EAFNOSUPPORT; +} + +/* tcp_ao_hash_sne(struct tcp_sigpool *hp) + * @hp - used for hashing + * @sne - sne value + */ +static int tcp_ao_hash_sne(struct tcp_sigpool *hp, u32 sne) +{ + struct scatterlist sg; + __be32 *bp; + + bp = (__be32 *)hp->scratch; + *bp = htonl(sne); + + sg_init_one(&sg, bp, sizeof(*bp)); + ahash_request_set_crypt(hp->req, &sg, NULL, sizeof(*bp)); + return crypto_ahash_update(hp->req); +} + +static int tcp_ao_hash_header(struct tcp_sigpool *hp, + const struct tcphdr *th, + bool exclude_options, u8 *hash, + int hash_offset, int hash_len) +{ + int err, len = th->doff << 2; + struct scatterlist sg; + u8 *hdr = hp->scratch; + + /* We are not allowed to change tcphdr, make a local copy */ + if (exclude_options) { + len = sizeof(*th) + sizeof(struct tcp_ao_hdr) + hash_len; + memcpy(hdr, th, sizeof(*th)); + memcpy(hdr + sizeof(*th), + (u8 *)th + hash_offset - sizeof(struct tcp_ao_hdr), + sizeof(struct tcp_ao_hdr)); + memset(hdr + sizeof(*th) + sizeof(struct tcp_ao_hdr), + 0, hash_len); + ((struct tcphdr *)hdr)->check = 0; + } else { + len = th->doff << 2; + memcpy(hdr, th, len); + /* zero out tcp-ao hash */ + ((struct tcphdr *)hdr)->check = 0; + memset(hdr + hash_offset, 0, hash_len); + } + + sg_init_one(&sg, hdr, len); + ahash_request_set_crypt(hp->req, &sg, NULL, len); + err = crypto_ahash_update(hp->req); + WARN_ON_ONCE(err != 0); + return err; +} + +int tcp_ao_hash_skb(unsigned short int family, + char *ao_hash, struct tcp_ao_key *key, + const struct sock *sk, const struct sk_buff *skb, + const u8 *tkey, int hash_offset, u32 sne) +{ + const struct tcphdr *th = tcp_hdr(skb); + int tkey_len = tcp_ao_digest_size(key); + struct tcp_sigpool hp; + void *hash_buf = NULL; + + hash_buf = kmalloc(tkey_len, GFP_ATOMIC); + if (!hash_buf) + goto clear_hash_noput; + + if (tcp_sigpool_start(key->tcp_sigpool_id, &hp)) + goto clear_hash_noput; + + if (crypto_ahash_setkey(crypto_ahash_reqtfm(hp.req), tkey, tkey_len)) + goto clear_hash; + + /* For now use sha1 by default. Depends on alg in tcp_ao_key */ + if (crypto_ahash_init(hp.req)) + goto clear_hash; + + if (tcp_ao_hash_sne(&hp, sne)) + goto clear_hash; + if (tcp_ao_hash_pseudoheader(family, sk, skb, &hp, skb->len)) + goto clear_hash; + if (tcp_ao_hash_header(&hp, th, false, + ao_hash, hash_offset, tcp_ao_maclen(key))) + goto clear_hash; + if (tcp_sigpool_hash_skb_data(&hp, skb, th->doff << 2)) + goto clear_hash; + ahash_request_set_crypt(hp.req, NULL, hash_buf, 0); + if (crypto_ahash_final(hp.req)) + goto clear_hash; + + memcpy(ao_hash, hash_buf, tcp_ao_maclen(key)); + tcp_sigpool_end(&hp); + kfree(hash_buf); + return 0; + +clear_hash: + tcp_sigpool_end(&hp); +clear_hash_noput: + memset(ao_hash, 0, tcp_ao_maclen(key)); + kfree(hash_buf); + return 1; +} + +int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, + const struct sock *sk, const struct sk_buff *skb, + const u8 *tkey, int hash_offset, u32 sne) +{ + return tcp_ao_hash_skb(AF_INET, ao_hash, key, sk, skb, + tkey, hash_offset, sne); +} + struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid) { @@ -270,6 +435,40 @@ struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, return tcp_ao_do_lookup(sk, addr, AF_INET, sndid, rcvid); } +int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, + struct tcp_ao_key *key, struct tcphdr *th, + __u8 *hash_location) +{ + struct tcp_skb_cb *tcb = TCP_SKB_CB(skb); + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_ao_info *ao; + void *tkey_buf = NULL; + u8 *traffic_key; + + ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, + lockdep_sock_is_held(sk)); + traffic_key = snd_other_key(key); + if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) { + __be32 disn; + + if (!(tcb->tcp_flags & TCPHDR_ACK)) { + disn = 0; + tkey_buf = kmalloc(tcp_ao_digest_size(key), GFP_ATOMIC); + if (!tkey_buf) + return -ENOMEM; + traffic_key = tkey_buf; + } else { + disn = ao->risn; + } + tp->af_specific->ao_calc_key_sk(key, traffic_key, + sk, ao->lisn, disn, true); + } + tp->af_specific->calc_ao_hash(hash_location, key, sk, skb, traffic_key, + hash_location - (u8 *)th, 0); + kfree(tkey_buf); + return 0; +} + static int tcp_ao_cache_traffic_keys(const struct sock *sk, struct tcp_ao_info *ao, struct tcp_ao_key *ao_key) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 3c73b5829377..b002f6497d19 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2287,6 +2287,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv4_specific = { #endif #ifdef CONFIG_TCP_AO .ao_lookup = tcp_v4_ao_lookup, + .calc_ao_hash = tcp_v4_ao_hash_skb, .ao_parse = tcp_v4_parse_ao, .ao_calc_key_sk = tcp_v4_ao_calc_key_sk, #endif diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 930700217a03..5eb81fb4d163 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -422,6 +422,7 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp) #define OPTION_FAST_OPEN_COOKIE BIT(8) #define OPTION_SMC BIT(9) #define OPTION_MPTCP BIT(10) +#define OPTION_AO BIT(11) static void smc_options_write(__be32 *ptr, u16 *options) { @@ -614,19 +615,43 @@ static void bpf_skops_write_hdr_opt(struct sock *sk, struct sk_buff *skb, * (but it may well be that other scenarios fail similarly). */ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, - struct tcp_out_options *opts) + struct tcp_out_options *opts, + struct tcp_key *key) { __be32 *ptr = (__be32 *)(th + 1); u16 options = opts->options; /* mungable copy */ - if (unlikely(OPTION_MD5 & options)) { + if (tcp_key_is_md5(key)) { *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); /* overload cookie hash location */ opts->hash_location = (__u8 *)ptr; ptr += 4; - } + } else if (tcp_key_is_ao(key)) { +#ifdef CONFIG_TCP_AO + struct tcp_ao_key *rnext_key; + struct tcp_ao_info *ao_info; + u8 maclen; + ao_info = rcu_dereference_check(tp->ao_info, + lockdep_sock_is_held(&tp->inet_conn.icsk_inet.sk)); + rnext_key = READ_ONCE(ao_info->rnext_key); + if (WARN_ON_ONCE(!rnext_key)) + goto out_ao; + maclen = tcp_ao_maclen(key->ao_key); + *ptr++ = htonl((TCPOPT_AO << 24) | + (tcp_ao_len(key->ao_key) << 16) | + (key->ao_key->sndid << 8) | + (rnext_key->rcvid)); + opts->hash_location = (__u8 *)ptr; + ptr += maclen / sizeof(*ptr); + if (unlikely(maclen % sizeof(*ptr))) { + memset(ptr, TCPOPT_NOP, sizeof(*ptr)); + ptr++; + } +out_ao: +#endif + } if (unlikely(opts->mss)) { *ptr++ = htonl((TCPOPT_MSS << 24) | (TCPOLEN_MSS << 16) | @@ -767,23 +792,25 @@ static void mptcp_set_option_cond(const struct request_sock *req, */ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, struct tcp_out_options *opts, - struct tcp_md5sig_key **md5) + struct tcp_key *key) { struct tcp_sock *tp = tcp_sk(sk); unsigned int remaining = MAX_TCP_OPTION_SPACE; struct tcp_fastopen_request *fastopen = tp->fastopen_req; + bool timestamps; - *md5 = NULL; -#ifdef CONFIG_TCP_MD5SIG - if (static_branch_unlikely(&tcp_md5_needed.key) && - rcu_access_pointer(tp->md5sig_info)) { - *md5 = tp->af_specific->md5_lookup(sk, sk); - if (*md5) { - opts->options |= OPTION_MD5; - remaining -= TCPOLEN_MD5SIG_ALIGNED; + /* Better than switch (key.type) as it has static branches */ + if (tcp_key_is_md5(key)) { + timestamps = false; + opts->options |= OPTION_MD5; + remaining -= TCPOLEN_MD5SIG_ALIGNED; + } else { + timestamps = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps); + if (tcp_key_is_ao(key)) { + opts->options |= OPTION_AO; + remaining -= tcp_ao_len(key->ao_key); } } -#endif /* We always get an MSS option. The option bytes which will be seen in * normal data packets should timestamps be used, must be in the MSS @@ -797,7 +824,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, opts->mss = tcp_advertise_mss(sk); remaining -= TCPOLEN_MSS_ALIGNED; - if (likely(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps) && !*md5)) { + if (likely(timestamps)) { opts->options |= OPTION_TS; opts->tsval = tcp_skb_timestamp_ts(tp->tcp_usec_ts, skb) + tp->tsoffset; opts->tsecr = tp->rx_opt.ts_recent; @@ -922,7 +949,7 @@ static unsigned int tcp_synack_options(const struct sock *sk, */ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb, struct tcp_out_options *opts, - struct tcp_md5sig_key **md5) + struct tcp_key *key) { struct tcp_sock *tp = tcp_sk(sk); unsigned int size = 0; @@ -930,17 +957,14 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb opts->options = 0; - *md5 = NULL; -#ifdef CONFIG_TCP_MD5SIG - if (static_branch_unlikely(&tcp_md5_needed.key) && - rcu_access_pointer(tp->md5sig_info)) { - *md5 = tp->af_specific->md5_lookup(sk, sk); - if (*md5) { - opts->options |= OPTION_MD5; - size += TCPOLEN_MD5SIG_ALIGNED; - } + /* Better than switch (key.type) as it has static branches */ + if (tcp_key_is_md5(key)) { + opts->options |= OPTION_MD5; + size += TCPOLEN_MD5SIG_ALIGNED; + } else if (tcp_key_is_ao(key)) { + opts->options |= OPTION_AO; + size += tcp_ao_len(key->ao_key); } -#endif if (likely(tp->rx_opt.tstamp_ok)) { opts->options |= OPTION_TS; @@ -1245,7 +1269,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, struct tcp_out_options opts; unsigned int tcp_options_size, tcp_header_size; struct sk_buff *oskb = NULL; - struct tcp_md5sig_key *md5; + struct tcp_key key; struct tcphdr *th; u64 prior_wstamp; int err; @@ -1277,11 +1301,11 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, tcb = TCP_SKB_CB(skb); memset(&opts, 0, sizeof(opts)); + tcp_get_current_key(sk, &key); if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) { - tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5); + tcp_options_size = tcp_syn_options(sk, skb, &opts, &key); } else { - tcp_options_size = tcp_established_options(sk, skb, &opts, - &md5); + tcp_options_size = tcp_established_options(sk, skb, &opts, &key); /* Force a PSH flag on all (GSO) packets to expedite GRO flush * at receiver : This slightly improve GRO performance. * Note that we do not force the PSH flag for non GSO packets, @@ -1362,16 +1386,25 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, th->window = htons(min(tp->rcv_wnd, 65535U)); } - tcp_options_write(th, tp, &opts); + tcp_options_write(th, tp, &opts, &key); + if (tcp_key_is_md5(&key)) { #ifdef CONFIG_TCP_MD5SIG - /* Calculate the MD5 hash, as we have all we need now */ - if (md5) { + /* Calculate the MD5 hash, as we have all we need now */ sk_gso_disable(sk); tp->af_specific->calc_md5_hash(opts.hash_location, - md5, sk, skb); - } + key.md5_key, sk, skb); #endif + } else if (tcp_key_is_ao(&key)) { + int err; + + err = tcp_ao_transmit_skb(sk, skb, key.ao_key, th, + opts.hash_location); + if (err) { + kfree_skb_reason(skb, SKB_DROP_REASON_NOT_SPECIFIED); + return -ENOMEM; + } + } /* BPF prog is the last one writing header option */ bpf_skops_write_hdr_opt(sk, skb, NULL, NULL, 0, &opts); @@ -1822,7 +1855,7 @@ unsigned int tcp_current_mss(struct sock *sk) u32 mss_now; unsigned int header_len; struct tcp_out_options opts; - struct tcp_md5sig_key *md5; + struct tcp_key key; mss_now = tp->mss_cache; @@ -1831,8 +1864,8 @@ unsigned int tcp_current_mss(struct sock *sk) if (mtu != inet_csk(sk)->icsk_pmtu_cookie) mss_now = tcp_sync_mss(sk, mtu); } - - header_len = tcp_established_options(sk, NULL, &opts, &md5) + + tcp_get_current_key(sk, &key); + header_len = tcp_established_options(sk, NULL, &opts, &key) + sizeof(struct tcphdr); /* The mss_cache is sized based on tp->tcp_header_len, which assumes * some common options. If this is an odd packet (because we have SACK @@ -3631,6 +3664,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, const struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_key *md5 = NULL; struct tcp_out_options opts; + struct tcp_key key = {}; struct sk_buff *skb; int tcp_header_size; struct tcphdr *th; @@ -3685,6 +3719,8 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, #ifdef CONFIG_TCP_MD5SIG rcu_read_lock(); md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req)); + if (md5) + key.type = TCP_KEY_MD5; #endif skb_set_hash(skb, READ_ONCE(tcp_rsk(req)->txhash), PKT_HASH_TYPE_L4); /* bpf program will be interested in the tcp_flags */ @@ -3711,7 +3747,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, /* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */ th->window = htons(min(req->rsk_rcv_wnd, 65535U)); - tcp_options_write(th, NULL, &opts); + tcp_options_write(th, NULL, &opts, &key); th->doff = (tcp_header_size >> 2); TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c index 9ab594fadbd9..d08735b6f3c5 100644 --- a/net/ipv6/tcp_ao.c +++ b/net/ipv6/tcp_ao.c @@ -7,6 +7,7 @@ * Francesco Ruggeri * Salam Noureddine */ +#include #include #include @@ -79,6 +80,33 @@ struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk, return tcp_v6_ao_do_lookup(sk, addr, sndid, rcvid); } +int tcp_v6_ao_hash_pseudoheader(struct tcp_sigpool *hp, + const struct in6_addr *daddr, + const struct in6_addr *saddr, int nbytes) +{ + struct tcp6_pseudohdr *bp; + struct scatterlist sg; + + bp = hp->scratch; + /* 1. TCP pseudo-header (RFC2460) */ + bp->saddr = *saddr; + bp->daddr = *daddr; + bp->len = cpu_to_be32(nbytes); + bp->protocol = cpu_to_be32(IPPROTO_TCP); + + sg_init_one(&sg, bp, sizeof(*bp)); + ahash_request_set_crypt(hp->req, &sg, NULL, sizeof(*bp)); + return crypto_ahash_update(hp->req); +} + +int tcp_v6_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, + const struct sock *sk, const struct sk_buff *skb, + const u8 *tkey, int hash_offset, u32 sne) +{ + return tcp_ao_hash_skb(AF_INET6, ao_hash, key, sk, skb, tkey, + hash_offset, sne); +} + int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen) { diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 2eda55cbd9ff..f52656bdf13a 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1921,6 +1921,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { #endif #ifdef CONFIG_TCP_AO .ao_lookup = tcp_v6_ao_lookup, + .calc_ao_hash = tcp_v6_ao_hash_skb, .ao_parse = tcp_v6_parse_ao, .ao_calc_key_sk = tcp_v6_ao_calc_key_sk, #endif @@ -1954,6 +1955,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { #endif #ifdef CONFIG_TCP_AO .ao_lookup = tcp_v6_ao_lookup, + .calc_ao_hash = tcp_v4_ao_hash_skb, .ao_parse = tcp_v6_parse_ao, #endif }; From patchwork Mon Oct 23 19:21:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157055 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504456vqx; Mon, 23 Oct 2023 12:23:12 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFGKJpisiHSE3ydUdGxn3flAxuKg5+XKwVGIcSjQs0U5iLaHC45LbNkHFA4O13gUea93yuD X-Received: by 2002:a17:903:27d0:b0:1c9:ccb3:2352 with SMTP id km16-20020a17090327d000b001c9ccb32352mr9754624plb.12.1698088991742; Mon, 23 Oct 2023 12:23:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698088991; cv=none; d=google.com; s=arc-20160816; b=FgYNOkaGskg0fYhLFZKBx9zUzWXLncTfmuOVa0Cr9aYqpo94MV+xId1qIrMz5LaXIB y0GF+TonSvv3MPfocGx2M9VhO53Bwaof/d6yOgq4GnH2QeO5i/O6y+S5gKoc96gf+DXU XILrHKK4zmw0iuq7hF3lV/d7tBoBnDWkiYrap6A+nrOatUJhdxfyn/EQJRJjpE9S8SDF HwW+Y6mG8oGQbq63D4fgU0zTjNKCaQjWg70ggyTlvhLy8hO4ZQjeQWSEqF3/ITf9Bsht Zek2j0GswQLHAITAPewom0pVt1gL4g3MdOkjCeddUOCghF8IYIAEg1ragxpDwFHrKBva MIwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=NTqNWRBS0zcUrotNdMsrwLpg176RJ9k1BahaqjErmhM=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=k2q/MXOy7iDe2rh0hIqCmAJ5xnPrrUjhHC4ecCGFzIVaJ4C340O7tHR5/Gwai8bCe3 YcwiPCfhq62Ka2I1X4PRgTYoqWEDGHy3EpAm15El6KtEZUAPcKbTc1gBv1g6QSAa5S2o B3Kf8GOYzX2TidGw9aMQccYnJYHF7GIz/gT/TkUPyL9MH3so+cKJ/LhNlXEzAL+9Ee6S b5bRAJXojVj1uj1GEHVmARqHCUsYgVungwwiIgPAYvdPU+W6ZwpZt4mOqwQUwJG1SMeY nCz7cZs6aQDdFZ88B7QX8hG6Z8TOvqwUTQ3Kf7w8+otAOx5D5pl7MGsXz/dx1i4EZC4m H3mA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=EXPqXgPS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id s2-20020a170902ea0200b001b829a32f2dsi7196523plg.457.2023.10.23.12.23.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=EXPqXgPS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id E97B380BBE8E; Mon, 23 Oct 2023 12:23:10 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231451AbjJWTW7 (ORCPT + 27 others); Mon, 23 Oct 2023 15:22:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230237AbjJWTWy (ORCPT ); Mon, 23 Oct 2023 15:22:54 -0400 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F5A6103 for ; Mon, 23 Oct 2023 12:22:41 -0700 (PDT) Received: by mail-wm1-x331.google.com with SMTP id 5b1f17b1804b1-40806e4106dso21276015e9.1 for ; Mon, 23 Oct 2023 12:22:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088960; x=1698693760; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NTqNWRBS0zcUrotNdMsrwLpg176RJ9k1BahaqjErmhM=; b=EXPqXgPSaGlT6KJwMy8WwTY/A5uLT+vDoJXQ/x0reVzmPMxARuxstHfPJ/QSOEjDYs 5jnNaOGhGtsCUkEFPzAmzu/6eY3apotBGp3RlTwcYEKZGTcNgLfC4+ZTBHMSZbwQiN0p a388uhQLDQfr+zvyD5BA+61vKiVPpKLLieXYuj/CKkYP9HNyzrJR1m4F5TMUtMU0a+Zw tDUcg/tT5Qim8WhRrYl6s6Vn2k+X5K6ih+0pFmCouGNG40CLRmjAmvtcQpdazBIFO10i 6x10OD/N5YBb7/6i74q0Kgyp99s1kBGl4U0yDkNR1RnM5l3PBZ3yb+ndNsB7Zzz/GTQv pSgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088960; x=1698693760; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NTqNWRBS0zcUrotNdMsrwLpg176RJ9k1BahaqjErmhM=; b=DBV+IjDsRvdyJ2hx0jWjKa9tWKp88E2Xu1VnVKuiL8EvRmy3TPWWMCXUshB8/+lM63 2fUauHVfxz+W/NNDqwIaSecIOKyUAEF5o9kvsQFD1+scr86ijdjRK5vcbnxqLEo3Vkb3 nig541C116m3MR1QsQTHXjvi1S/kcJ5PuS0iSQaoZWHu5q88CWPxoQmFQ+1JrEZ4QroP gMZNtwxBPGf+4EQ7BujM4zQRjecHHMpwSgOlplYrxTQCe8zYyeaDvpC7GtWi9z8cJIyx m2J1ffsnvinnlEbHIDOzsfmBMjOr2dUpi+kJCXBuCDA5lnmVMrzURWouO7b1lHSHAZ/K CYPw== X-Gm-Message-State: AOJu0Yy4v/BHpvTeANZTj2x83aGvAdzw9mwmZHSrTA3Ou1jw26m5hAMT WVbST/gR9/1dZaBTWNov7gJKBQ== X-Received: by 2002:a05:600c:3414:b0:401:c7ec:b930 with SMTP id y20-20020a05600c341400b00401c7ecb930mr12126542wmp.10.1698088959788; Mon, 23 Oct 2023 12:22:39 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:39 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 07/23] net/tcp: Add tcp_parse_auth_options() Date: Mon, 23 Oct 2023 20:21:59 +0100 Message-ID: <20231023192217.426455-8-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:23:11 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575362459280960 X-GMAIL-MSGID: 1780575362459280960 Introduce a helper that: (1) shares the common code with TCP-MD5 header options parsing (2) looks for hash signature only once for both TCP-MD5 and TCP-AO (3) fails with -EEXIST if any TCP sign option is present twice, see RFC5925 (2.2): ">> A single TCP segment MUST NOT have more than one TCP-AO in its options sequence. When multiple TCP-AOs appear, TCP MUST discard the segment." Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/dropreason-core.h | 6 ++++++ include/net/tcp.h | 24 ++++++++++++++++++++- include/net/tcp_ao.h | 17 ++++++++++++++- net/ipv4/tcp.c | 3 ++- net/ipv4/tcp_input.c | 39 ++++++++++++++++++++++++++--------- net/ipv4/tcp_ipv4.c | 15 +++++++++----- net/ipv6/tcp_ipv6.c | 11 ++++++---- 7 files changed, 93 insertions(+), 22 deletions(-) diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h index 845dce805de7..3af4464a9c5b 100644 --- a/include/net/dropreason-core.h +++ b/include/net/dropreason-core.h @@ -20,6 +20,7 @@ FN(IP_NOPROTO) \ FN(SOCKET_RCVBUFF) \ FN(PROTO_MEM) \ + FN(TCP_AUTH_HDR) \ FN(TCP_MD5NOTFOUND) \ FN(TCP_MD5UNEXPECTED) \ FN(TCP_MD5FAILURE) \ @@ -142,6 +143,11 @@ enum skb_drop_reason { * drop out of udp_memory_allocated. */ SKB_DROP_REASON_PROTO_MEM, + /** + * @SKB_DROP_REASON_TCP_AUTH_HDR: TCP-MD5 or TCP-AO hashes are met + * twice or set incorrectly. + */ + SKB_DROP_REASON_TCP_AUTH_HDR, /** * @SKB_DROP_REASON_TCP_MD5NOTFOUND: no MD5 hash and one expected, * corresponding to LINUX_MIB_TCPMD5NOTFOUND diff --git a/include/net/tcp.h b/include/net/tcp.h index 9a996834699d..ba30fcc89830 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -438,7 +438,6 @@ int tcp_mmap(struct file *file, struct socket *sock, void tcp_parse_options(const struct net *net, const struct sk_buff *skb, struct tcp_options_received *opt_rx, int estab, struct tcp_fastopen_cookie *foc); -const u8 *tcp_parse_md5sig_option(const struct tcphdr *th); /* * BPF SKB-less helpers @@ -2673,6 +2672,29 @@ static inline u64 tcp_transmit_time(const struct sock *sk) return 0; } +static inline int tcp_parse_auth_options(const struct tcphdr *th, + const u8 **md5_hash, const struct tcp_ao_hdr **aoh) +{ + const u8 *md5_tmp, *ao_tmp; + int ret; + + ret = tcp_do_parse_auth_options(th, &md5_tmp, &ao_tmp); + if (ret) + return ret; + + if (md5_hash) + *md5_hash = md5_tmp; + + if (aoh) { + if (!ao_tmp) + *aoh = NULL; + else + *aoh = (struct tcp_ao_hdr *)(ao_tmp - 2); + } + + return 0; +} + static inline bool tcp_ao_required(struct sock *sk, const void *saddr, int family) { diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 0b86bc05d8cf..fdd2f5091b98 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -152,7 +152,9 @@ int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); void tcp_ao_established(struct sock *sk); void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb); void tcp_ao_connect_init(struct sock *sk); - +void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, + struct tcp_request_sock *treq, + unsigned short int family); #else /* CONFIG_TCP_AO */ static inline int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, @@ -185,4 +187,17 @@ static inline void tcp_ao_connect_init(struct sock *sk) } #endif +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) +int tcp_do_parse_auth_options(const struct tcphdr *th, + const u8 **md5_hash, const u8 **ao_hash); +#else +static inline int tcp_do_parse_auth_options(const struct tcphdr *th, + const u8 **md5_hash, const u8 **ao_hash) +{ + *md5_hash = NULL; + *ao_hash = NULL; + return 0; +} +#endif + #endif /* _TCP_AO_H */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0e3372265cf4..b44b9ef99e60 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4396,7 +4396,8 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, l3index = sdif ? dif : 0; hash_expected = tcp_md5_do_lookup(sk, l3index, saddr, family); - hash_location = tcp_parse_md5sig_option(th); + if (tcp_parse_auth_options(th, &hash_location, NULL)) + return SKB_DROP_REASON_TCP_AUTH_HDR; /* We've parsed the options - do we have a hash? */ if (!hash_expected && !hash_location) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 64795a79a4a4..7e5e16d2e65c 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4254,39 +4254,58 @@ static bool tcp_fast_parse_options(const struct net *net, return true; } -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) /* - * Parse MD5 Signature option + * Parse Signature options */ -const u8 *tcp_parse_md5sig_option(const struct tcphdr *th) +int tcp_do_parse_auth_options(const struct tcphdr *th, + const u8 **md5_hash, const u8 **ao_hash) { int length = (th->doff << 2) - sizeof(*th); const u8 *ptr = (const u8 *)(th + 1); + unsigned int minlen = TCPOLEN_MD5SIG; + + if (IS_ENABLED(CONFIG_TCP_AO)) + minlen = sizeof(struct tcp_ao_hdr) + 1; + + *md5_hash = NULL; + *ao_hash = NULL; /* If not enough data remaining, we can short cut */ - while (length >= TCPOLEN_MD5SIG) { + while (length >= minlen) { int opcode = *ptr++; int opsize; switch (opcode) { case TCPOPT_EOL: - return NULL; + return 0; case TCPOPT_NOP: length--; continue; default: opsize = *ptr++; if (opsize < 2 || opsize > length) - return NULL; - if (opcode == TCPOPT_MD5SIG) - return opsize == TCPOLEN_MD5SIG ? ptr : NULL; + return -EINVAL; + if (opcode == TCPOPT_MD5SIG) { + if (opsize != TCPOLEN_MD5SIG) + return -EINVAL; + if (unlikely(*md5_hash || *ao_hash)) + return -EEXIST; + *md5_hash = ptr; + } else if (opcode == TCPOPT_AO) { + if (opsize <= sizeof(struct tcp_ao_hdr)) + return -EINVAL; + if (unlikely(*md5_hash || *ao_hash)) + return -EEXIST; + *ao_hash = ptr; + } } ptr += opsize - 2; length -= opsize; } - return NULL; + return 0; } -EXPORT_SYMBOL(tcp_parse_md5sig_option); +EXPORT_SYMBOL(tcp_do_parse_auth_options); #endif /* Sorry, PAWS as specified is broken wrt. pure-ACKs -DaveM diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index b002f6497d19..83e069d0f778 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -670,7 +670,9 @@ EXPORT_SYMBOL(tcp_v4_send_check); * Exception: precedence violation. We do not implement it in any case. */ -#ifdef CONFIG_TCP_MD5SIG +#ifdef CONFIG_TCP_AO +#define OPTION_BYTES MAX_TCP_OPTION_SPACE +#elif defined(CONFIG_TCP_MD5SIG) #define OPTION_BYTES TCPOLEN_MD5SIG_ALIGNED #else #define OPTION_BYTES sizeof(__be32) @@ -685,8 +687,8 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) } rep; struct ip_reply_arg arg; #ifdef CONFIG_TCP_MD5SIG + const __u8 *md5_hash_location = NULL; struct tcp_md5sig_key *key = NULL; - const __u8 *hash_location = NULL; unsigned char newhash[16]; int genhash; struct sock *sk1 = NULL; @@ -727,8 +729,11 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); #ifdef CONFIG_TCP_MD5SIG + /* Invalid TCP option size or twice included auth */ + if (tcp_parse_auth_options(tcp_hdr(skb), &md5_hash_location, NULL)) + return; + rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); if (sk && sk_fullsock(sk)) { const union tcp_md5_addr *addr; int l3index; @@ -739,7 +744,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) l3index = tcp_v4_sdif(skb) ? inet_iif(skb) : 0; addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr; key = tcp_md5_do_lookup(sk, l3index, addr, AF_INET); - } else if (hash_location) { + } else if (md5_hash_location) { const union tcp_md5_addr *addr; int sdif = tcp_v4_sdif(skb); int dif = inet_iif(skb); @@ -771,7 +776,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) genhash = tcp_v4_md5_hash_skb(newhash, key, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) + if (genhash || memcmp(md5_hash_location, newhash, 16) != 0) goto out; } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index f52656bdf13a..309f5993b1ca 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -990,7 +990,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) u32 seq = 0, ack_seq = 0; struct tcp_md5sig_key *key = NULL; #ifdef CONFIG_TCP_MD5SIG - const __u8 *hash_location = NULL; + const __u8 *md5_hash_location = NULL; unsigned char newhash[16]; int genhash; struct sock *sk1 = NULL; @@ -1012,8 +1012,11 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); #ifdef CONFIG_TCP_MD5SIG + /* Invalid TCP option size or twice included auth */ + if (tcp_parse_auth_options(th, &md5_hash_location, NULL)) + return; + rcu_read_lock(); - hash_location = tcp_parse_md5sig_option(th); if (sk && sk_fullsock(sk)) { int l3index; @@ -1022,7 +1025,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) */ l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0; key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr, l3index); - } else if (hash_location) { + } else if (md5_hash_location) { int dif = tcp_v6_iif_l3_slave(skb); int sdif = tcp_v6_sdif(skb); int l3index; @@ -1051,7 +1054,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) goto out; genhash = tcp_v6_md5_hash_skb(newhash, key, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) + if (genhash || memcmp(md5_hash_location, newhash, 16) != 0) goto out; } #endif From patchwork Mon Oct 23 19:22:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157057 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504528vqx; Mon, 23 Oct 2023 12:23:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEd7jbNh0Y4OHsGk3QYoDRuaUw2KaM0qi4Fn6Qkgs8E8j4bqi+kGVpaKCiXs0xH8V5amXiP X-Received: by 2002:a17:903:184:b0:1c6:1bf9:d88d with SMTP id z4-20020a170903018400b001c61bf9d88dmr12639979plg.44.1698089002184; Mon, 23 Oct 2023 12:23:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089002; cv=none; d=google.com; s=arc-20160816; b=ewxPqtEdbbmRLA/qYEeTzTa49Nx36O+4eyRO181MxrciBemBW3Vd3wokF0RzsILGWG qXKDAutVtQ8F6VvXlmCBZ13mJPqlrG2vinATdME6LGkAio6Me34zDKzWhgEVpnodyIOW DEBp/yXLh1q6SSFzyzgKkQf46uT20DXdCJkyNwLEH8QtL5LtLfC3MMLZa74Z64YHAw+p LTQFkfpVo3B+u8Mos+H4YXc269ARU48RZDWEDxcLZxzZsKCkyC+O/2KS6F+bY2hIenaf GnD4IcroYp/igo6ddHZH1tbJbMSmgFUj+4bx77ZXLaGlUnCtaxcVISS48CNYzgfTIvBD EGBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5gMLmKu0HZn84y99cjkYSyI3EMp9URdewcf/7FSN70Q=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=vuWcsALTcxG/X+NfFsnf3RIg2t5ETa8R/KH5SNDkeKdedscIWxVmWbJWQqOc7TV7g7 ztwpuMEQUdWgsoR6T6o6mR2SXgJ+nx25cFVNRMG8rYFC9XE9qgLXI0BKkKW1WD1y7igo 3Ch4vJaUe9VPLCTKcnfm+Yker/6GZvSP9IQxZGrHgb8brB/g9dsssCsXSzIEEuTJ6O8/ YQebMfplmYSLxJEatPNDzBpPMDL5TWFHDD4gB07qOVLneBMt8QqnhEvlXv8ca2oAWoQp CSrn3GozQ+KZJdwxeVvase+x3klsu1Noz+VIK+ofijv3qUbD4HA7uMQnVdzRLXL9Nxgd g9Tw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=McjBPN6U; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id g4-20020a170902c38400b001c72c258f82si6955369plg.99.2023.10.23.12.23.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=McjBPN6U; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 179AA80BE2C6; Mon, 23 Oct 2023 12:23:18 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232530AbjJWTXJ (ORCPT + 27 others); Mon, 23 Oct 2023 15:23:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231761AbjJWTWy (ORCPT ); Mon, 23 Oct 2023 15:22:54 -0400 Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBB78D68 for ; Mon, 23 Oct 2023 12:22:43 -0700 (PDT) Received: by mail-lj1-x22b.google.com with SMTP id 38308e7fff4ca-2bfed7c4e6dso54923351fa.1 for ; Mon, 23 Oct 2023 12:22:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088962; x=1698693762; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5gMLmKu0HZn84y99cjkYSyI3EMp9URdewcf/7FSN70Q=; b=McjBPN6UuiXs2Cq2itI/cJHRU8yeBcmxliMTFYG/YP7iGw15kN3YvKoasYdpFA8uPx a7yqq0J9vENUg8ID0wTavmjEDlArKHGWLeSA2yEF2oJRCCOunIUOnmgWA61xvI9TvKNq fqAQAwmeseitiNIuq4mUuLSgkGRWU4s5g+rNavwnWSgXTUinx9FVFSzbgn3B8AYFvoyM CHjU/1FyDBsTUqFhWQpbiY04+FAY67EYLubkmqHLJCxsJbhoWQeaMm/RC6uuHmsFdyE7 cRuyFQqYMnTkc8YIxHRidP+i2W7RzHD8TnuweKFCbKnOa3OqeG6vi1o1J6ERlZ5RRocl Zj5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088962; x=1698693762; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5gMLmKu0HZn84y99cjkYSyI3EMp9URdewcf/7FSN70Q=; b=KYs0saYAVVhsdOQmJb0iLekFANFeC9ejUvz8kzTNOtp2vbe+ac+BSReezmmZVMSUMS kTpsOUHwXIUZiryiPnmRMoadbrQzEjhZfjyr0DSXe/ZQPn4YcJD/YHDl5eBO0te8YO+6 PTCMu5H7MN5j/hcEXmQ1sqxrtHnxfQJmdpv8qugTaEFN08J7c4fWDrOuxz5Htj1xnQiW zvg1ZpcUdZY+gPwR+BxbTENblxfMj99K0PZHvk/+5XjFSIwh5nHlzWZ2jnQk8TPslvBa LG6+cpNhmG1FOEa9EfdE8+9lRZBzdzYLJWLsXZjsXoHyZUlrwRG0e57+4dISaIOg82zM zDeQ== X-Gm-Message-State: AOJu0Yz8SQBvsAouNC5fn8JxcmJl6Ic9aur3uic3wvAwl6rkNOt16nVJ jXCpE2RWg43UiCO/C5sk+pLpRQ== X-Received: by 2002:a2e:be1c:0:b0:2c5:13e8:d54e with SMTP id z28-20020a2ebe1c000000b002c513e8d54emr8317704ljq.38.1698088961756; Mon, 23 Oct 2023 12:22:41 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:41 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 08/23] net/tcp: Add AO sign to RST packets Date: Mon, 23 Oct 2023 20:22:00 +0100 Message-ID: <20231023192217.426455-9-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:23:18 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575373674160650 X-GMAIL-MSGID: 1780575373674160650 Wire up sending resets to TCP-AO hashing. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp.h | 7 ++- include/net/tcp_ao.h | 12 +++++ net/ipv4/tcp_ao.c | 102 ++++++++++++++++++++++++++++++++++++++++++- net/ipv4/tcp_ipv4.c | 69 +++++++++++++++++++++++------ net/ipv6/tcp_ipv6.c | 98 +++++++++++++++++++++++++++++------------ 5 files changed, 245 insertions(+), 43 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index ba30fcc89830..ec78ec6ab723 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2256,7 +2256,12 @@ static inline __u32 cookie_init_sequence(const struct tcp_request_sock_ops *ops, struct tcp_key { union { - struct tcp_ao_key *ao_key; + struct { + struct tcp_ao_key *ao_key; + char *traffic_key; + u32 sne; + u8 rcv_next; + }; struct tcp_md5sig_key *md5_key; }; enum { diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index fdd2f5091b98..629ab0365b83 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -120,12 +120,24 @@ int tcp_ao_hash_skb(unsigned short int family, const u8 *tkey, int hash_offset, u32 sne); int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family, sockptr_t optval, int optlen); +struct tcp_ao_key *tcp_ao_established_key(struct tcp_ao_info *ao, + int sndid, int rcvid); int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, unsigned int len, struct tcp_sigpool *hp); void tcp_ao_destroy_sock(struct sock *sk); struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid); +int tcp_ao_hash_hdr(unsigned short family, char *ao_hash, + struct tcp_ao_key *key, const u8 *tkey, + const union tcp_ao_addr *daddr, + const union tcp_ao_addr *saddr, + const struct tcphdr *th, u32 sne); +int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, + const struct tcp_ao_hdr *aoh, int l3index, + struct tcp_ao_key **key, char **traffic_key, + bool *allocated_traffic_key, u8 *keyid, u32 *sne); + /* ipv4 specific functions */ int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 007f29a2531f..b8afe78ff057 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -48,8 +48,8 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, * it's known that the keys in ao_info are matching peer's * family/address/VRF/etc. */ -static struct tcp_ao_key *tcp_ao_established_key(struct tcp_ao_info *ao, - int sndid, int rcvid) +struct tcp_ao_key *tcp_ao_established_key(struct tcp_ao_info *ao, + int sndid, int rcvid) { struct tcp_ao_key *key; @@ -369,6 +369,66 @@ static int tcp_ao_hash_header(struct tcp_sigpool *hp, return err; } +int tcp_ao_hash_hdr(unsigned short int family, char *ao_hash, + struct tcp_ao_key *key, const u8 *tkey, + const union tcp_ao_addr *daddr, + const union tcp_ao_addr *saddr, + const struct tcphdr *th, u32 sne) +{ + int tkey_len = tcp_ao_digest_size(key); + int hash_offset = ao_hash - (char *)th; + struct tcp_sigpool hp; + void *hash_buf = NULL; + + hash_buf = kmalloc(tkey_len, GFP_ATOMIC); + if (!hash_buf) + goto clear_hash_noput; + + if (tcp_sigpool_start(key->tcp_sigpool_id, &hp)) + goto clear_hash_noput; + + if (crypto_ahash_setkey(crypto_ahash_reqtfm(hp.req), tkey, tkey_len)) + goto clear_hash; + + if (crypto_ahash_init(hp.req)) + goto clear_hash; + + if (tcp_ao_hash_sne(&hp, sne)) + goto clear_hash; + if (family == AF_INET) { + if (tcp_v4_ao_hash_pseudoheader(&hp, daddr->a4.s_addr, + saddr->a4.s_addr, th->doff * 4)) + goto clear_hash; +#if IS_ENABLED(CONFIG_IPV6) + } else if (family == AF_INET6) { + if (tcp_v6_ao_hash_pseudoheader(&hp, &daddr->a6, + &saddr->a6, th->doff * 4)) + goto clear_hash; +#endif + } else { + WARN_ON_ONCE(1); + goto clear_hash; + } + if (tcp_ao_hash_header(&hp, th, false, + ao_hash, hash_offset, tcp_ao_maclen(key))) + goto clear_hash; + ahash_request_set_crypt(hp.req, NULL, hash_buf, 0); + if (crypto_ahash_final(hp.req)) + goto clear_hash; + + memcpy(ao_hash, hash_buf, tcp_ao_maclen(key)); + tcp_sigpool_end(&hp); + kfree(hash_buf); + return 0; + +clear_hash: + tcp_sigpool_end(&hp); +clear_hash_noput: + memset(ao_hash, 0, tcp_ao_maclen(key)); + kfree(hash_buf); + return 1; +} + int tcp_ao_hash_skb(unsigned short int family, char *ao_hash, struct tcp_ao_key *key, const struct sock *sk, const struct sk_buff *skb, @@ -435,6 +495,44 @@ struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, return tcp_ao_do_lookup(sk, addr, AF_INET, sndid, rcvid); } +int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, + const struct tcp_ao_hdr *aoh, int l3index, + struct tcp_ao_key **key, char **traffic_key, + bool *allocated_traffic_key, u8 *keyid, u32 *sne) +{ + struct tcp_ao_key *rnext_key; + struct tcp_ao_info *ao_info; + + *allocated_traffic_key = false; + /* If there's no socket - than initial sisn/disn are unknown. + * Drop the segment. RFC5925 (7.7) advises to require graceful + * restart [RFC4724]. Alternatively, the RFC5925 advises to + * save/restore traffic keys before/after reboot. + * Linux TCP-AO support provides TCP_AO_ADD_KEY and TCP_AO_REPAIR + * options to restore a socket post-reboot. + */ + if (!sk) + return -ENOTCONN; + + if ((1 << sk->sk_state) & + (TCPF_LISTEN | TCPF_NEW_SYN_RECV | TCPF_TIME_WAIT)) + return -1; + + ao_info = rcu_dereference(tcp_sk(sk)->ao_info); + if (!ao_info) + return -ENOENT; + + *key = tcp_ao_established_key(ao_info, aoh->rnext_keyid, -1); + if (!*key) + return -ENOENT; + *traffic_key = snd_other_key(*key); + rnext_key = READ_ONCE(ao_info->rnext_key); + *keyid = rnext_key->rcvid; + *sne = 0; + + return 0; +} + int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, struct tcp_ao_key *key, struct tcphdr *th, __u8 *hash_location) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 83e069d0f778..71e1cbb0020b 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -657,6 +657,52 @@ void tcp_v4_send_check(struct sock *sk, struct sk_buff *skb) } EXPORT_SYMBOL(tcp_v4_send_check); +#define REPLY_OPTIONS_LEN (MAX_TCP_OPTION_SPACE / sizeof(__be32)) + +static bool tcp_v4_ao_sign_reset(const struct sock *sk, struct sk_buff *skb, + const struct tcp_ao_hdr *aoh, + struct ip_reply_arg *arg, struct tcphdr *reply, + __be32 reply_options[REPLY_OPTIONS_LEN]) +{ +#ifdef CONFIG_TCP_AO + int sdif = tcp_v4_sdif(skb); + int dif = inet_iif(skb); + int l3index = sdif ? dif : 0; + bool allocated_traffic_key; + struct tcp_ao_key *key; + char *traffic_key; + bool drop = true; + u32 ao_sne = 0; + u8 keyid; + + rcu_read_lock(); + if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, + &key, &traffic_key, &allocated_traffic_key, + &keyid, &ao_sne)) + goto out; + + reply_options[0] = htonl((TCPOPT_AO << 24) | (tcp_ao_len(key) << 16) | + (aoh->rnext_keyid << 8) | keyid); + arg->iov[0].iov_len += round_up(tcp_ao_len(key), 4); + reply->doff = arg->iov[0].iov_len / 4; + + if (tcp_ao_hash_hdr(AF_INET, (char *)&reply_options[1], + key, traffic_key, + (union tcp_ao_addr *)&ip_hdr(skb)->saddr, + (union tcp_ao_addr *)&ip_hdr(skb)->daddr, + reply, ao_sne)) + goto out; + drop = false; +out: + rcu_read_unlock(); + if (allocated_traffic_key) + kfree(traffic_key); + return drop; +#else + return true; +#endif +} + /* * This routine will send an RST to the other tcp. * @@ -670,28 +716,21 @@ EXPORT_SYMBOL(tcp_v4_send_check); * Exception: precedence violation. We do not implement it in any case. */ -#ifdef CONFIG_TCP_AO -#define OPTION_BYTES MAX_TCP_OPTION_SPACE -#elif defined(CONFIG_TCP_MD5SIG) -#define OPTION_BYTES TCPOLEN_MD5SIG_ALIGNED -#else -#define OPTION_BYTES sizeof(__be32) -#endif - static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) { const struct tcphdr *th = tcp_hdr(skb); struct { struct tcphdr th; - __be32 opt[OPTION_BYTES / sizeof(__be32)]; + __be32 opt[REPLY_OPTIONS_LEN]; } rep; + const __u8 *md5_hash_location = NULL; + const struct tcp_ao_hdr *aoh; struct ip_reply_arg arg; #ifdef CONFIG_TCP_MD5SIG - const __u8 *md5_hash_location = NULL; struct tcp_md5sig_key *key = NULL; unsigned char newhash[16]; - int genhash; struct sock *sk1 = NULL; + int genhash; #endif u64 transmit_time = 0; struct sock *ctl_sk; @@ -728,11 +767,15 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) arg.iov[0].iov_len = sizeof(rep.th); net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); -#ifdef CONFIG_TCP_MD5SIG + /* Invalid TCP option size or twice included auth */ - if (tcp_parse_auth_options(tcp_hdr(skb), &md5_hash_location, NULL)) + if (tcp_parse_auth_options(tcp_hdr(skb), &md5_hash_location, &aoh)) return; + if (aoh && tcp_v4_ao_sign_reset(sk, skb, aoh, &arg, &rep.th, rep.opt)) + return; + +#ifdef CONFIG_TCP_MD5SIG rcu_read_lock(); if (sk && sk_fullsock(sk)) { const union tcp_md5_addr *addr; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 309f5993b1ca..4c888d66f858 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -854,8 +854,8 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = { static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, - int oif, struct tcp_md5sig_key *key, int rst, - u8 tclass, __be32 label, u32 priority, u32 txhash) + int oif, int rst, u8 tclass, __be32 label, + u32 priority, u32 txhash, struct tcp_key *key) { const struct tcphdr *th = tcp_hdr(skb); struct tcphdr *t1; @@ -870,13 +870,13 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 if (tsecr) tot_len += TCPOLEN_TSTAMP_ALIGNED; -#ifdef CONFIG_TCP_MD5SIG - if (key) + if (tcp_key_is_md5(key)) tot_len += TCPOLEN_MD5SIG_ALIGNED; -#endif + if (tcp_key_is_ao(key)) + tot_len += tcp_ao_len(key->ao_key); #ifdef CONFIG_MPTCP - if (rst && !key) { + if (rst && !tcp_key_is_md5(key)) { mrst = mptcp_reset_option(skb); if (mrst) @@ -917,14 +917,28 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 *topt++ = mrst; #ifdef CONFIG_TCP_MD5SIG - if (key) { + if (tcp_key_is_md5(key)) { *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | (TCPOPT_MD5SIG << 8) | TCPOLEN_MD5SIG); - tcp_v6_md5_hash_hdr((__u8 *)topt, key, + tcp_v6_md5_hash_hdr((__u8 *)topt, key->md5_key, &ipv6_hdr(skb)->saddr, &ipv6_hdr(skb)->daddr, t1); } #endif +#ifdef CONFIG_TCP_AO + if (tcp_key_is_ao(key)) { + *topt++ = htonl((TCPOPT_AO << 24) | + (tcp_ao_len(key->ao_key) << 16) | + (key->ao_key->sndid << 8) | + (key->rcv_next)); + + tcp_ao_hash_hdr(AF_INET6, (char *)topt, key->ao_key, + key->traffic_key, + (union tcp_ao_addr *)&ipv6_hdr(skb)->saddr, + (union tcp_ao_addr *)&ipv6_hdr(skb)->daddr, + t1, key->sne); + } +#endif memset(&fl6, 0, sizeof(fl6)); fl6.daddr = ipv6_hdr(skb)->saddr; @@ -987,19 +1001,23 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) { const struct tcphdr *th = tcp_hdr(skb); struct ipv6hdr *ipv6h = ipv6_hdr(skb); - u32 seq = 0, ack_seq = 0; - struct tcp_md5sig_key *key = NULL; -#ifdef CONFIG_TCP_MD5SIG const __u8 *md5_hash_location = NULL; - unsigned char newhash[16]; - int genhash; - struct sock *sk1 = NULL; +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) + bool allocated_traffic_key = false; #endif + const struct tcp_ao_hdr *aoh; + struct tcp_key key = {}; + u32 seq = 0, ack_seq = 0; __be32 label = 0; u32 priority = 0; struct net *net; u32 txhash = 0; int oif = 0; +#ifdef CONFIG_TCP_MD5SIG + unsigned char newhash[16]; + int genhash; + struct sock *sk1 = NULL; +#endif if (th->rst) return; @@ -1011,12 +1029,13 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) return; net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); -#ifdef CONFIG_TCP_MD5SIG /* Invalid TCP option size or twice included auth */ - if (tcp_parse_auth_options(th, &md5_hash_location, NULL)) + if (tcp_parse_auth_options(th, &md5_hash_location, &aoh)) return; - +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) rcu_read_lock(); +#endif +#ifdef CONFIG_TCP_MD5SIG if (sk && sk_fullsock(sk)) { int l3index; @@ -1024,7 +1043,9 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) * in an L3 domain and inet_iif is set to it. */ l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0; - key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr, l3index); + key.md5_key = tcp_v6_md5_do_lookup(sk, &ipv6h->saddr, l3index); + if (key.md5_key) + key.type = TCP_KEY_MD5; } else if (md5_hash_location) { int dif = tcp_v6_iif_l3_slave(skb); int sdif = tcp_v6_sdif(skb); @@ -1049,11 +1070,12 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) */ l3index = tcp_v6_sdif(skb) ? dif : 0; - key = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr, l3index); - if (!key) + key.md5_key = tcp_v6_md5_do_lookup(sk1, &ipv6h->saddr, l3index); + if (!key.md5_key) goto out; + key.type = TCP_KEY_MD5; - genhash = tcp_v6_md5_hash_skb(newhash, key, NULL, skb); + genhash = tcp_v6_md5_hash_skb(newhash, key.md5_key, NULL, skb); if (genhash || memcmp(md5_hash_location, newhash, 16) != 0) goto out; } @@ -1065,6 +1087,20 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) ack_seq = ntohl(th->seq) + th->syn + th->fin + skb->len - (th->doff << 2); +#ifdef CONFIG_TCP_AO + if (aoh) { + int l3index; + + l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0; + if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, + &key.ao_key, &key.traffic_key, + &allocated_traffic_key, + &key.rcv_next, &key.sne)) + goto out; + key.type = TCP_KEY_AO; + } +#endif + if (sk) { oif = sk->sk_bound_dev_if; if (sk_fullsock(sk)) { @@ -1084,22 +1120,30 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) label = ip6_flowlabel(ipv6h); } - tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, key, 1, - ipv6_get_dsfield(ipv6h), label, priority, txhash); + tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, 1, + ipv6_get_dsfield(ipv6h), label, priority, txhash, + &key); -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) out: + if (allocated_traffic_key) + kfree(key.traffic_key); rcu_read_unlock(); #endif } static void tcp_v6_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, - struct tcp_md5sig_key *key, u8 tclass, + struct tcp_md5sig_key *md5_key, u8 tclass, __be32 label, u32 priority, u32 txhash) { - tcp_v6_send_response(sk, skb, seq, ack, win, tsval, tsecr, oif, key, 0, - tclass, label, priority, txhash); + struct tcp_key key = { + .md5_key = md5_key, + .type = md5_key ? TCP_KEY_MD5 : TCP_KEY_NONE, + }; + + tcp_v6_send_response(sk, skb, seq, ack, win, tsval, tsecr, oif, 0, + tclass, label, priority, txhash, &key); } static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) From patchwork Mon Oct 23 19:22:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157056 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504490vqx; Mon, 23 Oct 2023 12:23:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGiuDjZe6Sd2+wVq+wpAJBHChtiueL02O4RNOLIcI0FN+jm7ImoHi7tDh51HjGDtdY2V95g X-Received: by 2002:a17:902:c411:b0:1cb:e412:c3fd with SMTP id k17-20020a170902c41100b001cbe412c3fdmr58381plk.63.1698088995224; Mon, 23 Oct 2023 12:23:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698088995; cv=none; d=google.com; s=arc-20160816; b=MRTaEcYaHGZq+it+XRB1/4FX+iZGnq5Rmn3AbZvqKQWgfH2oy8CUTQ42KWmSrK2A6+ /TPlA2THZZ9CMFtJE89AFprIdc/+U0F1v0WY8Ofv/5cePE4vJW7kPpgrrZ7JnWYx3edH GijJL4kZZyY5FXMGeHZsSy9ThZkYb2hFaZws8luoG3QknK6zPt+leFqeiqPwtYbKW2p0 7CExEZANzrFEFEBfSbD/x7LWCo+PumysrHhvuZ+RZegdtYUQ4nrkJrRT4iag2WehKf30 4gy/uA2kvMvYr5OQ/+Jk+CMZI6RHk/OoONqeggjsKtIBv+FPhcjeeGOOPAF4pdQ/roUP 4Xjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HJ+k18h0GERZTVwa41L7hV+JW3Ec8aNzLELydD0F+Fo=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=nro7k+pLwK+A8CAc9RRdGl7neNf3gxS1fOEl6LX6P+ms7K//VHodmph9CLcRRQSCLL +CbccIP5Rbm3Pslc+SI4NcS9pb+DBXzcghOUdZJOHIFpDBHxpXED1j5Zr7LDwCbDEn0+ DXwEa0dyZgIN6MKzqgtoxgMriCmDd1z+VHLF/igkeKHqwnc09ueXdMABdKT/NkdDRo+t oExiLqrXjQD+UJXarANf+70YDnZSJY5W+knqC6+g6cFZfheRIEAchl6djJ06TMzsNmE3 5V+594x2JJi3Bi51qVhDttEW4jP8b2lpGGUmP51/JzNqhNFvdT+B+3s5OKNdvYxXw8rT AFQA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=GFDjMPqE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id t23-20020a1709028c9700b001c60783985esi7040393plo.93.2023.10.23.12.23.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=GFDjMPqE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 43D0280BE2C4; Mon, 23 Oct 2023 12:23:14 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232382AbjJWTXD (ORCPT + 27 others); Mon, 23 Oct 2023 15:23:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38228 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231994AbjJWTWz (ORCPT ); Mon, 23 Oct 2023 15:22:55 -0400 Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D036010C3 for ; Mon, 23 Oct 2023 12:22:45 -0700 (PDT) Received: by mail-lj1-x22f.google.com with SMTP id 38308e7fff4ca-2c4fdf94666so48840761fa.2 for ; Mon, 23 Oct 2023 12:22:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088964; x=1698693764; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HJ+k18h0GERZTVwa41L7hV+JW3Ec8aNzLELydD0F+Fo=; b=GFDjMPqEjYTORfzMCk2gVp7fvMpcO+ygSBxPPeKKKq4aLqEwBYnHeGTxp/9W+fJKOZ wNtFR1tlTleMeBfYqOB3hb7Qlci5DUK7gvFmbezNvTfmW5Vu9FCFJqDOT7sTCuAbng0y KhyDAcgNTAN31Zhc2TcWAi982jwF62u9xjCq3Y4OlAtML0kS8dG2S+a/p4Kzd40I5shJ 2izPhMZ94i1izeoQED0nMgr91xk7ZidCM8MvP4Ivl5vgH1i38qucDWDEanCzcpG7QMNg zeA9uBXyvopQkiPAmiwZBn+KD3XAPwQLW2v9CvoFt1RtEZhk4/+6hRqArg1cjJ71T1yy kBEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088964; x=1698693764; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HJ+k18h0GERZTVwa41L7hV+JW3Ec8aNzLELydD0F+Fo=; b=UKfcF9M8kKfvBFnPFhLpgCxDdGsd6sYDy6kypiE+ZPbm1JsyQejHOB6n9HPVu7nBsB 1JciYkC236rdZiA17ECaGUYUJHSc29xiZon0uZmkTDoCEIPKK4aJC8eYZFhsqeGorRNu hCIuuttFe0FpBSukpZxEUkJ9BA8p4B0LMRe3OjRc/cGKGkK8xaBPG+hA655YwkG3TDx8 OXqbyrLyJT0UNu5k9+3bxXDfCr/8IZ7AkyJOHAQSU1KLip5H7USwey8jy1VUEpTJZ7j8 c2nTjQhmvld+bRRDF6iMob4DTFi9BhpMNqaNDi2BnrNXRFHqU7ztfeccrNDwc3U/rpbA LV0Q== X-Gm-Message-State: AOJu0YxRO5W7C5OuzhtPBhDx0i0+BZngA7n7++Ew8VD/A5f7M3FLruSt TulVkA0z6Hg7kgSElXswR9IVFg== X-Received: by 2002:a05:651c:2123:b0:2c4:ff2e:d6cd with SMTP id a35-20020a05651c212300b002c4ff2ed6cdmr8950587ljq.2.1698088963612; Mon, 23 Oct 2023 12:22:43 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:43 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 09/23] net/tcp: Add TCP-AO sign to twsk Date: Mon, 23 Oct 2023 20:22:01 +0100 Message-ID: <20231023192217.426455-10-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:23:14 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575365971814499 X-GMAIL-MSGID: 1780575365971814499 Add support for sockets in time-wait state. ao_info as well as all keys are inherited on transition to time-wait socket. The lifetime of ao_info is now protected by ref counter, so that tcp_ao_destroy_sock() will destruct it only when the last user is gone. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/linux/tcp.h | 3 ++ include/net/tcp_ao.h | 11 ++++- net/ipv4/tcp_ao.c | 49 +++++++++++++++++---- net/ipv4/tcp_ipv4.c | 92 +++++++++++++++++++++++++++++++--------- net/ipv4/tcp_minisocks.c | 4 +- net/ipv4/tcp_output.c | 2 +- net/ipv6/tcp_ipv6.c | 72 ++++++++++++++++++++++--------- 7 files changed, 183 insertions(+), 50 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 64e7b560fa79..eec6e7e5312e 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -514,6 +514,9 @@ struct tcp_timewait_sock { #ifdef CONFIG_TCP_MD5SIG struct tcp_md5sig_key *tw_md5_key; #endif +#ifdef CONFIG_TCP_AO + struct tcp_ao_info __rcu *ao_info; +#endif }; static inline struct tcp_timewait_sock *tcp_twsk(const struct sock *sk) diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 629ab0365b83..971d7edcda9c 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -85,6 +85,7 @@ struct tcp_ao_info { __unused :31; __be32 lisn; __be32 risn; + refcount_t refcnt; /* Protects twsk destruction */ struct rcu_head rcu; }; @@ -124,7 +125,8 @@ struct tcp_ao_key *tcp_ao_established_key(struct tcp_ao_info *ao, int sndid, int rcvid); int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, unsigned int len, struct tcp_sigpool *hp); -void tcp_ao_destroy_sock(struct sock *sk); +void tcp_ao_destroy_sock(struct sock *sk, bool twsk); +void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp); struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid); @@ -182,7 +184,7 @@ static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, return NULL; } -static inline void tcp_ao_destroy_sock(struct sock *sk) +static inline void tcp_ao_destroy_sock(struct sock *sk, bool twsk) { } @@ -194,6 +196,11 @@ static inline void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb) { } +static inline void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, + struct tcp_sock *tp) +{ +} + static inline void tcp_ao_connect_init(struct sock *sk) { } diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index b8afe78ff057..7c4e2f42845a 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -159,6 +159,7 @@ static struct tcp_ao_info *tcp_ao_alloc_info(gfp_t flags) if (!ao) return NULL; INIT_HLIST_HEAD(&ao->head); + refcount_set(&ao->refcnt, 1); return ao; } @@ -176,27 +177,54 @@ static void tcp_ao_key_free_rcu(struct rcu_head *head) kfree_sensitive(key); } -void tcp_ao_destroy_sock(struct sock *sk) +void tcp_ao_destroy_sock(struct sock *sk, bool twsk) { struct tcp_ao_info *ao; struct tcp_ao_key *key; struct hlist_node *n; - ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, 1); - tcp_sk(sk)->ao_info = NULL; + if (twsk) { + ao = rcu_dereference_protected(tcp_twsk(sk)->ao_info, 1); + tcp_twsk(sk)->ao_info = NULL; + } else { + ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, 1); + tcp_sk(sk)->ao_info = NULL; + } - if (!ao) + if (!ao || !refcount_dec_and_test(&ao->refcnt)) return; hlist_for_each_entry_safe(key, n, &ao->head, node) { hlist_del_rcu(&key->node); - atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc); + if (!twsk) + atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc); call_rcu(&key->rcu, tcp_ao_key_free_rcu); } kfree_rcu(ao, rcu); } +void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp) +{ + struct tcp_ao_info *ao_info = rcu_dereference_protected(tp->ao_info, 1); + + if (ao_info) { + struct tcp_ao_key *key; + struct hlist_node *n; + int omem = 0; + + hlist_for_each_entry_safe(key, n, &ao_info->head, node) { + omem += tcp_ao_sizeof_key(key); + } + + refcount_inc(&ao_info->refcnt); + atomic_sub(omem, &(((struct sock *)tp)->sk_omem_alloc)); + rcu_assign_pointer(tcptw->ao_info, ao_info); + } else { + tcptw->ao_info = NULL; + } +} + /* 4 tuple and ISNs are expected in NBO */ static int tcp_v4_ao_calc_key(struct tcp_ao_key *mkt, u8 *key, __be32 saddr, __be32 daddr, @@ -514,11 +542,13 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, if (!sk) return -ENOTCONN; - if ((1 << sk->sk_state) & - (TCPF_LISTEN | TCPF_NEW_SYN_RECV | TCPF_TIME_WAIT)) + if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) { return -1; - ao_info = rcu_dereference(tcp_sk(sk)->ao_info); + if (sk->sk_state == TCP_TIME_WAIT) + ao_info = rcu_dereference(tcp_twsk(sk)->ao_info); + else + ao_info = rcu_dereference(tcp_sk(sk)->ao_info); if (!ao_info) return -ENOENT; @@ -910,6 +940,9 @@ static struct tcp_ao_info *setsockopt_ao_info(struct sock *sk) if (sk_fullsock(sk)) { return rcu_dereference_protected(tcp_sk(sk)->ao_info, lockdep_sock_is_held(sk)); + } else if (sk->sk_state == TCP_TIME_WAIT) { + return rcu_dereference_protected(tcp_twsk(sk)->ao_info, + lockdep_sock_is_held(sk)); } return ERR_PTR(-ESOCKTNOSUPPORT); } diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 71e1cbb0020b..a78112d78d06 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -911,17 +911,13 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) static void tcp_v4_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, - struct tcp_md5sig_key *key, + struct tcp_key *key, int reply_flags, u8 tos, u32 txhash) { const struct tcphdr *th = tcp_hdr(skb); struct { struct tcphdr th; - __be32 opt[(TCPOLEN_TSTAMP_ALIGNED >> 2) -#ifdef CONFIG_TCP_MD5SIG - + (TCPOLEN_MD5SIG_ALIGNED >> 2) -#endif - ]; + __be32 opt[(MAX_TCP_OPTION_SPACE >> 2)]; } rep; struct net *net = sock_net(sk); struct ip_reply_arg arg; @@ -952,7 +948,7 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.window = htons(win); #ifdef CONFIG_TCP_MD5SIG - if (key) { + if (tcp_key_is_md5(key)) { int offset = (tsecr) ? 3 : 0; rep.opt[offset++] = htonl((TCPOPT_NOP << 24) | @@ -963,9 +959,27 @@ static void tcp_v4_send_ack(const struct sock *sk, rep.th.doff = arg.iov[0].iov_len/4; tcp_v4_md5_hash_hdr((__u8 *) &rep.opt[offset], - key, ip_hdr(skb)->saddr, + key->md5_key, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, &rep.th); } +#endif +#ifdef CONFIG_TCP_AO + if (tcp_key_is_ao(key)) { + int offset = (tsecr) ? 3 : 0; + + rep.opt[offset++] = htonl((TCPOPT_AO << 24) | + (tcp_ao_len(key->ao_key) << 16) | + (key->ao_key->sndid << 8) | + key->rcv_next); + arg.iov[0].iov_len += round_up(tcp_ao_len(key->ao_key), 4); + rep.th.doff = arg.iov[0].iov_len / 4; + + tcp_ao_hash_hdr(AF_INET, (char *)&rep.opt[offset], + key->ao_key, key->traffic_key, + (union tcp_ao_addr *)&ip_hdr(skb)->saddr, + (union tcp_ao_addr *)&ip_hdr(skb)->daddr, + &rep.th, key->sne); + } #endif arg.flags = reply_flags; arg.csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr, @@ -999,18 +1013,50 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) { struct inet_timewait_sock *tw = inet_twsk(sk); struct tcp_timewait_sock *tcptw = tcp_twsk(sk); + struct tcp_key key = {}; +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao_info; + + /* FIXME: the segment to-be-acked is not verified yet */ + ao_info = rcu_dereference(tcptw->ao_info); + if (ao_info) { + const struct tcp_ao_hdr *aoh; + + if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) { + inet_twsk_put(tw); + return; + } + + if (aoh) + key.ao_key = tcp_ao_established_key(ao_info, aoh->rnext_keyid, -1); + } + if (key.ao_key) { + struct tcp_ao_key *rnext_key; + + key.traffic_key = snd_other_key(key.ao_key); + rnext_key = READ_ONCE(ao_info->rnext_key); + key.rcv_next = rnext_key->rcvid; + key.type = TCP_KEY_AO; +#else + if (0) { +#endif +#ifdef CONFIG_TCP_MD5SIG + } else if (static_branch_unlikely(&tcp_md5_needed.key)) { + key.md5_key = tcp_twsk_md5_key(tcptw); + if (key.md5_key) + key.type = TCP_KEY_MD5; +#endif + } tcp_v4_send_ack(sk, skb, tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt, tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale, tcp_tw_tsval(tcptw), tcptw->tw_ts_recent, - tw->tw_bound_dev_if, - tcp_twsk_md5_key(tcptw), + tw->tw_bound_dev_if, &key, tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0, tw->tw_tos, - tw->tw_txhash - ); + tw->tw_txhash); inet_twsk_put(tw); } @@ -1018,8 +1064,7 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, struct request_sock *req) { - const union tcp_md5_addr *addr; - int l3index; + struct tcp_key key = {}; /* sk->sk_state == TCP_LISTEN -> for regular TCP_SYN_RECV * sk->sk_state == TCP_SYN_RECV -> for Fast Open. @@ -1032,15 +1077,24 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, * exception of segments, MUST be right-shifted by * Rcv.Wind.Shift bits: */ - addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr; - l3index = tcp_v4_sdif(skb) ? inet_iif(skb) : 0; +#ifdef CONFIG_TCP_MD5SIG + if (static_branch_unlikely(&tcp_md5_needed.key)) { + const union tcp_md5_addr *addr; + int l3index; + + addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr; + l3index = tcp_v4_sdif(skb) ? inet_iif(skb) : 0; + key.md5_key = tcp_md5_do_lookup(sk, l3index, addr, AF_INET); + if (key.md5_key) + key.type = TCP_KEY_MD5; + } +#endif tcp_v4_send_ack(sk, skb, seq, tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale, tcp_rsk_tsval(tcp_rsk(req)), READ_ONCE(req->ts_recent), - 0, - tcp_md5_do_lookup(sk, l3index, addr, AF_INET), + 0, &key, inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0, ip_hdr(skb)->tos, READ_ONCE(tcp_rsk(req)->txhash)); @@ -2404,7 +2458,7 @@ void tcp_v4_destroy_sock(struct sock *sk) rcu_assign_pointer(tp->md5sig_info, NULL); } #endif - tcp_ao_destroy_sock(sk); + tcp_ao_destroy_sock(sk, false); /* Clean up a referenced TCP bind bucket. */ if (inet_csk(sk)->icsk_bind_hash) diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 3dcb3fc36e64..6810cf65a322 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -279,7 +279,7 @@ static void tcp_time_wait_init(struct sock *sk, struct tcp_timewait_sock *tcptw) void tcp_time_wait(struct sock *sk, int state, int timeo) { const struct inet_connection_sock *icsk = inet_csk(sk); - const struct tcp_sock *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct net *net = sock_net(sk); struct inet_timewait_sock *tw; @@ -316,6 +316,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) #endif tcp_time_wait_init(sk, tcptw); + tcp_ao_time_wait(tcptw, tp); /* Get the TIME_WAIT timeout firing. */ if (timeo < rto) @@ -370,6 +371,7 @@ void tcp_twsk_destructor(struct sock *sk) call_rcu(&twsk->tw_md5_key->rcu, tcp_md5_twsk_free_rcu); } #endif + tcp_ao_destroy_sock(sk, true); } EXPORT_SYMBOL_GPL(tcp_twsk_destructor); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 5eb81fb4d163..f1e6b63122f6 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -4015,7 +4015,7 @@ int tcp_connect(struct sock *sk) * then free up ao_info if allocated. */ if (needs_md5) { - tcp_ao_destroy_sock(sk); + tcp_ao_destroy_sock(sk, false); } else if (needs_ao) { tcp_clear_md5_list(sk); kfree(rcu_replace_pointer(tp->md5sig_info, NULL, diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 4c888d66f858..4ee24a95f542 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -778,13 +778,6 @@ static int tcp_v6_md5_hash_skb(char *md5_hash, memset(md5_hash, 0, 16); return 1; } -#else /* CONFIG_TCP_MD5SIG */ -static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk, - const struct in6_addr *addr, - int l3index) -{ - return NULL; -} #endif static void tcp_v6_init_req(struct request_sock *req, @@ -1134,39 +1127,81 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) static void tcp_v6_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, - struct tcp_md5sig_key *md5_key, u8 tclass, + struct tcp_key *key, u8 tclass, __be32 label, u32 priority, u32 txhash) { - struct tcp_key key = { - .md5_key = md5_key, - .type = md5_key ? TCP_KEY_MD5 : TCP_KEY_NONE, - }; - tcp_v6_send_response(sk, skb, seq, ack, win, tsval, tsecr, oif, 0, - tclass, label, priority, txhash, &key); + tclass, label, priority, txhash, key); } static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) { struct inet_timewait_sock *tw = inet_twsk(sk); struct tcp_timewait_sock *tcptw = tcp_twsk(sk); + struct tcp_key key = {}; +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao_info; + + /* FIXME: the segment to-be-acked is not verified yet */ + ao_info = rcu_dereference(tcptw->ao_info); + if (ao_info) { + const struct tcp_ao_hdr *aoh; + + /* Invalid TCP option size or twice included auth */ + if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) + goto out; + if (aoh) { + key.ao_key = tcp_ao_established_key(ao_info, + aoh->rnext_keyid, -1); + } + } + if (key.ao_key) { + struct tcp_ao_key *rnext_key; + + key.traffic_key = snd_other_key(key.ao_key); + /* rcv_next switches to our rcv_next */ + rnext_key = READ_ONCE(ao_info->rnext_key); + key.rcv_next = rnext_key->rcvid; + key.type = TCP_KEY_AO; +#else + if (0) { +#endif +#ifdef CONFIG_TCP_MD5SIG + } else if (static_branch_unlikely(&tcp_md5_needed.key)) { + key.md5_key = tcp_twsk_md5_key(tcptw); + if (key.md5_key) + key.type = TCP_KEY_MD5; +#endif + } tcp_v6_send_ack(sk, skb, tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt, tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale, tcp_tw_tsval(tcptw), - tcptw->tw_ts_recent, tw->tw_bound_dev_if, tcp_twsk_md5_key(tcptw), + tcptw->tw_ts_recent, tw->tw_bound_dev_if, &key, tw->tw_tclass, cpu_to_be32(tw->tw_flowlabel), tw->tw_priority, tw->tw_txhash); +#ifdef CONFIG_TCP_AO +out: +#endif inet_twsk_put(tw); } static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, struct request_sock *req) { - int l3index; + struct tcp_key key = {}; - l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0; +#ifdef CONFIG_TCP_MD5SIG + if (static_branch_unlikely(&tcp_md5_needed.key)) { + int l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0; + + key.md5_key = tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr, + l3index); + if (key.md5_key) + key.type = TCP_KEY_MD5; + } +#endif /* sk->sk_state == TCP_LISTEN -> for regular TCP_SYN_RECV * sk->sk_state == TCP_SYN_RECV -> for Fast Open. @@ -1182,8 +1217,7 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale, tcp_rsk_tsval(tcp_rsk(req)), READ_ONCE(req->ts_recent), sk->sk_bound_dev_if, - tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr, l3index), - ipv6_get_dsfield(ipv6_hdr(skb)), 0, + &key, ipv6_get_dsfield(ipv6_hdr(skb)), 0, READ_ONCE(sk->sk_priority), READ_ONCE(tcp_rsk(req)->txhash)); } From patchwork Mon Oct 23 19:22:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157059 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504599vqx; Mon, 23 Oct 2023 12:23:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFQZYID4SUh51CqjyV6CxhwJUX0+pGsF0ALc69Bu8Y5kElUCnomHDLQffft0kLQdm5pwezj X-Received: by 2002:aa7:8424:0:b0:6bc:d1ad:d1c5 with SMTP id q4-20020aa78424000000b006bcd1add1c5mr7121631pfn.17.1698089011322; Mon, 23 Oct 2023 12:23:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089011; cv=none; d=google.com; s=arc-20160816; b=p7mm02z+z/5ABPxcPlR+o3YxpXFM10dOgcFeRvQ88wvVziCYKg0yRt2r3/VC3cZKbx cEkQNn2AZCknxaopdXZb9YPf/ciRxcuxIHNp5ZUo1RN9YA0lrU60vWa4kVG9MO/bCuey 0Qdh8A8Z7EzJ00lBwFZLDdG/03csHm7AvjILKHbmvGZZfOVH4AqJJpGTOHMND4Zi+Xah 33YMEzmyqSHfK2SbTAdpSuLZ6G4iRbTCvUcOie+qaI//ds1/MuExRDZy8uyZpIE4U0WP G6JouVb7a0TzwA2RpBGi4ap6PHtpmJ+awfUjvdAeKdmPAuU7LRGCnKDcql+i+qt6EW5Y 7cQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=bnWT9kOK2NemtMmue46RwqsrjF2gS9E6v093/01P7uk=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=QgLIm5qiYGYcMpTydyKdS2eq70AGh/H8V1Yn9Egmc+wL6fHG6TjN0ksc9GIUqty4qu iPKHmqAwNZh1B6EmcUz7gd7IpUK7Bi03nLrUfpdLMN3FAf4HxpeaPsDcLbkzhkW9bpEz Xz5ZUKQ1+ZVdMz3Q+UKApMcMa9/+s/711Y+sOU/vHYyYYEIW1xDeEy0u/+22zFuRKeOu e9otDd652Sc1eVRPpPBlQ7vPviuj0r6Cwt4slqXlE8vrwf8P5eAxPN9H4X9Jnx0Q6ppz RVze8IMfS8CAQqLWmeslT91V515AkwURhIa65yTt63Kka+h81Ppt0aJLKfs9Plxu/iwa QDFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=a6C8VYGi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id t64-20020a638143000000b005a9f776c59csi7058779pgd.468.2023.10.23.12.23.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=a6C8VYGi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 6706B80C0343; Mon, 23 Oct 2023 12:23:30 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233353AbjJWTXU (ORCPT + 27 others); Mon, 23 Oct 2023 15:23:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38346 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229628AbjJWTW4 (ORCPT ); Mon, 23 Oct 2023 15:22:56 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DCD15D7E for ; Mon, 23 Oct 2023 12:22:47 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id 5b1f17b1804b1-407c3adef8eso31345815e9.2 for ; Mon, 23 Oct 2023 12:22:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088966; x=1698693766; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bnWT9kOK2NemtMmue46RwqsrjF2gS9E6v093/01P7uk=; b=a6C8VYGiFkkxCaWXkOLl3qEkbAnLk9gqSf3CNgJgxEdeTtFsaOYhHjiWvvucLyg2Po Xm2PxuXXzXkzQRX0QhUwepuy04M/bQyUaf/Z/0ubel/T6PjlpvbfGk9Ipb6BroKQ3ai+ tsnooEnmFNOvJBDadqbsABdjb5jDMTiVZcgDvWJ9BpKEo4bgXWISLmhEa+D3XyNpJJUY /5tGW1gFnzwD6XTiXWjttFY8N6Ua89nxM68m0Rgx9rdXI0mkLyULVlHf4UuwzmRXyZ0p s177sJptpBwtfBdOnsKBRN4qTabflKXw0IgQcnYK4NzfL0Ve7Wps+BE2r3pJThLUvN0U 5UGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088966; x=1698693766; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bnWT9kOK2NemtMmue46RwqsrjF2gS9E6v093/01P7uk=; b=AAAiBmWiQAdI6GS1xvCWaKGTjLFnkKF3rdX/NB7SWjFu4+paZpOaQMAyEfIb6avZa9 EEkv6pm2neZj0g8895nwI5npjXno/Qk5nccVBwyz/PZQnhLeDDfKCpT8liTOakibz8yC NV7ZBtSdSK4yEF24Qvkh1sFei774RslrCXRJw5bnWCYGpstjzftFovVpknb8Q3UxfJpu Un0r0P0JWd+PN0AfVYebkdGoF5NxLj5ZoIQnDjIv08rnoGZUgVX35qTiM8BqXi22XOGF z9/5FgB2/EHRuJ+vFEugczyl/TH1EbqX12OfKxvKCHDRLFB1bzy7a6ZXQJxB6PSNI8PD 1t+A== X-Gm-Message-State: AOJu0Ywdfn+mOIq4Z2Js/rwmiCbNjwpT9O2oOTlJ6AqTnyDsh0vxxUU+ U9xDtGycqKb09Ke4EoCERO/uNA== X-Received: by 2002:a05:600c:3109:b0:408:5bc6:a7d with SMTP id g9-20020a05600c310900b004085bc60a7dmr5170367wmo.19.1698088965643; Mon, 23 Oct 2023 12:22:45 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:45 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 10/23] net/tcp: Wire TCP-AO to request sockets Date: Mon, 23 Oct 2023 20:22:02 +0100 Message-ID: <20231023192217.426455-11-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:23:30 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575383169084175 X-GMAIL-MSGID: 1780575383169084175 Now when the new request socket is created from the listening socket, it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is a new helper for checking if TCP-AO option was used to create the request socket. tcp_ao_copy_all_matching() will copy all keys that match the peer on the request socket, as well as preparing them for the usage (creating traffic keys). Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/linux/tcp.h | 18 +++ include/net/tcp.h | 6 + include/net/tcp_ao.h | 24 ++++ net/ipv4/syncookies.c | 2 + net/ipv4/tcp_ao.c | 264 ++++++++++++++++++++++++++++++++++++--- net/ipv4/tcp_input.c | 15 +++ net/ipv4/tcp_ipv4.c | 66 ++++++++-- net/ipv4/tcp_minisocks.c | 10 ++ net/ipv4/tcp_output.c | 37 +++--- net/ipv6/syncookies.c | 2 + net/ipv6/tcp_ao.c | 38 +++++- net/ipv6/tcp_ipv6.c | 73 +++++++++-- 12 files changed, 505 insertions(+), 50 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index eec6e7e5312e..ec4e9367f5b0 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -166,6 +166,11 @@ struct tcp_request_sock { * after data-in-SYN. */ u8 syn_tos; +#ifdef CONFIG_TCP_AO + u8 ao_keyid; + u8 ao_rcv_next; + u8 maclen; +#endif }; static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req) @@ -173,6 +178,19 @@ static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req) return (struct tcp_request_sock *)req; } +static inline bool tcp_rsk_used_ao(const struct request_sock *req) +{ + /* The real length of MAC is saved in the request socket, + * signing anything with zero-length makes no sense, so here is + * a little hack.. + */ +#ifndef CONFIG_TCP_AO + return false; +#else + return tcp_rsk(req)->maclen != 0; +#endif +} + #define TCP_RMEM_TO_WIN_SCALE 8 struct tcp_sock { diff --git a/include/net/tcp.h b/include/net/tcp.h index ec78ec6ab723..f9df97b3a767 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2214,6 +2214,12 @@ struct tcp_request_sock_ops { const struct sock *sk, const struct sk_buff *skb); #endif +#ifdef CONFIG_TCP_AO + struct tcp_ao_key *(*ao_lookup)(const struct sock *sk, + struct request_sock *req, + int sndid, int rcvid); + int (*ao_calc_key)(struct tcp_ao_key *mkt, u8 *key, struct request_sock *sk); +#endif #ifdef CONFIG_SYN_COOKIES __u32 (*cookie_init_seq)(const struct sk_buff *skb, __u16 *mss); diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 971d7edcda9c..d2c1ee8bf7b0 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -123,6 +123,9 @@ int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family, sockptr_t optval, int optlen); struct tcp_ao_key *tcp_ao_established_key(struct tcp_ao_info *ao, int sndid, int rcvid); +int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk, + struct request_sock *req, struct sk_buff *skb, + int family); int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, unsigned int len, struct tcp_sigpool *hp); void tcp_ao_destroy_sock(struct sock *sk, bool twsk); @@ -147,6 +150,11 @@ struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, int tcp_v4_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, const struct sock *sk, __be32 sisn, __be32 disn, bool send); +int tcp_v4_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key, + struct request_sock *req); +struct tcp_ao_key *tcp_v4_ao_lookup_rsk(const struct sock *sk, + struct request_sock *req, + int sndid, int rcvid); int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, const struct sock *sk, const struct sk_buff *skb, const u8 *tkey, int hash_offset, u32 sne); @@ -154,11 +162,21 @@ int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, int tcp_v6_ao_hash_pseudoheader(struct tcp_sigpool *hp, const struct in6_addr *daddr, const struct in6_addr *saddr, int nbytes); +int tcp_v6_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key, + const struct sk_buff *skb, __be32 sisn, __be32 disn); int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, const struct sock *sk, __be32 sisn, __be32 disn, bool send); +int tcp_v6_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key, + struct request_sock *req); +struct tcp_ao_key *tcp_v6_ao_do_lookup(const struct sock *sk, + const struct in6_addr *addr, + int sndid, int rcvid); struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid); +struct tcp_ao_key *tcp_v6_ao_lookup_rsk(const struct sock *sk, + struct request_sock *req, + int sndid, int rcvid); int tcp_v6_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, const struct sock *sk, const struct sk_buff *skb, const u8 *tkey, int hash_offset, u32 sne); @@ -178,6 +196,12 @@ static inline int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, return 0; } +static inline void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, + struct tcp_request_sock *treq, + unsigned short int family) +{ +} + static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid) { diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index c64334363230..0681d3e82b11 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -400,6 +400,8 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) treq->snt_synack = 0; treq->tfo_listener = false; + tcp_ao_syncookie(sk, skb, treq, AF_INET); + if (IS_ENABLED(CONFIG_SMC)) ireq->smc_ok = 0; diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 7c4e2f42845a..68d81704e14e 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -169,6 +169,23 @@ static void tcp_ao_link_mkt(struct tcp_ao_info *ao, struct tcp_ao_key *mkt) hlist_add_head_rcu(&mkt->node, &ao->head); } +static struct tcp_ao_key *tcp_ao_copy_key(struct sock *sk, + struct tcp_ao_key *key) +{ + struct tcp_ao_key *new_key; + + new_key = sock_kmalloc(sk, tcp_ao_sizeof_key(key), + GFP_ATOMIC); + if (!new_key) + return NULL; + + *new_key = *key; + INIT_HLIST_NODE(&new_key->node); + tcp_sigpool_get(new_key->tcp_sigpool_id); + + return new_key; +} + static void tcp_ao_key_free_rcu(struct rcu_head *head) { struct tcp_ao_key *key = container_of(head, struct tcp_ao_key, rcu); @@ -290,6 +307,42 @@ static int tcp_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, return -EOPNOTSUPP; } +int tcp_v4_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key, + struct request_sock *req) +{ + struct inet_request_sock *ireq = inet_rsk(req); + + return tcp_v4_ao_calc_key(mkt, key, + ireq->ir_loc_addr, ireq->ir_rmt_addr, + htons(ireq->ir_num), ireq->ir_rmt_port, + htonl(tcp_rsk(req)->snt_isn), + htonl(tcp_rsk(req)->rcv_isn)); +} + +static int tcp_v4_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key, + const struct sk_buff *skb, + __be32 sisn, __be32 disn) +{ + const struct iphdr *iph = ip_hdr(skb); + const struct tcphdr *th = tcp_hdr(skb); + + return tcp_v4_ao_calc_key(mkt, key, iph->saddr, iph->daddr, + th->source, th->dest, sisn, disn); +} + +static int tcp_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key, + const struct sk_buff *skb, + __be32 sisn, __be32 disn, int family) +{ + if (family == AF_INET) + return tcp_v4_ao_calc_key_skb(mkt, key, skb, sisn, disn); +#if IS_ENABLED(CONFIG_IPV6) + else if (family == AF_INET6) + return tcp_v6_ao_calc_key_skb(mkt, key, skb, sisn, disn); +#endif + return -EAFNOSUPPORT; +} + static int tcp_v4_ao_hash_pseudoheader(struct tcp_sigpool *hp, __be32 daddr, __be32 saddr, int nbytes) @@ -515,6 +568,16 @@ int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, tkey, hash_offset, sne); } +struct tcp_ao_key *tcp_v4_ao_lookup_rsk(const struct sock *sk, + struct request_sock *req, + int sndid, int rcvid) +{ + union tcp_ao_addr *addr = + (union tcp_ao_addr *)&inet_rsk(req)->ir_rmt_addr; + + return tcp_ao_do_lookup(sk, addr, AF_INET, sndid, rcvid); +} + struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid) { @@ -528,7 +591,7 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, struct tcp_ao_key **key, char **traffic_key, bool *allocated_traffic_key, u8 *keyid, u32 *sne) { - struct tcp_ao_key *rnext_key; + const struct tcphdr *th = tcp_hdr(skb); struct tcp_ao_info *ao_info; *allocated_traffic_key = false; @@ -543,23 +606,62 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, return -ENOTCONN; if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) { - return -1; + unsigned int family = READ_ONCE(sk->sk_family); + union tcp_ao_addr *addr; + __be32 disn, sisn; - if (sk->sk_state == TCP_TIME_WAIT) - ao_info = rcu_dereference(tcp_twsk(sk)->ao_info); - else + if (sk->sk_state == TCP_NEW_SYN_RECV) { + struct request_sock *req = inet_reqsk(sk); + + sisn = htonl(tcp_rsk(req)->rcv_isn); + disn = htonl(tcp_rsk(req)->snt_isn); + *sne = 0; + } else { + sisn = th->seq; + disn = 0; + } + if (IS_ENABLED(CONFIG_IPV6) && family == AF_INET6) + addr = (union tcp_md5_addr *)&ipv6_hdr(skb)->saddr; + else + addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr; +#if IS_ENABLED(CONFIG_IPV6) + if (family == AF_INET6 && ipv6_addr_v4mapped(&sk->sk_v6_daddr)) + family = AF_INET; +#endif + + sk = sk_const_to_full_sk(sk); ao_info = rcu_dereference(tcp_sk(sk)->ao_info); - if (!ao_info) - return -ENOENT; + if (!ao_info) + return -ENOENT; + *key = tcp_ao_do_lookup(sk, addr, family, -1, aoh->rnext_keyid); + if (!*key) + return -ENOENT; + *traffic_key = kmalloc(tcp_ao_digest_size(*key), GFP_ATOMIC); + if (!*traffic_key) + return -ENOMEM; + *allocated_traffic_key = true; + if (tcp_ao_calc_key_skb(*key, *traffic_key, skb, + sisn, disn, family)) + return -1; + *keyid = (*key)->rcvid; + } else { + struct tcp_ao_key *rnext_key; - *key = tcp_ao_established_key(ao_info, aoh->rnext_keyid, -1); - if (!*key) - return -ENOENT; - *traffic_key = snd_other_key(*key); - rnext_key = READ_ONCE(ao_info->rnext_key); - *keyid = rnext_key->rcvid; - *sne = 0; + if (sk->sk_state == TCP_TIME_WAIT) + ao_info = rcu_dereference(tcp_twsk(sk)->ao_info); + else + ao_info = rcu_dereference(tcp_sk(sk)->ao_info); + if (!ao_info) + return -ENOENT; + *key = tcp_ao_established_key(ao_info, aoh->rnext_keyid, -1); + if (!*key) + return -ENOENT; + *traffic_key = snd_other_key(*key); + rnext_key = READ_ONCE(ao_info->rnext_key); + *keyid = rnext_key->rcvid; + *sne = 0; + } return 0; } @@ -597,6 +699,46 @@ int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, return 0; } +static struct tcp_ao_key *tcp_ao_inbound_lookup(unsigned short int family, + const struct sock *sk, const struct sk_buff *skb, + int sndid, int rcvid) +{ + if (family == AF_INET) { + const struct iphdr *iph = ip_hdr(skb); + + return tcp_ao_do_lookup(sk, (union tcp_ao_addr *)&iph->saddr, + AF_INET, sndid, rcvid); + } else { + const struct ipv6hdr *iph = ipv6_hdr(skb); + + return tcp_ao_do_lookup(sk, (union tcp_ao_addr *)&iph->saddr, + AF_INET6, sndid, rcvid); + } +} + +void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, + struct tcp_request_sock *treq, + unsigned short int family) +{ + const struct tcphdr *th = tcp_hdr(skb); + const struct tcp_ao_hdr *aoh; + struct tcp_ao_key *key; + + treq->maclen = 0; + + if (tcp_parse_auth_options(th, NULL, &aoh) || !aoh) + return; + + key = tcp_ao_inbound_lookup(family, sk, skb, -1, aoh->keyid); + if (!key) + /* Key not found, continue without TCP-AO */ + return; + + treq->ao_rcv_next = aoh->keyid; + treq->ao_keyid = aoh->rnext_keyid; + treq->maclen = tcp_ao_maclen(key); +} + static int tcp_ao_cache_traffic_keys(const struct sock *sk, struct tcp_ao_info *ao, struct tcp_ao_key *ao_key) @@ -704,6 +846,100 @@ void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb) tcp_ao_cache_traffic_keys(sk, ao, key); } +int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk, + struct request_sock *req, struct sk_buff *skb, + int family) +{ + struct tcp_ao_key *key, *new_key, *first_key; + struct tcp_ao_info *new_ao, *ao; + struct hlist_node *key_head; + union tcp_ao_addr *addr; + bool match = false; + int ret = -ENOMEM; + + ao = rcu_dereference(tcp_sk(sk)->ao_info); + if (!ao) + return 0; + + /* New socket without TCP-AO on it */ + if (!tcp_rsk_used_ao(req)) + return 0; + + new_ao = tcp_ao_alloc_info(GFP_ATOMIC); + if (!new_ao) + return -ENOMEM; + new_ao->lisn = htonl(tcp_rsk(req)->snt_isn); + new_ao->risn = htonl(tcp_rsk(req)->rcv_isn); + new_ao->ao_required = ao->ao_required; + + if (family == AF_INET) { + addr = (union tcp_ao_addr *)&newsk->sk_daddr; +#if IS_ENABLED(CONFIG_IPV6) + } else if (family == AF_INET6) { + addr = (union tcp_ao_addr *)&newsk->sk_v6_daddr; +#endif + } else { + ret = -EAFNOSUPPORT; + goto free_ao; + } + + hlist_for_each_entry_rcu(key, &ao->head, node) { + if (tcp_ao_key_cmp(key, addr, key->prefixlen, family, -1, -1)) + continue; + + new_key = tcp_ao_copy_key(newsk, key); + if (!new_key) + goto free_and_exit; + + tcp_ao_cache_traffic_keys(newsk, new_ao, new_key); + tcp_ao_link_mkt(new_ao, new_key); + match = true; + } + + if (!match) { + /* RFC5925 (7.4.1) specifies that the TCP-AO status + * of a connection is determined on the initial SYN. + * At this point the connection was TCP-AO enabled, so + * it can't switch to being unsigned if peer's key + * disappears on the listening socket. + */ + ret = -EKEYREJECTED; + goto free_and_exit; + } + + key_head = rcu_dereference(hlist_first_rcu(&new_ao->head)); + first_key = hlist_entry_safe(key_head, struct tcp_ao_key, node); + + key = tcp_ao_established_key(new_ao, tcp_rsk(req)->ao_keyid, -1); + if (key) + new_ao->current_key = key; + else + new_ao->current_key = first_key; + + /* set rnext_key */ + key = tcp_ao_established_key(new_ao, -1, tcp_rsk(req)->ao_rcv_next); + if (key) + new_ao->rnext_key = key; + else + new_ao->rnext_key = first_key; + + sk_gso_disable(newsk); + rcu_assign_pointer(tcp_sk(newsk)->ao_info, new_ao); + + return 0; + +free_and_exit: + hlist_for_each_entry_safe(key, key_head, &new_ao->head, node) { + hlist_del(&key->node); + tcp_sigpool_release(key->tcp_sigpool_id); + atomic_sub(tcp_ao_sizeof_key(key), &newsk->sk_omem_alloc); + kfree_sensitive(key); + } +free_ao: + kfree(new_ao); + return ret; +} + static bool tcp_ao_can_set_current_rnext(struct sock *sk) { /* There aren't current/rnext keys on TCP_LISTEN sockets */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 7e5e16d2e65c..475af9767615 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -7044,6 +7044,10 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, struct flowi fl; u8 syncookies; +#ifdef CONFIG_TCP_AO + const struct tcp_ao_hdr *aoh; +#endif + syncookies = READ_ONCE(net->ipv4.sysctl_tcp_syncookies); /* TW buckets are converted to open requests without @@ -7130,6 +7134,17 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, inet_rsk(req)->ecn_ok = 0; } +#ifdef CONFIG_TCP_AO + if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) + goto drop_and_release; /* Invalid TCP options */ + if (aoh) { + tcp_rsk(req)->maclen = aoh->length - sizeof(struct tcp_ao_hdr); + tcp_rsk(req)->ao_rcv_next = aoh->keyid; + tcp_rsk(req)->ao_keyid = aoh->rnext_keyid; + } else { + tcp_rsk(req)->maclen = 0; + } +#endif tcp_rsk(req)->snt_isn = isn; tcp_rsk(req)->txhash = net_tx_rndhash(); tcp_rsk(req)->syn_tos = TCP_SKB_CB(skb)->ip_dsfield; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index a78112d78d06..2bd56b57f3c9 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1072,13 +1072,47 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq = (sk->sk_state == TCP_LISTEN) ? tcp_rsk(req)->snt_isn + 1 : tcp_sk(sk)->snd_nxt; - /* RFC 7323 2.3 - * The window field (SEG.WND) of every outgoing segment, with the - * exception of segments, MUST be right-shifted by - * Rcv.Wind.Shift bits: - */ +#ifdef CONFIG_TCP_AO + if (tcp_rsk_used_ao(req)) { + const union tcp_md5_addr *addr; + const struct tcp_ao_hdr *aoh; + + /* Invalid TCP option size or twice included auth */ + if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) + return; + if (!aoh) + return; + + addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr; + key.ao_key = tcp_ao_do_lookup(sk, addr, AF_INET, + aoh->rnext_keyid, -1); + if (unlikely(!key.ao_key)) { + /* Send ACK with any matching MKT for the peer */ + key.ao_key = tcp_ao_do_lookup(sk, addr, AF_INET, -1, -1); + /* Matching key disappeared (user removed the key?) + * let the handshake timeout. + */ + if (!key.ao_key) { + net_info_ratelimited("TCP-AO key for (%pI4, %d)->(%pI4, %d) suddenly disappeared, won't ACK new connection\n", + addr, + ntohs(tcp_hdr(skb)->source), + &ip_hdr(skb)->daddr, + ntohs(tcp_hdr(skb)->dest)); + return; + } + } + key.traffic_key = kmalloc(tcp_ao_digest_size(key.ao_key), GFP_ATOMIC); + if (!key.traffic_key) + return; + + key.type = TCP_KEY_AO; + key.rcv_next = aoh->keyid; + tcp_v4_ao_calc_key_rsk(key.ao_key, key.traffic_key, req); +#else + if (0) { +#endif #ifdef CONFIG_TCP_MD5SIG - if (static_branch_unlikely(&tcp_md5_needed.key)) { + } else if (static_branch_unlikely(&tcp_md5_needed.key)) { const union tcp_md5_addr *addr; int l3index; @@ -1087,8 +1121,14 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, key.md5_key = tcp_md5_do_lookup(sk, l3index, addr, AF_INET); if (key.md5_key) key.type = TCP_KEY_MD5; - } #endif + } + + /* RFC 7323 2.3 + * The window field (SEG.WND) of every outgoing segment, with the + * exception of segments, MUST be right-shifted by + * Rcv.Wind.Shift bits: + */ tcp_v4_send_ack(sk, skb, seq, tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale, @@ -1098,6 +1138,8 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0, ip_hdr(skb)->tos, READ_ONCE(tcp_rsk(req)->txhash)); + if (tcp_key_is_ao(&key)) + kfree(key.traffic_key); } /* @@ -1636,6 +1678,10 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = { .req_md5_lookup = tcp_v4_md5_lookup, .calc_md5_hash = tcp_v4_md5_hash_skb, #endif +#ifdef CONFIG_TCP_AO + .ao_lookup = tcp_v4_ao_lookup_rsk, + .ao_calc_key = tcp_v4_ao_calc_key_rsk, +#endif #ifdef CONFIG_SYN_COOKIES .cookie_init_seq = cookie_v4_init_sequence, #endif @@ -1737,12 +1783,16 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, /* Copy over the MD5 key from the original socket */ addr = (union tcp_md5_addr *)&newinet->inet_daddr; key = tcp_md5_do_lookup(sk, l3index, addr, AF_INET); - if (key) { + if (key && !tcp_rsk_used_ao(req)) { if (tcp_md5_key_copy(newsk, addr, AF_INET, 32, l3index, key)) goto put_and_exit; sk_gso_disable(newsk); } #endif +#ifdef CONFIG_TCP_AO + if (tcp_ao_copy_all_matching(sk, newsk, req, skb, AF_INET)) + goto put_and_exit; /* OOM, release back memory */ +#endif if (__inet_inherit_port(sk, newsk) < 0) goto put_and_exit; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 6810cf65a322..8d941a6e066b 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -506,6 +506,9 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, const struct tcp_sock *oldtp; struct tcp_sock *newtp; u32 seq; +#ifdef CONFIG_TCP_AO + struct tcp_ao_key *ao_key; +#endif if (!newsk) return NULL; @@ -594,6 +597,13 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, #ifdef CONFIG_TCP_MD5SIG newtp->md5sig_info = NULL; /*XXX*/ #endif +#ifdef CONFIG_TCP_AO + newtp->ao_info = NULL; + ao_key = treq->af_specific->ao_lookup(sk, req, + tcp_rsk(req)->ao_keyid, -1); + if (ao_key) + newtp->tcp_header_len += tcp_ao_len(ao_key); + #endif if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len) newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len; newtp->rx_opt.mss_clamp = req->mss; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index f1e6b63122f6..63a690107ed4 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -615,6 +615,7 @@ static void bpf_skops_write_hdr_opt(struct sock *sk, struct sk_buff *skb, * (but it may well be that other scenarios fail similarly). */ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, + const struct tcp_request_sock *tcprsk, struct tcp_out_options *opts, struct tcp_key *key) { @@ -629,20 +630,28 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, ptr += 4; } else if (tcp_key_is_ao(key)) { #ifdef CONFIG_TCP_AO - struct tcp_ao_key *rnext_key; - struct tcp_ao_info *ao_info; - u8 maclen; + u8 maclen = tcp_ao_maclen(key->ao_key); - ao_info = rcu_dereference_check(tp->ao_info, + if (tcprsk) { + u8 aolen = maclen + sizeof(struct tcp_ao_hdr); + + *ptr++ = htonl((TCPOPT_AO << 24) | (aolen << 16) | + (tcprsk->ao_keyid << 8) | + (tcprsk->ao_rcv_next)); + } else { + struct tcp_ao_key *rnext_key; + struct tcp_ao_info *ao_info; + + ao_info = rcu_dereference_check(tp->ao_info, lockdep_sock_is_held(&tp->inet_conn.icsk_inet.sk)); - rnext_key = READ_ONCE(ao_info->rnext_key); - if (WARN_ON_ONCE(!rnext_key)) - goto out_ao; - maclen = tcp_ao_maclen(key->ao_key); - *ptr++ = htonl((TCPOPT_AO << 24) | - (tcp_ao_len(key->ao_key) << 16) | - (key->ao_key->sndid << 8) | - (rnext_key->rcvid)); + rnext_key = READ_ONCE(ao_info->rnext_key); + if (WARN_ON_ONCE(!rnext_key)) + goto out_ao; + *ptr++ = htonl((TCPOPT_AO << 24) | + (tcp_ao_len(key->ao_key) << 16) | + (key->ao_key->sndid << 8) | + (rnext_key->rcvid)); + } opts->hash_location = (__u8 *)ptr; ptr += maclen / sizeof(*ptr); if (unlikely(maclen % sizeof(*ptr))) { @@ -1386,7 +1395,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, th->window = htons(min(tp->rcv_wnd, 65535U)); } - tcp_options_write(th, tp, &opts, &key); + tcp_options_write(th, tp, NULL, &opts, &key); if (tcp_key_is_md5(&key)) { #ifdef CONFIG_TCP_MD5SIG @@ -3747,7 +3756,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, /* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */ th->window = htons(min(req->rsk_rcv_wnd, 65535U)); - tcp_options_write(th, NULL, &opts, &key); + tcp_options_write(th, NULL, NULL, &opts, &key); th->doff = (tcp_header_size >> 2); TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index 5014aa663452..ad7a8caa7b2a 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -214,6 +214,8 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) treq->snt_isn = cookie; treq->ts_off = 0; treq->txhash = net_tx_rndhash(); + tcp_ao_syncookie(sk, skb, treq, AF_INET6); + if (IS_ENABLED(CONFIG_SMC)) ireq->smc_ok = 0; diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c index d08735b6f3c5..c9a6fa84f6ce 100644 --- a/net/ipv6/tcp_ao.c +++ b/net/ipv6/tcp_ao.c @@ -49,6 +49,17 @@ static int tcp_v6_ao_calc_key(struct tcp_ao_key *mkt, u8 *key, return err; } +int tcp_v6_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key, + const struct sk_buff *skb, + __be32 sisn, __be32 disn) +{ + const struct ipv6hdr *iph = ipv6_hdr(skb); + const struct tcphdr *th = tcp_hdr(skb); + + return tcp_v6_ao_calc_key(mkt, key, &iph->saddr, &iph->daddr, + th->source, th->dest, sisn, disn); +} + int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, const struct sock *sk, __be32 sisn, __be32 disn, bool send) @@ -63,9 +74,21 @@ int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, htons(sk->sk_num), disn, sisn); } -static struct tcp_ao_key *tcp_v6_ao_do_lookup(const struct sock *sk, - const struct in6_addr *addr, - int sndid, int rcvid) +int tcp_v6_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key, + struct request_sock *req) +{ + struct inet_request_sock *ireq = inet_rsk(req); + + return tcp_v6_ao_calc_key(mkt, key, + &ireq->ir_v6_loc_addr, &ireq->ir_v6_rmt_addr, + htons(ireq->ir_num), ireq->ir_rmt_port, + htonl(tcp_rsk(req)->snt_isn), + htonl(tcp_rsk(req)->rcv_isn)); +} + +struct tcp_ao_key *tcp_v6_ao_do_lookup(const struct sock *sk, + const struct in6_addr *addr, + int sndid, int rcvid) { return tcp_ao_do_lookup(sk, (union tcp_ao_addr *)addr, AF_INET6, sndid, rcvid); @@ -80,6 +103,15 @@ struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk, return tcp_v6_ao_do_lookup(sk, addr, sndid, rcvid); } +struct tcp_ao_key *tcp_v6_ao_lookup_rsk(const struct sock *sk, + struct request_sock *req, + int sndid, int rcvid) +{ + struct in6_addr *addr = &inet_rsk(req)->ir_v6_rmt_addr; + + return tcp_v6_ao_do_lookup(sk, addr, sndid, rcvid); +} + int tcp_v6_ao_hash_pseudoheader(struct tcp_sigpool *hp, const struct in6_addr *daddr, const struct in6_addr *saddr, int nbytes) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 4ee24a95f542..e2fee6c464bf 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -836,6 +836,10 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = { .req_md5_lookup = tcp_v6_md5_lookup, .calc_md5_hash = tcp_v6_md5_hash_skb, #endif +#ifdef CONFIG_TCP_AO + .ao_lookup = tcp_v6_ao_lookup_rsk, + .ao_calc_key = tcp_v6_ao_calc_key_rsk, +#endif #ifdef CONFIG_SYN_COOKIES .cookie_init_seq = cookie_v6_init_sequence, #endif @@ -1192,16 +1196,54 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, { struct tcp_key key = {}; +#ifdef CONFIG_TCP_AO + if (tcp_rsk_used_ao(req)) { + const struct in6_addr *addr = &ipv6_hdr(skb)->saddr; + const struct tcp_ao_hdr *aoh; + int l3index; + + l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0; + /* Invalid TCP option size or twice included auth */ + if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) + return; + if (!aoh) + return; + key.ao_key = tcp_v6_ao_do_lookup(sk, addr, aoh->rnext_keyid, -1); + if (unlikely(!key.ao_key)) { + /* Send ACK with any matching MKT for the peer */ + key.ao_key = tcp_v6_ao_do_lookup(sk, addr, -1, -1); + /* Matching key disappeared (user removed the key?) + * let the handshake timeout. + */ + if (!key.ao_key) { + net_info_ratelimited("TCP-AO key for (%pI6, %d)->(%pI6, %d) suddenly disappeared, won't ACK new connection\n", + addr, + ntohs(tcp_hdr(skb)->source), + &ipv6_hdr(skb)->daddr, + ntohs(tcp_hdr(skb)->dest)); + return; + } + } + key.traffic_key = kmalloc(tcp_ao_digest_size(key.ao_key), GFP_ATOMIC); + if (!key.traffic_key) + return; + + key.type = TCP_KEY_AO; + key.rcv_next = aoh->keyid; + tcp_v6_ao_calc_key_rsk(key.ao_key, key.traffic_key, req); +#else + if (0) { +#endif #ifdef CONFIG_TCP_MD5SIG - if (static_branch_unlikely(&tcp_md5_needed.key)) { + } else if (static_branch_unlikely(&tcp_md5_needed.key)) { int l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0; key.md5_key = tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr, l3index); if (key.md5_key) key.type = TCP_KEY_MD5; - } #endif + } /* sk->sk_state == TCP_LISTEN -> for regular TCP_SYN_RECV * sk->sk_state == TCP_SYN_RECV -> for Fast Open. @@ -1220,6 +1262,8 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, &key, ipv6_get_dsfield(ipv6_hdr(skb)), 0, READ_ONCE(sk->sk_priority), READ_ONCE(tcp_rsk(req)->txhash)); + if (tcp_key_is_ao(&key)) + kfree(key.traffic_key); } @@ -1449,19 +1493,26 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * #ifdef CONFIG_TCP_MD5SIG l3index = l3mdev_master_ifindex_by_index(sock_net(sk), ireq->ir_iif); - /* Copy over the MD5 key from the original socket */ - key = tcp_v6_md5_do_lookup(sk, &newsk->sk_v6_daddr, l3index); - if (key) { - const union tcp_md5_addr *addr; + if (!tcp_rsk_used_ao(req)) { + /* Copy over the MD5 key from the original socket */ + key = tcp_v6_md5_do_lookup(sk, &newsk->sk_v6_daddr, l3index); + if (key) { + const union tcp_md5_addr *addr; - addr = (union tcp_md5_addr *)&newsk->sk_v6_daddr; - if (tcp_md5_key_copy(newsk, addr, AF_INET6, 128, l3index, key)) { - inet_csk_prepare_forced_close(newsk); - tcp_done(newsk); - goto out; + addr = (union tcp_md5_addr *)&newsk->sk_v6_daddr; + if (tcp_md5_key_copy(newsk, addr, AF_INET6, 128, l3index, key)) { + inet_csk_prepare_forced_close(newsk); + tcp_done(newsk); + goto out; + } } } #endif +#ifdef CONFIG_TCP_AO + /* Copy over tcp_ao_info if any */ + if (tcp_ao_copy_all_matching(sk, newsk, req, skb, AF_INET6)) + goto out; /* OOM */ +#endif if (__inet_inherit_port(sk, newsk) < 0) { inet_csk_prepare_forced_close(newsk); From patchwork Mon Oct 23 19:22:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157058 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504544vqx; Mon, 23 Oct 2023 12:23:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGtUcyZ0Wf/KMYSc1G+9/s0mKLnQaXk+Ufx7Mk0Ts+mHh81b1ELq77mD7ZqpX4Bm2EIxMs6 X-Received: by 2002:a17:902:d0ca:b0:1ca:3e40:be9c with SMTP id n10-20020a170902d0ca00b001ca3e40be9cmr8455049pln.34.1698089005065; Mon, 23 Oct 2023 12:23:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089005; cv=none; d=google.com; s=arc-20160816; b=1D58Hg21mGeUjPJnzA00MxEMnLSqmTWLyRIDaJapCqplVAaWWomZdC9tfbtce1HLoO 7cgTYZZyMoj86o5ZLgCUW45nXA3q4L0mbpuBTqhMbVxGNfrUqjJZk+dvSIhus5WUZvLN YJajkUPnob51kRj6LCzQg+gk7xEN7AjZfNIAD+hWraXg15yD50kHLfR+t4pznojZH1f9 1KtbP7CM5dgHkFUlJiywBxf6js2Nz4qEAdnz2XPZOYL1uLoxj9wcG5wzNiQtKHL5Yu63 iPaGYxNK73UAbse/Wgl4DNML2pmDOZ2/1Z8vmwdxMWML/chhy/9tWIJBaotwYJYyKX3W qcUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=l0aaSc1IOtw0MVJOdbhyug9jf6AOQ6WtzoETwRHLe9Q=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=DsjKmFxWk9eIOImzZ1ZxledSYf6vMyN9zZOs2I1Q5kYuaZ6l4ce1pUEAynFjdriqTE ieD/AnlNsLpsduehYogUY2R4weECK2U9pZjDw6wUbQ/jLogRflsdE22Ft9drpKC12ccK b96gsaDSyDPf8fx8QOJkzZ040hjHDllTOlKKn3QrfoC7oA0L/OBOZUbojx1RzMwO+Etx 2YU695gApxWy5Ituxr5jQP+iJgdUjtl/4umfhFXMDGANebx13Wp+PfmjpnPePDF8BraC aSkEpDUStyboSFtmF9HwGxwx/L30uOQ20CCrnXCPwAZO83EGUHqSo8G6PfDpL6Td0IBo 7J9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=EFQ7kxS6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id m11-20020a1709026bcb00b001c9abd736fbsi6644213plt.305.2023.10.23.12.23.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=EFQ7kxS6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 26AD880BE2E3; Mon, 23 Oct 2023 12:23:24 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232572AbjJWTXP (ORCPT + 27 others); Mon, 23 Oct 2023 15:23:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231849AbjJWTW4 (ORCPT ); Mon, 23 Oct 2023 15:22:56 -0400 Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 787D910E5 for ; Mon, 23 Oct 2023 12:22:49 -0700 (PDT) Received: by mail-lf1-x132.google.com with SMTP id 2adb3069b0e04-50797cf5b69so5046991e87.2 for ; Mon, 23 Oct 2023 12:22:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088967; x=1698693767; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=l0aaSc1IOtw0MVJOdbhyug9jf6AOQ6WtzoETwRHLe9Q=; b=EFQ7kxS6D1P+imrgE38kTYTe8n9WWl1G4VNW9jDN0KSoGFOb+GPpjOtFx5fpDF91ec Ei9HOCexsrLsH/Vdoq9ZL4OyGgv5z6swS2x4Z4Ojg2gumgKvwuQC9mCkaovDrLxncv4d u7+swcI++K9MTAwCHmTcvKV6H6CMBEt766pm7cFCXWhgGcutlJcfBUp6ySOmPUGMbOwM PRhiTU5cCwZrR4zs1SesM3mXBV7p8KYJtiNCkjd7YykEtoJvoWi7kHjq60LKGJ3Rkj/B EFtCsuj0yHDvRDBuA+H63In3b9o0I5Dw7CgFDcJrhQbD4M+1b61nTX0Zi4KKgC+tO3WI GRiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088967; x=1698693767; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l0aaSc1IOtw0MVJOdbhyug9jf6AOQ6WtzoETwRHLe9Q=; b=banaoq4JwflI+KhwRvXXsWtCSPKSR9qCBbF/3vgA7Fj7PKXupj/rWziS4WHB4ydsnI PPCUmtpnkPDZWromNPN0dozo/Qsun0dhbbN5+5Blm62DMFG9RKP0NlbGBoCgdpCcD/9L Axg+L2u64Ce/naB93I9r4z1Upv/QZdNzZbxZ6an3mgpzuL62C8FMlTFpTR6g4zcnojuw YMWBbyMuhNDRUHpIv8CSZNKNyB0wwcbudwLJtqIYP/pihe7u+EE/esuwBT7OvA5xFlOp Yo9T2d9G/phyX7RxAaKDe2/OL/MNyTDM/YiuaM8ONbOV5j6x1+j7ne9aGiSnEoub5H3P n/8g== X-Gm-Message-State: AOJu0Yy8ZMOJEEFs8unaf/fEqQ49GNclJqsQ8CxU1VWqmcX6s0KGJgYD cxCPtznY06A07241yDgSfvU1QQ== X-Received: by 2002:a2e:b059:0:b0:2c5:f54:2497 with SMTP id d25-20020a2eb059000000b002c50f542497mr7026429ljl.39.1698088967520; Mon, 23 Oct 2023 12:22:47 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:46 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 11/23] net/tcp: Sign SYN-ACK segments with TCP-AO Date: Mon, 23 Oct 2023 20:22:03 +0100 Message-ID: <20231023192217.426455-12-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:23:24 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575376535504580 X-GMAIL-MSGID: 1780575376535504580 Similarly to RST segments, wire SYN-ACKs to TCP-AO. tcp_rsk_used_ao() is handy here to check if the request socket used AO and needs a signature on the outgoing segments. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp.h | 3 ++ include/net/tcp_ao.h | 6 ++++ net/ipv4/tcp_ao.c | 22 +++++++++++++ net/ipv4/tcp_ipv4.c | 1 + net/ipv4/tcp_output.c | 72 +++++++++++++++++++++++++++++++++---------- net/ipv6/tcp_ao.c | 22 +++++++++++++ net/ipv6/tcp_ipv6.c | 1 + 7 files changed, 111 insertions(+), 16 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index f9df97b3a767..9ef9fcf88c5b 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2219,6 +2219,9 @@ struct tcp_request_sock_ops { struct request_sock *req, int sndid, int rcvid); int (*ao_calc_key)(struct tcp_ao_key *mkt, u8 *key, struct request_sock *sk); + int (*ao_synack_hash)(char *ao_hash, struct tcp_ao_key *mkt, + struct request_sock *req, const struct sk_buff *skb, + int hash_offset, u32 sne); #endif #ifdef CONFIG_SYN_COOKIES __u32 (*cookie_init_seq)(const struct sk_buff *skb, diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index d2c1ee8bf7b0..1d69978e349a 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -147,6 +147,9 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid); +int tcp_v4_ao_synack_hash(char *ao_hash, struct tcp_ao_key *mkt, + struct request_sock *req, const struct sk_buff *skb, + int hash_offset, u32 sne); int tcp_v4_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, const struct sock *sk, __be32 sisn, __be32 disn, bool send); @@ -181,6 +184,9 @@ int tcp_v6_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, const struct sock *sk, const struct sk_buff *skb, const u8 *tkey, int hash_offset, u32 sne); int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen); +int tcp_v6_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key, + struct request_sock *req, const struct sk_buff *skb, + int hash_offset, u32 sne); void tcp_ao_established(struct sock *sk); void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb); void tcp_ao_connect_init(struct sock *sk); diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 68d81704e14e..de3710758d55 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -568,6 +568,28 @@ int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key, tkey, hash_offset, sne); } +int tcp_v4_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key, + struct request_sock *req, const struct sk_buff *skb, + int hash_offset, u32 sne) +{ + void *hash_buf = NULL; + int err; + + hash_buf = kmalloc(tcp_ao_digest_size(ao_key), GFP_ATOMIC); + if (!hash_buf) + return -ENOMEM; + + err = tcp_v4_ao_calc_key_rsk(ao_key, hash_buf, req); + if (err) + goto out; + + err = tcp_ao_hash_skb(AF_INET, ao_hash, ao_key, req_to_sk(req), skb, + hash_buf, hash_offset, sne); +out: + kfree(hash_buf); + return err; +} + struct tcp_ao_key *tcp_v4_ao_lookup_rsk(const struct sock *sk, struct request_sock *req, int sndid, int rcvid) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 2bd56b57f3c9..bdf0224ae827 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1681,6 +1681,7 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = { #ifdef CONFIG_TCP_AO .ao_lookup = tcp_v4_ao_lookup_rsk, .ao_calc_key = tcp_v4_ao_calc_key_rsk, + .ao_synack_hash = tcp_v4_ao_synack_hash, #endif #ifdef CONFIG_SYN_COOKIES .cookie_init_seq = cookie_v4_init_sequence, diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 63a690107ed4..e8ccad50f1c5 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -886,7 +886,7 @@ static unsigned int tcp_synack_options(const struct sock *sk, struct request_sock *req, unsigned int mss, struct sk_buff *skb, struct tcp_out_options *opts, - const struct tcp_md5sig_key *md5, + const struct tcp_key *key, struct tcp_fastopen_cookie *foc, enum tcp_synack_type synack_type, struct sk_buff *syn_skb) @@ -894,8 +894,7 @@ static unsigned int tcp_synack_options(const struct sock *sk, struct inet_request_sock *ireq = inet_rsk(req); unsigned int remaining = MAX_TCP_OPTION_SPACE; -#ifdef CONFIG_TCP_MD5SIG - if (md5) { + if (tcp_key_is_md5(key)) { opts->options |= OPTION_MD5; remaining -= TCPOLEN_MD5SIG_ALIGNED; @@ -906,8 +905,11 @@ static unsigned int tcp_synack_options(const struct sock *sk, */ if (synack_type != TCP_SYNACK_COOKIE) ireq->tstamp_ok &= !ireq->sack_ok; + } else if (tcp_key_is_ao(key)) { + opts->options |= OPTION_AO; + remaining -= tcp_ao_len(key->ao_key); + ireq->tstamp_ok &= !ireq->sack_ok; } -#endif /* We always send an MSS option. */ opts->mss = mss; @@ -3671,7 +3673,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, { struct inet_request_sock *ireq = inet_rsk(req); const struct tcp_sock *tp = tcp_sk(sk); - struct tcp_md5sig_key *md5 = NULL; struct tcp_out_options opts; struct tcp_key key = {}; struct sk_buff *skb; @@ -3725,18 +3726,48 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, tcp_rsk(req)->snt_synack = tcp_skb_timestamp_us(skb); } -#ifdef CONFIG_TCP_MD5SIG +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) rcu_read_lock(); - md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req)); - if (md5) - key.type = TCP_KEY_MD5; #endif + if (tcp_rsk_used_ao(req)) { +#ifdef CONFIG_TCP_AO + struct tcp_ao_key *ao_key = NULL; + u8 maclen = tcp_rsk(req)->maclen; + u8 keyid = tcp_rsk(req)->ao_keyid; + + ao_key = tcp_sk(sk)->af_specific->ao_lookup(sk, req_to_sk(req), + keyid, -1); + /* If there is no matching key - avoid sending anything, + * especially usigned segments. It could try harder and lookup + * for another peer-matching key, but the peer has requested + * ao_keyid (RFC5925 RNextKeyID), so let's keep it simple here. + */ + if (unlikely(!ao_key || tcp_ao_maclen(ao_key) != maclen)) { + u8 key_maclen = ao_key ? tcp_ao_maclen(ao_key) : 0; + + rcu_read_unlock(); + kfree_skb(skb); + net_warn_ratelimited("TCP-AO: the keyid %u with maclen %u|%u from SYN packet is not present - not sending SYNACK\n", + keyid, maclen, key_maclen); + return NULL; + } + key.ao_key = ao_key; + key.type = TCP_KEY_AO; +#endif + } else { +#ifdef CONFIG_TCP_MD5SIG + key.md5_key = tcp_rsk(req)->af_specific->req_md5_lookup(sk, + req_to_sk(req)); + if (key.md5_key) + key.type = TCP_KEY_MD5; +#endif + } skb_set_hash(skb, READ_ONCE(tcp_rsk(req)->txhash), PKT_HASH_TYPE_L4); /* bpf program will be interested in the tcp_flags */ TCP_SKB_CB(skb)->tcp_flags = TCPHDR_SYN | TCPHDR_ACK; - tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5, - foc, synack_type, - syn_skb) + sizeof(*th); + tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, + &key, foc, synack_type, syn_skb) + + sizeof(*th); skb_push(skb, tcp_header_size); skb_reset_transport_header(skb); @@ -3756,15 +3787,24 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, /* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */ th->window = htons(min(req->rsk_rcv_wnd, 65535U)); - tcp_options_write(th, NULL, NULL, &opts, &key); + tcp_options_write(th, NULL, tcp_rsk(req), &opts, &key); th->doff = (tcp_header_size >> 2); TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); -#ifdef CONFIG_TCP_MD5SIG /* Okay, we have all we need - do the md5 hash if needed */ - if (md5) + if (tcp_key_is_md5(&key)) { +#ifdef CONFIG_TCP_MD5SIG tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location, - md5, req_to_sk(req), skb); + key.md5_key, req_to_sk(req), skb); +#endif + } else if (tcp_key_is_ao(&key)) { +#ifdef CONFIG_TCP_AO + tcp_rsk(req)->af_specific->ao_synack_hash(opts.hash_location, + key.ao_key, req, skb, + opts.hash_location - (u8 *)th, 0); +#endif + } +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) rcu_read_unlock(); #endif diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c index c9a6fa84f6ce..99753e12c08c 100644 --- a/net/ipv6/tcp_ao.c +++ b/net/ipv6/tcp_ao.c @@ -144,3 +144,25 @@ int tcp_v6_parse_ao(struct sock *sk, int cmd, { return tcp_parse_ao(sk, cmd, AF_INET6, optval, optlen); } + +int tcp_v6_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key, + struct request_sock *req, const struct sk_buff *skb, + int hash_offset, u32 sne) +{ + void *hash_buf = NULL; + int err; + + hash_buf = kmalloc(tcp_ao_digest_size(ao_key), GFP_ATOMIC); + if (!hash_buf) + return -ENOMEM; + + err = tcp_v6_ao_calc_key_rsk(ao_key, hash_buf, req); + if (err) + goto out; + + err = tcp_ao_hash_skb(AF_INET6, ao_hash, ao_key, req_to_sk(req), skb, + hash_buf, hash_offset, sne); +out: + kfree(hash_buf); + return err; +} diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index e2fee6c464bf..c88b90552c47 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -839,6 +839,7 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = { #ifdef CONFIG_TCP_AO .ao_lookup = tcp_v6_ao_lookup_rsk, .ao_calc_key = tcp_v6_ao_calc_key_rsk, + .ao_synack_hash = tcp_v6_ao_synack_hash, #endif #ifdef CONFIG_SYN_COOKIES .cookie_init_seq = cookie_v6_init_sequence, From patchwork Mon Oct 23 19:22:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157061 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504743vqx; Mon, 23 Oct 2023 12:23:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG648KVEMcbD6nbXc9K5m72b5Bj0VNEqr91+E798Ynt0C0fjPtNkLFsI8k7G5EC0K8Cye7L X-Received: by 2002:a17:90b:51d0:b0:27d:3f0c:f087 with SMTP id sf16-20020a17090b51d000b0027d3f0cf087mr9951199pjb.25.1698089031911; Mon, 23 Oct 2023 12:23:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089031; cv=none; d=google.com; s=arc-20160816; b=JFWYM21sr8KBym2vc1oPPhhyMLjkQTNreL3h1QvPU7iBe9n1ttiF7UX3B/RS7ZLdkC QIER24vRX545nO9jpQ5EKg+nlBld60kua7tgisGJnG2Z9H2kCTB5wRnGhRXLJSTbenTo mNIcGCiNt/UvdAdkl2hXeijCddROuA2mo0bVKYGr8Q1tVuSVuJGmskJTOBeQTqXu7LDn iSOMhvrpO4kaOKi8m+A91rqgRJsjg5JF5aVCs4wz787AGzxU5lewfgLRSnYJkpHyFbbd IweWR8z+f2QPwkLaRGqOLm6GcUHQVPkfRdfrPYYWeRmQYMrrzJdr6hI3BG1y7t8LqTaS Z8Tw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=aZCEUJUz8kRpC/FIlrZlhkkwx7ktUe11E9Onr7Z+biI=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=ncyJpcTMeOjCDB2a3nGgdUALA/M5shO5bY2Vm/DlXnKJ20JMDys0GMU9O5B89P9dd0 J8uju5NrBiOUAPdsdn+9ExN158FYfWAtoZ+PKDOSrub5h5zih+PxAxqC9fzFLEVYxDNM OTFuSXFVosrbR9bhsN1YEJEIeNU5Twt2OnhOo04xm3izLIqlfuWb/PtZPwxqrnvthL9a a6ZcmOjVBebF2MSAxJuvQArw2aldwYc8p5ndVyCoekxleAjlygCxk07tnnCAXeRBfYAx ZaGj+tqdWXG0px0FsdV3ZLTGDRufd36eI+tbCoDmtLLJhfFPyhnvgfm9Nm3XRXGzk1Jr AhZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=AFyqvnyO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id lk7-20020a17090b33c700b0026b4d5844ccsi7042202pjb.27.2023.10.23.12.23.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=AFyqvnyO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id C59F780BC44F; Mon, 23 Oct 2023 12:23:50 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233316AbjJWTXk (ORCPT + 27 others); Mon, 23 Oct 2023 15:23:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231849AbjJWTXS (ORCPT ); Mon, 23 Oct 2023 15:23:18 -0400 Received: from mail-lj1-x232.google.com (mail-lj1-x232.google.com [IPv6:2a00:1450:4864:20::232]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CFF2170A for ; Mon, 23 Oct 2023 12:22:54 -0700 (PDT) Received: by mail-lj1-x232.google.com with SMTP id 38308e7fff4ca-2c5028e5b88so52849491fa.3 for ; Mon, 23 Oct 2023 12:22:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088969; x=1698693769; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aZCEUJUz8kRpC/FIlrZlhkkwx7ktUe11E9Onr7Z+biI=; b=AFyqvnyOrx95aWiO8IiSGZhLtBoOJ+bEIRZJSalU9aScguJXVjrqLQQAG+jjdYQv3G h2GMjWRAy0abLzobHGzTFuFbXKKV40jQEe16BLPi743NT7TNCL64SMK502zKAcNfzs/t GJQB5JeMmKudo7MTamRtcBCIhO5RbHA80M8npDf2qdMGcCOhGCjB/WYq8m8LuZWAqsmq A2iZkFrydcafP/5C5nCZ4ruC8+fLo77ktGegnvBZYD7I6dfMna73qji7K781GuC8NMw1 X0Y8jG+OFBnW5EhQMIATAT+Jf3DBwHYvVAJGzPH3LczjW7009admZqyAZ4qSHYu1Eayl liNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088969; x=1698693769; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aZCEUJUz8kRpC/FIlrZlhkkwx7ktUe11E9Onr7Z+biI=; b=fAkL1zGTXFLT95hKvgNYqfj7nasuCQOUnUive6lz0gwK8pkH8uiIVHGy0qDGqymCg3 Mxdp8WsgjOLDTgRWqD1sjjrGI9irHNyZK6IzE2M9tqSsqmNZLU4LNI9MWDS7wyyjb5As Qw4CDHvvYLDzxf3/ZEKyVETO1XA2xctEe5Lw1lt5bsI+hAWZKU6rgKBDsrmEXJ2jgbFc AWEBhf9uBg73uDT/P5dlpU+WR4IB110gNegqRmZ6L7321CvWOSKjVmZV0nqnNdRgL7I1 TE0ijnUdmXSouEC+JFlGCfuuFcEAREZPI46v43dheGjRayzOfPSlQTo+xB0jGe8afA+x RQnw== X-Gm-Message-State: AOJu0Yw/mqTeDc4purjcljCL17G4BcmNXbqdeh1uLIKUmY0QuN/R4WTp BpKoF88NM5oFhmIDevTDY1257w== X-Received: by 2002:a05:651c:b20:b0:2c5:2298:d44e with SMTP id b32-20020a05651c0b2000b002c52298d44emr8189374ljr.5.1698088969425; Mon, 23 Oct 2023 12:22:49 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:48 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 12/23] net/tcp: Verify inbound TCP-AO signed segments Date: Mon, 23 Oct 2023 20:22:04 +0100 Message-ID: <20231023192217.426455-13-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:23:50 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575404797842643 X-GMAIL-MSGID: 1780575404797842643 Now there is a common function to verify signature on TCP segments: tcp_inbound_hash(). It has checks for all possible cross-interactions with MD5 signs as well as with unsigned segments. The rules from RFC5925 are: (1) Any TCP segment can have at max only one signature. (2) TCP connections can't switch between using TCP-MD5 and TCP-AO. (3) TCP-AO connections can't stop using AO, as well as unsigned connections can't suddenly start using AO. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/dropreason-core.h | 17 ++++ include/net/tcp.h | 53 ++++++++++++- include/net/tcp_ao.h | 14 ++++ net/ipv4/tcp.c | 39 ++-------- net/ipv4/tcp_ao.c | 142 ++++++++++++++++++++++++++++++++++ net/ipv4/tcp_ipv4.c | 10 +-- net/ipv6/tcp_ao.c | 9 ++- net/ipv6/tcp_ipv6.c | 11 +-- 8 files changed, 248 insertions(+), 47 deletions(-) diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h index 3af4464a9c5b..7637137ae33e 100644 --- a/include/net/dropreason-core.h +++ b/include/net/dropreason-core.h @@ -24,6 +24,10 @@ FN(TCP_MD5NOTFOUND) \ FN(TCP_MD5UNEXPECTED) \ FN(TCP_MD5FAILURE) \ + FN(TCP_AONOTFOUND) \ + FN(TCP_AOUNEXPECTED) \ + FN(TCP_AOKEYNOTFOUND) \ + FN(TCP_AOFAILURE) \ FN(SOCKET_BACKLOG) \ FN(TCP_FLAGS) \ FN(TCP_ZEROWINDOW) \ @@ -163,6 +167,19 @@ enum skb_drop_reason { * to LINUX_MIB_TCPMD5FAILURE */ SKB_DROP_REASON_TCP_MD5FAILURE, + /** + * @SKB_DROP_REASON_TCP_AONOTFOUND: no TCP-AO hash and one was expected + */ + SKB_DROP_REASON_TCP_AONOTFOUND, + /** + * @SKB_DROP_REASON_TCP_AOUNEXPECTED: TCP-AO hash is present and it + * was not expected. + */ + SKB_DROP_REASON_TCP_AOUNEXPECTED, + /** @SKB_DROP_REASON_TCP_AOKEYNOTFOUND: TCP-AO key is unknown */ + SKB_DROP_REASON_TCP_AOKEYNOTFOUND, + /** @SKB_DROP_REASON_TCP_AOFAILURE: TCP-AO hash is wrong */ + SKB_DROP_REASON_TCP_AOFAILURE, /** * @SKB_DROP_REASON_SOCKET_BACKLOG: failed to add skb to socket backlog ( * see LINUX_MIB_TCPBACKLOGDROP) diff --git a/include/net/tcp.h b/include/net/tcp.h index 9ef9fcf88c5b..a703be0767f6 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1807,7 +1807,7 @@ tcp_md5_do_lookup_any_l3index(const struct sock *sk, enum skb_drop_reason tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, const void *saddr, const void *daddr, - int family, int dif, int sdif); + int family, int l3index, const __u8 *hash_location); #define tcp_twsk_md5_key(twsk) ((twsk)->tw_md5_key) @@ -1829,7 +1829,7 @@ tcp_md5_do_lookup_any_l3index(const struct sock *sk, static inline enum skb_drop_reason tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, const void *saddr, const void *daddr, - int family, int dif, int sdif) + int family, int l3index, const __u8 *hash_location) { return SKB_NOT_DROPPED_YET; } @@ -2728,4 +2728,53 @@ static inline bool tcp_ao_required(struct sock *sk, const void *saddr, return false; } +/* Called with rcu_read_lock() */ +static inline enum skb_drop_reason +tcp_inbound_hash(struct sock *sk, const struct request_sock *req, + const struct sk_buff *skb, + const void *saddr, const void *daddr, + int family, int dif, int sdif) +{ + const struct tcphdr *th = tcp_hdr(skb); + const struct tcp_ao_hdr *aoh; + const __u8 *md5_location; + int l3index; + + /* Invalid option or two times meet any of auth options */ + if (tcp_parse_auth_options(th, &md5_location, &aoh)) + return SKB_DROP_REASON_TCP_AUTH_HDR; + + if (req) { + if (tcp_rsk_used_ao(req) != !!aoh) + return SKB_DROP_REASON_TCP_AOFAILURE; + } + + /* sdif set, means packet ingressed via a device + * in an L3 domain and dif is set to the l3mdev + */ + l3index = sdif ? dif : 0; + + /* Fast path: unsigned segments */ + if (likely(!md5_location && !aoh)) { + /* Drop if there's TCP-MD5 or TCP-AO key with any rcvid/sndid + * for the remote peer. On TCP-AO established connection + * the last key is impossible to remove, so there's + * always at least one current_key. + */ + if (tcp_ao_required(sk, saddr, family)) + return SKB_DROP_REASON_TCP_AONOTFOUND; + if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); + return SKB_DROP_REASON_TCP_MD5NOTFOUND; + } + return SKB_NOT_DROPPED_YET; + } + + if (aoh) + return tcp_inbound_ao_hash(sk, skb, family, req, aoh); + + return tcp_inbound_md5_hash(sk, skb, saddr, daddr, family, + l3index, md5_location); +} + #endif /* _TCP_H */ diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 1d69978e349a..1c7c0a5d1877 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -111,6 +111,9 @@ struct tcp6_ao_context { }; struct tcp_sigpool; +#define TCP_AO_ESTABLISHED (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | \ + TCPF_CLOSE | TCPF_CLOSE_WAIT | \ + TCPF_LAST_ACK | TCPF_CLOSING) int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, struct tcp_ao_key *key, struct tcphdr *th, @@ -130,6 +133,10 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, unsigned int len, struct tcp_sigpool *hp); void tcp_ao_destroy_sock(struct sock *sk, bool twsk); void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp); +enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, + const struct sk_buff *skb, unsigned short int family, + const struct request_sock *req, + const struct tcp_ao_hdr *aoh); struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid); @@ -208,6 +215,13 @@ static inline void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, { } +static inline enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, + const struct sk_buff *skb, unsigned short int family, + const struct request_sock *req, const struct tcp_ao_hdr *aoh) +{ + return SKB_NOT_DROPPED_YET; +} + static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid) { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index b44b9ef99e60..846ddc023ac1 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4373,42 +4373,23 @@ EXPORT_SYMBOL(tcp_md5_hash_key); enum skb_drop_reason tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, const void *saddr, const void *daddr, - int family, int dif, int sdif) + int family, int l3index, const __u8 *hash_location) { - /* - * This gets called for each TCP segment that arrives - * so we want to be efficient. + /* This gets called for each TCP segment that has TCP-MD5 option. * We have 3 drop cases: * o No MD5 hash and one expected. * o MD5 hash and we're not expecting one. * o MD5 hash and its wrong. */ - const __u8 *hash_location = NULL; - struct tcp_md5sig_key *hash_expected; const struct tcphdr *th = tcp_hdr(skb); const struct tcp_sock *tp = tcp_sk(sk); - int genhash, l3index; + struct tcp_md5sig_key *key; u8 newhash[16]; + int genhash; - /* sdif set, means packet ingressed via a device - * in an L3 domain and dif is set to the l3mdev - */ - l3index = sdif ? dif : 0; + key = tcp_md5_do_lookup(sk, l3index, saddr, family); - hash_expected = tcp_md5_do_lookup(sk, l3index, saddr, family); - if (tcp_parse_auth_options(th, &hash_location, NULL)) - return SKB_DROP_REASON_TCP_AUTH_HDR; - - /* We've parsed the options - do we have a hash? */ - if (!hash_expected && !hash_location) - return SKB_NOT_DROPPED_YET; - - if (hash_expected && !hash_location) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); - return SKB_DROP_REASON_TCP_MD5NOTFOUND; - } - - if (!hash_expected && hash_location) { + if (!key && hash_location) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); return SKB_DROP_REASON_TCP_MD5UNEXPECTED; } @@ -4418,14 +4399,10 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, * IPv4-mapped case. */ if (family == AF_INET) - genhash = tcp_v4_md5_hash_skb(newhash, - hash_expected, - NULL, skb); + genhash = tcp_v4_md5_hash_skb(newhash, key, NULL, skb); else - genhash = tp->af_specific->calc_md5_hash(newhash, - hash_expected, + genhash = tp->af_specific->calc_md5_hash(newhash, key, NULL, skb); - if (genhash || memcmp(hash_location, newhash, 16) != 0) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); if (family == AF_INET) { diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index de3710758d55..6c5815713b73 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -761,6 +761,148 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, treq->maclen = tcp_ao_maclen(key); } +static enum skb_drop_reason +tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb, + unsigned short int family, struct tcp_ao_info *info, + const struct tcp_ao_hdr *aoh, struct tcp_ao_key *key, + u8 *traffic_key, u8 *phash, u32 sne) +{ + u8 maclen = aoh->length - sizeof(struct tcp_ao_hdr); + const struct tcphdr *th = tcp_hdr(skb); + void *hash_buf = NULL; + + if (maclen != tcp_ao_maclen(key)) + return SKB_DROP_REASON_TCP_AOFAILURE; + + hash_buf = kmalloc(tcp_ao_digest_size(key), GFP_ATOMIC); + if (!hash_buf) + return SKB_DROP_REASON_NOT_SPECIFIED; + + /* XXX: make it per-AF callback? */ + tcp_ao_hash_skb(family, hash_buf, key, sk, skb, traffic_key, + (phash - (u8 *)th), sne); + if (memcmp(phash, hash_buf, maclen)) { + kfree(hash_buf); + return SKB_DROP_REASON_TCP_AOFAILURE; + } + kfree(hash_buf); + return SKB_NOT_DROPPED_YET; +} + +enum skb_drop_reason +tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, + unsigned short int family, const struct request_sock *req, + const struct tcp_ao_hdr *aoh) +{ + const struct tcphdr *th = tcp_hdr(skb); + u8 *phash = (u8 *)(aoh + 1); /* hash goes just after the header */ + struct tcp_ao_info *info; + enum skb_drop_reason ret; + struct tcp_ao_key *key; + __be32 sisn, disn; + u8 *traffic_key; + u32 sne = 0; + + info = rcu_dereference(tcp_sk(sk)->ao_info); + if (!info) + return SKB_DROP_REASON_TCP_AOUNEXPECTED; + + if (unlikely(th->syn)) { + sisn = th->seq; + disn = 0; + } + + /* Fast-path */ + if (likely((1 << sk->sk_state) & TCP_AO_ESTABLISHED)) { + enum skb_drop_reason err; + struct tcp_ao_key *current_key; + + /* Check if this socket's rnext_key matches the keyid in the + * packet. If not we lookup the key based on the keyid + * matching the rcvid in the mkt. + */ + key = READ_ONCE(info->rnext_key); + if (key->rcvid != aoh->keyid) { + key = tcp_ao_established_key(info, -1, aoh->keyid); + if (!key) + goto key_not_found; + } + + /* Delayed retransmitted SYN */ + if (unlikely(th->syn && !th->ack)) + goto verify_hash; + + sne = 0; + /* Established socket, traffic key are cached */ + traffic_key = rcv_other_key(key); + err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key, + traffic_key, phash, sne); + if (err) + return err; + current_key = READ_ONCE(info->current_key); + /* Key rotation: the peer asks us to use new key (RNext) */ + if (unlikely(aoh->rnext_keyid != current_key->sndid)) { + /* If the key is not found we do nothing. */ + key = tcp_ao_established_key(info, aoh->rnext_keyid, -1); + if (key) + /* pairs with tcp_ao_del_cmd */ + WRITE_ONCE(info->current_key, key); + } + return SKB_NOT_DROPPED_YET; + } + + /* Lookup key based on peer address and keyid. + * current_key and rnext_key must not be used on tcp listen + * sockets as otherwise: + * - request sockets would race on those key pointers + * - tcp_ao_del_cmd() allows async key removal + */ + key = tcp_ao_inbound_lookup(family, sk, skb, -1, aoh->keyid); + if (!key) + goto key_not_found; + + if (th->syn && !th->ack) + goto verify_hash; + + if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) { + /* Make the initial syn the likely case here */ + if (unlikely(req)) { + sne = 0; + sisn = htonl(tcp_rsk(req)->rcv_isn); + disn = htonl(tcp_rsk(req)->snt_isn); + } else if (unlikely(th->ack && !th->syn)) { + /* Possible syncookie packet */ + sisn = htonl(ntohl(th->seq) - 1); + disn = htonl(ntohl(th->ack_seq) - 1); + sne = 0; + } else if (unlikely(!th->syn)) { + /* no way to figure out initial sisn/disn - drop */ + return SKB_DROP_REASON_TCP_FLAGS; + } + } else if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) { + disn = info->lisn; + if (th->syn || th->rst) + sisn = th->seq; + else + sisn = info->risn; + } else { + WARN_ONCE(1, "TCP-AO: Unexpected sk_state %d", sk->sk_state); + return SKB_DROP_REASON_TCP_AOFAILURE; + } +verify_hash: + traffic_key = kmalloc(tcp_ao_digest_size(key), GFP_ATOMIC); + if (!traffic_key) + return SKB_DROP_REASON_NOT_SPECIFIED; + tcp_ao_calc_key_skb(key, traffic_key, skb, sisn, disn, family); + ret = tcp_ao_verify_hash(sk, skb, family, info, aoh, key, + traffic_key, phash, sne); + kfree(traffic_key); + return ret; + +key_not_found: + return SKB_DROP_REASON_TCP_AOKEYNOTFOUND; +} + static int tcp_ao_cache_traffic_keys(const struct sock *sk, struct tcp_ao_info *ao, struct tcp_ao_key *ao_key) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index bdf0224ae827..f39ccefa78dc 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2204,9 +2204,9 @@ int tcp_v4_rcv(struct sk_buff *skb) if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb)) drop_reason = SKB_DROP_REASON_XFRM_POLICY; else - drop_reason = tcp_inbound_md5_hash(sk, skb, - &iph->saddr, &iph->daddr, - AF_INET, dif, sdif); + drop_reason = tcp_inbound_hash(sk, req, skb, + &iph->saddr, &iph->daddr, + AF_INET, dif, sdif); if (unlikely(drop_reason)) { sk_drops_add(sk, skb); reqsk_put(req); @@ -2283,8 +2283,8 @@ int tcp_v4_rcv(struct sk_buff *skb) goto discard_and_relse; } - drop_reason = tcp_inbound_md5_hash(sk, skb, &iph->saddr, - &iph->daddr, AF_INET, dif, sdif); + drop_reason = tcp_inbound_hash(sk, NULL, skb, &iph->saddr, &iph->daddr, + AF_INET, dif, sdif); if (drop_reason) goto discard_and_relse; diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c index 99753e12c08c..8b04611c9078 100644 --- a/net/ipv6/tcp_ao.c +++ b/net/ipv6/tcp_ao.c @@ -53,11 +53,12 @@ int tcp_v6_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key, const struct sk_buff *skb, __be32 sisn, __be32 disn) { - const struct ipv6hdr *iph = ipv6_hdr(skb); - const struct tcphdr *th = tcp_hdr(skb); + const struct ipv6hdr *iph = ipv6_hdr(skb); + const struct tcphdr *th = tcp_hdr(skb); - return tcp_v6_ao_calc_key(mkt, key, &iph->saddr, &iph->daddr, - th->source, th->dest, sisn, disn); + return tcp_v6_ao_calc_key(mkt, key, &iph->saddr, + &iph->daddr, th->source, + th->dest, sisn, disn); } int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index c88b90552c47..d740928c043f 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1785,9 +1785,9 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) drop_reason = SKB_DROP_REASON_XFRM_POLICY; else - drop_reason = tcp_inbound_md5_hash(sk, skb, - &hdr->saddr, &hdr->daddr, - AF_INET6, dif, sdif); + drop_reason = tcp_inbound_hash(sk, req, skb, + &hdr->saddr, &hdr->daddr, + AF_INET6, dif, sdif); if (drop_reason) { sk_drops_add(sk, skb); reqsk_put(req); @@ -1861,8 +1861,8 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) goto discard_and_relse; } - drop_reason = tcp_inbound_md5_hash(sk, skb, &hdr->saddr, &hdr->daddr, - AF_INET6, dif, sdif); + drop_reason = tcp_inbound_hash(sk, NULL, skb, &hdr->saddr, &hdr->daddr, + AF_INET6, dif, sdif); if (drop_reason) goto discard_and_relse; @@ -2090,6 +2090,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = { .ao_lookup = tcp_v6_ao_lookup, .calc_ao_hash = tcp_v4_ao_hash_skb, .ao_parse = tcp_v6_parse_ao, + .ao_calc_key_sk = tcp_v4_ao_calc_key_sk, #endif }; #endif From patchwork Mon Oct 23 19:22:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157067 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1505858vqx; Mon, 23 Oct 2023 12:26:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFtcf3AbiiRfGORXV8vNLK+w9Ic9YjwMFeeW+zm1xHfD8VY8DG9QKviGhZR5e39BaqH1V/J X-Received: by 2002:a17:90b:2311:b0:27e:3ae3:eae0 with SMTP id mt17-20020a17090b231100b0027e3ae3eae0mr3656409pjb.16.1698089177116; Mon, 23 Oct 2023 12:26:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089177; cv=none; d=google.com; s=arc-20160816; b=OKX6CO753TM4p5yF4eHsmpxFl5hL6ok2wTijRHVogL+x+iFeA/qRfG2oCkFdNqbXpr TRCQ7VwSGTL6YrocEbmhCalHEYsGO+H6qJT36qUL9aEcgDX6EEyLFe14B4lgHdW3ifRZ rEubXRM3oLRJYA9lmAA/nOCYg15uEua7ZY2U4w8IyBzc9qWex0tJ5LBim7jYUp0lw+m4 +zJ7KUSGAHT3RZOdDBqqZoMwDEdx/+kjdHu2jagGwPr0aOmrz/+MgQzvWSCP5XrE5C2A 9i2XQdQ7luzi7fzny6lk1b+KOQX6Ojtu/PEd47pAzDo/NUYfIH9NK0guvdvn6Lt1vOV6 br8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=qIOwXmBuhXHPO4F9TscWzsYp1LaZcxpuDcf3MsSKRzo=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=VN3HzpyLGMyk7b/XXiHzb0SeGVSaXTja6rb/0gk/lC35ZnlT2QuQWYP7jidoAo84lH EM02CbFnpH1VXkHWf166NRMAlpwP3HBENYrhauqHOZjXuzltYyJIo2uz+WHP8ocHZB6j n4Z5RfBrxpKSks0e9+llh09uCKGEahtW7Zd3lS6C/sC2GjcEcvYmMfmSncI4gv/urn3j 42NxgnPjF++69N1BbC+spDbxl44VS4BDpNbwZE+hiueXp/L+jEMIg4YA0dTTvyP0XVan 6zSbx1DrqHD3TVs0Smq9ekYh3Ztr0zfi3ExG6vN0b5igu+uzEJq/l1EhHBu+ZNKeglBl TVEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=PAyS5OKd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id hk8-20020a17090b224800b00275cffed966si9599011pjb.57.2023.10.23.12.26.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:26:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=PAyS5OKd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id F260D8079C82; Mon, 23 Oct 2023 12:25:42 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229628AbjJWTXo (ORCPT + 27 others); Mon, 23 Oct 2023 15:23:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233303AbjJWTXT (ORCPT ); Mon, 23 Oct 2023 15:23:19 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 588FD10C8 for ; Mon, 23 Oct 2023 12:22:54 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-40906fc54fdso6696725e9.0 for ; Mon, 23 Oct 2023 12:22:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088971; x=1698693771; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qIOwXmBuhXHPO4F9TscWzsYp1LaZcxpuDcf3MsSKRzo=; b=PAyS5OKdz6LK6ijQw5z1Zc0VRHqLONmj0n1Hp80RnYTu/Eq/A+bmdvDZcBmiCLd8WS QcfjsIAsQS35du8kY6gZb4STS4mrPkXyWXbNnS0uRGeo2JVMEn1biXuMkIYGYFd1Pk4r SDpRF+W7iJLpCYvOhMLh1ikSce8P2TFjol2HkemaBvPE7Sod/jWPdOOVd9T+rluXgQMr Hg6th6aAjYr5BmFwFf86h1Be8tAqSh6OYYxTvReXM6FWldPuCIufjqSv3sC+CcxGWZFd WuWkQfdNqEXHOAEd8nC8CKBez1ickoQ0thBHMwYG3bizvZUjwvZ9wGg3+12DjhlXA/K4 VcBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088971; x=1698693771; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qIOwXmBuhXHPO4F9TscWzsYp1LaZcxpuDcf3MsSKRzo=; b=Yf4mR+C7Hm1yDRLzkeQFaUfHvE68OSomDVx7MsmGYBOZuqErvGqshcKgWyRq6JmxGM ufnBxWoatf3i97e0JL+FlFod1vaoywxnocYH6ocBdxGtAPJr52K9QtUydTN03+h03Ni3 lM1lt6AuRElJOSrbxxh15DEoFAWi4pd/uZWZWHBGayRynZSk9ATofhF+cScc/nX0DvOf 7R78RWPlORpH9jO+o3CJaa2aXJRFXVpeRjxALzdFms4z1qiewpGibC71rUGfswjg8XwV Xs++yR+OUcCvWZKWL5qQC55IQFruxlQrsQWzPCUJhdwL4fVGulBhIkxSBVJ8HzKYzoMp Yo8g== X-Gm-Message-State: AOJu0YznwzU8lEFBhlp7JU50M6cX30FEkNi5dAr644X0HwHrrJjqAa+C deEc+g41FYTWgbGVj1ApDsFX8g== X-Received: by 2002:a05:600c:4fcb:b0:3fe:2b8c:9f0b with SMTP id o11-20020a05600c4fcb00b003fe2b8c9f0bmr7120064wmq.23.1698088971316; Mon, 23 Oct 2023 12:22:51 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:50 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 13/23] net/tcp: Add TCP-AO segments counters Date: Mon, 23 Oct 2023 20:22:05 +0100 Message-ID: <20231023192217.426455-14-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:25:43 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575557231310236 X-GMAIL-MSGID: 1780575557231310236 Introduce segment counters that are useful for troubleshooting/debugging as well as for writing tests. Now there are global snmp counters as well as per-socket and per-key. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/dropreason-core.h | 15 +++++++++++---- include/net/tcp.h | 15 +++++++++++---- include/net/tcp_ao.h | 10 ++++++++++ include/uapi/linux/snmp.h | 4 ++++ include/uapi/linux/tcp.h | 8 +++++++- net/ipv4/proc.c | 4 ++++ net/ipv4/tcp_ao.c | 30 +++++++++++++++++++++++++++--- net/ipv4/tcp_ipv4.c | 2 +- net/ipv6/tcp_ipv6.c | 4 ++-- 9 files changed, 77 insertions(+), 15 deletions(-) diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h index 7637137ae33e..3c70ad53a49c 100644 --- a/include/net/dropreason-core.h +++ b/include/net/dropreason-core.h @@ -168,17 +168,24 @@ enum skb_drop_reason { */ SKB_DROP_REASON_TCP_MD5FAILURE, /** - * @SKB_DROP_REASON_TCP_AONOTFOUND: no TCP-AO hash and one was expected + * @SKB_DROP_REASON_TCP_AONOTFOUND: no TCP-AO hash and one was expected, + * corresponding to LINUX_MIB_TCPAOREQUIRED */ SKB_DROP_REASON_TCP_AONOTFOUND, /** * @SKB_DROP_REASON_TCP_AOUNEXPECTED: TCP-AO hash is present and it - * was not expected. + * was not expected, corresponding to LINUX_MIB_TCPAOKEYNOTFOUND */ SKB_DROP_REASON_TCP_AOUNEXPECTED, - /** @SKB_DROP_REASON_TCP_AOKEYNOTFOUND: TCP-AO key is unknown */ + /** + * @SKB_DROP_REASON_TCP_AOKEYNOTFOUND: TCP-AO key is unknown, + * corresponding to LINUX_MIB_TCPAOKEYNOTFOUND + */ SKB_DROP_REASON_TCP_AOKEYNOTFOUND, - /** @SKB_DROP_REASON_TCP_AOFAILURE: TCP-AO hash is wrong */ + /** + * @SKB_DROP_REASON_TCP_AOFAILURE: TCP-AO hash is wrong, + * corresponding to LINUX_MIB_TCPAOBAD + */ SKB_DROP_REASON_TCP_AOFAILURE, /** * @SKB_DROP_REASON_SOCKET_BACKLOG: failed to add skb to socket backlog ( diff --git a/include/net/tcp.h b/include/net/tcp.h index a703be0767f6..d29c8a867f0e 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2710,7 +2710,7 @@ static inline int tcp_parse_auth_options(const struct tcphdr *th, } static inline bool tcp_ao_required(struct sock *sk, const void *saddr, - int family) + int family, bool stat_inc) { #ifdef CONFIG_TCP_AO struct tcp_ao_info *ao_info; @@ -2722,8 +2722,13 @@ static inline bool tcp_ao_required(struct sock *sk, const void *saddr, return false; ao_key = tcp_ao_do_lookup(sk, saddr, family, -1, -1); - if (ao_info->ao_required || ao_key) + if (ao_info->ao_required || ao_key) { + if (stat_inc) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOREQUIRED); + atomic64_inc(&ao_info->counters.ao_required); + } return true; + } #endif return false; } @@ -2745,8 +2750,10 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req, return SKB_DROP_REASON_TCP_AUTH_HDR; if (req) { - if (tcp_rsk_used_ao(req) != !!aoh) + if (tcp_rsk_used_ao(req) != !!aoh) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD); return SKB_DROP_REASON_TCP_AOFAILURE; + } } /* sdif set, means packet ingressed via a device @@ -2761,7 +2768,7 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req, * the last key is impossible to remove, so there's * always at least one current_key. */ - if (tcp_ao_required(sk, saddr, family)) + if (tcp_ao_required(sk, saddr, family, true)) return SKB_DROP_REASON_TCP_AONOTFOUND; if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 1c7c0a5d1877..cfb55bd9411b 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -19,6 +19,13 @@ struct tcp_ao_hdr { u8 rnext_keyid; }; +struct tcp_ao_counters { + atomic64_t pkt_good; + atomic64_t pkt_bad; + atomic64_t key_not_found; + atomic64_t ao_required; +}; + struct tcp_ao_key { struct hlist_node node; union tcp_ao_addr addr; @@ -33,6 +40,8 @@ struct tcp_ao_key { u8 rcvid; u8 maclen; struct rcu_head rcu; + atomic64_t pkt_good; + atomic64_t pkt_bad; u8 traffic_keys[]; }; @@ -81,6 +90,7 @@ struct tcp_ao_info { */ struct tcp_ao_key *current_key; struct tcp_ao_key *rnext_key; + struct tcp_ao_counters counters; u32 ao_required :1, __unused :31; __be32 lisn; diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h index b2b72886cb6d..3d5ea841bffe 100644 --- a/include/uapi/linux/snmp.h +++ b/include/uapi/linux/snmp.h @@ -297,6 +297,10 @@ enum LINUX_MIB_TCPMIGRATEREQSUCCESS, /* TCPMigrateReqSuccess */ LINUX_MIB_TCPMIGRATEREQFAILURE, /* TCPMigrateReqFailure */ LINUX_MIB_TCPPLBREHASH, /* TCPPLBRehash */ + LINUX_MIB_TCPAOREQUIRED, /* TCPAORequired */ + LINUX_MIB_TCPAOBAD, /* TCPAOBad */ + LINUX_MIB_TCPAOKEYNOTFOUND, /* TCPAOKeyNotFound */ + LINUX_MIB_TCPAOGOOD, /* TCPAOGood */ __LINUX_MIB_MAX }; diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index fa49f03e62fe..9c48964849d1 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -404,9 +404,15 @@ struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */ __u32 set_current :1, /* corresponding ::current_key */ set_rnext :1, /* corresponding ::rnext */ ao_required :1, /* don't accept non-AO connects */ - reserved :29; /* must be 0 */ + set_counters :1, /* set/clear ::pkt_* counters */ + reserved :28; /* must be 0 */ + __u16 reserved2; /* padding, must be 0 */ __u8 current_key; /* KeyID to set as Current_key */ __u8 rnext; /* KeyID to set as Rnext_key */ + __u64 pkt_good; /* verified segments */ + __u64 pkt_bad; /* failed verification */ + __u64 pkt_key_not_found; /* could not find a key to verify */ + __u64 pkt_ao_required; /* segments missing TCP-AO sign */ } __attribute__((aligned(8))); /* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */ diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c index a85b0aba3646..f5b37ebc18c0 100644 --- a/net/ipv4/proc.c +++ b/net/ipv4/proc.c @@ -299,6 +299,10 @@ static const struct snmp_mib snmp4_net_list[] = { SNMP_MIB_ITEM("TCPMigrateReqSuccess", LINUX_MIB_TCPMIGRATEREQSUCCESS), SNMP_MIB_ITEM("TCPMigrateReqFailure", LINUX_MIB_TCPMIGRATEREQFAILURE), SNMP_MIB_ITEM("TCPPLBRehash", LINUX_MIB_TCPPLBREHASH), + SNMP_MIB_ITEM("TCPAORequired", LINUX_MIB_TCPAOREQUIRED), + SNMP_MIB_ITEM("TCPAOBad", LINUX_MIB_TCPAOBAD), + SNMP_MIB_ITEM("TCPAOKeyNotFound", LINUX_MIB_TCPAOKEYNOTFOUND), + SNMP_MIB_ITEM("TCPAOGood", LINUX_MIB_TCPAOGOOD), SNMP_MIB_SENTINEL }; diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 6c5815713b73..1097e99a9ad6 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -182,6 +182,8 @@ static struct tcp_ao_key *tcp_ao_copy_key(struct sock *sk, *new_key = *key; INIT_HLIST_NODE(&new_key->node); tcp_sigpool_get(new_key->tcp_sigpool_id); + atomic64_set(&new_key->pkt_good, 0); + atomic64_set(&new_key->pkt_bad, 0); return new_key; } @@ -771,8 +773,12 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb, const struct tcphdr *th = tcp_hdr(skb); void *hash_buf = NULL; - if (maclen != tcp_ao_maclen(key)) + if (maclen != tcp_ao_maclen(key)) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD); + atomic64_inc(&info->counters.pkt_bad); + atomic64_inc(&key->pkt_bad); return SKB_DROP_REASON_TCP_AOFAILURE; + } hash_buf = kmalloc(tcp_ao_digest_size(key), GFP_ATOMIC); if (!hash_buf) @@ -782,9 +788,15 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb, tcp_ao_hash_skb(family, hash_buf, key, sk, skb, traffic_key, (phash - (u8 *)th), sne); if (memcmp(phash, hash_buf, maclen)) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD); + atomic64_inc(&info->counters.pkt_bad); + atomic64_inc(&key->pkt_bad); kfree(hash_buf); return SKB_DROP_REASON_TCP_AOFAILURE; } + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOGOOD); + atomic64_inc(&info->counters.pkt_good); + atomic64_inc(&key->pkt_good); kfree(hash_buf); return SKB_NOT_DROPPED_YET; } @@ -804,8 +816,10 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, u32 sne = 0; info = rcu_dereference(tcp_sk(sk)->ao_info); - if (!info) + if (!info) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND); return SKB_DROP_REASON_TCP_AOUNEXPECTED; + } if (unlikely(th->syn)) { sisn = th->seq; @@ -900,6 +914,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, return ret; key_not_found: + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND); + atomic64_inc(&info->counters.key_not_found); return SKB_DROP_REASON_TCP_AOKEYNOTFOUND; } @@ -1483,6 +1499,8 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, key->keyflags = cmd.keyflags; key->sndid = cmd.sndid; key->rcvid = cmd.rcvid; + atomic64_set(&key->pkt_good, 0); + atomic64_set(&key->pkt_bad, 0); ret = tcp_ao_parse_crypto(&cmd, key); if (ret < 0) @@ -1699,7 +1717,7 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family, return -EINVAL; } - if (cmd.reserved != 0) + if (cmd.reserved != 0 || cmd.reserved2 != 0) return -EINVAL; ao_info = setsockopt_ao_info(sk); @@ -1734,6 +1752,12 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family, goto out; } } + if (cmd.set_counters) { + atomic64_set(&ao_info->counters.pkt_good, cmd.pkt_good); + atomic64_set(&ao_info->counters.pkt_bad, cmd.pkt_bad); + atomic64_set(&ao_info->counters.key_not_found, cmd.pkt_key_not_found); + atomic64_set(&ao_info->counters.ao_required, cmd.pkt_ao_required); + } ao_info->ao_required = cmd.ao_required; if (new_current) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index f39ccefa78dc..ece95d5138e1 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1531,7 +1531,7 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, /* Don't allow keys for peers that have a matching TCP-AO key. * See the comment in tcp_ao_add_cmd() */ - if (tcp_ao_required(sk, addr, AF_INET)) + if (tcp_ao_required(sk, addr, AF_INET, false)) return -EKEYREJECTED; return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags, diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index d740928c043f..9c668bbb4853 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -661,7 +661,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, /* Don't allow keys for peers that have a matching TCP-AO key. * See the comment in tcp_ao_add_cmd() */ - if (tcp_ao_required(sk, addr, AF_INET)) + if (tcp_ao_required(sk, addr, AF_INET, false)) return -EKEYREJECTED; return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags, @@ -673,7 +673,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, /* Don't allow keys for peers that have a matching TCP-AO key. * See the comment in tcp_ao_add_cmd() */ - if (tcp_ao_required(sk, addr, AF_INET6)) + if (tcp_ao_required(sk, addr, AF_INET6, false)) return -EKEYREJECTED; return tcp_md5_do_add(sk, addr, AF_INET6, prefixlen, l3index, flags, From patchwork Mon Oct 23 19:22:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157071 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1506021vqx; Mon, 23 Oct 2023 12:26:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHgZY/SI8bOKfUgMOgoAGjJnh5wKoJY/wI3HVjXKcSn5vSrRO5t3LYvUWE3u0oGGQ8jSWUP X-Received: by 2002:a05:6a00:1813:b0:6be:62e:d5a8 with SMTP id y19-20020a056a00181300b006be062ed5a8mr9217052pfa.0.1698089189345; Mon, 23 Oct 2023 12:26:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089189; cv=none; d=google.com; s=arc-20160816; b=EalZHh0C6oe2edXr5l7f43wwFhw6yJofwC2pMckcZa0bbfwMXOvB/jsrlOjxRpPPou GNCIhobteekNej2G8FQhK6dIk9h/LghDbwtOFPmXTq+doHUULTJpne+M9Qu7MIrSiTRA GKCv59HUWNzqKmM+EIEpzmmPjlv65/tgfAuSVIk++pROSv3lYIlOMnZBtWSRbfqVtaRi LZKMRCJGNqCfJpWpv5ldRkiayeb0WzGqNOeCp3dB/WfgdqawJSI/oIcSWuKisqtPOJ6H nISRgccMNbxAmtO5HbKCUAirBoasWp9QDklIeODoG1aK895efwoCaUD/CZxS4BbCQlA5 nDoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=OQK8HpMXfmvV96FH1MbnWz7hCIl1lQYmXecUy/I3/xU=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=v5o3A1N/OPJd3iaJXL87yYsKyVE/Zd+tNlEGrX0AbvARUjPVq+6fgJ7/flNF6wSuNK ns40+zowbdnLsVkOMbTRS3NcKrR+xqXiP9mVrafGXNzGKJ2HbyEOFdIeolKQ3Mk6g0c2 BlkzGQMQYKmh7WD9nKKRB88HWPStX6FdwAER9bE8Y70eEFpA6VgDAcynEmUC8Y7xJtgM vqRRr0BG6QeESeWzv3XgLszFk8pGhAXcMqKubvi/O0i5rRVUkZfHvgKHF76gQsxJlhkd HRMJv8cJgfH6JJMyOT8DdlCLPk2MiQFR8HfeenbneOW4WAZmmsA4MKAM0ekw3YyfabK8 YN0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=SOyrRGoq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id fj1-20020a056a003a0100b006b1fc88d095si6824497pfb.71.2023.10.23.12.26.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:26:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=SOyrRGoq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 82CCD807C866; Mon, 23 Oct 2023 12:25:25 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232503AbjJWTXr (ORCPT + 27 others); Mon, 23 Oct 2023 15:23:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233395AbjJWTXU (ORCPT ); Mon, 23 Oct 2023 15:23:20 -0400 Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A5C010E7 for ; Mon, 23 Oct 2023 12:22:57 -0700 (PDT) Received: by mail-lj1-x22c.google.com with SMTP id 38308e7fff4ca-2c518a1d83fso57918261fa.3 for ; Mon, 23 Oct 2023 12:22:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088973; x=1698693773; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OQK8HpMXfmvV96FH1MbnWz7hCIl1lQYmXecUy/I3/xU=; b=SOyrRGoqi+MZF5++3oIEIhEZpz+9rJuZ/2CpIV14ssYjGndWxDXsLTbpRFTYAp0SUg LgN8GxbAi+gCN6yiUUNDshTSp142pNAdqfHWdsL/iJLzk97UgluLJcto5TbwFTICp5Kv Basg1IYD7QK+Eqx8BNbqO98mtcgkc6mFrhJZmnhhCtr5AEGR7g9uGyD90ffERzI4S/9G 1wCoGSyfMmKfrkNWfolf9RjT2YeyeuOVVeHDB8EExdOEQZ4myb8lkCU0lHZwJp2wZdSV j0mjfbCHpeo6SCIDwSC+8Fb4BRNhmIOGduxWDMQrugB3N3hvuC9QWeI+dHhqGrF9luFA L7Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088973; x=1698693773; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OQK8HpMXfmvV96FH1MbnWz7hCIl1lQYmXecUy/I3/xU=; b=bF+fO7aYIx/OUUEhw7hN12oTOYXXFCjaA5ZraO+L55WOtzLNUP2bvj2UBshoIyEVaF 9Nc+o9raaIUGBtjPjQ1kbSqJJM+89skoc6ek4to+Zihex95KvgVm/kVZ2VgSufDIKQUP +MI59qtmiwhlj+2dd8pWfX8ZL6kGtcVrChc33d7LcJAwHjxFEzu9YhH9k2ANohflgtjj 4S/3NrvHPxT65dp2APcqNzjjL12lcqWezSaLi/XFrvCqN6nTfVxfbR+YtaChmnkZXi7D uKjcp7ZXge5oOozl27+eF+/DU5B/jRQsDGt131uQ+ZtIR0+7612X4UYEnQjplQW0SNz7 5J1Q== X-Gm-Message-State: AOJu0YwZY8mN+pTTrxfngYCjiNRBoL6vT2fyw865VNXVqW3tg5jQ4dS6 xmEqERq5ZC5Qo3PhYH+4BE4LxQ== X-Received: by 2002:a05:651c:1a28:b0:2c5:1bd9:f946 with SMTP id by40-20020a05651c1a2800b002c51bd9f946mr8706892ljb.9.1698088973323; Mon, 23 Oct 2023 12:22:53 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:52 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 14/23] net/tcp: Add TCP-AO SNE support Date: Mon, 23 Oct 2023 20:22:06 +0100 Message-ID: <20231023192217.426455-15-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:25:25 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575569408315325 X-GMAIL-MSGID: 1780575569408315325 Add Sequence Number Extension (SNE) for TCP-AO. This is needed to protect long-living TCP-AO connections from replaying attacks after sequence number roll-over, see RFC5925 (6.2). Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp_ao.h | 22 ++++++++++++++++++- net/ipv4/tcp_ao.c | 46 ++++++++++++++++++++++++++++++++-------- net/ipv4/tcp_input.c | 28 ++++++++++++++++++++++++ net/ipv4/tcp_ipv4.c | 3 ++- net/ipv4/tcp_minisocks.c | 15 ++++++++++++- net/ipv6/tcp_ipv6.c | 3 ++- 6 files changed, 104 insertions(+), 13 deletions(-) diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index cfb55bd9411b..0c3516d1b968 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -95,6 +95,25 @@ struct tcp_ao_info { __unused :31; __be32 lisn; __be32 risn; + /* Sequence Number Extension (SNE) are upper 4 bytes for SEQ, + * that protect TCP-AO connection from replayed old TCP segments. + * See RFC5925 (6.2). + * In order to get correct SNE, there's a helper tcp_ao_compute_sne(). + * It needs SEQ basis to understand whereabouts are lower SEQ numbers. + * According to that basis vector, it can provide incremented SNE + * when SEQ rolls over or provide decremented SNE when there's + * a retransmitted segment from before-rolling over. + * - for request sockets such basis is rcv_isn/snt_isn, which seems + * good enough as it's unexpected to receive 4 Gbytes on reqsk. + * - for full sockets the basis is rcv_nxt/snd_una. snd_una is + * taken instead of snd_nxt as currently it's easier to track + * in tcp_snd_una_update(), rather than updating SNE in all + * WRITE_ONCE(tp->snd_nxt, ...) + * - for time-wait sockets the basis is tw_rcv_nxt/tw_snd_nxt. + * tw_snd_nxt is not expected to change, while tw_rcv_nxt may. + */ + u32 snd_sne; + u32 rcv_sne; refcount_t refcnt; /* Protects twsk destruction */ struct rcu_head rcu; }; @@ -147,6 +166,7 @@ enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, const struct request_sock *req, const struct tcp_ao_hdr *aoh); +u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq); struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid); @@ -156,7 +176,7 @@ int tcp_ao_hash_hdr(unsigned short family, char *ao_hash, const union tcp_ao_addr *saddr, const struct tcphdr *th, u32 sne); int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, - const struct tcp_ao_hdr *aoh, int l3index, + const struct tcp_ao_hdr *aoh, int l3index, u32 seq, struct tcp_ao_key **key, char **traffic_key, bool *allocated_traffic_key, u8 *keyid, u32 *sne); diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 1097e99a9ad6..7e14bcd4dfd4 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -401,6 +401,21 @@ static int tcp_ao_hash_pseudoheader(unsigned short int family, return -EAFNOSUPPORT; } +u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq) +{ + u32 sne = next_sne; + + if (before(seq, next_seq)) { + if (seq > next_seq) + sne--; + } else { + if (seq < next_seq) + sne++; + } + + return sne; +} + /* tcp_ao_hash_sne(struct tcp_sigpool *hp) * @hp - used for hashing * @sne - sne value @@ -611,7 +626,7 @@ struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, } int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, - const struct tcp_ao_hdr *aoh, int l3index, + const struct tcp_ao_hdr *aoh, int l3index, u32 seq, struct tcp_ao_key **key, char **traffic_key, bool *allocated_traffic_key, u8 *keyid, u32 *sne) { @@ -639,7 +654,7 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, sisn = htonl(tcp_rsk(req)->rcv_isn); disn = htonl(tcp_rsk(req)->snt_isn); - *sne = 0; + *sne = tcp_ao_compute_sne(0, tcp_rsk(req)->snt_isn, seq); } else { sisn = th->seq; disn = 0; @@ -670,11 +685,15 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, *keyid = (*key)->rcvid; } else { struct tcp_ao_key *rnext_key; + u32 snd_basis; - if (sk->sk_state == TCP_TIME_WAIT) + if (sk->sk_state == TCP_TIME_WAIT) { ao_info = rcu_dereference(tcp_twsk(sk)->ao_info); - else + snd_basis = tcp_twsk(sk)->tw_snd_nxt; + } else { ao_info = rcu_dereference(tcp_sk(sk)->ao_info); + snd_basis = tcp_sk(sk)->snd_una; + } if (!ao_info) return -ENOENT; @@ -684,7 +703,8 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, *traffic_key = snd_other_key(*key); rnext_key = READ_ONCE(ao_info->rnext_key); *keyid = rnext_key->rcvid; - *sne = 0; + *sne = tcp_ao_compute_sne(READ_ONCE(ao_info->snd_sne), + snd_basis, seq); } return 0; } @@ -698,6 +718,7 @@ int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, struct tcp_ao_info *ao; void *tkey_buf = NULL; u8 *traffic_key; + u32 sne; ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, lockdep_sock_is_held(sk)); @@ -717,8 +738,10 @@ int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, tp->af_specific->ao_calc_key_sk(key, traffic_key, sk, ao->lisn, disn, true); } + sne = tcp_ao_compute_sne(READ_ONCE(ao->snd_sne), READ_ONCE(tp->snd_una), + ntohl(th->seq)); tp->af_specific->calc_ao_hash(hash_location, key, sk, skb, traffic_key, - hash_location - (u8 *)th, 0); + hash_location - (u8 *)th, sne); kfree(tkey_buf); return 0; } @@ -846,7 +869,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, if (unlikely(th->syn && !th->ack)) goto verify_hash; - sne = 0; + sne = tcp_ao_compute_sne(info->rcv_sne, tcp_sk(sk)->rcv_nxt, + ntohl(th->seq)); /* Established socket, traffic key are cached */ traffic_key = rcv_other_key(key); err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key, @@ -881,14 +905,16 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) { /* Make the initial syn the likely case here */ if (unlikely(req)) { - sne = 0; + sne = tcp_ao_compute_sne(0, tcp_rsk(req)->rcv_isn, + ntohl(th->seq)); sisn = htonl(tcp_rsk(req)->rcv_isn); disn = htonl(tcp_rsk(req)->snt_isn); } else if (unlikely(th->ack && !th->syn)) { /* Possible syncookie packet */ sisn = htonl(ntohl(th->seq) - 1); disn = htonl(ntohl(th->ack_seq) - 1); - sne = 0; + sne = tcp_ao_compute_sne(0, ntohl(sisn), + ntohl(th->seq)); } else if (unlikely(!th->syn)) { /* no way to figure out initial sisn/disn - drop */ return SKB_DROP_REASON_TCP_FLAGS; @@ -986,6 +1012,7 @@ void tcp_ao_connect_init(struct sock *sk) tp->tcp_header_len += tcp_ao_len(key); ao_info->lisn = htonl(tp->write_seq); + ao_info->snd_sne = 0; } else { /* Can't happen: tcp_connect() verifies that there's * at least one tcp-ao key that matches the remote peer. @@ -1021,6 +1048,7 @@ void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb) return; WRITE_ONCE(ao->risn, tcp_hdr(skb)->seq); + ao->rcv_sne = 0; hlist_for_each_entry_rcu(key, &ao->head, node) tcp_ao_cache_traffic_keys(sk, ao, key); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 475af9767615..8132577bdfba 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3575,9 +3575,18 @@ static inline bool tcp_may_update_window(const struct tcp_sock *tp, static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack) { u32 delta = ack - tp->snd_una; +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; +#endif sock_owned_by_me((struct sock *)tp); tp->bytes_acked += delta; +#ifdef CONFIG_TCP_AO + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held((struct sock *)tp)); + if (ao && ack < tp->snd_una) + ao->snd_sne++; +#endif tp->snd_una = ack; } @@ -3585,9 +3594,18 @@ static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack) static void tcp_rcv_nxt_update(struct tcp_sock *tp, u32 seq) { u32 delta = seq - tp->rcv_nxt; +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; +#endif sock_owned_by_me((struct sock *)tp); tp->bytes_received += delta; +#ifdef CONFIG_TCP_AO + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held((struct sock *)tp)); + if (ao && seq < tp->rcv_nxt) + ao->rcv_sne++; +#endif WRITE_ONCE(tp->rcv_nxt, seq); } @@ -6455,6 +6473,16 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, * simultaneous connect with crossed SYNs. * Particularly, it can be connect to self. */ +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; + + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held(sk)); + if (ao) { + WRITE_ONCE(ao->risn, th->seq); + ao->rcv_sne = 0; + } +#endif tcp_set_state(sk, TCP_SYN_RECV); if (tp->rx_opt.saw_tstamp) { diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index ece95d5138e1..bdec99707028 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -676,7 +676,7 @@ static bool tcp_v4_ao_sign_reset(const struct sock *sk, struct sk_buff *skb, u8 keyid; rcu_read_lock(); - if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, + if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, ntohl(reply->seq), &key, &traffic_key, &allocated_traffic_key, &keyid, &ao_sne)) goto out; @@ -1034,6 +1034,7 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) struct tcp_ao_key *rnext_key; key.traffic_key = snd_other_key(key.ao_key); + key.sne = READ_ONCE(ao_info->snd_sne); rnext_key = READ_ONCE(ao_info->rnext_key); key.rcv_next = rnext_key->rcvid; key.type = TCP_KEY_AO; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 8d941a6e066b..a9807eeb311c 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -51,6 +51,18 @@ tcp_timewait_check_oow_rate_limit(struct inet_timewait_sock *tw, return TCP_TW_SUCCESS; } +static void twsk_rcv_nxt_update(struct tcp_timewait_sock *tcptw, u32 seq) +{ +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; + + ao = rcu_dereference(tcptw->ao_info); + if (unlikely(ao && seq < tcptw->tw_rcv_nxt)) + WRITE_ONCE(ao->rcv_sne, ao->rcv_sne + 1); +#endif + tcptw->tw_rcv_nxt = seq; +} + /* * * Main purpose of TIME-WAIT state is to close connection gracefully, * when one of ends sits in LAST-ACK or CLOSING retransmitting FIN @@ -136,7 +148,8 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, /* FIN arrived, enter true time-wait state. */ tw->tw_substate = TCP_TIME_WAIT; - tcptw->tw_rcv_nxt = TCP_SKB_CB(skb)->end_seq; + twsk_rcv_nxt_update(tcptw, TCP_SKB_CB(skb)->end_seq); + if (tmp_opt.saw_tstamp) { tcptw->tw_ts_recent_stamp = ktime_get_seconds(); tcptw->tw_ts_recent = tmp_opt.rcv_tsval; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 9c668bbb4853..97397f57dec1 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1090,7 +1090,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) int l3index; l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0; - if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, + if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, seq, &key.ao_key, &key.traffic_key, &allocated_traffic_key, &key.rcv_next, &key.sne)) @@ -1167,6 +1167,7 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) /* rcv_next switches to our rcv_next */ rnext_key = READ_ONCE(ao_info->rnext_key); key.rcv_next = rnext_key->rcvid; + key.sne = READ_ONCE(ao_info->snd_sne); key.type = TCP_KEY_AO; #else if (0) { From patchwork Mon Oct 23 19:22:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157068 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1505877vqx; Mon, 23 Oct 2023 12:26:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHj7cW8Td4X6ijt8iDQGHOyqgzpkQd/sWtLKKyl/EBVgaysiclSnB/6N3bOnTQYYc7YiYsz X-Received: by 2002:a05:6a20:4294:b0:17c:c278:bcb6 with SMTP id o20-20020a056a20429400b0017cc278bcb6mr591094pzj.29.1698089180124; Mon, 23 Oct 2023 12:26:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089180; cv=none; d=google.com; s=arc-20160816; b=glij+LeiEh8pTFuzuRiH0X/rZwSOnjoaPjk3sVjeeJteWEA6+VMnzPkAPe3ulidoZc 3d9CAVxKLQZEjV01kprmNilN3JCXdcTrmo2U/yfMHqQWz3Wi0IG+m8QtiZe8N+D4yXWI kyaOdOxn6gEnWy4rRF+ZnNbaAe/uoRPkNDofH01PTL/B78YREXHEzLvUv4RO2Ztjy3Jh o4CSlA07OPGGeTM6oJ7grEjXAaYw7+zqogLW940S2pt6vI6qaxj7dbG401rpIGptX/G/ o9ibGKd4FitEezU8XjwY9yjKJjZ2NX942dRQEHN0M/JanjwAi9jdOBE18bTXqFxSpB/8 7NFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=YaLyEMYJtggtWrtStOtLZq/UC24SdFNNRozlG1IgPAc=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=BCMxFSLDIE8h5k3SdNQKXKU8dX5rvVQB+xcquAhd2/3Z03dobt8WcqNmS8gdOAG+ko JJrog1j7UKhqr6/KLpSzSzsng/Wa1iuz5KjB+w8+hPw+PWUv8BQaOtYYbcd3IYhC4nj8 ARCPrOJhsMb+M1vKrBW7WEurPOC12GuDxpAvQyDQ+Klvtg9xFgh+rdDjRwLXQcOK1uMp +LWF143hNaZ8Z4f/hraYDvyYEcBW9/DgKFqGMMUAVxnYEeq2HJ/H08EsySvKtJvyq3tO V0xhp9RZ7wnKNzh1uhfYA51Gkmt9cVhYXmkidI/Tfi4HgbnYAz4UF+6tU2SA/Fi3nG3j lYWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=M7ptxhDu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id 193-20020a6302ca000000b00585a5e9a965si7049991pgc.161.2023.10.23.12.26.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:26:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=M7ptxhDu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 8BD9980AFEB7; Mon, 23 Oct 2023 12:25:07 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233687AbjJWTXv (ORCPT + 27 others); Mon, 23 Oct 2023 15:23:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231382AbjJWTX2 (ORCPT ); Mon, 23 Oct 2023 15:23:28 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2B061725 for ; Mon, 23 Oct 2023 12:22:59 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-40859c46447so16637575e9.1 for ; Mon, 23 Oct 2023 12:22:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088975; x=1698693775; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YaLyEMYJtggtWrtStOtLZq/UC24SdFNNRozlG1IgPAc=; b=M7ptxhDuFAQtR21kVMWx2BCV28n880Vx1PMU4eTB2y4dnAAB4SxyzwWLagRRcV9w27 IwQZb3gUdDvC3uhrqCQaF2Z1lpbcnh6F6r0khg42lsaqVjFA8K3p3fDYQ7EHZfrbSRav xo0cPBfae+Yh6HcZe9pdnCZvuiQv7vKreGlBFGptyvGdZMktYPMfOj7C5UCpmVuw+tuP 8xP3+VRJwHdKJ66g1BjzfRkSUHktmzgTCW+2GIjJZJIQrFc2UakIZH0I5c9CBTZfaMqe yugUeoH0nipfuQCreK2pemtlux6RPiM6S9TO19pQJLpEWpvAxksrOoNH1OTVYjYbUOHe PH9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088975; x=1698693775; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YaLyEMYJtggtWrtStOtLZq/UC24SdFNNRozlG1IgPAc=; b=hl8U3A4SomEqQ7SprR/eZFGEEa1duDqy1QzLjbaMRzyJP6/zkPCAmbwqIjciq9G+R1 I8NHXvi78ZGvXUJYGeYndRQXFGrQHdOhJluS7a34oMfsmnmIwqyAdI8VWnk1g/+A/rtK lvrLpQMap0MQZPHQmDeHXLs4qDRunj4KgSXvSz9yss9fDOJm4/E+a/i2RLvkr/PuuA4g ejwpYA+96JYtn1CZ7CEBBYv5DWvEjNk8YVKdxEVnscTZ0cEiYJAWNb1gEN1+iVi15vgJ Th6JtQ5KIg5pTbmbn+65KMaksRLBR6S3StJx5LgfDTrIu1vgyNYDvB0EeRkkGsW2g/sI fWdw== X-Gm-Message-State: AOJu0YyFbZmKQ39mic9TusAhRoSryAz+dl8XIESmD700ZDMCORhWTH/Y TNow1VzmiV/x/pBt1Zs8/z9SW6zxBnp54NWZUMI= X-Received: by 2002:a05:600c:468e:b0:408:500b:a476 with SMTP id p14-20020a05600c468e00b00408500ba476mr8051615wmo.20.1698088975181; Mon, 23 Oct 2023 12:22:55 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:54 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 15/23] net/tcp: Add tcp_hash_fail() ratelimited logs Date: Mon, 23 Oct 2023 20:22:07 +0100 Message-ID: <20231023192217.426455-16-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:25:07 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575559934192811 X-GMAIL-MSGID: 1780575559934192811 Add a helper for logging connection-detailed messages for failed TCP hash verification (both MD5 and AO). Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp.h | 14 ++++++++++++-- include/net/tcp_ao.h | 29 +++++++++++++++++++++++++++++ net/ipv4/tcp.c | 23 +++++++++++++---------- net/ipv4/tcp_ao.c | 7 +++++++ 4 files changed, 61 insertions(+), 12 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index d29c8a867f0e..c93ac6cc12c4 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2746,12 +2746,18 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req, int l3index; /* Invalid option or two times meet any of auth options */ - if (tcp_parse_auth_options(th, &md5_location, &aoh)) + if (tcp_parse_auth_options(th, &md5_location, &aoh)) { + tcp_hash_fail("TCP segment has incorrect auth options set", + family, skb, ""); return SKB_DROP_REASON_TCP_AUTH_HDR; + } if (req) { if (tcp_rsk_used_ao(req) != !!aoh) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD); + tcp_hash_fail("TCP connection can't start/end using TCP-AO", + family, skb, "%s", + !aoh ? "missing AO" : "AO signed"); return SKB_DROP_REASON_TCP_AOFAILURE; } } @@ -2768,10 +2774,14 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req, * the last key is impossible to remove, so there's * always at least one current_key. */ - if (tcp_ao_required(sk, saddr, family, true)) + if (tcp_ao_required(sk, saddr, family, true)) { + tcp_hash_fail("AO hash is required, but not found", + family, skb, "L3 index %d", l3index); return SKB_DROP_REASON_TCP_AONOTFOUND; + } if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); + tcp_hash_fail("MD5 Hash not found", family, skb, ""); return SKB_DROP_REASON_TCP_MD5NOTFOUND; } return SKB_NOT_DROPPED_YET; diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 0c3516d1b968..4da6e3657913 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -118,6 +118,35 @@ struct tcp_ao_info { struct rcu_head rcu; }; +#define tcp_hash_fail(msg, family, skb, fmt, ...) \ +do { \ + const struct tcphdr *th = tcp_hdr(skb); \ + char hdr_flags[5] = {}; \ + char *f = hdr_flags; \ + \ + if (th->fin) \ + *f++ = 'F'; \ + if (th->syn) \ + *f++ = 'S'; \ + if (th->rst) \ + *f++ = 'R'; \ + if (th->ack) \ + *f++ = 'A'; \ + if (f != hdr_flags) \ + *f = ' '; \ + if ((family) == AF_INET) { \ + net_info_ratelimited("%s for (%pI4, %d)->(%pI4, %d) %s" fmt "\n", \ + msg, &ip_hdr(skb)->saddr, ntohs(th->source), \ + &ip_hdr(skb)->daddr, ntohs(th->dest), \ + hdr_flags, ##__VA_ARGS__); \ + } else { \ + net_info_ratelimited("%s for [%pI6c]:%u->[%pI6c]:%u %s" fmt "\n", \ + msg, &ipv6_hdr(skb)->saddr, ntohs(th->source), \ + &ipv6_hdr(skb)->daddr, ntohs(th->dest), \ + hdr_flags, ##__VA_ARGS__); \ + } \ +} while (0) + #ifdef CONFIG_TCP_AO /* TCP-AO structures and functions */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 846ddc023ac1..8594a4bf764e 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4381,7 +4381,6 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, * o MD5 hash and we're not expecting one. * o MD5 hash and its wrong. */ - const struct tcphdr *th = tcp_hdr(skb); const struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_key *key; u8 newhash[16]; @@ -4391,6 +4390,7 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, if (!key && hash_location) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED); + tcp_hash_fail("Unexpected MD5 Hash found", family, skb, ""); return SKB_DROP_REASON_TCP_MD5UNEXPECTED; } @@ -4406,16 +4406,19 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb, if (genhash || memcmp(hash_location, newhash, 16) != 0) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE); if (family == AF_INET) { - net_info_ratelimited("MD5 Hash failed for (%pI4, %d)->(%pI4, %d)%s L3 index %d\n", - saddr, ntohs(th->source), - daddr, ntohs(th->dest), - genhash ? " tcp_v4_calc_md5_hash failed" - : "", l3index); + tcp_hash_fail("MD5 Hash failed", AF_INET, skb, "%s L3 index %d", + genhash ? "tcp_v4_calc_md5_hash failed" + : "", l3index); } else { - net_info_ratelimited("MD5 Hash %s for [%pI6c]:%u->[%pI6c]:%u L3 index %d\n", - genhash ? "failed" : "mismatch", - saddr, ntohs(th->source), - daddr, ntohs(th->dest), l3index); + if (genhash) { + tcp_hash_fail("MD5 Hash failed", + AF_INET6, skb, "L3 index %d", + l3index); + } else { + tcp_hash_fail("MD5 Hash mismatch", + AF_INET6, skb, "L3 index %d", + l3index); + } } return SKB_DROP_REASON_TCP_MD5FAILURE; } diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 7e14bcd4dfd4..f76fcb93499d 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -800,6 +800,8 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb, NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD); atomic64_inc(&info->counters.pkt_bad); atomic64_inc(&key->pkt_bad); + tcp_hash_fail("AO hash wrong length", family, skb, + "%u != %d", maclen, tcp_ao_maclen(key)); return SKB_DROP_REASON_TCP_AOFAILURE; } @@ -814,6 +816,7 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb, NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD); atomic64_inc(&info->counters.pkt_bad); atomic64_inc(&key->pkt_bad); + tcp_hash_fail("AO hash mismatch", family, skb, ""); kfree(hash_buf); return SKB_DROP_REASON_TCP_AOFAILURE; } @@ -841,6 +844,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, info = rcu_dereference(tcp_sk(sk)->ao_info); if (!info) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND); + tcp_hash_fail("AO key not found", family, skb, + "keyid: %u", aoh->keyid); return SKB_DROP_REASON_TCP_AOUNEXPECTED; } @@ -942,6 +947,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, key_not_found: NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND); atomic64_inc(&info->counters.key_not_found); + tcp_hash_fail("Requested by the peer AO key id not found", + family, skb, ""); return SKB_DROP_REASON_TCP_AOKEYNOTFOUND; } From patchwork Mon Oct 23 19:22:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157062 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504936vqx; Mon, 23 Oct 2023 12:24:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHGV9KVHSlST5hXNNPHBSsaLUAP8H6BmXXyzgt08auLfVUBQinpxC2eHBdJpvKAJ06e6N5i X-Received: by 2002:a17:903:1210:b0:1c5:d8a3:8789 with SMTP id l16-20020a170903121000b001c5d8a38789mr9859250plh.4.1698089056751; Mon, 23 Oct 2023 12:24:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089056; cv=none; d=google.com; s=arc-20160816; b=kiagul3x59YztUYnLb9EgbCxVxHSvizan/2DRS/cv+Zl/W1JLMUMgLuvAf6niOpVZ7 Nb9mLJZ9tV1wWzQ9F5xbqzdqEGVddCS1ceQYbiGjcdx8kdRK5tI4nPZQbL025XQIdKkd cEF+ffs3drtjPqp6CiiCOPygsjCbcY7Nt6tTdnKJqNTYWlxDS8ZVzyKSsKtsDmQ65sWK CXQ/eGiDTXCC3ULWCTmYxHuiB5RooMgqr1mtfn2Zfj0XzyuPg/WQ9ECNjU0+/3J/9s97 aQFDhtvLLZ7bsIUyf3uKm9yFzBWIRIpzQp+kH6wEs2ANqxJq7iByStYzfhOELUjmc0jM KTyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=zi8TUQ9RHKTqi8CYUyqs3AjnhbL/JGM1AwjJTbvUmVQ=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=bPCrwI2C54XMXHyxxwnyGCL0Y2368zxOk7WkiSLbENsGgxQLBr/8V0FUZTOxCCgyqL pm0kUxgrEdD2d91S1HqzpEYkWx4ZSlUx4WcgqHcmEf2Hj6nNMsyIKtE0d7ooZuDSaGyX rqS7BRy4uaLG2ZX9RU5DMGa4tcaVtsUF5OSpFefi1jd1vDug7KUxHQjEbUMJGcMc8PGL Vt4Aez7sZkIV87NqWEEqU5d/Vi/GyruBBEY+03rtFcxjyC7IIrG9vrVKDn3BJdeQ8Os+ BLtu/bsV+2bitJY0odpeZR8K8bqVAlmMqmaSaEMnKbR++34Zet/BjKkH4nIgX74yhKzS cKvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=hDtdGC0s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id b5-20020a170902650500b001c9fb3b55f0si6685305plk.652.2023.10.23.12.24.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:24:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=hDtdGC0s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 0982E80C0352; Mon, 23 Oct 2023 12:24:16 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232215AbjJWTYK (ORCPT + 27 others); Mon, 23 Oct 2023 15:24:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233121AbjJWTXi (ORCPT ); Mon, 23 Oct 2023 15:23:38 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DC351728 for ; Mon, 23 Oct 2023 12:23:00 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-4083cd39188so28523005e9.2 for ; Mon, 23 Oct 2023 12:23:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088977; x=1698693777; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zi8TUQ9RHKTqi8CYUyqs3AjnhbL/JGM1AwjJTbvUmVQ=; b=hDtdGC0swp+6Wikk3IkhyfUFPev01SoIH0D2Ko50PaHCDROinCFFWOM4yESxBNm/+u Va3fq/zCdX9CtQ9aF7MZSSYdvK61knku90SlSxxer1cjUJZsgs+iOOsiMmW+XE3JrEeu Xl83Wkr7OqaJ2RseNJX6fHle2N8SvCPKM4GD0JnTPzwdsLj9Zth0ScMp1rjefUOfKDGQ 21gCk8zRYKNoDMFmSjFCNxzq5miD7cHKG8q1NS5/XUMQZcTRNFeceZr4zxhX6dhC26Xt TqW6iN6wAfCTpaRNCaCmKtqMIT7/VVjiRQkhl79YVIs6n4LxuwTqxYVaysLR54o2Ev+S ebfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088977; x=1698693777; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zi8TUQ9RHKTqi8CYUyqs3AjnhbL/JGM1AwjJTbvUmVQ=; b=oqWj8T/upknqSZE8rO3xEiKl0k/nVtXqzxFSHKxnWtK3nmRuRhMzeRo7R73sXcm2d6 XRFcNgxwuKadNSE2qdQqNJa2eQahk2SC3HbjfKrx3R7U9ZJPWI82NSa9LEr4p1wOpsc0 UE6w1W2qRJU3Oe2cW199KR5MAOlr5L1LVcJayKRZnWek2g0BuK/fJt9PaMTZE4uNxXz7 /pG/ocE9M1I1KqaoHNItm7lAPzjEyVhzu6BjND/iJ/7TtlRfcaTz41yN/gP5MQBeDHfu cZLEdalJYQoUsKJfB1PEjBCxx9JNgy5nHJx5rC7RvV8HeycZH/HwhRDwBeUAE/ESUcj6 XMEQ== X-Gm-Message-State: AOJu0YwXLhrSSMuAuKy5IQVMhXDCLjJgVY/Bx70vIW725JaR/liAcFj3 ZY8uwcjgjE7uA4Pr0Mpp1S1s5Q== X-Received: by 2002:a05:600c:4fc3:b0:405:1c19:b747 with SMTP id o3-20020a05600c4fc300b004051c19b747mr7536981wmq.15.1698088977088; Mon, 23 Oct 2023 12:22:57 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:56 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 16/23] net/tcp: Ignore specific ICMPs for TCP-AO connections Date: Mon, 23 Oct 2023 20:22:08 +0100 Message-ID: <20231023192217.426455-17-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:24:16 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575430701560253 X-GMAIL-MSGID: 1780575430701560253 Similarly to IPsec, RFC5925 prescribes: ">> A TCP-AO implementation MUST default to ignore incoming ICMPv4 messages of Type 3 (destination unreachable), Codes 2-4 (protocol unreachable, port unreachable, and fragmentation needed -- ’hard errors’), and ICMPv6 Type 1 (destination unreachable), Code 1 (administratively prohibited) and Code 4 (port unreachable) intended for connections in synchronized states (ESTABLISHED, FIN-WAIT-1, FIN- WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT) that match MKTs." A selftest (later in patch series) verifies that this attack is not possible in this TCP-AO implementation. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp_ao.h | 11 +++++++- include/uapi/linux/snmp.h | 1 + include/uapi/linux/tcp.h | 4 ++- net/ipv4/proc.c | 1 + net/ipv4/tcp_ao.c | 58 +++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_ipv4.c | 7 +++++ net/ipv6/tcp_ipv6.c | 7 +++++ 7 files changed, 87 insertions(+), 2 deletions(-) diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 4da6e3657913..a9d38b9e8bcb 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -24,6 +24,7 @@ struct tcp_ao_counters { atomic64_t pkt_bad; atomic64_t key_not_found; atomic64_t ao_required; + atomic64_t dropped_icmp; }; struct tcp_ao_key { @@ -92,7 +93,8 @@ struct tcp_ao_info { struct tcp_ao_key *rnext_key; struct tcp_ao_counters counters; u32 ao_required :1, - __unused :31; + accept_icmps :1, + __unused :30; __be32 lisn; __be32 risn; /* Sequence Number Extension (SNE) are upper 4 bytes for SEQ, @@ -191,6 +193,7 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, unsigned int len, struct tcp_sigpool *hp); void tcp_ao_destroy_sock(struct sock *sk, bool twsk); void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp); +bool tcp_ao_ignore_icmp(const struct sock *sk, int family, int type, int code); enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, const struct request_sock *req, @@ -274,6 +277,12 @@ static inline void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, { } +static inline bool tcp_ao_ignore_icmp(const struct sock *sk, int family, + int type, int code) +{ + return false; +} + static inline enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, const struct request_sock *req, const struct tcp_ao_hdr *aoh) diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h index 3d5ea841bffe..a0819c6a5988 100644 --- a/include/uapi/linux/snmp.h +++ b/include/uapi/linux/snmp.h @@ -301,6 +301,7 @@ enum LINUX_MIB_TCPAOBAD, /* TCPAOBad */ LINUX_MIB_TCPAOKEYNOTFOUND, /* TCPAOKeyNotFound */ LINUX_MIB_TCPAOGOOD, /* TCPAOGood */ + LINUX_MIB_TCPAODROPPEDICMPS, /* TCPAODroppedIcmps */ __LINUX_MIB_MAX }; diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index 9c48964849d1..d8b2ea23f12a 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -405,7 +405,8 @@ struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */ set_rnext :1, /* corresponding ::rnext */ ao_required :1, /* don't accept non-AO connects */ set_counters :1, /* set/clear ::pkt_* counters */ - reserved :28; /* must be 0 */ + accept_icmps :1, /* accept incoming ICMPs */ + reserved :27; /* must be 0 */ __u16 reserved2; /* padding, must be 0 */ __u8 current_key; /* KeyID to set as Current_key */ __u8 rnext; /* KeyID to set as Rnext_key */ @@ -413,6 +414,7 @@ struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */ __u64 pkt_bad; /* failed verification */ __u64 pkt_key_not_found; /* could not find a key to verify */ __u64 pkt_ao_required; /* segments missing TCP-AO sign */ + __u64 pkt_dropped_icmp; /* ICMPs that were ignored */ } __attribute__((aligned(8))); /* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */ diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c index f5b37ebc18c0..5f4654ebff48 100644 --- a/net/ipv4/proc.c +++ b/net/ipv4/proc.c @@ -303,6 +303,7 @@ static const struct snmp_mib snmp4_net_list[] = { SNMP_MIB_ITEM("TCPAOBad", LINUX_MIB_TCPAOBAD), SNMP_MIB_ITEM("TCPAOKeyNotFound", LINUX_MIB_TCPAOKEYNOTFOUND), SNMP_MIB_ITEM("TCPAOGood", LINUX_MIB_TCPAOGOOD), + SNMP_MIB_ITEM("TCPAODroppedIcmps", LINUX_MIB_TCPAODROPPEDICMPS), SNMP_MIB_SENTINEL }; diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index f76fcb93499d..223af5c9eaf3 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -15,6 +15,7 @@ #include #include +#include int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, unsigned int len, struct tcp_sigpool *hp) @@ -44,6 +45,60 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, return 1; } +bool tcp_ao_ignore_icmp(const struct sock *sk, int family, int type, int code) +{ + bool ignore_icmp = false; + struct tcp_ao_info *ao; + + /* RFC5925, 7.8: + * >> A TCP-AO implementation MUST default to ignore incoming ICMPv4 + * messages of Type 3 (destination unreachable), Codes 2-4 (protocol + * unreachable, port unreachable, and fragmentation needed -- ’hard + * errors’), and ICMPv6 Type 1 (destination unreachable), Code 1 + * (administratively prohibited) and Code 4 (port unreachable) intended + * for connections in synchronized states (ESTABLISHED, FIN-WAIT-1, FIN- + * WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT) that match MKTs. + */ + if (family == AF_INET) { + if (type != ICMP_DEST_UNREACH) + return false; + if (code < ICMP_PROT_UNREACH || code > ICMP_FRAG_NEEDED) + return false; + } else { + if (type != ICMPV6_DEST_UNREACH) + return false; + if (code != ICMPV6_ADM_PROHIBITED && code != ICMPV6_PORT_UNREACH) + return false; + } + + rcu_read_lock(); + switch (sk->sk_state) { + case TCP_TIME_WAIT: + ao = rcu_dereference(tcp_twsk(sk)->ao_info); + break; + case TCP_SYN_SENT: + case TCP_SYN_RECV: + case TCP_LISTEN: + case TCP_NEW_SYN_RECV: + /* RFC5925 specifies to ignore ICMPs *only* on connections + * in synchronized states. + */ + rcu_read_unlock(); + return false; + default: + ao = rcu_dereference(tcp_sk(sk)->ao_info); + } + + if (ao && !ao->accept_icmps) { + ignore_icmp = true; + __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAODROPPEDICMPS); + atomic64_inc(&ao->counters.dropped_icmp); + } + rcu_read_unlock(); + + return ignore_icmp; +} + /* Optimized version of tcp_ao_do_lookup(): only for sockets for which * it's known that the keys in ao_info are matching peer's * family/address/VRF/etc. @@ -1086,6 +1141,7 @@ int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk, new_ao->lisn = htonl(tcp_rsk(req)->snt_isn); new_ao->risn = htonl(tcp_rsk(req)->rcv_isn); new_ao->ao_required = ao->ao_required; + new_ao->accept_icmps = ao->accept_icmps; if (family == AF_INET) { addr = (union tcp_ao_addr *)&newsk->sk_daddr; @@ -1792,9 +1848,11 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family, atomic64_set(&ao_info->counters.pkt_bad, cmd.pkt_bad); atomic64_set(&ao_info->counters.key_not_found, cmd.pkt_key_not_found); atomic64_set(&ao_info->counters.ao_required, cmd.pkt_ao_required); + atomic64_set(&ao_info->counters.dropped_icmp, cmd.pkt_dropped_icmp); } ao_info->ao_required = cmd.ao_required; + ao_info->accept_icmps = cmd.accept_icmps; if (new_current) WRITE_ONCE(ao_info->current_key, new_current); if (new_rnext) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index bdec99707028..8f98c58e2689 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -494,6 +494,8 @@ int tcp_v4_err(struct sk_buff *skb, u32 info) return -ENOENT; } if (sk->sk_state == TCP_TIME_WAIT) { + /* To increase the counter of ignored icmps for TCP-AO */ + tcp_ao_ignore_icmp(sk, AF_INET, type, code); inet_twsk_put(inet_twsk(sk)); return 0; } @@ -507,6 +509,11 @@ int tcp_v4_err(struct sk_buff *skb, u32 info) return 0; } + if (tcp_ao_ignore_icmp(sk, AF_INET, type, code)) { + sock_put(sk); + return 0; + } + bh_lock_sock(sk); /* If too many ICMPs get dropped on busy * servers this needs to be solved differently. diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 97397f57dec1..2b8e87429e24 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -396,6 +396,8 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, } if (sk->sk_state == TCP_TIME_WAIT) { + /* To increase the counter of ignored icmps for TCP-AO */ + tcp_ao_ignore_icmp(sk, AF_INET6, type, code); inet_twsk_put(inet_twsk(sk)); return 0; } @@ -406,6 +408,11 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, return 0; } + if (tcp_ao_ignore_icmp(sk, AF_INET6, type, code)) { + sock_put(sk); + return 0; + } + bh_lock_sock(sk); if (sock_owned_by_user(sk) && type != ICMPV6_PKT_TOOBIG) __NET_INC_STATS(net, LINUX_MIB_LOCKDROPPEDICMPS); From patchwork Mon Oct 23 19:22:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157063 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1504999vqx; Mon, 23 Oct 2023 12:24:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE2jnSghkfg/qyAULl4YoC/cXNzZ3CwX0cgWWqVquFwhV/p53xyEj7LTD9gvzWnVEPq5s4v X-Received: by 2002:a05:6a20:7486:b0:16c:b5be:5f6c with SMTP id p6-20020a056a20748600b0016cb5be5f6cmr543937pzd.54.1698089064201; Mon, 23 Oct 2023 12:24:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089064; cv=none; d=google.com; s=arc-20160816; b=fgaFjon5pBlQZoqTbMP1QJ+oeJRiubCGX+XzGvSCk0bMTzB4JFPn9SyyiIJRqipsx8 8TWg9PPMq0LD9ycKUWTWptXLRGEkcPXbmjIY3WWiuZCwcSaPm8r+wDwwBI+4WEH/358q riFXp3cGd+xrdm6J5lbu3T9CVNOT733sPnrouxp7IwAUpJL/By++XboSFoCqnlZ8k+Is 2o3kEkFRRC/f7SZ5aSboZqSLVt6VE5t/juXJXKuQ8CzHzr1hB9kO97gQSQst+STEsLWT FCDsYOxPP6DRLl3zsgdENk+XpNKF4hssoSLLpcwgTAV9eNVm7HMk8o7OQ2uBd2N7+qaV zT9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7u8YCu2AD+GW6MAEiXwWdEdCsazbndy14jHPXm7rwEg=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=DD3Rl9XvxkpLpoiJ/X1xeXpFM7EqtHHTWa2PqL/zTjgq5VAG7YHgQrZ6eLQErtPA/I DNc7gutmk78xUldnzhKX9SsLIoSuwzD7tC5yIwtBxhZvMIaLqc3lfO+zb4/W+prSuXL8 Qld77N+QqKw8SC/ES/kmDOccDEny1WPywO8i29G3+B7zMFjx5E1b2fvhIZ7JcBS4vE2N gsoEbpH2lRZlzFA4mk1sh+sfSKEnH9nLO9ecKNbli1ZVBF5cBlgVut7YVco94rA+5ErZ y7M5oD0CJDKk2GU7Ut7eNk70hqv0TBchfL4v2Br8+KRs9SLEEfbhDS25IdP9H3sQg37m yQTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=cOttSYca; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id g12-20020aa79dcc000000b006be0a1e095bsi6913944pfq.183.2023.10.23.12.24.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:24:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=cOttSYca; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 12DC480C0364; Mon, 23 Oct 2023 12:24:23 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233280AbjJWTYM (ORCPT + 27 others); Mon, 23 Oct 2023 15:24:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231453AbjJWTXi (ORCPT ); Mon, 23 Oct 2023 15:23:38 -0400 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F2901739 for ; Mon, 23 Oct 2023 12:23:02 -0700 (PDT) Received: by mail-wm1-x331.google.com with SMTP id 5b1f17b1804b1-40859dee28cso18766965e9.0 for ; Mon, 23 Oct 2023 12:23:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088979; x=1698693779; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7u8YCu2AD+GW6MAEiXwWdEdCsazbndy14jHPXm7rwEg=; b=cOttSYcamX4i+PydWodyMxATiRjX+vbQlUvLNTdHvEIALf5W6xab85A0U9xJfBYVGY hdYeBymYYZItSjjbot4etAP242bdfnoSvD+jXwUUcWBJCNoN3yUSCgHvJ8XgcGw6ISaS 3ayo0VKY7aU+0OJiE37A6SCOd4L4FDcc84Ts3Ch0LGn4UQ3iQPBIzDN7kGm9v4qfL6ud +wmWoL9+GGnqCnjYuLPN+Woh6MkXDNJkFajktuXWfoHgCW7daZU3dLJTC9ixf373wapM VdqB+7o+MsZhRAOm7boWXePBUGqtH3zF9rSJS/ork01187vI3+78wghO+vABikGSgwH7 xhRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088979; x=1698693779; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7u8YCu2AD+GW6MAEiXwWdEdCsazbndy14jHPXm7rwEg=; b=vCI7IM5N1LfU3M046cZwvB9DRXMFRCYjH/7MXIF5jc/PsULGQIYXE0uzukRBuWXg38 6M+meJ/EOYneAGQMfZKA8WQMlsCxOg7mDYAyaXFR22dHOFEnNsbsjzsL4PAn91VBcGgX GvY3k+nyasrhI940KeXlZc3B6hD6AuVcci4fYZ1Xg4HNADVEKAGgXAq5W1r6Mza2z6Jy 43UBZgM2N/eXC+EXWYVxQflhsxvrvWUJLA/Mk7wB0mbyNv7vHh2hb3AZtB8wU6fH94N0 LvFzzB0InjsZPZgEuuynFPx4RK5U6i+wHD5ivU4JH2ihVAmaGt51vXrttee5C8NI8cmE ysBQ== X-Gm-Message-State: AOJu0YyzBqwldpHJBsWjFDEVh84hiVG9L46DtBF3ObPsOOwVwKqV6pAf XxeRLJIsAykPSkM1F8hPsZ+20w== X-Received: by 2002:a05:600c:1d22:b0:408:4475:8cc1 with SMTP id l34-20020a05600c1d2200b0040844758cc1mr8453490wms.35.1698088978886; Mon, 23 Oct 2023 12:22:58 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:58 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 17/23] net/tcp: Add option for TCP-AO to (not) hash header Date: Mon, 23 Oct 2023 20:22:09 +0100 Message-ID: <20231023192217.426455-18-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:24:23 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575438424166255 X-GMAIL-MSGID: 1780575438424166255 Provide setsockopt() key flag that makes TCP-AO exclude hashing TCP header for peers that match the key. This is needed for interraction with middleboxes that may change TCP options, see RFC5925 (9.2). Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/uapi/linux/tcp.h | 5 +++++ net/ipv4/tcp_ao.c | 8 +++++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index d8b2ea23f12a..320aab010f9a 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -367,6 +367,11 @@ struct tcp_diag_md5sig { #define TCP_AO_MAXKEYLEN 80 #define TCP_AO_KEYF_IFINDEX (1 << 0) /* L3 ifindex for VRF */ +#define TCP_AO_KEYF_EXCLUDE_OPT (1 << 1) /* "Indicates whether TCP + * options other than TCP-AO + * are included in the MAC + * calculation" + */ struct tcp_ao_add { /* setsockopt(TCP_AO_ADD_KEY) */ struct __kernel_sockaddr_storage addr; /* peer's address for the key */ diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 223af5c9eaf3..10cc6be4d537 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -562,7 +562,8 @@ int tcp_ao_hash_hdr(unsigned short int family, char *ao_hash, WARN_ON_ONCE(1); goto clear_hash; } - if (tcp_ao_hash_header(&hp, th, false, + if (tcp_ao_hash_header(&hp, th, + !!(key->keyflags & TCP_AO_KEYF_EXCLUDE_OPT), ao_hash, hash_offset, tcp_ao_maclen(key))) goto clear_hash; ahash_request_set_crypt(hp.req, NULL, hash_buf, 0); @@ -610,7 +611,8 @@ int tcp_ao_hash_skb(unsigned short int family, goto clear_hash; if (tcp_ao_hash_pseudoheader(family, sk, skb, &hp, skb->len)) goto clear_hash; - if (tcp_ao_hash_header(&hp, th, false, + if (tcp_ao_hash_header(&hp, th, + !!(key->keyflags & TCP_AO_KEYF_EXCLUDE_OPT), ao_hash, hash_offset, tcp_ao_maclen(key))) goto clear_hash; if (tcp_sigpool_hash_skb_data(&hp, skb, th->doff << 2)) @@ -1454,7 +1456,7 @@ static struct tcp_ao_info *setsockopt_ao_info(struct sock *sk) return ERR_PTR(-ESOCKTNOSUPPORT); } -#define TCP_AO_KEYF_ALL (0) +#define TCP_AO_KEYF_ALL (TCP_AO_KEYF_EXCLUDE_OPT) static struct tcp_ao_key *tcp_ao_key_alloc(struct sock *sk, struct tcp_ao_add *cmd) From patchwork Mon Oct 23 19:22:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157078 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1510701vqx; Mon, 23 Oct 2023 12:36:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFLcUSGnRBAvlnH7OvoBabghvTLIpSvB11nHHhJQaNh3Uhlso6SK43uX3Tz60VmGvsEh4fV X-Received: by 2002:a17:902:dac6:b0:1c7:66a4:27ba with SMTP id q6-20020a170902dac600b001c766a427bamr14376839plx.48.1698089775073; Mon, 23 Oct 2023 12:36:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089775; cv=none; d=google.com; s=arc-20160816; b=ILgNa2EY7jGvv+h+o/8nl2/Su1m/nYh6QJu1ZEBzN/ZJ1dEDVcilVFvVSl7ENX+D1r 19ElLCGWlHiKGgV/kzfZ56Ay68smW6zf/2rOpkmBHd182U74zXVsMJQC2MnriXwNbH1k E742vb8C/Yq7ybQYTFbbD/7j+y0tEVZclcIvmfTOZvNs7Rd9Ow/b1yz1Z02rLPCLr9bG UuEHI5Zvlmfxramsnkuht1yYXgIy8SvZzwOJvUKtM78yBpxE96HU1xgkQ9/MpdpUh3YW qtLE3q66ZXCXRjeR89uN1pvKrJwc0CL7evRmjeAhiH8f2vdLNZREtZYtuBFfNpmhh62W Mj6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ogTBmilbYwjvOrqdvd5xSwoqZq6DdeNin62x3XORTDI=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=gwKHoRitQ8Hx4PGjMaqGQFtuZvCOpcNMpbREFE3OZo2jOaD/pEl/vaHWMVLqmECGQ2 26wW/I5f1L+8qgEdywrJ/y8/4ftBT0YwPRuPXnTN/pSSRLJkFGfrIJgvSq85KHof0Cr5 RmRjG7upvIj7yS3dum1G9Nn65xTk7OZsDUlcb3yhY7CTso3oYxZbXn4wh5nFY2gT73og e6ecfjMYOvKyXE3NONfGpScPNJKBWOOeYvwlSJih1VGhnku7lPVtmnFNlFrvXuWZrVZq tSNFEx4uad3KMmeQNwNrg4E5QYxWu99LSC5sw07xMkQcmeVVPheN5wCyK6mPTMIgIFBT 3kUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b="kQ/oA2c7"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id h5-20020a170902f54500b001c3a06b4fd7si7059432plf.561.2023.10.23.12.36.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:36:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b="kQ/oA2c7"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 84F0880BE2CF; Mon, 23 Oct 2023 12:36:13 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230245AbjJWTf7 (ORCPT + 27 others); Mon, 23 Oct 2023 15:35:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232364AbjJWTXq (ORCPT ); Mon, 23 Oct 2023 15:23:46 -0400 Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B0DC19A1 for ; Mon, 23 Oct 2023 12:23:06 -0700 (PDT) Received: by mail-lj1-x229.google.com with SMTP id 38308e7fff4ca-2b95d5ee18dso58019271fa.1 for ; Mon, 23 Oct 2023 12:23:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088981; x=1698693781; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ogTBmilbYwjvOrqdvd5xSwoqZq6DdeNin62x3XORTDI=; b=kQ/oA2c7+lJu16nLRNPBd6K7m3iD2AF/FR/7lGKR4GLQK1CY/KOOdcwBgGA7SrMGCm QE1geKLtiQcvvreWIv5zyyTFL3ltvE1wpjwuOy9+xCvhwnBg7pFFlJ1ypJbw+P69DxwZ 50ZIy/+iSn+XGd0loC3rm9P7G+RbLq8xpo8gTEQCGf6+HjB0ASA3d93CtPt1+3Armyuy tRa/ZEAIaLJr0h1rbEwjTtQwz41UlUO1M/INogQpmwCiMh2UPAvBhi9zTRuaDaGzTVdP et9KvUYGyfNX8FSbcVyAt/SiICxH681CuJyTEtX974OV91WiyUDhQKL5802GE3EtbF/u CqIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088981; x=1698693781; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ogTBmilbYwjvOrqdvd5xSwoqZq6DdeNin62x3XORTDI=; b=M7c+R+sFh0OwX0s3q64dvT5TVdwDVv3PZaD/INKaMK7MeX6u34LaJm46rsAwLMEJ01 uAc157dO6gUpFZsUFoPwzeqAqwJL/jcB4uA4run5yVsgic/wMf2xYUq42Mkou10hLmaA wvzwdGvcmVF/PDItTkH+43bCYezRDTYcuL4VJA9z1cK4Dkrcb8GTa7ih3HE5/vSnH45K xikK+sPQLyfWmKc8ciEn6UXy/ceRcx7rP6L5mdsIYeA8/cE9rhatkN97CpUTXfdRiHHy oRwhHi4yM1gZYxI2YsD/RmrucsjCcIS8qjYIrbyxUEx0b5DvhEQZYAJYZIo0eEilKTbE ecyw== X-Gm-Message-State: AOJu0YylJBzyrJ1rx5Dm4OKR1VM+aENMz+8w0/804bBpkdnOdnBNjpO5 Fc9Q4JRD61Nwp2kCCtuWveqjRA== X-Received: by 2002:a05:651c:1503:b0:2c5:1f20:926c with SMTP id e3-20020a05651c150300b002c51f20926cmr7676700ljf.11.1698088980794; Mon, 23 Oct 2023 12:23:00 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:00 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 18/23] net/tcp: Add TCP-AO getsockopt()s Date: Mon, 23 Oct 2023 20:22:10 +0100 Message-ID: <20231023192217.426455-19-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:36:13 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780576183863470255 X-GMAIL-MSGID: 1780576183863470255 Introduce getsockopt(TCP_AO_GET_KEYS) that lets a user get TCP-AO keys and their properties from a socket. The user can provide a filter to match the specific key to be dumped or ::get_all = 1 may be used to dump all keys in one syscall. Add another getsockopt(TCP_AO_INFO) for providing per-socket/per-ao_info stats: packet counters, Current_key/RNext_key and flags like ::ao_required and ::accept_icmps. Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp_ao.h | 12 ++ include/uapi/linux/tcp.h | 63 +++++++-- net/ipv4/tcp.c | 13 ++ net/ipv4/tcp_ao.c | 295 +++++++++++++++++++++++++++++++++++++++ 4 files changed, 369 insertions(+), 14 deletions(-) diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index a9d38b9e8bcb..061c358a3c8a 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -194,6 +194,8 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, void tcp_ao_destroy_sock(struct sock *sk, bool twsk); void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp); bool tcp_ao_ignore_icmp(const struct sock *sk, int family, int type, int code); +int tcp_ao_get_mkts(struct sock *sk, sockptr_t optval, sockptr_t optlen); +int tcp_ao_get_sock_info(struct sock *sk, sockptr_t optval, sockptr_t optlen); enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, const struct request_sock *req, @@ -316,6 +318,16 @@ static inline void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, static inline void tcp_ao_connect_init(struct sock *sk) { } + +static inline int tcp_ao_get_mkts(struct sock *sk, sockptr_t optval, sockptr_t optlen) +{ + return -ENOPROTOOPT; +} + +static inline int tcp_ao_get_sock_info(struct sock *sk, sockptr_t optval, sockptr_t optlen) +{ + return -ENOPROTOOPT; +} #endif #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index 320aab010f9a..201b3cbd6020 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -131,7 +131,8 @@ enum { #define TCP_AO_ADD_KEY 38 /* Add/Set MKT */ #define TCP_AO_DEL_KEY 39 /* Delete MKT */ -#define TCP_AO_INFO 40 /* Modify TCP-AO per-socket options */ +#define TCP_AO_INFO 40 /* Set/list TCP-AO per-socket options */ +#define TCP_AO_GET_KEYS 41 /* List MKT(s) */ #define TCP_REPAIR_ON 1 #define TCP_REPAIR_OFF 0 @@ -405,21 +406,55 @@ struct tcp_ao_del { /* setsockopt(TCP_AO_DEL_KEY) */ __u8 keyflags; /* see TCP_AO_KEYF_ */ } __attribute__((aligned(8))); -struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */ - __u32 set_current :1, /* corresponding ::current_key */ - set_rnext :1, /* corresponding ::rnext */ - ao_required :1, /* don't accept non-AO connects */ - set_counters :1, /* set/clear ::pkt_* counters */ - accept_icmps :1, /* accept incoming ICMPs */ +struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO), getsockopt(TCP_AO_INFO) */ + /* Here 'in' is for setsockopt(), 'out' is for getsockopt() */ + __u32 set_current :1, /* in/out: corresponding ::current_key */ + set_rnext :1, /* in/out: corresponding ::rnext */ + ao_required :1, /* in/out: don't accept non-AO connects */ + set_counters :1, /* in: set/clear ::pkt_* counters */ + accept_icmps :1, /* in/out: accept incoming ICMPs */ reserved :27; /* must be 0 */ __u16 reserved2; /* padding, must be 0 */ - __u8 current_key; /* KeyID to set as Current_key */ - __u8 rnext; /* KeyID to set as Rnext_key */ - __u64 pkt_good; /* verified segments */ - __u64 pkt_bad; /* failed verification */ - __u64 pkt_key_not_found; /* could not find a key to verify */ - __u64 pkt_ao_required; /* segments missing TCP-AO sign */ - __u64 pkt_dropped_icmp; /* ICMPs that were ignored */ + __u8 current_key; /* in/out: KeyID of Current_key */ + __u8 rnext; /* in/out: keyid of RNext_key */ + __u64 pkt_good; /* in/out: verified segments */ + __u64 pkt_bad; /* in/out: failed verification */ + __u64 pkt_key_not_found; /* in/out: could not find a key to verify */ + __u64 pkt_ao_required; /* in/out: segments missing TCP-AO sign */ + __u64 pkt_dropped_icmp; /* in/out: ICMPs that were ignored */ +} __attribute__((aligned(8))); + +struct tcp_ao_getsockopt { /* getsockopt(TCP_AO_GET_KEYS) */ + struct __kernel_sockaddr_storage addr; /* in/out: dump keys for peer + * with this address/prefix + */ + char alg_name[64]; /* out: crypto hash algorithm */ + __u8 key[TCP_AO_MAXKEYLEN]; + __u32 nkeys; /* in: size of the userspace buffer + * @optval, measured in @optlen - the + * sizeof(struct tcp_ao_getsockopt) + * out: number of keys that matched + */ + __u16 is_current :1, /* in: match and dump Current_key, + * out: the dumped key is Current_key + */ + + is_rnext :1, /* in: match and dump RNext_key, + * out: the dumped key is RNext_key + */ + get_all :1, /* in: dump all keys */ + reserved :13; /* padding, must be 0 */ + __u8 sndid; /* in/out: dump keys with SendID */ + __u8 rcvid; /* in/out: dump keys with RecvID */ + __u8 prefix; /* in/out: dump keys with address/prefix */ + __u8 maclen; /* out: key's length of authentication + * code (hash) + */ + __u8 keyflags; /* in/out: see TCP_AO_KEYF_ */ + __u8 keylen; /* out: length of ::key */ + __s32 ifindex; /* in/out: L3 dev index for VRF */ + __u64 pkt_good; /* out: verified segments */ + __u64 pkt_bad; /* out: segments that failed verification */ } __attribute__((aligned(8))); /* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 8594a4bf764e..a3107c28ea5f 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4282,6 +4282,19 @@ int do_tcp_getsockopt(struct sock *sk, int level, return err; } #endif + case TCP_AO_GET_KEYS: + case TCP_AO_INFO: { + int err; + + sockopt_lock_sock(sk); + if (optname == TCP_AO_GET_KEYS) + err = tcp_ao_get_mkts(sk, optval, optlen); + else + err = tcp_ao_get_sock_info(sk, optval, optlen); + sockopt_release_sock(sk); + + return err; + } default: return -ENOPROTOOPT; } diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 10cc6be4d537..cbc1ee0f5b9a 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -1894,3 +1894,298 @@ int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen) return tcp_parse_ao(sk, cmd, AF_INET, optval, optlen); } +/* tcp_ao_copy_mkts_to_user(ao_info, optval, optlen) + * + * @ao_info: struct tcp_ao_info on the socket that + * socket getsockopt(TCP_AO_GET_KEYS) is executed on + * @optval: pointer to array of tcp_ao_getsockopt structures in user space. + * Must be != NULL. + * @optlen: pointer to size of tcp_ao_getsockopt structure. + * Must be != NULL. + * + * Return value: 0 on success, a negative error number otherwise. + * + * optval points to an array of tcp_ao_getsockopt structures in user space. + * optval[0] is used as both input and output to getsockopt. It determines + * which keys are returned by the kernel. + * optval[0].nkeys is the size of the array in user space. On return it contains + * the number of keys matching the search criteria. + * If tcp_ao_getsockopt::get_all is set, then all keys in the socket are + * returned, otherwise only keys matching + * in optval[0] are returned. + * optlen is also used as both input and output. The user provides the size + * of struct tcp_ao_getsockopt in user space, and the kernel returns the size + * of the structure in kernel space. + * The size of struct tcp_ao_getsockopt may differ between user and kernel. + * There are three cases to consider: + * * If usize == ksize, then keys are copied verbatim. + * * If usize < ksize, then the userspace has passed an old struct to a + * newer kernel. The rest of the trailing bytes in optval[0] + * (ksize - usize) are interpreted as 0 by the kernel. + * * If usize > ksize, then the userspace has passed a new struct to an + * older kernel. The trailing bytes unknown to the kernel (usize - ksize) + * are checked to ensure they are zeroed, otherwise -E2BIG is returned. + * On return the kernel fills in min(usize, ksize) in each entry of the array. + * The layout of the fields in the user and kernel structures is expected to + * be the same (including in the 32bit vs 64bit case). + */ +static int tcp_ao_copy_mkts_to_user(struct tcp_ao_info *ao_info, + sockptr_t optval, sockptr_t optlen) +{ + struct tcp_ao_getsockopt opt_in, opt_out; + struct tcp_ao_key *key, *current_key; + bool do_address_matching = true; + union tcp_ao_addr *addr = NULL; + unsigned int max_keys; /* maximum number of keys to copy to user */ + size_t out_offset = 0; + size_t bytes_to_write; /* number of bytes to write to user level */ + int err, user_len; + u32 matched_keys; /* keys from ao_info matched so far */ + int optlen_out; + __be16 port = 0; + + if (copy_from_sockptr(&user_len, optlen, sizeof(int))) + return -EFAULT; + + if (user_len <= 0) + return -EINVAL; + + memset(&opt_in, 0, sizeof(struct tcp_ao_getsockopt)); + err = copy_struct_from_sockptr(&opt_in, sizeof(opt_in), + optval, user_len); + if (err < 0) + return err; + + if (opt_in.pkt_good || opt_in.pkt_bad) + return -EINVAL; + + if (opt_in.reserved != 0) + return -EINVAL; + + max_keys = opt_in.nkeys; + + if (opt_in.get_all || opt_in.is_current || opt_in.is_rnext) { + if (opt_in.get_all && (opt_in.is_current || opt_in.is_rnext)) + return -EINVAL; + do_address_matching = false; + } + + switch (opt_in.addr.ss_family) { + case AF_INET: { + struct sockaddr_in *sin; + __be32 mask; + + sin = (struct sockaddr_in *)&opt_in.addr; + port = sin->sin_port; + addr = (union tcp_ao_addr *)&sin->sin_addr; + + if (opt_in.prefix > 32) + return -EINVAL; + + if (ntohl(sin->sin_addr.s_addr) == INADDR_ANY && + opt_in.prefix != 0) + return -EINVAL; + + mask = inet_make_mask(opt_in.prefix); + if (sin->sin_addr.s_addr & ~mask) + return -EINVAL; + + break; + } + case AF_INET6: { + struct sockaddr_in6 *sin6; + struct in6_addr *addr6; + + sin6 = (struct sockaddr_in6 *)&opt_in.addr; + addr = (union tcp_ao_addr *)&sin6->sin6_addr; + addr6 = &sin6->sin6_addr; + port = sin6->sin6_port; + + /* We don't have to change family and @addr here if + * ipv6_addr_v4mapped() like in key adding: + * tcp_ao_key_cmp() does it. Do the sanity checks though. + */ + if (opt_in.prefix != 0) { + if (ipv6_addr_v4mapped(addr6)) { + __be32 mask, addr4 = addr6->s6_addr32[3]; + + if (opt_in.prefix > 32 || + ntohl(addr4) == INADDR_ANY) + return -EINVAL; + mask = inet_make_mask(opt_in.prefix); + if (addr4 & ~mask) + return -EINVAL; + } else { + struct in6_addr pfx; + + if (ipv6_addr_any(addr6) || + opt_in.prefix > 128) + return -EINVAL; + + ipv6_addr_prefix(&pfx, addr6, opt_in.prefix); + if (ipv6_addr_cmp(&pfx, addr6)) + return -EINVAL; + } + } else if (!ipv6_addr_any(addr6)) { + return -EINVAL; + } + break; + } + case 0: + if (!do_address_matching) + break; + fallthrough; + default: + return -EAFNOSUPPORT; + } + + if (!do_address_matching) { + /* We could just ignore those, but let's do stricter checks */ + if (addr || port) + return -EINVAL; + if (opt_in.prefix || opt_in.sndid || opt_in.rcvid) + return -EINVAL; + } + + bytes_to_write = min_t(int, user_len, sizeof(struct tcp_ao_getsockopt)); + matched_keys = 0; + /* May change in RX, while we're dumping, pre-fetch it */ + current_key = READ_ONCE(ao_info->current_key); + + hlist_for_each_entry_rcu(key, &ao_info->head, node) { + if (opt_in.get_all) + goto match; + + if (opt_in.is_current || opt_in.is_rnext) { + if (opt_in.is_current && key == current_key) + goto match; + if (opt_in.is_rnext && key == ao_info->rnext_key) + goto match; + continue; + } + + if (tcp_ao_key_cmp(key, addr, opt_in.prefix, + opt_in.addr.ss_family, + opt_in.sndid, opt_in.rcvid) != 0) + continue; +match: + matched_keys++; + if (matched_keys > max_keys) + continue; + + memset(&opt_out, 0, sizeof(struct tcp_ao_getsockopt)); + + if (key->family == AF_INET) { + struct sockaddr_in *sin_out = (struct sockaddr_in *)&opt_out.addr; + + sin_out->sin_family = key->family; + sin_out->sin_port = 0; + memcpy(&sin_out->sin_addr, &key->addr, sizeof(struct in_addr)); + } else { + struct sockaddr_in6 *sin6_out = (struct sockaddr_in6 *)&opt_out.addr; + + sin6_out->sin6_family = key->family; + sin6_out->sin6_port = 0; + memcpy(&sin6_out->sin6_addr, &key->addr, sizeof(struct in6_addr)); + } + opt_out.sndid = key->sndid; + opt_out.rcvid = key->rcvid; + opt_out.prefix = key->prefixlen; + opt_out.keyflags = key->keyflags; + opt_out.is_current = (key == current_key); + opt_out.is_rnext = (key == ao_info->rnext_key); + opt_out.nkeys = 0; + opt_out.maclen = key->maclen; + opt_out.keylen = key->keylen; + opt_out.pkt_good = atomic64_read(&key->pkt_good); + opt_out.pkt_bad = atomic64_read(&key->pkt_bad); + memcpy(&opt_out.key, key->key, key->keylen); + tcp_sigpool_algo(key->tcp_sigpool_id, opt_out.alg_name, 64); + + /* Copy key to user */ + if (copy_to_sockptr_offset(optval, out_offset, + &opt_out, bytes_to_write)) + return -EFAULT; + out_offset += user_len; + } + + optlen_out = (int)sizeof(struct tcp_ao_getsockopt); + if (copy_to_sockptr(optlen, &optlen_out, sizeof(int))) + return -EFAULT; + + out_offset = offsetof(struct tcp_ao_getsockopt, nkeys); + if (copy_to_sockptr_offset(optval, out_offset, + &matched_keys, sizeof(u32))) + return -EFAULT; + + return 0; +} + +int tcp_ao_get_mkts(struct sock *sk, sockptr_t optval, sockptr_t optlen) +{ + struct tcp_ao_info *ao_info; + + ao_info = setsockopt_ao_info(sk); + if (IS_ERR(ao_info)) + return PTR_ERR(ao_info); + if (!ao_info) + return -ENOENT; + + return tcp_ao_copy_mkts_to_user(ao_info, optval, optlen); +} + +int tcp_ao_get_sock_info(struct sock *sk, sockptr_t optval, sockptr_t optlen) +{ + struct tcp_ao_info_opt out, in = {}; + struct tcp_ao_key *current_key; + struct tcp_ao_info *ao; + int err, len; + + if (copy_from_sockptr(&len, optlen, sizeof(int))) + return -EFAULT; + + if (len <= 0) + return -EINVAL; + + /* Copying this "in" only to check ::reserved, ::reserved2, + * that may be needed to extend (struct tcp_ao_info_opt) and + * what getsockopt() provides in future. + */ + err = copy_struct_from_sockptr(&in, sizeof(in), optval, len); + if (err) + return err; + + if (in.reserved != 0 || in.reserved2 != 0) + return -EINVAL; + + ao = setsockopt_ao_info(sk); + if (IS_ERR(ao)) + return PTR_ERR(ao); + if (!ao) + return -ENOENT; + + memset(&out, 0, sizeof(out)); + out.ao_required = ao->ao_required; + out.accept_icmps = ao->accept_icmps; + out.pkt_good = atomic64_read(&ao->counters.pkt_good); + out.pkt_bad = atomic64_read(&ao->counters.pkt_bad); + out.pkt_key_not_found = atomic64_read(&ao->counters.key_not_found); + out.pkt_ao_required = atomic64_read(&ao->counters.ao_required); + out.pkt_dropped_icmp = atomic64_read(&ao->counters.dropped_icmp); + + current_key = READ_ONCE(ao->current_key); + if (current_key) { + out.set_current = 1; + out.current_key = current_key->sndid; + } + if (ao->rnext_key) { + out.set_rnext = 1; + out.rnext = ao->rnext_key->rcvid; + } + + if (copy_to_sockptr(optval, &out, min_t(int, len, sizeof(out)))) + return -EFAULT; + + return 0; +} + From patchwork Mon Oct 23 19:22:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157079 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1510790vqx; Mon, 23 Oct 2023 12:36:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGxBgu3QerGH4hQj5prG7QBOH3IDXJr1xmVFi2RbS0tuD2fimrJbOLJxdffmLBzby/sfde9 X-Received: by 2002:a17:90a:31c:b0:27d:68e1:c3d6 with SMTP id 28-20020a17090a031c00b0027d68e1c3d6mr10038391pje.28.1698089785007; Mon, 23 Oct 2023 12:36:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089784; cv=none; d=google.com; s=arc-20160816; b=oaHwC2W1+H+UJzglw1AlOKuBV0hdTnJAUo+IsGimz5Ba+mfzRrnOZN2Rc4eN1fhiBk KS/auanROP9x/OfRD+Q8YXw4gOYCrJka7FdEee4NFCk52wwY9I1uP+oam5YS42xYPPU+ E1RTtFeWaxjnWF6bDvPTaXEEXkXBOgBzk/lNImg9S9sJxfCxtTgniHdfD5C7eSCfVjDz +16rrSfjxEogDBZ1OcVhdgD2DPosX7uKKeGl/K1D+NSQlc1V6L8PSU1ujQQdC5QRXQ1T SRIt4xGt/IKagjdPa6vnyRZK3PKIjMqOeM0Y9+UEhpXm7pa0I4RHoz4DVXVXD+WVMMiG sSfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=yFuTyTJaPxOlc+JRRqn8TS1+jZrCOoj8k+Va4Di7hSE=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=D+wc7eTdnU5J6PJ3h0sH/i3z8Sa/DGAELF8D2ipjxSh2qEfdDSh2Z9JXx+UhaYI6tb zW4QaJESHYsd6C293pJ0hI1SxwKTkQrIhbtQtmm/mvgceMerOyHerT8WS9e/hqLRjEWq OvpOreQk4eLcfw2Xt0VNhCX3FTKjiWLhOV2Qxda1bB2nj7efVn+fFJY0G6oEzIETdox4 oYnT+eDf6OhzPMldwHQeJ9BgmK7SpekE10YOvYfwQWNEGhC6cuWQTGAjPmiZiy9bQ+B5 v79u+tm2Zqdllok26wXICacknV4NJkK/zhzuw64g0TgCIvWEgvlkS7cpWePhzz4txi/k Y0Rg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b="UbS5Tx/F"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id my17-20020a17090b4c9100b00276bf69ac44si7331037pjb.5.2023.10.23.12.36.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:36:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b="UbS5Tx/F"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 31A9F8077461; Mon, 23 Oct 2023 12:36:22 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231161AbjJWTgD (ORCPT + 27 others); Mon, 23 Oct 2023 15:36:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233677AbjJWTXu (ORCPT ); Mon, 23 Oct 2023 15:23:50 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 543B119B9 for ; Mon, 23 Oct 2023 12:23:07 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-4083f613272so31358255e9.1 for ; Mon, 23 Oct 2023 12:23:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088983; x=1698693783; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yFuTyTJaPxOlc+JRRqn8TS1+jZrCOoj8k+Va4Di7hSE=; b=UbS5Tx/FTwa44h+qaKvnKCeSPT2M013EBqImC5t66Y52QoAXzy2r19gCcxpMfu5QLz J8guOTXOagpwEsNAForVAYKrU4bG6xJTbBc++vay8XjTJgCSUf4m1EgmTfII9RRX8UWx oW8QL0+7Wexy5Z8jj9NIAFu+8+VUjKaCQoGox4IhW24OdoqK9EwjMyAYtoqOE5a8NmbO lEzNDPZ7ry9F26KPKR4bAS3WqFkMF7fyG9UwNlsiGrQXpPtPxurHjd+WxLDGWRi2amBo YrTE1bqyz3KKUpjzg/FElTpKvMjpWFYopaYQFf4YMT0ekqAsKBGdeySVS8U7dfnW9k9P kdcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088983; x=1698693783; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yFuTyTJaPxOlc+JRRqn8TS1+jZrCOoj8k+Va4Di7hSE=; b=I63Ayip7GIydDw09hhbWhbluMbEKiDEZNyFYLB37V7OgAjclc72etiNtKSpn/Mnhlm +s7TmfyHVJgF9fldstkxttmHRt0XJ25ZsJbtw8BIGlLJLg83H2oaaRs282slMx7ALXyw 3D9OQtxNdfT1mo/6i/oVVNYKFfHc/BiIMhzEkPxBPKvVI0lMVjoSMVMWV8EsLB6JE+kV jMBLqfAup+enNq34ob7ZUMQtCnbsmneXvFaCoo3G4d3Q/lZsao3SjA6vqxQyRwMphhem vvA2hPsDmNQRpNTJsT4M2ejS8YerroOw1CnoD37VDAbjMllxPQkkGSXOLfTMNReFBh10 uJTQ== X-Gm-Message-State: AOJu0Yx9cvVllQqAU0ANCTdICeJKrlBIB27Hh13VRPPzgW3qVsIbifmV Zi97VUGBKRTgwpZNOkpzGI4uwg== X-Received: by 2002:a05:600c:4748:b0:409:325:e499 with SMTP id w8-20020a05600c474800b004090325e499mr2352197wmo.32.1698088982727; Mon, 23 Oct 2023 12:23:02 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.23.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:02 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 19/23] net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs) Date: Mon, 23 Oct 2023 20:22:11 +0100 Message-ID: <20231023192217.426455-20-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:36:22 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780576194599865924 X-GMAIL-MSGID: 1780576194599865924 Delete becomes very, very fast - almost free, but after setsockopt() syscall returns, the key is still alive until next RCU grace period. Which is fine for listen sockets as userspace needs to be aware of setsockopt(TCP_AO) and accept() race and resolve it with verification by getsockopt() after TCP connection was accepted. The benchmark results (on non-loaded box, worse with more RCU work pending): > ok 33 Worst case delete 16384 keys: min=5ms max=10ms mean=6.93904ms stddev=0.263421 > ok 34 Add a new key 16384 keys: min=1ms max=4ms mean=2.17751ms stddev=0.147564 > ok 35 Remove random-search 16384 keys: min=5ms max=10ms mean=6.50243ms stddev=0.254999 > ok 36 Remove async 16384 keys: min=0ms max=0ms mean=0.0296107ms stddev=0.0172078 Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/uapi/linux/tcp.h | 3 ++- net/ipv4/tcp_ao.c | 21 ++++++++++++++++++--- 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index 201b3cbd6020..be34d7c5c531 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -396,7 +396,8 @@ struct tcp_ao_del { /* setsockopt(TCP_AO_DEL_KEY) */ __s32 ifindex; /* L3 dev index for VRF */ __u32 set_current :1, /* corresponding ::current_key */ set_rnext :1, /* corresponding ::rnext */ - reserved :30; /* must be 0 */ + del_async :1, /* only valid for listen sockets */ + reserved :29; /* must be 0 */ __u16 reserved2; /* padding, must be 0 */ __u8 prefix; /* peer's address prefix */ __u8 sndid; /* SendID for outgoing segments */ diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index cbc1ee0f5b9a..acbeb635fe29 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -1628,7 +1628,7 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, } static int tcp_ao_delete_key(struct sock *sk, struct tcp_ao_info *ao_info, - struct tcp_ao_key *key, + bool del_async, struct tcp_ao_key *key, struct tcp_ao_key *new_current, struct tcp_ao_key *new_rnext) { @@ -1636,11 +1636,24 @@ static int tcp_ao_delete_key(struct sock *sk, struct tcp_ao_info *ao_info, hlist_del_rcu(&key->node); + /* Support for async delete on listening sockets: as they don't + * need current_key/rnext_key maintaining, we don't need to check + * them and we can just free all resources in RCU fashion. + */ + if (del_async) { + atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc); + call_rcu(&key->rcu, tcp_ao_key_free_rcu); + return 0; + } + /* At this moment another CPU could have looked this key up * while it was unlinked from the list. Wait for RCU grace period, * after which the key is off-list and can't be looked up again; * the rx path [just before RCU came] might have used it and set it * as current_key (very unlikely). + * Free the key with next RCU grace period (in case it was + * current_key before tcp_ao_current_rnext() might have + * changed it in forced-delete). */ synchronize_rcu(); if (new_current) @@ -1711,6 +1724,8 @@ static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family, if (!new_rnext) return -ENOENT; } + if (cmd.del_async && sk->sk_state != TCP_LISTEN) + return -EINVAL; if (family == AF_INET) { struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.addr; @@ -1758,8 +1773,8 @@ static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family, if (key == new_current || key == new_rnext) continue; - return tcp_ao_delete_key(sk, ao_info, key, - new_current, new_rnext); + return tcp_ao_delete_key(sk, ao_info, cmd.del_async, key, + new_current, new_rnext); } return -ENOENT; } From patchwork Mon Oct 23 19:22:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157069 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1505923vqx; Mon, 23 Oct 2023 12:26:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGjkdc8J/ZSphc9JxHNbJGJyH6D15QMTjFeH9FLkePCr+8vD9MolRQxygW51H84FCq+wNjv X-Received: by 2002:a17:902:e887:b0:1c6:3222:c67c with SMTP id w7-20020a170902e88700b001c63222c67cmr9282901plg.23.1698089184666; Mon, 23 Oct 2023 12:26:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089184; cv=none; d=google.com; s=arc-20160816; b=nkJ+DZ9GOCr79N5n8mWKik/mqQ5DXaapfPguA65QjOvo0LI8N51lFTcUJRlfPMBlps mpu1IyCYi0LAUYoOasUPF/9klLw/e23K9Gepz1DVdHVFB6j+PAbA9bRofmYp4/Aa/LKs VHriGQdBuXre7qf66kb12Bg9u7XN0oV0am1XDxsAbO89ErWwUQWtbH91t2z/2dJ4niCX yRKPOIdJkWTSj4CPWZZMI97Im/RRCUphTNoI/TXA6UJFCnKPltV4D6sUWcTx4yibumI2 IGpsK66s8uvfQ//A67nwTuL0mrXsQuQxGoxBISyAov98RQtdcTiR1ciq0Y/dRRQdkNAr VZrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=2Z5Vx8wsuYApYg+k2HPemPqD4zXPQfaLOK27W0kQOf8=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=Ug4R+2fQ2xDSbdb6NL18hGb5Z6eukiPQhkDj4anHNXqdpVpYvfBlAaCi1z6H/QRXQ3 qKe2sv6+rohPN/6b0jy9358BsLtKg24cyWq90A2Tb4E6F8iIU1G1tlNGbbFA9cwJBrvX TJFe1fNwcNvaEVYs3k+Ofejfa+G7i9ea+GUzhujZ4oVSkhYHw1+9uADy2wD9dU9ZMpd4 wfa6W3MIGNjT0Vx7Ip2jcZrQ7qv8Wx5THlOc1Ed3NQtCgMTVzksgaFX1C38XrdEHqQnc oHBSodv99ym7iLmT9rdYRmIy68RZqv+wsWITwKf/sQ9RBfryBdUCOxKRPhIizy9sJXoz cMPw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=DADgZLlH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id j16-20020a170902759000b001c6189eaaebsi6619791pll.186.2023.10.23.12.26.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:26:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=DADgZLlH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 989308079C82; Mon, 23 Oct 2023 12:26:20 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229533AbjJWTZA (ORCPT + 27 others); Mon, 23 Oct 2023 15:25:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231686AbjJWTYx (ORCPT ); Mon, 23 Oct 2023 15:24:53 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C27061BC0 for ; Mon, 23 Oct 2023 12:23:08 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id 5b1f17b1804b1-4083740f92dso30482205e9.3 for ; Mon, 23 Oct 2023 12:23:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088984; x=1698693784; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2Z5Vx8wsuYApYg+k2HPemPqD4zXPQfaLOK27W0kQOf8=; b=DADgZLlHTm+5ehmQ6erOyj8N+sZm7w3By9IdDS2spVbIkEM4A35b8xJwjzNGRXfUqx ODjNUfW3Klo6QxmjgvpSk26fcOSx9evP+FmSF19oJr3/Sve2Q2RGyGYTgN8A6wK8x0wz Lvq/Qgxa2R0ebF8tnbgLYMw2sqd7PNkkcTaK7aPQzYBe12/5xhS9Xj+i77XdYyCP8j6t 5oTHgoEq9RB4fY0Rk8yJ5B3d+JjNYH3J/1u7pSQNqg7xn+5K6DFt6ySGPsXrnUt8C1Vh mcr+BOqIWtR8ellUJofdj5B/FGRbI8JN12gQNSBo1xOH633StaMdPsuHg4YVaDoyOUWP UIEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088984; x=1698693784; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2Z5Vx8wsuYApYg+k2HPemPqD4zXPQfaLOK27W0kQOf8=; b=P7G8znzZk29ehtEiNPZiHt/GxPKPgWzWhDiWyjSfXWl4Zdyn1DFSK1kbSzHapNHBc4 jGTVWc8voBAdWac1f6zFcjXeU3zbs3K400PbXMDf4dAz/9jBJymrurvxPRpZoBAldRy7 Zz4eMAPnPEo7lu3jn2geAqwRaFVc0zqtv8bl6zD2bU6OnDM/YnTH/u5gze5yuO5XV6Fo uskRF0XPwv/chGSGNv+rnOmbJIHsE9VuBKlsrdj1ic4wZAFxTeb4D8piHYeYOAlpv5rj 3SLAfqh97yhCLfdk25yfOQShoV1WC22mU4Pqb0VVERrNYWniR3NUQBM+wSQX6pdDwM1v JqbQ== X-Gm-Message-State: AOJu0YzvD7GtlTH9332GNUhE8p3r9jrmTZSGL1P/W3qk1s52zu5Fmkdf l2lYhnpneUyZdx9T/03DK5XTPg== X-Received: by 2002:a05:600c:a07:b0:405:3251:47a1 with SMTP id z7-20020a05600c0a0700b00405325147a1mr7360481wmp.40.1698088984548; Mon, 23 Oct 2023 12:23:04 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.23.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:04 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 20/23] net/tcp: Add static_key for TCP-AO Date: Mon, 23 Oct 2023 20:22:12 +0100 Message-ID: <20231023192217.426455-21-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:26:20 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575564839866365 X-GMAIL-MSGID: 1780575564839866365 Similarly to TCP-MD5, add a static key to TCP-AO that is patched out when there are no keys on a machine and dynamically enabled with the first setsockopt(TCP_AO) adds a key on any socket. The static key is as well dynamically disabled later when the socket is destructed. The lifetime of enabled static key here is the same as ao_info: it is enabled on allocation, passed over from full socket to twsk and destructed when ao_info is scheduled for destruction. Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp.h | 24 +++++++++++++++-------- include/net/tcp_ao.h | 2 ++ net/ipv4/tcp_ao.c | 22 +++++++++++++++++++++ net/ipv4/tcp_input.c | 46 +++++++++++++++++++++++++++++--------------- net/ipv4/tcp_ipv4.c | 25 +++++++++++++----------- net/ipv6/tcp_ipv6.c | 25 +++++++++++++----------- 6 files changed, 98 insertions(+), 46 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index c93ac6cc12c4..1635f0859ab5 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2286,14 +2286,18 @@ static inline void tcp_get_current_key(const struct sock *sk, #if defined(CONFIG_TCP_AO) || defined(CONFIG_TCP_MD5SIG) const struct tcp_sock *tp = tcp_sk(sk); #endif -#ifdef CONFIG_TCP_AO - struct tcp_ao_info *ao; - ao = rcu_dereference_protected(tp->ao_info, lockdep_sock_is_held(sk)); - if (ao) { - out->ao_key = READ_ONCE(ao->current_key); - out->type = TCP_KEY_AO; - return; +#ifdef CONFIG_TCP_AO + if (static_branch_unlikely(&tcp_ao_needed.key)) { + struct tcp_ao_info *ao; + + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held(sk)); + if (ao) { + out->ao_key = READ_ONCE(ao->current_key); + out->type = TCP_KEY_AO; + return; + } } #endif #ifdef CONFIG_TCP_MD5SIG @@ -2322,7 +2326,8 @@ static inline bool tcp_key_is_md5(const struct tcp_key *key) static inline bool tcp_key_is_ao(const struct tcp_key *key) { #ifdef CONFIG_TCP_AO - if (key->type == TCP_KEY_AO) + if (static_branch_unlikely(&tcp_ao_needed.key) && + key->type == TCP_KEY_AO) return true; #endif return false; @@ -2716,6 +2721,9 @@ static inline bool tcp_ao_required(struct sock *sk, const void *saddr, struct tcp_ao_info *ao_info; struct tcp_ao_key *ao_key; + if (!static_branch_unlikely(&tcp_ao_needed.key)) + return false; + ao_info = rcu_dereference_check(tcp_sk(sk)->ao_info, lockdep_sock_is_held(sk)); if (!ao_info) diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 061c358a3c8a..a38408072ea8 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -151,6 +151,8 @@ do { \ #ifdef CONFIG_TCP_AO /* TCP-AO structures and functions */ +#include +extern struct static_key_false_deferred tcp_ao_needed; struct tcp4_ao_context { __be32 saddr; diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index acbeb635fe29..ffce8ca60ff2 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -17,6 +17,8 @@ #include #include +DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ); + int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx, unsigned int len, struct tcp_sigpool *hp) { @@ -50,6 +52,9 @@ bool tcp_ao_ignore_icmp(const struct sock *sk, int family, int type, int code) bool ignore_icmp = false; struct tcp_ao_info *ao; + if (!static_branch_unlikely(&tcp_ao_needed.key)) + return false; + /* RFC5925, 7.8: * >> A TCP-AO implementation MUST default to ignore incoming ICMPv4 * messages of Type 3 (destination unreachable), Codes 2-4 (protocol @@ -185,6 +190,9 @@ static struct tcp_ao_key *__tcp_ao_do_lookup(const struct sock *sk, struct tcp_ao_key *key; struct tcp_ao_info *ao; + if (!static_branch_unlikely(&tcp_ao_needed.key)) + return NULL; + ao = rcu_dereference_check(tcp_sk(sk)->ao_info, lockdep_sock_is_held(sk)); if (!ao) @@ -276,6 +284,7 @@ void tcp_ao_destroy_sock(struct sock *sk, bool twsk) } kfree_rcu(ao, rcu); + static_branch_slow_dec_deferred(&tcp_ao_needed); } void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp) @@ -1180,6 +1189,11 @@ int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk, goto free_and_exit; } + if (!static_key_fast_inc_not_disabled(&tcp_ao_needed.key.key)) { + ret = -EUSERS; + goto free_and_exit; + } + key_head = rcu_dereference(hlist_first_rcu(&new_ao->head)); first_key = hlist_entry_safe(key_head, struct tcp_ao_key, node); @@ -1607,6 +1621,10 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, tcp_ao_link_mkt(ao_info, key); if (first) { + if (!static_branch_inc(&tcp_ao_needed.key)) { + ret = -EUSERS; + goto err_free_sock; + } sk_gso_disable(sk); rcu_assign_pointer(tcp_sk(sk)->ao_info, ao_info); } @@ -1875,6 +1893,10 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family, if (new_rnext) WRITE_ONCE(ao_info->rnext_key, new_rnext); if (first) { + if (!static_branch_inc(&tcp_ao_needed.key)) { + err = -EUSERS; + goto out; + } sk_gso_disable(sk); rcu_assign_pointer(tcp_sk(sk)->ao_info, ao_info); } diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 8132577bdfba..232b20e2643c 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3571,41 +3571,55 @@ static inline bool tcp_may_update_window(const struct tcp_sock *tp, (ack_seq == tp->snd_wl1 && (nwin > tp->snd_wnd || !nwin)); } -/* If we update tp->snd_una, also update tp->bytes_acked */ -static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack) +static void tcp_snd_sne_update(struct tcp_sock *tp, u32 ack) { - u32 delta = ack - tp->snd_una; #ifdef CONFIG_TCP_AO struct tcp_ao_info *ao; -#endif - sock_owned_by_me((struct sock *)tp); - tp->bytes_acked += delta; -#ifdef CONFIG_TCP_AO + if (!static_branch_unlikely(&tcp_ao_needed.key)) + return; + ao = rcu_dereference_protected(tp->ao_info, lockdep_sock_is_held((struct sock *)tp)); if (ao && ack < tp->snd_una) ao->snd_sne++; #endif +} + +/* If we update tp->snd_una, also update tp->bytes_acked */ +static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack) +{ + u32 delta = ack - tp->snd_una; + + sock_owned_by_me((struct sock *)tp); + tp->bytes_acked += delta; + tcp_snd_sne_update(tp, ack); tp->snd_una = ack; } +static void tcp_rcv_sne_update(struct tcp_sock *tp, u32 seq) +{ +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; + + if (!static_branch_unlikely(&tcp_ao_needed.key)) + return; + + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held((struct sock *)tp)); + if (ao && seq < tp->rcv_nxt) + ao->rcv_sne++; +#endif +} + /* If we update tp->rcv_nxt, also update tp->bytes_received */ static void tcp_rcv_nxt_update(struct tcp_sock *tp, u32 seq) { u32 delta = seq - tp->rcv_nxt; -#ifdef CONFIG_TCP_AO - struct tcp_ao_info *ao; -#endif sock_owned_by_me((struct sock *)tp); tp->bytes_received += delta; -#ifdef CONFIG_TCP_AO - ao = rcu_dereference_protected(tp->ao_info, - lockdep_sock_is_held((struct sock *)tp)); - if (ao && seq < tp->rcv_nxt) - ao->rcv_sne++; -#endif + tcp_rcv_sne_update(tp, seq); WRITE_ONCE(tp->rcv_nxt, seq); } diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 8f98c58e2689..18c5595e3814 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1024,18 +1024,20 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) #ifdef CONFIG_TCP_AO struct tcp_ao_info *ao_info; - /* FIXME: the segment to-be-acked is not verified yet */ - ao_info = rcu_dereference(tcptw->ao_info); - if (ao_info) { - const struct tcp_ao_hdr *aoh; + if (static_branch_unlikely(&tcp_ao_needed.key)) { + /* FIXME: the segment to-be-acked is not verified yet */ + ao_info = rcu_dereference(tcptw->ao_info); + if (ao_info) { + const struct tcp_ao_hdr *aoh; - if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) { - inet_twsk_put(tw); - return; + if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) { + inet_twsk_put(tw); + return; + } + + if (aoh) + key.ao_key = tcp_ao_established_key(ao_info, aoh->rnext_keyid, -1); } - - if (aoh) - key.ao_key = tcp_ao_established_key(ao_info, aoh->rnext_keyid, -1); } if (key.ao_key) { struct tcp_ao_key *rnext_key; @@ -1081,7 +1083,8 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, tcp_sk(sk)->snd_nxt; #ifdef CONFIG_TCP_AO - if (tcp_rsk_used_ao(req)) { + if (static_branch_unlikely(&tcp_ao_needed.key) && + tcp_rsk_used_ao(req)) { const union tcp_md5_addr *addr; const struct tcp_ao_hdr *aoh; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 2b8e87429e24..40af0a1a228d 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1154,17 +1154,19 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) #ifdef CONFIG_TCP_AO struct tcp_ao_info *ao_info; - /* FIXME: the segment to-be-acked is not verified yet */ - ao_info = rcu_dereference(tcptw->ao_info); - if (ao_info) { - const struct tcp_ao_hdr *aoh; + if (static_branch_unlikely(&tcp_ao_needed.key)) { - /* Invalid TCP option size or twice included auth */ - if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) - goto out; - if (aoh) { - key.ao_key = tcp_ao_established_key(ao_info, - aoh->rnext_keyid, -1); + /* FIXME: the segment to-be-acked is not verified yet */ + ao_info = rcu_dereference(tcptw->ao_info); + if (ao_info) { + const struct tcp_ao_hdr *aoh; + + /* Invalid TCP option size or twice included auth */ + if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) + goto out; + if (aoh) + key.ao_key = tcp_ao_established_key(ao_info, + aoh->rnext_keyid, -1); } } if (key.ao_key) { @@ -1206,7 +1208,8 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, struct tcp_key key = {}; #ifdef CONFIG_TCP_AO - if (tcp_rsk_used_ao(req)) { + if (static_branch_unlikely(&tcp_ao_needed.key) && + tcp_rsk_used_ao(req)) { const struct in6_addr *addr = &ipv6_hdr(skb)->saddr; const struct tcp_ao_hdr *aoh; int l3index; From patchwork Mon Oct 23 19:22:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157064 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1505041vqx; Mon, 23 Oct 2023 12:24:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFyWDIHKHEC0I+ckReRnxPDXfomBGAqxNIHWr5G4y7yBIelppfxB6pdvJFdvJAbeKMKQjo0 X-Received: by 2002:a05:6358:6e93:b0:168:e78c:e3a7 with SMTP id q19-20020a0563586e9300b00168e78ce3a7mr1917929rwm.18.1698089069641; Mon, 23 Oct 2023 12:24:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089069; cv=none; d=google.com; s=arc-20160816; b=p4XvYL9l5pwWCSy7iIO3qJ1KQzukhgHPqaT8NlN586gBLu+obmgh5dUFgzZS7sJczr rBS4oIcep3sHHlL+lDX9Gb5LUKxhC/K9tzegi655wai3iVCadbFyOkWpSZDRvbUUdFgM FK7ze0U4Zyfi1soBoNqzRTkx6KBoB0xX6gKiA4q4Zi0mOlunBpZr7QoZ0U84d9brOUpQ /cqQlM2vCpyp0WI148C4GSYqadsPmumMBHI/aoqIWg1qJuoWh/WWqWhqdU65C4AdEdR6 Uv62LlU0LgFHEnpCWH/LQX3WEzDfntz6c7ThFiznwConQm01+CXie7vQKuiOlNAz88X7 0hTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=rt5Zjq63J/79JACNSxD/qWmiLIkhJTTfMGQLn0zE6eM=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=uM/ddD+7Qz04AhOg13Zew1/DKBcGFW7azDg+o0M4W1pJlu/E5LJHXECCtrDYi52mMH VbH7P6TgSaEDRii743IjuzDKoGt8QHRpWJUqT/cSRcie6c0dN/bX4DhAn225fiHYgU4X bElNDlHISZBIA20r5uPpf6n/mkMOThYN/mNu30INuwRYN64MuOf39Cqu8ucRdmla0X0e pQDNFyx2XjZPieD5uxFtS/e/w+DaNH8MiaJ84qbi4G0ljdKKVRYb3Ndiu5Tjmr5S5DPV 17GSLZi+f0BwqE276X/Ls2nFWQ5b2ph26Py0vkayTbyg3gcAC7fnLRnQ+eLzjo8a4vgB PJmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=Un7DPegX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id w129-20020a636287000000b0056c0e3c77f7si6935368pgb.805.2023.10.23.12.24.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:24:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=Un7DPegX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 311CF80C0360; Mon, 23 Oct 2023 12:24:24 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233669AbjJWTYU (ORCPT + 27 others); Mon, 23 Oct 2023 15:24:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233795AbjJWTYA (ORCPT ); Mon, 23 Oct 2023 15:24:00 -0400 Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDD031BDC for ; Mon, 23 Oct 2023 12:23:11 -0700 (PDT) Received: by mail-lf1-x132.google.com with SMTP id 2adb3069b0e04-507bd64814fso5395602e87.1 for ; Mon, 23 Oct 2023 12:23:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088987; x=1698693787; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rt5Zjq63J/79JACNSxD/qWmiLIkhJTTfMGQLn0zE6eM=; b=Un7DPegXgzKv32Mg+vueeOVLci8yDVuhtf6P10C0YsLRZeJQUSxop14hodg0pzMVC2 KB+XyWmwt0mOB90YkB/WGwuG+l+WHzv21u1UYC85lWVAJGifQum1fCLbuYc9/tHv0atb znjKureM1P/XMrDvGXOmWd0n4Aq8oiZpo0NJ3z1VE92aBsseWTDXK1I7Udj2yC0TmXGj qLmnOJeZD375cn8JytYF3E8mHi2ZDWXAzdnT2vAUGGBTHmlNOMChjpuIGyeb5JUBNvWj 8IBC7Zops2G3Hx7Z1incgFUASIhZiy5hkjn5CJhhQoMyf6eNf+hKv4UhW0wUcF2VQlWJ WOhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088987; x=1698693787; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rt5Zjq63J/79JACNSxD/qWmiLIkhJTTfMGQLn0zE6eM=; b=YvTEFgcnk8bmAWvGnUx+Q+8BhsjAg7VLNdHVZS5zPvWSMu3zaNIMjso2BfgrMfK7Jk dIWyWK4de8L/if903PK7PEN542rENjWXLATP2KxBZTFyBPIJlqadHbnqiAdR8B6xYubi bp1kClYPGGUs3hCW+T1zQEh4Zbe3qA83wRQ6bjTvk2uLZ69yEs/qJogklNI5BaYefl2c E3VflWdXeNQ3OsmmO+u1U6UP+izELDR5dD5jSYUGLxJstv+sZMU1QBx16UquL+ffXpk7 oOuPfxLUodmZ0Wz3kEoeMhFVgVIHqUAiYNjwv+bWcDvEOOxOvi8u7r0fB+vpy3JaK/1C jSOg== X-Gm-Message-State: AOJu0YxYFEWTLbPUChChMG+4aap4+EZ5rvDy4L15QZQkah5gEeKCN1/d ajIc+zrKgRA6EezZ2Ircd4DuEQ== X-Received: by 2002:ac2:4d1b:0:b0:500:d970:6541 with SMTP id r27-20020ac24d1b000000b00500d9706541mr5873713lfi.39.1698088986627; Mon, 23 Oct 2023 12:23:06 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.23.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:06 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 21/23] net/tcp: Wire up l3index to TCP-AO Date: Mon, 23 Oct 2023 20:22:13 +0100 Message-ID: <20231023192217.426455-22-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:24:24 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575444466053699 X-GMAIL-MSGID: 1780575444466053699 Similarly how TCP_MD5SIG_FLAG_IFINDEX works for TCP-MD5, TCP_AO_KEYF_IFINDEX is an AO-key flag that binds that MKT to a specified by L3 ifinndex. Similarly, without this flag the key will work in the default VRF l3index = 0 for connections. To prevent AO-keys from overlapping, it's restricted to add key B for a socket that has key A, which have the same sndid/rcvid and one of the following is true: - !(A.keyflags & TCP_AO_KEYF_IFINDEX) or !(B.keyflags & TCP_AO_KEYF_IFINDEX) so that any key is non-bound to a VRF - A.l3index == B.l3index both want to work for the same VRF Additionally, it's restricted to match TCP-MD5 keys for the same peer the following way: |--------------|--------------------|----------------|---------------| | | MD5 key without | MD5 key | MD5 key | | | l3index | l3index=0 | l3index=N | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | without | reject | reject | reject | | l3index | | | | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | l3index=0 | reject | reject | allow | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | l3index=N | reject | allow | reject | |--------------|--------------------|----------------|---------------| This is done with the help of tcp_md5_do_lookup_any_l3index() to reject adding AO key without TCP_AO_KEYF_IFINDEX if there's TCP-MD5 in any VRF. This is important for case where sysctl_tcp_l3mdev_accept = 1 Similarly, for TCP-AO lookups tcp_ao_do_lookup() may be used with l3index < 0, so that __tcp_ao_key_cmp() will match TCP-AO key in any VRF. Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp.h | 11 +-- include/net/tcp_ao.h | 18 ++--- net/ipv4/syncookies.c | 6 +- net/ipv4/tcp_ao.c | 170 +++++++++++++++++++++++++++++++----------- net/ipv4/tcp_ipv4.c | 10 ++- net/ipv6/syncookies.c | 5 +- net/ipv6/tcp_ao.c | 21 +++--- net/ipv6/tcp_ipv6.c | 15 +++- 8 files changed, 177 insertions(+), 79 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 1635f0859ab5..f53878342ea9 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2715,7 +2715,7 @@ static inline int tcp_parse_auth_options(const struct tcphdr *th, } static inline bool tcp_ao_required(struct sock *sk, const void *saddr, - int family, bool stat_inc) + int family, int l3index, bool stat_inc) { #ifdef CONFIG_TCP_AO struct tcp_ao_info *ao_info; @@ -2729,7 +2729,7 @@ static inline bool tcp_ao_required(struct sock *sk, const void *saddr, if (!ao_info) return false; - ao_key = tcp_ao_do_lookup(sk, saddr, family, -1, -1); + ao_key = tcp_ao_do_lookup(sk, l3index, saddr, family, -1, -1); if (ao_info->ao_required || ao_key) { if (stat_inc) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOREQUIRED); @@ -2782,21 +2782,22 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req, * the last key is impossible to remove, so there's * always at least one current_key. */ - if (tcp_ao_required(sk, saddr, family, true)) { + if (tcp_ao_required(sk, saddr, family, l3index, true)) { tcp_hash_fail("AO hash is required, but not found", family, skb, "L3 index %d", l3index); return SKB_DROP_REASON_TCP_AONOTFOUND; } if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND); - tcp_hash_fail("MD5 Hash not found", family, skb, ""); + tcp_hash_fail("MD5 Hash not found", + family, skb, "L3 index %d", l3index); return SKB_DROP_REASON_TCP_MD5NOTFOUND; } return SKB_NOT_DROPPED_YET; } if (aoh) - return tcp_inbound_ao_hash(sk, skb, family, req, aoh); + return tcp_inbound_ao_hash(sk, skb, family, req, l3index, aoh); return tcp_inbound_md5_hash(sk, skb, saddr, daddr, family, l3index, md5_location); diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index a38408072ea8..edd6748b2cfa 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -33,6 +33,7 @@ struct tcp_ao_key { u8 key[TCP_AO_MAXKEYLEN] __tcp_ao_key_align; unsigned int tcp_sigpool_id; unsigned int digest_size; + int l3index; u8 prefixlen; u8 family; u8 keylen; @@ -200,10 +201,10 @@ int tcp_ao_get_mkts(struct sock *sk, sockptr_t optval, sockptr_t optlen); int tcp_ao_get_sock_info(struct sock *sk, sockptr_t optval, sockptr_t optlen); enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, - const struct request_sock *req, + const struct request_sock *req, int l3index, const struct tcp_ao_hdr *aoh); u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq); -struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, +struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, int l3index, const union tcp_ao_addr *addr, int family, int sndid, int rcvid); int tcp_ao_hash_hdr(unsigned short family, char *ao_hash, @@ -245,9 +246,6 @@ int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key, __be32 disn, bool send); int tcp_v6_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key, struct request_sock *req); -struct tcp_ao_key *tcp_v6_ao_do_lookup(const struct sock *sk, - const struct in6_addr *addr, - int sndid, int rcvid); struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid); struct tcp_ao_key *tcp_v6_ao_lookup_rsk(const struct sock *sk, @@ -265,7 +263,7 @@ void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb); void tcp_ao_connect_init(struct sock *sk); void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, struct tcp_request_sock *treq, - unsigned short int family); + unsigned short int family, int l3index); #else /* CONFIG_TCP_AO */ static inline int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, @@ -277,7 +275,7 @@ static inline int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, static inline void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, struct tcp_request_sock *treq, - unsigned short int family) + unsigned short int family, int l3index) { } @@ -289,13 +287,15 @@ static inline bool tcp_ao_ignore_icmp(const struct sock *sk, int family, static inline enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, - const struct request_sock *req, const struct tcp_ao_hdr *aoh) + const struct request_sock *req, int l3index, + const struct tcp_ao_hdr *aoh) { return SKB_NOT_DROPPED_YET; } static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, - const union tcp_ao_addr *addr, int family, int sndid, int rcvid) + int l3index, const union tcp_ao_addr *addr, + int family, int sndid, int rcvid) { return NULL; } diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 0681d3e82b11..98b25e5d147b 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -344,6 +344,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) __u8 rcv_wscale; struct flowi4 fl4; u32 tsoff = 0; + int l3index; if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies) || !th->ack || th->rst) @@ -400,13 +401,14 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) treq->snt_synack = 0; treq->tfo_listener = false; - tcp_ao_syncookie(sk, skb, treq, AF_INET); - if (IS_ENABLED(CONFIG_SMC)) ireq->smc_ok = 0; ireq->ir_iif = inet_request_bound_dev_if(sk, skb); + l3index = l3mdev_master_ifindex_by_index(sock_net(sk), ireq->ir_iif); + tcp_ao_syncookie(sk, skb, treq, AF_INET, l3index); + /* We throwed the options of the initial SYN away, so we hope * the ACK carries the same options again (see RFC1122 4.2.3.8) */ diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index ffce8ca60ff2..b5ac3e73e1da 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -136,7 +136,7 @@ static int ipv4_prefix_cmp(const struct in_addr *addr1, return memcmp(&a1, &a2, sizeof(a1)); } -static int __tcp_ao_key_cmp(const struct tcp_ao_key *key, +static int __tcp_ao_key_cmp(const struct tcp_ao_key *key, int l3index, const union tcp_ao_addr *addr, u8 prefixlen, int family, int sndid, int rcvid) { @@ -144,6 +144,10 @@ static int __tcp_ao_key_cmp(const struct tcp_ao_key *key, return (key->sndid > sndid) ? 1 : -1; if (rcvid >= 0 && key->rcvid != rcvid) return (key->rcvid > rcvid) ? 1 : -1; + if (l3index >= 0 && (key->keyflags & TCP_AO_KEYF_IFINDEX)) { + if (key->l3index != l3index) + return (key->l3index > l3index) ? 1 : -1; + } if (family == AF_UNSPEC) return 0; @@ -168,7 +172,7 @@ static int __tcp_ao_key_cmp(const struct tcp_ao_key *key, return -1; } -static int tcp_ao_key_cmp(const struct tcp_ao_key *key, +static int tcp_ao_key_cmp(const struct tcp_ao_key *key, int l3index, const union tcp_ao_addr *addr, u8 prefixlen, int family, int sndid, int rcvid) { @@ -176,14 +180,16 @@ static int tcp_ao_key_cmp(const struct tcp_ao_key *key, if (family == AF_INET6 && ipv6_addr_v4mapped(&addr->a6)) { __be32 addr4 = addr->a6.s6_addr32[3]; - return __tcp_ao_key_cmp(key, (union tcp_ao_addr *)&addr4, + return __tcp_ao_key_cmp(key, l3index, + (union tcp_ao_addr *)&addr4, prefixlen, AF_INET, sndid, rcvid); } #endif - return __tcp_ao_key_cmp(key, addr, prefixlen, family, sndid, rcvid); + return __tcp_ao_key_cmp(key, l3index, addr, + prefixlen, family, sndid, rcvid); } -static struct tcp_ao_key *__tcp_ao_do_lookup(const struct sock *sk, +static struct tcp_ao_key *__tcp_ao_do_lookup(const struct sock *sk, int l3index, const union tcp_ao_addr *addr, int family, u8 prefix, int sndid, int rcvid) { @@ -201,17 +207,18 @@ static struct tcp_ao_key *__tcp_ao_do_lookup(const struct sock *sk, hlist_for_each_entry_rcu(key, &ao->head, node) { u8 prefixlen = min(prefix, key->prefixlen); - if (!tcp_ao_key_cmp(key, addr, prefixlen, family, sndid, rcvid)) + if (!tcp_ao_key_cmp(key, l3index, addr, prefixlen, + family, sndid, rcvid)) return key; } return NULL; } -struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, +struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, int l3index, const union tcp_ao_addr *addr, int family, int sndid, int rcvid) { - return __tcp_ao_do_lookup(sk, addr, family, U8_MAX, sndid, rcvid); + return __tcp_ao_do_lookup(sk, l3index, addr, family, U8_MAX, sndid, rcvid); } static struct tcp_ao_info *tcp_ao_alloc_info(gfp_t flags) @@ -677,18 +684,22 @@ struct tcp_ao_key *tcp_v4_ao_lookup_rsk(const struct sock *sk, struct request_sock *req, int sndid, int rcvid) { - union tcp_ao_addr *addr = - (union tcp_ao_addr *)&inet_rsk(req)->ir_rmt_addr; + struct inet_request_sock *ireq = inet_rsk(req); + union tcp_ao_addr *addr = (union tcp_ao_addr *)&ireq->ir_rmt_addr; + int l3index; - return tcp_ao_do_lookup(sk, addr, AF_INET, sndid, rcvid); + l3index = l3mdev_master_ifindex_by_index(sock_net(sk), ireq->ir_iif); + return tcp_ao_do_lookup(sk, l3index, addr, AF_INET, sndid, rcvid); } struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid) { + int l3index = l3mdev_master_ifindex_by_index(sock_net(sk), + addr_sk->sk_bound_dev_if); union tcp_ao_addr *addr = (union tcp_ao_addr *)&addr_sk->sk_daddr; - return tcp_ao_do_lookup(sk, addr, AF_INET, sndid, rcvid); + return tcp_ao_do_lookup(sk, l3index, addr, AF_INET, sndid, rcvid); } int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, @@ -738,7 +749,8 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, ao_info = rcu_dereference(tcp_sk(sk)->ao_info); if (!ao_info) return -ENOENT; - *key = tcp_ao_do_lookup(sk, addr, family, -1, aoh->rnext_keyid); + *key = tcp_ao_do_lookup(sk, l3index, addr, family, + -1, aoh->rnext_keyid); if (!*key) return -ENOENT; *traffic_key = kmalloc(tcp_ao_digest_size(*key), GFP_ATOMIC); @@ -814,24 +826,26 @@ int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, static struct tcp_ao_key *tcp_ao_inbound_lookup(unsigned short int family, const struct sock *sk, const struct sk_buff *skb, - int sndid, int rcvid) + int sndid, int rcvid, int l3index) { if (family == AF_INET) { const struct iphdr *iph = ip_hdr(skb); - return tcp_ao_do_lookup(sk, (union tcp_ao_addr *)&iph->saddr, - AF_INET, sndid, rcvid); + return tcp_ao_do_lookup(sk, l3index, + (union tcp_ao_addr *)&iph->saddr, + AF_INET, sndid, rcvid); } else { const struct ipv6hdr *iph = ipv6_hdr(skb); - return tcp_ao_do_lookup(sk, (union tcp_ao_addr *)&iph->saddr, - AF_INET6, sndid, rcvid); + return tcp_ao_do_lookup(sk, l3index, + (union tcp_ao_addr *)&iph->saddr, + AF_INET6, sndid, rcvid); } } void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, struct tcp_request_sock *treq, - unsigned short int family) + unsigned short int family, int l3index) { const struct tcphdr *th = tcp_hdr(skb); const struct tcp_ao_hdr *aoh; @@ -842,7 +856,7 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb, if (tcp_parse_auth_options(th, NULL, &aoh) || !aoh) return; - key = tcp_ao_inbound_lookup(family, sk, skb, -1, aoh->keyid); + key = tcp_ao_inbound_lookup(family, sk, skb, -1, aoh->keyid, l3index); if (!key) /* Key not found, continue without TCP-AO */ return; @@ -856,7 +870,7 @@ static enum skb_drop_reason tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb, unsigned short int family, struct tcp_ao_info *info, const struct tcp_ao_hdr *aoh, struct tcp_ao_key *key, - u8 *traffic_key, u8 *phash, u32 sne) + u8 *traffic_key, u8 *phash, u32 sne, int l3index) { u8 maclen = aoh->length - sizeof(struct tcp_ao_hdr); const struct tcphdr *th = tcp_hdr(skb); @@ -867,7 +881,8 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb, atomic64_inc(&info->counters.pkt_bad); atomic64_inc(&key->pkt_bad); tcp_hash_fail("AO hash wrong length", family, skb, - "%u != %d", maclen, tcp_ao_maclen(key)); + "%u != %d L3index: %d", maclen, + tcp_ao_maclen(key), l3index); return SKB_DROP_REASON_TCP_AOFAILURE; } @@ -882,7 +897,8 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb, NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD); atomic64_inc(&info->counters.pkt_bad); atomic64_inc(&key->pkt_bad); - tcp_hash_fail("AO hash mismatch", family, skb, ""); + tcp_hash_fail("AO hash mismatch", family, skb, + "L3index: %d", l3index); kfree(hash_buf); return SKB_DROP_REASON_TCP_AOFAILURE; } @@ -896,7 +912,7 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb, enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, const struct request_sock *req, - const struct tcp_ao_hdr *aoh) + int l3index, const struct tcp_ao_hdr *aoh) { const struct tcphdr *th = tcp_hdr(skb); u8 *phash = (u8 *)(aoh + 1); /* hash goes just after the header */ @@ -911,7 +927,7 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, if (!info) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND); tcp_hash_fail("AO key not found", family, skb, - "keyid: %u", aoh->keyid); + "keyid: %u L3index: %d", aoh->keyid, l3index); return SKB_DROP_REASON_TCP_AOUNEXPECTED; } @@ -945,7 +961,7 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, /* Established socket, traffic key are cached */ traffic_key = rcv_other_key(key); err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key, - traffic_key, phash, sne); + traffic_key, phash, sne, l3index); if (err) return err; current_key = READ_ONCE(info->current_key); @@ -966,7 +982,7 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, * - request sockets would race on those key pointers * - tcp_ao_del_cmd() allows async key removal */ - key = tcp_ao_inbound_lookup(family, sk, skb, -1, aoh->keyid); + key = tcp_ao_inbound_lookup(family, sk, skb, -1, aoh->keyid, l3index); if (!key) goto key_not_found; @@ -1006,7 +1022,7 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, return SKB_DROP_REASON_NOT_SPECIFIED; tcp_ao_calc_key_skb(key, traffic_key, skb, sisn, disn, family); ret = tcp_ao_verify_hash(sk, skb, family, info, aoh, key, - traffic_key, phash, sne); + traffic_key, phash, sne, l3index); kfree(traffic_key); return ret; @@ -1014,7 +1030,7 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND); atomic64_inc(&info->counters.key_not_found); tcp_hash_fail("Requested by the peer AO key id not found", - family, skb, ""); + family, skb, "L3index: %d", l3index); return SKB_DROP_REASON_TCP_AOKEYNOTFOUND; } @@ -1042,7 +1058,7 @@ void tcp_ao_connect_init(struct sock *sk) struct tcp_ao_info *ao_info; union tcp_ao_addr *addr; struct tcp_ao_key *key; - int family; + int family, l3index; ao_info = rcu_dereference_protected(tp->ao_info, lockdep_sock_is_held(sk)); @@ -1059,9 +1075,11 @@ void tcp_ao_connect_init(struct sock *sk) #endif else return; + l3index = l3mdev_master_ifindex_by_index(sock_net(sk), + sk->sk_bound_dev_if); hlist_for_each_entry_rcu(key, &ao_info->head, node) { - if (!tcp_ao_key_cmp(key, addr, key->prefixlen, family, -1, -1)) + if (!tcp_ao_key_cmp(key, l3index, addr, key->prefixlen, family, -1, -1)) continue; if (key == ao_info->current_key) @@ -1134,9 +1152,9 @@ int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk, struct tcp_ao_key *key, *new_key, *first_key; struct tcp_ao_info *new_ao, *ao; struct hlist_node *key_head; + int l3index, ret = -ENOMEM; union tcp_ao_addr *addr; bool match = false; - int ret = -ENOMEM; ao = rcu_dereference(tcp_sk(sk)->ao_info); if (!ao) @@ -1164,9 +1182,11 @@ int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk, ret = -EAFNOSUPPORT; goto free_ao; } + l3index = l3mdev_master_ifindex_by_index(sock_net(newsk), + newsk->sk_bound_dev_if); hlist_for_each_entry_rcu(key, &ao->head, node) { - if (tcp_ao_key_cmp(key, addr, key->prefixlen, family, -1, -1)) + if (tcp_ao_key_cmp(key, l3index, addr, key->prefixlen, family, -1, -1)) continue; new_key = tcp_ao_copy_key(newsk, key); @@ -1470,7 +1490,8 @@ static struct tcp_ao_info *setsockopt_ao_info(struct sock *sk) return ERR_PTR(-ESOCKTNOSUPPORT); } -#define TCP_AO_KEYF_ALL (TCP_AO_KEYF_EXCLUDE_OPT) +#define TCP_AO_KEYF_ALL (TCP_AO_KEYF_IFINDEX | TCP_AO_KEYF_EXCLUDE_OPT) +#define TCP_AO_GET_KEYF_VALID (TCP_AO_KEYF_IFINDEX) static struct tcp_ao_key *tcp_ao_key_alloc(struct sock *sk, struct tcp_ao_add *cmd) @@ -1534,8 +1555,8 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, union tcp_ao_addr *addr; struct tcp_ao_key *key; struct tcp_ao_add cmd; + int ret, l3index = 0; bool first = false; - int ret; if (optlen < sizeof(cmd)) return -EINVAL; @@ -1565,9 +1586,46 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, return -EINVAL; } + if (cmd.ifindex && !(cmd.keyflags & TCP_AO_KEYF_IFINDEX)) + return -EINVAL; + + /* For cmd.tcp_ifindex = 0 the key will apply to the default VRF */ + if (cmd.keyflags & TCP_AO_KEYF_IFINDEX && cmd.ifindex) { + int bound_dev_if = READ_ONCE(sk->sk_bound_dev_if); + struct net_device *dev; + + rcu_read_lock(); + dev = dev_get_by_index_rcu(sock_net(sk), cmd.ifindex); + if (dev && netif_is_l3_master(dev)) + l3index = dev->ifindex; + rcu_read_unlock(); + + if (!dev || !l3index) + return -EINVAL; + + /* It's still possible to bind after adding keys or even + * re-bind to a different dev (with CAP_NET_RAW). + * So, no reason to return error here, rather try to be + * nice and warn the user. + */ + if (bound_dev_if && bound_dev_if != cmd.ifindex) + net_warn_ratelimited("AO key ifindex %d != sk bound ifindex %d\n", + cmd.ifindex, bound_dev_if); + } + /* Don't allow keys for peers that have a matching TCP-MD5 key */ - if (tcp_md5_do_lookup_any_l3index(sk, addr, family)) - return -EKEYREJECTED; + if (cmd.keyflags & TCP_AO_KEYF_IFINDEX) { + /* Non-_exact version of tcp_md5_do_lookup() will + * as well match keys that aren't bound to a specific VRF + * (that will make them match AO key with + * sysctl_tcp_l3dev_accept = 1 + */ + if (tcp_md5_do_lookup(sk, l3index, addr, family)) + return -EKEYREJECTED; + } else { + if (tcp_md5_do_lookup_any_l3index(sk, addr, family)) + return -EKEYREJECTED; + } ao_info = setsockopt_ao_info(sk); if (IS_ERR(ao_info)) @@ -1584,10 +1642,9 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, * > The IDs of MKTs MUST NOT overlap where their * > TCP connection identifiers overlap. */ - if (__tcp_ao_do_lookup(sk, addr, family, - cmd.prefix, -1, cmd.rcvid)) + if (__tcp_ao_do_lookup(sk, l3index, addr, family, cmd.prefix, -1, cmd.rcvid)) return -EEXIST; - if (__tcp_ao_do_lookup(sk, addr, family, + if (__tcp_ao_do_lookup(sk, l3index, addr, family, cmd.prefix, cmd.sndid, -1)) return -EEXIST; } @@ -1606,6 +1663,7 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, key->keyflags = cmd.keyflags; key->sndid = cmd.sndid; key->rcvid = cmd.rcvid; + key->l3index = l3index; atomic64_set(&key->pkt_good, 0); atomic64_set(&key->pkt_bad, 0); @@ -1694,17 +1752,17 @@ static int tcp_ao_delete_key(struct sock *sk, struct tcp_ao_info *ao_info, return err; } +#define TCP_AO_DEL_KEYF_ALL (TCP_AO_KEYF_IFINDEX) static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family, sockptr_t optval, int optlen) { struct tcp_ao_key *key, *new_current = NULL, *new_rnext = NULL; + int err, addr_len, l3index = 0; struct tcp_ao_info *ao_info; union tcp_ao_addr *addr; struct tcp_ao_del cmd; - int addr_len; __u8 prefix; u16 port; - int err; if (optlen < sizeof(cmd)) return -EINVAL; @@ -1721,6 +1779,17 @@ static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family, return -EINVAL; } + if (cmd.keyflags & ~TCP_AO_DEL_KEYF_ALL) + return -EINVAL; + + /* No sanity check for TCP_AO_KEYF_IFINDEX as if a VRF + * was destroyed, there still should be a way to delete keys, + * that were bound to that l3intf. So, fail late at lookup stage + * if there is no key for that ifindex. + */ + if (cmd.ifindex && !(cmd.keyflags & TCP_AO_KEYF_IFINDEX)) + return -EINVAL; + ao_info = setsockopt_ao_info(sk); if (IS_ERR(ao_info)) return PTR_ERR(ao_info); @@ -1788,6 +1857,13 @@ static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family, memcmp(addr, &key->addr, addr_len)) continue; + if ((cmd.keyflags & TCP_AO_KEYF_IFINDEX) != + (key->keyflags & TCP_AO_KEYF_IFINDEX)) + continue; + + if (key->l3index != l3index) + continue; + if (key == new_current || key == new_rnext) continue; @@ -1973,10 +2049,10 @@ static int tcp_ao_copy_mkts_to_user(struct tcp_ao_info *ao_info, struct tcp_ao_key *key, *current_key; bool do_address_matching = true; union tcp_ao_addr *addr = NULL; + int err, l3index, user_len; unsigned int max_keys; /* maximum number of keys to copy to user */ size_t out_offset = 0; size_t bytes_to_write; /* number of bytes to write to user level */ - int err, user_len; u32 matched_keys; /* keys from ao_info matched so far */ int optlen_out; __be16 port = 0; @@ -1995,11 +2071,16 @@ static int tcp_ao_copy_mkts_to_user(struct tcp_ao_info *ao_info, if (opt_in.pkt_good || opt_in.pkt_bad) return -EINVAL; + if (opt_in.keyflags & ~TCP_AO_GET_KEYF_VALID) + return -EINVAL; + if (opt_in.ifindex && !(opt_in.keyflags & TCP_AO_KEYF_IFINDEX)) + return -EINVAL; if (opt_in.reserved != 0) return -EINVAL; max_keys = opt_in.nkeys; + l3index = (opt_in.keyflags & TCP_AO_KEYF_IFINDEX) ? opt_in.ifindex : -1; if (opt_in.get_all || opt_in.is_current || opt_in.is_rnext) { if (opt_in.get_all && (opt_in.is_current || opt_in.is_rnext)) @@ -2101,7 +2182,7 @@ static int tcp_ao_copy_mkts_to_user(struct tcp_ao_info *ao_info, continue; } - if (tcp_ao_key_cmp(key, addr, opt_in.prefix, + if (tcp_ao_key_cmp(key, l3index, addr, opt_in.prefix, opt_in.addr.ss_family, opt_in.sndid, opt_in.rcvid) != 0) continue; @@ -2134,6 +2215,7 @@ static int tcp_ao_copy_mkts_to_user(struct tcp_ao_info *ao_info, opt_out.nkeys = 0; opt_out.maclen = key->maclen; opt_out.keylen = key->keylen; + opt_out.ifindex = key->l3index; opt_out.pkt_good = atomic64_read(&key->pkt_good); opt_out.pkt_bad = atomic64_read(&key->pkt_bad); memcpy(&opt_out.key, key->key, key->keylen); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 18c5595e3814..5f693bbd578d 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1087,6 +1087,7 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, tcp_rsk_used_ao(req)) { const union tcp_md5_addr *addr; const struct tcp_ao_hdr *aoh; + int l3index; /* Invalid TCP option size or twice included auth */ if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh)) @@ -1095,11 +1096,12 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, return; addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr; - key.ao_key = tcp_ao_do_lookup(sk, addr, AF_INET, + l3index = tcp_v4_sdif(skb) ? inet_iif(skb) : 0; + key.ao_key = tcp_ao_do_lookup(sk, l3index, addr, AF_INET, aoh->rnext_keyid, -1); if (unlikely(!key.ao_key)) { /* Send ACK with any matching MKT for the peer */ - key.ao_key = tcp_ao_do_lookup(sk, addr, AF_INET, -1, -1); + key.ao_key = tcp_ao_do_lookup(sk, l3index, addr, AF_INET, -1, -1); /* Matching key disappeared (user removed the key?) * let the handshake timeout. */ @@ -1493,6 +1495,7 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, const union tcp_md5_addr *addr; u8 prefixlen = 32; int l3index = 0; + bool l3flag; u8 flags; if (optlen < sizeof(cmd)) @@ -1505,6 +1508,7 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, return -EINVAL; flags = cmd.tcpm_flags & TCP_MD5SIG_FLAG_IFINDEX; + l3flag = cmd.tcpm_flags & TCP_MD5SIG_FLAG_IFINDEX; if (optname == TCP_MD5SIG_EXT && cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { @@ -1542,7 +1546,7 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname, /* Don't allow keys for peers that have a matching TCP-AO key. * See the comment in tcp_ao_add_cmd() */ - if (tcp_ao_required(sk, addr, AF_INET, false)) + if (tcp_ao_required(sk, addr, AF_INET, l3flag ? l3index : -1, false)) return -EKEYREJECTED; return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags, diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index ad7a8caa7b2a..500f6ed3b8cf 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -140,6 +140,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) struct dst_entry *dst; __u8 rcv_wscale; u32 tsoff = 0; + int l3index; if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies) || !th->ack || th->rst) @@ -214,7 +215,9 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) treq->snt_isn = cookie; treq->ts_off = 0; treq->txhash = net_tx_rndhash(); - tcp_ao_syncookie(sk, skb, treq, AF_INET6); + + l3index = l3mdev_master_ifindex_by_index(sock_net(sk), ireq->ir_iif); + tcp_ao_syncookie(sk, skb, treq, AF_INET6, l3index); if (IS_ENABLED(CONFIG_SMC)) ireq->smc_ok = 0; diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c index 8b04611c9078..3c09ac26206e 100644 --- a/net/ipv6/tcp_ao.c +++ b/net/ipv6/tcp_ao.c @@ -87,30 +87,29 @@ int tcp_v6_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key, htonl(tcp_rsk(req)->rcv_isn)); } -struct tcp_ao_key *tcp_v6_ao_do_lookup(const struct sock *sk, - const struct in6_addr *addr, - int sndid, int rcvid) -{ - return tcp_ao_do_lookup(sk, (union tcp_ao_addr *)addr, AF_INET6, - sndid, rcvid); -} - struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk, struct sock *addr_sk, int sndid, int rcvid) { + int l3index = l3mdev_master_ifindex_by_index(sock_net(sk), + addr_sk->sk_bound_dev_if); struct in6_addr *addr = &addr_sk->sk_v6_daddr; - return tcp_v6_ao_do_lookup(sk, addr, sndid, rcvid); + return tcp_ao_do_lookup(sk, l3index, (union tcp_ao_addr *)addr, + AF_INET6, sndid, rcvid); } struct tcp_ao_key *tcp_v6_ao_lookup_rsk(const struct sock *sk, struct request_sock *req, int sndid, int rcvid) { - struct in6_addr *addr = &inet_rsk(req)->ir_v6_rmt_addr; + struct inet_request_sock *ireq = inet_rsk(req); + struct in6_addr *addr = &ireq->ir_v6_rmt_addr; + int l3index; - return tcp_v6_ao_do_lookup(sk, addr, sndid, rcvid); + l3index = l3mdev_master_ifindex_by_index(sock_net(sk), ireq->ir_iif); + return tcp_ao_do_lookup(sk, l3index, (union tcp_ao_addr *)addr, + AF_INET6, sndid, rcvid); } int tcp_v6_ao_hash_pseudoheader(struct tcp_sigpool *hp, diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 40af0a1a228d..4872d773cad1 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -610,6 +610,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, union tcp_ao_addr *addr; int l3index = 0; u8 prefixlen; + bool l3flag; u8 flags; if (optlen < sizeof(cmd)) @@ -622,6 +623,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, return -EINVAL; flags = cmd.tcpm_flags & TCP_MD5SIG_FLAG_IFINDEX; + l3flag = cmd.tcpm_flags & TCP_MD5SIG_FLAG_IFINDEX; if (optname == TCP_MD5SIG_EXT && cmd.tcpm_flags & TCP_MD5SIG_FLAG_PREFIX) { @@ -668,7 +670,8 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, /* Don't allow keys for peers that have a matching TCP-AO key. * See the comment in tcp_ao_add_cmd() */ - if (tcp_ao_required(sk, addr, AF_INET, false)) + if (tcp_ao_required(sk, addr, AF_INET, + l3flag ? l3index : -1, false)) return -EKEYREJECTED; return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags, @@ -680,7 +683,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname, /* Don't allow keys for peers that have a matching TCP-AO key. * See the comment in tcp_ao_add_cmd() */ - if (tcp_ao_required(sk, addr, AF_INET6, false)) + if (tcp_ao_required(sk, addr, AF_INET6, l3flag ? l3index : -1, false)) return -EKEYREJECTED; return tcp_md5_do_add(sk, addr, AF_INET6, prefixlen, l3index, flags, @@ -1220,10 +1223,14 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, return; if (!aoh) return; - key.ao_key = tcp_v6_ao_do_lookup(sk, addr, aoh->rnext_keyid, -1); + key.ao_key = tcp_ao_do_lookup(sk, l3index, + (union tcp_ao_addr *)addr, + AF_INET6, aoh->rnext_keyid, -1); if (unlikely(!key.ao_key)) { /* Send ACK with any matching MKT for the peer */ - key.ao_key = tcp_v6_ao_do_lookup(sk, addr, -1, -1); + key.ao_key = tcp_ao_do_lookup(sk, l3index, + (union tcp_ao_addr *)addr, + AF_INET6, -1, -1); /* Matching key disappeared (user removed the key?) * let the handshake timeout. */ From patchwork Mon Oct 23 19:22:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157066 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1505252vqx; Mon, 23 Oct 2023 12:25:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGGH4qQGFDQxNJ3xBLPP//vkyt5gGuYcJsStl7PynP0Xsix0l9YRTa8xCxVY7RyeZxIjo/Z X-Received: by 2002:a17:903:22c7:b0:1ca:8e79:53af with SMTP id y7-20020a17090322c700b001ca8e7953afmr8495909plg.3.1698089101002; Mon, 23 Oct 2023 12:25:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089100; cv=none; d=google.com; s=arc-20160816; b=ee/QI2DUGoN2WhWhCM2+TzXe7O+aYoMz+RkwJ7li3XBfqNqlQXAkM+0vPEzlvswFsG 7K+7vpIjKktGmzf1R7zChM1DIpkrBC/ssMKBlSUu8qMRgFf4pXCKBeBcCvU8RX/V/Uci JapX9eV3BjYGgSUhOsalmIu71/7MGZaH/E1oKs9hO1ibjRiFZE7RQWX43/2BNB20mGgb 7dD11Xr7/jjp83sHuG0BVZ1/jbqhi717Sg0EstksAuZ7xCaM9IgKH6bKeRKo5Ij3QcT5 3g3PpllsDbuDW+K7LlevwBw5IWpFPjQ7WwPj8vWdI4I6+DwsaO/PQblk7YlgHdP8GShP aB3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=L5EcPvyMvZxkA/tm5aBCZ3L5YrriKsSOSNiPCBUZoqw=; fh=a8JK1LlV6fjkJ2GExjvEHYBwBHtgXArNx6ms6Kac6b4=; b=cca/5KQDAZHQzXTZvvLWey5SbzR+xC6Sb7blNsWuRW3SaaJw/jTztQ8lNFVBe/ROvA zF6w4hvUIWdV1rcOT7wDI2oGYNtv2WBbXO8qE5Rmvtr3qdeYKGCu6FHXSHzbUTMwHSA6 WuFy7OpXtx5TjVdyDetjhIa0UUh1rOXQEkzo4lM70R37nlAG/ETUO5WRfW83ILfPLEsj EsKs1LFBj8eEe98chio6C0BoZj/8vIUx1dI/41EY26bOyREHVfoI36rK0al11bispyh+ WPmAo10YCBIqm9y5HBtsI1V9g6j8IgSrpoo2ROaB0KUFegPDioehcM5VExDz/2C3Rm8s 1oBg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=E+7817jc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id k11-20020a170902694b00b001c624cbc92dsi6738920plt.296.2023.10.23.12.25.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:25:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=E+7817jc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id ABC3080A7CCE; Mon, 23 Oct 2023 12:24:57 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233317AbjJWTYm (ORCPT + 27 others); Mon, 23 Oct 2023 15:24:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233762AbjJWTY2 (ORCPT ); Mon, 23 Oct 2023 15:24:28 -0400 Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 251FD1BF2 for ; Mon, 23 Oct 2023 12:23:13 -0700 (PDT) Received: by mail-lf1-x12d.google.com with SMTP id 2adb3069b0e04-5079f3f3d7aso5811287e87.1 for ; Mon, 23 Oct 2023 12:23:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088989; x=1698693789; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=L5EcPvyMvZxkA/tm5aBCZ3L5YrriKsSOSNiPCBUZoqw=; b=E+7817jcL755IPbANif0e8XHy9cHTmblNXF78WYzgp7bPQgpuownpIPvvOfe4AHD7h mQvL1M0ud0UuTePGv+RxPeV2toKAa8IVosmeowra2OlhoJG3efpG5p/SySlcyv+uNtkZ TgRieNfuDKjj2kmTcLQ0NnHaurimmeRNrK7D1Fpw0t1GNLEinLYdDXAvmFyh7530eYiz 9a9avepC6OQIXRcoal+UWqflaLqsSR7Zww8DfwMSr/Whhieru0NvG0VP7Vc9T7w9H5fG sQggg6qxpbKdXsiahp42/WiCzxuJ/BV4+yuj2NJK4S5IxFsjlJJuSPCtERx3oRKEmJUb I9fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088989; x=1698693789; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=L5EcPvyMvZxkA/tm5aBCZ3L5YrriKsSOSNiPCBUZoqw=; b=mSf/duiU0MWHuq8FbueIlHa1SmiV4xr3TT57rbqkQpZWRsfLn/15ev5+/8H0cqLNhE XA0rcGH5LMut8Danojs1e8deY/DDWymxsFqeJnxOjO2+QBQLw0W0dDUcGvld/rOSPBnM 0HN6x1Udti9qP7rGIC8DTK88MevPg+Bm22zJQOzD8CrZ0IySn+T87tj2Si224tTBJmuG NOpSpFms6SlwUZe6CRJbBX7TZgQQVMl7gEVLWF2cyASMw2excw+GrtchFqRWX3PFN2Pn rhcIrciB7U7EFPnDUZHbWh0odGusXrJrfPdguEv93i2VqZJ4n7hCqlS/vbH6K7EGhtUO O9+w== X-Gm-Message-State: AOJu0YwnOwQoPqhDgoPOkDG03tSgLcGUSHunSt7L5q8yeJ1TORxqkudf y9yAj0enqGJJ13dqHDCwZ+tPZw== X-Received: by 2002:ac2:597a:0:b0:507:9623:8ae8 with SMTP id h26-20020ac2597a000000b0050796238ae8mr7133824lfp.29.1698088988928; Mon, 23 Oct 2023 12:23:08 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.23.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:08 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 22/23] net/tcp: Add TCP_AO_REPAIR Date: Mon, 23 Oct 2023 20:22:14 +0100 Message-ID: <20231023192217.426455-23-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:24:57 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575477375710379 X-GMAIL-MSGID: 1780575477375710379 Add TCP_AO_REPAIR setsockopt(), getsockopt(). They let a user to repair TCP-AO ISNs/SNEs. Also let the user hack around when (tp->repair) is on and add ao_info on a socket in any supported state. As SNEs now can be read/written at any moment, use WRITE_ONCE()/READ_ONCE() to set/read them. Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp_ao.h | 14 +++++++ include/uapi/linux/tcp.h | 8 ++++ net/ipv4/tcp.c | 24 +++++++---- net/ipv4/tcp_ao.c | 90 ++++++++++++++++++++++++++++++++++++++-- 4 files changed, 125 insertions(+), 11 deletions(-) diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index edd6748b2cfa..a375a171ef3c 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -199,6 +199,8 @@ void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp); bool tcp_ao_ignore_icmp(const struct sock *sk, int family, int type, int code); int tcp_ao_get_mkts(struct sock *sk, sockptr_t optval, sockptr_t optlen); int tcp_ao_get_sock_info(struct sock *sk, sockptr_t optval, sockptr_t optlen); +int tcp_ao_get_repair(struct sock *sk, sockptr_t optval, sockptr_t optlen); +int tcp_ao_set_repair(struct sock *sk, sockptr_t optval, unsigned int optlen); enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, const struct request_sock *req, int l3index, @@ -330,6 +332,18 @@ static inline int tcp_ao_get_sock_info(struct sock *sk, sockptr_t optval, sockpt { return -ENOPROTOOPT; } + +static inline int tcp_ao_get_repair(struct sock *sk, + sockptr_t optval, sockptr_t optlen) +{ + return -ENOPROTOOPT; +} + +static inline int tcp_ao_set_repair(struct sock *sk, + sockptr_t optval, unsigned int optlen) +{ + return -ENOPROTOOPT; +} #endif #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO) diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index be34d7c5c531..c07e9f90c084 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -133,6 +133,7 @@ enum { #define TCP_AO_DEL_KEY 39 /* Delete MKT */ #define TCP_AO_INFO 40 /* Set/list TCP-AO per-socket options */ #define TCP_AO_GET_KEYS 41 /* List MKT(s) */ +#define TCP_AO_REPAIR 42 /* Get/Set SNEs and ISNs */ #define TCP_REPAIR_ON 1 #define TCP_REPAIR_OFF 0 @@ -458,6 +459,13 @@ struct tcp_ao_getsockopt { /* getsockopt(TCP_AO_GET_KEYS) */ __u64 pkt_bad; /* out: segments that failed verification */ } __attribute__((aligned(8))); +struct tcp_ao_repair { /* {s,g}etsockopt(TCP_AO_REPAIR) */ + __be32 snt_isn; + __be32 rcv_isn; + __u32 snd_sne; + __u32 rcv_sne; +} __attribute__((aligned(8))); + /* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */ #define TCP_RECEIVE_ZEROCOPY_FLAG_TLB_CLEAN_HINT 0x1 diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index a3107c28ea5f..6c0a7fd0bae0 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3591,20 +3591,28 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, __tcp_sock_set_quickack(sk, val); break; + case TCP_AO_REPAIR: + err = tcp_ao_set_repair(sk, optval, optlen); + break; #ifdef CONFIG_TCP_AO case TCP_AO_ADD_KEY: case TCP_AO_DEL_KEY: case TCP_AO_INFO: { /* If this is the first TCP-AO setsockopt() on the socket, - * sk_state has to be LISTEN or CLOSE + * sk_state has to be LISTEN or CLOSE. Allow TCP_REPAIR + * in any state. */ - if (((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) || - rcu_dereference_protected(tcp_sk(sk)->ao_info, + if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) + goto ao_parse; + if (rcu_dereference_protected(tcp_sk(sk)->ao_info, lockdep_sock_is_held(sk))) - err = tp->af_specific->ao_parse(sk, optname, optval, - optlen); - else - err = -EISCONN; + goto ao_parse; + if (tp->repair) + goto ao_parse; + err = -EISCONN; + break; +ao_parse: + err = tp->af_specific->ao_parse(sk, optname, optval, optlen); break; } #endif @@ -4282,6 +4290,8 @@ int do_tcp_getsockopt(struct sock *sk, int level, return err; } #endif + case TCP_AO_REPAIR: + return tcp_ao_get_repair(sk, optval, optlen); case TCP_AO_GET_KEYS: case TCP_AO_INFO: { int err; diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index b5ac3e73e1da..6a845e906a1d 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -1490,6 +1490,16 @@ static struct tcp_ao_info *setsockopt_ao_info(struct sock *sk) return ERR_PTR(-ESOCKTNOSUPPORT); } +static struct tcp_ao_info *getsockopt_ao_info(struct sock *sk) +{ + if (sk_fullsock(sk)) + return rcu_dereference(tcp_sk(sk)->ao_info); + else if (sk->sk_state == TCP_TIME_WAIT) + return rcu_dereference(tcp_twsk(sk)->ao_info); + + return ERR_PTR(-ESOCKTNOSUPPORT); +} + #define TCP_AO_KEYF_ALL (TCP_AO_KEYF_IFINDEX | TCP_AO_KEYF_EXCLUDE_OPT) #define TCP_AO_GET_KEYF_VALID (TCP_AO_KEYF_IFINDEX) @@ -1671,11 +1681,13 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family, if (ret < 0) goto err_free_sock; - /* Change this condition if we allow adding keys in states - * like close_wait, syn_sent or fin_wait... - */ - if (sk->sk_state == TCP_ESTABLISHED) + if (!((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE))) { tcp_ao_cache_traffic_keys(sk, ao_info, key); + if (first) { + ao_info->current_key = key; + ao_info->rnext_key = key; + } + } tcp_ao_link_mkt(ao_info, key); if (first) { @@ -1926,6 +1938,8 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family, if (IS_ERR(ao_info)) return PTR_ERR(ao_info); if (!ao_info) { + if (!((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE))) + return -EINVAL; ao_info = tcp_ao_alloc_info(GFP_KERNEL); if (!ao_info) return -ENOMEM; @@ -2308,3 +2322,71 @@ int tcp_ao_get_sock_info(struct sock *sk, sockptr_t optval, sockptr_t optlen) return 0; } +int tcp_ao_set_repair(struct sock *sk, sockptr_t optval, unsigned int optlen) +{ + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_ao_repair cmd; + struct tcp_ao_key *key; + struct tcp_ao_info *ao; + int err; + + if (optlen < sizeof(cmd)) + return -EINVAL; + + err = copy_struct_from_sockptr(&cmd, sizeof(cmd), optval, optlen); + if (err) + return err; + + if (!tp->repair) + return -EPERM; + + ao = setsockopt_ao_info(sk); + if (IS_ERR(ao)) + return PTR_ERR(ao); + if (!ao) + return -ENOENT; + + WRITE_ONCE(ao->lisn, cmd.snt_isn); + WRITE_ONCE(ao->risn, cmd.rcv_isn); + WRITE_ONCE(ao->snd_sne, cmd.snd_sne); + WRITE_ONCE(ao->rcv_sne, cmd.rcv_sne); + + hlist_for_each_entry_rcu(key, &ao->head, node) + tcp_ao_cache_traffic_keys(sk, ao, key); + + return 0; +} + +int tcp_ao_get_repair(struct sock *sk, sockptr_t optval, sockptr_t optlen) +{ + struct tcp_sock *tp = tcp_sk(sk); + struct tcp_ao_repair opt; + struct tcp_ao_info *ao; + int len; + + if (copy_from_sockptr(&len, optlen, sizeof(int))) + return -EFAULT; + + if (len <= 0) + return -EINVAL; + + if (!tp->repair) + return -EPERM; + + rcu_read_lock(); + ao = getsockopt_ao_info(sk); + if (IS_ERR_OR_NULL(ao)) { + rcu_read_unlock(); + return ao ? PTR_ERR(ao) : -ENOENT; + } + + opt.snt_isn = ao->lisn; + opt.rcv_isn = ao->risn; + opt.snd_sne = READ_ONCE(ao->snd_sne); + opt.rcv_sne = READ_ONCE(ao->rcv_sne); + rcu_read_unlock(); + + if (copy_to_sockptr(optval, &opt, min_t(int, len, sizeof(opt)))) + return -EFAULT; + return 0; +} From patchwork Mon Oct 23 19:22:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 157070 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1505963vqx; Mon, 23 Oct 2023 12:26:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGssodhYUUxY+r1NxI8G3zZY+k3U/jhkMSWtmKPWR8DkT0RDoI9puwBENSqDWDRw1cCoZ+a X-Received: by 2002:a05:6a00:2daa:b0:68f:b5cb:ced0 with SMTP id fb42-20020a056a002daa00b0068fb5cbced0mr12046279pfb.34.1698089186370; Mon, 23 Oct 2023 12:26:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698089186; cv=none; d=google.com; s=arc-20160816; b=K1zZcTGm3IYpGSUSyukN10dnQrkD2ZBRY+nx+2vhcAOIpuy6dWB5iZmSQ2u3/clCbK GebEmMj6mrPueajU+JJrjb18p4E+Q1csX1TiRY8kk0+hTvXd465H+FirGcBtkZOkKbEs u/NsN0K3C+Dj4GWWkOLmXnqa/42+3pNre3AsHnPOBQKum4wV22IbHq9ayZzhLnFtwX6K hsTvQDUzCdk5PuC79oOtGDDO2WIER94QSUsm27gNVSjx4MMClxdHaJ8PXCs4wI9JTy02 dhaRjAE1JApTKg8EOt9nxuA+Cp8Mpqv5xCi4mtT704XMKHMlNy0q/F+piqtMrzRe7NU4 /yiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=i1qaHrqVtSrGJTiCvr5M49PYhV5KV1LtHFCXx9SrGjo=; fh=T7bplergXivl9ro1YkgY8UekcJkAwys3nxmEeI2hQbo=; b=DqTAIL0mrqaNpEj/Ixze01ttALFhg9P6spHcWTOIMLVLGsNkhdW1QLW7Y+yCJ3/Yz9 dVHaStt78pYQ/l2Nd0+/2RfavyGHyLztl0f+J3hehWFv3MWz4AQYzOYmqwlZf0TnoBMQ pKapRPo+UJaYYAbCoAcTNkz/pQlwJpQ325jMiR5XQefVxOd9wSrzZ+7EWS+khUsKljHw xPMpF9QhgJUsxrY0d9K2knhPnNEdH+1bq87aN23GMZZh2fGBvIUJkAMTVepFYm4aV8k0 /vDCUCPe9Ml/sRDNaeofG3bsgyxoY+OPZPOnUZWUDZQngCGW+FG0SMYrL+NUxOyPgDTA CT0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=Qc4F4ug5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id j12-20020a056a00174c00b006be55174f3fsi7266028pfc.28.2023.10.23.12.26.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:26:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=google header.b=Qc4F4ug5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 6159D80AFE90; Mon, 23 Oct 2023 12:26:22 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231215AbjJWTZj (ORCPT + 27 others); Mon, 23 Oct 2023 15:25:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229498AbjJWTZg (ORCPT ); Mon, 23 Oct 2023 15:25:36 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7BE3D1BF5 for ; Mon, 23 Oct 2023 12:23:13 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-40837ebba42so27047345e9.0 for ; Mon, 23 Oct 2023 12:23:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088991; x=1698693791; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=i1qaHrqVtSrGJTiCvr5M49PYhV5KV1LtHFCXx9SrGjo=; b=Qc4F4ug5aOALnNHY0R4+D88qlJKoO+1eQLIqDWRDSgj4KusHg3JwnhP8M0HKCjk3hA A0/BdI+bDAS7NnUX2vlGecksf9jD4xFPdE4h/WEar0zbC+jm/jPlRWJK1/hN0bC8hkHJ OtuyfIyAwqDoqHhjzNePyq34ba9ie+N0IjRvIi4Uvcdms7+v9hR2bMFO+xamvbfImxkK PUF+n5CEF3l9pZ8Dlbp9v+8PNICnT3S7J3WZhYrhWWuPVUD5LEQK+4/VFJSPsvwBQuxM kaUANYbemuShKsFDYxnA7cuqR7segSao77qFLfsi41ORP6KFe+uwci8X5CLJA7G5GD4T nbeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088991; x=1698693791; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i1qaHrqVtSrGJTiCvr5M49PYhV5KV1LtHFCXx9SrGjo=; b=uQyMD8mrN0pWJHz+Cv37Ux2aG2Bmx5sSL+EjTCik78rSxYqOo+OJ4JLV4HM5aViwUp tXuXqV6K2Z5ggyFg4QBVX0SWyTbOwtzJ8GsAEtReaaqiL+uaX5GKfQI7uV98J1ETVDe7 zupCW/xuw1IrXQ9hC1Dee4rCgoWTen090aIc3YoFiT2vslHfvYmjCe4C5kvh5mruqOyU luDGh+idE1E9GTEAgKXlLJI5iSmaCw0xIzIWjJOgx64QMYv+Ar1bsd7PdRruCuZ/a+m6 gnupAAdQ/DKFVOK2Yt2B6W3UqE4ajZAY/xECF8a8w8FIF7iKp6Vh8b02uAnucrdLuhre pouw== X-Gm-Message-State: AOJu0YzT1HN2lP6OMBkM9yLiJ1uEMeTxvRnznl4hc/FdmLFu0uqD+KHg kFjkb7K0uX1CWEb0eVO4ODvtbA== X-Received: by 2002:a05:600c:4c16:b0:401:d947:c8a9 with SMTP id d22-20020a05600c4c1600b00401d947c8a9mr8249626wmp.19.1698088991118; Mon, 23 Oct 2023 12:23:11 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.23.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:23:10 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org, Jonathan Corbet , linux-doc@vger.kernel.org Subject: [PATCH v16 net-next 23/23] Documentation/tcp: Add TCP-AO documentation Date: Mon, 23 Oct 2023 20:22:15 +0100 Message-ID: <20231023192217.426455-24-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Mon, 23 Oct 2023 12:26:22 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780575566893144548 X-GMAIL-MSGID: 1780575566893144548 It has Frequently Asked Questions (FAQ) on RFC 5925 - I found it very useful answering those before writing the actual code. It provides answers to common questions that arise on a quick read of the RFC, as well as how they were answered. There's also comparison to TCP-MD5 option, evaluation of per-socket vs in-kernel-DB approaches and description of uAPI provided. Hopefully, it will be as useful for reviewing the code as it was for writing. Cc: Jonathan Corbet Cc: linux-doc@vger.kernel.org Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- Documentation/networking/index.rst | 1 + Documentation/networking/tcp_ao.rst | 444 ++++++++++++++++++++++++++++ 2 files changed, 445 insertions(+) create mode 100644 Documentation/networking/tcp_ao.rst diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 2ffc5ad10295..683eb42309cc 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -106,6 +106,7 @@ Contents: sysfs-tagging tc-actions-env-rules tc-queue-filters + tcp_ao tcp-thin team timestamping diff --git a/Documentation/networking/tcp_ao.rst b/Documentation/networking/tcp_ao.rst new file mode 100644 index 000000000000..cfa5bf1cc542 --- /dev/null +++ b/Documentation/networking/tcp_ao.rst @@ -0,0 +1,444 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================================================== +TCP Authentication Option Linux implementation (RFC5925) +======================================================== + +TCP Authentication Option (TCP-AO) provides a TCP extension aimed at verifying +segments between trusted peers. It adds a new TCP header option with +a Message Authentication Code (MAC). MACs are produced from the content +of a TCP segment using a hashing function with a password known to both peers. +The intent of TCP-AO is to deprecate TCP-MD5 providing better security, +key rotation and support for variety of hashing algorithms. + +1. Introduction +=============== + +.. table:: Short and Limited Comparison of TCP-AO and TCP-MD5 + + +----------------------+------------------------+-----------------------+ + | | TCP-MD5 | TCP-AO | + +======================+========================+=======================+ + |Supported hashing |MD5 |Must support HMAC-SHA1 | + |algorithms |(cryptographically weak)|(chosen-prefix attacks)| + | | |and CMAC-AES-128 (only | + | | |side-channel attacks). | + | | |May support any hashing| + | | |algorithm. | + +----------------------+------------------------+-----------------------+ + |Length of MACs (bytes)|16 |Typically 12-16. | + | | |Other variants that fit| + | | |TCP header permitted. | + +----------------------+------------------------+-----------------------+ + |Number of keys per |1 |Many | + |TCP connection | | | + +----------------------+------------------------+-----------------------+ + |Possibility to change |Non-practical (both |Supported by protocol | + |an active key |peers have to change | | + | |them during MSL) | | + +----------------------+------------------------+-----------------------+ + |Protection against |No |Yes: ignoring them | + |ICMP 'hard errors' | |by default on | + | | |established connections| + +----------------------+------------------------+-----------------------+ + |Protection against |No |Yes: pseudo-header | + |traffic-crossing | |includes TCP ports. | + |attack | | | + +----------------------+------------------------+-----------------------+ + |Protection against |No |Sequence Number | + |replayed TCP segments | |Extension (SNE) and | + | | |Initial Sequence | + | | |Numbers (ISNs) | + +----------------------+------------------------+-----------------------+ + |Supports |Yes |No. ISNs+SNE are needed| + |Connectionless Resets | |to correctly sign RST. | + +----------------------+------------------------+-----------------------+ + |Standards |RFC 2385 |RFC 5925, RFC 5926 | + +----------------------+------------------------+-----------------------+ + + +1.1 Frequently Asked Questions (FAQ) with references to RFC 5925 +---------------------------------------------------------------- + +Q: Can either SendID or RecvID be non-unique for the same 4-tuple +(srcaddr, srcport, dstaddr, dstport)? + +A: No [3.1]:: + + >> The IDs of MKTs MUST NOT overlap where their TCP connection + identifiers overlap. + +Q: Can Master Key Tuple (MKT) for an active connection be removed? + +A: No, unless it's copied to Transport Control Block (TCB) [3.1]:: + + It is presumed that an MKT affecting a particular connection cannot + be destroyed during an active connection -- or, equivalently, that + its parameters are copied to an area local to the connection (i.e., + instantiated) and so changes would affect only new connections. + +Q: If an old MKT needs to be deleted, how should it be done in order +to not remove it for an active connection? (As it can be still in use +at any moment later) + +A: Not specified by RFC 5925, seems to be a problem for key management +to ensure that no one uses such MKT before trying to remove it. + +Q: Can an old MKT exist forever and be used by another peer? + +A: It can, it's a key management task to decide when to remove an old key [6.1]:: + + Deciding when to start using a key is a performance issue. Deciding + when to remove an MKT is a security issue. Invalid MKTs are expected + to be removed. TCP-AO provides no mechanism to coordinate their removal, + as we consider this a key management operation. + +also [6.1]:: + + The only way to avoid reuse of previously used MKTs is to remove the MKT + when it is no longer considered permitted. + +Linux TCP-AO will try its best to prevent you from removing a key that's +being used, considering it a key management failure. But sine keeping +an outdated key may become a security issue and as a peer may +unintentionally prevent the removal of an old key by always setting +it as RNextKeyID - a forced key removal mechanism is provided, where +userspace has to supply KeyID to use instead of the one that's being removed +and the kernel will atomically delete the old key, even if the peer is +still requesting it. There are no guarantees for force-delete as the peer +may yet not have the new key - the TCP connection may just break. +Alternatively, one may choose to shut down the socket. + +Q: What happens when a packet is received on a new connection with no known +MKT's RecvID? + +A: RFC 5925 specifies that by default it is accepted with a warning logged, but +the behaviour can be configured by the user [7.5.1.a]:: + + If the segment is a SYN, then this is the first segment of a new + connection. Find the matching MKT for this segment, using the segment's + socket pair and its TCP-AO KeyID, matched against the MKT's TCP connection + identifier and the MKT's RecvID. + + i. If there is no matching MKT, remove TCP-AO from the segment. + Proceed with further TCP handling of the segment. + NOTE: this presumes that connections that do not match any MKT + should be silently accepted, as noted in Section 7.3. + +[7.3]:: + + >> A TCP-AO implementation MUST allow for configuration of the behavior + of segments with TCP-AO but that do not match an MKT. The initial default + of this configuration SHOULD be to silently accept such connections. + If this is not the desired case, an MKT can be included to match such + connections, or the connection can indicate that TCP-AO is required. + Alternately, the configuration can be changed to discard segments with + the AO option not matching an MKT. + +[10.2.b]:: + + Connections not matching any MKT do not require TCP-AO. Further, incoming + segments with TCP-AO are not discarded solely because they include + the option, provided they do not match any MKT. + +Note that Linux TCP-AO implementation differs in this aspect. Currently, TCP-AO +segments with unknown key signatures are discarded with warnings logged. + +Q: Does the RFC imply centralized kernel key management in any way? +(i.e. that a key on all connections MUST be rotated at the same time?) + +A: Not specified. MKTs can be managed in userspace, the only relevant part to +key changes is [7.3]:: + + >> All TCP segments MUST be checked against the set of MKTs for matching + TCP connection identifiers. + +Q: What happens when RNextKeyID requested by a peer is unknown? Should +the connection be reset? + +A: It should not, no action needs to be performed [7.5.2.e]:: + + ii. If they differ, determine whether the RNextKeyID MKT is ready. + + 1. If the MKT corresponding to the segment’s socket pair and RNextKeyID + is not available, no action is required (RNextKeyID of a received + segment needs to match the MKT’s SendID). + +Q: How current_key is set and when does it change? It is a user-triggered +change, or is it by a request from the remote peer? Is it set by the user +explicitly, or by a matching rule? + +A: current_key is set by RNextKeyID [6.1]:: + + Rnext_key is changed only by manual user intervention or MKT management + protocol operation. It is not manipulated by TCP-AO. Current_key is updated + by TCP-AO when processing received TCP segments as discussed in the segment + processing description in Section 7.5. Note that the algorithm allows + the current_key to change to a new MKT, then change back to a previously + used MKT (known as "backing up"). This can occur during an MKT change when + segments are received out of order, and is considered a feature of TCP-AO, + because reordering does not result in drops. + +[7.5.2.e.ii]:: + + 2. If the matching MKT corresponding to the segment’s socket pair and + RNextKeyID is available: + + a. Set current_key to the RNextKeyID MKT. + +Q: If both peers have multiple MKTs matching the connection's socket pair +(with different KeyIDs), how should the sender/receiver pick KeyID to use? + +A: Some mechanism should pick the "desired" MKT [3.3]:: + + Multiple MKTs may match a single outgoing segment, e.g., when MKTs + are being changed. Those MKTs cannot have conflicting IDs (as noted + elsewhere), and some mechanism must determine which MKT to use for each + given outgoing segment. + + >> An outgoing TCP segment MUST match at most one desired MKT, indicated + by the segment’s socket pair. The segment MAY match multiple MKTs, provided + that exactly one MKT is indicated as desired. Other information in + the segment MAY be used to determine the desired MKT when multiple MKTs + match; such information MUST NOT include values in any TCP option fields. + +Q: Can TCP-MD5 connection migrate to TCP-AO (and vice-versa): + +A: No [1]:: + + TCP MD5-protected connections cannot be migrated to TCP-AO because TCP MD5 + does not support any changes to a connection’s security algorithm + once established. + +Q: If all MKTs are removed on a connection, can it become a non-TCP-AO signed +connection? + +A: [7.5.2] doesn't have the same choice as SYN packet handling in [7.5.1.i] +that would allow accepting segments without a sign (which would be insecure). +While switching to non-TCP-AO connection is not prohibited directly, it seems +what the RFC means. Also, there's a requirement for TCP-AO connections to +always have one current_key [3.3]:: + + TCP-AO requires that every protected TCP segment match exactly one MKT. + +[3.3]:: + + >> An incoming TCP segment including TCP-AO MUST match exactly one MKT, + indicated solely by the segment’s socket pair and its TCP-AO KeyID. + +[4.4]:: + + One or more MKTs. These are the MKTs that match this connection’s + socket pair. + +Q: Can a non-TCP-AO connection become a TCP-AO-enabled one? + +A: No: for already established non-TCP-AO connection it would be impossible +to switch using TCP-AO as the traffic key generation requires the initial +sequence numbers. Paraphrasing, starting using TCP-AO would require +re-establishing the TCP connection. + +2. In-kernel MKTs database vs database in userspace +=================================================== + +Linux TCP-AO support is implemented using ``setsockopt()s``, in a similar way +to TCP-MD5. It means that a userspace application that wants to use TCP-AO +should perform ``setsockopt()`` on a TCP socket when it wants to add, +remove or rotate MKTs. This approach moves the key management responsibility +to userspace as well as decisions on corner cases, i.e. what to do if +the peer doesn't respect RNextKeyID; moving more code to userspace, especially +responsible for the policy decisions. Besides, it's flexible and scales well +(with less locking needed than in the case of an in-kernel database). One also +should keep in mind that mainly intended users are BGP processes, not any +random applications, which means that compared to IPsec tunnels, +no transparency is really needed and modern BGP daemons already have +``setsockopt()s`` for TCP-MD5 support. + +.. table:: Considered pros and cons of the approaches + + +----------------------+------------------------+-----------------------+ + | | ``setsockopt()`` | in-kernel DB | + +======================+========================+=======================+ + | Extendability | ``setsockopt()`` | Netlink messages are | + | | commands should be | simple and extendable | + | | extendable syscalls | | + +----------------------+------------------------+-----------------------+ + | Required userspace | BGP or any application | could be transparent | + | changes | that wants TCP-AO needs| as tunnels, providing | + | | to perform | something like | + | | ``setsockopt()s`` | ``ip tcpao add key`` | + | | and do key management | (delete/show/rotate) | + +----------------------+------------------------+-----------------------+ + |MKTs removal or adding| harder for userspace | harder for kernel | + +----------------------+------------------------+-----------------------+ + | Dump-ability | ``getsockopt()`` | Netlink .dump() | + | | | callback | + +----------------------+------------------------+-----------------------+ + | Limits on kernel | equal | + | resources/memory | | + +----------------------+------------------------+-----------------------+ + | Scalability | contention on | contention on | + | | ``TCP_LISTEN`` sockets | the whole database | + +----------------------+------------------------+-----------------------+ + | Monitoring & warnings| ``TCP_DIAG`` | same Netlink socket | + +----------------------+------------------------+-----------------------+ + | Matching of MKTs | half-problem: only | hard | + | | listen sockets | | + +----------------------+------------------------+-----------------------+ + + +3. uAPI +======= + +Linux provides a set of ``setsockopt()s`` and ``getsockopt()s`` that let +userspace manage TCP-AO on a per-socket basis. In order to add/delete MKTs +``TCP_AO_ADD_KEY`` and ``TCP_AO_DEL_KEY`` TCP socket options must be used +It is not allowed to add a key on an established non-TCP-AO connection +as well as to remove the last key from TCP-AO connection. + +``setsockopt(TCP_AO_DEL_KEY)`` command may specify ``tcp_ao_del::current_key`` ++ ``tcp_ao_del::set_current`` and/or ``tcp_ao_del::rnext`` ++ ``tcp_ao_del::set_rnext`` which makes such delete "forced": it +provides userspace a way to delete a key that's being used and atomically set +another one instead. This is not intended for normal use and should be used +only when the peer ignores RNextKeyID and keeps requesting/using an old key. +It provides a way to force-delete a key that's not trusted but may break +the TCP-AO connection. + +The usual/normal key-rotation can be performed with ``setsockopt(TCP_AO_INFO)``. +It also provides a uAPI to change per-socket TCP-AO settings, such as +ignoring ICMPs, as well as clear per-socket TCP-AO packet counters. +The corresponding ``getsockopt(TCP_AO_INFO)`` can be used to get those +per-socket TCP-AO settings. + +Another useful command is ``getsockopt(TCP_AO_GET_KEYS)``. One can use it +to list all MKTs on a TCP socket or use a filter to get keys for a specific +peer and/or sndid/rcvid, VRF L3 interface or get current_key/rnext_key. + +To repair TCP-AO connections ``setsockopt(TCP_AO_REPAIR)`` is available, +provided that the user previously has checkpointed/dumped the socket with +``getsockopt(TCP_AO_REPAIR)``. + +A tip here for scaled TCP_LISTEN sockets, that may have some thousands TCP-AO +keys, is: use filters in ``getsockopt(TCP_AO_GET_KEYS)`` and asynchronous +delete with ``setsockopt(TCP_AO_DEL_KEY)``. + +Linux TCP-AO also provides a bunch of segment counters that can be helpful +with troubleshooting/debugging issues. Every MKT has good/bad counters +that reflect how many packets passed/failed verification. +Each TCP-AO socket has the following counters: +- for good segments (properly signed) +- for bad segments (failed TCP-AO verification) +- for segments with unknown keys +- for segments where an AO signature was expected, but wasn't found +- for the number of ignored ICMPs + +TCP-AO per-socket counters are also duplicated with per-netns counters, +exposed with SNMP. Those are ``TCPAOGood``, ``TCPAOBad``, ``TCPAOKeyNotFound``, +``TCPAORequired`` and ``TCPAODroppedIcmps``. + +RFC 5925 very permissively specifies how TCP port matching can be done for +MKTs:: + + TCP connection identifier. A TCP socket pair, i.e., a local IP + address, a remote IP address, a TCP local port, and a TCP remote port. + Values can be partially specified using ranges (e.g., 2-30), masks + (e.g., 0xF0), wildcards (e.g., "*"), or any other suitable indication. + +Currently Linux TCP-AO implementation doesn't provide any TCP port matching. +Probably, port ranges are the most flexible for uAPI, but so far +not implemented. + +4. ``setsockopt()`` vs ``accept()`` race +======================================== + +In contrast with TCP-MD5 established connection which has just one key, +TCP-AO connections may have many keys, which means that accepted connections +on a listen socket may have any amount of keys as well. As copying all those +keys on a first properly signed SYN would make the request socket bigger, that +would be undesirable. Currently, the implementation doesn't copy keys +to request sockets, but rather look them up on the "parent" listener socket. + +The result is that when userspace removes TCP-AO keys, that may break +not-yet-established connections on request sockets as well as not removing +keys from sockets that were already established, but not yet ``accept()``'ed, +hanging in the accept queue. + +The reverse is valid as well: if userspace adds a new key for a peer on +a listener socket, the established sockets in accept queue won't +have the new keys. + +At this moment, the resolution for the two races: +``setsockopt(TCP_AO_ADD_KEY)`` vs ``accept()`` +and ``setsockopt(TCP_AO_DEL_KEY)`` vs ``accept()`` is delegated to userspace. +This means that it's expected that userspace would check the MKTs on the socket +that was returned by ``accept()`` to verify that any key rotation that +happened on listen socket is reflected on the newly established connection. + +This is a similar "do-nothing" approach to TCP-MD5 from the kernel side and +may be changed later by introducing new flags to ``tcp_ao_add`` +and ``tcp_ao_del``. + +Note that this race is rare for it needs TCP-AO key rotation to happen +during the 3-way handshake for the new TCP connection. + +5. Interaction with TCP-MD5 +=========================== + +A TCP connection can not migrate between TCP-AO and TCP-MD5 options. The +established sockets that have either AO or MD5 keys are restricted for +adding keys of the other option. + +For listening sockets the picture is different: BGP server may want to receive +both TCP-AO and (deprecated) TCP-MD5 clients. As a result, both types of keys +may be added to TCP_CLOSED or TCP_LISTEN sockets. It's not allowed to add +different types of keys for the same peer. + +6. SNE Linux implementation +=========================== + +RFC 5925 [6.2] describes the algorithm of how to extend TCP sequence numbers +with SNE. In short: TCP has to track the previous sequence numbers and set +sne_flag when the current SEQ number rolls over. The flag is cleared when +both current and previous SEQ numbers cross 0x7fff, which is 32Kb. + +In times when sne_flag is set, the algorithm compares SEQ for each packet with +0x7fff and if it's higher than 32Kb, it assumes that the packet should be +verified with SNE before the increment. As a result, there's +this [0; 32Kb] window, when packets with (SNE - 1) can be accepted. + +Linux implementation simplifies this a bit: as the network stack already tracks +the first SEQ byte that ACK is wanted for (snd_una) and the next SEQ byte that +is wanted (rcv_nxt) - that's enough information for a rough estimation +on where in the 4GB SEQ number space both sender and receiver are. +When they roll over to zero, the corresponding SNE gets incremented. + +tcp_ao_compute_sne() is called for each TCP-AO segment. It compares SEQ numbers +from the segment with snd_una or rcv_nxt and fits the result into a 2GB window around them, +detecting SEQ numbers rolling over. That simplifies the code a lot and only +requires SNE numbers to be stored on every TCP-AO socket. + +The 2GB window at first glance seems much more permissive compared to +RFC 5926. But that is only used to pick the correct SNE before/after +a rollover. It allows more TCP segment replays, but yet all regular +TCP checks in tcp_sequence() are applied on the verified segment. +So, it trades a bit more permissive acceptance of replayed/retransmitted +segments for the simplicity of the algorithm and what seems better behaviour +for large TCP windows. + +7. Links +======== + +RFC 5925 The TCP Authentication Option + https://www.rfc-editor.org/rfc/pdfrfc/rfc5925.txt.pdf + +RFC 5926 Cryptographic Algorithms for the TCP Authentication Option (TCP-AO) + https://www.rfc-editor.org/rfc/pdfrfc/rfc5926.txt.pdf + +Draft "SHA-2 Algorithm for the TCP Authentication Option (TCP-AO)" + https://datatracker.ietf.org/doc/html/draft-nayak-tcp-sha2-03 + +RFC 2385 Protection of BGP Sessions via the TCP MD5 Signature Option + https://www.rfc-editor.org/rfc/pdfrfc/rfc2385.txt.pdf + +:Author: Dmitry Safonov