[v4,1/2] drivers: net: prevent tun_build_skb() to exceed the packet size limit

Message ID 20230801220710.464-1-andrew.kanner@gmail.com
State New
Headers
Series [v4,1/2] drivers: net: prevent tun_build_skb() to exceed the packet size limit |

Commit Message

Andrew Kanner Aug. 1, 2023, 10:07 p.m. UTC
  Using the syzkaller repro with reduced packet size it was discovered
that XDP_PACKET_HEADROOM is not checked in tun_can_build_skb(),
although pad may be incremented in tun_build_skb(). This may end up
with exceeding the PAGE_SIZE limit in tun_build_skb().

Fixes: 7df13219d757 ("tun: reserve extra headroom only when XDP is set")
Link: https://syzkaller.appspot.com/bug?extid=f817490f5bd20541b90a
Signed-off-by: Andrew Kanner <andrew.kanner@gmail.com>
---

Notes:
    v3 -> v4:
    * fall back to v1, fixing only missing XDP_PACKET_HEADROOM in pad and
      removing bpf_xdp_adjust_tail() check for frame_sz.
    * added rcu read lock, noted by Jason Wang <jasowang@redhat.com> in v1
    * I decided to leave the packet length check in tun_can_build_skb()
      instead of moving to tun_build_skb() suggested by Jason Wang
      <jasowang@redhat.com>. Otherwise extra packets will be dropped
      without falling back to tun_alloc_skb(). And in the discussion of v3
      Jesper Dangaard Brouer <jbrouer@redhat.com> noticed that XDP is ok
      with a higher order pages if it's a contiguous physical memory
      allocation, so falling to tun_alloc_skb() -> do_xdp_generic() should
      be ok.
    
    v2 -> v3:
    * attach the forgotten changelog
    
    v1 -> v2:
    * merged 2 patches in 1, fixing both issues: WARN_ON_ONCE with
      syzkaller repro and missing XDP_PACKET_HEADROOM in pad
    * changed the title and description of the execution path, suggested
      by Jason Wang <jasowang@redhat.com>
    * move the limit check from tun_can_build_skb() to tun_build_skb() to
      remove duplication and locking issue, and also drop the packet in
      case of a failed check - noted by Jason Wang <jasowang@redhat.com>

 drivers/net/tun.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)
  

Comments

David Ahern Aug. 2, 2023, 1:07 a.m. UTC | #1
On 8/1/23 4:07 PM, Andrew Kanner wrote:
> @@ -1594,7 +1597,13 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
>  	if (zerocopy)
>  		return false;
>  
> -	if (SKB_DATA_ALIGN(len + TUN_RX_PAD) +
> +	rcu_read_lock();
> +	xdp_prog = rcu_dereference(tun->xdp_prog);
> +	if (xdp_prog)
> +		pad += XDP_PACKET_HEADROOM;
> +	rcu_read_unlock();


since you do not care about the actual xdp_prog (only that one is set) I
believe you can use rcu_access_pointer here.
  
Andrew Kanner Aug. 2, 2023, 12:16 p.m. UTC | #2
On Tue, Aug 01, 2023 at 07:07:39PM -0600, David Ahern wrote:
> On 8/1/23 4:07 PM, Andrew Kanner wrote:
> > @@ -1594,7 +1597,13 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
> >  	if (zerocopy)
> >  		return false;
> >  
> > -	if (SKB_DATA_ALIGN(len + TUN_RX_PAD) +
> > +	rcu_read_lock();
> > +	xdp_prog = rcu_dereference(tun->xdp_prog);
> > +	if (xdp_prog)
> > +		pad += XDP_PACKET_HEADROOM;
> > +	rcu_read_unlock();
> 
> 
> since you do not care about the actual xdp_prog (only that one is set) I
> believe you can use rcu_access_pointer here.

Good point. Thanks, David.

I'll resend both as v5.

The correct cc-list for PATCH 2/2 is also needed. It fixes
net/core/filter.c instead of drivers/net/tun.c now.

pw-bot: changes-requested
  
Jesper Dangaard Brouer Aug. 2, 2023, 2:13 p.m. UTC | #3
On 02/08/2023 00.07, Andrew Kanner wrote:
> Using the syzkaller repro with reduced packet size it was discovered
> that XDP_PACKET_HEADROOM is not checked in tun_can_build_skb(),
> although pad may be incremented in tun_build_skb(). This may end up
> with exceeding the PAGE_SIZE limit in tun_build_skb().
> 
> Fixes: 7df13219d757 ("tun: reserve extra headroom only when XDP is set")
> Link: https://syzkaller.appspot.com/bug?extid=f817490f5bd20541b90a
> Signed-off-by: Andrew Kanner <andrew.kanner@gmail.com>
> ---
> 
> Notes:
>      v3 -> v4:
>      * fall back to v1, fixing only missing XDP_PACKET_HEADROOM in pad and
>        removing bpf_xdp_adjust_tail() check for frame_sz.
>      * added rcu read lock, noted by Jason Wang <jasowang@redhat.com> in v1
>      * I decided to leave the packet length check in tun_can_build_skb()
>        instead of moving to tun_build_skb() suggested by Jason Wang
>        <jasowang@redhat.com>. Otherwise extra packets will be dropped
>        without falling back to tun_alloc_skb(). And in the discussion of v3
>        Jesper Dangaard Brouer <jbrouer@redhat.com> noticed that XDP is ok
>        with a higher order pages if it's a contiguous physical memory
>        allocation, so falling to tun_alloc_skb() -> do_xdp_generic() should
>        be ok.
>      
>      v2 -> v3:
>      * attach the forgotten changelog
>      
>      v1 -> v2:
>      * merged 2 patches in 1, fixing both issues: WARN_ON_ONCE with
>        syzkaller repro and missing XDP_PACKET_HEADROOM in pad
>      * changed the title and description of the execution path, suggested
>        by Jason Wang <jasowang@redhat.com>
>      * move the limit check from tun_can_build_skb() to tun_build_skb() to
>        remove duplication and locking issue, and also drop the packet in
>        case of a failed check - noted by Jason Wang <jasowang@redhat.com>
> 
>   drivers/net/tun.c | 11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index d75456adc62a..a1d04bc9485f 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1582,6 +1582,9 @@ static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
>   static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
>   			      int len, int noblock, bool zerocopy)
>   {
> +	struct bpf_prog *xdp_prog;
> +	int pad = TUN_RX_PAD;
> +
>   	if ((tun->flags & TUN_TYPE_MASK) != IFF_TAP)
>   		return false;
>   
> @@ -1594,7 +1597,13 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
>   	if (zerocopy)
>   		return false;
>   
> -	if (SKB_DATA_ALIGN(len + TUN_RX_PAD) +
> +	rcu_read_lock();
> +	xdp_prog = rcu_dereference(tun->xdp_prog);
> +	if (xdp_prog)
> +		pad += XDP_PACKET_HEADROOM;
> +	rcu_read_unlock();
> +

Isolated seen, I guess, this is a correct fix to 7df13219d757.

> +	if (SKB_DATA_ALIGN(len + pad) +
>   	    SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) > PAGE_SIZE)
>   		return false;
>   

Question to Jason Wang:
Why fall back (to e.g. tun_alloc_skb()) when size is above PAGE_SIZE?

AFAIK tun_build_skb() *can* create get larger packets than PAGE_SIZE
from it's page_frag.  Is there a reason for this limitation?

(To Andrew, I assume a change in this area is another patch).

--Jesper
  
Jason Wang Aug. 3, 2023, 3:19 a.m. UTC | #4
On Wed, Aug 2, 2023 at 10:14 PM Jesper Dangaard Brouer <hawk@kernel.org> wrote:
>
>
>
> On 02/08/2023 00.07, Andrew Kanner wrote:
> > Using the syzkaller repro with reduced packet size it was discovered
> > that XDP_PACKET_HEADROOM is not checked in tun_can_build_skb(),
> > although pad may be incremented in tun_build_skb(). This may end up
> > with exceeding the PAGE_SIZE limit in tun_build_skb().
> >
> > Fixes: 7df13219d757 ("tun: reserve extra headroom only when XDP is set")
> > Link: https://syzkaller.appspot.com/bug?extid=f817490f5bd20541b90a
> > Signed-off-by: Andrew Kanner <andrew.kanner@gmail.com>
> > ---
> >
> > Notes:
> >      v3 -> v4:
> >      * fall back to v1, fixing only missing XDP_PACKET_HEADROOM in pad and
> >        removing bpf_xdp_adjust_tail() check for frame_sz.
> >      * added rcu read lock, noted by Jason Wang <jasowang@redhat.com> in v1
> >      * I decided to leave the packet length check in tun_can_build_skb()
> >        instead of moving to tun_build_skb() suggested by Jason Wang
> >        <jasowang@redhat.com>. Otherwise extra packets will be dropped
> >        without falling back to tun_alloc_skb(). And in the discussion of v3
> >        Jesper Dangaard Brouer <jbrouer@redhat.com> noticed that XDP is ok
> >        with a higher order pages if it's a contiguous physical memory
> >        allocation, so falling to tun_alloc_skb() -> do_xdp_generic() should
> >        be ok.
> >
> >      v2 -> v3:
> >      * attach the forgotten changelog
> >
> >      v1 -> v2:
> >      * merged 2 patches in 1, fixing both issues: WARN_ON_ONCE with
> >        syzkaller repro and missing XDP_PACKET_HEADROOM in pad
> >      * changed the title and description of the execution path, suggested
> >        by Jason Wang <jasowang@redhat.com>
> >      * move the limit check from tun_can_build_skb() to tun_build_skb() to
> >        remove duplication and locking issue, and also drop the packet in
> >        case of a failed check - noted by Jason Wang <jasowang@redhat.com>
> >
> >   drivers/net/tun.c | 11 ++++++++++-
> >   1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > index d75456adc62a..a1d04bc9485f 100644
> > --- a/drivers/net/tun.c
> > +++ b/drivers/net/tun.c
> > @@ -1582,6 +1582,9 @@ static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
> >   static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
> >                             int len, int noblock, bool zerocopy)
> >   {
> > +     struct bpf_prog *xdp_prog;
> > +     int pad = TUN_RX_PAD;
> > +
> >       if ((tun->flags & TUN_TYPE_MASK) != IFF_TAP)
> >               return false;
> >
> > @@ -1594,7 +1597,13 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
> >       if (zerocopy)
> >               return false;
> >
> > -     if (SKB_DATA_ALIGN(len + TUN_RX_PAD) +
> > +     rcu_read_lock();
> > +     xdp_prog = rcu_dereference(tun->xdp_prog);
> > +     if (xdp_prog)
> > +             pad += XDP_PACKET_HEADROOM;
> > +     rcu_read_unlock();
> > +
>
> Isolated seen, I guess, this is a correct fix to 7df13219d757.

I think so.

Actually, I think we can probably always count XDP_PACKET_HEADROOM
here. Since there's a window that XDP program might be attached in the
middle of tun_can_build_skb() and tun_build_skb().

>
> > +     if (SKB_DATA_ALIGN(len + pad) +
> >           SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) > PAGE_SIZE)
> >               return false;
> >
>
> Question to Jason Wang:
> Why fall back (to e.g. tun_alloc_skb()) when size is above PAGE_SIZE?
>
> AFAIK tun_build_skb() *can* create get larger packets than PAGE_SIZE
> from it's page_frag.  Is there a reason for this limitation?

I couldn't recall but I think we can relax.

Thanks

>
> (To Andrew, I assume a change in this area is another patch).
>
> --Jesper
>
>
>
  
Andrew Kanner Aug. 3, 2023, 5:53 p.m. UTC | #5
On Thu, Aug 03, 2023 at 11:19:47AM +0800, Jason Wang wrote:
> > > @@ -1594,7 +1597,13 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
> > >       if (zerocopy)
> > >               return false;
> > >
> > > -     if (SKB_DATA_ALIGN(len + TUN_RX_PAD) +
> > > +     rcu_read_lock();
> > > +     xdp_prog = rcu_dereference(tun->xdp_prog);
> > > +     if (xdp_prog)
> > > +             pad += XDP_PACKET_HEADROOM;
> > > +     rcu_read_unlock();
> > > +
> >
> > Isolated seen, I guess, this is a correct fix to 7df13219d757.
> 
> I think so.
> 
> Actually, I think we can probably always count XDP_PACKET_HEADROOM
> here. Since there's a window that XDP program might be attached in the
> middle of tun_can_build_skb() and tun_build_skb().


Thanks, that makes sense.
I will do it in v5.


> > Question to Jason Wang:
> > Why fall back (to e.g. tun_alloc_skb()) when size is above PAGE_SIZE?
> >
> > AFAIK tun_build_skb() *can* create get larger packets than PAGE_SIZE
> > from it's page_frag.  Is there a reason for this limitation?
> 
> I couldn't recall but I think we can relax.

Jesper already sent enough info for this idea in v2, I will use it for
the next patch/series.

Jesper, I will add this tag for this next patch/series if you don't
mind:
Suggested-by: Jesper Dangaard Brouer <hawk@kernel.org>
  
Jesper Dangaard Brouer Aug. 4, 2023, 9:30 a.m. UTC | #6
On 03/08/2023 19.53, Andrew Kanner wrote:
>>> Question to Jason Wang:
>>> Why fall back (to e.g. tun_alloc_skb()) when size is above PAGE_SIZE?
>>>
>>> AFAIK tun_build_skb()*can*  create get larger packets than PAGE_SIZE
>>> from it's page_frag.  Is there a reason for this limitation?
>> >> I couldn't recall but I think we can relax.
> > Jesper already sent enough info for this idea in v2, I will use it for
> the next patch/series.
> 

I have some more input and considerations when selecting the new
constant that replace PAGE_SIZE.

Lets see if Eric Dumazet or Alex Duyck disagree?
("inventors" of page_frag scheme)

The function tun_alloc_skb() uses a page_frag scheme for allocation.
The maximim size is 32768 bytes (Order-3), but using something that is
close to this max alloc size can cause memory waste and fragmentation.

My suggestion would be to use the constant SKB_MAX_ALLOC (16KiB).

Maybe Eric or Alex would recommend using something smaller? (e.g. 8192)

page_frag limit comes from:
  #define SKB_FRAG_PAGE_ORDER get_order(32768)


> Jesper, I will add this tag for this next patch/series if you don't
> mind:
> Suggested-by: Jesper Dangaard Brouer <hawk@kernel.org>

ACK
  

Patch

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index d75456adc62a..a1d04bc9485f 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1582,6 +1582,9 @@  static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
 static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
 			      int len, int noblock, bool zerocopy)
 {
+	struct bpf_prog *xdp_prog;
+	int pad = TUN_RX_PAD;
+
 	if ((tun->flags & TUN_TYPE_MASK) != IFF_TAP)
 		return false;
 
@@ -1594,7 +1597,13 @@  static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
 	if (zerocopy)
 		return false;
 
-	if (SKB_DATA_ALIGN(len + TUN_RX_PAD) +
+	rcu_read_lock();
+	xdp_prog = rcu_dereference(tun->xdp_prog);
+	if (xdp_prog)
+		pad += XDP_PACKET_HEADROOM;
+	rcu_read_unlock();
+
+	if (SKB_DATA_ALIGN(len + pad) +
 	    SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) > PAGE_SIZE)
 		return false;