duplicate requests on host side while streaming via uvcvideo gadget

Message ID ZcFx7P30Su_Mx4AV@pengutronix.de
State New
Headers
Series duplicate requests on host side while streaming via uvcvideo gadget |

Commit Message

Michael Grzeschik Feb. 5, 2024, 11:40 p.m. UTC
  Hi Thinh

I found some strange situation while streaming via uvc-gadget to some
usb host. It happens when some requests are missed due to higher load on
the gadget machine. In some cases some requests will reach the host
twice. In my special case, I added the following changes [1] for the
host and gadget side.

When applying the patches you will find all requests marked as errors on
the host and gadget side reported. However, the odd thing is that the
error counter on the host side will rise higher than the number of
requests we have actually marked as errornous on the gadget side. You
check the number of errors found on the host by looking in the
statistics and compare it with the numer of requests that are actually
marked with UVC_STREAM_ERR.

[1] applied hunks:

[ 1769.213387] error on uvc package!
[ 1769.213396] PTS: 16
[ 1769.213400] SCR: 64
[ 1769.213402] SCR: 229

[ 1769.461386] error on uvc package!
[ 1769.461394] PTS: 96
[ 1769.461398] SCR: 80
[ 1769.461401] SCR: 33

[ 1769.461405] error on uvc package! <- duplicate
[ 1769.461408] PTS: 96
[ 1769.461410] SCR: 80
[ 1769.461413] SCR: 33

[ 1769.657405] error on uvc package!
[ 1769.657442] PTS: 224
[ 1769.657449] SCR: 64
[ 1769.657453] SCR: 81

[ 1769.657460] error on uvc package! <- duplicate
[ 1769.657465] PTS: 224
[ 1769.657470] SCR: 64
[ 1769.657476] SCR: 81

[ 1779.525368] error on uvc package!
[ 1779.525374] PTS: 128
[ 1779.525378] SCR: 224
[ 1779.525380] SCR: 157

[ 1784.637359] error on uvc package!
[ 1784.637367] PTS: 0
[ 1784.637371] SCR: 208
[ 1784.637374] SCR: 89

[ 1784.825357] error on uvc package!
[ 1784.825394] PTS: 224
[ 1784.825401] SCR: 192
[ 1784.825406] SCR: 63

[ 1784.841362] error on uvc package!
[ 1784.841394] PTS: 144
[ 1784.841403] SCR: 48
[ 1784.841410] SCR: 186

[ 1784.841418] error on uvc package! <- duplicate
[ 1784.841424] PTS: 144
[ 1784.841430] SCR: 48
[ 1784.841436] SCR: 186

host$ grep errors /sys/kernel/debug/usb/uvcvideo/*/stats
/sys/kernel/debug/usb/uvcvideo/4-81-1/stats:errors:  10


-- Gadget:

[  126.826517] dropping frame! 1
[  126.829658] PTS: 16
[  126.831761] SCR: 64
[  126.833858] SCR: 229

[  127.090069] dropping frame! 2
[  127.093059] PTS: 96
[  127.095164] SCR: 80
[  127.097261] SCR: 33

[  127.288045] dropping frame! 3
[  127.291041] PTS: 224
[  127.293233] SCR: 64
[  127.295330] SCR: 81

[  137.153499] dropping frame! 4
[  137.156494] PTS: 128
[  137.158687] SCR: 224
[  137.160871] SCR: 157

[  142.265135] dropping frame! 5
[  142.268131] PTS: 0
[  142.270148] SCR: 208
[  142.272332] SCR: 89

[  142.453636] dropping frame! 6
[  142.456634] PTS: 224
[  142.458825] SCR: 192
[  142.461009] SCR: 63

[  142.469131] dropping frame! 7
[  142.472118] PTS: 144
[  142.474310] SCR: 48
[  142.476407] SCR: 186

Now I am totally unsure what could cause such error, but would expect
the issue to be somewhere in the gadget driver and the mapped trb memory
content. For the uvc_video layer, I compared and tested the enqueued
request list for duplicates but could not find any. I also reverted all
recent patches that changed request handling in the past year. I still
find these request duplicates on the host side show up.

Any Ideas?

Regards,
Michael
  

Comments

Thinh Nguyen Feb. 6, 2024, 3:23 a.m. UTC | #1
On Tue, Feb 06, 2024, Michael Grzeschik wrote:
> Hi Thinh
> 
> I found some strange situation while streaming via uvc-gadget to some
> usb host. It happens when some requests are missed due to higher load on
> the gadget machine. In some cases some requests will reach the host
> twice. In my special case, I added the following changes [1] for the
> host and gadget side.

Does this only happen to some specific hosts?

Are all the data of the duplicate requests matching or just some bits of
the transfer? Were you able to confirm from some usb analyzer/sniffer
that the data out the wire is actually duplicate?

> 
> When applying the patches you will find all requests marked as errors on
> the host and gadget side reported. However, the odd thing is that the
> error counter on the host side will rise higher than the number of
> requests we have actually marked as errornous on the gadget side. You
> check the number of errors found on the host by looking in the
> statistics and compare it with the numer of requests that are actually
> marked with UVC_STREAM_ERR.
> 

[snip]

> 
> -- Host:
> 
> [ 1769.213387] error on uvc package!
> [ 1769.213396] PTS: 16
> [ 1769.213400] SCR: 64
> [ 1769.213402] SCR: 229
> 
> [ 1769.461386] error on uvc package!
> [ 1769.461394] PTS: 96
> [ 1769.461398] SCR: 80
> [ 1769.461401] SCR: 33
> 
> [ 1769.461405] error on uvc package! <- duplicate
> [ 1769.461408] PTS: 96
> [ 1769.461410] SCR: 80
> [ 1769.461413] SCR: 33
> 
> [ 1769.657405] error on uvc package!
> [ 1769.657442] PTS: 224
> [ 1769.657449] SCR: 64
> [ 1769.657453] SCR: 81
> 
> [ 1769.657460] error on uvc package! <- duplicate
> [ 1769.657465] PTS: 224
> [ 1769.657470] SCR: 64
> [ 1769.657476] SCR: 81
> 
> [ 1779.525368] error on uvc package!
> [ 1779.525374] PTS: 128
> [ 1779.525378] SCR: 224
> [ 1779.525380] SCR: 157
> 
> [ 1784.637359] error on uvc package!
> [ 1784.637367] PTS: 0
> [ 1784.637371] SCR: 208
> [ 1784.637374] SCR: 89
> 
> [ 1784.825357] error on uvc package!
> [ 1784.825394] PTS: 224
> [ 1784.825401] SCR: 192
> [ 1784.825406] SCR: 63
> 
> [ 1784.841362] error on uvc package!
> [ 1784.841394] PTS: 144
> [ 1784.841403] SCR: 48
> [ 1784.841410] SCR: 186
> 
> [ 1784.841418] error on uvc package! <- duplicate
> [ 1784.841424] PTS: 144
> [ 1784.841430] SCR: 48
> [ 1784.841436] SCR: 186
> 
> host$ grep errors /sys/kernel/debug/usb/uvcvideo/*/stats
> /sys/kernel/debug/usb/uvcvideo/4-81-1/stats:errors:  10
> 
> 
> -- Gadget:
> 
> [  126.826517] dropping frame! 1
> [  126.829658] PTS: 16
> [  126.831761] SCR: 64
> [  126.833858] SCR: 229
> 
> [  127.090069] dropping frame! 2
> [  127.093059] PTS: 96
> [  127.095164] SCR: 80
> [  127.097261] SCR: 33
> 
> [  127.288045] dropping frame! 3
> [  127.291041] PTS: 224
> [  127.293233] SCR: 64
> [  127.295330] SCR: 81
> 
> [  137.153499] dropping frame! 4
> [  137.156494] PTS: 128
> [  137.158687] SCR: 224
> [  137.160871] SCR: 157
> 
> [  142.265135] dropping frame! 5
> [  142.268131] PTS: 0
> [  142.270148] SCR: 208
> [  142.272332] SCR: 89
> 
> [  142.453636] dropping frame! 6
> [  142.456634] PTS: 224
> [  142.458825] SCR: 192
> [  142.461009] SCR: 63
> 
> [  142.469131] dropping frame! 7
> [  142.472118] PTS: 144
> [  142.474310] SCR: 48
> [  142.476407] SCR: 186
> 
> Now I am totally unsure what could cause such error, but would expect
> the issue to be somewhere in the gadget driver and the mapped trb memory
> content. For the uvc_video layer, I compared and tested the enqueued
> request list for duplicates but could not find any. I also reverted all
> recent patches that changed request handling in the past year. I still
> find these request duplicates on the host side show up.
> 
> Any Ideas?
> 

I'm not familiar with UVC, can you give more context on your validation
and error count? What's PTS/SCR? Is the device sending some error status
back as new request whenever there's -EXDEV?

Thanks,
Thinh
  
Michael Grzeschik Feb. 6, 2024, 9:30 p.m. UTC | #2
On Tue, Feb 06, 2024 at 03:23:17AM +0000, Thinh Nguyen wrote:
>On Tue, Feb 06, 2024, Michael Grzeschik wrote:
>> Hi Thinh
>>
>> I found some strange situation while streaming via uvc-gadget to some
>> usb host. It happens when some requests are missed due to higher load on
>> the gadget machine. In some cases some requests will reach the host
>> twice. In my special case, I added the following changes [1] for the
>> host and gadget side.
>
>Does this only happen to some specific hosts?

>Are all the data of the duplicate requests matching or just some bits of
>the transfer? Were you able to confirm from some usb analyzer/sniffer
>that the data out the wire is actually duplicate?

Turns out, this duplicates are just misinterpretations.

The linux uvcvideo driver will parse the uvc header payload twice. (If
the FID was incremented inbetween). This led to those double misleading
outputs. Although this means that there is a bug in
uvc_video_stats_decode for incrementing the error count.

Anyway, so just ignore this whole thread and be thanked for even having any
thoghts on this.

Regards,
Michael
  
Kieran Bingham Feb. 7, 2024, 1:04 p.m. UTC | #3
Hi Michael,

Quoting Michael Grzeschik (2024-02-06 21:30:17)
> On Tue, Feb 06, 2024 at 03:23:17AM +0000, Thinh Nguyen wrote:
> >On Tue, Feb 06, 2024, Michael Grzeschik wrote:
> >> Hi Thinh
> >>
> >> I found some strange situation while streaming via uvc-gadget to some
> >> usb host. It happens when some requests are missed due to higher load on
> >> the gadget machine. In some cases some requests will reach the host
> >> twice. In my special case, I added the following changes [1] for the
> >> host and gadget side.
> >
> >Does this only happen to some specific hosts?
> 
> >Are all the data of the duplicate requests matching or just some bits of
> >the transfer? Were you able to confirm from some usb analyzer/sniffer
> >that the data out the wire is actually duplicate?
> 
> Turns out, this duplicates are just misinterpretations.

I'm glad there's no deeper issue to worry about here.

> The linux uvcvideo driver will parse the uvc header payload twice. (If
> the FID was incremented inbetween). This led to those double misleading
> outputs. Although this means that there is a bug in
> uvc_video_stats_decode for incrementing the error count.

Do you plan/are you able to submit a patch to fix this?  Hopefully that
would prevent anyone else following the same rabbit-hole.

--
Kieran


> 
> Anyway, so just ignore this whole thread and be thanked for even having any
> thoghts on this.
> 
> Regards,
> Michael
> 
> -- 
> Pengutronix e.K.                           |                             |
> Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
> 31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
  
Michael Grzeschik Feb. 7, 2024, 1:10 p.m. UTC | #4
On Wed, Feb 07, 2024 at 01:04:48PM +0000, Kieran Bingham wrote:
>Hi Michael,
>
>Quoting Michael Grzeschik (2024-02-06 21:30:17)
>> On Tue, Feb 06, 2024 at 03:23:17AM +0000, Thinh Nguyen wrote:
>> >On Tue, Feb 06, 2024, Michael Grzeschik wrote:
>> >> Hi Thinh
>> >>
>> >> I found some strange situation while streaming via uvc-gadget to some
>> >> usb host. It happens when some requests are missed due to higher load on
>> >> the gadget machine. In some cases some requests will reach the host
>> >> twice. In my special case, I added the following changes [1] for the
>> >> host and gadget side.
>> >
>> >Does this only happen to some specific hosts?
>>
>> >Are all the data of the duplicate requests matching or just some bits of
>> >the transfer? Were you able to confirm from some usb analyzer/sniffer
>> >that the data out the wire is actually duplicate?
>>
>> Turns out, this duplicates are just misinterpretations.
>
>I'm glad there's no deeper issue to worry about here.
>
>> The linux uvcvideo driver will parse the uvc header payload twice. (If
>> the FID was incremented inbetween). This led to those double misleading
>> outputs. Although this means that there is a bug in
>> uvc_video_stats_decode for incrementing the error count.
>
>Do you plan/are you able to submit a patch to fix this?  Hopefully that
>would prevent anyone else following the same rabbit-hole.

Sure. For now I came up with this changes. This leaves the errorcounts
in a sane range. However with this compiled kernel I saw some issues
with the host machine to freeze up. I will just ensure that this is not
caused by this particular patch.

Regards,
Michael

diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
index 40eafe43d1888..b582698be7f00 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -1094,6 +1094,29 @@ static int uvc_video_decode_start(struct uvc_streaming *stream,
                         uvc_video_stats_update(stream);
         }

+       /*
+        * Mark the buffer as done if we're at the beginning of a new frame.
+        * End of frame detection is better implemented by checking the EOF
+        * bit (FID bit toggling is delayed by one frame compared to the EOF
+        * bit), but some devices don't set the bit at end of frame (and the
+        * last payload can be lost anyway). We thus must check if the FID has
+        * been toggled.
+        *
+        * stream->last_fid is initialized to -1, so the first isochronous
+        * frame will never trigger an end of frame detection.
+        *
+        * Empty buffers (bytesused == 0) don't trigger end of frame detection
+        * as it doesn't make sense to return an empty buffer. This also
+        * avoids detecting end of frame conditions at FID toggling if the
+        * previous payload had the EOF bit set.
+        */
+       if (fid != stream->last_fid && buf->bytesused != 0) {
+               uvc_dbg(stream->dev, FRAME,
+                       "Frame complete (FID bit toggled)\n");
+               buf->state = UVC_BUF_STATE_READY;
+               return -EAGAIN;
+       }
+
         uvc_video_clock_decode(stream, buf, data, len);
         uvc_video_stats_decode(stream, data, len);

@@ -1140,29 +1163,6 @@ static int uvc_video_decode_start(struct uvc_streaming *stream,
                 buf->state = UVC_BUF_STATE_ACTIVE;
         }

-       /*
-        * Mark the buffer as done if we're at the beginning of a new frame.
-        * End of frame detection is better implemented by checking the EOF
-        * bit (FID bit toggling is delayed by one frame compared to the EOF
-        * bit), but some devices don't set the bit at end of frame (and the
-        * last payload can be lost anyway). We thus must check if the FID has
-        * been toggled.
-        *
-        * stream->last_fid is initialized to -1, so the first isochronous
-        * frame will never trigger an end of frame detection.
-        *
-        * Empty buffers (bytesused == 0) don't trigger end of frame detection
-        * as it doesn't make sense to return an empty buffer. This also
-        * avoids detecting end of frame conditions at FID toggling if the
-        * previous payload had the EOF bit set.
-        */
-       if (fid != stream->last_fid && buf->bytesused != 0) {
-               uvc_dbg(stream->dev, FRAME,
-                       "Frame complete (FID bit toggled)\n");
-               buf->state = UVC_BUF_STATE_READY;
-               return -EAGAIN;
-       }
-
         stream->last_fid = fid;

         return data[0];
  

Patch

diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h
index 6bdcf3bdd62a9..63515fc949880 100644
--- a/drivers/usb/gadget/function/uvc.h
+++ b/drivers/usb/gadget/function/uvc.h
@@ -91,6 +91,8 @@  struct uvc_video {
         struct work_struct pump;
         struct workqueue_struct *async_wq;

+       int errorcount;
+
         /* Frame parameters */
         u8 bpp;
         u32 fcc;

diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c
index eb1f7cee4a0af..f45dd53fde7ef 100644
--- a/drivers/usb/gadget/function/uvc_video.c
+++ b/drivers/usb/gadget/function/uvc_video.c
@@ -35,6 +35,12 @@  uvc_video_encode_header(struct uvc_video *video, struct uvc_buffer *buf,

         data[1] = UVC_STREAM_EOH | video->fid;

+       if (video->queue.flags & UVC_QUEUE_DROP_INCOMPLETE) {
+               video->errorcount++;
+               printk("dropping frame! %d\n", video->errorcount);
+               data[1] |= UVC_STREAM_ERR;
+       }
+
         if (video->queue.buf_used == 0 && ts.tv_sec) {
                 /* dwClockFrequency is 48 MHz */
                 u32 pts = ((u64)ts.tv_sec * USEC_PER_SEC + ts.tv_nsec / NSEC_PER_USEC) * 48;

@@ -47,6 +47,8 @@  uvc_video_encode_header(struct uvc_video *video, struct uvc_buffer *buf,
  
                 data[1] |= UVC_STREAM_PTS;
                 put_unaligned_le32(pts, &data[pos]);
+               if (data[1] & UVC_STREAM_ERR)
+                       trace_printk("PTS: %u\n", data[pos]);
                 pos += 4;
         }
  
@@ -60,6 +62,10 @@  uvc_video_encode_header(struct uvc_video *video, struct uvc_buffer *buf,
                 data[1] |= UVC_STREAM_SCR;
                 put_unaligned_le32(stc, &data[pos]);
                 put_unaligned_le16(sof, &data[pos+4]);
+               if (data[1] & UVC_STREAM_ERR) {
+                       trace_printk("SCR: %u\n", data[pos]);
+                       trace_printk("SCR: %hu\n", data[pos+4]);
+               }
                 pos += 6;
         }
  

@@ -731,6 +734,7 @@  int uvcg_video_enable(struct uvc_video *video)
         struct usb_gadget *gadget = cdev->gadget;
         int ret;

+       video->errorcount = 0;
         if (video->ep == NULL) {
                 uvcg_info(&video->uvc->func,
                           "Video enable failed, device is uninitialized.\n");

diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
index 28dde08ec6c5d..40eafe43d1888 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -916,8 +916,24 @@  static void uvc_video_stats_decode(struct uvc_streaming *stream,
         if (len <= header_size)
                 stream->stats.frame.nb_empty++;

-       if (data[1] & UVC_STREAM_ERR)
+       if (data[1] & UVC_STREAM_ERR) {
                 stream->stats.frame.nb_errors++;
+               printk("error on uvc package!\n");
+               if (data[1] & UVC_STREAM_PTS) {
+                       printk("PTS: %u\n", data[2]);
+                       if (data[1] & UVC_STREAM_SCR) {
+                               printk("SCR: %u\n", data[6]);
+                               printk("SCR: %hu\n", data[10]);
+                       }
+               } else {
+                       if (data[1] & UVC_STREAM_SCR) {
+                               printk("SCR: %u\n", data[2]);
+                               printk("SCR: %hu\n", data[6]);
+                       }
+               }
+
+
+       }
  }

-- Host: