kdb: Handle LF in the command parser

Message ID 20230628125612.1.I5cc6c3d916195f5bcfdf5b75d823f2037707f5dc@changeid
State New
Headers
Series kdb: Handle LF in the command parser |

Commit Message

Doug Anderson June 28, 2023, 7:56 p.m. UTC
  The main kdb command parser only handles CR (ASCII 13 AKA '\r') today,
but not LF (ASCII 10 AKA '\n'). That means that the kdb command parser
can handle terminals that send just CR or that send CR+LF but can't
handle terminals that send just LF.

The fact that kdb didn't handle LF in the command parser tripped up a
tool I tried to use with it. Specifically, I was trying to send a
command to my device to resume it from kdb using a ChromeOS tool like:
  dut-control cpu_uart_cmd:"g"
That tool only terminates lines with LF, not CR+LF.

Arguably the ChromeOS tool should be fixed. After all, officially kdb
seems to be designed such that CR+LF is the official line ending
transmitted over the wire and that internally a line ending is just
'\n' (LF). Some evidence:
* uart_poll_put_char(), which is used by kdb, notices a '\n' and
  converts it to '\r\n'.
* kdb functions specifically use '\r' to get a carriage return without
  a newline. You can see this in the pager where kdb will write a '\r'
  and then write over the pager prompt.

However, all that being said there's no real harm in accepting LF as a
command terminator in the kdb parser and doing so seems like it would
improve compatibility. After this, I'd expect that things would work
OK-ish with a remote terminal that used any of CR, CR+LF, or LF as a
line ending. Someone using CR as a line ending might get some ugliness
where kdb wasn't able to overwrite the last line, but basic commands
would work. Someone using just LF as a line ending would probably also
work OK.

A few other notes:
- It can be noted that "bash" running on an "agetty" handles LF as a
  line termination with no complaints.
- Historically, kdb's "pager" actually handled either CR or LF fine. A
  very quick inspection would make one think that kdb's pager actually
  could have paged down two lines instead of one for anyone using
  CR+LF, but this is generally avoided because of kdb_input_flush().
- Conceivably one could argue that some of this special case logic
  belongs in uart_poll_get_char() since uart_poll_put_char() handles
  the '\n' => '\r\n' conversion. I would argue that perhaps we should
  eventually do the opposite and move the '\n' => '\r\n' out of
  uart_poll_put_char(). Having that conversion at such a low level
  could interfere if we ever want to transfer binary data. In
  addition, if we truly made uart_poll_get_char() the inverse of
  uart_poll_put_char() it would convert back to '\n' and (ironically)
  kdb's parser currently only looks for '\r' to find the end of a
  command.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 kernel/debug/kdb/kdb_io.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)
  

Comments

Daniel Thompson June 29, 2023, 4:48 p.m. UTC | #1
On Wed, Jun 28, 2023 at 12:56:17PM -0700, Douglas Anderson wrote:
> The main kdb command parser only handles CR (ASCII 13 AKA '\r') today,
> but not LF (ASCII 10 AKA '\n'). That means that the kdb command parser
> can handle terminals that send just CR or that send CR+LF but can't
> handle terminals that send just LF.
>
> The fact that kdb didn't handle LF in the command parser tripped up a
> tool I tried to use with it. Specifically, I was trying to send a
> command to my device to resume it from kdb using a ChromeOS tool like:
>   dut-control cpu_uart_cmd:"g"
> That tool only terminates lines with LF, not CR+LF.
>
> Arguably the ChromeOS tool should be fixed. After all, officially kdb
> seems to be designed such that CR+LF is the official line ending
> transmitted over the wire and that internally a line ending is just
> '\n' (LF). Some evidence:
> * uart_poll_put_char(), which is used by kdb, notices a '\n' and
>   converts it to '\r\n'.
> * kdb functions specifically use '\r' to get a carriage return without
>   a newline. You can see this in the pager where kdb will write a '\r'
>   and then write over the pager prompt.

I'd view this as simply "what you have to do drive a terminal correctly"
rather than indicating what the official newline protocol for kdb is.
With a human and a terminal emulator I would expect the typical input to
be CR-only (and that's what I setup the test suite to send ;-)).


> However, all that being said there's no real harm in accepting LF as a
> command terminator in the kdb parser and doing so seems like it would
> improve compatibility. After this, I'd expect that things would work
> OK-ish with a remote terminal that used any of CR, CR+LF, or LF as a
> line ending. Someone using CR as a line ending might get some ugliness
> where kdb wasn't able to overwrite the last line, but basic commands
> would work. Someone using just LF as a line ending would probably also
> work OK.
>
> A few other notes:
> - It can be noted that "bash" running on an "agetty" handles LF as a
>   line termination with no complaints.
> - Historically, kdb's "pager" actually handled either CR or LF fine. A
>   very quick inspection would make one think that kdb's pager actually
>   could have paged down two lines instead of one for anyone using
>   CR+LF, but this is generally avoided because of kdb_input_flush().

These are very convincing though!

> - Conceivably one could argue that some of this special case logic
>   belongs in uart_poll_get_char() since uart_poll_put_char() handles
>   the '\n' => '\r\n' conversion. I would argue that perhaps we should
>   eventually do the opposite and move the '\n' => '\r\n' out of
>   uart_poll_put_char(). Having that conversion at such a low level
>   could interfere if we ever want to transfer binary data. In
>   addition, if we truly made uart_poll_get_char() the inverse of
>   uart_poll_put_char() it would convert back to '\n' and (ironically)
>   kdb's parser currently only looks for '\r' to find the end of a
>   command.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

This arrived just as I was gathering up the patches (I know... running
late). I've added a couple of cases to the test suite to cover the
new feature.

The code looks good to me. Are you happy for me to take it this
merge cycle?


Daniel.

> ---
>
>  kernel/debug/kdb/kdb_io.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 5c7e9ba7cd6b..813cb6cf72d6 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -131,6 +131,7 @@ char kdb_getchar(void)
>  	int escape_delay = 0;
>  	get_char_func *f, *f_prev = NULL;
>  	int key;
> +	static bool last_char_was_cr;
>
>  	for (f = &kdb_poll_funcs[0]; ; ++f) {
>  		if (*f == NULL) {
> @@ -149,6 +150,18 @@ char kdb_getchar(void)
>  			continue;
>  		}
>
> +		/*
> +		 * The caller expects that newlines are either CR or LF. However
> +		 * some terminals send _both_ CR and LF. Avoid having to handle
> +		 * this in the caller by stripping the LF if we saw a CR right
> +		 * before.
> +		 */
> +		if (last_char_was_cr && key == '\n') {
> +			last_char_was_cr = false;
> +			continue;
> +		}
> +		last_char_was_cr = (key == '\r');
> +
>  		/*
>  		 * When the first character is received (or we get a change
>  		 * input source) we set ourselves up to handle an escape
> @@ -244,7 +257,8 @@ static char *kdb_read(char *buffer, size_t bufsize)
>  			*cp = tmp;
>  		}
>  		break;
> -	case 13: /* enter */
> +	case 10: /* linefeed */
> +	case 13: /* carriage return */
>  		*lastchar++ = '\n';
>  		*lastchar++ = '\0';
>  		if (!KDB_STATE(KGDB_TRANS)) {
> --
> 2.41.0.162.gfafddb0af9-goog
>
  
Doug Anderson June 29, 2023, 5:49 p.m. UTC | #2
Hi,

On Thu, Jun 29, 2023 at 9:48 AM Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> On Wed, Jun 28, 2023 at 12:56:17PM -0700, Douglas Anderson wrote:
> > The main kdb command parser only handles CR (ASCII 13 AKA '\r') today,
> > but not LF (ASCII 10 AKA '\n'). That means that the kdb command parser
> > can handle terminals that send just CR or that send CR+LF but can't
> > handle terminals that send just LF.
> >
> > The fact that kdb didn't handle LF in the command parser tripped up a
> > tool I tried to use with it. Specifically, I was trying to send a
> > command to my device to resume it from kdb using a ChromeOS tool like:
> >   dut-control cpu_uart_cmd:"g"
> > That tool only terminates lines with LF, not CR+LF.
> >
> > Arguably the ChromeOS tool should be fixed. After all, officially kdb
> > seems to be designed such that CR+LF is the official line ending
> > transmitted over the wire and that internally a line ending is just
> > '\n' (LF). Some evidence:
> > * uart_poll_put_char(), which is used by kdb, notices a '\n' and
> >   converts it to '\r\n'.
> > * kdb functions specifically use '\r' to get a carriage return without
> >   a newline. You can see this in the pager where kdb will write a '\r'
> >   and then write over the pager prompt.
>
> I'd view this as simply "what you have to do drive a terminal correctly"
> rather than indicating what the official newline protocol for kdb is.
> With a human and a terminal emulator I would expect the typical input to
> be CR-only (and that's what I setup the test suite to send ;-)).

Fair enough. I haven't done massive amounts of serial communications
across lots of platforms, but I do remember the history of line
endings in text files and so I wanted to document my findings a bit.
;-)


> > However, all that being said there's no real harm in accepting LF as a
> > command terminator in the kdb parser and doing so seems like it would
> > improve compatibility. After this, I'd expect that things would work
> > OK-ish with a remote terminal that used any of CR, CR+LF, or LF as a
> > line ending. Someone using CR as a line ending might get some ugliness
> > where kdb wasn't able to overwrite the last line, but basic commands
> > would work. Someone using just LF as a line ending would probably also
> > work OK.
> >
> > A few other notes:
> > - It can be noted that "bash" running on an "agetty" handles LF as a
> >   line termination with no complaints.
> > - Historically, kdb's "pager" actually handled either CR or LF fine. A
> >   very quick inspection would make one think that kdb's pager actually
> >   could have paged down two lines instead of one for anyone using
> >   CR+LF, but this is generally avoided because of kdb_input_flush().
>
> These are very convincing though!
>
> > - Conceivably one could argue that some of this special case logic
> >   belongs in uart_poll_get_char() since uart_poll_put_char() handles
> >   the '\n' => '\r\n' conversion. I would argue that perhaps we should
> >   eventually do the opposite and move the '\n' => '\r\n' out of
> >   uart_poll_put_char(). Having that conversion at such a low level
> >   could interfere if we ever want to transfer binary data. In
> >   addition, if we truly made uart_poll_get_char() the inverse of
> >   uart_poll_put_char() it would convert back to '\n' and (ironically)
> >   kdb's parser currently only looks for '\r' to find the end of a
> >   command.
> >
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
>
> This arrived just as I was gathering up the patches (I know... running
> late). I've added a couple of cases to the test suite to cover the
> new feature.
>
> The code looks good to me. Are you happy for me to take it this
> merge cycle?

Yeah, it should be OK. It's really pretty straightforward. Thanks!

-Doug
  

Patch

diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index 5c7e9ba7cd6b..813cb6cf72d6 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -131,6 +131,7 @@  char kdb_getchar(void)
 	int escape_delay = 0;
 	get_char_func *f, *f_prev = NULL;
 	int key;
+	static bool last_char_was_cr;
 
 	for (f = &kdb_poll_funcs[0]; ; ++f) {
 		if (*f == NULL) {
@@ -149,6 +150,18 @@  char kdb_getchar(void)
 			continue;
 		}
 
+		/*
+		 * The caller expects that newlines are either CR or LF. However
+		 * some terminals send _both_ CR and LF. Avoid having to handle
+		 * this in the caller by stripping the LF if we saw a CR right
+		 * before.
+		 */
+		if (last_char_was_cr && key == '\n') {
+			last_char_was_cr = false;
+			continue;
+		}
+		last_char_was_cr = (key == '\r');
+
 		/*
 		 * When the first character is received (or we get a change
 		 * input source) we set ourselves up to handle an escape
@@ -244,7 +257,8 @@  static char *kdb_read(char *buffer, size_t bufsize)
 			*cp = tmp;
 		}
 		break;
-	case 13: /* enter */
+	case 10: /* linefeed */
+	case 13: /* carriage return */
 		*lastchar++ = '\n';
 		*lastchar++ = '\0';
 		if (!KDB_STATE(KGDB_TRANS)) {