x86/insn_decoder_test: allow longer symbol-names

Message ID 320c4dba-9919-404b-8a26-a8af16be1845@app.fastmail.com
State New
Headers
Series x86/insn_decoder_test: allow longer symbol-names |

Commit Message

David Rheinsberg Jan. 24, 2023, 11:04 a.m. UTC
  Increase the allowed line-length of the insn-decoder-test to 4k to allow
for symbol-names longer than 256 characters.

The insn-decoder-test takes objdump output as input, which may contain
symbol-names as instruction arguments. With rust-code entering the
kernel, those symbol-names will include mangled-symbols which might
exceed the current line-length-limit of the tool.

By bumping the line-length-limit of the tool to 4k, we get a reasonable
buffer for all objdump outputs I have seen so far. Unfortunately, ELF
symbol-names are not restricted in length, so technically this might
still end up failing if we encounter longer names in the future.

My compile-failure looks like this:

    arch/x86/tools/insn_decoder_test: error: malformed line 1152000:
    tBb_+0xf2>

..which overflowed by 10 characters reading this line:

    ffffffff81458193:   74 3d                   je     ffffffff814581d2 <_RNvXse_NtNtNtCshGpAVYOtgW1_4core4iter8adapters7flattenINtB5_13FlattenCompatINtNtB7_3map3MapNtNtNtBb_3str4iter5CharsNtB1v_17CharEscapeDefaultENtNtBb_4char13EscapeDefaultENtNtBb_3fmt5Debug3fmtBb_+0xf2>

Signed-off-by: David Rheinsberg <david@readahead.eu>
---
 arch/x86/tools/insn_decoder_test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Ingo Molnar Jan. 25, 2023, 11:30 a.m. UTC | #1
* David Rheinsberg <david@readahead.eu> wrote:

> Increase the allowed line-length of the insn-decoder-test to 4k to allow
> for symbol-names longer than 256 characters.
> 
> The insn-decoder-test takes objdump output as input, which may contain
> symbol-names as instruction arguments. With rust-code entering the
> kernel, those symbol-names will include mangled-symbols which might
> exceed the current line-length-limit of the tool.
> 
> By bumping the line-length-limit of the tool to 4k, we get a reasonable
> buffer for all objdump outputs I have seen so far. Unfortunately, ELF
> symbol-names are not restricted in length, so technically this might
> still end up failing if we encounter longer names in the future.
> 
> My compile-failure looks like this:
> 
>     arch/x86/tools/insn_decoder_test: error: malformed line 1152000:
>     tBb_+0xf2>
> 
> ..which overflowed by 10 characters reading this line:
> 
>     ffffffff81458193:   74 3d                   je     ffffffff814581d2 <_RNvXse_NtNtNtCshGpAVYOtgW1_4core4iter8adapters7flattenINtB5_13FlattenCompatINtNtB7_3map3MapNtNtNtBb_3str4iter5CharsNtB1v_17CharEscapeDefaultENtNtBb_4char13EscapeDefaultENtNtBb_3fmt5Debug3fmtBb_+0xf2>
> 
> Signed-off-by: David Rheinsberg <david@readahead.eu>
> ---
>  arch/x86/tools/insn_decoder_test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/tools/insn_decoder_test.c b/arch/x86/tools/insn_decoder_test.c
> index 472540aeabc2..366e07546344 100644
> --- a/arch/x86/tools/insn_decoder_test.c
> +++ b/arch/x86/tools/insn_decoder_test.c
> @@ -106,7 +106,7 @@ static void parse_args(int argc, char **argv)
>  	}
>  }
>  
> -#define BUFSIZE 256
> +#define BUFSIZE 4096

That hard-coded constant is a bit lame and will cause trouble the minute 
*that* size is exceeded - don't we have some more natural figure, such as 
KSYM_SYMBOL_LEN?

Thanks,

	Ingo
  
David Rheinsberg Jan. 27, 2023, 10:27 a.m. UTC | #2
Hi Ingo!

On Wed, Jan 25, 2023, at 12:30 PM, Ingo Molnar wrote:
> * David Rheinsberg <david@readahead.eu> wrote:
>> diff --git a/arch/x86/tools/insn_decoder_test.c b/arch/x86/tools/insn_decoder_test.c
>> index 472540aeabc2..366e07546344 100644
>> --- a/arch/x86/tools/insn_decoder_test.c
>> +++ b/arch/x86/tools/insn_decoder_test.c
>> @@ -106,7 +106,7 @@ static void parse_args(int argc, char **argv)
>>  	}
>>  }
>>  
>> -#define BUFSIZE 256
>> +#define BUFSIZE 4096
>
> That hard-coded constant is a bit lame and will cause trouble the minute 
> *that* size is exceeded - don't we have some more natural figure, such as 
> KSYM_SYMBOL_LEN?

Thanks for the hint! I tried mentioning this in the commit-message. I am unsure whether a fixed-size limit is the correct thing to do. However, given that this is a compile-time test-tool, I thought bumping the limit for this stack buffer to be sufficient. I am open to other suggestions.

The input to this tool is the output of objdump, so effectively annotated x86-assembly. To get a proper estimate of how long such a line could be, I would have to audit all instructions and know what kind of arguments are possible. Can some of them take multiple symbols as argument? Are there other possibly long encodings of arguments to consider?

Lastly, shouldn't this use KSYM_NAME_LEN rather than KSYM_SYMBOL_LEN? And then how much room for normal instruction-encoding should I calculate? I feel like whatever I do (even with those constants), I would end up with only an estimate and wouldn't actually end up closing this issue.

The current workaround is to just disable CONFIG_X86_DECODER_SELFTEST, which I thought is a sad state. I can gladly use `256 + KSYM_NAME_LEN` and add a comment ala "room for insn-encoding and a symbol name". Would that be acceptable? The alternative would be to try to dyn-alloc a buffer and increase it to the actual line-length?

Open to suggestions!

Thanks
David
  
Miguel Ojeda Feb. 20, 2024, 5:07 p.m. UTC | #3
On Fri, Jan 27, 2023 at 11:39 AM David Rheinsberg <david@readahead.eu> wrote:
>
> The current workaround is to just disable CONFIG_X86_DECODER_SELFTEST, which I thought is a sad state. I can gladly use `256 + KSYM_NAME_LEN` and add a comment ala "room for insn-encoding and a symbol name". Would that be acceptable? The alternative would be to try to dyn-alloc a buffer and increase it to the actual line-length?

John independently hit this issue again. Could we fix this? Going for
the `256 + KSYM_NAME_LEN` sounds good enough for the moment since it
would be a clear improvement, though I agree this could be cleaned up
further.

Sergio took the approach David suggested in a related patch [1], but
perhaps it is best to submit the fix on its own so that it is easier
to put it in. David, would you be so kind as to submit a v2 with that?
Hopefully x86 can pick it up, otherwise with an Acked-by I am happy to
take it too; and then Sergio can submit his patch on top again.

Thanks!

(Cc'ing also Masami who wrote this originally)

[1] https://lore.kernel.org/rust-for-linux/20231119180145.157455-1-sergio.collado@gmail.com/

Cheers,
Miguel
  

Patch

diff --git a/arch/x86/tools/insn_decoder_test.c b/arch/x86/tools/insn_decoder_test.c
index 472540aeabc2..366e07546344 100644
--- a/arch/x86/tools/insn_decoder_test.c
+++ b/arch/x86/tools/insn_decoder_test.c
@@ -106,7 +106,7 @@  static void parse_args(int argc, char **argv)
 	}
 }
 
-#define BUFSIZE 256
+#define BUFSIZE 4096
 
 int main(int argc, char **argv)
 {