[v3,0/2] get_maintainer: correctly parse UTF-8 encoded names in files

Message ID 20231219-get-maintainers-utf8-v3-0-f85a39e2265a@bang-olufsen.dk
Headers
Series get_maintainer: correctly parse UTF-8 encoded names in files |

Message

Alvin Šipraga Dec. 19, 2023, 1:25 a.m. UTC
  Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
---
Changes in v3:
- add more rationale for opening everything with UTF-8 encoding
- fix a separate issue identified when introducing UTF-8 names, namely
  that they would not get escaped with quotes as expected, due to Perl's
  default behaviour being to match UTF-8 characters with \w
- add a second patch to fix an unrelated issue mentioned by Joe whereby
  a mailing list might get the display name '-'
- Link to v2: https://lore.kernel.org/r/20231214-get-maintainers-utf8-v2-1-b188dc7042a4@bang-olufsen.dk

Changes in v2:
- use '\p{L}' rather than '\p{Latin}', so that matching is even more
  inclusive (i.e. match also Greek letters, CJK, etc.)
- fix commit message to refer to tools mailing list, not b4 mailing list
- Link to v1: https://lore.kernel.org/r/20231014-get-maintainers-utf8-v1-1-3af8c7aeb239@bang-olufsen.dk

---
Alvin Šipraga (2):
      get_maintainer: correctly parse UTF-8 encoded names in files
      get_maintainer: remove stray punctuation when cleaning file emails

 scripts/get_maintainer.pl | 48 +++++++++++++++++++++++++++--------------------
 1 file changed, 28 insertions(+), 20 deletions(-)
---
base-commit: 2cf4f94d8e8646803f8fb0facf134b0cd7fb691a
change-id: 20231014-get-maintainers-utf8-32c65c4d6f8a