Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memcollxfrm: Handle above-Unicode code points #22989

Open
wants to merge 5 commits into
base: blead
Choose a base branch
from

Conversation

khwilliamson
Copy link
Contributor

As stated in the comments added by this commit, it is undefined behavior to call strxfrm() on above-Unicode code points, and especially calling it with Perl's invented extended UTF-8. This commit changes all such input into a legal value, replacing all above-Unicode with the highest permanently unassigned code point, U+10FFFF.

  • This set of changes may require a perldelta entry, and please state your opinion

This value is not going to be used again.  I put in the ++ out of habit.
This creates an internal macro that skips some error checking for use
when we don't care if it is completely well-formed or not.
The next commit will want to use the results later.
As stated in the comments added by this commit, it is undefined behavior
to call strxfrm() on above-Unicode code points, and especially calling
it with Perl's invented extended UTF-8.  This commit changes all such
input into a legal value, replacing all above-Unicode with the highest
permanently unassigned code point, U+10FFFF.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant