Description
Hello,
Could you please advise what could be used in KMP project to read a valid String from the Source
providing the approximate limit in bytes for that String
?
The use-case is the following:
I have a Source
that is used for parsing data from a file (file might contain non-ASCII characters). I need to read a portion of the content, parse it and if more data needed - read another portion from the Source
, etc.
Right now there is a method Source.readString(byteCount: Long)
that accepts limit in bytes but if the last byte is just a part of the actual codepoint it will be substituted with the replacement codepoint. And I won't get that last codepoint on a second read attempt either.
I wonder if there is a way to solve my use-case without reimplementing UTF-16 decoding on my side (logic from here). For example, in Java I could use java.io.Reader#read(char[])
method and if the last char is a high-surrogate I could try to read another char
to check whether the string is ill-formed or not (real example is a StreamReader
from SnakeYAML)
Would really appreciate your thoughts and suggestions. Thank you!