writeCodePointValue
Encodes codePoint in UTF-8 and writes it to this sink.
codePoint should represent valid Unicode code point, meaning that its value should be within the Unicode codespace (U+000000
.. U+10ffff
), otherwise IllegalArgumentException will be thrown.
Note that in general, a value retrieved from Char.code could not be written directly as it may be a part of a surrogate pair (that could be detected using Char.isSurrogate, or Char.isHighSurrogate and Char.isLowSurrogate). Such a pair of characters needs to be manually converted back to a single code point which then could be written to a Sink. Without such a conversion, data written to a Sink can not be converted back to a string from which a surrogate pair was retrieved.
More specifically, all code points mapping to UTF-16 surrogates (U+d800
..U+dfff
) will be written as ?
characters (U+0063
).
Parameters
the codePoint to be written.
Throws
when the sink is closed.
when codePoint value is negative, or greater than U+10ffff
.
when some I/O error occurs.
Samples
import kotlinx.io.*
import kotlin.test.*
fun main() {
//sampleStart
val buffer = Buffer()
// Basic Latin (a.k.a. ASCII) characters are encoded with a single byte
buffer.writeCodePointValue('Y'.code)
assertContentEquals(byteArrayOf(0x59), buffer.readByteArray())
// wider characters are encoded into multiple UTF-8 code units
buffer.writeCodePointValue('ฮ'.code)
assertContentEquals(byteArrayOf(0xce.toByte(), 0x94.toByte()), buffer.readByteArray())
// note the difference: writeInt won't encode the code point, like writeCodePointValue did
buffer.writeInt('ฮ'.code)
assertContentEquals(byteArrayOf(0, 0, 0x3, 0x94.toByte()), buffer.readByteArray())
//sampleEnd
}
import kotlinx.io.*
import kotlin.test.*
fun main() {
//sampleStart
val buffer = Buffer()
// U+1F31E (a.k.a. "sun with face") is too wide to fit in a single UTF-16 character,
// so it's represented using a surrogate pair.
val chars = "๐".toCharArray()
assertEquals(2, chars.size)
// such a pair has to be manually converted to a single code point
assertTrue(chars[0].isHighSurrogate())
assertTrue(chars[1].isLowSurrogate())
val highSurrogate = chars[0].code
val lowSurrogate = chars[1].code
// see https://en.wikipedia.org/wiki/UTF-16#Code_points_from_U+010000_to_U+10FFFF for details
val codePoint = 0x10000 + (highSurrogate - 0xD800).shl(10).or(lowSurrogate - 0xDC00)
assertEquals(0x1F31E, codePoint)
// now we can write the code point
buffer.writeCodePointValue(codePoint)
// and read the correct string back
assertEquals("๐", buffer.readString())
// we won't achieve that by writing surrogates as it is
buffer.apply {
writeCodePointValue(highSurrogate)
writeCodePointValue(lowSurrogate)
}
assertNotEquals("๐", buffer.readString())
//sampleEnd
}