Home > database >  Is there a better way to convert a stream of code points into a string in Kotlin?
Is there a better way to convert a stream of code points into a string in Kotlin?

Time:02-01

I have a sequence of code points as Sequence<Int>.

I want to get this into a String.

What I currently do is this:

val string = codePoints
    .map { codePoint -> String(intArrayOf(codePoint), 0, 1) }
    .joinToString()

But it feels extremely hairy to create a string for each code point just to concatenate them immediately after. Is there a more direct way to do this?

So far the best I was able to do was something like this:

val string2 = codePoints.toList().toIntArray()
    .let { codePoints -> String(codePoints, 0, codePoints.size) }

The amount of code isn't really any better, and it has a toList().toIntArray() which I'm not completely fond of. But it at least avoids the packaging of everything into dozens of one-code-point strings, and the logic is still written in the logical order.

CodePudding user response:

You can either go for the simple:

val string = codePoints.joinToString("") { Character.toString(it) }
// or
val string = codePoints.joinToString("", transform = Character::toString)

Or use a string builder:

fun Sequence<Int>.codePointsToString(): String = buildString {
    [email protected] { cp ->
        appendCodePoint(cp)
    }
}

This second one expresses exactly what you want, and may benefit from future optimizations in the string builder.

it feels extremely hairy to create a string for each code point just to concatenate them immediately after

Did you really measure a performance issue with the extra string objects created here? Using toList() would also create a bunch of object arrays behind the scenes (one for each resize), which is a bit less, but not tremendously better. And as you pointed out toIntArray on top of that is yet another array creation.

Unless you know the number of elements in the sequence up front, I don't believe there is much you can do about that (the string builder approach will also likely use a resizable array behind the scenes, but at least you don't need extra array copies).

CodePudding user response:

val result = codePoints.map { Character.toString(it) }.joinToString("")

Edit, based on Joffrey's comment below:

val result = codePoints.joinToString("") { Character.toString(it) }

Additional edit, full example:

val codePoints: Sequence<Int> = sequenceOf(
  'a'.code,
  Character.toCodePoint(0xD83D.toChar(), 0xDE03.toChar()),
  Character.toCodePoint(0xD83D.toChar(), 0xDE04.toChar()),
  Character.toCodePoint(0xD83D.toChar(), 0xDE05.toChar())
)

val result = codePoints.joinToString("") { Character.toString(it) }

println(result)

This will print: a

  •  Tags:  
  • Related