Description
Description
UTF8Span
instances obtained from a String
in the small representation are not valid. They point to the wrong memory. In contrast, gettin a UTF8Span
from a Span<UInt8>
given out by UTF8View
is fine. This is very surprising given that String.utf8Span
is a thin wrapper over UTF8View.span
.
Reproduction
let master = String(200) //"This string is not a small string."
func smallStringTest1() {
let s = master
let utf8 = s.utf8Span
let span = utf8.span
for i in span.indices {
print(span[i], terminator: " ")
}
print()
}
smallStringTest1()
func smallStringTest2() {
let s = master.utf8
let view = s.span
let utf8 = try! UTF8Span(validating: view)
let span = utf8.span
for i in span.indices {
print(span[i], terminator: " ")
}
print()
}
smallStringTest2()
func smallStringTest3() {
let s = master
let utf8 = s.utf8Span
var it = utf8.makeUnicodeScalarIterator()
while let c = it.next() {
print(c, terminator: " ")
}
print()
}
smallStringTest3()
func smallStringTest4() {
let s = master.utf8
let view = s.span
let utf8 = try! UTF8Span(validating: view)
var it = utf8.makeUnicodeScalarIterator()
while let c = it.next() {
print(c, terminator: " ")
}
print()
}
smallStringTest4()
(also here: https://swift.godbolt.org/z/zv15z94MG)
The 1st and 3rd functions print surprising output. The 3rd sometimes crashes. Sample output:
Program returned: 0
88 189 14
50 48 48
2 ½ �
2 0 0
Expected behavior
Given this code, we'd expect the output of the 1st and 3rd to be identical to the 2nd and 4th, respectively:
Program returned: 0
50 48 48
50 48 48
2 0 0
2 0 0
Note that when master
's value is changed to a "large" String
, all the output is as expected.
Environment
swift-DEVELOPMENT-SNAPSHOT-2025-06-03-a
Observed on macOS and Linux.
Code example: https://swift.godbolt.org/z/zv15z94MG
Additional information
Related to: #81931
Also tracked as rdar://152615664