carragom
carragom
DDeno
Created by carragom on 4/9/2024 in #help
How to go from a JS string[] to a C array of pointers
Great resources, helped me a lot. Here is the current implementation, if you get a chance let me know if you spot anything problematic.
const TO_STR_ARRAY_MAP = new WeakMap<Uint8Array, Uint8Array[]>()
export function toCStringArray(strings: string[]): Uint8Array | null {
if (strings === undefined || strings === null || strings.length === 0) {
return null
}

// Encode strings into Uint8Array appending null byte
const chunks = strings.map((str) => toCString(str))
const ptrs = new BigUint64Array(chunks.length)

for (const [index, chunk] of chunks.entries()) {
const ptr = Deno.UnsafePointer.of(chunk)

if (ptr === null) {
// Should never be possible ??
throw new Error('Failed to create pointer')
} else {
ptrs[index] = BigInt(Deno.UnsafePointer.value(ptr))
}
}

const buf = new Uint8Array(ptrs.buffer)
// Maps the ptrs buffer to the chunks to prevent GC from collecting them
TO_STR_ARRAY_MAP.set(buf, chunks)

// Return the buffer as a Uint8Array
return buf
}
const TO_STR_ARRAY_MAP = new WeakMap<Uint8Array, Uint8Array[]>()
export function toCStringArray(strings: string[]): Uint8Array | null {
if (strings === undefined || strings === null || strings.length === 0) {
return null
}

// Encode strings into Uint8Array appending null byte
const chunks = strings.map((str) => toCString(str))
const ptrs = new BigUint64Array(chunks.length)

for (const [index, chunk] of chunks.entries()) {
const ptr = Deno.UnsafePointer.of(chunk)

if (ptr === null) {
// Should never be possible ??
throw new Error('Failed to create pointer')
} else {
ptrs[index] = BigInt(Deno.UnsafePointer.value(ptr))
}
}

const buf = new Uint8Array(ptrs.buffer)
// Maps the ptrs buffer to the chunks to prevent GC from collecting them
TO_STR_ARRAY_MAP.set(buf, chunks)

// Return the buffer as a Uint8Array
return buf
}
21 replies
DDeno
Created by carragom on 4/9/2024 in #help
How to go from a JS string[] to a C array of pointers
I also have a couple of questions: 1. Will the return allocate new memory or reuse the underlying buffer? 2. Is there any advantage in using Uint8Array as return instead of just the BigUint64Array ?
21 replies
DDeno
Created by carragom on 4/9/2024 in #help
How to go from a JS string[] to a C array of pointers
function toCStringArray(strings: string[]): Uint8Array | null {
if (strings === undefined || strings === null || strings.length === 0) {
return null
}

// Encode strings into Uint8Array appending null byte
const chunks = strings.map((str) => enc.encode(str + '\0'))
const buf = new BigUint64Array(chunks.length)
const map = new WeakMap<Deno.PointerObject, Uint8Array>()

for (const [index, chunk] of chunks.entries()) {
const ptr = Deno.UnsafePointer.of(chunk)

if (ptr === null) {
// No idea how this could happen
throw new Error('Failed to create pointer')
} else {
// Map pointers to chunks to prevent GC from collecting them
map.set(ptr, chunk)
buf[index] = BigInt(Deno.UnsafePointer.value(ptr))
}
}

// Return the buffer as a Uint8Array
return new Uint8Array(buf.buffer)
}
function toCStringArray(strings: string[]): Uint8Array | null {
if (strings === undefined || strings === null || strings.length === 0) {
return null
}

// Encode strings into Uint8Array appending null byte
const chunks = strings.map((str) => enc.encode(str + '\0'))
const buf = new BigUint64Array(chunks.length)
const map = new WeakMap<Deno.PointerObject, Uint8Array>()

for (const [index, chunk] of chunks.entries()) {
const ptr = Deno.UnsafePointer.of(chunk)

if (ptr === null) {
// No idea how this could happen
throw new Error('Failed to create pointer')
} else {
// Map pointers to chunks to prevent GC from collecting them
map.set(ptr, chunk)
buf[index] = BigInt(Deno.UnsafePointer.value(ptr))
}
}

// Return the buffer as a Uint8Array
return new Uint8Array(buf.buffer)
}
21 replies
DDeno
Created by carragom on 4/9/2024 in #help
How to go from a JS string[] to a C array of pointers
Well that worked it's it's even simpler and more readable now. I would appreciate your keen eye on it. Specially the part about the WeakMap, I have no idea if that would be enough to prevent garbage collection.
21 replies
DDeno
Created by carragom on 4/9/2024 in #help
How to go from a JS string[] to a C array of pointers
Great insight. Extra allocations are definitively something to watch out for. I love the idea of creating pointers to the encoded chunks, seems to have the best of all worlds. Balancing unsafe memory with a WeakMap on the other hand feels like a very delicate procedure, but hey at this point everything feels a bit brittle. I will give a try and report back.
21 replies
DDeno
Created by carragom on 4/9/2024 in #help
How to go from a JS string[] to a C array of pointers
In order to understand your code I wrote a function based on it. It should achieve the same and should handle multi-byte characters. But do mind the possible bugs. If you find anything weird with it let me know. Here it's
function toCStringArray(strings: string[]): Uint8Array | null {
if (strings === undefined || strings === null || strings.length === 0) {
return null
}

// Encode strings into byte arrays appending null byte
const encoded = strings.map((str) => enc.encode(str + '\0'))
// Total bytes required by encoded strings
const encodedLength = encoded.reduce((acc, cur) => acc + cur.byteLength + 1, 0)
// Total bytes required by pointers
const pointersLength = encoded.length * 8

const buf = new Uint8Array(pointersLength + encodedLength)
const ptrSegment = new BigUint64Array(buf.buffer, 0, encoded.length)
const dataSegment = buf.subarray(pointersLength)

// Get a pointer to the start of the data segment
const dataPointer = BigInt(
Deno.UnsafePointer.value(Deno.UnsafePointer.of(dataSegment)),
)

let offset = 0
for (const [index, data] of encoded.entries()) {
dataSegment.set(data, offset) // Copy the data to the strings buffer
ptrSegment[index] = dataPointer + BigInt(offset) // Store the pointer the inserted data
offset += data.byteLength
}

return buf
}
function toCStringArray(strings: string[]): Uint8Array | null {
if (strings === undefined || strings === null || strings.length === 0) {
return null
}

// Encode strings into byte arrays appending null byte
const encoded = strings.map((str) => enc.encode(str + '\0'))
// Total bytes required by encoded strings
const encodedLength = encoded.reduce((acc, cur) => acc + cur.byteLength + 1, 0)
// Total bytes required by pointers
const pointersLength = encoded.length * 8

const buf = new Uint8Array(pointersLength + encodedLength)
const ptrSegment = new BigUint64Array(buf.buffer, 0, encoded.length)
const dataSegment = buf.subarray(pointersLength)

// Get a pointer to the start of the data segment
const dataPointer = BigInt(
Deno.UnsafePointer.value(Deno.UnsafePointer.of(dataSegment)),
)

let offset = 0
for (const [index, data] of encoded.entries()) {
dataSegment.set(data, offset) // Copy the data to the strings buffer
ptrSegment[index] = dataPointer + BigInt(offset) // Store the pointer the inserted data
offset += data.byteLength
}

return buf
}
21 replies
DDeno
Created by carragom on 4/9/2024 in #help
How to go from a JS string[] to a C array of pointers
Thanks a lot !!! This was driving me crazy. I think it's safe to say that this a complex task to get right and there should be a helper function in the std library to deal with this conversion. This looks like a like a very common data structure to find as function parameter
21 replies