How to go from a JS string[] to a C array of pointers
Hi,
Trying to call a C function with FFI but struggling to covert a string[] to an array of pointers. Lets asume the following C function
And using some JS code I can try to call it
Is
buffer
the right type for that parameter I better us pointer
?. And finally how to convert the params
array to something that can be sent to the C function.
Appreciate the help.13 Replies
I wasn't able to find the answer to your question in the official Deno documentation. However, based on the code snippet you provided, I noticed a couple of potential issues that you might want to address:
1. In the JavaScript code, there is a syntax error in the
dlopen
call. The test
object is being initialized with a trailing comma instead of a closing parenthesis. It should be:
2. In the test.symbols.test_multi
call, param.length
should be params.length
to match the name of the params
array.
Regarding your question about whether to use buffer
or pointer
for the FFI type, and how to convert the params
array to a format that can be sent to the C function, this is not directly covered in the Deno documentation. However, typically for FFI calls, you would need to allocate a buffer for each string, convert each string to a C-style string (null-terminated), and then create an array of pointers to these strings.
Please note that this advice is not based on the Deno documentation, and you should double-check the accuracy of this advice. For the most accurate and up-to-date information, you may want to consult the Deno FFI documentation or seek help from the Deno community.Here's an example of how I pass string args: https://github.com/aapoalas/libclang_deno/blob/main/lib/utils.ts#L9-L46
GitHub
libclang_deno/lib/utils.ts at main · aapoalas/libclang_deno
Deno FFI bindings for libclang. Contribute to aapoalas/libclang_deno development by creating an account on GitHub.
The basic idea is this: Create a buffer that is large enough to hold
params.length
pointers plus all of the param strings encoded. Then encode the strings into the buffer and write in pointers that point to the start of each string (so essentially self-references; the buffer contains pointers that point within the buffer).And here's a usage example: https://github.com/aapoalas/libclang_deno/blob/5d6d8ed33627c6f12949efc15c3f7ccf07f26b3a/lib/mod.ts#L364-L380
GitHub
libclang_deno/lib/mod.ts at 5d6d8ed33627c6f12949efc15c3f7ccf07f26b3...
Deno FFI bindings for libclang. Contribute to aapoalas/libclang_deno development by creating an account on GitHub.
Thanks a lot !!! This was driving me crazy. I think it's safe to say that this a complex task to get right and there should be a helper function in the std library to deal with this conversion. This looks like a like a very common data structure to find as function parameter
In order to understand your code I wrote a function based on it. It should achieve the same and should handle multi-byte characters. But do mind the possible bugs. If you find anything weird with it let me know. Here it's
This does suffer from doing two memory copies for the strings instead of one. Also you're doing N+1 memory allocations where N is the number of strings.
You could calculate the encoded length from the string lengths times some constant. Any constant below 3 (IIRC) does mean that it's theoretically possible for the estimated encoded length to be too small, but that is fairly unlikely. This would mean no extra memory copy and only one allocation.
If the buffer ends up being too small, you can detect that based on the encoder's return value (bytes read, bytes written) and create a new, larger buffer to replace the first one. This would be an extra memory copy but it's very, very unlikely to happen.
Alternatively, you can just encode the strings one by one and create pointers to those resulting buffers. Then, allocate an array of pointers and write the pointers into that. Finally, use a WeakMap to bind the string-buffers to stay alive as long as the pointers-buffer is alive.
This gets rid of the extra copy but retains the N+1 allocations.
Great insight. Extra allocations are definitively something to watch out for. I love the idea of creating pointers to the encoded chunks, seems to have the best of all worlds. Balancing unsafe memory with a WeakMap on the other hand feels like a very delicate procedure, but hey at this point everything feels a bit brittle. I will give a try and report back.
Deno internally does fun stuff with a WeakMap in
Deno.UnsafePointer.of(arrayBuffer)
. This returns an opaque pointer object and we put it in as a key into a WeakMap with arrayBuffer
as the value.
The result is that as long as the pointer object lives, the buffer memory also lives.Well that worked it's it's even simpler and more readable now. I would appreciate your keen eye on it. Specially the part about the WeakMap, I have no idea if that would be enough to prevent garbage collection.
I also have a couple of questions:
1. Will the return allocate new memory or reuse the underlying buffer?
2. Is there any advantage in using Uint8Array as return instead of just the BigUint64Array ?
Your WeakMap should be a map from your
buf
into chunks
. Right now you're saying "until ptr
exists, chunk must exist." Then you immediately let the ptr object go out of scope and potentially get garbage collected. The chunk is then free to be GC'd as well.
Return is reusing the underlying buffer.
And there is indeed an advantage to using Uint8Array as your go-to-choice whenever you pass data into FFI as "buffer"
: Due to reasons, Deno has to choose one buffer type to optimise for, and Uint8Array is the most versatile one we can choose so that one it is.
As a result, with a trivial benchmark you should see perhaps even a 100x difference in performance between the exact same FFI call but just changing from a BigUint64Array into a Uint8Array.
Re: The WeakMap. It should actually be created outside of the function scope, so that it itself cannot be GC'd.
You can then start code-golfing to something like (pseudo-code incoming because I'm on mobile):
Related: If I did not plug these yet, then take a look at:
* Denonomicon: https://denonomicon.deno.dev/introduction
* WTF is FFI? videos: https://youtu.be/9f9Ujeods5o?si=qV9YeRyI_TiHxYsK
* I also spend most of this seemingly Rust talk talking about Deno FFI: https://youtu.be/QDd4Iu_-MTw?si=VdZm_TF8KnNxl7xW
Deno
YouTube
WTF is FFI? An intro to foreign function interface with Deno, pt. 1
Foreign function interface, or "FFI", is a way to access data and call functions from native dynamic libraries (like C, C++, and Rust).
View the source code from this episode's demo here: https://github.com/aapoalas/deno-ffi-tutorial
This is part of a longer series where Andy and Aapo talk about FFI, how to use it with Deno, and advance use ca...
SoC Hub
YouTube
Aapo Alasuutari: FFI & libffi (@ Finland Rust-lang group meetup 11/...
Aapo Alasuutari's presentation "FFI & libffi" at the November 2022 meetup of the Finland Rust-lang group in Tampere (organized in cooperation with SoC Hub)
Great resources, helped me a lot. Here is the current implementation, if you get a chance let me know if you spot anything problematic.
Looks good to me! 🙂