Embedding, snapshotting, and 7-bit ascii requirement
What are the differences in how a code module is processed when using
load_side_module()
versus when building an extension and using .esm()
and include_js_files!()
?
I have an embed scenario working as expected with a JavaScript module, but less efficiently than I'd like (I suppose). I'd like to make it more efficient using snapshotting, but I'm hitting roadblocks that I don't understand. (Details follow...)9 Replies
My scenario works when I use the
deno_runtime
crate, the MainWorker
API, and I use include_str!()
to include the string contents of that module into the binary, create a ModuleCode
from it, and pass that to load_side_module()
. Something like this:
After that completes, the runtime is in a state where I can run scripts that use that side module.
However, what I think is happening here is that every time this embed code runs, it's having to create a new module, re-parse the JavaScript, and re-execute the module in order to put the runtime into a state where the module is available for use by some "main module", or some ad-hoc code that I may later run with execute_script()
. (Not to mention that the binary will have to include the JavaScript source code a string, making for a larger binary.)
So what it seems like I should be doing is leveraging the snapshotting feature to do the equivalent of this side module loading work in a build.rs
, take a snapshot, and then load that snapshot in the main.rs
(or lib.rs
) of my crate.It seems like the recent video live stream addresses exactly my concern, except that it doesn't address the roadblock I'm hitting:
https://www.youtube.com/watch?v=zlJrMGm-XeA
When I attempt to use the snapshotting APIs described in the video, I run into other problems. The method shown there uses an extension, and loads the code using a
deno_core::ExtensionBuilder
, .esm()
, and include_js_files!()
.
When I load the exact same JavaScript module in an extension like that, I get an error like "Extension code must be 7-bit ASCII...".
So there is evidently some difference in how this JavaScript module is being loaded and processed into these two loading scenarios.
The higher-level approach, loading as a "side module" in the JsRuntime
that is part of the MainWorker
provided by the deno_runtime
crate works as expected, but is less efficient than it might be if I were leveraging snapshotting. The lower-level approach, using an extension with deno_core
does not work because this 7-bit ascii requirement is violated.
What am I missing?Deno
YouTube
Deno Sessions ep 3: 1.32 and snapshotting in custom JS runtime
Andy and Leo go over recent Deno updates and talk about how to use snapshotting.
00:00 Intro
02:26 Deno 1.32
08:18 What is snapshotting?
13:07 Implementing snapshotting
15:55 Dissecting
CreateSnapshotOptions
23:33 Writing build.rs to create snapshot
29:13 Loading snapshot in main.rs
35:11 Q&A
45:00 Answering questions from GitHub Issue on Sna...It's a constraint we put in place to ensure performance - ASCII strings are waaaay cheaper for V8 to store in a snapshot than UTF-8 - if you need to store that module in the snapshot just replace UTF-8 characters in the code with escape codes
Like here:
GitHub
deno/01_console.js at 246569f6d45852aa42d6f7fe6221fe4d9fa69e3c · de...
A modern runtime for JavaScript and TypeScript. Contribute to denoland/deno development by creating an account on GitHub.
Ok, that makes sense about why you'd do that optimization, and the possible escape code workaround.
I still wonder why I only hit this ASCII requirement when trying to load it as an extension. If I recall correctly, I was having this same problem with the ASCII requirement when I was just loading the module in an extension, even without creating a snapshot.
Is the higher-level approach (via
load_side_module()
) replacing the UTF-8 characters automatically for me? Or is it some how loading the module in a way that does not have an ASCII requirement?I still wonder why I only hit this ASCII requirement when trying to load it as an extension. If I recall correctly, I was having this same problem with the ASCII requirement when I was just loading the module in an extension, even without creating a snapshot.Because extensions are mostly meant for our internal usage - so we imposed these limitations. When loading code using
load_side_module()
or load_main_module()
we are loading user code and there's no guarantee that it will be ASCII)
Or is it some how loading the module in a way that does not have an ASCII requirement?Correct, when executing these modules there's no requirement for ASCII because they are not snapshotted and ES module specification allows UTF-8. Again we imposed ASCII limitation for
extension!
because they are usually snapshotted and we wanted to make it as cheap as possible to snapshotOk, thanks.
My conclusion from this, then, is that if I want to use any valid ES module, I should not load it via
extension!
.
Furthermore, I should not expect to be able to snapshot any valid ES module either.
...which circles back to your initial suggestion: replace any UTF-8 characters with escape codes, if I want to load such a module in an extension, in order to create a snapshot.That's a valid conclusion for the time being - once Matt is back from OOO we can maybe look into making it possible, but for know I would only use it for the code you fully control. You should be able to revert to using
ExtensionBuilder
though - IIRC there should be an option there to just load any ES module, not just ASCII. Not sure though
Thinking about it know - we should probably make this configurable somehow for the extension!
macro 🤔
Can you circle back next week - I'll discuss with Matt; maybe we can make it work@.bartlomieju sure, thanks.
For what it's worth, I can control all of this code. In fact, the module I'm importing is already something that was produced by a custom Rollup configuration. I'll attempt to just replace utf-8 characters with escape codes on that bundle.