X Tutup
The Wayback Machine - https://web.archive.org/web/20221221145807/https://github.com/nodejs/node/pull/44710
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move ESM loaders off-thread #44710

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

JakobJingleheimer
Copy link
Contributor

@JakobJingleheimer JakobJingleheimer commented Sep 18, 2022

Resolves #43658

To-dos:

  • Address Module serialisation issue (Proxies aren't serialisable)
  • Spawn hooks worker only when custom hooks are supplied
  • Convert ESM defaultResolve to synchronous (no real change there)
  • Convert import.meta.resolve() to synchronous
    • When no custom hooks are supplied, just regular sync
    • When any custom hooks are supplied, leverage worker + atomics to simulate sync
  • Handle very large request/response payload (main ↔︎ worker)
  • Performance check effects to
    • node bootstrap/startup
    • individual tasks (eg a single import)

@JakobJingleheimer JakobJingleheimer added this to In progress in Loaders via automation Sep 18, 2022
@nodejs-github-bot
Copy link
Contributor

nodejs-github-bot commented Sep 18, 2022

Review requested:

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Sep 18, 2022
lib/internal/modules/esm/public_loader_proxy.js Outdated Show resolved Hide resolved
lib/internal/modules/esm/public_loader_proxy.js Outdated Show resolved Hide resolved
lib/internal/modules/esm/public_loader_proxy.js Outdated Show resolved Hide resolved
lib/internal/modules/esm/public_loader_proxy.js Outdated Show resolved Hide resolved
const lock = new Int32Array(comsChannel, 0, 4);
/**
* The request & response segment of the shared memory. TextEncoder/Decoder (needed to convert
* requests & responses into a format supported by the coms channel) reads and writes with
Copy link
Member

@GeoffreyBooth GeoffreyBooth Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* requests & responses into a format supported by the coms channel) reads and writes with
* requests & responses into a format supported by the comms channel) reads and writes with

Why is the other Int32Array and this is Uint8Array?

Copy link
Contributor Author

@JakobJingleheimer JakobJingleheimer Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments tell you why 🙂

Copy link
Member

@GeoffreyBooth GeoffreyBooth Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the comment explaining why Int32Array, but why Uint8Array?

Copy link
Contributor Author

@JakobJingleheimer JakobJingleheimer Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's lines 15–17 of this file, right above this PR thread (the mobile app won't let me copy+paste). It starts with TextEncoder/Decoder (the reason is because that's the numerical datatype TextEncoder/Decoder can handle).

Maybe it doesn't need to be numerically encoded at all? 🤔 it goes into a SharedArrayBuffer, and the first slice of that is used by the Atomic, but the rest of it isn't. It's probably smaller, but I didn't check. TextEncoder/Decoder + Uint8Array was Guy's suggestion—I can ask him.

Oh, duh, it does need to be numerical because some form of that is the only view option for slicing up a SharedArrayBuffer.

lib/internal/modules/esm/public_loader_proxy.js Outdated Show resolved Hide resolved
*/
const requestResponseData = new Uint8Array(comsChannel, 4, 2044);

const worker = new Worker('./worker.mjs', {
Copy link
Member

@GeoffreyBooth GeoffreyBooth Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A filename of worker.mjs implies that this is the only worker we’ll ever have under modules/esm. Perhaps user_loaders_worker.mjs? Also the actual file is named worker.js, a CommonJS file.

Copy link
Contributor Author

@JakobJingleheimer JakobJingleheimer Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand: it is a CJS file.

RE file name: cross that bridge when we come to it?

Copy link
Member

@GeoffreyBooth GeoffreyBooth Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code says worker.mjs. If the file is named worker.js, shouldn’t the code say the same?

Re cross that bridge: renaming a file often messes with git history. It’s better to try to future-proof names as much as we can from the start. It doesn’t hurt to be as specific as possible with the filename, that will encourage future additions to be added into new files rather than bloating existing files.

Copy link
Contributor Author

@JakobJingleheimer JakobJingleheimer Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH! I think it's copy+pasta from my experiment (which was esm).

One of the major downsides of using this worker is errors get swallowed.

lib/internal/modules/esm/worker.js Outdated Show resolved Hide resolved
lib/internal/modules/esm/worker.js Outdated Show resolved Hide resolved
} = primordials;

const {
ERR_VM_DYNAMIC_IMPORT_CALLBACK_MISSING,
} = require('internal/errors').codes;
const { ESMLoader } = require('internal/modules/esm/loader');
const publicESMLoader = require('internal/modules/esm/public_loader_proxy');
Copy link
Member

@GeoffreyBooth GeoffreyBooth Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does “public” mean here? As in there are two ESMLoader instances, one for Node internals and one for user code, and this is the latter? Is it publicly accessible to user code?

Copy link
Contributor Author

@JakobJingleheimer JakobJingleheimer Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct: internal is used internally by node for its own code. Public handles userland code. Neither are accessible in userland.

Copy link
Member

@GeoffreyBooth GeoffreyBooth Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it’s not accessible to application code, “public” feels like a poor choice of name. Maybe applicationESMLoader or userCodeESMLoader?

@GeoffreyBooth GeoffreyBooth added esm Issues and PRs related to the ECMAScript Modules implementation. loaders Issues and PRs related to ES module loaders labels Sep 18, 2022

Atomics.wait(lock, 0, 1); // ! Block this module until the worker is ready.

function makeRequest(type, data) {
Copy link
Member

@MoLow MoLow Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function makeRequest(type, data) {
function makeRequest(method, data) {

(new TextDecoder().decode(requestResponseData))
.replaceAll('\x00', '') // strip empty space (not a great solution)
);
const response = await publicESMLoader[type](data);
Copy link
Member

@MoLow MoLow Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const response = await publicESMLoader[type](data);
const response = await publicESMLoader[method](data);

@@ -85,21 +84,21 @@ function loadModulesInIsolation(specifiers, loaders = []) {
// A separate loader instance is necessary to avoid cross-contamination
// between internal Node.js and userland. For example, a module with internal
// state (such as a counter) should be independent.
const internalEsmLoader = new ESMLoader();
internalEsmLoader.addCustomLoaders(loaders);
const internalESMLoader = new ESMLoader();
Copy link
Member

@MoLow MoLow Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should --loader and --import run on the main thread?

Copy link
Member

@GeoffreyBooth GeoffreyBooth Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving --loader off the main thread is the point of this PR, so I’m not sure I understand the question?

--import is user application code so it needs to stay on the main thread.

Copy link
Contributor Author

@JakobJingleheimer JakobJingleheimer Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Moshe is asking whether both "internal" and "public" loaders should be in the worker thread (at the moment in this PR, only the public is—once I get the public, I'll move internal).

Copy link
Member

@GeoffreyBooth GeoffreyBooth Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we move the “internal” loader onto the worker thread, wouldn’t that break user code that monkey-patches Node stuff?

Perhaps we could do that as a follow-up, if we do it at all. It feels like it might be a semver-major change.

(new TextDecoder().decode(requestResponseData))
.replaceAll('\x00', '') // strip empty space (not a great solution)
);
const response = await publicESMLoader[type](data);
Copy link
Member

@MoLow MoLow Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lacks error handling, I would expect the worker to somehow let the main thread via the proxy about the failure

Copy link
Contributor Author

@JakobJingleheimer JakobJingleheimer Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a draft because it's not done yet 😉 good shout though, thanks

Copy link
Member

@MoLow MoLow Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'l give another peak when I see it moves to ready for review :)

@JakobJingleheimer JakobJingleheimer force-pushed the feat/esm_off-thread-loaders branch 3 times, most recently from 2265b79 to ce94ba6 Compare Sep 19, 2022
lib/internal/modules/esm/public_loader_proxy.js Outdated Show resolved Hide resolved
lib/internal/modules/esm/public_loader_proxy.js Outdated Show resolved Hide resolved
lib/internal/modules/esm/worker.js Show resolved Hide resolved
lib/internal/modules/esm/public_loader_proxy.js Outdated Show resolved Hide resolved
lib/internal/modules/esm/worker.js Outdated Show resolved Hide resolved
Atomics.store(lock, 0, 1); // Send 'ready' signal to main
Atomics.notify(lock, 0); // Notify main of signal

while (true) { // event loop
Copy link
Contributor

@aduh95 aduh95 Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not a good idea to call that "event loop" when event loop already means something else in JS-land.

Suggested change
while (true) { // event loop
while (true) {

@@ -166,7 +167,7 @@ class Worker extends EventEmitter {
RegExpPrototypeExec(/^\.\.?[\\/]/, filename) !== null) {
filename = path.resolve(filename);
url = pathToFileURL(filename);
} else {
} else if (!isInternal) {
Copy link
Contributor

@dygabo dygabo Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
considering the above change on Line 141:

  if (options.eval || isInternal)

This check is not needed as isInternal must be false to get here

@cspotcode
Copy link

cspotcode commented Sep 27, 2022

Does each worker thread get its own loader thread? E.g. if an app launches 2x worker threads, there will be 6x threads total: one main, 2x worker, 3x loader?

@JakobJingleheimer
Copy link
Contributor Author

JakobJingleheimer commented Sep 27, 2022

The current design puts all loaders in the same worker thread (eg there are a total of 2 threads: the main and the worker). Internal things are handled via the main thread; userland things are handled by the worker thread.

@cspotcode
Copy link

cspotcode commented Sep 27, 2022

I think that's slightly different than what I'm asking about. If a node app uses the worker threads API to launch worker threads, does each get its own loader thread? (where each loader thread may have one or more loaders running within it)

Not does each loader get a thread, but does each user thread get a loader thread.

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Sep 27, 2022

If a node app uses the worker threads API to launch worker threads, does each get its own loader thread?

I think the current state (before this PR) is that loaders always execute in the main thread, even if they’re “for” user code that is in a worker thread. So I would think that after this PR, loaders would run inside their worker thread, regardless of whether they’re customizing main or worker thread user code.

So in other words, for an app using custom loaders where user code that spawns three workers, there are five threads total: main, loaders, and the three workers spawned by the user code.

This is just my guess, others can please correct me if I’m mistaken.

@targos
Copy link
Member

targos commented Sep 27, 2022

I don't know how it's implemented but I would expect that each Node.js thread (main and workers) has its own separate loaders thread, at least for two reasons:

  • Worker threads are supposed to be as isolated as possible from the main thread and from each other
  • You can spawn a worker thread with a different set of --loader flags.

@bmeck
Copy link
Member

bmeck commented Sep 27, 2022

@targos
Copy link
Member

targos commented Oct 8, 2022

What remains to be done?

@JakobJingleheimer
Copy link
Contributor Author

JakobJingleheimer commented Oct 8, 2022

Something in this implementation is causing node to hang on startup. We suspect it's some kind of circular dependency, and the subsequent dependency smacks into the Atomics lock (preventing the rest of the flow to complete, which would release the initial lock). I think this circular dep is between ESMLoader and Worker (it's rather difficult to troubleshoot as output is swallowed).

There is a working PoC, so we know this works in principle (and that the Atomics otherwise behave appropriately, so the problem lies with integrating it into node).

This was on hold whilst I was on holiday. I'm working on it today and tomorrow (but will be away on a business trip next week).

@JakobJingleheimer
Copy link
Contributor Author

JakobJingleheimer commented Oct 8, 2022

I found the circular dep:

process/esm_loader.js:12modules/esm/public_loader_proxy.js:3worker_thread.js:29/36process/pre_execution.js:562process/esm_loader.js

@JakobJingleheimer
Copy link
Contributor Author

JakobJingleheimer commented Oct 9, 2022

Gah, it wasn't a circular dependency 🤦‍♂️ it was setting url to null here:

node/lib/internal/worker.js

Lines 141 to 149 in 9836c67

if (options.eval || isInternal) {
if (typeof filename !== 'string') {
throw new ERR_INVALID_ARG_VALUE(
'options.eval',
options.eval,
'must be false when \'filename\' is not a string'
);
}
url = null;

which caused the worker to spawn with nothing in it.

(I shouldn't have copied it over from the previous attempt at off-threading, which was later hard-coding the url).

I'm not sure if the internal worker should go through the classic path, or merely get require()ed (I tried both, and it doesn't seem to make a difference).

There is (now was) a circular dependency issue after the empty worker issue is addressed in

initializeESMLoader();

that's now fixed too.

NODE_DEBUG=esm,worker ./out/Release/node -p 1 still doesn't print 1, BUT it does now print debugging output, and we can see it getting fairly far along now, so progress:

ESM 13018: hooking up public ESMLoader
ESM 13018: creating worker for publicESMLoader
[…]
WORKER 15812: instantiating Worker. url: file:///…/nodejs/node/lib/internal/modules/esm/worker.js doEval: classic
WORKER 15812: [0] created Worker with ID 1

esm/worker.js's debug (worker for public ESM running) is not printing, so that's likely why things have stalled (as soon as the worker is running, it releases the lock).

JakobJingleheimer and others added 10 commits Dec 4, 2022
wordsmith comments

Co-authored-by: Geoffrey Booth <webadmin@geoffreybooth.com>
Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>
`__dirname` is not defined? O.o
- Let's not create a debug file, otherwise we create a new instance
  each time instead of re-using the existing one.
- Internal modules are not classic scripts.
- Fix `await` use outside of async function.
@GeoffreyBooth GeoffreyBooth force-pushed the feat/esm_off-thread-loaders branch from a75ed16 to 38bec52 Compare Dec 4, 2022
@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Dec 4, 2022

I just rebased on main. I was able to reproduce the issues cited above. To summarize:

./node --trace-uncaught --test ./test/es-module/test-esm-loader-chaining.mjs
TAP version 13
node:internal/test_runner/harness:37
      throw err;
      ^

Error [ERR_UNHANDLED_ERROR]: Unhandled error. ('This Environment was initialized without a V8::Inspector')
    at new NodeError (node:internal/errors:400:5)
    at InternalWorker.emit (node:events:508:17)
    at [kOnErrorMessage] (node:internal/worker:312:10)
    at [kOnMessage] (node:internal/worker:323:37)
    at MessagePort.<anonymous> (node:internal/worker:218:57)
    at [nodejs.internal.kHybridDispatch] (node:internal/event_target:734:20)
    at exports.emitMessage (node:internal/per_context/messageport:23:28) {
  code: 'ERR_UNHANDLED_ERROR',
  context: 'This Environment was initialized without a V8::Inspector'
}
./node --eval 'console.log(1)'
node:internal/modules/cjs/loader:1165
    throw err;
    ^

TypeError: Cannot read properties of undefined (reading 'set')
    at internalCompileFunction (node:internal/vm:102:17)
    at wrapSafe (node:internal/modules/cjs/loader:1141:20)
    at Module._compile (node:internal/modules/cjs/loader:1182:27)
    at runScript (node:internal/process/execution:82:27)
    at evalScript (node:internal/process/execution:104:10)
    at node:internal/main/eval_string:50:3
./node --input-type=module --eval 'console.log(1)'
^C # Kill hung process

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Dec 4, 2022

@JakobJingleheimer @nodejs/v8 @targos @MoLow I’ve made a small discovery that might point to the cause of the This Environment was initialized without a V8::Inspector error. So per above we were getting that error by running ./node --trace-uncaught --test ./test/es-module/test-esm-loader-chaining.mjs. If you make the following change to src/env-inl.h:

inline bool Environment::should_create_inspector() const {
-  return (flags_ & EnvironmentFlags::kNoCreateInspector) == 0 &&
-         !options_->test_runner && !options_->watch_mode;
+  return (flags_ & EnvironmentFlags::kNoCreateInspector) == 0;
}

Then the error disappears, and the process hangs like it does for ./node --input-type=module --eval 'console.log(1)'. An __error.log file is generated, containing this:

TypeError [ERR_INVALID_ARG_VALUE]: The argument 'filename' must be a file URL object, file URL string, or absolute path string. Received undefined
    at new NodeError (node:internal/errors:400:5)
    at createRequire (node:internal/modules/cjs/loader:1337:11)
    at file:///Users/geoffrey/Sites/node/test/common/index.mjs:3:17
    at ModuleJob.run (node:internal/modules/esm/module_job:194:25)

While this kNoCreateInspector flag was added in #35025, the second line about !options_->test_runner && !options_->watch_mode was added in #44366. So it appears that something about --test is causing this error. This might be a bug in general, not specific to this PR, but I don’t have time at the moment to try to reproduce it in a minimal context (cc @MoLow).

@JakobJingleheimer For the purposes of this PR I would just test it using a command other than --test, and hopefully the --test issue can be investigated separately.

@MoLow
Copy link
Member

MoLow commented Dec 4, 2022

While this kNoCreateInspector flag was added in #35025, the second line about !options_->test_runner && !options_->watch_mode was added in #44366. So it appears that something about --test is causing this error. This might be a bug in general, not specific to this PR, but I don’t have time at the moment to try to reproduce it in a minimal context (cc @MoLow).

the objective of the change in #44366 was to make --inspect (with all of its variations) only affect the "inner" process since both --test and --watch have a wrapping process with internal node code, --inspect should not open a debugger session on the external process.

we might want to only pass --trace-uncaught to the inner process and ignore it in the wrapper process

@JakobJingleheimer
Copy link
Contributor Author

JakobJingleheimer commented Dec 6, 2022

./node --trace-uncaught ./test/es-module/test-esm-loader-chaining.mjs

is now passing, and errors from the worker are now surfaced (and the process terminates gracefully)

Something related to the --test trigers such an error

./node --trace-uncaught --test ./test/es-module/test-esm-loader-chaining.mjs

Also, I believe I've found the cause of the v8.serialize() failure for

./node --input-type="module" -e "console.log(1)"

ESMLoader::eval() returns a Module instance, and Module::wrapper is a Proxy, which is not serialisable. I'll dig into that (chatted with Geoffrey, and we have a couple avenues to try).

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Dec 6, 2022

the objective of the change in #44366 was to make --inspect (with all of its variations) only affect the “inner” process since both --test and --watch have a wrapping process with internal node code, --inspect should not open a debugger session on the external process.

we might want to only pass --trace-uncaught to the inner process and ignore it in the wrapper process

@MoLow it seems like this change breaks ./node --trace-uncaught --test ./test/es-module/test-esm-loader-chaining.mjs when worker threads are used, which happens as a result of the changes in this PR. I suspect that node --test some-code-that-spawns-workers.js is probably also broken. Do you have time to look into this? Perhaps there’s a different way to prevent the inspector from covering the “wrapper” process, or we can revert that change for now (to un-break using --test with workers) and add it back once a new method is determined?

@MoLow
Copy link
Member

MoLow commented Dec 8, 2022

@GeoffreyBooth Il take a deep look at this on this weekend

@mcollina mcollina dismissed their stale review Dec 8, 2022

The worker_thread is spawned only if enabled.

@MoLow
Copy link
Member

MoLow commented Dec 9, 2022

@GeoffreyBooth @JakobJingleheimer I tryed reproducing with

// worker.js
'use strict';

const { parentPort } = require('worker_threads');

parentPort.on('message', (msg) => {
  parentPort.postMessage(msg);
  setTimeout(() => { throw new Error('test'); }, 10);
});
import { describe, it } from 'node:test'
import { deepEqual, fail } from 'node:assert'
import { Worker } from 'node:worker_threads'



it('test', async () => {
  const worker = new Worker("./worker.js");
  worker.on('online', console.log);
  worker.postMessage('hello');
  await new Promise((resolve) => {
    worker.on('message',(...args) => {
      deepEqual(args, ['hello']);
      setTimeout(() => {
        worker.terminate();
        resolve();
      }, 100);
    });
  });
});

but this does not seem to reproduce any issues with --trace-uncaught,
what am I missing?

@JakobJingleheimer
Copy link
Contributor Author

JakobJingleheimer commented Dec 9, 2022

@MoLow the issue is with --test (not --trace-uncaught)

@MoLow
Copy link
Member

MoLow commented Dec 10, 2022

I have tried both, and ran test-esm-loader-chaining as well. both with and without --test - was not able to reproduce

@@ -0,0 +1,69 @@
'use strict';
Copy link
Member

@joyeecheung joyeecheung Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is supposed to be a script instead of a module, you can put it in lib/internal/main/ instead.

Copy link
Contributor Author

@JakobJingleheimer JakobJingleheimer Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's different about that location? I put it here because it's exclusively for the use of ESM (loaders)

*/
const requestResponseData = new Uint8Array(commsChannel, 4, 2044);

debug('creating worker for publicESMLoader');
Copy link
Member

@joyeecheung joyeecheung Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't do debug() at the top-level in a module that's supposed to be require()ed, you can wrap this in a function that gets called by who needs this to happen, on-demand.

const esmLoader = new ESMLoader();
exports.esmLoader = esmLoader;
debug('hooking up public ESMLoader');
const customLoaders = getOptionValue('--experimental-loader');
Copy link
Member

@joyeecheung joyeecheung Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid accessing getOptionValue() at the top level in a module that's supposed to be require()ed. That makes the module immediately runtime-dependent and unsnapshottable. The loader can be created when the loader initialization is actually run for the first time, see #45828. (In general internal modules are not supposed to cause side-effects or query runtime-states at the top-level i.e. when they are require()ed. Most of those things can always be done on-demand in a function instead. It only makes sense to do that at the top-level in main scripts).

@@ -0,0 +1,74 @@
'use strict';
Copy link
Member

@joyeecheung joyeecheung Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be merged into esm/loader or just process/esm_loader too if you wrap the side-effect below into a function that produces a loader proxy and use a boolean to make sure it only gets created once. That reduces one more require() call. (In general, avoid doing side-effects at the top-level in internal modules, that makes the modules themselves unsnapshottable. Most things can be done in functions that get exported and called on-demand. It's an overkill to rely on the internal module loader to implement call-only-once, that can be easily achieved with a if (called) { return; } else { called = true; } pattern.

}

// ! maybe move this outside
const { ESMLoader } = require('internal/modules/esm/loader');
Copy link
Member

@joyeecheung joyeecheung Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file can be merged into esm/loader and gets exported as part of that module.exports, then you no longer need to require that file below. And there would be one less require() call from modules/esm/worker.js as well.

(And to the comment above - that creates a new process/esm_loader -> esm/load_modules_in_isolation -> internal/modules/esm/loader -> internal/modules/esm/* -> process/esm_loader circular dependency if you move it outside, the process/esm_loader -> internal/modules/esm/loader -> process/esm_loader is already inevitable (but should be made lazy, as done in #45828), but let's try not to make the mess worse by adding a new circular path, which might work if you are lucky but makes the code harder to reason about especially when there are TDZs around).

Copy link
Contributor Author

@JakobJingleheimer JakobJingleheimer Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the insight!

I need to pivot approach on this PR, and in the pivot, this can go back to where it was.

@JakobJingleheimer
Copy link
Contributor Author

JakobJingleheimer commented Dec 13, 2022

We realised the serialisation issues are indeed caused by Proxys in Module; one of the Proxies seems maybe unnecessary (a getter might suffice), but the other is very necessary (it may be possible to refactor it to something else, but it would likely be undesirable).

So we're planning to pivot the approach here: Instead of instantiating the whole public ESMLoader within the worker (what we're calling the "heavy" approach), we're going to pursue two others.

A "thin" worker that is basically just a sandbox; the public ESMLoader's relevant methods currently contain loops. With this approach, each iteration of the loop would make a request to the worker, wherein the hook is merely executed and its output returned as the worker's response. The benefits of this approach are that it's very simple (but proves it works); the cons are it will likely have unideal performance (due to many requests/responses, and thus a lot of serialisation and deserialisation).

A "thick" worker wherein we extract hook-related methods off the current ESMLoader class into a separate Hooks class (or other structure). When the worker is needed, it is spawns with the Hooks class inside; then the worker responds with the final result of a hook chain.

The response of both approaches has the same shape (a plain object with a few properties like source: Buffer | string).

We may first go the thin option as a PoC and then land the thick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. lib / src Issues and PRs related to general changes in the lib or src directory. loaders Issues and PRs related to ES module loaders needs-ci PRs that need a full CI run.
Projects
Loaders
In progress
Development

Successfully merging this pull request may close these issues.

esm, loader: move to own thread
X Tutup