X Tutup
Skip to content

build: add ts support in core modules#62146

Open
marco-ippolito wants to merge 5 commits intonodejs:mainfrom
marco-ippolito:native-ts
Open

build: add ts support in core modules#62146
marco-ippolito wants to merge 5 commits intonodejs:mainfrom
marco-ippolito:native-ts

Conversation

@marco-ippolito
Copy link
Member

@marco-ippolito marco-ippolito commented Mar 7, 2026

Let's try again.
This PR allows Node.js source code to be written in TS.
This is semver major because the build now always requires rust to be installed.
This adds swc as a rust crate dependency and it's only use during build time.
Technically we could replace amaro wasm with this for type stripping but it's not the goal of this PR and in needs other considerations.
I moved an internal from .js and .ts to showcase
Also added a flag so that the transpiled code can be writte on the fs for debugging.
A lot of the addition are vendored crates sorry for the massive PR
I used AI to help me out with things I dont know (rust) so please review carefully
I had to bump the rustc from 1.82 to 1.85, obviously this cannot land if we dont update that on the CI machines nodejs/build#4245

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/build
  • @nodejs/gyp
  • @nodejs/security-wg
  • @nodejs/tsc

@nodejs-github-bot nodejs-github-bot added build Issues and PRs related to build files or the CI. dependencies Pull requests that update a dependency file. doc Issues and PRs related to the documentations. needs-ci PRs that need a full CI run. labels Mar 7, 2026
@marco-ippolito marco-ippolito added semver-major PRs that contain breaking changes and should be released in the next major version. strip-types Issues or PRs related to strip-types support labels Mar 7, 2026
@marco-ippolito marco-ippolito changed the title build: add type ts support in core build: add ts support in core Mar 7, 2026
@marco-ippolito marco-ippolito changed the title build: add ts support in core build: add ts support in core modules Mar 7, 2026
@marco-ippolito marco-ippolito marked this pull request as ready for review March 7, 2026 15:28
@marco-ippolito marco-ippolito added the blocked PRs that are blocked by other issues or PRs. label Mar 7, 2026
@anonrig
Copy link
Member

anonrig commented Mar 7, 2026

@marco-ippolito do you know what steps are needed to run and expose Rust code just like C++?

@marco-ippolito
Copy link
Member Author

marco-ippolito commented Mar 7, 2026

@marco-ippolito do you know what steps are needed to run and expose Rust code just like C++?

its already done in this PR, see d9ee95f

basically add the crate in the deps/crates cargo.toml, create a header file if doesnt come with the crate and add it to the node gyp

@ljharb
Copy link
Member

ljharb commented Mar 7, 2026

Will this type strip at runtime? Or will it do it at build time?

@marco-ippolito
Copy link
Member Author

marco-ippolito commented Mar 7, 2026

Will this type strip at runtime? Or will it do it at build time?

Build time

@ljharb
Copy link
Member

ljharb commented Mar 7, 2026

How does this interact with the --node-builtin-modules-path flag? will it still type strip at runtime in that case?

@marco-ippolito
Copy link
Member Author

How does this interact with the --node-builtin-modules-path flag? will it still type strip at runtime in that case?

Yes, I tested it, it works fine. it plugs in the js2c module

@mcollina
Copy link
Member

mcollina commented Mar 7, 2026

I'm relatively concerned about adding 1 million and 300 hundred thousand lines of code, even if they are just dependencies.

Is there a solution that does not require this much?

@marco-ippolito
Copy link
Member Author

marco-ippolito commented Mar 7, 2026

I dont think so since we are vendoring dependencies. I think the only solution is to have an automation generate them but its kinda impossible to review anyways. (The temporal PR from @legendecas had the same problem)

@legendecas
Copy link
Member

legendecas commented Mar 7, 2026

I think the question should be is it necessary to have every swc dependencies to be included?

I listed the newly added dependencies, and the followings are the biggest deps that I doubt if they are required:

9.6M	deps/crates/vendor/tracing          => Diagnostic purpose
5.1M	deps/crates/vendor/libc             => Indirect dep of crates `cpufeatures` and `num_cpus`
2.8M	deps/crates/vendor/regex-automata   => Regex engine
1.6M	deps/crates/vendor/bitvec           => Indirect dep of swc_sourcemap
1.6M	deps/crates/vendor/zerocopy         => Indirect dep of hashbrown
1.3M	deps/crates/vendor/zerocopy-derive  => Indirect dep of hashbrown
1.0M	deps/crates/vendor/idna             => Indirect dep of crate `url`

Additionally, crates like wasm-bindgen are not technically needed for the PR purpose.

I think technically many of these deps can be stripped down.

@marco-ippolito
Copy link
Member Author

Can I just like delete them and see if it builds?

@marco-ippolito
Copy link
Member Author

marco-ippolito commented Mar 8, 2026

I tried to remove those crates but it seems they are all required to build. Idk if there is a way to know exactly which ones are unused or how to remove them, I'm no rust expert

@targos
Copy link
Member

targos commented Mar 8, 2026

This is blocked by nodejs/build#4245

@marco-ippolito
Copy link
Member Author

Node.js configure: Found Python 3.12.3...
INFO: configure completed successfully
SKIP_XZ=1 supplied, skipping .tar.xz creation
mkdir -p out/doc
cp doc/node-config-schema.json out/doc
mkdir -p out/doc/api
cp -r doc/api out/doc

added 323 packages, and audited 325 packages in 6s

136 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities
[09:05:04.899] ERROR: Unknown file extension ".wasm" for /home/runner/work/node/node/tools/doc/node_modules/@minify-html/wasm/index_bg.wasm
TypeError [ERR_UNKNOWN_FILE_EXTENSION]: Unknown file extension ".wasm" for /home/runner/work/node/node/tools/doc/node_modules/@minify-html/wasm/index_bg.wasm
    at Object.getFileProtocolModuleFormat [as file:] (node:internal/modules/esm/get_format:189:9)
    at defaultGetFormat (node:internal/modules/esm/get_format:232:36)
    at defaultLoad (node:internal/modules/esm/load:145:22)
make: *** [Makefile:845: out/doc/api/addons.html] Error 1
    at async ModuleLoader.loadAndTranslate (node:internal/modules/esm/loader:543:45)

this seems unrelated but consistent 🫤

@marco-ippolito marco-ippolito force-pushed the native-ts branch 3 times, most recently from 6fa7a86 to 7f79144 Compare March 8, 2026 09:20
@ShogunPanda
Copy link
Contributor

I'm relatively concerned about adding 1 million and 300 hundred thousand lines of code, even if they are just dependencies.

Is there a solution that does not require this much?

I had she same concern in another PR I was working on. (no spoiler ;P)
I wonder if it would make sense to track such deps in another repo or similar solutions.

@codecov
Copy link

codecov bot commented Mar 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.65%. Comparing base (3725bd2) to head (5c971a2).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #62146      +/-   ##
==========================================
- Coverage   89.65%   89.65%   -0.01%     
==========================================
  Files         676      675       -1     
  Lines      206543   206492      -51     
  Branches    39547    39532      -15     
==========================================
- Hits       185184   185133      -51     
+ Misses      13480    13474       -6     
- Partials     7879     7885       +6     

see 38 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines +235 to +236
* A Rust toolchain (`rustc` >= 1.85, `cargo` >= 1.85) for compiling native
dependencies. See <https://rustup.rs/> for installation instructions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other dependency needs are not only listed at the top level, there are also explicit instructions given for each operating system. These explicit instructions per operating system for Rust are missing here.

Copy link
Contributor

@MikeMcC399 MikeMcC399 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't know how it's possible to review a PR with 3000+ files changed manually.

I don't think that this can be anything more than a proof of concept, unless there is some reliable way to review it automatically.

@marco-ippolito
Copy link
Member Author

marco-ippolito commented Mar 9, 2026

I really don't know how it's possible to review a PR with 3000+ files changed manually.

I don't think that this can be anything more than a proof of concept, unless there is some reliable way to review it automatically.

This is not a proof of concept. We are vendoring dependencies, so unless we are blocking all future rust dependencies (and as other comments suggest, also other contributors are already working on adding more), we have to find a way around. A way to review is by commit. A separate commit adds the vendored dependencies, it is true that its impossible to review that single commit.

@Renegade334
Copy link
Member

I'm assuming that there hasn't been a formal discussion beyond what was previously discussed at #60489?

Aside from the deps question, I do think this warrants a bit more of a think from the design perspective.

What are we hoping to accomplish here? Huge swathes of core aren't really TS-able due to legacy patterns that we can't move away from (the old pseudoconstructors etc), so we would be committing to a divided codebase. Typechecking in TS modules would only be robust if all their JS module dependencies were typed as well, which is currently very limited.

tsserver-compatible IDEs support strict typechecking in JS, as would using tsc in a CI pipeline if we really wanted to. It seems to me that aiming towards a JSDoc-driven internal type system (with phased-in checkJs support) would be far less hassle than shoehorning in TS transpilation, is basically an extension of what some subsystems are already doing, and would be better compatible with the pre-ES6 elements in core.

@marco-ippolito
Copy link
Member Author

The old pseudoconstructor pattern can be refactored to es6, I have done that myself a few times. We are using type stripping so it can be all migrated because it can coesist with javascript. The migration can take place over many years and its not a mass rewrite. Type checking applies only to the files we migrate over time.
The main benefit other than well types, is that at some point we can generate an accurate @types/node.
There are still things we need to figure out in type checking but those can be fixed as we go

@Renegade334
Copy link
Member

Renegade334 commented Mar 10, 2026

The old pseudoconstructor pattern can be refactored to es6, I have done that myself a few times.

Moving to new-only construction of streams, EventEmitters, etc. is one of those "break all the legacy libraries" changes that I somewhat assumed we would never be able to make.

This is one example. The biggest caveat, however, is the module system. tsc has require(...) and module.exports as typechecked operations in JS only, and the typechecked equivalents in TS (import ... = require(...)/ESM-style imports and export =/ESM-style exports) are not erasable syntax. Jake did raise this in the previous PR, but we are going to be continually straining against a TS language service that simply wasn't designed with the internal module environment in mind, and would probably have to add type casts to every imported identifier just to hack around these constraints. (The file renamed in this PR is a case in point: it might "type-check" just fine, but the type of the lazy import is any, as would be the type of require('internal/util/colors') if imported from another core module.)

I'm all for better code quality in core, but this seems like a very nuclear move with more nuance than is maybe being presented.

The main benefit other than well types, is that at some point we can generate an accurate @types/node.

This comes up frequently as a suggestion. For reasons that are very complex and to do with the TS ecosystem rather than Node.js itself, autogenerating @types/node is never really going to work. It was the web team who last sat down with TS a year or so ago to discuss the ins and outs of this, I am happy to do so again, but this should not really be a consideration in weighing up the pros and cons of this change.

@marco-ippolito
Copy link
Member Author

marco-ippolito commented Mar 10, 2026

Moving to new-only construction of streams, EventEmitters, etc. is one of those "break all the legacy libraries" changes that I somewhat assumed we would never be able to make.

It needs to go through a deprecation cycle.

Also just because we cannot typecheck everything, it's not worth to adding it at all seems like a poor argument. The implementation needs to be refined on many levels but still it's worth exploring and not be killed on some speculations.

autogenerating @types/node is never really going to work

You are presenting this as an absolut truth. Can you argument it please

It was the web team who last sat down with TS a year or so ago to discuss the ins and outs of this, I am happy to do so again, but this should not really be a consideration in weighing up the pros and cons of this change.

Yes Im happy to discuss it again

@ChALkeR
Copy link
Member

ChALkeR commented Mar 10, 2026

Why are the new deps needed if amaro which is already vendored can already do ts -> js stripping?

@marco-ippolito
Copy link
Member Author

marco-ippolito commented Mar 10, 2026

Why are the new deps needed if amaro which is already vendored can already do ts -> js stripping?

Amaro needs javascript to run, it would require node as a build requirement to build itself. This is a pure rust/c++ implementation

@ChALkeR
Copy link
Member

ChALkeR commented Mar 10, 2026

Amaro needs javascript to run, it would require node as a build requirement to build itself.

Sure, but that seems like a viable option? That's what many projects do
That would avoid having two different ts implementations in the tree

Also, the underlying lib really needs just wasm not the Node.js js api so a separate runner could be used sharing most of the logic on v8 + wasm but prior to js being built - that won't even need a separate Node.js...

@ChALkeR
Copy link
Member

ChALkeR commented Mar 10, 2026

I just checked there is nothing requiring Node.js process actually in amaro
I.e. nothing that can't be faked with bootstrap js code
WebAssembly support is native in v8 and doesn't need additional actions

chalker@macbook-air _test % cat amaro.cjs
const amaro = require('amaro');
const { code } = amaro.transformSync("const foo: string = 'bar';", { mode: "strip-only" });
console.log(code);

chalker@macbook-air _test % node amaro.cjs 
const foo         = 'bar';
chalker@macbook-air _test % npx @exodus/test --engine=v8:bundle amaro.cjs
Engine: v8:bundle, running on ~/.jsvu/bin/v8
# amaro.cjs
const foo         = 'bar';
All 1 test suites passed
Total time: 185.111ms

This was executed on v8 cli. The code was bundled though, but that could be solved with some minor js bootstrap.
And v8 and amaro are already in-tree.


Perhaps replace Buffer.from with Uint8Array.fromBase64 where available 🙃
Apart from that only utf8 encoding/decoding is needed, and that's pretty much it

@Renegade334
Copy link
Member

Also just because we cannot typecheck everything, it's not worth to adding it at all seems like a poor argument.

This is exactly my point, though. We already have a way of leveraging tsc/tsserver to provide robust typechecking in JS, in a way that can be gradually phased in (@ts-check directives), without any requirement for transpilation, without any new dependencies, and without necessitating breaking changes. I really don't see why we wouldn't want to go down this avenue instead (essentially #60489 (comment)).

@ChALkeR
Copy link
Member

ChALkeR commented Mar 10, 2026

@marco-ippolito this works on pure v8 cli:

Without bundling, with v8 CLI import()
(which could alternatively be replaced with just concatenating it into the script)

(async function() {
  class TextEncoder {} // only constructed, not actually used
  class TextDecoder {
    constructor(encoding = "utf-8", options = {}) {
      if (encoding !== 'utf-8' || !options.ignoreBOM) throw new Error('Unexpected')
    }
    decode(input = new Uint8Array(), options) {
      if (!(input instanceof Uint8Array) || options) throw new Error('Unexpected')
      return decodeURIComponent(escape(String.fromCharCode.apply(String, input)));
    }
  };

  const Amaro = globalThis.module = {}
  globalThis.require = (arg) => {
    if (arg === 'util') return { TextEncoder, TextDecoder }
    if (arg === 'node:buffer') return {
      Buffer: {
        from: (x, encoding) => {
          if (encoding === 'base64') return Uint8Array.fromBase64(x)
          throw new Error('Unexpected')
        }
      }
    }
    throw new Error('Unexpected')
  }

  await import('./node_modules/amaro/dist/index.js')

  const { code } = Amaro.exports.transformSync("const foo: string = 'bar';", { mode: "strip-only" });
  console.log(code);
})();
chalker@macbook-air _test % ~/.jsvu/bin/v8 tempout.cjs
const foo         = 'bar';

@marco-ippolito
Copy link
Member Author

@marco-ippolito this works on pure v8 cli:

Without bundling, with v8 CLI import().

(async function() {
  class TextEncoder {} // only constructed, not actually used
  class TextDecoder {
    constructor(encoding = "utf-8", options = {}) {
      if (encoding !== 'utf-8' || options.fatal || !options.ignoreBOM) throw new Error('Unexpected')
    }
    decode(input = new Uint8Array(), options) {
      if (!(input instanceof Uint8Array) || options) throw new Error('Unexpected')
      return decodeURIComponent(escape(String.fromCharCode.apply(String, input)));
    }
  };

  const Amaro = globalThis.module = {}
  globalThis.require = (arg) => {
    if (arg === 'util') return { TextEncoder, TextDecoder }
    if (arg === 'node:buffer') return {
      Buffer: {
        from: (x, encoding) => {
          if (encoding === 'base64') return Uint8Array.fromBase64(x)
          throw new Error('Unexpected')
        }
      }
    }
    console.log('arg', arg)
  }

  await import('./node_modules/amaro/dist/index.js')

  var { code } = Amaro.exports.transformSync("const foo: string = 'bar';", { mode: "strip-only" });
  console.log(code);
})();

If we could run this in the js2c.cc, we could skip completely the rust dependency

@mcollina
Copy link
Member

Moving to new-only construction of streams, EventEmitters, etc. is one of those "break all the legacy libraries" changes that I somewhat assumed we would never be able to make.

The blast radius of this would be way too great to justify the effort on the ecosystem side. There is a lot of code that depends on this pattern, and updating it all is going to be problematic: even if we deprecate, most developers won't know how to deal with it because it's buried under 10 levels on dependencies on a module that hasn't been updated since 2016. Unless there is a clear reason (like security in the case of Buffer), I would not recommend we do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blocked PRs that are blocked by other issues or PRs. build Issues and PRs related to build files or the CI. dependencies Pull requests that update a dependency file. doc Issues and PRs related to the documentations. needs-ci PRs that need a full CI run. semver-major PRs that contain breaking changes and should be released in the next major version. strip-types Issues or PRs related to strip-types support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

X Tutup