X Tutup
The Wayback Machine - https://web.archive.org/web/20230106131251/https://github.com/nodejs/node/pull/45803
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: add simdutf dependency #45803

Closed
wants to merge 3 commits into from
Closed

Conversation

anonrig
Copy link
Member

@anonrig anonrig commented Dec 9, 2022

simdutf provides a faster way of providing utf8 operations with SIMD instructions. @nodejs/undici team was looking for a way to validate utf8 input, and this dependency can make it happen.

Edit: I'm proposing either exposing the following functionality through a new module (like node:encoding) or through util.types or buffer

  • validate_ascii(string)
  • validate_utf8(string)
  • count_utf8(string)

PS: simdutf supports more features, and depending on the need, it makes more sense to expose them through a new module, instead of util.types or buffer.

@nodejs-github-bot
Copy link
Contributor

nodejs-github-bot commented Dec 9, 2022

Review requested:

@nodejs-github-bot nodejs-github-bot added build Issues and PRs related to build files or the CI. dependencies Pull requests that update a dependency file. needs-ci PRs that need a full CI run. tools Issues and PRs related to the tools directory. labels Dec 9, 2022
@anonrig anonrig force-pushed the deps/simdutf branch 2 times, most recently from 2c20c9a to 6daa546 Compare Dec 9, 2022
@anonrig anonrig changed the title dep: add simdutf dependency deps: add simdutf dependency Dec 9, 2022
@KhafraDev
Copy link
Member

KhafraDev commented Dec 9, 2022

This would help speedup both ws and undici's WebSocket implementation (which is still WIP). When we receive a text frame or receive a close frame with a reason, we need to validate that the buffer contains valid utf-8.

There are a few ways of doing so currently: a js implementation by default in both undici and ws, and optionally a package such as utf-8-validate. Note that simdutf is many times faster than the c++ version of utf-8-validate in the benchmark above, and the js fallback version is the slowest.

Here is a PR from @lpinca that shows massive speedups when using simdutf: websockets/utf-8-validate#101. Considering how widespread usage of ws is, exposing a very fast ability to validate utf-8 would improve a ton of the ecosystem.

@anonrig anonrig force-pushed the deps/simdutf branch 2 times, most recently from 5027cae to e94ba5f Compare Dec 9, 2022
@richardlau richardlau added the request-ci Add this label to start a Jenkins CI on a PR. label Dec 9, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Dec 9, 2022
@nodejs-github-bot
Copy link
Contributor

nodejs-github-bot commented Dec 9, 2022

@anonrig anonrig force-pushed the deps/simdutf branch 2 times, most recently from bed88cc to 4269faf Compare Dec 9, 2022
deps/simdutf/simdutf.gyp Outdated Show resolved Hide resolved
@anonrig anonrig force-pushed the deps/simdutf branch 3 times, most recently from ced7ef2 to 5566c99 Compare Dec 10, 2022
@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Dec 10, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Dec 10, 2022
@nodejs-github-bot
Copy link
Contributor

nodejs-github-bot commented Dec 10, 2022

@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Dec 10, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Dec 10, 2022
@lemire
Copy link
Contributor

lemire commented Dec 22, 2022

@lpinca Please open an issue upstream (simdutf) if you'd like. You have 1.3 million downloads a week, so maybe we can do something specific.

@lpinca
Copy link
Member

lpinca commented Dec 22, 2022

I will do it tomorrow. Anway as written in #45803 (comment) those 1.3 million downloads a week are basically all ws downloads. There is little to no usage outside of ws.

targos pushed a commit that referenced this pull request Jan 1, 2023
PR-URL: #45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
targos pushed a commit that referenced this pull request Jan 1, 2023
PR-URL: #45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
targos pushed a commit that referenced this pull request Jan 1, 2023
Co-authored-by: Daniel Lemire <daniel@lemire.me>
PR-URL: #45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
RafaelGSS added a commit that referenced this pull request Jan 2, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
deps:
  * disable avx512 for simutf on benchmark ci (Yagiz Nizipli) #45803
  * add simdutf dependency (Yagiz Nizipli) #45803
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: TBD
@RafaelGSS RafaelGSS mentioned this pull request Jan 2, 2023
RafaelGSS added a commit that referenced this pull request Jan 2, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
deps:
  * disable avx512 for simutf on benchmark ci (Yagiz Nizipli) #45803
  * add simdutf dependency (Yagiz Nizipli) #45803
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS added a commit that referenced this pull request Jan 2, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
deps:
  * disable avx512 for simutf on benchmark ci (Yagiz Nizipli) #45803
  * add simdutf dependency (Yagiz Nizipli) #45803
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS added a commit that referenced this pull request Jan 2, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
deps:
  * disable avx512 for simutf on benchmark ci (Yagiz Nizipli) #45803
  * add simdutf dependency (Yagiz Nizipli) #45803
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS added a commit that referenced this pull request Jan 2, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
deps:
  * disable avx512 for simutf on benchmark ci (Yagiz Nizipli) #45803
  * add simdutf dependency (Yagiz Nizipli) #45803
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS added a commit that referenced this pull request Jan 3, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
deps:
  * disable avx512 for simutf on benchmark ci (Yagiz Nizipli) #45803
  * add simdutf dependency (Yagiz Nizipli) #45803
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS pushed a commit that referenced this pull request Jan 4, 2023
PR-URL: #45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
RafaelGSS pushed a commit that referenced this pull request Jan 4, 2023
PR-URL: #45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
RafaelGSS pushed a commit that referenced this pull request Jan 4, 2023
Co-authored-by: Daniel Lemire <daniel@lemire.me>
PR-URL: #45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
RafaelGSS added a commit that referenced this pull request Jan 4, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
deps:
  * disable avx512 for simutf on benchmark ci (Yagiz Nizipli) #45803
  * add simdutf dependency (Yagiz Nizipli) #45803
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS added a commit that referenced this pull request Jan 4, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
deps:
  * disable avx512 for simutf on benchmark ci (Yagiz Nizipli) #45803
  * add simdutf dependency (Yagiz Nizipli) #45803
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS added a commit that referenced this pull request Jan 4, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS added a commit that referenced this pull request Jan 4, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS pushed a commit that referenced this pull request Jan 5, 2023
PR-URL: #45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
RafaelGSS pushed a commit that referenced this pull request Jan 5, 2023
PR-URL: #45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
RafaelGSS pushed a commit that referenced this pull request Jan 5, 2023
Co-authored-by: Daniel Lemire <daniel@lemire.me>
PR-URL: #45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
RafaelGSS added a commit that referenced this pull request Jan 5, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS added a commit that referenced this pull request Jan 5, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
RafaelGSS added a commit that referenced this pull request Jan 6, 2023
Notable changes:

buffer:
  * (SEMVER-MINOR) add buffer.isUtf8 for utf8 validation (Yagiz Nizipli) #45947
http:
  * (SEMVER-MINOR) improved timeout defaults handling (Paolo Insogna) #45778
net
  * add autoSelectFamily global getter and setter (Paolo Insogna) #45777
os:
  * (SEMVER-MINOR) add availableParallelism() (Colin Ihrig) #45895
util:
  * add fast path for text-decoder fatal flag (Yagiz Nizipli) #45803

PR-URL: #46061
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. build Issues and PRs related to build files or the CI. commit-queue-rebase Add this label to allow the Commit Queue to land a PR in several commits. dependencies Pull requests that update a dependency file. needs-ci PRs that need a full CI run. notable-change PRs with changes that should be highlighted in changelogs. performance Issues and PRs related to the performance of Node.js. review wanted PRs that need reviews. tools Issues and PRs related to the tools directory.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

X Tutup