X Tutup
The Wayback Machine - https://web.archive.org/web/20221003162731/https://github.com/PowerShell/PowerShell/pull/18195
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve startup time by triggering initialization of additional types on background thread. #18195

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

daxian-dbw
Copy link
Member

@daxian-dbw daxian-dbw commented Sep 30, 2022

PR Summary

Improve startup time by triggering initialization of additional types on background thread.
There are 2 changes:

  1. Trigger the type initializers of Compiler, CachedReflectionInfo, and ExpressionCache, which involves lots of reflection operations.
  2. Reverse the order of LanguagePrimitives and TypeAccelerators, because the former is needed earlier in the startup process.

PR Context

The change improves the startup time of pwsh.exe by about 30% on my dev machine.
The benchmark result below is from comparing "a release build with the change (V73Wip)" with "a release build without the change (V730)" (both has R2R images).

## Warm startup -- with module analysis cache pre-populated
## PowerShell is built with 'Release' configuration and R2R images.
Test scenario: pwsh.exe -noprofile -c echo 1


BenchmarkDotNet=v0.13.2, OS=Windows 11 (10.0.22621.521)
Intel Xeon W-2235 CPU 3.80GHz, 1 CPU, 12 logical and 6 physical cores
.NET SDK=7.0.100-rc.1.22431.12
  [Host]     : .NET 7.0.0 (7.0.22.42610), X64 RyuJIT AVX2
  DefaultJob : .NET 7.0.0 (7.0.22.42610), X64 RyuJIT AVX2


|            Method |     Mean |    Error |   StdDev | Ratio | Code Size |
|------------------ |---------:|---------:|---------:|------:|----------:|
|   V730StartupTime | 557.2 ms | 10.83 ms | 10.64 ms |  1.00 |      41 B |
| V73WipStartupTime | 386.3 ms |  6.81 ms |  6.03 ms |  0.69 |      41 B |


Benchmark_pwsh_startup.V730StartupTime: DefaultJob
-------------------- Histogram --------------------
[539.780 ms ; 552.371 ms) | @@@@
[552.371 ms ; 572.428 ms) | @@@@@@@@@@
[572.428 ms ; 584.997 ms) | @@
---------------------------------------------------

Benchmark_pwsh_startup.V73WipStartupTime: DefaultJob
-------------------- Histogram --------------------
[374.236 ms ; 387.432 ms) | @@@@@@@@@
[387.432 ms ; 401.417 ms) | @@@@@
---------------------------------------------------

PR Checklist

@pull-request-quantifier
Copy link

pull-request-quantifier bot commented Sep 30, 2022

This PR has 5 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!


Quantification details

Label      : Extra Small
Size       : +3 -2
Percentile : 2%

Total files changed: 1

Change summary by file extension:
.cs : +3 -2

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a
balance between between PR complexity and PR review overhead. PRs within the
optimal size (typical small, or medium sized PRs) mean:

  • Fast and predictable releases to production:
    • Optimal size changes are more likely to be reviewed faster with fewer
      iterations.
    • Similarity in low PR complexity drives similar review times.
  • Review quality is likely higher as complexity is lower:
    • Bugs are more likely to be detected.
    • Code inconsistencies are more likely to be detected.
  • Knowledge sharing is improved within the participants:
    • Small portions can be assimilated better.
  • Better engineering practices are exercised:
    • Solving big problems by dividing them in well contained, smaller problems.
    • Exercising separation of concerns within the code changes.

What can I do to optimize my changes

  • Use the PullRequestQuantifier to quantify your PR accurately
    • Create a context profile for your repo using the context generator
    • Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
    • Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
    • Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
  • Change your engineering behaviors
    • For PRs that fall outside of the desired spectrum, review the details and check if:
      • Your PR could be split in smaller, self-contained PRs instead
      • Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

  • One line was added: +1 -0
  • One line was deleted: +0 -1
  • One line was modified: +1 -1 (git diff doesn't know about modified, it will
    interpret that line like one addition plus one deletion)
  • Change percentiles: Change characteristics (addition, deletion, modification)
    of this PR in relation to all other PRs within the repository.


Was this comment helpful? 👍  👌  👎 (Email)
Customize PullRequestQuantifier for this repository.

@iSazonov
Copy link
Collaborator

iSazonov commented Oct 1, 2022

On my notebook:

BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19044.2006/21H2/November2021Update)
Intel Core i5-2410M CPU 2.30GHz (Sandy Bridge), 1 CPU, 4 logical and 2 physical cores
.NET SDK=7.0.100-rc.1.22431.12
  [Host]     : .NET 7.0.0 (7.0.22.42610), X64 RyuJIT AVX
  DefaultJob : .NET 7.0.0 (7.0.22.42610), X64 RyuJIT AVX

Method Mean Ratio Code Size Allocated Alloc Ratio
Before 654.5 ms 1.00 1,836 B 1.01 KB 1.00
After 638.2 ms 0.97 41 B 1.01 KB 1.00

Intel Xeon W-2235 CPU 3.80GHz, 1 CPU, 12 logical and 6 physical cores

Obviously server CPU with 12 Core win 😄

@bergmeister
Copy link
Contributor

bergmeister commented Oct 3, 2022

Thanks @daxian-dbw for continuing to add improvements to the startup time, a 30% improvements is awesome and almost unbelievable.
Unrelated to that may I add something to think about when testing startup time: The times when slow startup time is a pain the most if when the CPU is busy already with load on other processes and therefore restricts the resources on the PowerShell process. I wonder if it's possible to create a test scenario for that to measure impact in that case or consider optimisations for that case. Because although it usually helps putting things into background threads, it can sometimes make the strain on the CPU worse in such scenarios due to the increased context switching overhead of having too many things happening in the background.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants
X Tutup