New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Preserve stdout byte stream for native commands #17857
base: master
Are you sure you want to change the base?
WIP: Preserve stdout byte stream for native commands #17857
Conversation
Also preserve for `native > file.txt`.
|
|
||
| namespace System.Management.Automation; | ||
|
|
||
| internal sealed class AsyncByteStreamDrainer : IDisposable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really don't like this name. StreamReader implies string or I'd use that. Open to suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AsyncByteStreamRepeater?
|
|
||
| namespace System.Management.Automation; | ||
|
|
||
| internal interface IStreamSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole file is a little over engineered and needs to be rethought a bit
| if (_isPreparedCalled) | ||
| { | ||
| return; | ||
| } | ||
|
|
||
| _isPreparedCalled = true; | ||
|
|
||
| DownStreamNativeCommand?.Prepare(psDefaultParameterValues); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can probably be removed. Initially I wasn't lazily retrieving the stdout stream so the downstream native command had to be prepared before the upstream. I believe I've fixed that
| public ProcessOutputHandler( | ||
| Process process, | ||
| BlockingCollection<ProcessOutputObject> queue, | ||
| BytePipe stdOutDestination, | ||
| out AsyncByteStreamDrainer stdOutDrainer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying not to too drastically alter the classes here but ProcessOutputHandler is a bit non-typical already. We were just creating an instance of it and not saving it anywhere so the constructor sort of functioned like a static method. I need to take a closer look at if it's worth rethinking this class in general as this out param doesn't help it's readability.
| foreach (var redirection in redirections) { | ||
| if (redirection is MergingRedirection) | ||
| { | ||
| redirection.Bind(pipe, commandProcessor, context); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The double enumeration here was a quick band-aid for native 2>&1 > file.txt. In that scenario we want to revert to the current string reading behavior so I needed to process merging redirections before I tried to create a specialized byte pipe for a file stream.
I need to implement a better pattern here.
| @@ -723,7 +757,7 @@ internal static SteppablePipeline GetSteppablePipeline(PipelineAst pipelineAst, | |||
|
|
|||
| foreach (var commandTuple in commandTuples) | |||
| { | |||
| var commandProcessor = AddCommand(pipelineProcessor, commandTuple.Item2.ToArray(), commandTuple.Item1, commandTuple.Item3.ToArray(), context); | |||
| var commandProcessor = AddCommand(pipelineProcessor, commandTuple.Item2.ToArray(), commandTuple.Item1, commandTuple.Item3.ToArray(), context, false); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't touched steppable pipelines yet, that likely doesn't work atm.
| && FromStream is RedirectionStream.Output | ||
| && !string.IsNullOrWhiteSpace(File)) | ||
| { | ||
| nativeCommand.StdOutDestination = FileBytePipe.Create(File, Appending); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently just opening a file stream. It needs path resolution to be done through FileSystemProvider and it needs to surface errors similarly to how Out-File does today.
| if (_inputFormat is not NativeCommandIOFormat.Text) | ||
| { | ||
| AddTextInput(input); | ||
| AddXmlInput(input); | ||
| return; | ||
| } | ||
| else // Xml | ||
|
|
||
| object baseObjInput = input; | ||
| if (input is PSObject pso) | ||
| { | ||
| AddXmlInput(input); | ||
| baseObjInput = pso.BaseObject; | ||
| } | ||
|
|
||
| if (baseObjInput is byte[] bytes) | ||
| { | ||
| _streamWriter.BaseStream.Write(bytes, 0, bytes.Length); | ||
| return; | ||
| } | ||
|
|
||
| if (baseObjInput is byte b) | ||
| { | ||
| _streamWriter.BaseStream.WriteByte(b); | ||
| return; | ||
| } | ||
|
|
||
| AddTextInput(input); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit enables piping bytes directly to native commands. Might need to be in a different PR, not strictly in scope.
| return new FileBytePipe( | ||
| new FileStream( | ||
| fileName, | ||
| append ? FileMode.Append : FileMode.OpenOrCreate, | ||
| FileAccess.Write, | ||
| FileShare.ReadWrite | FileShare.Delete)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The options here likely need to be changed to match Out-File
| lock (_syncObject) | ||
| { | ||
| GetStream().Write(bytes); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lock might not be needed here. Originally this was due to the drainers for stdout and stderr writing to the same stream when redirected, but now we revert to string reading for that.
|
Glad to see this being taken seriously! It is important for generic shells--especially ones that aspire to work cross platform--to be able to pass data through agnostically. (I cloned and built Powershell just to try this PR! The stdout redirection did appear to work on my tests (which require not introducing CR into LF-only streams). That is good! I was able to pipe | and redirect > successfully. However, the powershell I built seems to put the redirected native outputs wherever powershell was started up, not in the current directory. So if run from some directory like: But we find the file in the original directory: Piping commands that aren't native output, e.g. Redirecting 2> for errors is introducing CRs when CMD.EXE does not give them, and does not have the directory issue...so I assume it is not running the new handling. (As per my post on the discussion thread, my vote is very much that stderr be covered by the same handling!) |
I do not know much about PowerShell, but wanted to try [building a PR to help test it](PowerShell#17857 (comment)). (The PR addresses one big reason why I *can't* use PowerShell...[pipe/redirect corruption](PowerShell#1908).) It is nice that there is a script which will automate the build process. But the default disposition of powershell is to reject it. And the link suggested by the warning message given goes to a very long page, with no obvious prescription for the exact incantation to make to run the script. I don't know if what I did is the "right" answer: (`pwsh -ExecutionPolicy Unrestricted`). But it worked--so I'm offering it as a first draft. If there's a better way to advise first-time-builders, then that should be said instead. But some sort of guidance here would be very helpful.
Ah yeah that makes sense. That'd be part of what gets fixed with my comment above about hooking up path resolution via
Yeah I can definitely understand the desire, but the only real way to do that with consistency would be to rewrite the Maybe in the future dotnet will add something to the |
Can you explain in more detail why this is significantly different from the stdout case? |
The biggest technical issue is around merging stderr into stdout. So the ideal way this would work (using Windows APIs as an example) is I would create a single pipe (or file for redirection) with Doing the same for This doesn't directly exclude lighting up the It's also not necessarily a simple switch I'd be flipping to enable it for stderr as well. There are a lot of places it needs to be special cased in pipeline creation similarly to how I'm doing it for stdout but subtly different. If/when this PR is merged I'll create a separate issue that is specifically for preserving bytes when redirecting stderr so the engine WG can discuss it. This PR will be for stdout specifically, but that doesn't necessarily mean another PR can't address it. |
|
The stdout case probably would satisfy most people, whose concerns are more about binary preservation with things like curl/zip. My issue with cross-platform consistency on CRLF is its own crusade, which is distinct from this if one is going to actually "use" PowerShell in a bigger sense. But it happens to be measurably better with this change. I can see that interleaving the streams raises a lot of issues. I'm not clear on how it ever works (are UNIX shells at risk of doing half-codepoints in UTF-8? If not, how? Would their guarantee be accomplished by line buffering vs. understanding UTF-8 encodings specifically? If so, what happens when the line buffer size is exceeded?) |
The trouble mostly comes from utilizing It's much easier when you are synchronously reading from and writing to a single pipe, as most platform specific shells with less output processing will do. |
|
This pull request has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 15 days. It will be closed if no further activity occurs within 10 days of this comment. |
|
@adityapatwardhan What sort of feedback would be needed for PowerShell to prioritize reviewing this? It is a rather popular request (see #1908, and touched upon by some other issues), where these seem like things people expect to work in a shell: Modulo the output directory I found when trying it, it seemed to work for that. |
It's not finished yet, it's a work in progress PR. You can ignore the bot, if it closed it then I'd just reopen |
| { | ||
| isNativeCommand = true; | ||
| nativeCommandProcessor.UpstreamIsNativeCommand = lastCommandWasNative; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this won't work for pipelines invoked via PowerShell.AddCommand, given the pipeline construction won't go through the compiler code.
Similar to the usage of Runspace.CreatePipeline() and Runspace.CreateNestedPipeline, as they don't go through the compiler code either. We may need to do this within the pipeline processor.
|
|
||
| private Task? _readToBufferTask; | ||
|
|
||
| public AsyncByteStreamDrainer(IStreamSource streamSource, SpanAction<byte, object?> callback, object? callbackArg, Action completedCallback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you try the Stream.CopyToAsync API? If it works, we may not need AsyncByteStreamDrainer.
|
This pull request has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 15 days. It will be closed if no further activity occurs within 10 days of this comment. |
|
This pull request has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 15 days. It will be closed if no further activity occurs within 10 days of this comment. |
|
This PR has Quantification details
Why proper sizing of changes matters
Optimal pull request sizes drive a better predictable PR flow as they strike a
What can I do to optimize my changes
How to interpret the change counts in git diff output
Was this comment helpful? |
| catch (Exception e) when (e.Data.Contains(typeof(ErrorRecord))) | ||
| { | ||
| // The error record is attached to the exception when thrown to preserve | ||
| // the call stack. | ||
| ErrorRecord? errorRecord = e.Data[typeof(ErrorRecord)] as ErrorRecord; | ||
| if (errorRecord is null) | ||
| { | ||
| throw; | ||
| } | ||
|
|
||
| e.Data.Remove(typeof(ErrorRecord)); | ||
| throw new RuntimeException(null, e, errorRecord); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the cleanest way I could think of to throw the underlying error record/exception while retaining the same exception type as Out-File without stripping the stack trace. Open to other ideas
This was in #559 (comment) |
So it kind of depends how you interpret what @vors was saying there. In a sense, that's what I'm doing in this PR. In another sense (which is what you're getting at) there is another route that would be a much bigger lift but would be closer to how other shells implement it. On Windows that would be using That would require rewriting a significant chunk of the code as we wouldn't be able to use |

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

Note that this is currently a proof of concept (albeit completely functional afaik). A lot of work is still needed in terms of polish and consistency. I'll try to outline what work still needs to be done below. Trying to get some feedback as early as possible for this change though so please feel free to review!
PR Summary
PR Context
Left to do:
Stream.CopyToAsync(thank you @daxian-dbw!)PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:or[ WIP ]to the beginning of the title (theWIPbot will keep its status check atPendingwhile the prefix is present) and remove the prefix when the PR is ready.(which runs in a different PS Host).