PowerShell should support creating an List similar to how it supports arrays #5643

TravisEz13 · 2017-12-06T23:10:17Z

Powershell supports creating arrays with $array = 'a', 1, '3' . Then you can add an element to the array with $array += 4, but this creates a new array which is not performant.

Powershell should have a syntax which allows creating lists.
Assuming the operator is @[...], you could create a list with $list = @['a', 1, '3'] and then you could add an element to the existing list with $list += 4 without PowerShell having to create a new list.
Note: this new operator might function more like @(...)

This design assumes that changing , would be a breaking change. I'm open to discussing changing , as well.

I filed this based on an offline discussion about this comment on a PR: #5625 (comment)

The text was updated successfully, but these errors were encountered:

rkeithhill · 2017-12-06T23:23:05Z

I like the idea but to truly impact performance you'd need to be operating on large lists. For a convenient way to create large lists, I would expect something like this to work @[Get-ChildItem C:\Windows -r -file *.dll -ea 0]. While the list literal form is nice, I don't see folks creating lists large enough to gain much of a perf benefit over using an array. Well, unless the list literal is created inside a busy (large n) loop.

lzybkr · 2017-12-06T23:39:18Z

Two points:

@() says - make sure the thing inside is an array. It's not necessary if you use , because the comma operator always creates an array.
I've often wondered if the comma operator could create a list instead of an array. I have a feeling most scripts would never notice a difference because of how freely things are converted to an object array.

TravisEz13 · 2017-12-06T23:56:12Z

@lzybkr I updated my description based on @lzybkr 's comments

daxian-dbw · 2017-12-07T00:37:18Z

I like the idea of having a list literal in powershell. I think it could have a syntax like @[1, 2, 3] to directly create a list with elements 1, 2 and 3 without first create an array literal from 1,2,3 and then make it a list using @[].

cchiu1979 · 2017-12-07T01:48:31Z

@lzybkr is it something like this?
1,2,3
(1,2,3).length
( , (1,2,3) ).length
( @(1,2,3) ).length

lzybkr · 2017-12-07T04:16:46Z

Sure, alias properties would be needed to make lists work just like arrays.

mklement0 · 2017-12-15T00:39:54Z

I've often wondered if the comma operator could create a list instead of an array.

If that's not considered too much of a breaking change, it would certainly be the best solution.

Otherwise:

@rkeithhill:

I get what you're saying about large lists, but that's where += comes in as a convenient syntax for appending to the list (calling ::Add() or ::AddRange() on the [System.Collections.ArrayList] or [System.Collections.Generic.List[object]] instance behind the scenes - unlike today's behavior of +=, which either silently recreates the variable content as an array or, if the variable was type-constrained, as a new instance).

In other words: something like the following would make sense:

$al = @[] # simpler than: [System.Collections.ArrayList]::new()

for ($i = 0; $i -lt 1000; ++$i) {
  $al += $i # simpler than: $null = $al.Add($i)
}

daxian-dbw · 2017-12-29T23:02:37Z

I submitted two PRs (WIP) with different designs for the List support in PowerShell:

[WIP] Support ListExpression '@[]' in PowerShell #5762 -- Support @[], similar to @()
[WIP] Support ListLiteralExpression '[]' in PowerShell #5761 -- Support ListLiteralExpression '[]', similar to ArrayLiteralExpression

@[] is my first design. However, I ran into a blocking issue regarding the closing bracket character ']'. Quoted from #5762:

@[] has a SubExpression like '@()' and '$()'. However, unlike the closing parenthesis character ')', the closing bracket character ']' doesn't always force to start a new token, and it can be included in a generic token, meaning that ']' can appear in a command name, argument, or function name. This makes it impossible for @[dir] to determine the ending of the list expression because dir] will be treated as a single generic token.

This PR adds the property InListSubExpression to Tokenizer, and makes ']' a force-to-start-new-token character when _tokenizer.InListSubExpression is set. This approach solves the most common UX problem but is by no way perfect, for example, comparing to @(funcHas[]inName) or @(dir has[]inpath), @[funcHas[]inName] and @[dir has[]inpath] won't work because the first ']' will force the command name to end.

Without breaking change, I think the best we can do is probably to make ']' a force-to-start-new-token character when parsing a command invocation pipeline in @[] but not when parsing any nested expression or statement within the @[].

At the same time, I started to think an alternative -- add ListLiteralExpression like the ArrayLiteralExpression. In that case, a list can only contain Expression elements and hence command name, arguments, and function names won't be a problem for the ending bracket. PR #5761 is for that design, where we use '[]' (same token pair as TypeConstraint and Attribute).

I hope the those 2 PRs can draw more discussion on the design.

markekraus · 2017-12-29T23:47:05Z

@daxian-dbw since this code is beyond my understanding, does it attempt to create a strongly typed list, or is always List<Object>?

daxian-dbw · 2017-12-29T23:59:27Z

@markekraus It attempts to always create List<object>, like @() ways create an object[].

daxian-dbw · 2017-12-30T00:15:24Z

@lzybkr proposed to use new token pairs instead of @[] to represent a ListExpression in #5762 (comment):

You could consider 2 character tokens.
For example, F# uses this syntax for an array literal:
[| 1; 2 |]
There are other possibilities that probably aren't breaking changes, e.g. [< 1, 2 >].
The key here is to use a second character that can't be in a command name.

It would be great to have @[] to represent ListExpression, but I'm fine with new token pairs. ~~I will prototype with [<>]~~. [<>] won't work because '>]' is allowed in a generic token. [| .. |] may work. If new token pairs are acceptable, I definitely prefer ListExpression over ListLiteral.

mklement0 · 2017-12-30T00:36:43Z

@daxian-dbw: I'm really glad to see you take this on, but before we go any further with the syntax debate:

Is the consensus that we cannot just simply switch ,, the array construction operator, to an array-list/generic-list implementation behind the scenes, as @lzybkr hinted at - for reasons of backward compatibility?

The answer may well be that yes, it's too risky to make that change (I personally cannot tell), but if it happens to be no, after all, there's no need for a syntax debate.

daxian-dbw · 2017-12-31T07:32:46Z

@mklement0 IMHO, there would be 3 problems if we simply change the comma operator ',' to return a list:

The AST type name ArrayLiteralAst would be inconsistent, but changing it would be a huge breaking change. There would be other breaking changes like the returned value of StaticType property, but the AST type name would be the most problematic one I guess.
With the comma operator, we wouldn't be able to create an empty list.
The comma operator only takes Expression elements, not arbitrary statements like @() does, for example, comparing to @(dir), you would have to use ,(dir). Besides, the comma operator doesn't unwrap the Expression value because it's literal (ArrayLiteralAst). So ,(dir) would return a one-element list that contains an object array.

I prefer a ListExpressionAst '@[]' over a ListLiteralAst '[]' because of the 3rd one above.

iSazonov · 2018-01-01T13:05:41Z

@daxian-dbw Thanks for great prototypes!
I'd prefer @[] if it would possible to implement. I very wonder to see something like[| ... |] - if we haven't another way I'd rather see simple List( ... ) or [List]1,2,3.

If we have problem with last ] in@[] could we use @[ 1, 2, 3 ]@ like multiline string literals?
@@[] don't resolve the problem.
We could reuse parentheses with other prefix - if @() array, $() singletion then %() or &() or *() - list.

I personally like *().

markekraus · 2018-01-01T18:23:23Z

*() would be somewhat ambiguous. Should 5*(Get-Random) throw a ~~RuntimeException for missing op_Multiply on List~~ a CommandNotFoundException or should it multiply a random number by 5?

daxian-dbw · 2018-01-01T23:34:38Z

'%(1)' is parsed into a CommandAst today, where '%' is the command name (foreach-object), and the argument is (1).
'&(1)' is parsed into a CommandAst today, where '&' is the invocation operator and the command name is (1).
'*()' is also ambiguous, as @markekraus pointed out.

markekraus · 2018-01-02T00:52:28Z

Minor correction: %{} would be the foreach-obejct. %() is ambiguous with modulo. e.g 5%(Get-Random -Minimum 1 -Maximum 5)

Also, @@[] would possibly be problematic for extended splat literals (if they ever make their way out of RFC).

outside of the literals.. I like the idea of Lists getting an accelerator, but only if it works similar to using namespace System.Collections.Generic making $MyList = [List[MyClass]]::New() easier. I would not like a [List] accelerator without the ability to set the type unless if could play nice and create List<Object> by default but still allow creating lists of a desired type.

iSazonov · 2018-01-02T12:11:28Z

I have only one question - where I can buy Unicode keyboard with 32000 buttons to replace my 102 keyboard? 😄

We could combine the accelerator idea and list literals:

@[int](1,2,3)
@[string](dir C:\)
@[](1,2,3) as short cut of @[object](1,2,3)

mklement0 · 2018-01-05T00:22:02Z

@daxian-dbw: Thanks for the detailed feedback.

I can't speak to 1. (AST names), but perhaps the answer is to special-case @() for the @(<empty-or-scalar-or-array-literal>) cases, such as @(), @(3), or @(1, 2, 3) (note that @(<array-literal>) already is special-cased - see #4280), while leaving any @() that involves a command and/or multiple statements to work as it does now.

The alternative is to simply make @() always return a list. This has the advantage of allowing the definition of lists as a series of individual expression statements (defining an element each), obviating the need for , in multiline definitions (in which case the line breaks take the place of the statement-separating ;). The down-side is that lists would be created in many situations where an array will do; while @(Get-ChildItem) is more convenient than @((Get-ChildItem)), creating a list in such a case strikes me as less important.

Again, it might be too risky, but it would solve the syntax problem.

That said, that alone wouldn't address the desire for explicit typing.

Perhaps the special casing could be tweaked to translate something like
@([string[]] (...)) into a List<string> instance.

The need for the inner (...) - due to operator precedence - makes this slightly awkward, however, and forgetting them can easily go unnoticed, because you quietly get [object[]].

On the other hand, explicit typing is a more advanced use case, and optimizing for the typical case is arguably more important.

daxian-dbw · 2018-01-05T01:51:18Z

The alternative is to simply make @() always return a list.

I talked to @jpsnover about this today and he also brought up changing the semantic of @() to return a list. The down-side is:

the AST type name ArrayExpressionAst being inconsistent with the semantic and StaticType property.
list is created in some situation you need an array, but powershell can convert List<object> to object[] implicitly, so this might not be an issue.

For (1), could it be OK to have this inconsistency?

lzybkr · 2018-01-05T02:14:30Z

The Ast type name doesn't matter that much.

There are many examples outside of PowerShell where the name can be misleading - ArrayList is a good one.

Lua is another good example - quoting from here.

Tables in Lua are not a data structure; they are the data structure. All structures that other languages offer---arrays, records, lists, queues, sets---are represented with tables in Lua. More to the point, tables implement all these structures efficiently.

iSazonov · 2018-01-05T09:15:42Z

but powershell can convert List to object[] implicitly, so this might not be an issue.

If $a = @(1, 2, 3) define List then I'd expect that $a = $a + 4 or $a += 4 don't convert List to Array. We could $a.ToArray(). In the case we should add magic ToArray() to arrays too as we add magic Count, Length, Where() and ForEach().
Also I expect many customers will ask about typed lists like [int]@(1, 2, 3) or @[int](1, 2, 3).

mklement0 · 2018-01-06T01:38:50Z

@iSazonov:

If $a = @(1, 2, 3) define List then I'd expect that $a = $a + 4 or $a += 4 don't convert List to Array.

Actually, I would expect that to work with instances of any type that implements the IList interface and therefore has an .Add(Object) method - irrespective of this specific issue; see #5805

Also I expect many customers will ask about typed lists like [int]@(1, 2, 3) or @[int](1, 2, 3)

While slightly awkward, as discussed, @([int[]] (1,2,3)) has the advantage of not introducing new syntax (only new semantics).

KirkMunro · 2019-06-27T14:35:29Z

I don't get the drive error you get. Are you testing that in a session that defines a CommandNotFoundAction handler?

I suppose @: could work too. That mucks up the reference to the angry emoji though. 😠 🤣

I really like the notion that you can add a character to an enclosure prefix in your scripts and voilà, they'll use a more efficient data structure. That would be a very low cost performance enhancement for some scripts if the data structure was implemented properly with operator support for things like +=, etc.

KirkMunro · 2019-06-27T14:41:18Z

Just to put another alternative on the table:

:(1,2,3,4)

That's shorter, but : could be a command (still, it would only be a breaking change if someone had that as a command and they invoked that command by passing arguments in using round brackets).

vexx32 · 2019-06-27T15:08:55Z

Nope, fresh PS7-preview1 session. 🤷‍♂

Yeah, could do, but then you lose the callback to @() a bit and the meaning is a little less clear, I feel?

iSazonov · 2019-06-27T15:24:10Z

As Jason meantioned above lists is probably edge case for scripts - so no need to have a syntax suger for creating lists. Perhaps we could only enhance '+' (+=) operator to support lists and concurrent collections (other types?).
We could start with this and add syntax sugers later if we find compromise.

vexx32 · 2019-06-27T15:51:31Z

That's definitely a no-brainer; we need the + / += support for lists and similar.

The syntactic sugar would really be nice as well though 😊

ili101 · 2020-08-31T15:52:16Z

Please consider adding support for + / +=.
Even if you ignore the performance benefit this is more natural to use, for example I just did something like this and was surprised by the error:

$MyArrayList = [System.Collections.ArrayList]@(0, 1, 3, 4)
$MyArrayList += 5
$MyArrayList.Insert(2, 2) # Exception calling "Insert" with "2" argument(s): "Collection was of a fixed size."

vexx32 · 2020-08-31T16:18:00Z

I think we have an existing issue for that specifically: #5805

It came up again recently as a duplicate, but my comment there still stands: #13152 (comment)

SteveL-MSFT · 2020-10-10T16:10:15Z

@daxian-dbw perhaps we can turn your @[] implementation w/ addition operator support as an experimental feature? As part of this, we can make the breaking change so that ] forces a new token as it seems like a bucket 3 breaking change and we can get real world feedback via experimental feature.

vexx32 · 2020-10-10T20:03:50Z

@SteveL-MSFT clarification point on that -- would @[] become another subexpression operator in that case to match @() and $() or would it be more akin to () in that line breaks within it aren't permitted?

SteveL-MSFT · 2020-10-16T17:58:15Z

@vexx32 good question, I suppose it should probably match @() so that hypothetically people could just search and replace in many cases as a replacement and get the benefits

oising · 2020-11-30T23:19:32Z

Has anyone mentioned exposing a $pscollectionprefererence variable where would could override what collection type is used natively, for all array operators and pipeline output?

ghost · 2020-12-25T16:47:52Z

@KirkMunro

I'm late to this conversation, but so far in PowerShell, square brackets are always used for indices. For that reason I personally don't like @[] as an enclosure.

I'm late to but in PowerShell square brackets are also used for types: [Math]::Round(2.2). The same applies to ( and ) - they not only used for defining array via @(). I don't see any reason why square brackets should not use for the proposed feature.

I'm all for @[] implementation which matches @().

SteveL-MSFT · 2022-10-15T16:03:58Z

Reviving this. Rather than introduce a new language syntax, I think it would be simpler to just introduce the proposed [list] type accelerator which produces a [system.collections.generic.list[object]].

iRon7 · 2022-10-19T12:08:17Z

... If only that the implication of this syntactic sugar proposal will likely also support Constrained Language Mode.
See also: Mutual lists in Constrained Language Mode

iSazonov · 2022-10-19T12:16:08Z

The type accelerator doesn't change behavior of related operators.

SeeminglyScience · 2022-10-19T16:57:56Z

I meant to comment on this thread but apparently never did. Specifically, regarding making += work that would be the first instance of += actually mutating the object on the LHS rather than returning something new.

With regards to List<> specifically, I think we'd be adding another common pitfall if it sees widespread use. The mutability of List<> is partially why you don't see it a lot in public API surfaces. It's so easy to slip up in this regard there's even an instance of mutating the caller's list in one of our public APIs (#12928).

That said, I did prototype (two years ago apparently) a List<>-y implementation with comparable Add performance, but with some extra guards against mutation. Definitely not ready to be added as is, but it's an approach with considering.

As a side note on the topic of lists and their place in PowerShell, they aren't actually very frequently better than simply using $myCollection = foreach ($a in $b) { }. The engine handling it for you is almost always better, so the amount of use cases for List<> specifically aren't actually as high as they appear. Granted not always an option, I'm not saying scenarios like adding across named blocks don't exist, just that they are not as frequent.

mklement0 · 2022-10-21T16:55:20Z

Good point that using list types is often not needed, @SeeminglyScience.
@iRon7 created a "canonical" answer on Stack Overflow that advises against += and preaches the gospel of statements and pipelines as expressions (where the engine automatically collects multiple outputs in an [object[]] array for you).

As for your "List<>-y" implementation. At least at first glance it sounds similar to what @PetSerAl - who agrees that += shouldn't mutate the LHS (as do I now) attempted in the context of Assigning the result of an addition (+ operator) with an IList LHS back to that LHS should preserve the list (collection) type (#5805), specifically here.

Implementing our own list type that plays nicely with += without sacrificing (lots of) performance would provide two additional benefits:

Potentially providing syntactic sugar for construction, as previously discussed (@[...] or ....) - though finding a consensus on the syntax may be challenging.
Being able to avoid the pitfall mentioned by @powercode, namely the List<T>'s native's .ForEach() method shadowing its intrinsic (engine-provided) counterpart.

The least-effort alternative, perhaps acceptable in light of the need for lists not being as pressing as it may seem, would be:

Simply provide type accelerators for existing list types, say [arraylist] and [list[object]] - though I'm not sure if the latter, with its generic parameter, fits into the current type-accelerator mechanism.
Consider their use an advanced use case and expect users to know how those types work and their pitfalls:
- the need to use .Add(), and additionally for [arraylist], to suppress the usually unwanted return value.
- that for [list[T]] .ForEach() isn't PowerShell's .ForEach() method
By contrast, [list], while easier to type, wouldn't readily reveal its relationship with the List<Object> type it represents, and potentially increase the risk of falling into the .ForEach() pitfall.
An unpleasant pitfall that applies to any type-literal / cast-based solution is that an array on the RHS must be (...)-enclosed, though using a type constraint avoids the problem:

using namespace System.Collections.Generic

$list = [List[int]] 1..10 # !! WRONG

$list = [List[int]] (1..10) # OK
[List[int]] $list = 1..10 # OK

SeeminglyScience · 2022-10-21T18:35:21Z

As for your "List<>-y" implementation. At least at first glance it sounds similar to what @PetSerAl - who agrees that += shouldn't mutate the LHS (as do I now) attempted in the context of Assigning the result of an addition (+ operator) with an IList LHS back to that LHS should preserve the list (collection) type (#5805), specifically here.

Hah! They beat me to it by two years, that's amazing. Thanks for the link ❤️

TravisEz13 mentioned this issue Dec 6, 2017

Packaging: Try to make New-Unix package more readable #5625

Merged

5 tasks

SteveL-MSFT added the WG-Language parser, language semantics label Dec 12, 2017

This was referenced Dec 29, 2017

[WIP] Support ListLiteralExpression '[]' in PowerShell #5761

Closed

[WIP] Support ListExpression '@[]' in PowerShell #5762

Closed

SteveL-MSFT added the Issue-Enhancement the issue is more of a feature request than a bug label Jan 5, 2018

SteveL-MSFT added this to the 6.1.0-Consider milestone Jan 5, 2018

SteveL-MSFT added the Review - Committee The PR/Issue needs a review from the PowerShell Committee label Jan 5, 2018

SteveL-MSFT modified the milestones: 7.0-Consider, 7.1-Consider Dec 9, 2019

mklement0 mentioned this issue Feb 1, 2020

Casting a scalar (single value) to System.Collections.ArrayList fails #11749

Open

JustinGrote mentioned this issue Jul 10, 2020

Feature Request: Support +/-/+=/-= operators for all IList types #13152

Closed

joeyaiello modified the milestones: 7.1-Consider, 7.2-Consider Jul 20, 2020

mklement0 mentioned this issue Oct 16, 2020

Allow HashTable to reference itself #13782

Open

iRon7 mentioned this issue Oct 20, 2020

Enhance hash table syntax #13817

Open

SteveL-MSFT modified the milestones: 7.2-Consider, 7.3-Consider Dec 7, 2020

MartinGC94 mentioned this issue Jun 25, 2022

List Subexpression Operator #17578

Closed

SteveL-MSFT removed this from the 7.3-Consider milestone Oct 15, 2022

Nov	DEC	Jan
	26
2021	2022	2023

PowerShell should support creating an List similar to how it supports arrays #5643

PowerShell should support creating an List similar to how it supports arrays #5643

Comments

TravisEz13 commented Dec 6, 2017 • edited

rkeithhill commented Dec 6, 2017

lzybkr commented Dec 6, 2017

TravisEz13 commented Dec 6, 2017

daxian-dbw commented Dec 7, 2017

cchiu1979 commented Dec 7, 2017 • edited

lzybkr commented Dec 7, 2017

mklement0 commented Dec 15, 2017

daxian-dbw commented Dec 29, 2017

markekraus commented Dec 29, 2017

daxian-dbw commented Dec 29, 2017

daxian-dbw commented Dec 30, 2017 • edited

mklement0 commented Dec 30, 2017

daxian-dbw commented Dec 31, 2017

iSazonov commented Jan 1, 2018

markekraus commented Jan 1, 2018 • edited

daxian-dbw commented Jan 1, 2018

markekraus commented Jan 2, 2018

iSazonov commented Jan 2, 2018

mklement0 commented Jan 5, 2018

daxian-dbw commented Jan 5, 2018

lzybkr commented Jan 5, 2018

iSazonov commented Jan 5, 2018

mklement0 commented Jan 6, 2018 • edited

KirkMunro commented Jun 27, 2019

KirkMunro commented Jun 27, 2019

vexx32 commented Jun 27, 2019

iSazonov commented Jun 27, 2019 • edited

vexx32 commented Jun 27, 2019

ili101 commented Aug 31, 2020

vexx32 commented Aug 31, 2020 • edited

SteveL-MSFT commented Oct 10, 2020

vexx32 commented Oct 10, 2020

SteveL-MSFT commented Oct 16, 2020

oising commented Nov 30, 2020

ghost commented Dec 25, 2020

SteveL-MSFT commented Oct 15, 2022

iRon7 commented Oct 19, 2022 • edited

iSazonov commented Oct 19, 2022

SeeminglyScience commented Oct 19, 2022

mklement0 commented Oct 21, 2022

SeeminglyScience commented Oct 21, 2022

TravisEz13 commented Dec 6, 2017 •

edited

cchiu1979 commented Dec 7, 2017 •

edited

daxian-dbw commented Dec 30, 2017 •

edited

markekraus commented Jan 1, 2018 •

edited

mklement0 commented Jan 6, 2018 •

edited

iSazonov commented Jun 27, 2019 •

edited

vexx32 commented Aug 31, 2020 •

edited

iRon7 commented Oct 19, 2022 •

edited