X Tutup
Skip to content

Allow specifying the subnet size for networks allocated from default pools#50114

Merged
akerouanton merged 1 commit intomoby:masterfrom
danegsta:50107-defaultpoolsize
Oct 23, 2025
Merged

Allow specifying the subnet size for networks allocated from default pools#50114
akerouanton merged 1 commit intomoby:masterfrom
danegsta:50107-defaultpoolsize

Conversation

@danegsta
Copy link
Contributor

@danegsta danegsta commented May 30, 2025

- What I did

Updated how subnet arguments are processed to treat 0.0.0.0/<size>, ::/<size>, or /<size> as requests for a subnet from the default pools with the given prefix size (rather than the default prefix size for the pool)

Networks with both IPv4 and IPv6 subnets are supported:

docker network create --ipv6 --subnet ::/65 --subnet 0.0.0.0/24 test results in an IPv6 subnet of size /65 and an IPv4 subnet of size /24, both from the default pools. Fixed address subnets can be combined with default pool subnets as well.

Implements #50107

- How I did it

Added support for treating 0.0.0.0 and :: subnet addresses as requests for networks from the default pools. If a prefix length is provided as well (i.e. 0.0.0.0/24 or ::/64), it will be used when allocating the network instead of the pool default prefix length. If no pool

- How to verify it

I've added unit tests for the new behavior, but the logic can be manually verified by specifying a global subnet to docker network create (i.e. docker network create --subnet 0.0.0.0/24 test, or docker network create --subnet /24 test).

- Human readable description for the release notes

- Users can request a specific prefix size for networks allocated from the default pools by using the unspecified address, for example `--subnet 0.0.0.0/24 --subnet ::/96`.
  - If Docker is downgraded to a version that does not have this support the network will become unusable, it must be deleted and re-created.

- A picture of a cute animal (not mandatory but encouraged)

image

@danegsta
Copy link
Contributor Author

One thing I haven't done in this PR is account for the case where only a subset of the available network pools may be able to fulfill the given preferred network size. For example, imagine pool A has a default size of /24 and pool B has a default size of /20 and the user requests a preferred size of /20. It likely makes sense to attempt to attempt to allocate an address from pool B first (even if pool A isn't fully allocated) before falling back to a /24 network from pool A if there's no available range in pool B.

If a preferred network size is requested, it's likely worth ordering the available pools by default pool size before attempting to allocate a subnet to ensure that subnets that can fulfill the preferred size (or are closes to the preferred size) get precedence.

@thaJeztah
Copy link
Member

Thank you! It's perfectly fine for things to be incomplete, or even "throw away code". Having something to look at, or to give it a spin, and to have the discussion never hurts!

@danegsta
Copy link
Contributor Author

I ended up adding a step to sort the pools if a preferred size is specified (with the idea being to prefer pools that are able to accommodate the preferred size when allocating a network).

My current design treats the default (or configured) pool prefix sizes as hard limits; the resulting network can have a smaller address range than the default, but never larger. For example, if a pool was configured with a default size of 24 and a request was made to create a network with a preferred prefix size of 16, any network from the pool would still be limited to a 24 prefix.

One other possible approach would be to honor the preferred prefix size as long as it was valid for the underlying base network for a given pool. For example, take a default pool configured with a base network of 172.17.0.0/12 with a default allocation size of 24. If a preferred size of 20 was given, that could be honored as it still fits within the overall /12 base subnet. In that case only the base prefix size would need to be considered when evaluating networks, not the default allocation size.

@danegsta danegsta force-pushed the 50107-defaultpoolsize branch from f2c7c9b to 87498f9 Compare July 2, 2025 03:08
@danegsta danegsta changed the title Proposal: Allow hinting to ipam a preferred subnet size from default pool Proposal: Allow specifying the subnet size for networks allocated from default pools Jul 2, 2025
@robmry
Copy link
Contributor

robmry commented Jul 25, 2025

Just found a previous PR/discussion related to address pool sizes. So, linking it here, but it looks like that work stalled in April last year ...

@danegsta
Copy link
Contributor Author

Just found a previous PR/discussion related to address pool sizes. So, linking it here, but it looks like that work stalled in April last year ...

I need to do some work to get this ready to move out of draft; I've been mostly heads down on an upcoming release at work, but should have time again to devote to this proposal now.

@danegsta
Copy link
Contributor Author

I'm suspicious there'll still be a few changes warranted to this PR, but I think it's at a place where it's at ready for actual review.

@iBug
Copy link

iBug commented Sep 19, 2025

but it looks like that work stalled in April last year ...

Author of #47737 here, I was not given clear instructions or guidance on the next steps. There are unresolved arguments on whether "non network-savvy users can easily read and identify /22 subnets" and that's pretty much why it stalled. On my side all required code changes have been done and all I was doing was awaiting input.

@danegsta
Copy link
Contributor Author

but it looks like that work stalled in April last year ...

Author of #47737 here, I was not given clear instructions or guidance on the next steps. There are unresolved arguments on whether "non network-savvy users can easily read and identify /22 subnets" and that's pretty much why it stalled. On my side all required code changes have been done and all I was doing was awaiting input.

I'd argue the two changes are complementary; I don't think anything in this PR invalidates the benefit of more reasonable defaults that increase the number of simultaneous networks available to the average Docker user. I see this PR as being more beneficial for developer tooling that interacts with Docker networking (coincidentally exactly what I work on, but I also recognize it's a more niche scenario). My main motivation with this PR is that, if I know that even 1024 unique addresses is more than a given network will require, I'd rather be a good steward of a developer's system resources and request a smaller range such as /24 (or even /25).

@akerouanton
Copy link
Member

akerouanton commented Sep 30, 2025

[We discussed this PR during today's libnet maintainers call.

We all agreed that the --subnet-based UX is better than forcing users to specify an --ipam-opt.

Regarding API representation, we think that it's fine to put unspecified prefixes in IPAM.Config.Subnet. Older daemons will accept these values, but will fail to attach any container to such networks (because they'll try to allocate / assign the 0.0.0.1 address). There's not much we can do about that — clients will need to discover whether the daemon support this feature (by checking the API version).

We considered putting the subnet size in an IPAM option at the API level — i.e. while still allowing users to specify unspecified prefixes through --subnet — but this would require the CLI / Compose to translate subnet values into IPAM options. Since this would be a new CLI / Compose behavior, older versions would send these as IPAM.Config.Subnet, so that wouldn't solve the initial compatibility issue (i.e. consumers would have to check CLI / Compose version to discover whether they support that translation, plus they'd have to check the API version to know whether the daemon will honor it).

Regarding this comment:

My current design treats the default (or configured) pool prefix sizes as hard limits; the resulting network can have a smaller address range than the default, but never larger. For example, if a pool was configured with a default size of 24 and a request was made to create a network with a preferred prefix size of 16, any network from the pool would still be limited to a 24 prefix.

We're rather erring on the side of making the subnet size a mandatory requirement the Engine has to honor, instead of an hint that the Engine may decide to ignore. The reason is that we're considering #47737 as a next step, and we'd like to take that PR even further by making the default subnet size a global config parameter instead of a per address pool parameter as is currently the case.

@danegsta
Copy link
Contributor Author

[We discussed this PR during today's libnet maintainers call.

We all agreed that the --subnet-based UX is better than forcing users to specify an --ipam-opt.

Regarding API representation, we think that it's fine to put unspecified prefixes in IPAM.Config.Subnet. Older daemons will accept these values, but will fail to attach any container to such networks (because they'll try to allocate / assign the 0.0.0.1 address). There's not much we can do about that — clients will need to discover whether the daemon support this feature (by checking the API version).

We considered putting the subnet size in an IPAM option at the API level — i.e. while still allowing users to specify unspecified prefixes through --subnet — but this would require the CLI / Compose to translate subnet values into IPAM options. Since this would be a new CLI / Compose behavior, older versions would send these as IPAM.Config.Subnet, so that wouldn't solve the initial compatibility issue (i.e. consumers would have to check CLI / Compose version to discover whether they support that translation, plus they'd have to check the API version to know whether the daemon will honor it).

Regarding this comment:

My current design treats the default (or configured) pool prefix sizes as hard limits; the resulting network can have a smaller address range than the default, but never larger. For example, if a pool was configured with a default size of 24 and a request was made to create a network with a preferred prefix size of 16, any network from the pool would still be limited to a 24 prefix.

We're rather erring on the side of making the subnet size a mandatory requirement the Engine has to honor, instead of an hint that the Engine may decide to ignore. The reason is that we're considering #47737 as a next step, and we'd like to take that PR even further by making the default subnet size a global config parameter instead of a per address pool parameter as is currently the case.

Sounds good, the current implementation is fairly close to this already; I'm treating the subnet size as mandatory (when specified) and honoring it via --subnet (works for multiple subnets including a mix of IPv4 and v6). The only real change would be removing the --ipam-opt which is trivial and would only require removing a single if block and updating a few tests. My main motivation for including the --ipam-opt was primarily to make my life easier integrating this feature with developer tooling that I work on, but it's nothing that I can't work around via checking for supported versions.

@danegsta
Copy link
Contributor Author

@akerouanton with the latest commits, I believe the PR is at a place where it matches the desired behavior.

@danegsta
Copy link
Contributor Author

danegsta commented Oct 1, 2025

I missed a unit test that was validating RequestPool with "all address" subnet requests (0.0.0.0/x, ::/x) and expecting the previous behavior (try to process it as an explicit subnet request). Updated the test to account for the new default pool allocation behavior.

@akerouanton
Copy link
Member

Could you squash your commits please? It'd make it easier to review from an IDE. At a quick glance, I think there's no upfront refactoring, so it should probably go into a single commit.

@danegsta danegsta force-pushed the 50107-defaultpoolsize branch from 87a30a0 to e2e763e Compare October 2, 2025 16:31
@corhere
Copy link
Contributor

corhere commented Oct 9, 2025

cc @dperny - want to take a look as well?

@danegsta danegsta force-pushed the 50107-defaultpoolsize branch from d441a9b to cc7d6ef Compare October 9, 2025 19:08
@danegsta
Copy link
Contributor Author

I'm assuming the remaining failing tests are likely noise or an issue with the github agent as they seem to be due to disk capacity issues on an unrelated integration test.

@robmry
Copy link
Contributor

robmry commented Oct 11, 2025

Yes, test failures look unrelated to this change. As you say, it seems a test runner is low on space. Strange to have hit the same thing on the first two runs, but I've given it another kick.

=== FAIL: amd64.docker.docker.integration.build TestBuildWithHugeFile (34.60s)
    build_test.go:514: assertion failed: string "{\"stream\":\"Step 1/2 : FROM busybox\"}\r\n{\"stream\":\"\\n\"}\r\n{\"stream\":\" ---\\u003e 1c35c4412082\\n\"}\r\n{\"stream\":\"Step 2/2 : RUN for g in $(seq 0 8); do dd if=/dev/urandom of=rnd bs=1K count=1 seek=$((1024*1024*g)) status=none; done  \\u0026\\u0026 ls -la rnd \\u0026\\u0026 du -sk rnd\"}\r\n{\"stream\":\"\\n\"}\r\n{\"stream\":\" ---\\u003e Running in 93b70daf87fb\\n\"}\r\n{\"stream\":\"-rw-r--r--    1 root     root     8589935616 Oct 10 08:16 rnd\\n\"}\r\n{\"stream\":\"36\\trnd\\n\"}\r\n{\"stream\":\" ---\\u003e Removed intermediate container 93b70daf87fb\\n\"}\r\n{\"errorDetail\":{\"message\":\"write /rnd: no space left on device\"},\"error\":\"write /rnd: no space left on device\"}\r\n" does not contain "Successfully built"

@danegsta danegsta force-pushed the 50107-defaultpoolsize branch from cc7d6ef to 57f8b6c Compare October 14, 2025 19:28
@github-actions github-actions bot added the area/daemon Core Engine label Oct 14, 2025
@danegsta
Copy link
Contributor Author

Can I trouble someone to re-run the one failing unit test?

@robmry
Copy link
Contributor

robmry commented Oct 20, 2025

Can I trouble someone to re-run the one failing unit test?

Looks like it's going to keep failing ... lots of stuff is broken today, but definitely unrelated to this PR.

@danegsta
Copy link
Contributor Author

Can I trouble someone to re-run the one failing unit test?

Looks like it's going to keep failing ... lots of stuff is broken today, but definitely unrelated to this PR.

Yeah, us-east-1 going down does tend to have some severe knock-on effects ☹️.

Copy link
Member

@akerouanton akerouanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akerouanton
Copy link
Member

@danegsta You have one conflict to resolve, and we should be able to merge your PR afterward.

@danegsta
Copy link
Contributor Author

@danegsta You have one conflict to resolve, and we should be able to merge your PR afterward.

Merge conflicts are resolved (there was only a single conflicting line, so a very quick update).

@danegsta
Copy link
Contributor Author

Spoke too soon; thought I'd run the integration tests before pushing, but looks like I only ran the unit tests. There's a few additional tweaks that'll have to happen due to other changes that didn't cause conflicts.

…pools

Signed-off-by: David Negstad <David.Negstad@microsoft.com>
@danegsta danegsta force-pushed the 50107-defaultpoolsize branch from 7c41d32 to ea0d934 Compare October 22, 2025 19:46
@danegsta
Copy link
Contributor Author

Okay, fixed the issue with the return type from the new integration network inspect helpers and both unit and integration tests are passing locally for me.

@akerouanton
Copy link
Member

Thanks @danegsta! Let's get it in 🎉

@akerouanton akerouanton merged commit 37ac6ce into moby:master Oct 23, 2025
181 checks passed
@danegsta
Copy link
Contributor Author

Thanks @danegsta! Let's get it in 🎉

Thanks everyone for the reviews and feedback to help get this ready to merge!

@robmry
Copy link
Contributor

robmry commented Oct 24, 2025

Thanks everyone for the reviews and feedback to help get this ready to merge!

Thank you for all the work!

@robmry robmry changed the title Proposal: Allow specifying the subnet size for networks allocated from default pools Allow specifying the subnet size for networks allocated from default pools Oct 24, 2025
@robmry
Copy link
Contributor

robmry commented Oct 24, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/daemon Core Engine area/dependencies area/networking/ipam Networking area/networking Networking backlog docs/revisit impact/changelog kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

X Tutup