X Tutup
The Wayback Machine - https://web.archive.org/web/20240721024610/https://github.com/python/cpython/issues/70933
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

struct.pack(): trailing padding bytes on x64 #70933

Open
skrah mannequin opened this issue Apr 13, 2016 · 6 comments
Open

struct.pack(): trailing padding bytes on x64 #70933

skrah mannequin opened this issue Apr 13, 2016 · 6 comments
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@skrah
Copy link
Mannequin

skrah mannequin commented Apr 13, 2016

BPO 26746
Nosy @mdickinson, @skrah, @meadori, @vadmium, @eric-wieser

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2016-04-13.08:38:49.170>
labels = ['extension-modules', 'type-bug', '3.7']
title = 'struct.pack(): trailing padding bytes on x64'
updated_at = <Date 2019-04-21.07:32:27.586>
user = 'https://github.com/skrah'

bugs.python.org fields:

activity = <Date 2019-04-21.07:32:27.586>
actor = 'skrah'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Extension Modules']
creation = <Date 2016-04-13.08:38:49.170>
creator = 'skrah'
dependencies = []
files = []
hgrepos = []
issue_num = 26746
keywords = []
message_count = 6.0
messages = ['263315', '263322', '263328', '265213', '269887', '309085']
nosy_count = 6.0
nosy_names = ['mark.dickinson', 'skrah', 'meador.inge', 'martin.panter', 'Eric.Wieser', 'Allan Haldane']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue26746'
versions = ['Python 3.7']

@skrah
Copy link
Mannequin Author

skrah mannequin commented Apr 13, 2016

On the x64 architecture gcc adds trailing padding bytes after the last
struct member. NumPy does the same:

>>> import numpy as np
>>> 
>>> t = np.dtype([('x', 'u1'), ('y', 'u8'), ('z', 'u1')], align=True)
>>> x = np.array([(1, 2, 3)], dtype=t)
>>> x.tostring()
b'\x01\xf7\xba\xab\x03\x7f\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00'

The struct module in native mode does not:

>>> struct.pack("BQB", 1, 2, 3)
b'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03'

I'm not sure if this is intended -- or if full compatibility to
native compilers is even achievable in the general case.

@skrah skrah mannequin added extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error labels Apr 13, 2016
@vadmium
Copy link
Member

vadmium commented Apr 13, 2016

This behaviour seems to be documented, although it is not very explicit, and a bit surprising to me. See the third note at the end of <https://docs.python.org/3/library/struct.html#byte-order-size-and-alignment\>: “align the end . . . with a repeat count of zero”, and the example

>>> pack('llh0l', 1, 2, 3)
b'\x00\x00\x00\x01\x00\x00\x00\x02\x00\x03\x00\x00'

@skrah
Copy link
Mannequin Author

skrah mannequin commented Apr 13, 2016

Thank you. So technically, in the above NumPy example the format
string generated by NumPy would be considered incomplete if we assume
struct syntax:

>>> m = memoryview(x)
>>> m.format
'T{B:x:xxxxxxxL:y:B:z:}'

I find this "0L" thing a very odd notation. Taking care of this
manually requires a) knowledge of what the compiler does and b)
searching for the largest struct member.

@meadori
Copy link
Member

meadori commented May 10, 2016

I'm not to crazy about the trailing padding syntax either. The behavior is documented all the way back to Python 2.6. So, I would be hesitant to change it now.

If the new 'T{...}' struct syntax from bpo-3132 ever gets added, then maybe we could address this there?

FWIW, internal and trailing padding is implementation defined by the C standard. That being said, most compilers I have worked with add the trailing padding.

@AllanHaldane
Copy link
Mannequin

AllanHaldane mannequin commented Jul 6, 2016

Hello,

Over at numpy I have a proposed fix for the bug you discovered, that numpy drops trailing padding in the 3118 format string. My strategy is going to make numpy interpret format strings exactly the same way as the struct module, let me know if you disagree.

See numpy/numpy#7798

@skrah
Copy link
Mannequin Author

skrah mannequin commented Dec 27, 2017

I have just worked on PEP-3118 ==> Datashape translation and I have
encountered many issues similar to the ones in the PR referenced by
Allan.

It seems to me that we should simplify the PEP-3118 struct syntax as much
as possible in order to remove any ambiguities.

I think generally that numpy's approach is the best for data interchange, so I
would propose this modified struct syntax for PEP-3118:

  1. Padding is always explicit and exact, also for natively aligned types.

  2. Padding is only allowed in struct fields.

  3. Trailing padding is explicit.

  4. If no padding is present in a struct, it is assumed to be packed with
    alignment 1 for the entire struct.

  5. The tuple syntax "bxL" is not supported, only the T{} syntax with
    explicit field names.

  6. Repetition "10s" is only allowed for bytes. "10f" is a tuple (not
    supported), an array of 10 floats would be (10)f.

  7. Modifiers (@, =, <, >, !) are only given for primitive data types,
    not for entire structs or types.

  8. Implementations are free to reject any padding that would not arise
    naturally by specifying alignment or packing constraints (like gcc does
    with attributes).

Here is my implementation with a grammar:

https://github.com/plures/ndtypes/blob/master/libndtypes/compat/bpgrammar.y

Some tests against numpy:

https://github.com/plures/xnd/blob/master/python/test_xnd.py#L1509

I think the best way forward would be to tweak the above grammar so that
it covers everything that numpy can export.

@skrah skrah mannequin added the 3.7 (EOL) end of life label Dec 27, 2017
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants
X Tutup