X Tutup
The Wayback Machine - https://web.archive.org/web/20230608193416/https://github.com/haproxy/haproxy/issues/1035
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash #1035

Closed
srkunze opened this issue Jan 6, 2021 · 66 comments
Closed

Crash #1035

srkunze opened this issue Jan 6, 2021 · 66 comments
Labels
status: needs-triage This issue needs to be triaged. type: bug This issue describes a bug.

Comments

@srkunze
Copy link

srkunze commented Jan 6, 2021

OK, guys. Crashed it again - this time with a bug report ;-) Change in the config was to add some awareness of static files. Config does not work with 2.2.6 and 2.3.2

global
    ... <from installation>

default
    ... <from installation>

frontend https
    bind *:443 ssl crt /etc/haproxy/certs/my.cert.pem

    acl a1 hdr_dom(host) -i app1.domain
    acl a2 hdr_dom(host) -i app2.domain
    acl is_static path -i -m beg /static/

    use_backend b1s if a1 is_static
    use_backend b1 if a1
    use_backend b2 if a2

backend b1s
    server s3 zz.zz.zz.zz:80 check

backend b1
    server s1 zz.zz.zz.zz:8000 check

backend b2
    server s2 qq.qq.qq.qq:80 check

Starting config (didn't work with 2.0.13; confirmed working with 2.2.6):

global
    ... <from installation>

default
    ... <from installation>

frontend https
    bind *:443 ssl crt /etc/haproxy/certs/my.cert.pem

    acl a1 hdr_dom(host) -i app1.domain
    acl a2 hdr_dom(host) -i app2.domain

    use_backend b1 if a1
    use_backend b2 if a2

backend b1
    server s1 zz.zz.zz.zz:8000 check

backend b2
    server s2 qq.qq.qq.qq:80 check
root@haproxy:~# haproxy -v
HA-Proxy version 2.3.2-1ppa1~focal 2020/11/29 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2022.
Known bugs: http://www.haproxy.org/bugs/bugs-2.3.2.html
Running on: Linux 5.4.0-1025-raspi #28-Ubuntu SMP PREEMPT Wed Dec 9 17:10:53 UTC 2020 armv8l

@srkunze srkunze added status: needs-triage This issue needs to be triaged. type: bug This issue describes a bug. labels Jan 6, 2021
@wtarreau
Copy link
Member

wtarreau commented Jan 7, 2021

Thanks. Nothing fancy in your config. We're currently in the process of backporting plenty of fixes to emit a new 2.3 and a new 2.2 hopefully this week, and given the number of bugs fixed, it's possible that yours is as well. If you have the core, please open it in gdb and issue "t a a bt full" to get the full backtrace. It should provide important information about where it happened.

@srkunze
Copy link
Author

srkunze commented Jan 7, 2021

@wtarreau maybe it's important to note that it's running inside of a linux container on a RaspberryPi.

I will try to get a core dump. Give me a few minutes.

@srkunze
Copy link
Author

srkunze commented Jan 7, 2021

@wtarreau using limit -c unlimited and also looking at /var/lib/systemd/coredump/, I cannot produce one. Is there something I need to do first?

@a-denoyelle
Copy link
Contributor

@wtarreau using limit -c unlimited and also looking at /var/lib/systemd/coredump/, I cannot produce one. Is there something I need to do first?

If using systemd, you should also take a look at /etc/systemd/coredump.conf and coredumpctl utility

@wtarreau
Copy link
Member

wtarreau commented Jan 7, 2021

By the way, coredumps will be automatically disabled by the kernel if you happen to have a "user" statement or various other ones. There is a special set-dumpable global option you can set in your global section that will make the process dumpable regardless of these other ones. I strongly encourage you to set it for this.

@wtarreau
Copy link
Member

By the way, stupid question since this is an extremely common issue on RPi, are you certain of your power supply's stability ?

@srkunze
Copy link
Author

srkunze commented Jan 13, 2021

@wtarreau I use the original RPi power supply for my version of the RPi. Do you have something concrete in mind which I could check?

@wtarreau
Copy link
Member

The original PSU usually is OK. Look at the red LED. If you see it flashing/blinking, it definitely indicates a power supply issue. Not seeing it flash doesn't necessarily indicate everything is OK though, but that's a good start. With that said, I was essentially asking "just in case", as there are definitely bugs from time to time in haproxy which can cause crashes, but PSU issues are also a known cause on RPi, so that's not easy to tell :-)

@chipitsine
Copy link
Member

I run haproxy on raspberry without any issue. Can you describe repro steps ?

@srkunze
Copy link
Author

srkunze commented Jan 13, 2021

Of course. You install it in a fresh Ubuntu 20.04.1 TLS installation and use the config from above.

Once the server starts, nothing bad happens.
But as soon as you wget to it, it crashes with the usual failure message.

@wtarreau
Copy link
Member

Could you please post your global and default sections ? They definitely contain the most important parts and it's impossible to try the config without them. Thanks!

@chipitsine
Copy link
Member

Of course. You install it in a fresh Ubuntu 20.04.1 TLS installation and use the config from above.

Once the server starts, nothing bad happens.
But as soon as you wget to it, it crashes with the usual failure message.

as an option, can you try to run it in gdb ?

gdb --args /path/to/haproxy -f /path/to/config -d

after it fails, it should escape to gdb, where you can debug as usual, i.e. "bt full" and so on

@srkunze
Copy link
Author

srkunze commented Jan 18, 2021

Here you go

global
	maxconn 128
	tune.ssl.default-dh-param 2048
	log /dev/log	local0
	log /dev/log	local1 notice
	chroot /var/lib/haproxy
	stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
	stats timeout 30s
	user haproxy
	group haproxy
	daemon

	# Default SSL material locations
	ca-base /etc/ssl/certs
	crt-base /etc/ssl/private

	# See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
        ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
        ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
        ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

defaults
	log	global
	mode	http
	option	httplog
	option	dontlognull
        timeout connect 5000
        timeout client  50000
        timeout server  50000
	errorfile 400 /etc/haproxy/errors/400.http
	errorfile 403 /etc/haproxy/errors/403.http
	errorfile 408 /etc/haproxy/errors/408.http
	errorfile 500 /etc/haproxy/errors/500.http
	errorfile 502 /etc/haproxy/errors/502.http
	errorfile 503 /etc/haproxy/errors/503.http
	errorfile 504 /etc/haproxy/errors/504.http

@srkunze
Copy link
Author

srkunze commented Jan 18, 2021

There is a special set-dumpable global option you can set in your global section that will make the process dumpable regardless of these other ones.

I gave it another try but were unsuccessful. :-( Will try the gdb idea.

@srkunze
Copy link
Author

srkunze commented Jan 18, 2021

Well, sorry to bother you again, but what's necessary for gdb to open up the port?

root@haproxy:~# /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg
...

root@haproxy:~# ss -tulpen | grep 443
tcp   LISTEN 0      128                             0.0.0.0:443         0.0.0.0:*                                                                                users:(("haproxy",pid=8471,fd=6)) ino:10954917 sk:5d3 <->                      
root@haproxy:~#
root@haproxy:~# gdb --args /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg
....

root@haproxy:~# ss -tulpen | grep 443
root@haproxy:~#

@srkunze
Copy link
Author

srkunze commented Jan 18, 2021

I remembered that you pushed out a new version last week. I installed 2.3.4 but the error remains.

root@haproxy:~# /usr/sbin/haproxy -v
HA-Proxy version 2.3.4-1ppa1~focal 2021/01/15 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2022.
Known bugs: http://www.haproxy.org/bugs/bugs-2.3.4.html
Running on: Linux 5.4.0-1026-raspi #29-Ubuntu SMP PREEMPT Mon Dec 14 17:01:16 UTC 2020 armv8l

@srkunze
Copy link
Author

srkunze commented Jan 18, 2021

Alright. In the meantime, I boiled the problematic part of the config down to this line:

    acl is_static path -i -m beg /static/

And when removing the -i the server does not crash anymore. :-)

@chipitsine
Copy link
Member

chipitsine commented Jan 18, 2021

in the config above there's nothing like is_static

can you please post the full config ?

I tried to repro on amd64 on latest master branch using

    acl is_static path -i -m beg /static/
    use_backend xxx if is_static

works as expected. I will try on Raspberry Pi 400 tomorrow

@srkunze
Copy link
Author

srkunze commented Jan 19, 2021

global
	maxconn 128
	tune.ssl.default-dh-param 2048
	log /dev/log	local0
	log /dev/log	local1 notice
	chroot /var/lib/haproxy
	stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
	stats timeout 30s
	user haproxy
	group haproxy
	daemon

	# Default SSL material locations
	ca-base /etc/ssl/certs
	crt-base /etc/ssl/private

	# See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
        ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
        ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
        ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

defaults
	log	global
	mode	http
	option	httplog
	option	dontlognull
        timeout connect 5000
        timeout client  50000
        timeout server  50000
	errorfile 400 /etc/haproxy/errors/400.http
	errorfile 403 /etc/haproxy/errors/403.http
	errorfile 408 /etc/haproxy/errors/408.http
	errorfile 500 /etc/haproxy/errors/500.http
	errorfile 502 /etc/haproxy/errors/502.http
	errorfile 503 /etc/haproxy/errors/503.http
	errorfile 504 /etc/haproxy/errors/504.http

frontend https
    bind *:443 ssl crt /etc/haproxy/certs/my.cert.pem

    acl a1 hdr_dom(host) -i app1.domain
    acl a2 hdr_dom(host) -i app2.domain
    acl is_static path -i -m beg /static/

    use_backend b1s if a1 is_static
    use_backend b1 if a1
    use_backend b2 if a2

backend b1s
    server s3 zz.zz.zz.zz:80 check

backend b1
    server s1 zz.zz.zz.zz:8000 check

backend b2
    server s2 qq.qq.qq.qq:80 check

@srkunze
Copy link
Author

srkunze commented Jan 27, 2021

Well, sorry to bother you again, but what's necessary for gdb to open up the port?

Somebody here to give me a hint?

@wtarreau
Copy link
Member

Not yet, sadly, it looks like various more reproducible bugs are already keeping everyone 100% busy :-(

@srkunze
Copy link
Author

srkunze commented Jan 27, 2021

Well, I see. :-)

Just give me a sign, when you have time for it. It's still reproducible on my machine. :-D

@capflam
Copy link
Member

capflam commented Jan 27, 2021

I don't see any reason that may explain why gdb is blocking the port. There is nothing to do to "open" a port from gdb, except running the program. Are you sure it is running ? Just in case, you should enter run on the gdb prompt or add -ex run on the command line. For instance :

gdb -ex "run" --args /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg

In addition, it is probably a good idea to not use the master-worker mode (removing -Ws argument) because it forks the worker process. This means gdb is attached to the master process and not on the worker. Except if you explicitly follow the child using the command set follow-fork-mode child :

gdb -ex "set follow-fork-mode child`" -ex "run" --args /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg

@wtarreau
Copy link
Member

I'm noticing that your arch is set to "armv8l" which is the 32-bit version of ARMv8, and there might be a slight difference here. Could you please post the output of "gcc -v" ? In any case, getting the gdb output at the moment of the crash would immensely help.

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

Alright here we go, regarding the architecture. Not sure if these things should be mixed up anyway, but I can ask at LXD how they intend it to be used

ubuntu@raspi:~$ uname -a
Linux raspi 5.4.0-1026-raspi #29-Ubuntu SMP PREEMPT Mon Dec 14 17:01:16 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

ubuntu@raspi:~$ lxc shell haproxy
root@haproxy:~# uname -a
Linux haproxy 5.4.0-1026-raspi #29-Ubuntu SMP PREEMPT Mon Dec 14 17:01:16 UTC 2020 armv8l armv8l armv8l GNU/Linux

Regarding gdb, no luck so far. I tried:

gdb -ex "run" --args /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg.github
gdb -ex "run" --args /usr/sbin/haproxy -V -d -f /etc/haproxy/haproxy.cfg.github
gdb -ex "set follow-fork-mode child" -ex "run" --args /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg.github

Sample output but port 443 still not open:

GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/haproxy...
(No debugging symbols found in /usr/sbin/haproxy)
Starting program: /usr/sbin/haproxy -V -d -f /etc/haproxy/haproxy.cfg.github

The worst thing about this is that it's 100% reproducible without gdb but I cannot shoot an HTTP request to trigger it when using gdb.

Output without gdb (just ignore the Layer4 connection problem, that is because I would need to reconfigure the backend servers for the test scenario but that is not necessary to reproduce the bug)

root@haproxy:~# /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg.github 
Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
	[SPOE] spoe
	[CACHE] cache
	[FCGI] fcgi-app
	[COMP] compression
	[TRACE] trace
Using epoll() as the polling mechanism.
[NOTICE] 027/142510 (4308) : New worker #1 (4310) forked
[WARNING] 027/142510 (4310) : Server b1/xx.xx.xx.xx is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[NOTICE] 027/142510 (4310) : haproxy version is 2.3.4-1ppa1~focal
[NOTICE] 027/142510 (4310) : path to executable is /usr/sbin/haproxy
[ALERT] 027/142510 (4310) : backend 'b1' has no server available!
00000000:www-https.accept(0006)=0015 from [xx.xx.xx.xx:57206] ALPN=<none>
00000000:www-https.clireq[0015:ffffffff]: GET /static/admin/css/dashboard.css HTTP/1.1
00000000:www-https.clihdr[0015:ffffffff]: user-agent: Wget/1.20.3 (linux-gnueabihf)
00000000:www-https.clihdr[0015:ffffffff]: accept: */*
00000000:www-https.clihdr[0015:ffffffff]: accept-encoding: identity
00000000:www-https.clihdr[0015:ffffffff]: host: <aaa.bbb.cc>
[NOTICE] 027/142513 (4308) : haproxy version is 2.3.4-1ppa1~focal
[NOTICE] 027/142513 (4308) : path to executable is /usr/sbin/haproxy
[ALERT] 027/142513 (4308) : Current worker #1 (4310) exited with code 135 (Bus error)
[ALERT] 027/142513 (4308) : exit-on-failure: killing every processes with SIGTERM
[WARNING] 027/142513 (4308) : All workers exited. Exiting... (135)

gcc version

root@haproxy:~# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/9/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.3.0-17ubuntu1~20.04' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --enable-multilib --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --disable-werror --enable-multilib --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04) 

@wtarreau
Copy link
Member

So in short, this environment doesn't seem to work well enough for even something as standard as gdb to work.

The bus error is at least a new indication that makes things a bit more precise. It usually is an alignment issue, though there should not be on armv8, except maybe for atomic ops or double-word load/stores. Could you please show the output of gcc -dM -xc -E - </dev/null ? It will dump all arch-specific defines and will help me figure if there are some specific instructions used somewhere.

When you get this error there, do you have any extra info in dmesg ? I'm looking for the address of the error and possibly the surrounding code.

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

arch-specific defines

root@haproxy:~# gcc -dM -xc -E - </dev/null
#define __SSP_STRONG__ 3
#define __DBL_MIN_EXP__ (-1021)
#define __HQ_FBIT__ 15
#define __FLT32X_MAX_EXP__ 1024
#define __UINT_LEAST16_MAX__ 0xffff
#define __ARM_SIZEOF_WCHAR_T 4
#define __ATOMIC_ACQUIRE 2
#define __SFRACT_IBIT__ 0
#define __FLT_MIN__ 1.1754943508222875e-38F
#define __GCC_IEC_559_COMPLEX 2
#define __UFRACT_MAX__ 0XFFFFP-16UR
#define __UINT_LEAST8_TYPE__ unsigned char
#define __DQ_FBIT__ 63
#define __INTMAX_C(c) c ## LL
#define __ARM_FEATURE_SAT 1
#define __ULFRACT_FBIT__ 32
#define __SACCUM_EPSILON__ 0x1P-7HK
#define __CHAR_BIT__ 8
#define __USQ_IBIT__ 0
#define __UINT8_MAX__ 0xff
#define __ACCUM_FBIT__ 15
#define __WINT_MAX__ 0xffffffffU
#define __FLT32_MIN_EXP__ (-125)
#define __USFRACT_FBIT__ 8
#define __ORDER_LITTLE_ENDIAN__ 1234
#define __SIZE_MAX__ 0xffffffffU
#define __ARM_ARCH_ISA_ARM 1
#define __WCHAR_MAX__ 0xffffffffU
#define __LACCUM_IBIT__ 32
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 1
#define __DBL_DENORM_MIN__ ((double)4.9406564584124654e-324L)
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 1
#define __GCC_ATOMIC_CHAR_LOCK_FREE 2
#define __GCC_IEC_559 2
#define __FLT32X_DECIMAL_DIG__ 17
#define __FLT_EVAL_METHOD__ 0
#define __unix__ 1
#define __LLACCUM_MAX__ 0X7FFFFFFFFFFFFFFFP-31LLK
#define __FLT64_DECIMAL_DIG__ 17
#define __GCC_ATOMIC_CHAR32_T_LOCK_FREE 2
#define __FRACT_FBIT__ 15
#define __UINT_FAST64_MAX__ 0xffffffffffffffffULL
#define __SIG_ATOMIC_TYPE__ int
#define __UACCUM_FBIT__ 16
#define __DBL_MIN_10_EXP__ (-307)
#define __FINITE_MATH_ONLY__ 0
#define __ARMEL__ 1
#define __ARM_FEATURE_UNALIGNED 1
#define __LFRACT_IBIT__ 0
#define __GNUC_PATCHLEVEL__ 0
#define __FLT32_HAS_DENORM__ 1
#define __LFRACT_MAX__ 0X7FFFFFFFP-31LR
#define __UINT_FAST8_MAX__ 0xff
#define __has_include(STR) __has_include__(STR)
#define __DEC64_MAX_EXP__ 385
#define __INT8_C(c) c
#define __INT_LEAST8_WIDTH__ 8
#define __UINT_LEAST64_MAX__ 0xffffffffffffffffULL
#define __SA_FBIT__ 15
#define __SHRT_MAX__ 0x7fff
#define __LDBL_MAX__ 1.7976931348623157e+308L
#define __FRACT_MAX__ 0X7FFFP-15R
#define __thumb2__ 1
#define __UFRACT_FBIT__ 16
#define __ARM_FP 12
#define __UFRACT_MIN__ 0.0UR
#define __UINT_LEAST8_MAX__ 0xff
#define __GCC_ATOMIC_BOOL_LOCK_FREE 2
#define __UINTMAX_TYPE__ long long unsigned int
#define __LLFRACT_EPSILON__ 0x1P-63LLR
#define __linux 1
#define __DEC32_EPSILON__ 1E-6DF
#define __FLT_EVAL_METHOD_TS_18661_3__ 0
#define __CHAR_UNSIGNED__ 1
#define __UINT32_MAX__ 0xffffffffU
#define __ULFRACT_MAX__ 0XFFFFFFFFP-32ULR
#define __TA_IBIT__ 64
#define __LDBL_MAX_EXP__ 1024
#define __WINT_MIN__ 0U
#define __ARM_ASM_SYNTAX_UNIFIED__ 1
#define __linux__ 1
#define __INT_LEAST16_WIDTH__ 16
#define __ULLFRACT_MIN__ 0.0ULLR
#define __SCHAR_MAX__ 0x7f
#define __WCHAR_MIN__ 0U
#define __INT64_C(c) c ## LL
#define __DBL_DIG__ 15
#define __GCC_ATOMIC_POINTER_LOCK_FREE 2
#define __LLACCUM_MIN__ (-0X1P31LLK-0X1P31LLK)
#define __SIZEOF_INT__ 4
#define __SIZEOF_POINTER__ 4
#define __USACCUM_IBIT__ 8
#define __USER_LABEL_PREFIX__ 
#define __STDC_HOSTED__ 1
#define __LDBL_HAS_INFINITY__ 1
#define __LFRACT_MIN__ (-0.5LR-0.5LR)
#define __HA_IBIT__ 8
#define __FLT32_DIG__ 6
#define __TQ_IBIT__ 0
#define __FLT_EPSILON__ 1.1920928955078125e-7F
#define __APCS_32__ 1
#define __SHRT_WIDTH__ 16
#define __USFRACT_IBIT__ 0
#define __LDBL_MIN__ 2.2250738585072014e-308L
#define __STDC_UTF_16__ 1
#define __FRACT_MIN__ (-0.5R-0.5R)
#define __DEC32_MAX__ 9.999999E96DF
#define __DA_IBIT__ 32
#define __ARM_SIZEOF_MINIMAL_ENUM 4
#define __FLT32X_HAS_INFINITY__ 1
#define __INT32_MAX__ 0x7fffffff
#define __UQQ_FBIT__ 8
#define __INT_WIDTH__ 32
#define __SIZEOF_LONG__ 4
#define __UACCUM_MAX__ 0XFFFFFFFFP-16UK
#define __STDC_IEC_559__ 1
#define __STDC_ISO_10646__ 201706L
#define __UINT16_C(c) c
#define __PTRDIFF_WIDTH__ 32
#define __DECIMAL_DIG__ 17
#define __LFRACT_EPSILON__ 0x1P-31LR
#define __FLT64_EPSILON__ 2.2204460492503131e-16F64
#define __ULFRACT_MIN__ 0.0ULR
#define __gnu_linux__ 1
#define __INTMAX_WIDTH__ 64
#define __has_include_next(STR) __has_include_next__(STR)
#define __ARM_PCS_VFP 1
#define __LDBL_HAS_QUIET_NAN__ 1
#define __ULACCUM_IBIT__ 32
#define __FLT64_MANT_DIG__ 53
#define __UACCUM_EPSILON__ 0x1P-16UK
#define __GNUC__ 9
#define __ULLACCUM_MAX__ 0XFFFFFFFFFFFFFFFFP-32ULLK
#define __pie__ 2
#define __HQ_IBIT__ 0
#define __FLT_HAS_DENORM__ 1
#define __SIZEOF_LONG_DOUBLE__ 8
#define __BIGGEST_ALIGNMENT__ 8
#define __FLT64_MAX_10_EXP__ 308
#define __GNUC_STDC_INLINE__ 1
#define __DQ_IBIT__ 0
#define __DBL_MAX__ ((double)1.7976931348623157e+308L)
#define __ULFRACT_IBIT__ 0
#define __INT_FAST32_MAX__ 0x7fffffff
#define __DBL_HAS_INFINITY__ 1
#define __HAVE_SPECULATION_SAFE_VALUE 1
#define __ACCUM_IBIT__ 16
#define __DEC32_MIN_EXP__ (-94)
#define __THUMB_INTERWORK__ 1
#define __INTPTR_WIDTH__ 32
#define __LACCUM_MAX__ 0X7FFFFFFFFFFFFFFFP-31LK
#define __FLT32X_HAS_DENORM__ 1
#define __INT_FAST16_TYPE__ int
#define __LDBL_HAS_DENORM__ 1
#define __ARM_FEATURE_LDREX 15
#define __DEC128_MAX__ 9.999999999999999999999999999999999E6144DL
#define __INT_LEAST32_MAX__ 0x7fffffff
#define __DEC32_MIN__ 1E-95DF
#define __ACCUM_MAX__ 0X7FFFFFFFP-15K
#define __DBL_MAX_EXP__ 1024
#define __USACCUM_EPSILON__ 0x1P-8UHK
#define __WCHAR_WIDTH__ 32
#define __FLT32_MAX__ 3.4028234663852886e+38F32
#define __DEC128_EPSILON__ 1E-33DL
#define __SFRACT_MAX__ 0X7FP-7HR
#define __FRACT_IBIT__ 0
#define __PTRDIFF_MAX__ 0x7fffffff
#define __UACCUM_MIN__ 0.0UK
#define __UACCUM_IBIT__ 16
#define __FLT32_HAS_QUIET_NAN__ 1
#define __LONG_LONG_MAX__ 0x7fffffffffffffffLL
#define __SIZEOF_SIZE_T__ 4
#define __ULACCUM_MAX__ 0XFFFFFFFFFFFFFFFFP-32ULK
#define __SIZEOF_WINT_T__ 4
#define __LONG_LONG_WIDTH__ 64
#define __FLT32_MAX_EXP__ 128
#define __SA_IBIT__ 16
#define __ULLACCUM_MIN__ 0.0ULLK
#define __GXX_ABI_VERSION 1013
#define __UTA_FBIT__ 64
#define __FLT_MIN_EXP__ (-125)
#define __USFRACT_MAX__ 0XFFP-8UHR
#define __UFRACT_IBIT__ 0
#define __ARM_FEATURE_QBIT 1
#define __INT_FAST64_TYPE__ long long int
#define __FLT64_DENORM_MIN__ 4.9406564584124654e-324F64
#define __DBL_MIN__ ((double)2.2250738585072014e-308L)
#define __PIE__ 2
#define __FLT32X_EPSILON__ 2.2204460492503131e-16F32x
#define __FLT64_MIN_EXP__ (-1021)
#define __LACCUM_MIN__ (-0X1P31LK-0X1P31LK)
#define __ULLACCUM_FBIT__ 32
#define __GXX_TYPEINFO_EQUALITY_INLINE 0
#define __FLT64_MIN_10_EXP__ (-307)
#define __ULLFRACT_EPSILON__ 0x1P-64ULLR
#define __DEC128_MIN__ 1E-6143DL
#define __REGISTER_PREFIX__ 
#define __UINT16_MAX__ 0xffff
#define __DBL_HAS_DENORM__ 1
#define __ACCUM_MIN__ (-0X1P15K-0X1P15K)
#define __SQ_IBIT__ 0
#define __FLT32_MIN__ 1.1754943508222875e-38F32
#define __UINT8_TYPE__ unsigned char
#define __UHA_FBIT__ 8
#define __NO_INLINE__ 1
#define __SFRACT_MIN__ (-0.5HR-0.5HR)
#define __UTQ_FBIT__ 128
#define __FLT_MANT_DIG__ 24
#define __LDBL_DECIMAL_DIG__ 17
#define __VERSION__ "9.3.0"
#define __UINT64_C(c) c ## ULL
#define __ULLFRACT_FBIT__ 64
#define __FRACT_EPSILON__ 0x1P-15R
#define __ULACCUM_MIN__ 0.0ULK
#define _STDC_PREDEF_H 1
#define __UDA_FBIT__ 32
#define __LLACCUM_EPSILON__ 0x1P-31LLK
#define __GCC_ATOMIC_INT_LOCK_FREE 2
#define __FLT32_MANT_DIG__ 24
#define __FLOAT_WORD_ORDER__ __ORDER_LITTLE_ENDIAN__
#define __USFRACT_MIN__ 0.0UHR
#define __UQQ_IBIT__ 0
#define __STDC_IEC_559_COMPLEX__ 1
#define __SCHAR_WIDTH__ 8
#define __INT32_C(c) c
#define __DEC64_EPSILON__ 1E-15DD
#define __ORDER_PDP_ENDIAN__ 3412
#define __DEC128_MIN_EXP__ (-6142)
#define __UHQ_FBIT__ 16
#define __LLACCUM_FBIT__ 31
#define __FLT32_MAX_10_EXP__ 38
#define __INT_FAST32_TYPE__ int
#define __UINT_LEAST16_TYPE__ short unsigned int
#define unix 1
#define __INT16_MAX__ 0x7fff
#define __SIZE_TYPE__ unsigned int
#define __UINT64_MAX__ 0xffffffffffffffffULL
#define __UDQ_FBIT__ 64
#define __INT8_TYPE__ signed char
#define __thumb__ 1
#define __ELF__ 1
#define __ULFRACT_EPSILON__ 0x1P-32ULR
#define __LLFRACT_FBIT__ 63
#define __FLT_RADIX__ 2
#define __INT_LEAST16_TYPE__ short int
#define __ARM_ARCH_PROFILE 65
#define __LDBL_EPSILON__ 2.2204460492503131e-16L
#define __UINTMAX_C(c) c ## ULL
#define __SACCUM_MAX__ 0X7FFFP-7HK
#define __SIG_ATOMIC_MAX__ 0x7fffffff
#define __GCC_ATOMIC_WCHAR_T_LOCK_FREE 2
#define __VFP_FP__ 1
#define __SIZEOF_PTRDIFF_T__ 4
#define __FLT32X_MANT_DIG__ 53
#define __LACCUM_EPSILON__ 0x1P-31LK
#define __FLT32X_MIN_EXP__ (-1021)
#define __DEC32_SUBNORMAL_MIN__ 0.000001E-95DF
#define __INT_FAST16_MAX__ 0x7fffffff
#define __FLT64_DIG__ 15
#define __UINT_FAST32_MAX__ 0xffffffffU
#define __UINT_LEAST64_TYPE__ long long unsigned int
#define __USACCUM_MAX__ 0XFFFFP-8UHK
#define __SFRACT_EPSILON__ 0x1P-7HR
#define __FLT_HAS_QUIET_NAN__ 1
#define __FLT_MAX_10_EXP__ 38
#define __LONG_MAX__ 0x7fffffffL
#define __DEC128_SUBNORMAL_MIN__ 0.000000000000000000000000000000001E-6143DL
#define __FLT_HAS_INFINITY__ 1
#define __unix 1
#define __USA_FBIT__ 16
#define __UINT_FAST16_TYPE__ unsigned int
#define __DEC64_MAX__ 9.999999999999999E384DD
#define __ARM_32BIT_STATE 1
#define __INT_FAST32_WIDTH__ 32
#define __CHAR16_TYPE__ short unsigned int
#define __PRAGMA_REDEFINE_EXTNAME 1
#define __SIZE_WIDTH__ 32
#define __INT_LEAST16_MAX__ 0x7fff
#define __DEC64_MANT_DIG__ 16
#define __INT64_MAX__ 0x7fffffffffffffffLL
#define __UINT_LEAST32_MAX__ 0xffffffffU
#define __SACCUM_FBIT__ 7
#define __FLT32_DENORM_MIN__ 1.4012984643248171e-45F32
#define __GCC_ATOMIC_LONG_LOCK_FREE 2
#define __SIG_ATOMIC_WIDTH__ 32
#define __INT_LEAST64_TYPE__ long long int
#define __ARM_FEATURE_CLZ 1
#define __INT16_TYPE__ short int
#define __INT_LEAST8_TYPE__ signed char
#define __STDC_VERSION__ 201710L
#define __SQ_FBIT__ 31
#define __DEC32_MAX_EXP__ 97
#define __ARM_ARCH_ISA_THUMB 2
#define __INT_FAST8_MAX__ 0x7f
#define __ARM_ARCH 7
#define __INTPTR_MAX__ 0x7fffffff
#define __QQ_FBIT__ 7
#define linux 1
#define __UTA_IBIT__ 64
#define __FLT64_HAS_QUIET_NAN__ 1
#define __FLT32_MIN_10_EXP__ (-37)
#define __FLT32X_DIG__ 15
#define __LDBL_MANT_DIG__ 53
#define __SFRACT_FBIT__ 7
#define __SACCUM_MIN__ (-0X1P7HK-0X1P7HK)
#define __DBL_HAS_QUIET_NAN__ 1
#define __FLT64_HAS_INFINITY__ 1
#define __SIG_ATOMIC_MIN__ (-__SIG_ATOMIC_MAX__ - 1)
#define __INTPTR_TYPE__ int
#define __UINT16_TYPE__ short unsigned int
#define __WCHAR_TYPE__ unsigned int
#define __SIZEOF_FLOAT__ 4
#define __THUMBEL__ 1
#define __USQ_FBIT__ 32
#define __pic__ 2
#define __UINTPTR_MAX__ 0xffffffffU
#define __INT_FAST64_WIDTH__ 64
#define __DEC64_MIN_EXP__ (-382)
#define __ULLACCUM_IBIT__ 32
#define __FLT32_DECIMAL_DIG__ 9
#define __INT_FAST64_MAX__ 0x7fffffffffffffffLL
#define __GCC_ATOMIC_TEST_AND_SET_TRUEVAL 1
#define __FLT_DIG__ 6
#define __FLT32_HAS_INFINITY__ 1
#define __UINT_FAST64_TYPE__ long long unsigned int
#define __INT_MAX__ 0x7fffffff
#define __LACCUM_FBIT__ 31
#define __USACCUM_MIN__ 0.0UHK
#define __UHA_IBIT__ 8
#define __INT64_TYPE__ long long int
#define __FLT_MAX_EXP__ 128
#define __UTQ_IBIT__ 0
#define __DBL_MANT_DIG__ 53
#define __INT_LEAST64_MAX__ 0x7fffffffffffffffLL
#define __GCC_ATOMIC_CHAR16_T_LOCK_FREE 2
#define __DEC64_MIN__ 1E-383DD
#define __WINT_TYPE__ unsigned int
#define __UINT_LEAST32_TYPE__ unsigned int
#define __SIZEOF_SHORT__ 2
#define __ULLFRACT_IBIT__ 0
#define __LDBL_MIN_EXP__ (-1021)
#define __arm__ 1
#define __FLT64_MAX__ 1.7976931348623157e+308F64
#define __UDA_IBIT__ 32
#define __WINT_WIDTH__ 32
#define __INT_LEAST8_MAX__ 0x7f
#define __FLT32X_MAX_10_EXP__ 308
#define __LFRACT_FBIT__ 31
#define __ARM_ARCH_7A__ 1
#define __LDBL_MAX_10_EXP__ 308
#define __ATOMIC_RELAXED 0
#define __DBL_EPSILON__ ((double)2.2204460492503131e-16L)
#define __ARM_FEATURE_SIMD32 1
#define __UINT8_C(c) c
#define __FLT64_MAX_EXP__ 1024
#define __INT_LEAST32_TYPE__ int
#define __SIZEOF_WCHAR_T__ 4
#define __UINT64_TYPE__ long long unsigned int
#define __LLFRACT_MAX__ 0X7FFFFFFFFFFFFFFFP-63LLR
#define __TQ_FBIT__ 127
#define __INT_FAST8_TYPE__ signed char
#define __ULLACCUM_EPSILON__ 0x1P-32ULLK
#define __UHQ_IBIT__ 0
#define __ARM_FEATURE_COPROC 15
#define __LLACCUM_IBIT__ 32
#define __FLT64_HAS_DENORM__ 1
#define __FLT32_EPSILON__ 1.1920928955078125e-7F32
#define __DBL_DECIMAL_DIG__ 17
#define __STDC_UTF_32__ 1
#define __INT_FAST8_WIDTH__ 8
#define __DEC_EVAL_METHOD__ 2
#define __FLT32X_MAX__ 1.7976931348623157e+308F32x
#define __TA_FBIT__ 63
#define __UDQ_IBIT__ 0
#define __ORDER_BIG_ENDIAN__ 4321
#define __ACCUM_EPSILON__ 0x1P-15K
#define __UINT32_C(c) c ## U
#define __INTMAX_MAX__ 0x7fffffffffffffffLL
#define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
#define __FLT_DENORM_MIN__ 1.4012984643248171e-45F
#define __LLFRACT_IBIT__ 0
#define __INT8_MAX__ 0x7f
#define __LONG_WIDTH__ 32
#define __PIC__ 2
#define __UINT_FAST32_TYPE__ unsigned int
#define __CHAR32_TYPE__ unsigned int
#define __FLT_MAX__ 3.4028234663852886e+38F
#define __USACCUM_FBIT__ 8
#define __INT32_TYPE__ int
#define __SIZEOF_DOUBLE__ 8
#define __FLT_MIN_10_EXP__ (-37)
#define __UFRACT_EPSILON__ 0x1P-16UR
#define __FLT64_MIN__ 2.2250738585072014e-308F64
#define __INT_LEAST32_WIDTH__ 32
#define __INTMAX_TYPE__ long long int
#define __DEC128_MAX_EXP__ 6145
#define __FLT32X_HAS_QUIET_NAN__ 1
#define __ATOMIC_CONSUME 1
#define __GNUC_MINOR__ 3
#define __INT_FAST16_WIDTH__ 32
#define __UINTMAX_MAX__ 0xffffffffffffffffULL
#define __DEC32_MANT_DIG__ 7
#define __FLT32X_DENORM_MIN__ 4.9406564584124654e-324F32x
#define __HA_FBIT__ 7
#define __DBL_MAX_10_EXP__ 308
#define __LDBL_DENORM_MIN__ 4.9406564584124654e-324L
#define __INT16_C(c) c
#define __STDC__ 1
#define __PTRDIFF_TYPE__ int
#define __LLFRACT_MIN__ (-0.5LLR-0.5LLR)
#define __ATOMIC_SEQ_CST 5
#define __DA_FBIT__ 31
#define __UINT32_TYPE__ unsigned int
#define __FLT32X_MIN_10_EXP__ (-307)
#define __UINTPTR_TYPE__ unsigned int
#define __USA_IBIT__ 16
#define __DEC64_SUBNORMAL_MIN__ 0.000000000000001E-383DD
#define __ARM_EABI__ 1
#define __DEC128_MANT_DIG__ 34
#define __LDBL_MIN_10_EXP__ (-307)
#define __SIZEOF_LONG_LONG__ 8
#define __ULACCUM_EPSILON__ 0x1P-32ULK
#define __SACCUM_IBIT__ 8
#define __GCC_ATOMIC_LLONG_LOCK_FREE 2
#define __FLT32X_MIN__ 2.2250738585072014e-308F32x
#define __LDBL_DIG__ 15
#define __FLT_DECIMAL_DIG__ 9
#define __UINT_FAST16_MAX__ 0xffffffffU
#define __GCC_ATOMIC_SHORT_LOCK_FREE 2
#define __INT_LEAST64_WIDTH__ 64
#define __ULLFRACT_MAX__ 0XFFFFFFFFFFFFFFFFP-64ULLR
#define __UINT_FAST8_TYPE__ unsigned char
#define __USFRACT_EPSILON__ 0x1P-8UHR
#define __ULACCUM_FBIT__ 32
#define __ARM_FEATURE_DSP 1
#define __QQ_IBIT__ 0
#define __ATOMIC_ACQ_REL 4
#define __ATOMIC_RELEASE 3

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

I'm wondering what could be done in this lxc environment to make such a thing fail. We could try to intercept the SIGBUS and dump a trace, but it's painful to do :-/

I still don't want to drop the gdb idea completely because it would make life much more easy. But the port is closed when starting it via gdb run.... :-/

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

Port 80 does not open as does not 8000. I don't understand this.

@wtarreau
Copy link
Member

But do you type "run" under gdb or do you see the gdb prompt ? It's not been very clear from the beginning. Once at the gdb prompt you have to type run, then you have no prompt anymore because gdb's waiting for haproxy to quit (or for you to Ctrl-C). Also make sure to remove -Ws from your command line under gdb.

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

I used all those variants (with and without -Ws)

gdb --args /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg.github
gdb -ex "run" --args /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg.github
gdb -ex "run" --args /usr/sbin/haproxy -V -d -f /etc/haproxy/haproxy.cfg.github
gdb -ex "set follow-fork-mode child" -ex "run" --args /usr/sbin/haproxy -V -d -Ws -f /etc/haproxy/haproxy.cfg.github

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

(gdb) run
Starting program: /usr/sbin/haproxy -V -d -f /etc/haproxy/haproxy.cfg.github
root@haproxy:~# ss -tulpen
Netid State  Recv-Q Send-Q                   Local Address:Port Peer Address:PortProcess                                                                         
udp   UNCONN 0      0                        127.0.0.53%lo:53        0.0.0.0:*    users:(("systemd-resolve",pid=185,fd=12)) uid:101 ino:76932 sk:4a <->          
udp   UNCONN 0      0                   10.215.46.100%eth0:68        0.0.0.0:*    users:(("systemd-network",pid=183,fd=14)) uid:100 ino:358061 sk:6a <->         
udp   UNCONN 0      0      [fe80::216:3eff:fe28:2db8]%eth0:546          [::]:*    users:(("systemd-network",pid=183,fd=19)) uid:100 ino:76853 sk:4c v6only:1 <-> 
tcp   LISTEN 0      4096                     127.0.0.53%lo:53        0.0.0.0:*    users:(("systemd-resolve",pid=185,fd=13)) uid:101 ino:76933 sk:4d <->          
tcp   LISTEN 0      128                            0.0.0.0:22        0.0.0.0:*    users:(("sshd",pid=247,fd=3)) ino:83254 sk:4e <->                              
tcp   LISTEN 0      128                               [::]:22           [::]:*    users:(("sshd",pid=247,fd=4)) ino:83265 sk:4f v6only:1 <->                     

@wtarreau
Copy link
Member

Could you press ctrl-C when in this situation in gdb, then issue "t a a bt" ? It looks like the old issue affecting locks on macs that makes the process spin-loop on startup, except it's not a mac.

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

root@haproxy:~# gdb --args /usr/sbin/haproxy -V -d -f /etc/haproxy/haproxy.cfg.github
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/haproxy...
(No debugging symbols found in /usr/sbin/haproxy)
(gdb) run
Starting program: /usr/sbin/haproxy -V -d -f /etc/haproxy/haproxy.cfg.github
^C
Program received signal SIGINT, Interrupt.
0xf7fc81e4 in ?? () from /lib/ld-linux-armhf.so.3
(gdb) t a a bt

Thread 1 (process 7840):
#0  0xf7fc81e4 in ?? () from /lib/ld-linux-armhf.so.3
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

I think I need the debug symbols for this. 1 second.

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

Not much of a difference... :-/

root@haproxy:~# gdb --args /usr/sbin/haproxy -V -d -f /etc/haproxy/haproxy.cfg.github
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/haproxy...
Reading symbols from /usr/lib/debug/.build-id/88/286ec23e09ff7b4f013e83c953d72df9384f4c.debug...
(gdb) run
Starting program: /usr/sbin/haproxy -V -d -f /etc/haproxy/haproxy.cfg.github
^C
Program received signal SIGINT, Interrupt.
0xf7fd12fe in ?? () from /lib/ld-linux-armhf.so.3
(gdb) t a a bt

Thread 1 (process 7898):
#0  0xf7fd12fe in ?? () from /lib/ld-linux-armhf.so.3
#1  0xf7fc81e4 in ?? () from /lib/ld-linux-armhf.so.3
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 

@wtarreau
Copy link
Member

It's completely bogus, it didn't even start. I sincerely think gdb doesn't manage to work at all in this environment, so you can't count on it, unfortunately.

@wtarreau
Copy link
Member

By the way, given that it didn't even start, I guess you'll get the same with any other program (e.g. gdb --args /bin/id).

In my opinion that's enough to consider that this setup cannot be trusted, you should stick to the native architecture.

@srkunze
Copy link
Author

srkunze commented Jan 28, 2021

gdb --args /bin/id

Good catch. You are right.

you should stick to the native architecture

I seems so. I will open another issue for lxd about the gdb problem because I think that should work, right? Maybe, then I can come back. What do you think?

PS: Productive-wise, I can use arm64 in lxd and everything works just fine. Thanks for your help. I learned a lot so far :-)

@wtarreau
Copy link
Member

OK, it could likely help the LXC team to get this report, maybe it's a simple issue that was not identified, maybe it's a deeper issue that cannot easily be addressed and they'll need to at least emit a warning on such a setup.

As long as it works fine for you on a native setup, you should stay with this. At this point I think we can close the issue, but feel free to feed this ticket once you get some feedback there.

@wtarreau
Copy link
Member

So I could install all of this into a chroot and found that the crash happens in XXH64 when hashing an unaligned pattern to be used as a cache key:

(gdb) where
#0  XXH64_endian_align (align=XXH_unaligned, endian=XXH_littleEndian, 
    seed=<optimized out>, len=8, input=<optimized out>) at src/xxhash.c:439
#1  XXH64 (input=<optimized out>, len=8, seed=<optimized out>)
    at src/xxhash.c:493
#2  0x00843684 in pat_match_beg (smp=0xff7ffda8, expr=0x283a078, 
    fill=<optimized out>) at src/pattern.c:688
#3  0x00845bd4 in pattern_exec_match (head=head@entry=0x2839fd4, 
    smp=smp@entry=0xff7ffda8, fill=fill@entry=0) at src/pattern.c:2564

The crash happens when reading the 64-b value:

434         h64 += (U64) len;
435
436         while (p+8<=bEnd)
437         {
438             U64 k1 = XXH_get64bits(p);
439             k1 *= PRIME64_2;
440             k1 = XXH_rotl64(k1,31);
441             k1 *= PRIME64_1;
442             h64 ^= k1;
443             h64 = XXH_rotl64(h64,27) * PRIME64_1 + PRIME64_4;

XXH64 uses an ldrd instruction here, which doesn't support unaligned accesses:

   0x0089a276 <+1150>:  movt    lr, #10196      ; 0x27d4
   0x0089a27a <+1154>:  movt    r8, #49842      ; 0xc2b2
   0x0089a27e <+1158>:  movt    r7, #34283      ; 0x85eb
   0x0089a282 <+1162>:  movt    r12, #40503     ; 0x9e37
   0x0089a286 <+1166>:  sub.w   r6, r3, #8
   0x0089a28a <+1170>:  add     r11, r3
   0x0089a28c <+1172>:  str     r2, [sp, #0]
   0x0089a28e <+1174>:  mov     r3, r5
   0x0089a290 <+1176>:  mov     r2, r4
   0x0089a292 <+1178>:  adds    r6, #8
=> 0x0089a294 <+1180>:  ldrd    r4, r1, [r6]
   0x0089a298 <+1184>:  mul.w   r0, r8, r4
   0x0089a29c <+1188>:  umull   r4, r5, r4, lr
   0x0089a2a0 <+1192>:  mla     r1, lr, r1, r0
   0x0089a2a4 <+1196>:  add     r5, r1
   0x0089a2a6 <+1198>:  lsls    r0, r4, #31
   0x0089a2a8 <+1200>:  lsls    r1, r5, #31
   0x0089a2aa <+1202>:  orr.w   r0, r0, r5, lsr #1
   0x0089a2ae <+1206>:  orr.w   r4, r1, r4, lsr #1
   0x0089a2b2 <+1210>:  mul.w   r1, r12, r0
   0x0089a2b6 <+1214>:  mla     r4, r7, r4, r1
   0x0089a2ba <+1218>:  umull   r0, r1, r0, r7
   0x0089a2be <+1222>:  eor.w   r5, r0, r2

It's not easy though as the unaligned limitation is only true for 64 bits. I have no idea why it doesn't fail on a plain armhf system. Maybe it's just caught by the kernel in emulation.

@wtarreau
Copy link
Member

OK so I could address it. The patch splits 32-bit and 64-bit alignment checks. I kept it minimal and sent it to XXH as well. It works for me with this, thus I'm going to merge it.

@wtarreau
Copy link
Member

In the end, the real difference is the kernel: it doesn't implement alignment emulation for arm64, but these systems can run 100% valid arm code that usually rely on kernel emulation for 64 bit unaligned accesses, and don't have it available on such kernels. As such, the arm 32-bit compatibility on such kernels is not 100%.

@srkunze
Copy link
Author

srkunze commented Jan 31, 2021

@wtarreau Wow, That's really interesting. So, everything works by accident except a few bits here and there. Just wow.

I'd like to thank you for all work you've done here. Amazing. Should we make the maintainers of the kernel aware of it?

@wtarreau
Copy link
Member

I doubt this will be sufficient to get armv8 kernels to implement emulation, since it used to be disabled by default in the past. There are very few programs which make use of unaligned accesses, all those focusing on extreme performance, and it's well known that these approaches are often tricky. We don't yet know when the compiler decides to emit such instructions so it's not even trivial to provide a solid reproducer.

@srkunze
Copy link
Author

srkunze commented Feb 1, 2021

very few programs which make use of unaligned accesses, all those focusing on extreme performance

Not understanding everything but I gather that haproxy is such a tool. So, you think that at this level everybody is on his/her own, right?

haproxy-mirror pushed a commit that referenced this issue Feb 4, 2021
There was a special case made to allow ARMv6 to use unaligned accesses
via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while
ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit
pointers, not 64-bit ones, leading to bus errors when the compiler emits
an ldrd instruction and the input (e.g. a pattern) is not aligned, as in
issue #1035.

Note that v7 was properly using the packed approach here and was safe,
however haproxy versions 2.3 and older use the old r39 xxhash code which
has the same issue for armv7. A slightly different fix is required there,
by using a different definition of packed for 32 and 64 bits.

The problem is really visible when running v7 code on a v8 kernel because
such kernels do not implement alignment trap emulation, and the process
dies when this happens. This is why in the issue above it was only detected
under lxc. The emulation could have been disabled on v7 as well by writing
zero to /proc/cpu/alignment though.

This commit is a backport of xxhash commit a470f2ef ("update default memory
access for armv6").

Thanks to @srkunze for the report and tests, @stgraber for his help on
setting up an easy reproducer outside of lxc, and @Cyan4973 for the
discussion around the best way to fix this. Details and alternate patches
available on Cyan4973/xxHash#490.
FireBurn pushed a commit to FireBurn/haproxy that referenced this issue Feb 5, 2021
There was a special case made to allow ARMv6 to use unaligned accesses
via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while
ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit
pointers, not 64-bit ones, leading to bus errors when the compiler emits
an ldrd instruction and the input (e.g. a pattern) is not aligned, as in
issue haproxy#1035.

Note that v7 was properly using the packed approach here and was safe,
however haproxy versions 2.3 and older use the old r39 xxhash code which
has the same issue for armv7. A slightly different fix is required there,
by using a different definition of packed for 32 and 64 bits.

The problem is really visible when running v7 code on a v8 kernel because
such kernels do not implement alignment trap emulation, and the process
dies when this happens. This is why in the issue above it was only detected
under lxc. The emulation could have been disabled on v7 as well by writing
zero to /proc/cpu/alignment though.

This commit is a backport of xxhash commit a470f2ef ("update default memory
access for armv6").

Thanks to @srkunze for the report and tests, @stgraber for his help on
setting up an easy reproducer outside of lxc, and @Cyan4973 for the
discussion around the best way to fix this. Details and alternate patches
available on Cyan4973/xxHash#490.

(cherry picked from commit 4acb99f)
[wt: used the different version suitable for backpotring, using the
 distinct packed settings]
Signed-off-by: Willy Tarreau <w@1wt.eu>
FireBurn pushed a commit to FireBurn/haproxy that referenced this issue Feb 6, 2021
There was a special case made to allow ARMv6 to use unaligned accesses
via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while
ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit
pointers, not 64-bit ones, leading to bus errors when the compiler emits
an ldrd instruction and the input (e.g. a pattern) is not aligned, as in
issue haproxy#1035.

Note that v7 was properly using the packed approach here and was safe,
however haproxy versions 2.3 and older use the old r39 xxhash code which
has the same issue for armv7. A slightly different fix is required there,
by using a different definition of packed for 32 and 64 bits.

The problem is really visible when running v7 code on a v8 kernel because
such kernels do not implement alignment trap emulation, and the process
dies when this happens. This is why in the issue above it was only detected
under lxc. The emulation could have been disabled on v7 as well by writing
zero to /proc/cpu/alignment though.

This commit is a backport of xxhash commit a470f2ef ("update default memory
access for armv6").

Thanks to @srkunze for the report and tests, @stgraber for his help on
setting up an easy reproducer outside of lxc, and @Cyan4973 for the
discussion around the best way to fix this. Details and alternate patches
available on Cyan4973/xxHash#490.

(cherry picked from commit 4acb99f)
[wt: used the different version suitable for backpotring, using the
 distinct packed settings]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 59ad20e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
FireBurn pushed a commit to FireBurn/haproxy that referenced this issue Feb 6, 2021
There was a special case made to allow ARMv6 to use unaligned accesses
via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while
ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit
pointers, not 64-bit ones, leading to bus errors when the compiler emits
an ldrd instruction and the input (e.g. a pattern) is not aligned, as in
issue haproxy#1035.

Note that v7 was properly using the packed approach here and was safe,
however haproxy versions 2.3 and older use the old r39 xxhash code which
has the same issue for armv7. A slightly different fix is required there,
by using a different definition of packed for 32 and 64 bits.

The problem is really visible when running v7 code on a v8 kernel because
such kernels do not implement alignment trap emulation, and the process
dies when this happens. This is why in the issue above it was only detected
under lxc. The emulation could have been disabled on v7 as well by writing
zero to /proc/cpu/alignment though.

This commit is a backport of xxhash commit a470f2ef ("update default memory
access for armv6").

Thanks to @srkunze for the report and tests, @stgraber for his help on
setting up an easy reproducer outside of lxc, and @Cyan4973 for the
discussion around the best way to fix this. Details and alternate patches
available on Cyan4973/xxHash#490.

(cherry picked from commit 4acb99f)
[wt: used the different version suitable for backpotring, using the
 distinct packed settings]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 59ad20e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 5b1f60d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 77eed6c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
FireBurn pushed a commit to FireBurn/haproxy that referenced this issue Feb 6, 2021
There was a special case made to allow ARMv6 to use unaligned accesses
via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while
ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit
pointers, not 64-bit ones, leading to bus errors when the compiler emits
an ldrd instruction and the input (e.g. a pattern) is not aligned, as in
issue haproxy#1035.

Note that v7 was properly using the packed approach here and was safe,
however haproxy versions 2.3 and older use the old r39 xxhash code which
has the same issue for armv7. A slightly different fix is required there,
by using a different definition of packed for 32 and 64 bits.

The problem is really visible when running v7 code on a v8 kernel because
such kernels do not implement alignment trap emulation, and the process
dies when this happens. This is why in the issue above it was only detected
under lxc. The emulation could have been disabled on v7 as well by writing
zero to /proc/cpu/alignment though.

This commit is a backport of xxhash commit a470f2ef ("update default memory
access for armv6").

Thanks to @srkunze for the report and tests, @stgraber for his help on
setting up an easy reproducer outside of lxc, and @Cyan4973 for the
discussion around the best way to fix this. Details and alternate patches
available on Cyan4973/xxHash#490.

(cherry picked from commit 4acb99f)
[wt: used the different version suitable for backpotring, using the
 distinct packed settings]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 59ad20e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 5b1f60d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 77eed6c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit c02437c)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
FireBurn pushed a commit to FireBurn/haproxy that referenced this issue Feb 6, 2021
There was a special case made to allow ARMv6 to use unaligned accesses
via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while
ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit
pointers, not 64-bit ones, leading to bus errors when the compiler emits
an ldrd instruction and the input (e.g. a pattern) is not aligned, as in
issue haproxy#1035.

Note that v7 was properly using the packed approach here and was safe,
however haproxy versions 2.3 and older use the old r39 xxhash code which
has the same issue for armv7. A slightly different fix is required there,
by using a different definition of packed for 32 and 64 bits.

The problem is really visible when running v7 code on a v8 kernel because
such kernels do not implement alignment trap emulation, and the process
dies when this happens. This is why in the issue above it was only detected
under lxc. The emulation could have been disabled on v7 as well by writing
zero to /proc/cpu/alignment though.

This commit is a backport of xxhash commit a470f2ef ("update default memory
access for armv6").

Thanks to @srkunze for the report and tests, @stgraber for his help on
setting up an easy reproducer outside of lxc, and @Cyan4973 for the
discussion around the best way to fix this. Details and alternate patches
available on Cyan4973/xxHash#490.

(cherry picked from commit 4acb99f)
[wt: used the different version suitable for backpotring, using the
 distinct packed settings]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 59ad20e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 5b1f60d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
FireBurn pushed a commit to FireBurn/haproxy that referenced this issue Mar 5, 2021
There was a special case made to allow ARMv6 to use unaligned accesses
via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while
ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit
pointers, not 64-bit ones, leading to bus errors when the compiler emits
an ldrd instruction and the input (e.g. a pattern) is not aligned, as in
issue haproxy#1035.

Note that v7 was properly using the packed approach here and was safe,
however haproxy versions 2.3 and older use the old r39 xxhash code which
has the same issue for armv7. A slightly different fix is required there,
by using a different definition of packed for 32 and 64 bits.

The problem is really visible when running v7 code on a v8 kernel because
such kernels do not implement alignment trap emulation, and the process
dies when this happens. This is why in the issue above it was only detected
under lxc. The emulation could have been disabled on v7 as well by writing
zero to /proc/cpu/alignment though.

This commit is a backport of xxhash commit a470f2ef ("update default memory
access for armv6").

Thanks to @srkunze for the report and tests, @stgraber for his help on
setting up an easy reproducer outside of lxc, and @Cyan4973 for the
discussion around the best way to fix this. Details and alternate patches
available on Cyan4973/xxHash#490.

(cherry picked from commit 4acb99f)
[wt: used the different version suitable for backpotring, using the
 distinct packed settings]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 59ad20e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 5b1f60d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 77eed6c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit c02437c)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
(cherry picked from commit 97cee32)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 939467c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
FireBurn pushed a commit to FireBurn/haproxy that referenced this issue Mar 5, 2021
There was a special case made to allow ARMv6 to use unaligned accesses
via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while
ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit
pointers, not 64-bit ones, leading to bus errors when the compiler emits
an ldrd instruction and the input (e.g. a pattern) is not aligned, as in
issue haproxy#1035.

Note that v7 was properly using the packed approach here and was safe,
however haproxy versions 2.3 and older use the old r39 xxhash code which
has the same issue for armv7. A slightly different fix is required there,
by using a different definition of packed for 32 and 64 bits.

The problem is really visible when running v7 code on a v8 kernel because
such kernels do not implement alignment trap emulation, and the process
dies when this happens. This is why in the issue above it was only detected
under lxc. The emulation could have been disabled on v7 as well by writing
zero to /proc/cpu/alignment though.

This commit is a backport of xxhash commit a470f2ef ("update default memory
access for armv6").

Thanks to @srkunze for the report and tests, @stgraber for his help on
setting up an easy reproducer outside of lxc, and @Cyan4973 for the
discussion around the best way to fix this. Details and alternate patches
available on Cyan4973/xxHash#490.

(cherry picked from commit 4acb99f)
[wt: used the different version suitable for backpotring, using the
 distinct packed settings]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 59ad20e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 5b1f60d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 77eed6c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit c02437c)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
(cherry picked from commit 97cee32)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needs-triage This issue needs to be triaged. type: bug This issue describes a bug.
Projects
None yet
Development

No branches or pull requests

6 participants
X Tutup