Closed
Description
Hello,
When some of the supervised processes crash constantly for any reason, Supervisor also crashes when you send a restart command.
root@a3af610c9776:~# echo 'restart all' | supervisorctl
Consumer00:worker_0 RUNNING pid 24378, uptime 0:00:03
Consumer22:worker_0 RUNNING pid 24369, uptime 0:00:03
Consumer22:worker_1 RUNNING pid 24368, uptime 0:00:03
Consumer33:worker_0 RUNNING pid 24367, uptime 0:00:03
Consumer33:worker_1 RUNNING pid 24366, uptime 0:00:03
Consumer33:worker_2 RUNNING pid 24365, uptime 0:00:03
Consumer44:worker_0 RUNNING pid 24376, uptime 0:00:03
Consumer55:worker_0 RUNNING pid 24370, uptime 0:00:03
Consumer66:worker_0 RUNNING pid 24372, uptime 0:00:03
Consumer66:worker_1 RUNNING pid 24371, uptime 0:00:03
Consumer77 RUNNING pid 24375, uptime 0:00:03
Consumer88:worker_0 RUNNING pid 24374, uptime 0:00:03
Consumer88:worker_1 RUNNING pid 24373, uptime 0:00:03
Consumer99:worker_0 RUNNING pid 24377, uptime 0:00:03
supervisor> Consumer33:worker_1: stopped
FAILED: unknown problem killing worker_1 (24382):Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/supervisor/process.py", line 428, in kill
options.kill(pid, sig)
File "/usr/lib/python2.7/dist-packages/supervisor/options.py", line 1229, in kill
os.kill(pid, signal)
OSError: [Errno 3] No such process
Consumer77: stopped
Consumer44:worker_0: stopped
Consumer00:worker_0: stopped
Consumer33:worker_2: stopped
Consumer33:worker_0: stopped
Consumer22:worker_0: stopped
Consumer55:worker_0: stopped
Consumer66:worker_1: stopped
Consumer66:worker_0: stopped
Consumer88:worker_1: stopped
Consumer88:worker_0: stopped
Consumer99:worker_0: stopped
error: <class 'xmlrpclib.ProtocolError'>, <ProtocolError for 127.0.0.1/RPC2: 500 Internal Server Error>: file: /usr/lib/python2.7/dist-packages/supervisor/xmlrpc.py line: 501
supervisor>
root@a3af610c9776:~#
log:
.
.
.
2020-11-09 15:59:20,300 INFO exited: worker_1 (exit status 1; not expected)
2020-11-09 15:59:20,558 INFO spawned: 'worker_1' with pid 22270
2020-11-09 15:59:20,573 INFO success: worker_1 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:20,592 INFO exited: worker_0 (exit status 1; not expected)
2020-11-09 15:59:20,860 INFO spawned: 'worker_0' with pid 22271
2020-11-09 15:59:20,869 INFO success: worker_0 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:20,894 INFO exited: worker_0 (exit status 1; not expected)
2020-11-09 15:59:21,182 INFO spawned: 'worker_0' with pid 22272
2020-11-09 15:59:21,192 INFO success: worker_0 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:21,217 INFO exited: worker_1 (exit status 1; not expected)
2020-11-09 15:59:21,484 INFO spawned: 'worker_1' with pid 22273
2020-11-09 15:59:21,493 INFO success: worker_1 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:21,516 INFO exited: worker_1 (exit status 1; not expected)
2020-11-09 15:59:21,818 INFO spawned: 'worker_1' with pid 22274
2020-11-09 15:59:21,829 INFO success: worker_1 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:21,852 INFO exited: worker_0 (exit status 1; not expected)
2020-11-09 15:59:22,145 INFO spawned: 'worker_0' with pid 22277
2020-11-09 15:59:22,157 INFO success: worker_0 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:22,191 INFO exited: worker_0 (exit status 1; not expected)
2020-11-09 15:59:22,194 INFO spawned: 'worker_0' with pid 22278
2020-11-09 15:59:22,195 INFO success: worker_0 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:22,208 INFO stopped: worker_2 (terminated by SIGTERM)
2020-11-09 15:59:22,208 INFO stopped: worker_1 (terminated by SIGTERM)
2020-11-09 15:59:22,217 INFO stopped: worker_0 (terminated by SIGTERM)
2020-11-09 15:59:22,225 CRIT unknown problem killing worker_0 (22278):Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/supervisor/process.py", line 428, in kill
options.kill(pid, sig)
File "/usr/lib/python2.7/dist-packages/supervisor/options.py", line 1229, in kill
os.kill(pid, signal)
OSError: [Errno 3] No such process
2020-11-09 15:59:22,227 INFO stopped: worker_1 (terminated by SIGTERM)
2020-11-09 15:59:22,233 INFO waiting for worker_0 to stop
2020-11-09 15:59:22,233 INFO waiting for worker_1 to stop
2020-11-09 15:59:22,233 INFO waiting for worker_0 to stop
2020-11-09 15:59:22,233 INFO waiting for worker_1 to stop
2020-11-09 15:59:22,233 INFO waiting for worker_0 to stop
2020-11-09 15:59:22,233 INFO waiting for Consumer77 to stop
2020-11-09 15:59:22,233 INFO waiting for worker_0 to stop
2020-11-09 15:59:22,233 INFO waiting for worker_0 to stop
2020-11-09 15:59:22,234 INFO waiting for worker_0 to stop
2020-11-09 15:59:22,235 INFO stopped: worker_0 (terminated by SIGTERM)
2020-11-09 15:59:22,237 INFO stopped: Consumer77 (terminated by SIGTERM)
2020-11-09 15:59:22,243 INFO stopped: worker_1 (terminated by SIGTERM)
2020-11-09 15:59:22,251 INFO stopped: worker_0 (terminated by SIGTERM)
2020-11-09 15:59:22,251 INFO stopped: worker_0 (terminated by SIGTERM)
2020-11-09 15:59:22,252 INFO stopped: worker_0 (terminated by SIGTERM)
2020-11-09 15:59:22,254 INFO stopped: worker_0 (terminated by SIGTERM)
2020-11-09 15:59:22,255 INFO stopped: worker_0 (terminated by SIGTERM)
2020-11-09 15:59:22,256 INFO stopped: worker_1 (terminated by SIGTERM)
2020-11-09 15:59:23,095 INFO spawned: 'worker_2' with pid 22279
2020-11-09 15:59:23,095 INFO success: worker_2 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:23,097 INFO spawned: 'worker_1' with pid 22280
2020-11-09 15:59:23,097 INFO success: worker_1 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:23,099 INFO spawned: 'worker_0' with pid 22281
2020-11-09 15:59:23,099 INFO success: worker_0 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:23,101 INFO spawned: 'worker_1' with pid 22282
2020-11-09 15:59:23,105 INFO success: worker_1 entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2020-11-09 15:59:23,120 ERRO XML-RPC response callback error:Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/supervisor/xmlrpc.py", line 70, in more
value = self.callback()
File "/usr/lib/python2.7/dist-packages/supervisor/rpcinterface.py", line 896, in allfunc
callback = func(name, **extra_kwargs)
File "/usr/lib/python2.7/dist-packages/supervisor/rpcinterface.py", line 281, in startProcess
process.spawn()
File "/usr/lib/python2.7/dist-packages/supervisor/process.py", line 206, in spawn
ProcessStates.BACKOFF, ProcessStates.STOPPED)
File "/usr/lib/python2.7/dist-packages/supervisor/process.py", line 179, in _assertInState
self.config.name, current_state, allowable_states))
AssertionError: Assertion failed for worker_0: UNKNOWN not in EXITED FATAL BACKOFF STOPPED
no log after this line.
env:
root@a3af610c9776:~# uname -a
Linux a3af610c9776 4.14.193-113.317.amzn1.x86_64 #1 SMP Thu Sep 3 19:08:08 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
root@a3af610c9776:~# apt-cache policy supervisor
supervisor:
Installed: 3.2.0-2ubuntu0.2
Candidate: 3.2.0-2ubuntu0.2
Version table:
*** 3.2.0-2ubuntu0.2 500
500 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages
500 http://security.ubuntu.com/ubuntu xenial-security/universe amd64 Packages
100 /var/lib/dpkg/status
3.2.0-2 500
500 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages
(logs and command's output are from 2 different failures, PID numbers may vary)