Hello Knot developers,
I'm testing 1.3.0-rc4, and have found something that looks like a bug.
I'm running knot using the CentOS upstart supervisor, and in the upstart
script, I have:
pre-stop exec knotc -c $CONF -w stop
This means that when I run "initctl stop knot", upstart will run "knotc
-c /etc/knot/knot.conf -w stop". The "-w" is supposed to make knotc wait
until the server has stopped.
However, in reality this is not happening. When the stop command is
given, Knot logs this:
2013-07-17T22:48:23 Stopping server...
2013-07-17T22:48:23 Server finished.
2013-07-17T22:48:23 Shut down.
And knotc returns *immediately*. However, if I examine the process
table, I see the knotd process still running. It takes knotd about 10
more seconds to actually exit, at 22:48:33. This is problematic for
upstart. Since knotc has returned, but the knotd process hasn't yet
died, upstart thinks that it has not responded to the stop request, and
so upstart uses the sledgehammer (kill -9) to stop the knotd process.
My assumption is that the knotd process is still doing housekeeping
stuff, so the KILL signal is not a good idea. By the looks of it, the
"-w" flag to knotc isn't doing what it's supposed to, ie. wait for the
server to exit. Could you please investigate this and fix it?
(As an aside, I can work around this in upstart by using the option
"kill timeout 60" which will make upstart wait at least 60 seconds
before trying a KILL signal, by which time knotd should have exited. But
this is just a work-around, not a solution).
Regards,
Anand Buddhdev
RIPE NCC