Hi Anand,
that's right. I have created an issue
https://gitlab.labs.nic.cz/knot/issues/99 for it.
I'm still thinking what might be the best course of action, but I
guess I'll make the -w deprecated for
several reasons. The major one is that we need to cut the remote
control connection at some point during teardown but then, we can only
guess how long it will take for the server to fully stop. Before the
'stop' was based on using signals, we could just check the PID for
liveness then, but that's not possible now.
I could still shift things around and delay reply, but that wouldn't
be always 100% correct.
I think when you run knotd supervised, it might still be a better
solution to use signals for stopping and wait for the process to
terminate instead of using 'knotc stop'. Even if I end up shifting
things around and not deprecating -w, does this sound reasonable?
Kind regards,
Marek
On 18 July 2013 01:04, Anand Buddhdev <anandb(a)ripe.net> wrote:
Hello Knot developers,
I'm testing 1.3.0-rc4, and have found something that looks like a bug.
I'm running knot using the CentOS upstart supervisor, and in the upstart
script, I have:
pre-stop exec knotc -c $CONF -w stop
This means that when I run "initctl stop knot", upstart will run "knotc
-c /etc/knot/knot.conf -w stop". The "-w" is supposed to make knotc wait
until the server has stopped.
However, in reality this is not happening. When the stop command is
given, Knot logs this:
2013-07-17T22:48:23 Stopping server...
2013-07-17T22:48:23 Server finished.
2013-07-17T22:48:23 Shut down.
And knotc returns *immediately*. However, if I examine the process
table, I see the knotd process still running. It takes knotd about 10
more seconds to actually exit, at 22:48:33. This is problematic for
upstart. Since knotc has returned, but the knotd process hasn't yet
died, upstart thinks that it has not responded to the stop request, and
so upstart uses the sledgehammer (kill -9) to stop the knotd process.
My assumption is that the knotd process is still doing housekeeping
stuff, so the KILL signal is not a good idea. By the looks of it, the
"-w" flag to knotc isn't doing what it's supposed to, ie. wait for the
server to exit. Could you please investigate this and fix it?
(As an aside, I can work around this in upstart by using the option
"kill timeout 60" which will make upstart wait at least 60 seconds
before trying a KILL signal, by which time knotd should have exited. But
this is just a work-around, not a solution).
Regards,
Anand Buddhdev
RIPE NCC
_______________________________________________
knot-dns-users mailing list
knot-dns-users(a)lists.nic.cz
https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users