Hey Anand,
at the moment, the timer database contains just serialised running timers for
the zone:
- next refresh (SOA query sent to master)
- zone expiration (zone content invalidated)
- next flush (zone file synchronization with memory/journal)
- next transfer (AXFR/IXFR update of zone)
The above was obvious solution when we implemented this feature for 1.6.0. It
has several serious problems though:
- Refresh and transfer are entangeled. There is some mutual scheduling between
these events and it's error prone. In addition to that, the logic is dual.
One works most of the time. The other is used for expired zones, where
refresh is never used and transfer basically means bootstrap.
- Current bootstrap interval is not preserved.
- Zone expiration timer is started at the moment when a refresh fails. So a
zone cannot expire if the server was correctly shut down and not started for
a longer time.
- Zone flush is planned at the moment when content of the zone is changed. If
you change the zonefile-sync config option and reload the configuration,
existing flush timers won't be updated.
- And possibly some more.
So we have decided to refactor the scheduling and timers some time ago. But
it involves a lot of code changes and I can't no longer spend 100% of my time
on that. So please be patient. :-)
The new code will merge refresh and transfer events into one, which will:
- Simplify the logic making it a bit error persistent.
- Loosen the load on event scheduler.
- Allow the server to reuse the existing TCP connection from SOA query to
AXFR/IXFR. (Useful for anycasted instances.)
The timer database will contain the following values:
- SOA expire value
- last refresh
- next planned refresh
- last flush
With respect to the problems above, this should fix them and also simplify
the scheduling:
- The last refresh value is a starting point to compute zone expiration. And
also to correctly compute bootstrap interval for expired zones.
- Next refresh is when we try to update the zone. That means issuing zone
transfer (preceeded by a SOA query for live zones).
- Last flush just carries the information when we changed the zone for the
last time. So the next zone file flush can be scheduled simply even after a
configuration change.
- SOA expire value clearly indicates when a zone will expire. This allows the
server to expire the zones even when the server was shut down.
Authoritative servers are complicated. :-) Does it make sense?
Regards,
Jan
On Fri, Oct 21, 2016 at 2:03 PM, Anand Buddhdev <anandb(a)ripe.net> wrote:
On 21/10/16 13:26, Ondřej Surý wrote:
Hi Ondrej,
we are working on a tool to sneak peek into
timers
database and manipulate the timers database.
[snip]
Thanks for this information. My reason for asking was to trigger
discussion about what *should* be in the timer database, to enable Knot
to gracefully handle all the various scenarios with zones when it's a slave.
However, during the progress of issue #479, I've had discussions with
Daniel and Jan, and understood that there is much better code coming
soon for timers, so things should be much better in Knot's future. For
now, the handful of fixes in #479 work well-enough for my common failure
cases, and if they're rolled into a 2.3.2 release, that would be just fine.
I guess the major timer changes will probably go into 2.4, right?
Regards,
Anand
_______________________________________________
knot-dns-users mailing list
knot-dns-users(a)lists.nic.cz
https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users