restore, key copy failed (already exists)

List overview All Threads
Download

newer

older

ACL issues

Knot DNS 3.5.2 and 3.4.9 releases

Einar Bjarni Halldórsson

27 Lis 2025 27 Lis '25

16:08

Hi, In our setup, we have one active signer and one backup signer. Both use softhsm, but only the active signer does automatic key management. There is an hourly cron job that syncs keys from active to backup signer. It runs knotc zone-backup on the active signer, only backing up the kaspdb. It then syncs the files over to the secondary and runs knotc zone-restore. This has been running for a few years now without problems. These last two weeks we’ve been performing algorithm rollovers for some of our zones, and after we run `knotc zone-ksk-submitted nic.is` we start seeing these errors when the zone-restore is run on the backup: error: [nic.is.] zone event 'backup/restore' failed (already exists) warning: [nic.is.] zone restore failed (already exists) warning: [nic.is.] restore, key copy failed (already exists) I searched the knot dns source code, but couldn't find where these errors are output. Like I said, we’ve been running like this for a few years, doing regular ZSK rollovers, and a few KSK rollovers, without problems. There’s something about the algorithm rollover that causes this problem with our setup. I assume I can just delete the keys on the secondary and sync again, but I want to understand what causes these errors so we can avoid them or at best document them in our process. .einar

Show replies by date

David Vasek

28 Lis 28 Lis

11:38

Hi Einar, thank you for your report. We are investigating the issue. Just to be sure - do you have all keys in softhsm's only? If I understand it right, you sync both Knot configuration and softhsm data from the active to the backup signer first, don't you? Regards, David On 2025-11-27 16:08, Einar Bjarni Halldórsson via knot-dns-users wrote:

...

Einar Bjarni Halldórsson

11:44

Hi David,

...

On 28 Nov 2025, at 10:38, David Vasek <david.vasek(a)nic.cz> wrote: thank you for your report. We are investigating the issue. Just to be sure - do you have all keys in softhsm's only? If I understand it right, you sync both Knot configuration and softhsm data from the active to the backup signer first, don't you?

All keys are in softhsm only and we only sync the softhsm data. The signers where running 3.5.0 from ports on FreeBSD 14.3. I saw some changes regarding keys in 3.5.1 and 3.5.2, so I just tried upgrading, but the issue still resides. On the active signer we run: knotc zone-backup +backupdir /tmp/somedir +nozonefile +nojournal +notimers \ +kaspdb +nocatalog On the backup signer we run: knotc zone-restore +backupdir /tmp/somedir +nozonefile +nojournal +notimers \ +kaspdb +nocatalog I tested removing the keys from the secondary /var/db/knot/keys and synced without errors. Subsequent syncs also run without errors. .einar

Daniel Salzman

16:09

Einar, Are you able to reproduce the issue with a different key set? How do you synchronize data in softhsm? Do you simply replace the whole directory? Thanks, Daniel On 11/27/25 16:08, Einar Bjarni Halldórsson via knot-dns-users wrote:

...

Einar Bjarni Halldórsson

16:34

Hi Daniel,

...

On 28 Nov 2025, at 15:09, Daniel Salzman via knot-dns-users <knot-dns-users(a)lists.nic.cz> wrote: Are you able to reproduce the issue with a different key set?

Yes, but always the same kind, RSASHA256 outgoing, ECDSAP256SHA256 incoming. Starts happening after I run knotc zone-ksk-submitted.

...

How do you synchronize data in softhsm? Do you simply replace the whole directory?

The script cleans the destination directory, rsyncs the results of knotc zone-backup and runs knotc zone-restore on the destination host. .einar

Daniel Salzman

29 Lis 29 Lis

18:47

Hi Einar, To be clear, if you use a PKCS #11 keystore, the zone backup doesn't and can't back up the stored private keys. It only backs up metadata stored in the KASP DB. Therefore, you must also synchronize contents of the HSM. In the case of SoftHSM, you just copy the tokens directory. In my opinion, SoftHSM is perfect for testing or if your software requires a PKCS #11 device (OpenDNSSEC), but for production (with Knot DNS) it only complicates the setup without providing significant security benefits. I would recommend migrating to a PEM keystore. Daniel On 11/28/25 16:34, Einar Bjarni Halldórsson wrote:

...

Hi Daniel,

On 28 Nov 2025, at 15:09, Daniel Salzman via knot-dns-users <knot-dns-users(a)lists.nic.cz> wrote: Are you able to reproduce the issue with a different key set?

Yes, but always the same kind, RSASHA256 outgoing, ECDSAP256SHA256 incoming. Starts happening after I run knotc zone-ksk-submitted.

How do you synchronize data in softhsm? Do you simply replace the whole directory?

The script cleans the destination directory, rsyncs the results of knotc zone-backup and runs knotc zone-restore on the destination host. .einar

Einar Bjarni Halldórsson

20:22

...

On 29 Nov 2025, at 17:47, Daniel Salzman <daniel.salzman(a)nic.cz> wrote: To be clear, if you use a PKCS #11 keystore, the zone backup doesn't and can't back up the stored private keys. It only backs up metadata stored in the KASP DB. Therefore, you must also synchronize contents of the HSM. In the case of SoftHSM, you just copy the tokens directory.

Sorry for the misunderstanding, I incorrectly used softhsm to mean “not HSM”. We *are* using the PEM keystore. .einar

Libor Peltan

1 Pro 1 Pro

12:32

Hi Einar, thank you for your bug report :) We are trying to reproduce your observations, but without luck yet. Anyway, it would be useful if you provide us with more complete information, mostly (at least) about the server where you do observe the issue (which is, I assume, the backup signer where the keys are being restored to): - Knot DNS version - configuration file (or at least relevant parts; don't forget to remove any TSIG secrets or sensitive IPs) - longer log snippets around the time the issue was observed - the script that you use for the backup (or at least relevant parts; unless it is somehow sensitive) - maybe also the directory with the backup whose "restore" triggers the issue (don't forget to delete the contents of all the PEM files in it!!, and note that data.mdb only contains public keys) I'd also have some more questions to make a complete picture about the situation: 1) Is it possible that the issue is not really triggered by algorithm rollover, but by Knot DNS version upgrade? Have you upgraded Knot DNS recently? 2) Do you use PKCS#11 is any way (either a HSM or SoftHSM), or just PKCS#8 (PEM files directly accessed by Knot)? 3) Do you somehow clean up the destination Knot's directories before calling zone-restore? 4) Do you somehow clean up the target directory on the active signer before performing zone-backup into that directory (or you always create fresh empty directory for the purpose)? 5) When manipulating with the backup directory, do you somehow write its content into an existing directory with an older version of the backup in it? Thank you much for providing at least some of those! Libor On 29. 11. 25 20:22, Einar Bjarni Halldórsson via knot-dns-users wrote:

...

Sorry for the misunderstanding, I incorrectly used softhsm to mean “not HSM”. We *are* using the PEM keystore. .einar --

Einar Bjarni Halldórsson

13:05

Hi Libor,

...

On 1 Dec 2025, at 11:32, Libor Peltan <libor.peltan(a)nic.cz> wrote: Anyway, it would be useful if you provide us with more complete information, mostly (at least) about the server where you do observe the issue (which is, I assume, the backup signer where the keys are being restored to): - Knot DNS version

Initially seen on 3.5.0, but tried upgrarding to 3.5.2 and I still get the error there. Both running on FreeBSD 14.3-RELEASE

...

- configuration file (or at least relevant parts; don't forget to remove any TSIG secrets or sensitive IPs)

Attached config files.

...

- longer log snippets around the time the issue was observed

Complete log file attached. Key sync runs every hour, 8 mins. past the hour.

...

- the script that you use for the backup (or at least relevant parts; unless it is somehow sensitive)

Scripts attached. sync_keys.sh is run on the active signer every hour, sync_keys_wrapper.sh is on the backup and used so authorized_keys can allow only allowed operations.

...

- maybe also the directory with the backup whose "restore" triggers the issue (don't forget to delete the contents of all the PEM files in it!!, and note that data.mdb only contains public keys)

I will do that once I’ve triggered the error again.

...

I'd also have some more questions to make a complete picture about the situation: 1) Is it possible that the issue is not really triggered by algorithm rollover, but by Knot DNS version upgrade? Have you upgraded Knot DNS recently?

Last upgrade before the errors appeared was October 2nd 3.4.8 -> 3.5.0. The errors start on November 24th, when we started doing algorithm rollovers.

...

2) Do you use PKCS#11 is any way (either a HSM or SoftHSM), or just PKCS#8 (PEM files directly accessed by Knot)?

Just PKCS#8.

...

3) Do you somehow clean up the destination Knot's directories before calling zone-restore?

Yes, see attached scripts.

...

4) Do you somehow clean up the target directory on the active signer before performing zone-backup into that directory (or you always create fresh empty directory for the purpose)?

We run rsync with ‘—delete’ but do not clean the directory before syncing.

...

5) When manipulating with the backup directory, do you somehow write its content into an existing directory with an older version of the backup in it?

On the primary, the backup directory is cleaned completely before performing the backup. Without doing anything to the backup directories, If I remove the /var/db/knot/keys directory on the secondary, and run sync, the problem goes away. .einar

Einar Bjarni Halldórsson

15:13

...

On 1 Dec 2025, at 11:32, Libor Peltan <libor.peltan(a)nic.cz> wrote: 1) Is it possible that the issue is not really triggered by algorithm rollover, but by Knot DNS version upgrade? Have you upgraded Knot DNS recently?

I just ran `knotc zone-ksk-submitted` on three different servers, all with zones migrating from RSASHA256 to ECDSAP256SHA256 and I’m not seeing the error (yet). All three sets of servers are running Knot 3.5.2 on FreeBSD 14.3. Either the error happens later, when the old keys are purged, or the error has been fixed between 3.5.0 and 3.5.2. I did upgrade a server to 3.5.2 and saw the error, but that was after rollover had finished on the primary when it was running 3.5.0. I’m going to attempt to downgrade a server to 3.5.0 and perform an algorithm rollover and sync. If the error appears, we’ll know it’s in the rollover itself where some state is produced which causes the error. .einar

Einar Bjarni Halldórsson

2 Pro 2 Pro

10:17

...

On 1 Dec 2025, at 14:13, Einar Bjarni Halldórsson via knot-dns-users <knot-dns-users(a)lists.nic.cz> wrote: I just ran `knotc zone-ksk-submitted` on three different servers, all with zones migrating from RSASHA256 to ECDSAP256SHA256 and I’m not seeing the error (yet). All three sets of servers are running Knot 3.5.2 on FreeBSD 14.3. Either the error happens later, when the old keys are purged, or the error has been fixed between 3.5.0 and 3.5.2. I did upgrade a server to 3.5.2 and saw the error, but that was after rollover had finished on the primary when it was running 3.5.0. I’m going to attempt to downgrade a server to 3.5.0 and perform an algorithm rollover and sync. If the error appears, we’ll know it’s in the rollover itself where some state is produced which causes the error.

I just setup a test environment, running 3.5.0, but I can’t reproduce the error. There must have been some legacy rot on the signers which caused it. .einar

Einar Bjarni Halldórsson

1 Pro 1 Pro

12:31

...

On 28 Nov 2025, at 15:09, Daniel Salzman via knot-dns-users <knot-dns-users(a)lists.nic.cz> wrote: Are you able to reproduce the issue with a different key set?

I have one domain on the staging signers waiting for KSK submission. It’s completing an algorithm rollover from RSASHA256 to ECDSAP256SHA256. Once I run `knotc zone-ksk-submitted` it will trigger the error. Do you have ideas for things I can do before I trigger the error, to collect more and better data? .einar

Libor Peltan

12:34

One thing that came to my mind is to attach a strace to the running Knot process, with a filter on open() syscall (and/or other file-manipulation syscalls) and see what key- and backup-related files are being accessed and how. /Libor On 01. 12. 25 12:31, Einar Bjarni Halldórsson via knot-dns-users wrote:

...

On 28 Nov 2025, at 15:09, Daniel Salzman via knot-dns-users <knot-dns-users(a)lists.nic.cz> wrote: Are you able to reproduce the issue with a different key set?

days inactive

days old

knot-dns-users@lists.nic.cz

Manage subscription

12 comments

4 participants

tags (0)

participants (4)

Daniel Salzman
David Vasek
Einar Bjarni Halldórsson
Libor Peltan