Roll the KSK early and often
ICANN and Verisign have done a stellar job signing the root and publishing the signed zone. The key ceremony was designed and executed with incredible vision and attention to detail. Many TLD operators responded quickly by signing their zones too, and we now have broad implementation of DNSSEC at the top level of the DNS hierarchy. The one area that remains incomplete is the key rollover process. Standard practice in all cryptographic systems is regular change of the keys. DNSSEC is no exception, and the protocol and operating practices were designed to include changes to the keys in every zone, including the root zone. In addition to changing keys every so often to avoid having the same key in use long enough to become vulnerable, there are two other reasons to change keys. First, the recommended size of keys changes over time as computing power becomes more easily available or better cracking algorithms become known. Second, algorithms tend to have a lifetime, and thus it's necessary to change algorithms occasionally. Over the last few decades we have seen a relatively rapid evolution in hash algorithms from MD4 to MD5 to SHA-1 to SHA-2. We are also in the midst of a transition from RSA as the principal asymmetric signature algorithm to various forms of elliptic curve cryptography functions. For all of these reasons, changes in keys are necessary. For changes in keys below the very top of the hierarchy, the protocol provides a direct way for the child zone to inform the parent zone of its new key, and to do so in a way that carefully stages the transition so validating resolvers always see a properly signed zone. We have seen many key rollovers below the root. Most of them have worked well, but some of them have not. Individual operators and the community as a whole are still working out the precise details of how to effect key rollovers smoothly. Most of the issues have involved a dependence on manual processes and a lack of precision in the execution of the necessary sequence of steps. The community is rapidly learning how to do key rollovers more smoothly and tools are evolving to carry out this process with less human intervention. Against this backdrop of active learning and improvement in key management and key rollover below the top of the hierarchy we have the peculiar lack of any activity related to changing the top level key or root key. The root key, known more precisely as the Root's Key Signing Key (KSK), plays a special role. It is this key, or really the public part of the Root's KSK, that has to be known to every validating resolver around the world. An enormous number of devices now hold the Root's KSK and are enabled to validate signed DNS answers. And to facilitate regular changes to the root key, the DNSSEC protocol includes a process, documented in RFC 5011, for propagating a new root key from time to time. This process has not been tested yet, and the community is now being asked for comments in preparation for the first test. My first comment is this test should have been carried out much earlier, preferably not long after the root was first signed. Key rollover is an integral part of the overall system, and to leave it untested is very odd, perhaps akin to not testing whether the landing gear on an airplane will work. There's no question the key has to be changed, and when it is eventually changed, the process had better work. Rolling the top level key will demand attention and care from the operators at Verisign and ICANN. They've already demonstrated great competence and careful execution, so we all expect their part of the process will be flawless. However, they are only a very small part of the overall system. The very large number of validating resolvers are the other part of the system. In order for key rollover to work correctly, all of these validating resolvers will have to execute their part of the rollover process correctly. As noted, this has not yet been tested. Is their software implemented correctly? Are the configuration parameters set correctly? Is each of them operated in a mode that will accept the signals to change the key? We certainly hope so, but hope is not the preferred form of assurance. The process must be tested. And not just once, but repeatedly until it's clear the bugs have been wrung out of the system. So, the way forward is to roll the key not once but at least a few times. This my second comment: Do it more than once. Leave enough time between key rollovers to learn whatever lessons have to learned and to make the necessary adjustments, but do it again quickly enough to put the learning to use. My intuition suggests scheduling a handful of key rollovers at three month intervals, and then settling down to a regular rhythm of rolling the key every two years. I mention these specific intervals to make the advice concrete and vivid, but I am less concerned about the specific numbers than the principles involved. Until we have strong evidence that the devices and systems that are using the root key are actually able to go through the rollover process without problems, we need to keep stressing the process and helping the entire community focus attention on the rollover process. If we do less than that, we might as well be flying in a plane without confidence it can lower its wheels when it approaches an airport. Steve Crocker P.S. This issue was discussed at the Signed Root Symposium in June 2009, a little more than a year before the root was signed. Principals from Verisign, ICANN and NTIA were all in attendance. The report from that symposium was drafted but not completed at the time. The draft is now being posted to provide a record of that discussion almost four years ago. I will send the pointer to it when it's up.