Slightly off topic but the author recommends expiry periods for certificates. Other than rotating a key because of compromise, this seems unnecessary. We've finally decided that passwords should no longer be mandatory rotated, why should certificates?
If it's for SSH, you should be able to blacklist that particular certificate. You want to do this to revoke access anyway. Only downside is an every growing black list file, but that is probably a problem you need centuries of use to actual need to solve.
Forgive my ignorance but what's the correlation between expiry and revoking? I'd assume there are several reasons you'd want to revoke a certificate such as it's been compromised. How would it's expiry date help that?
Author here. My take on this is that fail-closed is a vastly better security model than fail-open. I am genuinely surprised that OpenSSH actually issues certificates with no expiry date as a default.
If you have a certificate which expires within a day by default then an unsuccessful revocation is no longer a huge cause of stress. In the worst case, you lock down access to your bastions and disallow the issue of any future certificates for that user. Within a day, any potential threat from that certificate has vanished. This seems preferable to having a mandatory requirement of an up-to-date revocation database which is synced everywhere.
Just to make sure I understand your example, for users who require regular access you re-issue certificates daily? I could see that being useful for a "one off" type thing (i.e. you want to temporarily grant access for one day) but how does that help regular users?
I'm also not sure it's easier to "lock down access to your bastions" and wait out the certificate expiration instead of having a certificate revocation database. Although OpenSSH does not provide a mechanism to distribute the revocation list it seems trivial to add a certificate to the revocation list and distribute it in an automated fashion.
Lastly, since you have to both lock down hosts and wait out the expiration, does that not constitute a fail-open system? I really don't think an expiration date mechanism makes this a fail closed system. Either method requires manual intervention upon compromise.
Yes, even for very regular users I would recommend setting up a process requiring users to get a new certificate on a daily basis with a short validity period. You can automate a lot of this and make it a simple one-command process to get a new certificate - even something like a simple shell script called by ProxyCommand is a good habit to get into. In bigger organisations you'd likely want to centralise this process somehow or institute other tooling.
The overarching reason isn't really a question of "helping users" as such, although I would strongly encourage making the certificate issuing process as quick and easy as possible to encourage adoption and reduce pushback. The people it really helps are security teams and organisations as a whole who can now have more confidence that they haven't left holes in their infrastructure which can be exploited by bad actors. It also checks a lot of boxes for auditing, compliance and reporting purposes which are huge positives in a corporate environment. If you're able to say "yes, disgruntled former employee X had a certificate that would have given them access to all these servers, but it expired three days ago" then that's a lot better than saying "X has a certificate that gives them access to all our servers, but we _think_ we've blocked it from being used everywhere".
Overall, I agree that the model does lend itself better to things like access to critical production infrastructure (where access should be the exception rather than the rule), but in my opinion it's a good practice to get into for access to everything. The ability to log that a certain user requested a certificate at a certain time and then link that to exactly where the certificate was used (via centralised logging, for example) is incredibly powerful.
You're perhaps correct that both do constitute fail-open systems at first. The difference is in the vulnerability period - with an expiring certificate, that ends at a fixed point in the future. With a certificate that has no expiry, that period never ends until such time as you rotate your CA and force everyone to get a new certificate - something which is also far less of a burden when your certificates expire every day by default and you have a process for getting a new one, incidentally.
I appreciate your detailed response but I think we'll just have to agree to disagree here. My personal opinion is that there isn't any value in this arbitrary temporal benchmark for certificates expiring. When a certificate is compromised, or needs to be revoked, it needs to be revoked immediately. At that point, your trusting the same mechanisms to remove access in either system. An auditor is going to be interested in the period between the user having access and that access being revoked. The fact that the key expires later on (even within just hours) is irreverent, as it's after revocation and it's already invalid. Anything less provides the bad actor with plenty of time to do something malicious. The example you give in quotes would be immediately followed with "Okay, but how did you disable that access immediately?"
You could make keys valid for only a minute and it wouldn't add any security, as only seconds are needed for a malicious action to take place.