Users of RHEL 7 and CentOS 7 on Windows Active Directory networks are likely enjoying the benefits of using the SSSD-AD domain-join client module along with the Realmd tool which facilitates proper management of SSSD client configuration (a very complex task).
Unfortunately, with Enterprise domain services like Active Directory (AD), there are MANY things that can go wrong (Murphy’s Law). Every AD domain is unique because the AD is highly customizable. One such thing that can go wrong is Kerberos Ticket Validation. TGT verification failure is such a common issue that Red Hat has dedicated a (customers-only) solution page to it: SSSD user logins fail due to failed TGT validation.
As of March 2017, their suggestions failed to mention a solution that I needed on one of my systems. I’m recording the issue I saw here so I can refer back to it.
- Set debug_level to a high value like 6 for all sections in your sssd.conf file
- Restart sssd “systemctl restart sssd”
- Attempt to login to your system with a domain user account
- Review the contents of /var/log/sssd/krb5_child for indicators of this problem
- “validate_tgt” line like “TGT failed verification using key for [host/your.fqdn@YOUR.REALM]
- Additional lines like “Server not found in Kerberos database“
- I’m not sure if you can get more useful log output by setting the debug level higher – this was the most detail I found before solving the problem
- CHECK THE DOMAIN FOR DUPLICATE SPN’s
- Duplicate Service Principal Names will be indicated in the output of Windows command “setspn -X” when run by a domain admin user.
- If duplicate SPN’s are found, they must be resolved
- EITHER delete unused duplicate objects from AD
- OR leave domain (realm leave) – change hostname (nmtui) – reboot – rejoin domain (realm join) with Linux client
- This security validation of the Ticket Granting Ticket (TGT) is controlled by the setting “krb5_validate” (true or false) in the domain-specific section of your “sssd.conf” file. You can change this setting and restart sssd to test whether the TGT validation checks are causing your issue. There are many factors that may cause this validation to fail – Duplicate SPN is one such issue. For other possibilities, please review the above linked Red Hat solution, or try your luck with Google :-).
- Once the underlying issue is resolved, set krb_validate back to true
I hope this is helpful to someone experiencing the same issue. As we can see here, a cleanup of the Active Directory infrastructure resolves the problem (removal of duplicate SPN in this case). Other basic infrastructure services should all be in place and functioning correctly to include: Forward and Reverse DNS Name Resolution, Secure Dynamic DNS Name Registration, Domain Time Synchronization (PDC authoritative, automatic to Windows domain members, NTP/chronyd to Linux clients)