How to monitor internal and private SSL certificates

Public certificates have a built-in safety net: real users, browsers, and external uptime checks scream the moment one expires. Internal and private certificates have none of that, which is exactly why they bite harder. This guide covers how to monitor an internal SSL certificate when it lives behind a firewall, on a private PKI, or inside a service mesh — and what to actually check beyond a single expiry date.

Why internal certs are riskier than public ones

A public-facing cert that expires gets noticed within minutes. An internal one can fail silently for days because the only "users" are other services, and services tend to fail closed in confusing ways — a 500 here, a dropped gRPC connection there, a queue that stops draining.

Several properties make private certificates harder to track:

No external visibility. They aren't reachable from the public internet, so an outside checker can't see them at all.
Short lifetimes. Private PKI (Vault, step-ca, cert-manager) and mesh CAs often issue certs valid for hours or days. A 90-day public cert gives you slack; a 24-hour mesh cert does not.
mTLS and service mesh. In an mTLS setup both sides present certs. An expired client cert breaks calls just as hard as an expired server cert, and it's easy to forget the client side exists.
Forgotten endpoints. Load balancers, internal admin panels, databases with TLS, printers, IPMI/BMC controllers, and appliances all carry certs that nobody owns.

The core challenge is reachability. You can't point an external service at https://payments.internal.corp because DNS and routing only exist inside your network. So monitoring has to run from inside.

What to check (not just the expiry date)

Before picking a mechanism, be clear on what a complete check means. Expiry is necessary but not sufficient:

Expiry (notAfter). The obvious one. Alert on a threshold, not on the day of.
Validity start (notBefore). Clock skew or a botched rotation can hand out a cert that isn't valid yet.
Chain completeness. A leaf cert without its intermediates will fail verification on strict clients even though openssl x509 on the leaf looks fine.
SAN coverage. The hostname being connected to must appear in the Subject Alternative Name list. CN-only certs are rejected by modern clients.
CA trust. For a private CA, every client must trust the issuing root. A missing root is the usual cause of self signed certificate in certificate chain errors.

Approach 1: an in-network checker on a jump host

The simplest reliable pattern is a cron job on a host that can reach your internal endpoints. openssl s_client does the heavy lifting.

Fetch a cert and read its expiry over the network:

echo | openssl s_client -connect payments.internal.corp:443 \
  -servername payments.internal.corp 2>/dev/null \
  | openssl x509 -noout -enddate -subject -ext subjectAltName

Compute days remaining and exit non-zero when under threshold so cron (or your monitoring agent) treats it as a failure:

#!/usr/bin/env bash
set -euo pipefail
host="$1"; port="${2:-443}"; warn_days="${3:-21}"

end=$(echo | openssl s_client -connect "$host:$port" -servername "$host" 2>/dev/null \
      | openssl x509 -noout -enddate | cut -d= -f2)
end_epoch=$(date -d "$end" +%s)
now_epoch=$(date +%s)
days=$(( (end_epoch - now_epoch) / 86400 ))

echo "$host expires in $days day(s) ($end)"
[ "$days" -ge "$warn_days" ]

To validate the chain and trust against your private root rather than the system store, verify explicitly:

openssl s_client -connect payments.internal.corp:443 \
  -servername payments.internal.corp \
  -CAfile /etc/pki/internal-root-ca.pem -verify_return_error </dev/null

A non-zero exit or verify error line means the chain or trust is broken — catch that, not just expiry. For ad-hoc inspection you can paste a host or a PEM into the SSL checker to see the same fields without remembering flags.

Approach 2: read certs from Kubernetes secrets

In Kubernetes, TLS material usually lives in kubernetes.io/tls secrets, often managed by cert-manager. You can inspect them without ever opening a network connection, which catches problems before a pod even mounts the cert.

kubectl get secret payments-tls -n payments \
  -o jsonpath='{.data.tls\.crt}' \
  | base64 -d \
  | openssl x509 -noout -enddate -subject -ext subjectAltName

Sweep every TLS secret in the cluster and flag short-dated ones:

kubectl get secrets --all-namespaces \
  --field-selector type=kubernetes.io/tls \
  -o jsonpath='{range .items[*]}{.metadata.namespace}{" "}{.metadata.name}{"\n"}{end}' \
| while read -r ns name; do
    crt=$(kubectl get secret "$name" -n "$ns" -o jsonpath='{.data.tls\.crt}' | base64 -d)
    end=$(echo "$crt" | openssl x509 -noout -enddate | cut -d= -f2)
    echo "$ns/$name $end"
  done

If you use cert-manager, also watch its own signals — a Certificate whose Ready condition is False is a renewal that's failing right now:

kubectl get certificate -A \
  -o custom-columns='NS:.metadata.namespace,NAME:.metadata.name,READY:.status.conditions[?(@.type=="Ready")].status,RENEWAL:.status.renewalTime'

More cluster-specific patterns live in the Kubernetes monitoring guide.

Approach 3: pull from Vault or another private PKI

If HashiCorp Vault is your CA, you don't have to scrape endpoints — query the PKI engine directly. List issued serials and read each one's notAfter:

vault list pki/certs

vault read -field=certificate pki/cert/<serial> \
  | openssl x509 -noout -enddate -subject

Check the CA cert itself too; a private root or intermediate expiring is a cluster-wide outage, not a single-service one:

curl -s https://vault.internal.corp:8200/v1/pki/ca/pem \
  | openssl x509 -noout -enddate -subject

Export expiry to your existing monitoring

Don't build a parallel alerting stack. Emit days-until-expiry as a metric and let Prometheus/Alertmanager (or whatever you run) own the thresholds and routing. A textfile-collector script for the node exporter is enough:

#!/usr/bin/env bash
out=/var/lib/node_exporter/textfile/ssl_expiry.prom
: > "$out"
for target in payments.internal.corp:443 admin.internal.corp:8443; do
  host=${target%:*}; port=${target#*:}
  end=$(echo | openssl s_client -connect "$target" -servername "$host" 2>/dev/null \
        | openssl x509 -noout -enddate | cut -d= -f2)
  secs=$(( $(date -d "$end" +%s) - $(date +%s) ))
  printf 'ssl_cert_expiry_seconds{host="%s",port="%s"} %s\n' "$host" "$port" "$secs" >> "$out"
done

Then alert in Prometheus when the runway is short:

- alert: InternalCertExpiringSoon
  expr: ssl_cert_expiry_seconds < 86400 * 21
  for: 1h
  labels: { severity: warning }
  annotations:
    summary: "{{ $labels.host }} cert expires in under 21 days"

For mesh certs measured in hours, drop the threshold accordingly — but in that case prefer monitoring the issuer (cert-manager/Vault) over polling endpoints, since the certs themselves rotate faster than any sane scrape interval.

Monitor it automatically

Internal endpoints need an in-network checker like the ones above; public-facing ones are easier to hand off. SSLNudge watches your externally reachable certificates daily — expiry, chain, and SAN — and emails you well before anything lapses, so the certs your customers hit are covered without a cron job to babysit. Pair it with the in-network scripts here, and both sides of your perimeter are accounted for.