Troubleshooting connections between components

If agent nodes can't retrieve configurations, check for communication, certificate, DNS , and NTP issues.

Agents can't reach the primary server

Agent nodes must be able to communicate with the primary server in order to retrieve configurations.

If agents can't reach the primary server, running telnet <PRIMARY_HOSTNAME> 8140 returns a Name or service not known error.
  1. Verify that the primary server is reachable at a DNS name your agents recognize.
    If you aren't sure how to do this, refer to: Agents aren't using the primary server's valid DNS name
  2. Verify that the pe-puppetserver service is running.

Agents don't have signed certificates

Agent certificates must be signed by the primary server.

If the node's Puppet agent logs contain warnings about unverified peer certificates in the current SSL session, the agent's certificate signing request (CSR) that hasn't yet been signed.
  1. On the primary server, run puppet cert list to generate a list of pending CSRs.
    Tip: You can also Manage CSRs in the console.
  2. To sign a node's certificate, run: puppetserver ca sign <NODE_NAME>

Agents aren't using the primary server's valid DNS name

Agents trust the primary server only if they contact it at one of the valid hostnames specified when the primary server was installed.

On the agent node, if you don't get one of the primary server's valid DNS names (which you chose when installing the primary server) when you run puppet agent --configprint server, then the agent node and primary server can't communicate.
  1. To edit the primary server's hostname on agent nodes, open the /etc/puppetlabs/puppet/puppet.conf file, and change the server setting to a valid DNS name.
  2. To reset the primary server's valid DNS names, log in as root (or the Administrator) and run:
    puppet infrastructure run regenerate_primary_certificate --dns_alt_names=<COMMA-SEPARATED_LIST_OF_DNS_NAMES>

Time is out of sync

The date and time must be in sync on the primary server and agent nodes.

If time is out of sync on nodes, running the date command returns incorrect or inconsistent dates.
Set up NTP to get the time in sync. However, keep in mind that NTP can behave unreliably on virtual machines.

Node certificates have invalid dates

The date and time must be in sync when certificates are created.

If certificates were signed out of sync, you get invalid dates (such as certificates with future dates) when you run:
openssl x509 -text -noout -in $(puppet config print --section master ssldir)/certs/<NODE_NAME>.pem
  1. On the primary server, delete certificates with invalid dates by running:
    puppetserver ca clean --certname <NODE_CERT_NAME>
  2. On the nodes with invalid certificates, delete the SSL directory by running:
    rm -r $(puppet config print --section master ssldir)
  3. On each impacted agent node, run puppet agent --test to generate a new certificate request.
  4. On the primary server, run puppetserver ca sign <NODE_NAME> to sign each request.

A node is re-using a certname

If a new node re-uses an old node's certname, and the primary server retains the previous node's certificate, the new node can't request a new certificate.

  1. On the primary server, clear the node's certificate by running:
    puppetserver ca clean --certname <NODE_CERT_NAME>
  2. On the agent node, run puppet agent --test to generate a new certificate.
  3. On the primary server, run puppetserver ca sign <NODE_NAME> to sign the request.

Agents can't reach the filebucket server

If the primary server is installed with a certname that doesn't match its hostname, agents can't back up files to the filebucket on the primary server.

If agents logs contain errors like could not back up, this means nodes are likely attempting to back up files to the wrong hostname.

On the primary server, edit /etc/puppetlabs/code/environments/production/manifests/site.pp so that the filebucket server attribute points to the correct hostname. For example:
# Define filebucket 'main':
filebucket { 'main':
  server => '<PRIMARY_DNS_NAME>',
  path   => false,
}
Results
Changing the filebucket server attribute on the primary server fixes the error on all agent nodes.

Orchestrator can't connect to the PE Bolt server

There are two options for debugging a faulty connection between the orchestrator and the PE Bolt server.

  • Set the bolt_server_loglevel parameter in the puppet_enterprise::profile::bolt_server class, and then run Puppet.
  • Manually update the loglevel parameter in the /etc/puppetlabs/bolt-server/conf.d/bolt-server.conf file.

The Bolt server logs are located at: /var/log/puppetlabs/bolt-server/bolt-server.log