Troubleshooting Agents Installed with the connectware-agent Helm Chart
Prerequisites
- Helm v3 installed (K8s Helm | Installing Helm).
kubectl
installed (K8s Install Tools).kubectl
configured with the current context pointing to your target cluster (Configure Access to Multiple Clusters).
Troubleshooting Agent Problems
When having problems with agents installed using the connectware-agent
Helm chart, the first step is usually to delete any pod stuck in a state other than Running and Ready. This can easily happen, because the agents are StatefulSets, which do not automatically get rescheduled if they are unhealthy when their controller is updated, so they need manual intervention.
Deleting Pods in Faulty State
Simply use kubectl get pod -l app.kubernetes.io/component=protocol-mapper-agent
command to display all agent pods, then delete any pod that is in a faulty state using the kubectl delete pod <podname>
command.
Example
kubectl get pod -l app.kubernetes.io/component=protocol-mapper-agent -n <namespace>
Code-Sprache: YAML (yaml)
kubectl -n <namespace> delete pod <podname>
Code-Sprache: YAML (yaml)
If this does not help, you need to look at the faulty pods events and log to check for helpful error messages.
Depending on the Pods state, you should look at a different detail information to find the issue.
Pod state | Kind of problem | Where to check |
---|---|---|
Pending, ContainerCreating | Kubernetes is trying to create the pod. | Pod events or description (see Checking pod state). |
Running, but not ready or not behaving as expected. | Pod unready, application not working correctly. | Current pod logs (see Checking agent pod logs). |
Unknown | Pod status is unknown, Kubernetes cluster problem. | Kubernetes cluster state and events (see https://kubernetes.io/docs/tasks/debug/debug-cluster/). |
ImagePullBackOff | Image for the pod can’t be pulled. | Helm value configuration (see Verifying container image configuration). |
CrashLoopBackOff | Application is crashing. | Previous pod logs (see Checking agent pod logs). |
Checking Pod State
When you have problem with a pod not being scheduled, there can be different reasons, that can be classified in two categories:
- Issues with your configuration.
- Issue with your Kubernetes cluster.
For both categories, you will find events detailing the problem associated with the pod. We will assume, that you have already identified the pod through the previous steps in this article. You will need to know the name and namespace of the pod you are trying to debug.
Use the following command to display events associated with your pod:
kubectl get event -n <namespace> --field-selector involvedObject.name=<podname>
Code-Sprache: YAML (yaml)
Example
Info: You can also view the events at the end of the output of kubectl describe pod <podname>
Issues with you Kubernetes cluster can take very many forms and are beyond the scope of this article, but you can use Debug pods as a starting point to debug any events you see that indicate a problem with your Kubernetes cluster.
Common Problems
Following are a few common scenarios that include issues with your configuration and how to address them.
Event mentions | Likely problem | Likely solution |
---|---|---|
FailedScheduling, Insufficient cpu, Insufficient memory | You specified CPU and memory resources for your agents, that your Kubernetes cluster can’t provide. | Review Configuring compute resources for the connectware-agent Helm chart and adjust the configured resources to something that is available in your Kubernetes cluster. |
FailedScheduling, didn’t match pod anti-affinity rules | There are no available Kubernetes nodes that can schedule the agent because of podAntiAffinity rules. | Review Configuring podAntiAffinity for the connectware-agent Helm chart and adjust your settings, or add additional nodes to your Kubernetes cluster. |
FailedMount in combination with the names you chose as mTLS secret or CA chain, or “mtls-agent-keypair” / „mtls-ca-chain“ | You enabled mTLS for an agent without providing the necessary ConfigMap and Secret for CA chain and key pair. | Review Using Mutual Transport Layer Security (mTLS) for agents with the connectware-agent Helm chart and adjust your configuration accordingly. |
FailedMount in combination with the names of volumes (starting with “data-” | The currently used storage provider is unable to provide the necessary volumes. | Review Configuring agent persistence for the connectware-agent Helm chart and choose a Kubernetes StorageClass that can provide the necessary volumes. |
Checking Agent Pod Logs
When your pods are scheduled, but don’t work the way you expect, are unready, or keep crashing (Status: “CrashLoopBackOff”), then you need to check the logs of this pod for details.
For pods that are ready or unready, check the current logs. For pods in status “CrashLoopBackOff” you need to check the logs of the previous container, to see why it crashed.
Checking Current Pod Logs
To check the current logs of your pod, use the kubectl logs command with the pod name, and look for error messages.
kubectl logs -n <namespace> <podname>
Code-Sprache: YAML (yaml)
Example
Checking Previous Pod Logs
To check the logs of a previous container, follow Checking current pod logs, but add the parameter –previous to the command:
kubectl logs -n <namespace> <podname> --previous
Code-Sprache: YAML (yaml)
Common Problems
Event mentions | Likely problem | Likely solution |
---|---|---|
Agent with mTLS enabled not connecting to broker Agent log shows Reconnecting to mqtts://connectware:8883 Broker log shows:[warning] can’t authenticate client {„ssl“,<<„someName“>>} from someIp due to <<„Authentication denied“>> | mTLS not enabled in Connectware. | Enable mTLS in Connectware. Set Helm value global.authentication.mTLS.enabled to true. |
Agent not connecting to broker when mTLS in Connectware is enabled Agent log showsVRPC agent connection to broker lost Reconnecting to mqtts://someIp:8883 | mTLS enabled in Connectware, but not in agent. | Enable mTLS in agent as described in Using Mutual Transport Layer Security (mTLS) for agents with the connectware-agent Helm chart |
Agent with mTLS enabled does not connect to broker Agent log showsError: Client network socket disconnected before secure TLS connection was established | Agent is connecting to the wrong MQTTS port in broker. | If your setup requires manual configuration due to additional NAT or something similar, review Configuring target Connectware for the connectware-agent Helm chart and adjust your configuration accordingly.If you are not aware of any special requirements of your environment, try removing all advanced MQTT target parameters. |
Agent with mTLS enabled does not connect to broker Agent log shows Failed to read certificates during mTLS setup please check the configuration | The certificates provided to the agent are either not found or faulty. | Review Using Mutual Transport Layer Security (mTLS) for agents with the connectware-agent Helm chart and Full mTLS Examples for the connectware-agent Helm chart, and make sure your mTLS certificates fulfill the requirements. |
Allowing an mTLS enabled agent in Connectware Client Registry fails with the message “An Error has occurred – Registration failed” auth-server logs show: Unable to process request: ‚POST /api/client-registry/confirm‘, because: Certificate Common Name does not match the username. CN: someCN, username: agentName | Agent’s certificate invalid. | Review Using Mutual Transport Layer Security (mTLS) for agents with the connectware-agent Helm chart and Full mTLS Examples for the connectware-agent Helm chart, and make sure your mTLS certificate CN matches the name of the agent. |
Agent with mTLS enabled does not connect to broker Agent log shows: Can not register protocol-mapper agent, because: socket hang up | Agent’s certificate invalid. | Review Using Mutual Transport Layer Security (mTLS) for agents with the connectware-agent Helm chart and Full mTLS Examples for the connectware-agent Helm chart, and make sure your mTLS certificate is signed by the correct Certificate Authority (CA). |
Agent with mTLS enabled does not connect to broker Agent log shows: Failed to register agent. Response: 409 Conflict. A conflicting registration might be pending, or a user with the same username | The username of the agent is already taken. | Every agent needs a user with the username of the value configured in the Helm value name for this agent.Verify that the agent’s name is uniqueVerify there is no old agent with the same name, if there is:Delete the Agent using the Systems => Agents UIDelete the user using the User Management => Users and Roles UIIf you created a user with the agent’s name for something else you have to choose a different name for the agent |
Agent pod enters state CrashLoopBackOff Agent log shows:{„level“:30,“time“:1670940068658,“pid“:8,“hostname“:“welder-robots-0″,“service“:“protocol-mapper“,“msg“:“Re-starting using cached credentials“}2{„level“:50,“time“:1670940068759,“pid“:8,“hostname“:“someName“,“service“:“protocol-mapper“,“msg“:“Failed to query license at https://someIp/api/system/info probably due to authentication“: 401 Unauthorized.“} | The agent’s credentials are not correct anymore. | The agent needs to be re-registered:Delete the Agent using the Systems => Agents UIDelete the user using the User Management => Users and Roles UI Delete the agents StatefulSet: kubectl -n <namespace> delete sts <release-name>-<chart-name>-<agent-name>Delete the agents PersistentVolumeClaim:kubectl -n <namespace> delete pvc data-<release-name>-<chart-name>-<agent-name>-0Re-apply your configuration through helm upgrade as described in Configuring agents with the connectware-agent Helm chart. |
Verifying Container Image Configuration
When an agent pod is in the Status “ImagePullBackOff” it means that Kubernetes is unable to pull the container image required for this agent.
By default Connectware agents use the official protocol-mapper image from Cybus‘ official container registry. This requires a valid secret of the type kubernetes.io/dockerconfigjson to be used, but you have different ways of achieving this. Another option is to provide the images through a mirror, or even using custom images.
This leaves a lot of options to control the image, for which you have to find the right combination for your use case. How to configure these parameters is discussed in these articles:
- Using a custom image registry for the connectware-agent Helm chart
- Configuring image name and version for the connectware-agent Helm chart
- Installing Connectware agents without a license key using the connectware-agent Helm chart
To see the effect of your settings, you need to inspect the complete image definition of your agent pods.
To do so, you can use this command:
kubectl -n <namespace> get pod -l app.kubernetes.io/component=protocol-mapper-agent -o custom-columns="NAME:metadata.name,IMAGE:spec.containers[0].image"
Code-Sprache: YAML (yaml)
Example
In this example you can see, that agent “painter-robots” is trying to use an invalid image name, which needs to be corrected using the image.name Helm value inside the agents entry in the protocolMapperAgents section of the Helm values.
Related Links
Need more help?
Can’t find the answer you’re looking for?
Don’t worry, we’re here to help.