Troubleshooting the Installation

This section provides the steps to troubleshoot the 
DX Platform
 installation:
dxp10
This section provides the steps to troubleshoot the 
DX Platform
 installation:
System Pods in Crashloopback State
Symptom:
The status for the kube-system pods displays crashloopback.
Solution:
Restart the docker. To restart, run the following command:
systemctl restart docker
Pods are Not in Running State
Symptom:
Pods are not in the running state.
Solution:
Run the following commands to flush the pods:
iptables -P FORWARD ACCEPT iptables-save
Unable to Push the Image and Installation Fails 
Symptom:
The installation fails because I am unable to push the image due to limited memory as shown:
[oerth-scx.ca.com:4443/analytics/jarvis-ldds:2.6.4] Tagging image: localhost:5000/jarvis/ldds:2.6.4. Error during callback com.github.dockerjava.api.exception.NotFoundException: {"message":"no such id: oerth-scx.ca.com:4443/dxi/platform/dxi-postgres:aop_v19.1.0.6_1.0.14"} at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:103) at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:33) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340
Solution:
Follow these steps:
  1. Run the following command to stop the docker.
    systemctl stop docker
  2. Run the following command to remove the docker-storage directory:
    run rm -rf /mnt/docker-storage
  3. Run the following command to start the docker.
    systemctl start docker
Pods are Not in Running State
Symptom:
Pods are not in the running state.
Solution:
 
Follow these steps:
  1. Run the following command to get the namespace:
    kubectl get ns
  2. Run the following command to get the list of pods:
    kubectl get pods -n <namespace>
  3. Run the following command to check the logs of the pod:
    kubectl logs <podname> -n <namespace>
Installation Fails with Error
Symptom:
During the installation, when I select option 2 (Medium Installation) at the prompt, the installation fails with an error:
Specify the size of the Elasticsearch. 1. Small Installation: One Elasticsearch Hot node is installed. 2. Medium Installation: Three Elasticsearch Hot nodes are installedEnter your choice: 2Do you want to enable OI? (Y/N) [N]: y[es-validate] [ FAILED ] Expected to find 3 nodes with labels master-data-1, master-data-2, master-data-3. Need exactly one node tagged for each label.Sun Jul 21 19:36:39 PDT 2019 Installation failed.
Solution:
Perform the following steps:
  • Delete the config file from the install directory install directory> and run the installer again.
  • Ensure that there is only one node with each of the following label:
    • dxi-es-node=master-data-1
    • dx-es-node=master-data-2
    • dxi-es-node=master-data-3
User Access Logs Are Not in EM
Symptom:
After several unsuccessful logins I want to see the user access in log. EM log does not show any authorization message.
Solution:
You need to check apmservices-manager logs for login failures because authentication is done in apmservices-manager and not EM. Simple way to do it is logging into Cluster Management console and getting logs from Cluster Management on Service screen. Other option is using kubectl to query logs from running pod. 
kubectl logs ${POD_NAME} Resulting log entry: 2019-08-01 15:11:43.240 ERROR [manager,3064d4205d623f77,35211da6bb868168,false] 1 --- [http-nio-8008-exec-4] com.ca.apm.ess.services.EssManagerImpl : { "code" : 1351129, "msg" : "Authentication Failed: Invalid Username, Password, or Tenant ID", "desc" : "Authentication Failed: Invalid Username, Password, or Tenant ID" }
Verify the Installation
After the installation is successful, run the following commands to verify the status:
Verification
Steps
Verify if the containers are deployed successfully.
Run the following command to check the deployment status of the pods:
kubectl get pods -n <namespace>
The status must be Running.
If some of the pods report an error, run the following command to view the details:
kubectl get events -n <namespace> |fgrep<service name> | more
Get Ingresses for components
Run the following command to get all the ingresses:
kubectl get ingress -n <namespace>
Log into the Admin UI Console
Run the following command to display the Admin Console URL:
echo http://$(kubectl get ingress -n <namespace> |fgrep apmservices-gateway|awk '{print $2}')"/dxiportal
Open the URL that was displayed after the installation. Log in as masteradmin with the password provided during the installation. 
Verify the Elasticsearch Data
Run the following command:
echo "http://$(kubectl get ingress -n <namespace> |fgrep jarvis-es|awk '{print $2}')"
Open the following URL:
 
<URL printed above>_cat/indices
 
Verify that you can see multiple rows of data.
Check the Jarvis Health
Run the following command:
echo "http://$(kubectl get ingress -n <namespace> |fgrep jarvis-apis|awk '{print $2}')"
Run the following command to check the health of Jarvis. Replace the URL with the APIs Ingress URL. The status of all the components should report green.
curl -X GET "<URL printed above>/health" -H "accept: application/json;charset=UTF-8"