Troubleshoot the Installation

Use the following solutions when you encounter issues with installation and/or upgrade:
Use the following solutions when you encounter issues with installation and/or upgrade:
If you are using your own provisioned host, all Docker and Portal installation commands
must
be prepended with
sudo
for example
sudo ./portal.sh
.
How to Inspect Services
To view all Docker services:
docker service ls
To view details for a specific service:
docker service inspect <serviceName>
Deployment Failure
Symptom
The
API Portal
deployment failed.
Solution
Perform the following steps to troubleshoot a failed deployment:
  1. Ensure that Docker is running.
    For Docker information, see the Docker documentation.
  2. View the services using the following command:
    docker service ls
    Look for services with
    0/X
    under the
    REPLICAS
    column in the following example:
    ID NAME MODE REPLICAS ... h8uty1lrwq9r portal_analytics-server global 0/1 ... 2vs9wy6mwvac portal_apim global 1/1 ...
  3. View the logs for the failed services using the following command:
    docker service logs -f <service_name>
  4. View the Docker logs using the following command:
    journalctl -fu docker
  5. Restart the failed service using the following command:
    docker service update --force <service_name | service_id>
  6. View all the node status using the following command:
    docker node ls
Script Failure
Symptom
Executing
config.sh
and
portal.sh
failed.
Solution
Perform the following steps to troubleshoot a failed execution of config.sh and portal.sh:
The
API Portal
installation requires an external network connectivity to retrieve the required packages when the installation packages are not on the host system.
  1. Ensure that you can reach external resources, using the following example command:
    ping google.com
  2. Ensure that the
    network
    service is running by typing, using the following example command:
    service network status
External Network Unavailable
Symptom
An external network connectivity is not available.
Solution
When an external network connectivity is not available, use the following method to by-pass external resource access:
  1. Back up the existing
    portal.sh
    file using the following command: cp portal.sh portal.sh.bak
  2. Execute the following command: sed -i 's/docker \(login\|pull\)/true/' portal.sh
  3. Execute
    portal.sh
    to start
    API Portal
    .
Database Lockup during Upgrade
Symptom
Services from Authenticator, portal-data, and Portal Enterprise are not started when upgrading to the latest version, causing database to lock up.
Sample log messages are shown next that throw a changelog lock exception:
[email protected] | 2018-04-13 22:36:44.402 [INFO ] org.springframework.boot.liquibase.CommonsLoggingLiquibaseLogger - Waiting for changelog lock.... [email protected] | 2018-04-13 22:42:24.966 [INFO ] org.springframework.boot.liquibase.CommonsLoggingLiquibaseLogger - Waiting for changelog lock.... [email protected] | 2018-04-13 22:52:49.660 [INFO ] org.springframework.boot.web.servlet.AbstractFilterRegistrationBean - Mapping filter: 'loggingFilter' to: [/*] [email protected] | 2018-04-13 22:52:49.661 [INFO ] org.springframework.boot.web.servlet.AbstractFilterRegistrationBean - Mapping filter: 'characterEncodingFilter' to: [/*] [email protected] | 2018-04-13 22:36:49.693 [DEBUG] org.springframework.security.saml.metadata.MetadataManager - Reloading metadata [email protected]
Solution
To unlock the database and proceed with upgrade:
  • If you’re using the out-of-the-box PostgreSQL database, run the following SQL query on PostgreSQL database:
    docker exec -it <Container ID of the 'portal_portaldb' getting from 'docker ps'> psql -U admin portal --command "UPDATE DATABASECHANGELOGLOCK SET locked='false', lockgranted=null, lockedby=null WHERE id=1;"
  • If you’re using your own MySQL database, run the following command on the MySQL console:
    mysql> use portal; mysql> UPDATE DATABASECHANGELOGLOCK SET locked=0, lockgranted=null, lockedby=null WHERE id=1;
For more information on Portal services, see API Portal Architecture.
Health Check Failure during Upgrade
Symptom
Portal Enterprise service cannot be started due to health check failure after upgrading, showing 0/X under the REPLICAS column in the output of the command
docker service ls
. This is due to a health check timeout while patches are being applied after the upgrade.
Sample log messages are shown next that show a health check failure:
[email protected] | 2018-04-18 20:15:47,149 main ERROR Unable to register shutdown hook because JVM is shutting down. java.lang.IllegalStateException: Cannot add new shutdown hook as this is not started. Current state: STOPPED [email protected] | at org.apache.logging.log4j.core.util.DefaultShutdownCallbackRegistry.addShutdownCallback(DefaultShutdownCallbackRegistry.java:113) [email protected] | at org.apache.logging.log4j.core.impl.Log4jContextFactory.addShutdownCallback(Log4jContextFactory.java:273) [email protected] | at org.apache.logging.log4j.core.LoggerContext.setUpShutdownHook(LoggerContext.java:256) [email protected] | at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:216)
Solution
Run the following command to increase the health check start period and restart the services:
sudo docker service update portal_portal-enterprise --health-start-period=10m
For more information on Portal services, see API Portal Architecture.
Restart Deployment
The
API Portal
installation issues are resolved automatically when the Docker Daemon is running. If you must start the deployment, follow these steps:
  1. Remove the existing deployment (services only) using the following command:
    docker stack rm portal
    Note:
    Removing the existing deployment does not remove the persistent data. To remove persistent data, use:
    docker volume rm $(docker volume ls -q)
  2. Run the deployment.
  3. If you encounter any errors, repeat step 1.
Portal Policies Missing from the Enrolled Gateway
Symptom
A number of Portal integration policies (for example, the Standard Policy Template Fragment) are not created in the enrolled Gateway after the enrollment process is completed.
Solution
Perform the following steps to troubleshoot a failed enrollment:
  1. Inside the Gateway appliance, ping the FQDN of the portal to make sure the Gateway can reach the Portal server through the customer’s DNS server.
    ping apim.<portal domain>
  2. If the FQDN of the portal is not reachable through their DNS server, add the FQDN for the portal to
    /etc/hosts
    inside the Gateway appliance as the short term fix. Refer to the Configure Your DNS Server for more information.
Unable to Send Mail through External Mail Server
Symptom
You are unable to send mail through your external mail server. This occurs if your mail server is on the 172.18.0.0/16 subnet and you are using default settings on the docker_gwbridge network. You will not be able to reach your mail server as Docker uses the 172.18.0.0/16 subnet internally by default.
Sample log messages that accompany this situation are shown next:
May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: WARNING: 4: Unable to send email: Could not connect to SMTP host: 172.18.5.22, port: 25. Exception caught! May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: May 03, 2018 7:35:44 AM com.l7tech.server.policy.assertion.alert.ServerEmailAlertAssertion May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: WARNING: 4: Unable to send email: Could not connect to SMTP host: 172.18.5.22, port: 25. Exception caught! May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: May 03, 2018 7:35:44 AM com.l7tech.server.policy.assertion.alert.ServerEmailAlertAssertion May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: WARNING: 4: Unable to send email: Unknown SMTP host: mail.ca.com. Exception caught! May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: May 03, 2018 7:35:44 AM com.l7tech.server.policy.assertion.ServerAuditDetailAssertion May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: INFO: -4: transactionId:,sessionId:,requestId:00000163210b978a-13b20,username:,statusCode:500,domain: May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: WARNING: 3016: Request routing failed with status 600 (Assertion Falsified) May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: May 03, 2018 7:35:44 AM com.l7tech.server.message May 03 13:05:44 yourserver.yourdomain.com dockerd[23046]: WARNING: Message was not processed: Assertion Falsified (600)
Solution
Remove and re-create the docker_gwbridge network with custom settings using a subnet that is not in use on your network:
  1. Stop the API Portal:
    docker stack rm portal
  2. Delete the existing
    docker_gwbridge
    interface:
    docker network rm docker_gwbridge
  3. (Optional)
    If you receive errors while running the previous command, this may be caused by a Docker bug that prevents the docker_gwbridge network from being properly removed. Run the following command, then repeat step 2:
    docker network disconnect --force docker_gwbridge gateway_ingress-sbox
  4. Re-create the docker_gwbridge network using custom settings. In this example, you are assigning 172.20.0.0/16 subnet to the docker_gwbridge network with a default Gateway address of 172.20.0.1:
    docker network create \ --subnet 172.20.0.0/16 \ --opt com.docker.network.bridge.name=docker_gwbridge \ --opt com.docker.network.bridge.enable_icc=false \ --opt com.docker.network.bridge.enable_ip_masquerade=true \ --gateway 172.20.0.1 \ docker_gwbridge
  5. Restart the Portal:
    sudo ./portal.sh
Info:
For more information about the docker_gwbridge network, see How do I change the docker gwbridge address?