Services

dxp10
HID_Services
Use the multi-tenant services to monitor and stabilize the resources that your environment utilizes. Using the
Services
option in the left-navigation pane, you can view the resource consumption of individual services and service instances. Debug actions and scaling options are also available for selected services. For more information on deployed services, see Deployed Services Reference.
This article contains the following topics:
2
Overview of the Services
The
Services
page lists all the services that are deployed and the number of instances for each of the services.
The following options apply to the
Services
page:
  • Active only
    : View all or only active services.
  • Refresh:
    Refresh the services data.
  • Upgrade All Services:
    The option gets enabled when there is at least one service available for upgrade. For more information, see Upgrade DX Platform.
At the service-level, this page provides the following information:
  • Service name
  • Number of instances running for the service.
  • Service status:
    Running OK
    /
    Warning
    /
    Failure
    When the service status is
    Warning
    , it typically means that manual changes to the deployment occurred outside of DX Cluster Management console, for example, changes done from Kubernetes or OpenShift console. Click on the
    Warning
    message to view the details of the manual changes.
  • Utilization of the system by all the instances of that service in the last 8 minutes. We recommend to scale-up if the utilization is above 80%.
  • Resources utilization (CPU Core and RAM) by all the instances of that service in the last eight minutes. Click the graph to view the metrics.
  • Click the information icon () for a service instance to view the instance deployment details.
  • Download logs for a service instance.
  • Delete the instance of a stateless service. Currently, this is applicable to the API Gateway service instance.
  • Options to scale services by adding instances or configure deployment to add resources.
The following image provides an overview of the
Services
page, various actions available for a service and an instance, and how to view the metrics.
Services Page
Services that begin with
AXA
are available for use only if
DX App Experience Analytics
is enabled. For more information about the deployed services, see the Deployed Services Reference page.
Scalability of Services and Recommendations
For select services, you can add instances or resources to scale the services. The services such as
Tenants
and
Cluster Management
have only one instance and therefore do not scale.
ACC Sharding
The ACC sharding feature allows you to handle and process large amounts of tenant data (bundles, packages, agents, reports, jobs...) by spreading the load into horizontal partitions (also known as shards), rather than storing it in a single server. For more information, see ACC Sharding
Services and their Configurability
The table below indicates the list of services and their configurability.
Table Legend
:
  • Partitioning
    : the entity based on which the data being stored in different instances.
  • Utilization Metric Available
    : if marked
    Yes
    , you can view a utilization metric for the service in
    Metric Browser
    . Utilization value allows you to plan for a scale-up of the service.
  • Data Retention Configuration
    : if marked
    Yes
    , you can configure data retention for the service. For more information on configuring data retention, see Configure Cluster Settings.
  • Data Rebalancing
    : if marked
    Yes
    , the service supports the automatic rebalancing of data after scale-up.
  • Stateless
    : services can have multiple instances and they share the same database. If marked
    Yes
    , an instance can be deleted except with the service with ID
    1
    .
Service Name
Partitioning Shard Size
Utilization Metric Available
Data Retention Configuration
Data Rebalancing
Stateless
NASS
Agent
Yes
Yes
Yes
No
Metadata
Tenant
No
No
No
No
TAS
Tenant
No
Yes
No
No
States
Tenant
No
Yes
No
No
Card
32 vertex groups per Tenant
Yes
Yes
No
No
Tracestore
32 vertex groups per Tenant
Yes
N/A
Yes
No
CloudGW
N/A
Yes
N/A
N/A
Yes
Assisted Triage
Tenant
No
Yes
No
No
API Gateway
N/A
No
N/A
N/A
Yes
ATC
N/A
No
N/A
N/A
No
OI Metric Publisher
Agent / Source
No
No
No
No
ACC
Tenant
Yes
No
No
Yes
Impact of Scaling the Service
When you scale the services, the following implications are observed:
  • Partitioning - Agents
    : since data is stored based on the agent ID, you must add instances prior to the new agent connection. Agents already connected will send data to the original instance.
  • Partitioning - Tenant
    : one tenant data is always stored in one instance, you must increase instances prior to adding a new tenant. For large tenants use vertical scale up if available for services, such as Card, TAS, Assisted Triage.
  • Partitioning - N/A
    - service can be scaled up at any point in time and load will be automatically balanced
  • 32 instances/shards per Tenant
    - 32 instances/shards of card services for a tenant is supported. Each shard can handle up to 32 instances of Card service.
  • 32 agent groups per Tenant
    - 32 agents supported in a group for a tenant
Recommended Vertical and Horizontal Scaling
The table below indicates the services that you can scale vertically and/or horizontally.
Service
Horizontal Scaling
Vertical Scaling
Notes
TAS
Yes
Yes
For large tenants, implement vertical scaling. For small and multiple tenants, implement horizontal scaling.
NASS
Yes
Yes
NASS distributes storing metrics evenly. Horizontal scaling increases the ingestion rate capacity. Vertical scaling increases cache sizes to execute the queries faster.
Metadata
Yes
Yes
In case you have one or a few large tenants we recommend vertical scaling. If the tenants are small but there are many of them we recommend horizontal scaling.
States
Yes
Yes
For one or multiple large tenants, implement vertical scaling. For small and multiple numbers of tenants, implement horizontal scaling.
CARD
Yes
Yes
Tracestore
No
No
CloudGW
Yes
No
CloudGW handles traffic from all tenants and supports only horizontal scaling. It has a soft connection limit to 3000, which maps to 100% utilization. The default hard connection limit is 3500, beyond which the connections are refused. The number of instances should be greater or equal to the expected number of agents from all tenants / (0.8 * soft connection limit). The recommendation for high availability is to scale horizontally so in case one CloudGW instance fails, others have enough connection capacity to immediately take over.
Assisted Triage
Yes
Yes
For small tenants, implement horizontal scaling with the default configuration. For large tenants, use vertical scaling only for the required instances. Assess the heap usage metric of the instance to determine when to apply vertical scaling of the instances. Vertical scaling allocates more memory for the instance.
API Gateway
Yes
No
API gateway handles requests from all tenants and supports only horizontal scaling. The server has a hard 500 simultaneous connections (threads) limit. The load primarily depends on the expected number of simultaneous Team Center UI, Cluster Management Console Clients, and the number and size of tenant instances. The recommended scaling is at least two instances to ensure fast failover of requests. The horizontal scaling is recommended when the clients are experiencing frequent response connection timeouts.
ATC
Yes
No
OI Metric Publisher
Yes
No
One instance of OI Metric publisher service can handle 1M metrics. However, when there are more metrics, you can create additional instances to redistribute the load. You can distribute the load to multiple instances by splitting the tenants or the regular expression defined in the respective configuration files. For more information on configuring the OI Metric Publisher service, see Configure the OI Metric Publisher Service.
ACC
Yes
No
ACC supports horizontal scaling by adding a new partition, and by adding an instance (replica) to an existing partition. See ACC Sharding.
Add Instances to Scale the Services Horizontally
If a service exhausts its resources, you can scale up the service by increasing the number of instances. You can add the instances using the
Add Instance
button under
Actions.
This button is displayed only for services that can be scaled up.
Follow these steps:
  1. In the
    Services
    page, identify the service that you want to scale and then click
    Add Instance
    .
  2. In the confirmation dialog, click
    Confirm
    to add another instance.
    The confirmation dialog displays details of the resources that will get added. In addition, the resource requirement is checked and estimated and a message is displayed if additional hardware is required or not to scale up.
    When you add an instance, the
    Add Instance
    button changes to
    In progress
    and the instance is added to the list of services with the status as red. Click the
    In progress
    button to view the details of the job progress. Once the job is complete, the button changes to
    Instance added
    and the added instance displays the status as green.
ACC supports horizontal scaling by adding a new partition, and by adding an instance (replica) to an existing partition. See ACC Sharding.
Configure the Deployment to Scale up the Application Vertically
Use the
Configure deployment
option to scale the application vertically. The vertical scaling enables higher performance of all the service instances. The
Configure deployment
option is available only for a few services such as TAS and Card.
Follow these steps:
  1. In the
    Services
    page, navigate to the service that displays
    Ellipses
    .
  2. Click
    llipses
    and then click
    Configure deployment
    .
  3. Select the
    Deployment Type
    . The size of each deployment type may vary depending on the service that you configure.
    • Default size (5G)
    • Middle size (8GB)
    • Large size (12GB)
    When you select the deployment type, the resource requirement is checked and estimated and a message is displayed if the additional hardware is required or not to scale up.
  4. Click
    Redeploy
    .
ACC does not support vertical scaling. Instead, you can horizontally scale an existing partition.
Recommended Vertical Scaling
The table below indicates the vertical scaling recommendations.
Service
Utilization Metric/Indicator
Maximum Recommendation for Default
Maximum Recommendation for Middle
Maximum Recommendation for Large
Notes
NASS
% Metrics - Shows utilization of metric ingestion rate that is supported
100 % Metrics
100 % Metrics
100 % Metrics
Vertical scaling improves query response time as it increases the size of memory caches and not the capacity.  The capacity is increased by horizontal scaling. Query cache sizes are: 700MB; 3GB; 9GB
Metadata
TAS
% Topology - Shows current utilization of topology size that needs to fit into a cache to provide fast query speed
100 % Topology
100 % Topology
100 % Topology
Vertical scaling increases the cache size and also the capacity for tenants (size of tenant). Cache sizes are: 800MB; 3.7GB; 10GB
States
% CPU
400%
N/A
N/A
Horizontal scaling increases the capacity for more tenants.
CARD
% Cards
100 % Cards
100 % Cards
100 % Cards
Tracestore
% Transactions
100 % Transactions
N/A
N/A
Transaction processing is evenly distributed. Horizontal scaling increases the capacity for transaction processing.
CloudGW
N/A
N/A
N/A
N/A
Cloud Gateway does not support vertical scaling.
Assisted Triage
Verify the following metrics:
  • Memory Heap Max
  • Memory Heap Used
2G
5G
11G
Capacity is increased by horizontal scaling. For multiple tenants or large tenants that are attached to a particular instance, implement vertical scaling to increase allocated memory.
API Gateway
N/A
N/A
N/A
N/A
API Gateway does not support vertical scaling.
ATC
% Processor
200% CPU
OI Metric Publisher
N/A
N/A
N/A
N/A
OI Metric Publisher does not support vertical scaling.
Monitor and Debug Service Instances
You can use the debug options to monitor service instances and investigate potential service-related issues.
The
Debug
option is not available for all services.
Follow these steps:
  1. In the
    Services
    page, identify the service that you want debug.
  2. Expand the service to view the service instances.
  3. Identify the instance that you want to debug.
  4. Under
    Actions
    , click
    Ellipses
    to display the debug menu.
  5. Click a debug action from the menu.
    • Download Logs:
      Enables you to download the following logs:
      • Service Log
        : You can switch from
        Info Level
        to
        Debug Level
        logging for 5 minutes.
      • Access Log
        : You can enable
        Success and Redirect requests
        logging for 5 minutes.
      You can click
      Download
      at any time while debug logging is enabled. Debug logging stops after five minutes by default.
    • Download Thread Dump:
      Downloads a Java thread dump.
    • Download Service Dump:
      Downloads service-specific data in JSON format.
      This action is only available for selected services.
Delete Stateless Service Instance
Using the
Actions
menu for an instance, you can delete a stateless service instance. A stateless service is one that does not preserve any state in the database. Currently, only
API Gateway
is the stateless service for which you can delete an instance. However, to delete an instance the number of instances must be more than one. The services with ID 1 cannot be deleted.
Follow these steps:
  1. In the
    Services
    page, open the API Gateway service instance list.
  2. Click the
    Ellipses
    icon under
    Actions
    and click
    delete instance
    .
  3. In the text-box, type
    delete
    .
  4. Click
    Delete
    to confirm the service deletion.
  5. Click the In progress button to view the deletion job progress.
    Once the job is complete, the instance is deleted and removed from the list of instances. You can now view this activity in the
    Activity Log
    .