Migrate the Data Repository

To migrate the Data Repository, install Vertica on the new cluster and migrate the data. The following situations might require migration:
capm310
To migrate the Data Repository, install Vertica on the new cluster and migrate the data. The following situations might require migration:
  • You are upgrading to Red Hat Enterprise Linux 6.x
  • The current database hardware no longer meets sizing requirements.
  • You are moving from virtual machines to physical hardware for the database.
  • You are moving data from another environment.
This migration procedure minimizes Data Repository downtime.
2
Prepare to Install the Destination Data Repository
Before you install Vertica on the new cluster, prepare the environment for the installation. The new destination cluster must have the same number of nodes as the source cluster, and have the same version of Vertica installed. For more information about preparing the destination cluster, see Prepare to Install the Data Repository.
The node names for the new cluster must match the node names from the source cluster.
Install Vertica on the New Cluster
 The installation process is the same as a normal Data Repository installation, for more information, see Install the Data Repository.
Use the same configuration for the new cluster as for the source cluster. For example, the vertica version, node count, database name, administrator, user, catalog directory, and data directory must be the same as the original Data Repository.
Migrate the Cluster
After you install the database on the new cluster, migrate the existing data. The migration uses the copy cluster command, which simultaneously backs up the existing database and restores the data to the new cluster. Copy cluster copies all data in the Data Repository from before you run the command. Because
CA Performance Management
continues to collect data during the migration, the process requires multiple runs of the copy cluster command.
3
Create a Configuration File for Copy Cluster
The copy cluster command requires a configuration file that includes the necessary information.
Example:
The following example configuration file is set up to copy a database on a three node cluster (v_vmart_node0001, v_vmart_node0002, and v_vmart_node0003) to another cluster consisting of nodes: test-host01, test-host02, and test-host03:
The dbName parameter is case-sensitive.
[Misc]
snapshotName = CopyVmart
verticaConfig = False
restorePointLimit = 5
tempDir = /tmp/vbr
retryCount = 5
retryDelay = 1
[Database]
dbName =
vmart
dbUser =
dradmin
dbPassword =
password
dbPromptForPassword = False
[Transmission]
encrypt = False
checksum = False
port_rsync = 50000
bwlimit = 0
[Mapping]
; backupDir is not used for cluster copy
v_vmart_node0001= test-host01
v_vmart_node0002= test-host02
v_vmart_node0003= test-host03
Stop the Target Database
Before you start the migration, shut down the database on the target cluster.
Follow these steps:
  1. Log in to the target database cluster as the database admin user.
  2. Open Vertica admin Tools:
    /opt/vertica/bin/adminTools
  3. Select (4) Stop Database. Wait for the shutdown to complete before you run copy cluster.
Copy Historical Data
After you install the database on the new cluster, copy the data from the existing database. The copy cluster command copies all information from before you initiate the command. New data continues to come in while the command is running, but the command does not copy this data. The target cluster must be stopped before you invoke copy cluster.
Follow these steps:
  1. Log in to the source cluster with the database administrator account.
  2. Run the copy cluster command:
    vbr.py --task copycluster --config-file
    CopyClusterConfigurationFile
    .ini
    The command copies the historical data for the database and displays the following message:
    > vbr.py --config-file CopyVmart.ini --task copycluster
    Preparing...
    Copying...
    1871652633 out of 1871652633, 100%
    All child processes terminated successfully.
    copycluster done!
Verify the Copy of the Historical Data
After the copy cluster process completes, ensure the integrity of your data.
Follow these steps:
  1. Log in to the target database cluster as the database admin user.
  2. Open Vertica admin Tools:
    /opt/vertica/bin/adminTools
  3. Start the database.
    From any node in the cluster, open open the Vertica SQL prompt:
    /opt/vertica/vsql -U dauser
  4. Run the following queries to verify the timestamp of these key database tables:
    SELECT to_timestamp(max(tstamp)) from dauser.reach_rate;
    SELECT to_timestamp(max(tstamp)) from dauser.ifstats_rate;
    The date and time must correspond to the time when you started the copy.
  5. Open adminTools, and stop the database.
Stop the Data Aggregator
To maintain integrity of your data, stop the Data Aggregator before the final copy of the Data Repository. Polling continues on Data Collector if it is running and polling when Data Aggregator is stopped. Data Collector queues polled data for future delivery to Data Aggregator.
Follow these steps:
  1. Log in to the Data Aggregator host as the root user or a sudo user.
  2. Use firewall rules to block traffic from all the Data Collectors. Run the following command for each Data Collector:
    iptables -A INPUT -s
    DC_IP
    -j DROP
    DC_IP
    specifies the IP of the Data Collector.
  3. Open a command prompt and run one of the following commands:
    • Root user:
      service dadaemon stop
    • Sudo user:
      sudo service dadaemon stop
    The Data Aggregator stops.
Copy Recent Data
After you stop the Data Aggregator, run the copy cluster command again to copy recent data. The Data Repository copies only new data that arrive after the initial copy.
Follow these steps:
  1. Log in to the source cluster with the database administrator account.
  2. Run the copy cluster command:
    vbr.py --task copycluster --config-file
    CopyConfigurationFile
    .ini
    The command copies the recent data.
Verify the Copy of the Recent Data
To ensure the integrity of your data, verify the data.
Follow these steps:
  1. Log in to the target database cluster as the database admin user.
  2. Open Vertica admin Tools:
    /opt/vertica/bin/adminTools
  3. Start the database.
  4. From any node in the cluster, open the Vertica SQL prompt: 
    /opt/vertica/vsql -U dauser
  5. Run the following queries to verify the timestamp of these key database tables:
    SELECT to_timestamp(max(tstamp)) from dauser.reach_rate;
    SELECT to_timestamp(max(tstamp)) from dauser.ifstats_rate;
    The date and time must correspond to the time when you started the copy.
Update the Database Connection Information
To enable communication between the Data Aggregator and the new Data Repository cluster, update the database connection information.
Follow these steps:
  1. Log in to the Data Aggregator host.
  2. Open the following file:
    vi /opt/IMDataAggregator/apache-karaf-2.3.0/etc/dbconnection.cfg
  3. Update the following parameter with the hostnames of the new Data Repository cluster:
    dbHostNames=
    hostname1,hostname2,hostname3
Restart Data Aggregator
After the migration is complete, restart the Data Aggregator.
Follow these steps:
  1. Log in to the Data Aggregator host as the root user or a sudo user.
  2. Open a command prompt and run one of the following commands:
    • Root user:
      service dadaemon start
    • Sudo user:
      sudo service dadaemon start
    Data Aggregator starts and synchronizes with CA
    Performance Center
    and the Data Repository. Any queued, polled data on the Data Collectors is sent to the Data Aggregator. The oldest data is discarded if the queued data exceeds a disk space limit that is configured on the Data Collector system. As a result, there is a gap in the polled reporting data.
  3. Monitor the Data Aggregator restart process:
    1. Log in to the Data Aggregator host and navigate to the following directory:
      /opt/IMDataAggregator/performance-spool
    2. Verify that no DTO files exist with a size greater than zero.
    3. Enable traffic from the SNMP Data Collector with largest number of polled items:
      iptables -D INPUT -s
      DC_IP
      -j DROP
      Data Aggregator starts schema validation and processing of cached and new polled data from this Data Collector.
    4. After the Data Aggregator system utilization decreases, enable traffic from the remaining SNMP Data Collectors.
    5. After the Data Aggregator system utilization decreases, enable traffic from the CAMM Data Collectors.
Verify the Migration
After the Data Aggregator startup is complete, log in to
Performance Center
, and verify the following indicators:
  • The system status is good and the Data Aggregator data source is available.
  • Verify the Last Polled on data and time.
  • Navigate to the Data Collector List and verify that all Data Collectors are up and collecting data.
  • Open the Infrastructure Overview dashboard, and verify that data is available for the following time ranges:
    • Last hour
    • Last 7 days