Install the Data Repository

Once you meet the prerequisites described in , complete the installation as follows:
capm370
HID_Install_the_Data_Repository
Once you meet the prerequisites described in Prepare to Install the Data Repository, complete the installation as follows:
2
The following video shows the installation process:

When Data Repository is installed, two users are created. A third user is created during the Data Aggregator installation. The following table provides information about these users:
New User Example
Password Example
Operating System User Account?
Vertica Database User Account?
Notes
Permissions
dradmin
drpass
Yes
Yes
This user is the first user that you created when you installed Data Repository.
When the dradmin user is created, a verticadba group is also created. The dradmin user is added to this group.
This user can run the Data Repository processes and the Administration Tools utility. This user owns Data Repository catalog files, data files, and so on.
dauser
dbpassword
No
Yes
The user that the Data Aggregator uses to interact with the database
Note:
This user is created during the Data Aggregator installation.
Vertica includes a verticadba group for tighter control over filesystem access in the /opt/vertica/ directories. During the installation, the verticadba group is created, and existing users are added to the group with permissions set to 775. This setting grants full privileges to the verticadba group and read/execute privileges to all other users. The /opt/vertica/log and /opt/vertica/config directories are the folders with the modified permissions.
Install the Database with Root User Passwordless SSH
To set up the Data Repository, you can install and configure the Vertica database as the root user or sudo user that has or will have passwordless SSH configured.
In a cluster installation, initiate the Data Repository installation from any of the hosts that participates in the cluster. The required software components are pushed to the additional nodes during the installation.
Follow these steps:
  1. Log in to any host in the data repository cluster as the root user.
  2. Copy the installDR.bin file locally.
  3. Change permissions for the installation file by typing the following command:
    chmod u+x installDR.bin
  4. To extract the installation file as the root user, type the command:
    ./installDR.bin
  5. To extract the installation file as the sudo user, type the following command:
    sudo ./installDR.bin
    The installDR.bin file does not install Data Repository. This file extracts the Data Repository rpm, the license file, and the three installation scripts. You install Data Repository later in this procedure.
  6. Follow the instructions in the console.
  7. When prompted, specify the installation directory to extract the Data Repository installation package and Vertica license file to. The default installation directory is /opt/CA/IMDataRepository_vertica
    Version
    /. Press Enter twice.
    The script generates WARN messages for any LVM present in the environment. For help, contact CA Support.
    The Data Repository installation package, license file, and associated setup scripts are extracted to the chosen directory.
  8. Adjust the following parameters in the
    drinstall.properties
    file to reflect your installation-specific values. This file applies to
    dr_
    validate
    .sh
    and
    dr_install.sh
    . The
    drinstall.properties
    file exists in the installation directory you specified previously.
    • DbAdminLinuxUser=
      The
      Linux user that is created to serve as the Vertica database administrato
      r
      Default
      : dradmin
    • DbAdminLinuxUserHome=
      The Vertica Linux database administrator user home directory
      Default:
      /export/dradmin
      This directory is created if the Vertica installer creates the user. Be sure that the directory leading up to the home account already exists on the system. For example, if you are using /export/dradmin, be sure that /export exists.
    • DbDataDir=
      The location of the data directory
      Default:
      /data
      Do not use the Logical Volume Manager (LVM) for the data directory.
    • DbCatalogDir=
      The location of the catalog directory
      Default:
      /catalog
      Do not use the Logical Volume Manager (LVM) for the catalog directory.
    • DbHostNames=
      The comma-delimited list of hostnames for Data Repository
      Default:
      yourhostname1,yourhostname2,yourhostname3
    • DbName=
      The database name
      Default:
      drdata
      This parameter is case-sensitive.
    • DbPwd=
      The database password
      Default:
      dbpass
      The database password that you define here is used during the installation of the Data Aggregator. You can use special characters (except for single quotation marks) in passwords. To use special characters, encase the password with single quotation marks (for example,
      DbPwd='test$tr|ng'
      ). If the
      DbPwd
      property is not found or blank, the script prompts for this information at runtime.
  9. Run the validation script. This script verifies the OS settings and modifies the settings if necessary.  To run the validation script as the root user, type the following command:
    ./dr_validate.sh -p
    properties_file
    The validation script establishes SSH without a password for the root user across all hosts in a cluster. If SSH without a password does not exist for the root account, you are prompted for a password. You are sometimes prompted multiple times.
    You can use the
    -l
    flag to allow
    localhost
    as the value for the
    DbHostNames
    property. You can use the
    -n
    flag to skip database connectivity checks.
  10. Review any on-screen output for failures or warnings. You can run this script multiple times after you fix any failures or warnings. The script automatically corrects many failures or warnings. Proceed only if the final status is “PASSED”. If the final status is not "PASSED", contact CA Support.
    The validation script may ask you to reboot.
    The validation script and the installation script generate a log file in
    installation_directory
    /logs on the Data Repository host from which you run the scripts. These log files include the step-by-step output of the scripts. To validate successful/failed script runs, review the script output.
    The following example shows the script output and lists what settings the script verifies and changes:
    Log File: logs/install_log_validate_10-29-2015_11-14-11.log
    ===============================================================================
    Checking Passwordless SSH to all hosts: verticahost-dr
    ===============================================================================
    Passwordless SSH from verticahost-dr to [email protected] ...................[ OK ]
    ===============================================================================
    Beginning Data Repository Prerequisite Compliance Enforcement on host verticahost-dr
    ===============================================================================
    Red Hat Enterprise Linux Major Release: 6 ..............................[ OK ]
    Processor Type: Intel ...................................................[ OK ]
    CPU frequency scaling not available on this system ......................[ OK ]
    DR Administrative User dradmin does not exist. It will be created during vertica installation. [ OK ]
    Maximum number of file handles >= 65536 .................................[ OK ]
    Detected incorrect maximum number of memory maps ........................[WARN]
    Set maximum number of memory maps to Total Mem(KB)/16 ...................[ OK ]
    Detected incorrect page reclaim threshold value .........................[WARN]
    Set page reclaim threshold value to 7924 ................................[ OK ]
    Disabling necessary firewall settings. ..................................[ OK ]
    Enabling NTP daemon. ....................................................[ OK ]
    Starting the NTP daemon. ................................................[ OK ]
    Detected incorrect readahead parameter for /dev/sda .....................[WARN]
    Set readahead parameter for /dev/sda to 2048 ............................[ OK ]
    Block Size for /dev/sda is 4096 .........................................[ OK ]
    Readahead parameter for /dev/sda1 is 2048 ...............................[ OK ]
    Block Size for /dev/sda1 is 1024. Expected value >= 4096 ...............[WARN]
    Readahead parameter for /dev/sda2 is 2048 ...............................[ OK ]
    Block Size for /dev/sda2 is 4096 ........................................[ OK ]
    Readahead parameter for /dev/sda3 is 2048 ...............................[ OK ]
    Block Size for /dev/sda3 is 4096 ........................................[ OK ]
    Detected incorrect swappiness setting ...................................[WARN]
    Set swappiness to 0 .....................................................[ OK ]
    Transparent hugepages in /sys/kernel/mm/redhat_transparent_hugepage/enabled are enabled [WARN]
    Disabled Huge Page Compaction ...........................................[ OK ]
    Huge Page Compaction Defrag in /sys/kernel/mm/redhat_transparent_hugepage/defrag is enabled [WARN]
    Disabled Huge Page Compaction Defrag ....................................[ OK ]
    Disk Scheduler for sda is not deadline ..................................[WARN]
    Set Disk Scheduler for sda to deadline ..................................[ OK ]
    Reloading sysctl.conf ...................................................[WARN]
    SELinux is disabled .....................................................[ OK ]
    Verifying Swap Space. ...................................................[ OK ]
    No Logical Volumes exist. ...............................................[ OK ]
    Root entry exists in /etc/sudoers file. .................................[ OK ]
    Verifying ext3 or ext4 filesystem used for data directory. ..............[ OK ]
    Verifying ext3 or ext4 filesystem used for catalog directory. ...........[ OK ]
    Fresh install of Vertica is being performed - skipping database connectivity testing.
    Data Repository Prerequisite Compliance Status on host verticahost-dr -- PASSED
    ===============================================================================
    Script finished - /user/home/verticahost/dr_validate.sh
    ===============================================================================
    If the installation fails early enough in the process, the log file may be available in the home directory of the root or sudo user.
  11. Run the installation script:
    ./dr_install.sh -p
    properties_file
    This script installs the data repository, creates the database, and disables unnecessary Vertica processes on all the hosts in the cluster.
    If the database administrator user does not already exist, the installation script creates the user. The script prompts you to assign a new password. If the database administrator user exists, but passwordless SSH is not set up, the script prompts for the password to set up.
    If the installation script returns a WARN message for LVM on directories that Vertica does not use, contact CA Support.
  12. Verify that Data Repository has been installed successfully by doing the following steps:
    1. To log in to the database server as the database administrator user, type the following command:
      su - dradmin
    2. Type the following command:
      /opt/vertica/bin/adminTools
    3. The Administration Tools dialog opens.
    4. Select (1) View Database Cluster State and then select OK or press Enter.
      The database name appears and the State is reported as UP.
    5. Select OK to acknowledge that the database is UP.
    6. Select (E) Exit and press Enter.
    If the database does not start automatically, select Start DB to start the database manually. If the database is not started, the Data Aggregator installation fails.
Install the Database with Sudo User Passwordless SSH Configured
To set up the Data Repository, you can install and configure the Vertica database as the sudo user.
This functionality is not supported on RHEL 6.x
Follow these steps:
  1. Log in to
    each
    node in the data repository cluster as the sudo user.
  2. Copy the installDR.bin file locally.
  3. Change permissions for the installation file by typing the following command:
    chmod u+x installDR.bin
  4. To extract the installation file as the sudo user, type the following command:
    sudo ./installDR.bin
    The installDR.bin file does not install Data Repository. This file extracts the Data Repository rpm, the license file, and the three installation scripts. You install Data Repository later in this procedure.
  5. Follow the instructions in the console.
  6. When prompted, specify the installation directory to extract the Data Repository installation package and Vertica license file to. When you are installing the Data Repository using the sudo user account with passwordless SSH, after extracting, you must run the installation on
    each
    host in the cluster using the same location. The default installation directory is
    /opt/CA/IMDataRepository_vertica
    Version
    /
    . Press Enter twice.
    The script generates WARN messages for any LVM present in the environment. For help, contact CA Support.
    The Data Repository installation package, license file, and associated setup scripts are extracted to the chosen directory.
  7. Adjust the following parameters in the
    drinstall.properties
    file to reflect your installation-specific values. This file applies to
    dr_
    validate
    .sh
    and
    dr_install.sh
    . The
    drinstall.properties
    file exists in the installation directory you specified previously.
    • DbAdminLinuxUser=
      The
      Linux user that is created to serve as the Vertica database administrato
      r
      Default
      : dradmin
    • DbAdminLinuxUserHome=
      The Vertica Linux database administrator user home directory
      Default:
      /export/dradmin
      This directory is created if the Vertica installer creates the user. Be sure that the directory leading up to the home account already exists on the system. For example, if you are using /export/dradmin, be sure that /export exists.
    • DbDataDir=
      The location of the data directory
      Default:
      /data
      Do not use the Logical Volume Manager (LVM) for the data directory.
    • DbCatalogDir=
      The location of the catalog directory
      Default:
      /catalog
      Do not use the Logical Volume Manager (LVM) for the catalog directory.
    • DbHostNames=
      The list of hostnames for Data Repository
      Default:
      yourhostname1,yourhostname2,yourhostname3
      For this step, list the local hostname only. You add all other nodes in a later step.
    • DbName=
      The database name
      Default:
      drdata
      This parameter is case-sensitive.
    • DbPwd=
      The database password
      Default:
      dbpass
      The database password that you define here is used during the installation of the Data Aggregator. You can use special characters (except for single quotation marks) in passwords. To use special characters, encase the password with single quotation marks (for example,
      DbPwd='test$tr|ng'
      ). If the
      DbPwd
      property is not found or blank, the script prompts for this information at runtime.
  8. Run the validation script with the "-sp" command line argument on
    each
    node:
    sudo ./dr_validate.sh -sp
    properties_file
    You can use the
    -l
    flag to allow
    localhost
    as the value for the
    DbHostNames
    property. You can use the
    -n
    flag to skip database connectivity checks.
  9. Review any on-screen output for failures or warnings. You can run this script multiple times after you fix any failures or warnings. The script automatically corrects many failures or warnings. Proceed only if the final status is “PASSED”. If the final status is not "PASSED", contact CA Support.
    The validation script may ask you to reboot.
    The validation script and the installation script generate a log file in
    installation_directory
    /logs
    on the Data Repository host from which you run the scripts. These log files include the step-by-step output of the scripts. To validate successful/failed script runs, review the script output.
    The following example shows the script output and lists what settings the script verifies and changes:
    Log File: logs/install_log_validate_10-29-2015_11-14-11.log
    ===============================================================================
    Checking Passwordless SSH to all hosts: verticahost-dr
    ===============================================================================
    Passwordless SSH from verticahost-dr to [email protected] ...................[ OK ]
    ===============================================================================
    Beginning Data Repository Prerequisite Compliance Enforcement on host verticahost-dr
    ===============================================================================
    Red Hat Enterprise Linux Major Release: 6 ..............................[ OK ]
    Processor Type: Intel ...................................................[ OK ]
    CPU frequency scaling not available on this system ......................[ OK ]
    DR Administrative User dradmin does not exist. It will be created during vertica installation. [ OK ]
    Maximum number of file handles >= 65536 .................................[ OK ]
    Detected incorrect maximum number of memory maps ........................[WARN]
    Set maximum number of memory maps to Total Mem(KB)/16 ...................[ OK ]
    Detected incorrect page reclaim threshold value .........................[WARN]
    Set page reclaim threshold value to 7924 ................................[ OK ]
    Disabling necessary firewall settings. ..................................[ OK ]
    Enabling NTP daemon. ....................................................[ OK ]
    Starting the NTP daemon. ................................................[ OK ]
    Detected incorrect readahead parameter for /dev/sda .....................[WARN]
    Set readahead parameter for /dev/sda to 2048 ............................[ OK ]
    Block Size for /dev/sda is 4096 .........................................[ OK ]
    Readahead parameter for /dev/sda1 is 2048 ...............................[ OK ]
    Block Size for /dev/sda1 is 1024. Expected value >= 4096 ...............[WARN]
    Readahead parameter for /dev/sda2 is 2048 ...............................[ OK ]
    Block Size for /dev/sda2 is 4096 ........................................[ OK ]
    Readahead parameter for /dev/sda3 is 2048 ...............................[ OK ]
    Block Size for /dev/sda3 is 4096 ........................................[ OK ]
    Detected incorrect swappiness setting ...................................[WARN]
    Set swappiness to 0 .....................................................[ OK ]
    Transparent hugepages in /sys/kernel/mm/redhat_transparent_hugepage/enabled are enabled [WARN]
    Disabled Huge Page Compaction ...........................................[ OK ]
    Huge Page Compaction Defrag in /sys/kernel/mm/redhat_transparent_hugepage/defrag is enabled [WARN]
    Disabled Huge Page Compaction Defrag ....................................[ OK ]
    Disk Scheduler for sda is not deadline ..................................[WARN]
    Set Disk Scheduler for sda to deadline ..................................[ OK ]
    Reloading sysctl.conf ...................................................[WARN]
    SELinux is disabled .....................................................[ OK ]
    Verifying Swap Space. ...................................................[ OK ]
    No Logical Volumes exist. ...............................................[ OK ]
    Root entry exists in /etc/sudoers file. .................................[ OK ]
    Verifying ext3 or ext4 filesystem used for data directory. ..............[ OK ]
    Verifying ext3 or ext4 filesystem used for catalog directory. ...........[ OK ]
    Fresh install of Vertica is being performed - skipping database connectivity testing.
    Data Repository Prerequisite Compliance Status on host verticahost-dr -- PASSED
    ===============================================================================
    Script finished - /user/home/verticahost/dr_validate.sh
    ===============================================================================
    If the installation fails early enough in the process, the log file may be available in the home directory of the root or sudo user.
  10. Repeat the previous steps for
    each
    node.
  11. Go to the first node and edit the
    DbHostnames
    parameter in the
    drinstall.properties
    file to include
    all
    the nodes in the cluster.
  12. Run the installation script with the "-sp" command line argument:
    sudo ./dr_install.sh -sp
    properties_file
    To run the script as sudo, passwordless SSH (public key) must be set up for the sudo account between the Data Repository hosts. If passwordless SSH does not exist for the sudo account, you cannot proceed. For more information, see Prepare to Install the Data Repository.
    This script installs the Data Repository, creates the database, and disables unnecessary Vertica processes on all the hosts in the cluster.
    If the database administrator user does not already exist, the installation script creates the user. The script prompts you to assign a new password. If the database administrator user exists, but passwordless SSH is not set up, the script prompts for the password to set up.
    If the installation script returns a WARN message for LVM on directories that Vertica does not use, contact CA Support.
  13. Verify that Data Repository has been installed successfully by doing the following steps:
    1. To log in to the database server as the database administrator user, type the following command:
      su - dradmin
    2. Type the following command:
      /opt/vertica/bin/adminTools
    3. The Administration Tools dialog opens.
    4. Select (1) View Database Cluster State and then select OK or press Enter.
      The database name appears and the State is reported as UP.
    5. Select OK to acknowledge that the database is UP.
    6. Select (E) Exit and press Enter.
    If the database does not start automatically, select Start DB to start the database manually. If the database is not started, the Data Aggregator installation fails.
(Optional) Secure Data Repository
To limit the users who can log in to the database to only the Data Repository administrative account and the root user, lock down the database.
Follow these steps:
  1. Modify the /etc/pam.d/sshd file by adding the following entry, for the PAM access module, after the "account required pam_nologin.so" entry:
    account required pam_access.so accessfile=/etc/security/sshd.conf
    If
    /etc/security/sshd.conf
    is missing, you must create it using the SSHD documentation.
  2. If the following line from the /etc/security/access.conf file exists, remove it:
    -:ALL EXCEPT database_admin_user root:LOCAL
    For example:
    -:ALL EXCEPT dradmin root:LOCAL
Configure Log Rotation for Data Repository
To prevent the underlying vertica.log file from becoming too large, configure log rotation for Data Repository. The recommended configuration for the log rotation is a daily rotation with logs retained for 21 days.
Configuring the log rotation is required because the underlying Data Repository log file (vertica.log) can grow substantially.
Follow these steps:
  1. Log in to the database server for Data Repository as the database administrator user. Type the following command:
    su - dradmin
  2. Type the following command:
    /opt/vertica/bin/admintools -t logrotate -d
    database_name
    -r
    frequency
    -k
    number
    • -d
      indicates the database name.
      This parameter is case-sensitive.
    • -r
      specifies how often to rotate the daily logs.
      Values:
      daily, weekly, monthly
    • -k
      specifies how many logs to keep according to the frequency. For example, if the frequency is weekly, a value of 3 keeps three weeks of daily log files.
      Example
      /opt/vertica/bin/admintools -t logrotate -d drdata -r daily -k 14
  3. (Optional) To verify that the vertica.log rotation has been configured correctly, look at the new gzipped vertica.log files in the Vertica catalog directory for previous days. The log files use the following filename format:
    vertica.log.
    YYYYMMDD
    .gz
Set Up Automatic Backups of Data Repository
To preserve your data against failures, set up automatic backups of the Data Repository. For more information, see Back Up the Data Repository.