Install the Data Repository

Install the data repository using these instructions.
HID_Install_the_Data_Repository
After you meet the prerequisites described in Prepare to Install the Data Repository, complete the installation as follows:
2
The following video shows the installation process:

The data repository installation creates the following users:
New User Example
Password Example
Operating System User Account?
Vertica Database User Account?
Notes
Permissions
dradmin
drpass
Yes
Yes
This user, the database admin user, is the first user that the data repository installation creates.
It also creates the verticadba group for tighter control over filesystem access in the
/opt/vertica/
directories, and adds the dradmin user to this group.
This user has permissions set to 775. This setting grants full privileges to the verticadba group and read/execute privileges to all other users. The modified permissions are located in the
/opt/vertica/log
and
/opt/vertica/config
directories.
This user can run data repository processes and the Administration Tools utility (adminTools). This user owns data repository catalog files and the data files.
dauser
dbpassword
No
Yes
The data aggregator uses this user to connect and interact with the database.
The installation script creates this user during the data aggregator installation.
Install the Database
To set up the data repository, you install and configure the Vertica database as one of the following users:
Install as Root User with Passwordless SSH Configured
In a cluster installation, initiate the data repository installation from any of the hosts that participates in the cluster. The installation pushes the required software components to the additional nodes.
Follow these steps:
  1. Log in to any host in the data repository cluster as the root user.
  2. Copy the
    installDR.bin
    file locally.
  3. Change permissions for the installation file by issuing the following command:
    chmod u+x installDR.bin
  4. Extract the installation file by issuing the following command:
    ./installDR.bin
    The
    installDR.bin
    file extracts the data repository RPM Package Manager (RPM), the license file, and the validation and installation scripts.
  5. Follow the instructions in the console.
  6. When prompted, specify the installation directory to which to extract the data repository installation package, the Vertica license file, and the associated setup scripts. The default installation directory is
    /opt/CA/IMDataRepository_vertica
    Version
    /
    . Press the return key on your keyboard twice.
    The installation script generates WARN messages for any Logical Volume Manager (LVM) present in the environment. For help, contact Support.
  7. Adjust the following parameters in the
    installation directory
    /
    drinstall.properties
    file to reflect installation-specific values. This file applies to the validation (
    dr_validate.sh
    ) and installation (
    dr_install.sh
    ) scripts.
    • DbAdminLinuxUser
      The Linux user that is created to serve as the Vertica database administrator.
      Default:
      dradmin
    • DbAdminLinuxUserHome
      The Vertica Linux database administrator user home directory.
      Default:
      /export/dradmin
      This directory is created if the Vertica installer creates the user. Ensure that the directory leading up to the home account already exists on the system. For example, if you are using the
      /export/dradmin
      directory, be sure that the
      /export
      directory exists.
    • DbDataDir
      The location of the
      data
      directory.
      Default:
      /data
      Do not use the LVM for this directory. Ensure that this directory is on a separate mount from the
      catalog
      directory. This isolates those file systems from performance and space interference so that they are unencumbered from any other disk usage or performance considerations, including each other.
    • DbCatalogDir
      The location of the
      catalog
      directory.
      Default:
      /catalog
      Do not use the LVM for this directory. Ensure that this directory is on a separate mount from the
      data
      directory. This isolates those file systems from performance and space interference so that they are unencumbered from any other disk usage or performance considerations, including each other.
    • DbHostNames
      The comma-delimited list of hostnames for the data repository.
      Default:
      yourhostname1,yourhostname2,yourhostname3
    • DbName
      The database name.
      Default:
      drdata
      Case-sensitive:
      Yes
    • DbPwd
      The database password.
      Default:
      dbpass
      The
      dr_install.sh
      installation script uses this password during the installation of the data aggregator. You can use special characters (except for single quotation marks) in passwords. If the script does not find the
      DbPwd
      property or if it is blank, the script prompts for this information at runtime.
  8. Run the
    dr_validate.sh
    validation script. This script verifies the OS settings and modifies the settings if necessary. To run the validation script as the root user, issue the following command:
    ./dr_validate.sh -p
    properties_file
    The validation script establishes SSH without a password for the root user across all hosts in a cluster. If SSH without a password does not exist for the root account, you are prompted for a password. You are sometimes prompted multiple times.
    You can use the
    -l
    flag to allow
    localhost
    as the value for the
    DbHostNames
    property. You can use the
    -n
    flag to skip database connectivity checks.
  9. Review any on-screen output for failures or warnings. You can run the validation script multiple times after you fix any failures or warnings. The script automatically corrects many failures or warnings. Proceed only if the final status is “PASSED”. If the final status is not "PASSED", contact Support.
    The validation script might ask you to reboot.
    The validation script and the installation script generate a log file in the
    installation_directory
    /logs
    directory on the data repository host from which you run the scripts. These log files include the step-by-step output of the scripts. To validate successful/failed script runs, review the script output.
    The following example shows the script output and lists what settings the script verifies and changes:
    Log File: logs/install_log_validate_10-29-2015_11-14-11.log
    ===============================================================================
    Checking Passwordless SSH to all hosts: verticahost-dr
    ===============================================================================
    Passwordless SSH from verticahost-dr to [email protected] ...................[ OK ]
    ===============================================================================
    Beginning Data Repository Prerequisite Compliance Enforcement on host verticahost-dr
    ===============================================================================
    Red Hat Enterprise Linux Major Release: 6 ..............................[ OK ]
    Processor Type: Intel ...................................................[ OK ]
    CPU frequency scaling not available on this system ......................[ OK ]
    DR Administrative User dradmin does not exist. It will be created during vertica installation. [ OK ]
    Maximum number of file handles >= 65536 .................................[ OK ]
    Detected incorrect maximum number of memory maps ........................[WARN]
    Set maximum number of memory maps to Total Mem(KB)/16 ...................[ OK ]
    Detected incorrect page reclaim threshold value .........................[WARN]
    Set page reclaim threshold value to 7924 ................................[ OK ]
    Disabling necessary firewall settings. ..................................[ OK ]
    Enabling NTP daemon. ....................................................[ OK ]
    Starting the NTP daemon. ................................................[ OK ]
    Detected incorrect readahead parameter for /dev/sda .....................[WARN]
    Set readahead parameter for /dev/sda to 2048 ............................[ OK ]
    Block Size for /dev/sda is 4096 .........................................[ OK ]
    Readahead parameter for /dev/sda1 is 2048 ...............................[ OK ]
    Block Size for /dev/sda1 is 1024. Expected value >= 4096 ...............[WARN]
    Readahead parameter for /dev/sda2 is 2048 ...............................[ OK ]
    Block Size for /dev/sda2 is 4096 ........................................[ OK ]
    Readahead parameter for /dev/sda3 is 2048 ...............................[ OK ]
    Block Size for /dev/sda3 is 4096 ........................................[ OK ]
    Detected incorrect swappiness setting ...................................[WARN]
    Set swappiness to 0 .....................................................[ OK ]
    Transparent hugepages in /sys/kernel/mm/redhat_transparent_hugepage/enabled are enabled [WARN]
    Disabled Huge Page Compaction ...........................................[ OK ]
    Huge Page Compaction Defrag in /sys/kernel/mm/redhat_transparent_hugepage/defrag is enabled [WARN]
    Disabled Huge Page Compaction Defrag ....................................[ OK ]
    Disk Scheduler for sda is not deadline ..................................[WARN]
    Set Disk Scheduler for sda to deadline ..................................[ OK ]
    Reloading sysctl.conf ...................................................[WARN]
    SELinux is disabled .....................................................[ OK ]
    Verifying Swap Space. ...................................................[ OK ]
    No Logical Volumes exist. ...............................................[ OK ]
    Root entry exists in /etc/sudoers file. .................................[ OK ]
    Verifying ext3 or ext4 filesystem used for data directory. ..............[ OK ]
    Verifying ext3 or ext4 filesystem used for catalog directory. ...........[ OK ]
    Fresh install of Vertica is being performed - skipping database connectivity testing.
    Data Repository Prerequisite Compliance Status on host verticahost-dr -- PASSED
    ===============================================================================
    Script finished - /user/home/verticahost/dr_validate.sh
    ===============================================================================
    If the
    dr_install.sh
    installation script fails early enough in the process, the log file might be available in the home directory of the root or sudo user.
  10. Run the
    dr_install.sh
    installation script by issuing the following command:
    ./dr_install.sh -p
    properties_file
    This script installs the data repository, creates the database, and disables unnecessary Vertica processes on all the hosts in the cluster.
    If the database administrator user does not already exist, the installation script creates the user. The script prompts you to assign a new password. If the database administrator user exists, but passwordless SSH is not set up, the script prompts for the password to set up.
    If the installation script returns a WARN message for LVM on directories that Vertica does not use, contact Support.
  11. Verify that the installation script has installed the data repository successfully by doing the following steps:
    1. Log in to the database server as the database administrator user by issuing the following command:
      su - dradmin
    2. Issue the following command:
      /opt/vertica/bin/adminTools
      The Administration Tools dialog opens.
    3. Select
      (1) View Database Cluster State
      , and then select
      OK
      or press the return key on your keyboard.
      The database name appears and the State is reported as UP.
    4. Select
      OK
      to acknowledge that the database is UP.
    5. Select
      (E) Exit
      , and then press the return key on your keyboard.
    If the database does not start automatically, to avoid data aggregator installation failure, start the database manually by selecting
    Start DB
    .
Install as Sudo User with Passwordless SSH Configured
You set up the data repository by installing and configuring the Vertica database as the sudo user.
RHEL 6.x does not support this functionality.
Follow these steps:
  1. Log in to
    each
    node in the data repository cluster as the sudo user.
  2. Copy the
    installDR.bin
    file locally.
  3. Change permissions for the installation file by issuing the following command:
    chmod u+x installDR.bin
  4. Extract the installation file as the sudo user by issuing the following command:
    sudo ./installDR.bin
    The
    installDR.bin
    file extracts the data repository RPM, the license file, and the three installation scripts.
  5. Follow the instructions in the console.
  6. When prompted, specify the installation directory to which to extract the data repository installation package and Vertica license file. When you are installing the data repository using the sudo user account with passwordless SSH, after extracting, run the installation on
    each
    host in the cluster using the same location.
    The default installation directory is
    /opt/CA/IMDataRepository_vertica
    Version
    /
    .
    Press the return key on your keyboard twice.
    The script generates WARN messages for any LVM present in the environment. For help, contact Support.
    The data repository installation package, license file, and associated setup scripts are extracted to the chosen directory.
  7. Adjust the following parameters in the
    installation directory
    /
    drinstall.properties
    file to reflect installation-specific values. This file applies to the
    dr_validate.sh
    and
    dr_install.sh
    validation and installation scripts.
    • DbAdminLinuxUser
      The Linux user that is created to serve as the Vertica database administrator.
      Default:
      dradmin
    • DbAdminLinuxUserHome
      The Vertica Linux database administrator user home directory.
      Default:
      /export/dradmin
      This directory is created if the Vertica installer creates the user. Be sure that the directory leading up to the home account already exists on the system. For example, if you are using the
      /export/dradmin
      directory, be sure that the
      /export
      directory exists.
    • DbDataDir
      The location of the data directory.
      Default:
      /data
      Do not use the LVM as the data directory.
    • DbCatalogDir
      The location of the catalog directory.
      Default:
      /catalog
      Do not use the LVM as the catalog directory.
    • DbHostNames
      The list of hostnames for the data repository.
      Default:
      yourhostname1,yourhostname2,yourhostname3
      List only the local hostname. You add all other nodes in a later step.
    • DbName
      The database name.
      Default:
      drdata
      Case-sensitive:
      Yes
    • DbPwd
      The database password. You can use special characters (except for single quotation marks) in passwords.
      Default:
      dbpass
      The
      dr_install.sh
      installation script uses this password during the installation of the data aggregator. If the script does not find the
      DbPwd
      property or if it is blank, the script prompts for this information at runtime.
  8. Run the validation script, with the
    -sp
    command line argument, on
    each
    node:
    sudo ./dr_validate.sh -sp
    properties_file
    Use the
    -l
    flag to allow
    localhost
    as the value for the
    DbHostNames
    property. Use the
    -n
    flag to skip database connectivity checks.
  9. Review the on-screen output for failures or warnings. You can run the validation script multiple times after you fix failures or warnings. The script automatically corrects many failures or warnings. Proceed only if the final status is “PASSED”. If the final status is not "PASSED", contact Support.
    The validation script might ask you to reboot.
    The validation and installation scripts generate a log file in
    installation_directory
    /logs
    on the data repository host from which you run the scripts. These log files include the step-by-step output of the scripts. To validate successful/failed script runs, review the script output.
    The following example shows the script output and lists what settings the script verifies and changes:
    Log File: logs/install_log_validate_10-29-2015_11-14-11.log
    ===============================================================================
    Checking Passwordless SSH to all hosts: verticahost-dr
    ===============================================================================
    Passwordless SSH from verticahost-dr to [email protected] ...................[ OK ]
    ===============================================================================
    Beginning Data Repository Prerequisite Compliance Enforcement on host verticahost-dr
    ===============================================================================
    Red Hat Enterprise Linux Major Release: 6 ..............................[ OK ]
    Processor Type: Intel ...................................................[ OK ]
    CPU frequency scaling not available on this system ......................[ OK ]
    DR Administrative User dradmin does not exist. It will be created during vertica installation. [ OK ]
    Maximum number of file handles >= 65536 .................................[ OK ]
    Detected incorrect maximum number of memory maps ........................[WARN]
    Set maximum number of memory maps to Total Mem(KB)/16 ...................[ OK ]
    Detected incorrect page reclaim threshold value .........................[WARN]
    Set page reclaim threshold value to 7924 ................................[ OK ]
    Disabling necessary firewall settings. ..................................[ OK ]
    Enabling NTP daemon. ....................................................[ OK ]
    Starting the NTP daemon. ................................................[ OK ]
    Detected incorrect readahead parameter for /dev/sda .....................[WARN]
    Set readahead parameter for /dev/sda to 2048 ............................[ OK ]
    Block Size for /dev/sda is 4096 .........................................[ OK ]
    Readahead parameter for /dev/sda1 is 2048 ...............................[ OK ]
    Block Size for /dev/sda1 is 1024. Expected value >= 4096 ...............[WARN]
    Readahead parameter for /dev/sda2 is 2048 ...............................[ OK ]
    Block Size for /dev/sda2 is 4096 ........................................[ OK ]
    Readahead parameter for /dev/sda3 is 2048 ...............................[ OK ]
    Block Size for /dev/sda3 is 4096 ........................................[ OK ]
    Detected incorrect swappiness setting ...................................[WARN]
    Set swappiness to 0 .....................................................[ OK ]
    Transparent hugepages in /sys/kernel/mm/redhat_transparent_hugepage/enabled are enabled [WARN]
    Disabled Huge Page Compaction ...........................................[ OK ]
    Huge Page Compaction Defrag in /sys/kernel/mm/redhat_transparent_hugepage/defrag is enabled [WARN]
    Disabled Huge Page Compaction Defrag ....................................[ OK ]
    Disk Scheduler for sda is not deadline ..................................[WARN]
    Set Disk Scheduler for sda to deadline ..................................[ OK ]
    Reloading sysctl.conf ...................................................[WARN]
    SELinux is disabled .....................................................[ OK ]
    Verifying Swap Space. ...................................................[ OK ]
    No Logical Volumes exist. ...............................................[ OK ]
    Root entry exists in /etc/sudoers file. .................................[ OK ]
    Verifying ext3 or ext4 filesystem used for data directory. ..............[ OK ]
    Verifying ext3 or ext4 filesystem used for catalog directory. ...........[ OK ]
    Fresh install of Vertica is being performed - skipping database connectivity testing.
    Data Repository Prerequisite Compliance Status on host verticahost-dr -- PASSED
    ===============================================================================
    Script finished - /user/home/verticahost/dr_validate.sh
    ===============================================================================
    If the installation fails early enough in the process, the log file might be available in the home directory of the root or sudo user.
  10. Repeat the previous steps for
    each
    node.
  11. Go to the first node and edit the
    DbHostnames
    parameter in the
    drinstall.properties
    file to include
    all
    the nodes in the cluster.
  12. Run the installation script with the
    -sp
    command line argument by issuing the following command:
    sudo ./dr_install.sh -sp
    properties_file
    To run the script as sudo, you must set up passwordless SSH (the public key) for the sudo account between the data repository hosts.
    For more information, see Prepare to Install the Data Repository.
    This script installs the data repository, creates the database, and disables unnecessary Vertica processes on all the hosts in the cluster.
    If the database administrator user (dradmin) does not already exist, the installation script creates the user. The script prompts you to assign a new password. If the database administrator user exists, but passwordless SSH is not set up, the script prompts for the password to set up.
    If the installation script returns a WARN message for LVM on directories that Vertica does not use, contact Support.
  13. Verify that data repository has been installed successfully by doing the following steps:
    1. To log in to the database server as the database administrator (dradmin) user, issue the following command:
      su - dradmin
    2. Issue the following command:
      /opt/vertica/bin/adminTools
      The Administration Tools utility (adminTools) opens.
    3. Select
      (1) View Database Cluster State
      , and then select
      OK
      or press the return key on your keyboard.
      The database name appears and the State is reported as UP.
    4. Select
      OK
      to acknowledge that the database is UP.
    5. Select
      (E) Exit
      , and then press the return key on your keyboard.
    If the database does not start automatically, select Start DB to start the database manually. If the database is not started, the Data Aggregator installation fails.
(Optional) Secure the Data Repository
To limit the users who can log in to the database to only the data repository administrative account (dradmin) and the root user, lock down the database.
Follow these steps:
  1. Modify the
    /etc/pam.d/sshd
    file by adding the following entry, for the PAM access module, after the "account required pam_nologin.so" entry:
    account required pam_access.so accessfile=/etc/security/sshd.conf
    If the
    /etc/security/sshd.conf
    file is missing, create it using the SSHD documentation.
  2. If the following line from the
    /etc/security/access.conf
    file exists, remove it:
    -:ALL EXCEPT database_admin_user root:LOCAL
    For example:
    -:ALL EXCEPT dradmin root:LOCAL
Configure Log Rotation for Data Repository
To prevent the underlying data repository log file (
vertica.log
) from becoming too large, configure log rotation for the data repository. The recommended configuration for the log rotation is a daily rotation with logs retained for 21 days.
Follow these steps:
  1. Log in to the database server for the data repository as the database administrator user (dradmin) by issuing the following command:
    su - dradmin
  2. Issue the following command:
    /opt/vertica/bin/admintools -t logrotate -d
    database_name
    -r
    frequency
    -k
    number
    • -d
      indicates the database name.
      This parameter is case-sensitive.
    • -r
      specifies how often to rotate the daily logs.
      Values:
      daily, weekly, monthly
    • -k
      specifies how many logs to keep according to the frequency. For example, if the frequency is weekly, a value of 3 keeps three weeks of daily log files.
      Example:
      /opt/vertica/bin/admintools -t logrotate -d drdata -r daily -k 14
  3. (Optional) To verify that you have configured the data repository log file rotation correctly, look at the new gzipped
    vertica.log
    files in the Vertica catalog directory for previous days. The log files use the following filename format:
    vertica.log.
    YYYYMMDD
    .gz
Set Up Automatic Backups of the Data Repository
To preserve your data against failures, set up automatic backups of the data repository.
For more information, see Back Up the Data Repository.