Install the Data Repository

Install the data repository using these instructions.
HID_Install_the_Data_Repository
After you have met the prerequisites described in Prepare to Install the Data Repository, complete the installation as follows:
The following video shows the installation process:

The data repository installation creates the following users:
  • dradmin
    This is the database administrator system account (the database administrator user). This user can run data repository processes and the Vertica Administration Tools utility,
    adminTools
    . This user owns the data repository catalog files and the data files.
    The
    dr_install.sh
    installation script creates this user first. It also creates the verticadba group for tighter control over filesystem access in the
    /opt/vertica/
    directories, and adds this user to this group.
    This user has permissions set to 775. This setting grants full privileges to the verticadba group and read/execute privileges to the other users. The modified permissions are located in the
    /opt/vertica/log
    and
    /opt/vertica/config
    directories.
    Example password:
    drpass
    User Account:
    Operating System and Vertica Database
  • dauser
    The data aggregator connects and interacts with the database using this user. The
    dr_install.sh
    installation script creates this user during the data aggregator installation.
    Example password:
    dbpassword
    User Account:
    Vertica Database
Install the Database
Set up the data repository by installing and configuring the Vertica database as one of the following users:
Install as Root User with Passwordless SSH Configured
In a cluster installation, initiate the data repository installation from any of the hosts that participates in the cluster. The installation pushes the required software components to the additional nodes.
Follow these steps:
  1. Log in to
    any
    host in the data repository cluster as the root user.
  2. Copy the
    installDR.bin
    file locally.
  3. Change permissions for the installation file by issuing the following command:
    chmod u+x installDR.bin
  4. Extract the installation file by issuing the following command:
    ./installDR.bin
  5. Follow the instructions in the console.
  6. When prompted, specify the installation directory to which to extract the installation package, the Vertica license file, and the associated scripts.
    The default installation directory is
    /opt/CA/IMDataRepository_vertica
    Version
    /
    .
  7. Press the
    Return/Enter
    key on your keyboard twice.
    The following files are extracted from the installation file:
    • The data repository RPM Package Manager (RPM) installation package
    • The license file
    • dr_validate.sh
      This validation script verifies the OS settings and modifies the settings if necessary.
    • dr_install.sh
      This installation script installs the data repository, creates the database, and disables unnecessary Vertica processes on the hosts in the cluster. If the Vertica database administrator user (dradmin) does not already exist, the installation script creates the user.
  8. Adjust the following parameters in the
    <installation_directory>
    /drinstall.properties
    file to reflect installation-specific values:
    • DbAdminLinuxUser
      The Linux user that is created to serve as the Vertica database administrator.
      Default:
      dradmin
    • DbAdminLinuxUserHome
      The Vertica database administrator user home directory. If the installation script creates the Vertica database administrator user, it will also create this directory.
      Ensure that the directory leading up to the home account (for example, the
      /export
      directory) already exists on the system.
      Default:
      /export/dradmin
    • DbDataDir
      The location of the
      data
      directory.
      Do not use the LVM for this directory. Ensure that this directory is on a separate mount from the
      catalog
      directory. This isolates those file systems from performance and space interference so that they are unencumbered from any other disk usage or performance considerations, including each other.
      Default:
      /data
    • DbCatalogDir
      The location of the
      catalog
      directory.
      Do not use the LVM for this directory. Ensure that this directory is on a separate mount from the
      data
      directory. This isolates those file systems from performance and space interference so that they are unencumbered from any other disk usage or performance considerations, including each other.
      Default:
      /catalog
    • DbHostNames
      The comma-delimited list of hostnames for the data repository.
      Default:
      yourhostname1,yourhostname2,yourhostname3
    • DbName
      The database name.
      Default:
      drdata
      Case sensitive:
      Yes
    • DbPwd
      The database password. The installation script uses this password during the installation of the data aggregator. You can use special characters (except for single quotation marks) in passwords. If the script does not find the
      DbPwd
      property or if it is blank, the script prompts for this information at runtime.
      Default:
      dbpass
  9. Run the validation script with the
    -p
    option on
    each
    node in the data repository cluster using the same location by issuing the following command:
    ./dr_validate.sh -p drinstall.properties
    You can use the following options with the command:
    • The
      -l
      option: Allows
      localhost
      as the value for the
      DbHostNames
      property.
    • The
      -n
      option: Skips database connectivity checks.
    The root user is established with passwordless Secure Shell (SSH) across the hosts in the cluster. If this user account is not configured with passwordless SSH, you are prompted, sometime multiple times, for a password.
  10. Review on-screen output for failures or warnings. You can run the validation script multiple times after you fix any failures or warnings. The script automatically corrects many failures or warnings. Proceed to the next step only if the final status is “PASSED”. If the final status is not "PASSED", contact Broadcom Support.
    The validation script might ask you to reboot.
    The validation and installation script sgenerate a log file in the
    installation_directory
    /logs
    directory on the data repository host from which you run the scripts. If the installation script fails early enough in the process, the log file might be available in the home directory of the root or sudo user. These log files include the step-by-step output of the scripts. Validate successful/failed script runs by reviewing the script output.
    The following example shows the script output and lists what settings the script verifies and changes:
    Log File: logs/install_log_validate_10-29-2015_11-14-11.log
    ===============================================================================
    Checking Passwordless SSH to all hosts: verticahost-dr
    ===============================================================================
    Passwordless SSH from verticahost-dr to [email protected] ...................[ OK ]
    ===============================================================================
    Beginning Data Repository Prerequisite Compliance Enforcement on host verticahost-dr
    ===============================================================================
    Red Hat Enterprise Linux Major Release: 6 ..............................[ OK ]
    Processor Type: Intel ...................................................[ OK ]
    CPU frequency scaling not available on this system ......................[ OK ]
    DR Administrative User dradmin does not exist. It will be created during vertica installation. [ OK ]
    Maximum number of file handles >= 65536 .................................[ OK ]
    Detected incorrect maximum number of memory maps ........................[WARN]
    Set maximum number of memory maps to Total Mem(KB)/16 ...................[ OK ]
    Detected incorrect page reclaim threshold value .........................[WARN]
    Set page reclaim threshold value to 7924 ................................[ OK ]
    Disabling necessary firewall settings. ..................................[ OK ]
    Enabling NTP daemon. ....................................................[ OK ]
    Starting the NTP daemon. ................................................[ OK ]
    Detected incorrect readahead parameter for /dev/sda .....................[WARN]
    Set readahead parameter for /dev/sda to 2048 ............................[ OK ]
    Block Size for /dev/sda is 4096 .........................................[ OK ]
    Readahead parameter for /dev/sda1 is 2048 ...............................[ OK ]
    Block Size for /dev/sda1 is 1024. Expected value >= 4096 ...............[WARN]
    Readahead parameter for /dev/sda2 is 2048 ...............................[ OK ]
    Block Size for /dev/sda2 is 4096 ........................................[ OK ]
    Readahead parameter for /dev/sda3 is 2048 ...............................[ OK ]
    Block Size for /dev/sda3 is 4096 ........................................[ OK ]
    Detected incorrect swappiness setting ...................................[WARN]
    Set swappiness to 0 .....................................................[ OK ]
    Transparent hugepages in /sys/kernel/mm/redhat_transparent_hugepage/enabled are enabled [WARN]
    Disabled Huge Page Compaction ...........................................[ OK ]
    Huge Page Compaction Defrag in /sys/kernel/mm/redhat_transparent_hugepage/defrag is enabled [WARN]
    Disabled Huge Page Compaction Defrag ....................................[ OK ]
    Disk Scheduler for sda is not deadline ..................................[WARN]
    Set Disk Scheduler for sda to deadline ..................................[ OK ]
    Reloading sysctl.conf ...................................................[WARN]
    SELinux is disabled .....................................................[ OK ]
    Verifying Swap Space. ...................................................[ OK ]
    No Logical Volumes exist. ...............................................[ OK ]
    Root entry exists in /etc/sudoers file. .................................[ OK ]
    Verifying ext3 or ext4 filesystem used for data directory. ..............[ OK ]
    Verifying ext3 or ext4 filesystem used for catalog directory. ...........[ OK ]
    Fresh install of Vertica is being performed - skipping database connectivity testing.
    Data Repository Prerequisite Compliance Status on host verticahost-dr -- PASSED
    ===============================================================================
    Script finished - /user/home/verticahost/dr_validate.sh
    ===============================================================================
  11. Run the installation script with the
    -p
    option by issuing the following command:
    ./dr_install.sh -p drinstall.properties
The data repository is installed, the database is created, and the unnecessary Vertica processes on the hosts in the cluster are disabled. If the database administrator user (dradmin) does not already exist, the user is created, and you are prompted to assign a new password.
Install as Sudo User with Passwordless SSH Configured
You set up the data repository by installing and configuring the Vertica database as the sudo user.
RHEL 6.x does not support this functionality.
Follow these steps:
  1. Log in to
    each
    node in the data repository cluster as the sudo user.
  2. Copy the
    installDR.bin
    file locally.
  3. Change permissions for the installation file by issuing the following command:
    chmod u+x installDR.bin
  4. Extract the installation file as the sudo user by issuing the following command:
    sudo ./installDR.bin
  5. Follow the instructions in the console.
  6. Press the
    Return/Enter
    key on your keyboard twice.
    The following files are extracted from the installation file:
    • The data repository RPM Package Manager (RPM) installation package
    • The license file
    • dr_validate.sh
      This validation script verifies the OS settings and modifies the settings if necessary.
    • dr_install.sh
      This installation script installs the data repository, creates the database, and disables unnecessary Vertica processes on the hosts in the cluster. If the Vertica database administrator (dradmin) user does not already exist, the script creates the user.
  7. Adjust the following parameters in the
    <installation directory>
    /drinstall.properties
    file to reflect installation-specific values. This file applies to the
    dr_validate.sh
    and
    dr_install.sh
    validation and installation scripts.
    • DbAdminLinuxUser
      The Linux user that is created to serve as the Vertica database administrator.
      Default:
      dradmin
    • DbAdminLinuxUserHome
      The Vertica database administrator user home directory.
      Default:
      /export/dradmin
      This directory is created if the Vertica installer creates the user. Be sure that the directory leading up to the home account already exists on the system. For example, if you are using the
      /export/dradmin
      directory, be sure that the
      /export
      directory exists.
    • DbDataDir
      The location of the
      data
      directory.
      Default:
      /data
      Do not use the LVM for this directory. Ensure that this directory is on a separate mount from the
      catalog
      directory. This isolates those file systems from performance and space interference so that they are unencumbered from any other disk usage or performance considerations, including each other.
    • DbCatalogDir
      The location of the
      catalog
      directory.
      Default:
      /catalog
      Do not use the LVM for this directory. Ensure that this directory is on a separate mount from the
      data
      directory. This isolates those file systems from performance and space interference so that they are unencumbered from any other disk usage or performance considerations, including each other.
    • DbHostNames
      The list of hostnames for the data repository.
      Default:
      yourhostname1,yourhostname2,yourhostname3
      List only the local hostname. You add all other nodes in a later step.
    • DbName
      The database name.
      Default:
      drdata
      Case sensitive:
      Yes
    • DbPwd
      The database password. You can use special characters (except for single quotation marks) in passwords.
      Default:
      dbpass
      The installation script uses this password during the installation of the data aggregator. If the script does not find the
      DbPwd
      property or if it is blank, the script prompts for this information at runtime.
  8. Run the validation script, with the
    -sp
    option, on
    each
    node in the data repository cluster using the same location by issuing the following command:
    sudo ./dr_validate.sh -sp drinstall.properties
    You can use the following options with the command:
    • To allow
      localhost
      as the value for the
      DbHostNames
      property, use the
      -l
      option.
    • To skip database connectivity checks, use the
      -n
      option.
    The sudo user is established with passwordless Secure Shell (SSH) across the hosts in the cluster. If SSH without a password does not exist for this account, you are prompted, sometime multiple times, for a password.
  9. Review the on-screen output for failures or warnings. You can run the validation script multiple times after you fix failures or warnings. The script automatically corrects many failures or warnings. Proceed only if the final status is “PASSED”. If the final status is not "PASSED", contact Broadcom Support.
    The validation script might ask you to reboot.
    The validation and installation scripts generate a log file in
    installation_directory
    /logs
    on the data repository host from which you run the scripts. If the installation fails early enough in the process, the log file might be available in the home directory of the root or sudo user. These log files include the step-by-step output of the scripts. To validate successful/failed script runs, review the script output.
    The following example shows the script output and lists what settings the script verifies and changes:
    Log File: logs/install_log_validate_10-29-2015_11-14-11.log
    ===============================================================================
    Checking Passwordless SSH to all hosts: verticahost-dr
    ===============================================================================
    Passwordless SSH from verticahost-dr to [email protected] ...................[ OK ]
    ===============================================================================
    Beginning Data Repository Prerequisite Compliance Enforcement on host verticahost-dr
    ===============================================================================
    Red Hat Enterprise Linux Major Release: 6 ..............................[ OK ]
    Processor Type: Intel ...................................................[ OK ]
    CPU frequency scaling not available on this system ......................[ OK ]
    DR Administrative User dradmin does not exist. It will be created during vertica installation. [ OK ]
    Maximum number of file handles >= 65536 .................................[ OK ]
    Detected incorrect maximum number of memory maps ........................[WARN]
    Set maximum number of memory maps to Total Mem(KB)/16 ...................[ OK ]
    Detected incorrect page reclaim threshold value .........................[WARN]
    Set page reclaim threshold value to 7924 ................................[ OK ]
    Disabling necessary firewall settings. ..................................[ OK ]
    Enabling NTP daemon. ....................................................[ OK ]
    Starting the NTP daemon. ................................................[ OK ]
    Detected incorrect readahead parameter for /dev/sda .....................[WARN]
    Set readahead parameter for /dev/sda to 2048 ............................[ OK ]
    Block Size for /dev/sda is 4096 .........................................[ OK ]
    Readahead parameter for /dev/sda1 is 2048 ...............................[ OK ]
    Block Size for /dev/sda1 is 1024. Expected value >= 4096 ...............[WARN]
    Readahead parameter for /dev/sda2 is 2048 ...............................[ OK ]
    Block Size for /dev/sda2 is 4096 ........................................[ OK ]
    Readahead parameter for /dev/sda3 is 2048 ...............................[ OK ]
    Block Size for /dev/sda3 is 4096 ........................................[ OK ]
    Detected incorrect swappiness setting ...................................[WARN]
    Set swappiness to 0 .....................................................[ OK ]
    Transparent hugepages in /sys/kernel/mm/redhat_transparent_hugepage/enabled are enabled [WARN]
    Disabled Huge Page Compaction ...........................................[ OK ]
    Huge Page Compaction Defrag in /sys/kernel/mm/redhat_transparent_hugepage/defrag is enabled [WARN]
    Disabled Huge Page Compaction Defrag ....................................[ OK ]
    Disk Scheduler for sda is not deadline ..................................[WARN]
    Set Disk Scheduler for sda to deadline ..................................[ OK ]
    Reloading sysctl.conf ...................................................[WARN]
    SELinux is disabled .....................................................[ OK ]
    Verifying Swap Space. ...................................................[ OK ]
    No Logical Volumes exist. ...............................................[ OK ]
    Root entry exists in /etc/sudoers file. .................................[ OK ]
    Verifying ext3 or ext4 filesystem used for data directory. ..............[ OK ]
    Verifying ext3 or ext4 filesystem used for catalog directory. ...........[ OK ]
    Fresh install of Vertica is being performed - skipping database connectivity testing.
    Data Repository Prerequisite Compliance Status on host verticahost-dr -- PASSED
    ===============================================================================
    Script finished - /user/home/verticahost/dr_validate.sh
    ===============================================================================
  10. Go to the first node and edit the
    DbHostnames
    parameter in the
    drinstall.properties
    file to include
    all
    the nodes in the cluster.
  11. Run the installation script with the
    -sp
    option by issuing the following command:
    sudo ./dr_install.sh -sp drinstall.properties
    You can run the script as sudo by setting up passwordless SSH (the public key) for the sudo account between the data repository hosts.
    For more information, see Prepare to Install the Data Repository.
The data repository is installed, the database is created, and the unnecessary Vertica processes on all the hosts in the cluster are disabled. You are prompted for the sudo user password for
each
node during this process. If the database administrator user (dradmin) does not already exist, the user is created, and you are prompted to assign a new password for
each
node during this process.
Verify the Database Installation
Verify that the installation script has installed the data repository successfully. Use
adminTools
.
Follow these steps:
  1. Log in to the database server as the database administrator (dradmin) user by issuing the following command:
    su - dradmin
  2. Open
    adminTools
    from the
    /opt/vertica/bin/adminTools
    directory.
  3. Select option
    1 (View Database Cluster State)
    from the main menu of the
    Administration Tools
    dialog.
  4. Select
    OK
    or press the
    Return/Enter
    key on your keyboard.
    The database name appears and the State is reported as "UP".
  5. Acknowledge that the database is UP by selecting
    OK
    .
  6. Select option
    E (Exit)
    , and then press the
    Return/Enter
    key on your keyboard.
If the database does not start automatically, to avoid data aggregation installation failure, start the database manually by selecting
Start DB
.
Next Steps
After you have upgraded the data repository, you can do the following.
(Optional) Secure the Data Repository
To limit the users who can log in to the database to only the data repository administrative account (dradmin) and the root user, lock down the database.
Follow these steps:
  1. Modify the
    /etc/pam.d/sshd
    file by adding the following entry, for the PAM access module, after the "account required pam_nologin.so" entry:
    account required pam_access.so accessfile=/etc/security/sshd.conf
    If this file is missing, create it.
    For more information, see the SSHD documentation.
  2. If the following line from the
    /etc/security/access.conf
    file exists, remove it:
    -:ALL EXCEPT
    <database_admin_user>
    root:LOCAL
    Example:
    -:ALL EXCEPT dradmin root:LOCAL
The data repository is secured.
Configure Log Rotation for the Data Repository
To prevent the underlying data repository log file (
vertica.log
) from becoming too large, configure log rotation for the data repository. The recommended configuration for the log rotation is a daily rotation with logs retained for 21 days.
Follow these steps:
  1. Log in to the database server for the data repository as the database administrator user (dradmin) by issuing the following command:
    su - dradmin
  2. Issue the following command, with the following options:
    /opt/vertica/bin/admintools -t logrotate -d
    database_name
    -r
    frequency
    -k
    number
    • -d
      Indicates the database name.
      Case senstive:
      Yes
    • -r
      Specifies how often to rotate the daily logs.
      Values:
      daily, weekly, monthly
    • -k
      Specifies how many logs to keep according to the frequency. For example, if the frequency is weekly, a value of 3 keeps three weeks of daily log files.
      Example:
      /opt/vertica/bin/admintools -t logrotate -d drdata -r daily -k 14
  3. (Optional) To verify that you have configured the data repository log file rotation correctly, look at the new
    vertica.log
    gzipped files in the Vertica catalog directory for previous days. The log files use the following filename format:
    vertica.log.
    YYYYMMDD
    .gz
Set Up Automatic Backups of the Data Repository
To preserve your data against failures, set up automatic backups of the data repository.