EXTRACT (Extract Data Table)

The EXTRACT function of DBUTLTY reads a ddb table and writes it to the output data set in the specified sequence. EXTRACT outputs data in expanded form.
datacom151
The EXTRACT function of DBUTLTY reads a 
CA Datacom®/DB
 table and writes it to the output data set in the specified sequence. EXTRACT outputs data in expanded form.
 
More Information:
 
The following topics are discussed on this page:
 
 
When to Use EXTRACT
Use the EXTRACT function when you intend to process table information for use with the 
CA Datacom® Datadictionary™
 Record Migration Facility utility (DDRMFLM) in fixed-block format or for use outside the 
CA Datacom®
 environment. To back up a table in compressed form, use the BACKUP function for the entire data area that contains the table.
In Simplify mode, the function expects MUF to be enabled. This provides control and protection. If MUF is not enabled, this function should wait for MUF to enable. If you are required to execute the function without MUF enabled, for example if it is needed to allow restart to execute, the function must either follow a function that can only run with MUF not enabled or follow a function to acknowledge MUF is not enabled with SET OPTION1=MUF_NOT_ENABLED. It is possible (but not recommended) to use SET OPTION1=MUF_ENABLED_OR_DISABLED. The function without MULTUSE=YES and MUF enabled does not require or want the database to be closed. However, there must be no open URT for any table in the base, except that SEQ=PHYSICAL allows URT entries.
Extracting in Native Key Sequence
When not using Simplify or MUF is not enabled, an extract of a table in Native Key sequence without MULTUSE=YES requires that the database not be open for update by any copy of 
CA Datacom®/DB
. This protection helps ensure the successful retrieval of the data. Multiple EXTRACTs may be executing at once with the same or different tables in the database. However, if two or more are executing within a single area, the completion of the first updates the Directory (CXX) to remove the fact that the area is being unloaded, which loses protection for any remaining EXTRACTs for an area. Therefore, to help ensure protection, only one table in an area should be done at one time.
An extract with MULTUSE=YES requires that the database be set in the MUF startup without ACCESS NOOPT. The database is opened for update during the EXTRACT, and that update protects the area containing the table to ensure a successful retrieval of the data.
 With MULTUSE=YES, only one EXTRACT or BACKUP may be executing at one time within a single data area.
When using Native Key sequence you can do a parallel extract.
Extracting in Physical Key Sequence
With UPDATE=YES, the physical extract occurs with the rules as previously described (for Native Key sequence).
Without UPDATE=YES, the physical extract executes with few requirements and no protection. It was designed for the special case of extracting the data with I/O errors or during maintenance for a 
hot
 backup. With no protection, the data is the best that it is possible to get by reading what is on the DASD at the time it is read. This option should be executed when the submitter is watching or controlling other activity.
The EXTRACT function can also be used to verify the integrity of every data block in a data area. To activate this feature, code 'DUMMY' for the DDNAME= keyword value (DDNAME=DUMMY) and also code 'DUMMY' on the JCL DD statement (//DUMMY DD DUMMY).
If any block in the data area is determined to be invalid, the DBUTLTY execution terminates with a S000 U0004 abend. One DB13155W message is produced for each invalid data block found. SNAP dumps are produced for the first ten errors encountered. 
EXTRACT Output
The output contains all the records for the table in the sequence specified by the SEQ keyword. If the table is empty, the function creates a valid null output data set.
Usually, EXTRACT writes each data record as a fixed-length record in a fixed-block format. When you request an EXTRACT of a table that was loaded from a variable-length VSAM file for use by the 
CA Datacom®
 VSAM Transparency, the function writes each data record as a variable-length record in a variable block format.
Master List Requirements
The data area EXTRACT function uses a form of GSETL/GETIT or GSETP/GETPS command for reading the data blocks. This includes reading multiple blocks per I/O. You can allocate sequential buffers to use for reading the area by specifying the SEQBUFS= keyword.
Successful Execution Requirements and Controls
 
 
Environmental Requirements when MULTUSE=NO and either SEQ=NAT or SEQ=PHY,UPDATE=YES, Simplify NO, or MUF down
 
 
  • The database may not be open for update anywhere.
  • The database may or may not be open in any MUF for read.
  • The database may or may not have any table in unloading status.
 
 
Environmental 
Requirements
 
 
when MULTUSE=NO and either SEQ=NAT or SEQ=PHY,UPDATE=YES, Simplify YES, and MUF enabled
 
 
  • The database may be open for update but must have no users.
 
 
Environmental 
Controls 
when MULTUSE=NO and either SEQ=NAT or SEQ=PHY,UPDATE=YES Simplify NO or MUF down
 
 
  • The database is opened for unloading status.
  • The utility has no knowledge about current ACCESS database or area status.
  • The utility sets no ACCESS database or area status.
 
 
Environmental 
Controls
 when MULTUSE=NO and either SEQ=NAT or SEQ=PHY,UPDATE=YES, Simplify YES, and MUF enabled
 
 
  • The database is opened for update if not already.
  • All areas are closed if open.
  • The utility executes with every ACCESS database or area status.
  • The utility sets protection to block other incompatible DBUTLTY functions.
 
 
Environmental 
Requirements
 
when SEQ=PHY,UPDATE=NO, Simplify NO, or MUF down
 
  • The database may or may not be open for update or read or unloading anywhere.
  • The database is not opened for read or update or unloading.
  • This option has no requirements, it provides no protection.
 
 
Environmental 
Requirements 
when SEQ=PHY,UPDATE=NO, Simplify YES, and MUF enabled (this includes MULTUSE=YES, MULTUSE=NO or omitted)
 
 
  • The database may be open for update or will be opened.
 
 
Environmental
 Requirements 
when MULTUSE=YES (except SEQ=PHY,UPDATE=NO above)
 
 
  • The database may or may not be open for update in the connected MUF
  • The database may 
    not
     be open for update in another MUF or Single User job.
  • The database may 
    not
     have any table in 
    unloading
     status.
  • The database ACCESS must be set OPTIMIZE and WRITE/READ/UTLTY.
  • The area ACCESS must be set WRITE/READ/UTLTY.
  • The area may have no table in an open URT with UPDATE=YES.
  • The area may 
    not
     have any other MULTUSE=YES utility executing.
  • When MULTUSE=YES, this function will not be allowed against a Data Sharing MUF if more than one is executing.
 
 
Environmental
 Controls 
when MULTUSE=YES
 
 
  • The database is opened by MUF for update.
  • The database has no 
    unloading
     status set.
  • The utility sets ACCESS area READ (utility set).
  • The utility sets EXTRACT executing this MUF this database or area.
How to Use EXTRACT
You can execute this command in either Single User or with the MUF active. Execute the EXTRACT function using the following command format:
 
Extract Data Table Native (z/OS)
 
►►─ EXTRACT AREA=
a
,DBID=
n
,DDNAME=
d
,TABLE=
t
─┬──────────────┬──────────────────► └─ ,BLKSIZE=
n
─┘ ►─┬───────────────┬─┬──────────────┬─┬───────────────────────┬───────────────► └─ ,FIRSTKEY=
n
─┘ └─ ,LASTKEY=
n
─┘ └─ ,MULTUSE= ─┬─ NO ──┬─┘ └─ YES ─┘ ►─┬────────────────┬─┬───────────────┬─ ,SEQ=NATIVE ─────────────────────────►◄ └─ ,SEQBUFS=
nnn
─┘ └─ ,SLACK=
nnnn
─┘
 
Extract Data Table Physical (z/OS)
 
►►─ EXTRACT AREA=
a
,DBID=
n
,DDNAME=
d
,TABLE=
t
─┬──────────────┬──────────────────► └─ ,BLKSIZE=
n
─┘ ►─┬────────────────┬─┬───────────────┬─ ,SEQ=PHYSICAL ───────────────────────► └─ ,SEQBUFS=
nnn
─┘ └─ ,SLACK=
nnnn
─┘ ►─┬───────────────────────┬──────────────────────────────────────────────────►◄ └─ ,UPDATE= ─┬─ NO ◄ ─┬─┘ └─ YES ──┘
 
Command
 
  •  
    EXTRACT
    Invokes the function to extract data in expanded form.
 
Required Keywords
 
  •  
    AREA=
    Identifies the data area containing the table.
    Code *** (three asterisks) to use the Full Parent name, that is, if you are extracting a partition Full Parent table.
    •  
      Valid Entries:
      DATACOM-NAME of the area, or *** to extract a partition Full Parent table
    •  
      Default Value:
      (No default)
  •  
    ,DBID=
    Identifies the database for the area to be read.
    •  
      Valid Entries:
      DATACOM-ID of the database
    •  
      Default Value:
      (No default)
  •  
    ,DDNAME=
    Specifies the JCL DDNAME for the output data set. This name must match the corresponding name in the JCL.
    A DDNAME is not acceptable
     
    for sequential input or output files if it is a name reserved for a 
    CA Datacom®
     area. Names with the following patterns are therefore not acceptable for DDNAME=:
  • 3-byte names that end with XX, meaning they are reserved as either current or future 
    CA Datacom®
     control areas.
  • 6-byte names that end with what could be a database ID from 001 through 999.
  • 7-byte names that end with what could be a database ID from 1000 through 9999.
    The DDNAME= value is verified for acceptability to protect you from unintentionally causing data corruption. The DDNAME check is the default but optional. You can prevent the DDNAME check by using a DBSIDPR parameter (DBUTLTY_EDIT_DATA_SET=) for individual MUF environments. However, we recommend that you allow the DDNAME check.
    The data corruption risk involves not the DDNAME itself but the content of the data set. For example, suppose that you used the CXX DDNAME as the output of a backup. You then copied the CXX DD statement and changed the DDNAME of the copy to be acceptable, avoiding the DDNAME= error. The backup would, however, then overlay the CXX data set, which is not the intent of a backup.
    If you specify an unacceptable name for DDNAME=, message DB10059E is generated. 
     We recommend that you allow DDNAME= check protection. You can, however, disable DDNAME= protection. To disable protection, assemble the DBSIDPR module used for this 
    CA Datacom®
     environment and specify NONE for the DBUTLTY_EDIT_DATA_SET= parameter. The default is DBUTLTY_EDIT_DATA_SET=FULL_1, which allows DDNAME= protection. 
  •  
    ,TABLE=
    Specifies the name of an individual table to be extracted.
    •  
      Valid Entries:
      DATACOM-NAME of the table
    •  
      Default Value:
      (No default)
 
Optional Keywords
 
  •  
    ,BLKSIZE=
    Indicates the block size for the output data set. The number supplied with this keyword overrides the block size supplied in your JCL. If block size is specified in the JCL and this keyword is not used, the JCL block size is used. In the absence of either this parameter or a specification in the JCL, the utility uses a default as specified below.
    •  
      Valid Entries:
      For fixed-length tables: An integer value that is a multiple of the table record length.
      For variable-length tables (
      CA Datacom®
       VSAM Transparency): An integer value that is at least the length of the longest record plus 8.
    •  
      Default Value:
      The default for BLKSIZE= when running z/OS is 0 (zero), which defaults to the z/OS default. That default is large and optimized for the particular device type, that is, the most optimal default for sites executing this operating system.
  •  
    ,FIRSTKEY= and/or ,LASTKEY=
    These keywords allow you to select a segment of a table or area by Native Key (SEQ=NATIVE) value to be processed for EXTRACT. The EXTRACT function normally defaults to include the full Native Key range from low values to high values. FIRSTKEY= overrides the default starting position while LASTKEY= overrides the default ending position, limiting the records retrieved for output to the EXTRACT file. If one keyword is specified and the other is not specified, a default value is selected for the missing keyword. Normal 
    CA Datacom®
     key lengths are from 1 through 180 bytes. The key value data entered for FIRSTKEY= and LASTKEY= can be from 0 (zero) through 59 bytes long. You only need to code the number of positions of the key value that are significant to you, but be aware that the value coded is 
    left justified regardless of key field data type.
     The utility pads the low order or remaining positions of the key value with low values on FIRSTKEY= and high values on LASTKEY=. To extract the entire area in one step, specify neither FIRSTKEY= nor LASTKEY=. Alternately, you could specify FIRSTKEY=00 and/or LASTKEY=00 which causes processing to start at low values and end at high values. The length of zero indicates that the padded key values (low values and high values) are to be used.
    The data is not edited or interpreted. Therefore, if the key value data you need to enter is in binary, you must set up the JCL using a hexadecimal mode display and enter the key value data in hexadecimal. This data can contain blanks, commas, and any other special characters. 
    CA Datacom®/DB
     cannot edit this data for syntax, so the control statement can 
    look
     invalid, especially when it includes blanks or commas or key value data containing English words. For example, FIRSTKEY=12121212121212 means the length is 12, key value is 121212121212. Another example, FIRSTKEY=10FIRSTKEY10 means the length is 10, key value is FIRSTKEY10.
    •  
      Valid Entries:
      A 2-byte length (in the range of 00 through 59) followed by a character string of key value bytes (for the length specified) that represents the key value.
    •  
      Default Value:
      The starting/ending position of the full Native Key range from low values to high values.
  •  
    ,MULTUSE=
    (For area level DBUTLTY control only.)
     
     EXTRACT with a Full Parent table must 
    not
     specify MULTUSE=YES.
    For more information about area level DBUTLTY control, see Area Level DBUTLTY Control.
    •  
      Valid Entries:
      NO or YES
    •  
      Default Value:
      NO
  •  
    ,SEQ=
    Indicates the sequence for the output data.
    •  
      NATIVE
      CA Datacom®/DB
       reads the Index and writes the data in logical Native Key sequence. When using SEQ=NATIVE, you can use the FIRSTKEY=/LASTKEY= optional keywords to select only a portion of the table or area. One of the uses of FIRSTKEY=/LASTKEY= involves doing parallel extracts. 
    •  
      PHYSICAL
      CA Datacom®/DB
       does not read the Index and outputs the data in physical track sequence.
       EXTRACT with a Full Parent table must 
      not
       specify MULTUSE=YES.
    •  
      Valid Entries:
      NATIVE or PHYSICAL
    •  
      Default Value:
      NATIVE
  •  
    ,SEQBUFS=
    With MULTUSE=YES specified, if necessary to ensure best performance DBUTLTY overrides the SEQBUFS= value either specified or defaulted. If SEQBUFS= is omitted (invoking the default value) or specified as a value 0-128 (even numbers only), the MUF treats this as though 128 was specified (or defaulted). This directs the MUF to allocate 128 private sequential buffers for use by this function. If SEQBUFS= is specified as a number 130-256 (even numbers only) the MUF treats this as though 256 was specified and uses the common data pools (64 blocks per I/O).
    When MULTUSE= is omitted or specified as NO, SEQBUFS= is edited and ignored, and DBUTLTY executes with the best performance by using either the available data buffers (specified with the DATANO= parameter in the DBMSTLST macro) or available private sequential buffers.
    •  
      Valid Entries:
      0-256 (even numbers only)
    •  
      Default Value:
      128
  •  
    ,SLACK=
    Provides record padding in the output data. This allows a larger size to be written to the output file to assist in possible record expansion. Without this parameter, the data written to the output file is in the size used by 
    CA Datacom®/DB
    . The pad, if any, is always blanks.
    •  
      Valid Entries:
      0 to 9999
    •  
      Default Value:
      0
  •  
    ,UPDATE=
    Indicates whether the database can be open for update at the start or during the backup. When using MULTUSE=YES the area level controls apply when backing up an area. 
     Only valid if you specify PHYSICAL for the SEQ= keyword.
Parallel Extract
Use parallel extract by specifying starting and/or ending key value ranges to be processed. Using parallel extract you could, for example, extract the first half of the data in one job and the remaining half in a concurrently executing second job. 
CA Datacom®/DB
 places no restrictions on the number of extracts running in parallel. We recommend a starting point of four so that each processes approximately one fourth of the data. It is up to you to determine the appropriate key value ranges.
The parallel extract output file can be used as input to a regular LOAD DBUTLTY function either as all parallel parts concatenated together in key value order (for a full load) or less than all of the parallel parts to allow for deleting a range of rows.
FIRSTKEY= and LASTKEY= Keywords in Parallel Extract
It is important that the set of parallel extracts not miss any data rows, and 
it is your responsibility
 to edit the key values before executing the set of extract jobs. We strongly recommend that each set of extract jobs be executed through the DBUTLTY edit function and the output reviewed before execution. To do the edit, add the parameter EDIT as the first DBUTLTY function followed by your parallel extract control statements. Execute DBUTLTY and review the output to make certain that you are requesting what you expect.
If the parallel extract is being used to delete from the start or end of the key value range, the extract can accomplish this with one job step. A load of the output extract file would provide a data area loaded without the 
dropped
 rows.
When DBUTLTY echoes back the input parameters for the FIRSTKEY= and LASTKEY= keywords, it prints them as a three line set with the first line being printable characters, the second being the zone portion of the bytes, the third being the digit portion of the bytes. Following is an example:
EXTRACT DBID=111,AREA=SAM,DDNAME=EXTRACT,SEQ=NATIVE FIRSTKEY=0512345,LASTKEY=0299 FUNCTION=EXTRACT AREA=SAM DBID=00111 DDNAME=EXTRACT SEQ=NATIVE FIRSTKEY=0512345 . . . .=FFFFFFF . . . .=0512345 LASTKEY=0299 . . . =FFFF . . . =0299
The following is an example of parallel extract using three output datasets:
EXTRACT DBID=790,AREA=A01,DDNAME=EXTRACT1,SEQ=NATIVE, FIRSTKEY=00,LASTKEY=043333 EXTRACT DBID=790,AREA=A01,DDNAME=EXTRACT2,SEQ=NATIVE, FIRSTKEY=043334,LASTKEY=046666 EXTRACT DBID=790,AREA=A01,DDNAME=EXTRACT3,SEQ=NATIVE, FIRSTKEY=046667,LASTKEY=00
Restrictions
We do 
not
 recommend parallel extract for tables with a NATIVE sequence key defined as descending or as being type sensitive. For these types of keys, the actual key value stored in the index may differ from the value stored in the data row. Parallel extract is therefore allowed with the restriction that the key value provided must be in the internal index format, not in the external format matching the data row data.
Other restrictions for parallel extract are as follows:
  • The option is available only in z/OS.
  • The option is available only to EXTRACT with SEQ=NATIVE and AREA= specified.
  • Exercise due caution when specifying and using parallel extract. Be aware that data rows could be lost due to poorly defined control statements or data set handling. 
    CA Datacom®/DB
     cannot edit the goal, only the specific control statements provided.
Multivolume INITs
You can initialize these areas across multiple extents on multiple volumes. For a description of how the operating system handles secondary allocations, see your operating system JCL manual.
The 
CA Datacom®/DB
 SVC must be installed when using multivolume data sets. 
Example JCL (EXTRACT)
The following shows the command to extract the data table POL in area DEM in database 1 to the output data set, OUT1DD, on tape or disk. 
CA Datacom®/DB
 outputs the data in physical track sequence.
 Use the following as a guide to prepare your JCL. The JCL statements are for example only. Lowercase letters in a statement indicate a value you must supply. Code all statements to your site and installation standards.
//jobname
See the note above and
.
// EXEC PGM=DBUTLTY,REGION=2M //STEPLIB
See the note above and
.
//CXX DD DSN=cxx.data.set,DISP=SHR Directory data set //OUT1DD DD DSN=out1dd,DISP=SHR Output data set //SYSIN DD * Command Input EXTRACT AREA=DEM,DBID=1,DDNAME=OUT1DD,TABLE=POL,SEQ=PHYSICAL /*
Sample Report
Following is a sample report page. For an example report header, see Sample Report Headers.
 
CONTROL CARD(S) .........1.........2.........3.........4.........5.........6.........7.........8 EXTRACT AREA=DEM,DBID=1,DDNAME=OUT1DD,TABLE=POL,SEQ=PHYSICAL FUNCTION=EXTRACT AREA=DEM DBID=1 DDNAME=OUT1DD SEQ=PHYSICAL TABLE=POL
This page of the report shows the following:
  • The command exactly as entered.
  • An analysis of keywords encountered and expected. Any errors found are flagged with a note in the left margin.
  • Any messages related to syntax processing.
TABLE RECORDS PNC 0 PNM 0 POH 0 POL 4 TOTAL 4
This page of the report shows the following:
Confirmation that four records were extracted from the requested table.