Using filler in control file
Andrew Hancock - VMware vExpert. See if this solution works for you by signing up for a 7 day free trial. What do I get with a subscription? With your subscription - you'll gain access to our exclusive IT community of thousands of IT pros.
We can't always guarantee that the perfect solution to your specific problem will be waiting for you. If you ask your own question - our Certified Experts will team up with you to help you get the answers you need.
Who are the certified experts? How quickly will I get my solution? We can't guarantee quick solutions - Experts Exchange isn't a help desk. We're a community of IT professionals committed to sharing knowledge.
The syntax for delimiter specifications is:. BY An optional keyword for readability. If the data is not enclosed, the data is read as a terminated field. AND This keyword specifies a trailing enclosure delimiter which may be different from the initial enclosure delimiter. If the AND clause is not present, then the initial and trailing delimiters are assumed to be the same.
Sometimes the same punctuation mark that is a delimiter also needs to be included in the data. To make that possible, two adjacent delimiter characters are interpreted as a single occurrence of the character, and this character is included in the data.
For example, this data:. For this reason, problems can arise when adjacent fields use the same delimiters. For example, the following specification:. But if field1 and field2 were adjacent, then the results would be incorrect, because. The default maximum length of delimited data is bytes. So delimited fields can require significant amounts of storage for the bind array.
A good policy is to specify the smallest possible maximum value; see Determining the Size of the Bind Array. Trailing blanks can only be loaded with delimited datatypes. For more discussion on whitespace in fields, see Trimming Blanks and Tabs. If conflicting lengths are specified, one of the lengths takes precedence.
A warning is also issued when a conflict exists. This section explains which length is used. If you specify a starting position and ending position for one of these fields, then the length of the field is determined by these specifications. If you specify a length as part of the datatype and do not give an ending position, the field has the given length.
If starting position, ending position, and length are all specified, and the lengths differ; then the length given as part of the datatype specification is used for the length of the field. For example, if. If a delimited field is specified with a length, or if a length can be calculated from the starting and ending position, then that length is the maximum length of the field.
The actual length can vary up to that maximum, based on the presence of the delimiter. If a starting and ending position are both specified for the field and if a field length is specified in addition, then the specified length value overrides the length calculated from the starting and ending position.
If the expected delimiter is absent and no maximum length has been specified, then the end of record terminates the field. The length of a date field depends on the mask, if a mask is specified. For example, if the mask is specified as:. If starting and ending positions are specified, however, then the length calculated from the position specification overrides a length derived from the mask.
A specified length such as "DATE 12 " overrides either of those. If the date field is also specified with terminating or enclosing delimiters, then the length specified in the control file is interpreted as a maximum length for the field. When a datafile created on one platform is to be loaded on a different platform, the data must be written in a form that the target system can read. For example, if the source system has a native, floating-point representation that uses 16 bytes, and the target system's floating-point numbers are 12 bytes, there is no way for the target system to directly read data generated on the source system.
The best solution is to load data across a Net8 database link, taking advantage of the automatic conversion of datatypes. This is the recommended approach, whenever feasible.
Problems with inter-platform loads typically occur with native datatypes. In some situations, it is possible to avoid problems by lengthening a field by padding it with zeros, or to read only part of the field to shorten it.
For example, when an 8-byte integer is to be read on a system that uses 4-byte integers, or vice versa. Note, however, that incompatible byte-ordering or incompatible datatype implementation, may prevent this.
Datafiles written using these datatypes are longer than those written with native datatypes. They may take more time to load, but they transport more readily across platforms. However, where incompatible byte-ordering is an issue, special filters may still be required to reorder the data.
It does not apply to the direct path load method. Because a direct path load formats database blocks directly, rather than using Oracle's SQL interface, it does not use a bind array. Multiple rows are read at one time and stored in the bind array. The bind array has to be large enough to contain a single row. Otherwise, the bind array contains as many rows as can fit within it, up to the limit set by the value of the ROWS parameter.
Although the entire bind array need not be in contiguous memory, the buffer for each field in the bind array must occupy contiguous memory.
To minimize the number of calls to Oracle and maximize performance, large bind arrays are preferable. In general, you gain large improvements in performance with each increase in the bind array size up to rows. Increasing the bind array size above rows generally delivers more modest improvements in performance. So the size in bytes of rows is typically a good value to use. The remainder of this section details the method for determining that size.
It is not usually necessary to perform the detailed calculations described in this section. This section should be read when maximum performance is desired, or when an explanation of memory usage is needed. The bind array never exceeds that maximum.
If that size is too large to fit within the specified maximum, the load terminates with an error. The bind array's size is equivalent to the number of rows it contains times the maximum length of each row.
The maximum length of a row is equal to the sum of the maximum field lengths, plus overhead. Many fields do not vary in size. These fixed-length fields are the same for each loaded row. There is no overhead for these fields. The maximum lengths describe the number of bytes, or character positions, that the fields can occupy in the input data record.
That length also describes the amount of storage that each field occupies in the bind array, but the bind array includes additional overhead for fields that can vary in size. When specified without delimiters, the size in the record is fixed, but the size of the inserted field may still vary, due to whitespace trimming.
So internally, these datatypes are always treated as varying-length fields--even when they are fixed-length fields. A length indicator is included for each of these fields in the bind array.
The space reserved for the field in the bind array is large enough to hold the longest possible value of the field. The length indicator gives the actual length of the field for each row. On most systems, the size of the length indicator is two bytes. On a few systems, it is three bytes.
To determine its size, use the following control file:. This control file "loads" a one-character field using a one-row bind array. No data is actually loaded, due to the numeric conversion error that occurs when "a" is loaded as a number.
The bind array size shown in the log file, minus one the length of the character field is the value of the length indicator. Note: A similar technique can determine bind array size without doing any calculations. Multiply by the number of rows you want in the bind array to get the bind array size. The following tables summarize the memory requirements for each datatype.
They can consume enormous amounts of memory--especially when multiplied by the number of rows in the bind array. It is best to specify the smallest possible maximum length for these fields. This can make a considerable difference in the number of rows that fit into the bind array. Imagine all of the fields listed in the control file as one, long data structure -- that is, the format of a single row in the bind array.
So, it is especially important to minimize the buffer allocations for fields like these. Such generated data does not require any space in the bind array.
If you want all inserted values for a given column to be null, omit the column's specifications entirely. See also Specifying Field Conditions for details on the conditional tests. The condition has the same format as that specified for a WHEN clause.
The column's value is set to null if the condition is true. Otherwise, the value remains unchanged. This specification may be useful if you want certain data values to be replaced by nulls. The value for a column is first determined from the datafile. It is then set to null just before the insert takes place. Totally blank fields for numeric or DATE fields cause the record to be rejected.
If an all-blank CHAR field is surrounded by enclosure delimiters, then the blanks within the enclosures are loaded. Otherwise, the field is loaded as null. More details on whitespace trimming in character fields are presented in the following section.
Blanks and tabs constitute whitespace. Depending on how the field is specified, whitespace at the start of a field leading whitespace and at the end of a field trailing whitespace may, or may not be, included when the field is inserted into the database. This section describes the way character data fields are recognized, and how they are loaded. In particular, it describes the conditions under which whitespace is trimmed from fields.
See Preserving Whitespace for more information. The information in this section applies only to fields specified with one of the character-data datatypes:.
There are two ways to specify field length. If a field has a constant length that is defined in the control file, then it has a predetermined size. If a field's length is not known in advance, but depends on indicators in the record, then the field is delimited.
Fields that have a predetermined size are specified with a starting position and ending position, or with a length, as in the following examples:.
In the second case, even though the field's exact position is not specified, the field's length is predetermined. Delimiters are characters that demarcate field boundaries. Enclosure delimiters surround a field, like the quotes in:. Termination delimiters signal the end of a field, like the comma in:. If predetermined size is specified for a delimited field, and the delimiter is not found within the boundaries indicated by the size specification; then an error is generated.
For example, if you specify:. If a comma is found, then it delimits the field. When a starting position is not specified for a field, it begins immediately after the end of the previous field. Figure illustrates this situation when the previous field has a predetermined size. If the previous field is terminated by a delimiter, then the next field begins immediately after the delimiter, as shown in Figure When a field is specified both with enclosure delimiters and a termination delimiter, then the next field starts after the termination delimiter, as shown in Figure In Figure , both fields are stored with leading whitespace.
Fields do not include leading whitespace in the following cases:. The next field starts at the next non-whitespace character.
Figure illustrates this case. Leading whitespace is also removed from a field when optional enclosure delimiters are specified but not present. If none is found, then the first non-whitespace character signals the start of the field. This situation is shown in Figure Note: If enclosure delimiters are present, leading whitespace after the initial enclosure delimiter is kept, but whitespace before this delimiter is discarded.
Trailing whitespace is only trimmed from character-data fields that have a predetermined size. It is always trimmed from those fields. If a field is enclosed, or terminated and enclosed, like the first field shown in Figure , then any whitespace outside the enclosure delimiters is not part of the field.
Any whitespace between the enclosure delimiters belongs to the field, whether it is leading or trailing whitespace. See the following section, Preserving Whitespace , for details on how to prevent trimming. Whitespace trimming is described in the previous section, Trimming Blanks and Tabs. It also leaves trailing whitespace intact when fields are specified with a predetermined size. This keyword preserves tabs and blanks; for example, if the field.
Otherwise, the leading whitespace is trimmed. Both words must be specified. In general, any SQL function that returns a single value may be used. The column name and the name of the column in the SQL string must match exactly, including the quotation marks, as in this example of specifying the control file:. The SQL string must be enclosed in double quotation marks. To quote the column name in the SQL string, you must escape it.
The SQL string appears after any other specifications for a given column. If the string is recognized, but causes a database error, the row that caused the error is rejected. To refer to fields in the record, precede the field name with a colon :. Field values from the current record are substituted. The following examples illustrate references to the current field:.
In the last example, only the :field1 that is not in single quotes is interpreted as a column name. For more information on the use of quotes inside quoted strings, see Specifying Filenames and Objects Names. Also, they cannot reference filler fields. When used with a date mask, the date mask is evaluated after the SQL string.
A field specified as:. This field would be stored with the formatting characters dollar sign, period, and so on already in place. You have even more flexibility, however, if you store such values as numeric quantities or dates. You can then apply arithmetic functions to the values in the database, and still select formatted values for your reports. Column object in the control file are described in terms of their attributes. In the datafile, the data corresponding to each of the attributes of a column-object is in a datafield similar to that corresponding to a simple relational column.
Following are some examples of loading column objects. First, where the data is in predetermined size fields and second, where the data is in delimited fields. The first six characters italicized specify the length of the forthcoming record.
Loading Nested Column Objects Example shows a control file describing nested column-objects one column-object nested in another column-object. An object can have a subset of its attributes be null, it can have all of its attributes be null an attributively null object , or it can be null itself an atomically null object. In fields corresponding to object columns, you can use the NULLIF clause to specify the field conditions under which a particular attribute should be initialized to null.
Example demonstrates this. Although the above is workable, it is not ideal when the condition under which an object should take the value of null is independent of any of the mapped fields. You can map a filler field to the field in the datafile indicating if a particular object is atomically null or not and use the filler filed in the field condition of the NULLIF clause of the particular object.
Loading Object Tables The control file syntax required to load an object table is nearly identical to that used to load a typical relational table. Example demonstrates loading an object table with primary key OIDs. Note that by looking only at the above control file you might not be able to determine if the table being loaded was an object table with system generated OIDs real OIDs , an object table with primary key OIDs, or a relational table. Note also that you may want to load data which already contains real OIDs and may want to specify that, instead of generating new OIDs, the existing OIDs in the datafile should be used.
Example demonstrates loading real OIDs with the row-objects. The OID in the datafile is a character string and is interpreted as 32 digit hex number;. The 32 digit hex number is later converted into a 16 byte RAW and stored in the object table. Note that the arguments can be specified either as constants or dynamically using filler fields. Example demonstrates real REF loading:.
The first argument is the table name followed by arguments that specify the primary key OID on which the REF column to be loaded is based. Example demonstrates loading primary key REFs:.
The LOB data instances can be in predetermined size fields, delimited fields, or length-value pair fields. The following examples illustrate these situations. For more information on trimming trailing whitespaces see Trimming Whitespace: Summary. Note that this method of loading provides better performance than using delimited fields, but can reduce flexibility for example, you must know the LOB length for each LOB before loading.
Example demonstrates loading LOB data in length-value pair fields. Consequently, the LOB instance is initialized to empty. Therefore, the processing overhead of dealing with records is avoided.
This type of organization of data is ideal for LOB loading. The predetermined size of the fields allows the data-parser to perform optimally. One difficulty is that it is often hard to guarantee that all the LOBs are of the same size. In this format, loading different size LOBs into the same column is not a problem. Note that this method of loading can provide better performance over delimited LOBs, but at the expense of some flexibility for example, you must know the LOB length for each LOB before loading.
This entry specifies an empty not null LOB. A BFILE column or attribute stores a file locator that points to the external file containing the data. Note that the file which is to be loaded as a BFILE does not have to exist at the time of loading, it can be created later. You can achieve this in two ways:. There are some differences:. Collection descriptions can include a secondary datafile SDF specification. Clauses or directives that take field names as arguments cannot use a field name that is in a collection unless the DDL specification is for a field in the same collection.
The field list must contain one non-filler field and any number of filler fields. Example demonstrates loading a varray and a nested table. The main field name corresponding to the VARRAY field description is the same as the field name of its nested non-filler-field, specifically, the name of the column object field description. Note how full name field references dot notated resolve the field name conflict created by the presence of this filler field.
It also, specifies a fixed record format within the SDF. The SID clause specifies the field that contains the set-ids for the nested tables. Note also that if the SID clause is specified but the set-ids for a particular record is missing from the datafile, a set-id for the record is generated by the system.
Specifying the set-ids in the datafile is optional and does not result any significant performance gain. Therefore, the set-id is loaded. If "mysid" does not contain a valid hexadecimal number, the record is rejected.
Loading a Parent Table Separately from its Child Table When loading a table which contain a nested table column, it may be possible to load the parent table separately from the child table.
You can do independent loading of the parent and child tables if the SIDs system-generated or user-defined are already known at the time of the load i.
The larger the value used for ROWs, the fewer transactions and, therefore, better performance. Options Clause Load Statement Note: The characterset specified does not apply to data in the control file. There must not be any spaces between the operator and the operands. Precision vs. Length precision length The precision of a numeric field is the number of digits it contains. Date Mask The date mask specifies the format of the date value.
Comments in the Control File Comments can appear anywhere in the command section of the file, but they should not appear within the data. For example, --This is a Comment All text to the right of the double hyphen is ignored, until the end of the line. Operating System Considerations Specifying a Complete Path If you encounter problems when trying to specify a complete pathname, it may be due to an operating system-specific incompatibility caused by special characters in the specification.
Escaping the Backslash If your operating system uses the backslash character to separate directories in a pathname and if the version of Oracle running on your operating system implements the backslash escape character for filenames and other non-portable strings, then you must specify double backslashes in your pathnames and use single quotation marks. Escape Character Sometimes Disallowed The version of Oracle running on your operating system may not implement the escape character for non-portable strings.
Double backslashes are not needed. Naming the File To specify a file that contains the data to be loaded, use the INFILE keyword, followed by the filename and optional processing options string. For example, the following excerpt from a control file specifies four datafiles with separate bad and discard files: INFILE mydat1.
DAT, both a bad file and discard file are explicitly specified. Therefore both files are created, as needed. If you have specified that a bad file is to be created, the following applies: if one or more records are rejected, the bad file is logged.
Examples A bad file with filename UGH and default file extension or file type of. REJ' Rejected Records A record is rejected if it meets either of the following conditions: Upon insertion the record causes an Oracle error such as invalid data for a given datatype. A discard file is created according to the following rules: You have specified a discard filename and one or more records fail to satisfy all of the WHEN clauses specified in the control file. Note that, if the discard file is created, it overwrites any existing file with the same name so insure that you do not overwrite any files you wish to retain.
MAY A full path to the discard file forget. Limiting the Number of Discards You can limit the number of records to be discarded for each datafile: where n must be an integer.
To update existing rows, use the following procedure: Load your data into a work table. State of Tables and Indexes When a load is discontinued, any data already loaded remains in the tables, and the tables are left in a valid state. Continuing Single Table Loads To continue a discontinued direct or conventional path load involving only one table, specify the number of logical records to skip with the command-line parameter SKIP. The logical records would be the same in each case: aaaaaaaa Instead, this character is included when the logical record is assembled.
Choosing which Rows to Load You can choose to load or discard a logical record by using the WHEN clause to test a condition in the record. The syntax is: Note: Terminators are strings not limited to a single character.
This option is suggested for use when: available storage is limited, or the number of rows to be loaded is small compared to the size of the table a ratio of , or less, is recommended. Hexadecimal strings are padded with hexadecimal zeroes. Specifying Columns and Fields You may load any number of a table's columns.
The list of columns is enclosed by parentheses and separated with commas as follows: columnspec , columnspec , Specifying Filler Fields Filler fields have names but they are not loaded into the table.
Before the datatype is specified, the field's position must be specified. Extracting Multiple Logical Records Some data storage and transfer media have fixed-length physical records. Distinguishing Different Input Record Formats A single datafile might contain records in a variety of formats. Setting a Column to a Constant Value This is the simplest form of generated data. Generating Sequence Numbers for Multiple Tables Because a unique sequence number is generated for each logical input record, rather than for each table insert, the same sequence number can be used when inserting data into multiple tables.
The syntax for this datatype is: where precision is the number of digits in the number, and scale if given is the number of digits to the right of the implied decimal point.
The syntax for the this datatype is: where: precision The number of digits in a value. The syntax for this datatype is: A maximum length specified in the control file does not include the size of the length subfield.
CHAR The data field contains character data. DATE The data field contains character data that should be converted to an Oracle date using the specified date mask. The syntax for this datatype is: Attention: The data is a number in character form, not binary representation. This is illustrated in the following example:.
However, they cannot be used in SQL strings. If another field references a nullified filler field, then an error is generated.
0コメント