Tuesday, May 29, 2007

SQL*Loader

Reference

SQL*Loader files:
log - contains the loading session log
control - a mandatory file containing all the control information (where are the other files, how to parse data and where to save it) It can also contain the data as well
bad - contains the records not loaded due to an error
discard - contains the records not satisfying the record select criteria

Possible ways of running the sql*loader:
sqlldr user/password control=some_file.ctl
sqlldr user/password some_file.ctl
You can also specify the direct = TRUE|FALSE parameter to use the direct path load. When left to default (FALSE) conventional path is used.

Direct path loading is fast. It doesn't generate undo or redo, it always writes above the high water mark, it doesn't issue COMMIT, it bypasses most of the constraints.

Some restrictions with the direct path loading:
  • Indexes are always rebuilt after the loading. If unique constraint is violated, the unique index is left in unusable state. You need to find the duplicate, eliminate them and rebuild the index.
  • Direct path load cannot occur if there are active transactions against the target table as it places an exclusive table lock on the whole table for the time of load.
  • Triggers do not fire during direct path loads
  • Direct path loading into cluster tables is not supported. In fact only loading to heap organized tables is supported.
  • During the direct path loads, foreign key constraints are disabled and then re-enabled

Possible types of input data formats:
  • fixed length. All rows have the same length and fields are separated by some character. This format will be the quickest in processing.
  • Variable length. Each row starts with a number showing how long the row is.
  • Stream format. Will be the slowest to process. But no information on the row length is needed. You just need to specify the field separator.

No comments: