import-mainframe
The import-mainframe
tool imports all sequential datasets in a partitioned dataset (PDS) on a mainframe to HDFS.
A PDS is akin to a directory on the open systems.
The records in a dataset can contain only character data.
Records will be stored with the entire record as a single text field.
The tool usage is shown below.
$ sqoop import-mainframe <generic-args> <import-args>
$ sqoop-import-mainframe <generic-args> <import-args>
Although the generic Hadoop arguments must precede any import-mainframe
arguments, the import-mainframe
arguments can be specified in any order with respect to one another.
--connect <jdbc-uri> |
Specifies the JDBC connection string |
--connection-manager <class-name> |
Specifies the connection manager class to use |
--connection-param-file <filename> |
Specifies optional properties file that provides connection parameters |
--driver <class-name> |
Specifies the JDBC driver class to use |
--hadoop-mapred-home <dir> |
Overrides |
--help |
Prints usage instructions |
--password-file |
Sets the path to a file containing the authentication password |
-P |
Reads the password from the console |
--password <password> |
Specifies the authentication password |
--username <username> |
Specifies the authentication username |
--verbose |
Prints more information while working |
--relaxed-isolation |
Instructs Sqoop to use the read-uncommitted isolation level |
--as-avrodatafile |
Imports data as Avro Data Files |
--as-sequencefile |
Imports data to SequenceFiles |
--as-textfile |
Imports data as plain text (default) |
--as-parquetfile |
Imports data to Parquet files |
--delete-target-dir |
Deletes the import target directory if it exists |
-m,--num-mappers <n> |
Specifies to use |
--target-dir <dir> |
The HDFS destination directory |
--warehouse-dir <dir> |
Sets an HDFS parent directory for table destination |
-z,--compress |
Enables compression |
--compression-codec <c> |
Specifies a Hadoop compression codec to use (default is gzip) |
--enclosed-by <char> |
Sets a required field enclosing character |
--escaped-by <char> |
Sets an escape character |
--fields-terminated-by <char> |
Sets a field separator character |
--lines-terminated-by <char> |
Sets an end-of-line character |
--mysql-delimiters |
Uses the MySQL default delimiter set: fields — |
--optionally-enclosed-by <char> |
Sets an optional field enclosing character |
Since a mainframe record contains only one field, importing to delimited files will not contain any field delimiter. However, the field may be enclosed with enclosing character or escaped by an escaping character.
--input-enclosed-by <char> |
Sets a character that encloses the input |
--input-escaped-by <char> |
Sets an input escape character |
--input-fields-terminated-by <char> |
Sets an input field separator |
--input-lines-terminated-by <char> |
Sets an input end-of-line character |
--input-optionally-enclosed-by <char> |
Sets a field-enclosing character |
When Sqoop imports data to HDFS, it generates a Java class that can reinterpret the text files that it creates when doing a delimited-format import.
The delimiters are chosen with arguments such as --fields-terminated-by
; this controls both how the data is written to disk, and how the generated parse()
method reinterprets this data.
The delimiters used by the parse()
method can be chosen independently of the output arguments, by using --input-fields-terminated-by
, and so on.
This is useful, for example, to generate classes that can parse records created with one set of delimiters, and emit the records to a different set of files using a separate set of delimiters.
--create-hive-table |
If set, then the job fails if the target Hive table exists |
--hive-home <dir> |
Overrides |
--hive-import |
Imports tables into Hive (uses the Hive’s default delimiters if none are set) |
--hive-overwrite |
Overwrites existing data in the Hive table |
--hive-table <table-name> |
Sets the table name to use when importing to Hive |
--hive-drop-import-delims |
Drops |
--hive-delims-replacement |
Replaces |
--hive-partition-key |
Sets the Hive partition key |
--hive-partition-value <v> |
Sets the Hive partition value |
--map-column-hive <map> |
Overrides default mapping from SQL type data types to Hive data types.
If you specify commas in this argument, use URL-encoded keys and values, for example, use |
--column-family <family> |
Sets the target column family for the import |
--hbase-create-table |
If specified, creates missing HBase tables |
--hbase-row-key <col> |
Specifies which input column to use as the row key.
If the input table contains a composite key, then |
--hbase-table <table-name> |
Specifies an HBase table to use as the target instead of HDFS |
--hbase-bulkload |
Enables bulk loading |
--bindir <dir> |
Sets the output directory for compiled objects |
--class-name <name> |
Specifies a name for generated class.
This overrides |
--jar-file <file> |
Disables code generation; the provided JAR is used instead |
--map-column-java <m> |
Overrides the default mapping from SQL type to Java type for column |
--outdir <dir> |
Sets the output directory for generated code |
--package-name <name> |
Puts auto-generated classes into the specified package |