import-all-tables

The import-all-tables tool imports a set of tables from an RDBMS to HDFS. The data from each table is stored in a separate directory in HDFS.

For the import-all-tables tool to be useful, the following conditions must be met:

  • Each table must have a single-column primary key or the --autoreset-to-one-mapper option must be used.

  • You should import all columns of each table.

  • You should not use non-default splitting column, nor impose any conditions using the WHERE clause.

The tool usage is shown below.

$ sqoop import-all-tables <generic-args> <import-args>
$ sqoop-import-all-tables <generic-args> <import-args>

Although the generic Hadoop arguments must precede any import-all-tables arguments, the import-all-tables arguments can be specified in any order with respect to one another.

Common arguments

--connect <jdbc-uri>

Specifies the JDBC connection string

--connection-manager <class-name>

Specifies the connection manager class to use

--connection-param-file <filename>

Specifies optional properties file that provides connection parameters

--driver <class-name>

Specifies the JDBC driver class to use

--hadoop-mapred-home <dir>

Overrides $HADOOP_MAPRED_HOME

--help

Prints usage instructions

--password-file

Sets the path to a file containing the authentication password

-P

Reads the password from the console

--password <password>

Specifies the authentication password

--username <username>

Specifies the authentication username

--verbose

Prints more information while working

--relaxed-isolation

Instructs Sqoop to use the read-uncommitted isolation level

Import control arguments

--as-avrodatafile

Imports data as Avro Data Files

--as-sequencefile

Imports data to SequenceFiles

--as-textfile

Imports data as plain text (default)

--as-parquetfile

Imports data to Parquet files

--autoreset-to-one-mapper

Import should use one mapper if a table has no primary key and no split-by column is provided. Cannot be used with the --split-by <column-name> option

--direct

Uses the direct connector for the database (if exists)

--exclude-tables <tables>

Specifies a comma-separated list of tables to exclude from the import process

--inline-lob-limit <n>

Sets the maximum size for an inline LOB

-m,--num-mappers <n>

Specifies to use n map tasks to import in parallel

--warehouse-dir <dir>

Sets an HDFS parent directory for table destination

-z,--compress

Enables compression

--compression-codec <c>

Specifies a Hadoop compression codec to use (default is gzip)

These arguments behave in the same way as when used with the sqoop-import tool, but the --table, --split-by, --columns, and --where arguments are not allowed for import-all-tables. The --exclude-tables argument is intended for import-all-tables only.

Output line formatting arguments

--enclosed-by <char>

Sets a required field enclosing character

--escaped-by <char>

Sets an escape character

--fields-terminated-by <char>

Sets a field separator character

--lines-terminated-by <char>

Sets an end-of-line character

--mysql-delimiters

Uses the MySQL default delimiter set: fields — ,, lines — \n, escaped-by — \, optionally-enclosed-by — '

--optionally-enclosed-by <char>

Sets an optional field enclosing character

Input parsing arguments

--input-enclosed-by <char>

Sets a character that encloses the input

--input-escaped-by <char>

Sets an input escape character

--input-fields-terminated-by <char>

Sets an input field separator

--input-lines-terminated-by <char>

Sets an input end-of-line character

--input-optionally-enclosed-by <char>

Sets a field-enclosing character

Hive arguments

--create-hive-table

If set, then the job fails if the target Hive table exists

--hive-home <dir>

Overrides $HIVE_HOME

--hive-import

Imports tables into Hive (uses the Hive’s default delimiters if none are set)

--hive-overwrite

Overwrites existing data in the Hive table

--hive-table <table-name>

Sets the table name to use when importing to Hive

--hive-drop-import-delims

Drops \n, \r, and \01 from string fields when importing to Hive

--hive-delims-replacement

Replaces \n, \r, and \01 in string fields with user-defined string when importing to Hive

--hive-partition-key

Sets the Hive partition key

--hive-partition-value <v>

Sets the Hive partition value

--map-column-hive <map>

Overrides default mapping from SQL type data types to Hive data types. If you specify commas in this argument, use URL-encoded keys and values, for example, use DECIMAL(1%2C%201) instead of DECIMAL(1, 1)

Code generation arguments

--bindir <dir>

Sets the output directory for compiled objects

--jar-file <file>

Disables code generation; the provided JAR is used instead

--outdir <dir>

Sets the output directory for generated code

--package-name <name>

Puts auto-generated classes into the specified package

The import-all-tables tool does not support the --class-name argument. You may, however, specify a package with --package-name in which all generated classes will be placed.

Found a mistake? Seleсt text and press Ctrl+Enter to report it