codegen

The codegen tool generates Java classes that encapsulate and interpret imported records. The Java definition of a record is instantiated as part of the import process, but can also be performed separately. For example, if Java source is lost, it can be recreated. New versions of a class can be created that use different delimiters between fields, and so on.

The tool usage is shown below.

$ sqoop codegen <generic-args> <codegen-args>
$ sqoop-codegen <generic-args> <codegen-args>
Common arguments

--connect <jdbc-uri>

Specifies the JDBC connection string

--connection-manager <class-name>

Specifies the connection manager class to use

--connection-param-file <filename>

Specifies optional properties file that provides connection parameters

--driver <class-name>

Specifies the JDBC driver class to use

--hadoop-mapred-home <dir>

Overrides $HADOOP_MAPRED_HOME

--help

Prints usage instructions

--password-file

Sets the path to a file containing the authentication password

-P

Reads the password from the console

--password <password>

Specifies the authentication password

--username <username>

Specifies the authentication username

--verbose

Prints more information while working

--relaxed-isolation

Instructs Sqoop to use the read-uncommitted isolation level

Code generation arguments

--bindir <dir>

Sets the output directory for compiled objects

--class-name <name>

Specifies a name for generated class. This overrides --package-name. When combined with --jar-file, sets the input class

--jar-file <file>

Disables code generation; the provided JAR is used instead

--map-column-java <m>

Overrides the default mapping from SQL type to Java type for column <m>

--outdir <dir>

Sets the output directory for generated code

--package-name <name>

Puts auto-generated classes into the specified package

Output line formatting arguments

--enclosed-by <char>

Sets a required field enclosing character

--escaped-by <char>

Sets an escape character

--fields-terminated-by <char>

Sets a field separator character

--lines-terminated-by <char>

Sets an end-of-line character

--mysql-delimiters

Uses the MySQL default delimiter set: fields — ,, lines — \n, escaped-by — \, optionally-enclosed-by — '

--optionally-enclosed-by <char>

Sets an optional field enclosing character

Input parsing arguments

--input-enclosed-by <char>

Sets a character that encloses the input

--input-escaped-by <char>

Sets an input escape character

--input-fields-terminated-by <char>

Sets an input field separator

--input-lines-terminated-by <char>

Sets an input end-of-line character

--input-optionally-enclosed-by <char>

Sets a field-enclosing character

Hive arguments

--create-hive-table

If set, then the job fails if the target Hive table exists

--hive-home <dir>

Overrides $HIVE_HOME

--hive-import

Imports tables into Hive (uses the Hive’s default delimiters if none are set)

--hive-overwrite

Overwrites existing data in the Hive table

--hive-table <table-name>

Sets the table name to use when importing to Hive

--hive-drop-import-delims

Drops \n, \r, and \01 from string fields when importing to Hive

--hive-delims-replacement

Replaces \n, \r, and \01 in string fields with user-defined string when importing to Hive

--hive-partition-key

Sets the Hive partition key

--hive-partition-value <v>

Sets the Hive partition value

--map-column-hive <map>

Overrides default mapping from SQL type data types to Hive data types. If you specify commas in this argument, use URL-encoded keys and values, for example, use DECIMAL(1%2C%201) instead of DECIMAL(1, 1)

If Hive arguments are provided to the codegen tool, Sqoop generates a file containing the HQL statements to create a table and load data.

Found a mistake? Seleсt text and press Ctrl+Enter to report it