It’s all about Apache Sqoop… contd...
In the previous post, we have seen Sqoop Imports, Sqoop Exports and Sqoop Jobs. We will cover the remaining topics in this post.
Sqoop-Codegen – Purpose:
The codegen tool generates Java classes
which encapsulate and interpret imported records. The Java definition of a
record is instantiated as part of the import process, but can also be performed
separately. For example, if Java source is lost, it can be recreated. New
versions of a class can be created which use different delimiters between
fields, and so on.
In
a short, this tool helps us to generate the Data Access Object related to a
RDBMS table along with getter and setter methods in it. All of the database
table columns will be derived in that. The output of the codegen will be a jar file which contains the java file and
compiled .class file.
Syntax will be
$ sqoop codegen (generic-args) (codegen-args)
$ sqoop-codegen (generic-args) (codegen-args)
The
following is some of the Code Generation Arguments:
--bindir
<dir>
|
Output
directory for compiled objects
|
--class-name
<name>
|
Sets
the generated class name. This overrides --package-name. When combined
with --jar-file, sets the input class.
|
--jar-file
<file>
|
Disable
code generation; use specified jar
|
--outdir
<dir>
|
Output
directory for generated code
|
--package-name
<name>
|
Put
auto-generated classes in this package
|
Invocation: The
following is the syntax for the invocation of codegen.
$ sqoop codegen --connect jdbc:mysql://<db_host>/<db_name>
--username <<user> --password <password>
--table <table_name>
Sqoop-Create-Hive_Tables – Purpose:
The create-hive-table tool populates a Hive metastore with a
definition for a table based on a database table previously imported to HDFS,
or one planned to be imported. This effectively performs the "--hive-import" step of sqoop-import without running the
preceding import.
If data was already loaded to HDFS,
you can use this tool to finish the pipeline of importing the data to Hive. You
can also create Hive tables with this tool; data then can be imported and
populated into the target after a preprocessing step run by the user.
Syntax will be
$ sqoop create-hive-table (generic-args) (create-hive-table-args)
$ sqoop-create-hive-table (generic-args) (create-hive-table-args)
Hive Aruguments: The following are some of the Hive
arguments used to import
--hive-home
<dir>
|
Override $HIVE_HOME
|
--hive-overwrite
|
Overwrite
existing data in the Hive table.
|
--create-hive-table
|
If
set, then the job will fail if the target hive table exits. By default, this
property is false.
|
--hive-table
<table-name>
|
Sets
the table name to use when importing to Hive.
|
--table
|
The
database table to read the definition from.
|
The following is the example
snippet to import hive tables.
$ sqoop create-hive-table --connect
jdbc:mysql://<db_host>/<db_name>
--username <<user> --password <password>
--table <table_name>
--hive-table <hive_table_name>
Sqoop-Eval – Purpose:
The eval tool
allows users to quickly run simple SQL queries against a database; results are
printed to the console. This allows users to preview their import queries to
ensure they import the data they expect.
The eval tool
is provided for evaluation purpose only. You can use it to verify database
connection from within the Sqoop or to test simple queries. It’s not supposed
to be used in production workflows.
Syntax will be
$ sqoop eval (generic-args) (create-hive-table-args)
$ sqoop-eval (generic-args) (create-hive-table-args)
The SQL Evaluation arguments will be
-e,--query <statement> Execute statement in
SQL.
Invocations Examples of Eval would be,
$ sqoop eval --connect jjdbc:mysql://<db_host>/<db_name>
--username <<user> --password <password>
--query "<<SELECT QUERY>>"
$ sqoop eval --connect jjdbc:mysql://<db_host>/<db_name>
--username <<user> --password <password>
-e "<<INSERT QUERY>>"
Sqoop-List-Databases – Purpose:
List database schemas
present on a server.
Syntax will be
$ sqoop list-databases (generic-args) (list-databases-args)
$ sqoop-list-databases (generic-args) (list-databases-args)
Invocation example:
$ sqoop list-databases --connect
jdbc:mysql://<db_host>/<db_name>
--username <<user> --password <password>
Sqoop-List-Tables – Purpose:
List all tables present
in a database
Syntax will be
$ sqoop list-tables (generic-args) (list-tables-args)
$ sqoop-list-tables (generic-args) (list-tables-args)
Invocation example:
$ sqoop list-tables --connect
jdbc:mysql://<db_host>/<db_name>
--username <<user> --password <password>
--schema <schema_name>
Other Tools:
$
sqoop help
|
List all available
tools in Sqoop
|
$
sqoop help import
|
Displaying usage
instructions for the Sqoop Import
tool
|
$
sqoop version
|
Displaying version
information for Sqoop
|
Contact us at: