$show=home

Sqoop - Import

Sqoop - Import Shout 4 Education
This chapter describes how to import data from MySQL database to Hadoop HDFS. The ‘Import tool’ imports individual tables from RDBMS to HDFS. Each row in a table is treated as a record in HDFS. All records are stored as text data in the text files or as binary data in Avro and Sequence files.

Syntax

The following syntax is used to import data into HDFS.
$ sqoop import (generic-args) (import-args) 
$ sqoop-import (generic-args) (import-args)

Example

Let us take an example of three tables named as empemp_add, and emp_contact, which are in a database called userdb in a MySQL database server.
The three tables and their data are as follows.

emp:

idnamedegsalarydept
1201gopalmanager50,000TP
1202manishaProof reader50,000TP
1203khalilphp dev30,000AC
1204prasanthphp dev30,000AC
1204kranthiadmin20,000TP

emp_add:

idhnostreetcity
1201288Avgirijublee
1202108Iaocsec-bad
1203144Zpguttahyd
120478Bold citysec-bad
1205720Xhitecsec-bad

emp_contact:

idphnoemail
12012356742gopal@tp.com
12021661663manisha@tp.com
12038887776khalil@ac.com
12049988774prasanth@ac.com
12051231231kranthi@tp.com

Importing a Table

Sqoop tool ‘import’ is used to import table data from the table to the Hadoop file system as a text file or a binary file.
The following command is used to import the emp table from MySQL database server to HDFS.
$ sqoop import \
--connect jdbc:mysql://localhost/userdb \
--username root \
--table emp --m 1
If it is executed successfully, then you get the following output.
14/12/22 15:24:54 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5
14/12/22 15:24:56 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/12/22 15:24:56 INFO tool.CodeGenTool: Beginning code generation
14/12/22 15:24:58 INFO manager.SqlManager: Executing SQL statement: 
   SELECT t.* FROM `emp` AS t LIMIT 1
14/12/22 15:24:58 INFO manager.SqlManager: Executing SQL statement: 
   SELECT t.* FROM `emp` AS t LIMIT 1
14/12/22 15:24:58 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
14/12/22 15:25:11 INFO orm.CompilationManager: Writing jar file: 
   /tmp/sqoop-hadoop/compile/cebe706d23ebb1fd99c1f063ad51ebd7/emp.jar
-----------------------------------------------------
-----------------------------------------------------
14/12/22 15:25:40 INFO mapreduce.Job: The url to track the job: 
   http://localhost:8088/proxy/application_1419242001831_0001/
14/12/22 15:26:45 INFO mapreduce.Job: Job job_1419242001831_0001 running in uber mode : 
   false
14/12/22 15:26:45 INFO mapreduce.Job: map 0% reduce 0%
14/12/22 15:28:08 INFO mapreduce.Job: map 100% reduce 0%
14/12/22 15:28:16 INFO mapreduce.Job: Job job_1419242001831_0001 completed successfully
-----------------------------------------------------
-----------------------------------------------------
14/12/22 15:28:17 INFO mapreduce.ImportJobBase: Transferred 145 bytes in 177.5849 seconds 
   (0.8165 bytes/sec)
14/12/22 15:28:17 INFO mapreduce.ImportJobBase: Retrieved 5 records.
To verify the imported data in HDFS, use the following command.
$ $HADOOP_HOME/bin/hadoop fs -cat /emp/part-m-*
It shows you the emp table data and fields are separated with comma (,).
1201, gopal,    manager, 50000, TP
1202, manisha,  preader, 50000, TP
1203, kalil,    php dev, 30000, AC
1204, prasanth, php dev, 30000, AC
1205, kranthi,  admin,   20000, TP

Importing into Target Directory

We can specify the target directory while importing table data into HDFS using the Sqoop import tool.
Following is the syntax to specify the target directory as option to the Sqoop import command.
--target-dir <new or exist directory in HDFS>
The following command is used to import emp_add table data into ‘/queryresult’ directory.
$ sqoop import \
--connect jdbc:mysql://localhost/userdb \
--username root \
--table emp_add \
--m 1 \
--target-dir /queryresult
The following command is used to verify the imported data in /queryresult directory form emp_add table.
$ $HADOOP_HOME/bin/hadoop fs -cat /queryresult/part-m-*
It will show you the emp_add table data with comma (,) separated fields.
1201, 288A, vgiri,   jublee
1202, 108I, aoc,     sec-bad
1203, 144Z, pgutta,  hyd
1204, 78B,  oldcity, sec-bad
1205, 720C, hitech,  sec-bad

Import Subset of Table Data

We can import a subset of a table using the ‘where’ clause in Sqoop import tool. It executes the corresponding SQL query in the respective database server and stores the result in a target directory in HDFS.
The syntax for where clause is as follows.
--where 
The following command is used to import a subset of emp_add table data. The subset query is to retrieve the employee id and address, who lives in Secunderabad city.
$ sqoop import \
--connect jdbc:mysql://localhost/userdb \
--username root \
--table emp_add \
--m 1 \
--where city =’sec-bad’” \
--target-dir /wherequery
The following command is used to verify the imported data in /wherequery directory from the emp_add table.
$ $HADOOP_HOME/bin/hadoop fs -cat /wherequery/part-m-*
It will show you the emp_add table data with comma (,) separated fields.
1202, 108I, aoc,     sec-bad
1204, 78B,  oldcity, sec-bad
1205, 720C, hitech,  sec-bad

Incremental Import

Incremental import is a technique that imports only the newly added rows in a table. It is required to add ‘incremental’, ‘check-column’, and ‘last-value’ options to perform the incremental import.
The following syntax is used for the incremental option in Sqoop import command.
--incremental 
--check-column <column name>
--last value <last check column value>
Let us assume the newly added data into emp table is as follows −
1206, satish p, grp des, 20000, GR
The following command is used to perform the incremental import in the emp table.
$ sqoop import \
--connect jdbc:mysql://localhost/userdb \
--username root \
--table emp \
--m 1 \
--incremental append \
--check-column id \
-last value 1205
The following command is used to verify the imported data from emp table to HDFS emp/ directory.
$ $HADOOP_HOME/bin/hadoop fs -cat /emp/part-m-*
It shows you the emp table data with comma (,) separated fields.
1201, gopal,    manager, 50000, TP
1202, manisha,  preader, 50000, TP
1203, kalil,    php dev, 30000, AC
1204, prasanth, php dev, 30000, AC
1205, kranthi,  admin,   20000, TP
1206, satish p, grp des, 20000, GR
The following command is used to see the modified or newly added rows from the emp table.
$ $HADOOP_HOME/bin/hadoop fs -cat /emp/part-m-*1
It shows you the newly added rows to the emp table with comma (,) separated fields.
1206, satish p, grp des, 20000, GR

Comments

Blogger
Name

.NET_Interview,1,Accenture,1,Accenture News,1,Accenture_GFT,1,Accenture_Prep,1,Advance_Excel,22,Advance_Python,10,Advanced_Linux,6,Advanced_SQL,18,Advanced_Unix,6,AI,1,Alexa,1,Amazon,1,Amazon News,5,AMCAT,1,AMCAT_Prep,1,AMCAT_Solved_Papers,1,Ancient India,5,Android,1,Ansible,2,Apache_Sqoop,10,Aptitude,1,AWS,23,AWS CLI,6,AWS DeepRacer,1,AWS Tutorials,13,AWS_Dumps,1,AWS_Interview,1,AZ-104,1,AZ-900,2,Azure,3,Azure Administrator Associate,1,B_Tech,19,B.Tech,1,B.Tech Jobs,1,Backup,2,Banking Exam,1,Banking_Exam,1,Basic_Linux,29,Basic_Python,19,Basic_SQL,24,Basic_Unix,30,Best_Websites,1,Big_Data_Analytics,70,Blog,447,Blogger,3,Blogging,2,Blogspot,1,Books,2,BTech,20,C++_Interview,1,CBSE,93,Certification,5,Cheat Sheet,2,Civil_1st_Semester,1,Class 11,2,Class 11 Physics,2,Class 12,61,Class 12 Biology,16,Class 12 Chemistry,16,Class 12 English,14,Class 12 Physics,15,Class_12,28,Class_12_Chemistry,4,Class_12_Computer_Science,7,Class_12_Mathematics,1,Class_12_NCERT,15,Class_12_NCERT_Solutions,15,Class_12_Physics,18,Class_12_Physics_NCERT_Solutions,15,Class_12_Science,29,Cloud,2,Cloud_Service,1,CloudFormation,2,Coding,1,Cognizant News,1,Communication,2,Computer,18,Computer_Memory,2,Computer_Programming,2,Computer_Science,4,Control_System,9,Crack_Interview,3,CSE_5th_Semester,1,CSS,1,Data Analyst Jobs,1,Data Science,2,Data Science Interview,1,Data_Analytics,16,Data_Science,18,Data_Science_Interview,1,Database,47,Database Interview,2,Database_Interview,4,DP-900,1,Dumps,2,ECE_1st_Semester,1,ECE_1st_Year,1,ECE_4th_Semester,9,Electrical,2,Electrical_1st_Semester,1,Electronics,2,Electronics & Communication,1,Electronics_&_Communication,14,English,2,Error,1,Ethical Hacking,2,Ethical_Hacking,1,ETL_Tools,17,Exam Dumps,1,Exam Preparation,13,Exam_Cracker,3,Exams,5,Exams_Banking,1,Exams_Prep,1,Excel,22,Excel_Macros,22,Excel_Terms,1,Excel_VBA,22,File System,1,Free_OS,1,Free_Softwares,1,Games,3,GATE,20,GATE EC,1,GATE Electronics & Communication,1,GATE_2019,16,GATE_2020,16,GATE_2021,7,GATE_EC,10,GATE_ECE,9,GATE_ECE_Best_Book,1,GATE_Electrical,1,GATE_Electronics,12,GATE_Made_Easy,1,GATE_ME,6,GATE_Mechanical,6,GGSIPU,1,Google,1,Google Cloud,2,Google News,2,Government Jobs,2,Graphic,1,GRUB,1,Handwritten_Notes,10,Hardware,6,HCL_Prep,1,HDFS,1,Hive,1,Hive Tutorials,1,Hosting,2,How To,15,How_To,8,HR Interview,2,HR Interview Questions,1,HR_Interview,3,Hyderabad_News,1,IBPS,2,IBPS_English,1,IBPS_PO,2,Indian History,5,Informatica,1,Informatica_Interview,1,Information,20,Internet,5,Interview,20,Interview Preparation,5,Interview_Prep,20,IPU,1,ISRO Jobs,1,IT Jobs,4,IT News,1,Java,2,Java Interview,1,Java_Interview,2,Java_Questions_&_answers,1,JavaScript,1,JEE,6,JEE_Mains,6,Job Alert,6,Jobs,1,Kali Linux,1,Kali Linux Tools,1,Katoolin,1,Keyboard,1,Keyboard_Shortcuts,1,Learn_VBA,22,Linux,42,Linux Command,1,Linux Interview,1,Linux Mint,1,Linux Tools,4,Linux Tutorials,1,Linux_Distributions,1,Linux_Interview,1,Linux_Redirections,1,Linux_Scripting,30,Linux_Shell_Arrays,1,Linux_Shell_Functions,1,Linux_Shell_Quote,1,Linux_Signals_And_Traps,1,Logical_Reasoning,1,M.Tech Jobs,1,Machine Learning,1,Machine Learning Interview,1,Machine_Learning,1,Machine_Learning_Interview,1,Macros,22,Manufacturing_Processes,1,ME_1st_Semester,1,ME_Fluid_Mechanics,1,ME_Industrial_Engineering,1,ME_Machine_Design,1,Mechanical,6,Memory,1,Microcontroller,1,Microsoft,5,Microsoft Azure Associate,1,Microsoft Azure Fundamentals,3,Microsoft_Azure,1,Microsoft_Azure_Interview,1,Mobile,1,Mobile News,1,MS_Access,40,MySQL,44,NCERT Solutions,63,Network,2,News,3,Notes,31,OOPs,1,Open_Source_OS,1,OpenTelemetry,1,Operating_Systems,2,Operating_Systems_Interview,1,Oracle,42,Oracle Interview,1,Oracle_Interview,1,Paytm Jobs,1,Physics,2,PL_SQL,42,PL_SQL_Interview,1,Placement,21,Placement Preparation,18,Placement_Prep,24,Poetry,1,Programming,28,Programming_Languages,1,Python,28,Python_Built_In_Strings_Methods,2,Python_built_In_Tuple_Functions,1,Python_CAlling_a_Function,1,Python_CGI,1,Python_Class,1,Python_Data_Types,1,Python_DAte,1,Python_Decision_Making,1,Python_Dictionary,1,Python_DOM_APIs,1,Python_Features,1,Python_Files_Functions,1,Python_For_Loop,1,Python_Functions,6,Python_GUI,1,Python_History,1,Python_If_Else,1,Python_import_Statements,1,Python_Installation,1,Python_Interview,1,Python_JPython,1,Python_Lists,2,Python_Loops,1,Python_Methods,1,Python_Modules,1,Python_MySQL,1,Python_Nested_If_Else,1,Python_Nested_Loops,1,Python_Number_Type_Conversion,1,Python_Numbers,2,Python_Object_Oriented,1,Python_OOP,1,Python_Pass_By_Reference_vs_Value,1,Python_Programming,28,Python_Scripting,28,Python_Special_Operators,1,Python_Strings,2,Python_Strings_Functions,1,Python_Threading_Module,1,Python_Time,1,Python_Tkinter,1,Python_Tuples,2,Python_Tutorial,28,Python_Types_of_Loops,1,Python_Variables,1,Python_Web_Server,1,Python_While_Loop,1,Python_wxPython,1,Python_XML_Processing,1,PythonPath_Setup,1,Quantitative,1,Quantitative_Aptitude,2,RDBMS,1,Run,1,S3,1,Sabrent,1,Samsung,1,SBI Jobs,1,Scripting,52,Scripting Interview,1,Security,1,Server,2,service now,1,Shell_Command_Manual,1,Shell_Logging_Commands,1,Shell_Scripting,31,Shell_Scripting_Interview,1,Software Engineering Interview,1,Software_Engineering_Interview,1,Solutions,1,Spinnaker,1,SQL,46,SQL Interview,1,SQL_Alias_Syntax,1,SQL_Alter_Table_Query,1,SQL_Alter_Table_Statement,1,SQL_AND_OR_Query,1,SQL_AND_OR_Statement,1,SQL_Architecture,1,SQL_Clone_Table,1,SQL_Commands,1,SQL_Conjunctive_Operators,1,SQL_Constraints,1,SQL_Create_Database,1,SQL_Create_Table,1,SQL_DataTypes,1,SQL_Date_Functions,1,SQL_Date_Statement,1,SQL_DCL,1,SQL_Server,39,SQL_Temporary_Table_Statement,1,SQLite,43,Sqoop,9,Sqoop_Tutorial,10,SSC,4,SSC CGL,1,SSC CHSL,1,SSC_CGL,3,SSC_CGL_English,1,SSC_CHSL,1,SSC_CPO,1,SSC_GS,1,SSC_Quantative,1,SSD,1,ssl,1,Storage,1,Talend,17,Talend Interview,1,Talend_ETL,15,Talend_Tutorials,16,TCS Interview,1,TCS Jobs,1,TCS News,1,TCS_Interview,1,TCS_Prep,1,Tech News,15,Tech Tips,18,Teradata_Interview,1,Terraform,2,Tips & Tricks,12,Tips_&_Tricks,18,Top 10,3,Top 50,10,Top_10,1,Top_50,18,Top25,1,Tutorials,165,Tutorials_Python,28,Tutorials_VBA,11,Unix,32,Unix Interview,1,Unix Tutorials,1,Unix_Interview,1,Unix_Scripting,31,UPSC,5,VBA,22,VBA_Basics,22,VBA_Excel,22,VBA_Scripting,22,VBA_Tutorials,22,VirtualBox,1,Visual_Basic_Application,22,VPN,1,Websites,3,Windows,22,WordPress,2,Yarn,1,
ltr
item
Shout4Education - Get Jobs, Tutorials and Notes: Sqoop - Import
Sqoop - Import
Sqoop - Installation .. Best Sqoop Tutorial only @ Shout4Education .. Learn Sqoop in an Easy Way .. Keep Shouting For Education. This chapter describes how to import data from MySQL database to Hadoop HDFS. The ‘Import tool’ imports individual tables from RDBMS to HDFS. Each row in a table is treated as a record in HDFS. All records are stored as text data in the text files or as binary data in Avro and Sequence files. Syntax The following syntax is used to import data into HDFS. $ sqoop import (generic-args) (import-args) $ sqoop-import (generic-args) (import-args) Example Let us take an example of three tables named as emp, emp_add, and emp_contact, which are in a database called userdb in a MySQL database server. The three tables and their data are as follows. emp: id name deg salary dept 1201 gopal manager 50,000 TP 1202 manisha Proof reader 50,000 TP 1203 khalil php dev 30,000 AC 1204 prasanth php dev 30,000 AC 1204 kranthi admin 20,000 TP emp_add: id hno street city 1201 288A vgiri jublee 1202 108I aoc sec-bad 1203 144Z pgutta hyd 1204 78B old city sec-bad 1205 720X hitec sec-bad emp_contact: id phno email 1201 2356742 gopal@tp.com 1202 1661663 manisha@tp.com 1203 8887776 khalil@ac.com 1204 9988774 prasanth@ac.com 1205 1231231 kranthi@tp.com Importing a Table Sqoop tool ‘import’ is used to import table data from the table to the Hadoop file system as a text file or a binary file. The following command is used to import the emp table from MySQL database server to HDFS. $ sqoop import \ --connect jdbc:mysql://localhost/userdb \ --username root \ --table emp --m 1 If it is executed successfully, then you get the following output. 14/12/22 15:24:54 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5 14/12/22 15:24:56 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/12/22 15:24:56 INFO tool.CodeGenTool: Beginning code generation 14/12/22 15:24:58 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `emp` AS t LIMIT 1 14/12/22 15:24:58 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `emp` AS t LIMIT 1 14/12/22 15:24:58 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop 14/12/22 15:25:11 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/cebe706d23ebb1fd99c1f063ad51ebd7/emp.jar ----------------------------------------------------- ----------------------------------------------------- 14/12/22 15:25:40 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1419242001831_0001/ 14/12/22 15:26:45 INFO mapreduce.Job: Job job_1419242001831_0001 running in uber mode : false 14/12/22 15:26:45 INFO mapreduce.Job: map 0% reduce 0% 14/12/22 15:28:08 INFO mapreduce.Job: map 100% reduce 0% 14/12/22 15:28:16 INFO mapreduce.Job: Job job_1419242001831_0001 completed successfully ----------------------------------------------------- ----------------------------------------------------- 14/12/22 15:28:17 INFO mapreduce.ImportJobBase: Transferred 145 bytes in 177.5849 seconds (0.8165 bytes/sec) 14/12/22 15:28:17 INFO mapreduce.ImportJobBase: Retrieved 5 records. To verify the imported data in HDFS, use the following command. $ $HADOOP_HOME/bin/hadoop fs -cat /emp/part-m-* It shows you the emp table data and fields are separated with comma (,). 1201, gopal, manager, 50000, TP 1202, manisha, preader, 50000, TP 1203, kalil, php dev, 30000, AC 1204, prasanth, php dev, 30000, AC 1205, kranthi, admin, 20000, TP Importing into Target Directory We can specify the target directory while importing table data into HDFS using the Sqoop import tool. Following is the syntax to specify the target directory as option to the Sqoop import command. --target-dir The following command is used to import emp_add table data into ‘/queryresult’ directory. $ sqoop import \ --connect jdbc:mysql://localhost/userdb \ --username root \ --table emp_add \ --m 1 \ --target-dir /queryresult The following command is used to verify the imported data in /queryresult directory form emp_add table. $ $HADOOP_HOME/bin/hadoop fs -cat /queryresult/part-m-* It will show you the emp_add table data with comma (,) separated fields. 1201, 288A, vgiri, jublee 1202, 108I, aoc, sec-bad 1203, 144Z, pgutta, hyd 1204, 78B, oldcity, sec-bad 1205, 720C, hitech, sec-bad Import Subset of Table Data We can import a subset of a table using the ‘where’ clause in Sqoop import tool. It executes the corresponding SQL query in the respective database server and stores the result in a target directory in HDFS. The syntax for where clause is as follows. --where The following command is used to import a subset of emp_add table data. The subset query is to retrieve the employee id and address, who lives in Secunderabad city. $ sqoop import \ --connect jdbc:mysql://localhost/userdb \ --username root \ --table emp_add \ --m 1 \ --where “city =’sec-bad’” \ --target-dir /wherequery The following command is used to verify the imported data in /wherequery directory from the emp_add table. $ $HADOOP_HOME/bin/hadoop fs -cat /wherequery/part-m-* It will show you the emp_add table data with comma (,) separated fields. 1202, 108I, aoc, sec-bad 1204, 78B, oldcity, sec-bad 1205, 720C, hitech, sec-bad Incremental Import Incremental import is a technique that imports only the newly added rows in a table. It is required to add ‘incremental’, ‘check-column’, and ‘last-value’ options to perform the incremental import. The following syntax is used for the incremental option in Sqoop import command. --incremental --check-column --last value Let us assume the newly added data into emp table is as follows − 1206, satish p, grp des, 20000, GR The following command is used to perform the incremental import in the emp table. $ sqoop import \ --connect jdbc:mysql://localhost/userdb \ --username root \ --table emp \ --m 1 \ --incremental append \ --check-column id \ -last value 1205 The following command is used to verify the imported data from emp table to HDFS emp/ directory. $ $HADOOP_HOME/bin/hadoop fs -cat /emp/part-m-* It shows you the emp table data with comma (,) separated fields. 1201, gopal, manager, 50000, TP 1202, manisha, preader, 50000, TP 1203, kalil, php dev, 30000, AC 1204, prasanth, php dev, 30000, AC 1205, kranthi, admin, 20000, TP 1206, satish p, grp des, 20000, GR The following command is used to see the modified or newly added rows from the emp table. $ $HADOOP_HOME/bin/hadoop fs -cat /emp/part-m-*1 It shows you the newly added rows to the emp table with comma (,) separated fields. 1206, satish p, grp des, 20000, GR @Shout 4 Education
https://1.bp.blogspot.com/-Bwtu5qVi6ig/XoJQa-nnUEI/AAAAAAAAA3g/syE8myXkCdQaKPslUhejJwbxzcQj7uQOQCLcBGAsYHQ/s640/sqoop_import.jpg
https://1.bp.blogspot.com/-Bwtu5qVi6ig/XoJQa-nnUEI/AAAAAAAAA3g/syE8myXkCdQaKPslUhejJwbxzcQj7uQOQCLcBGAsYHQ/s72-c/sqoop_import.jpg
Shout4Education - Get Jobs, Tutorials and Notes
https://www.shout4education.com/2020/03/sqoop-import.html
https://www.shout4education.com/
https://www.shout4education.com/
https://www.shout4education.com/2020/03/sqoop-import.html
true
7947974353386595563
UTF-8
Loaded All Posts Not Found Any Posts :( View All Read More Reply Cancel Reply Delete By Home Pages Posts View All Similar Posts Label Archive Search All Posts Not Found Any Post Match with Your Request Sorry !! Search Something Blazing :) Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Just Now 1 Minute Ago $$1$$ minutes ago 1 Hour Ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago More than 5 Weeks Ago Followers Follow :) This Premium Content is LOCKED !!! STEP 1: Share. STEP 2: Click the Link You Shared to Unlock Copy All Code Select All Code All Codes were Copied to Your Clipboard :) Can NOT Copy the Codes / Texts, Please Press [CTRL]+[C] (or CMD+C with Mac) to Copy