mariadb에서 sqoop2 (1.99.6)로 import

sqoop:000> show connector
+----+------------------------+---------+------------------------------------------------------+----------------------+
| Id |          Name          | Version |                        Class                         | Supported Directions |
+----+------------------------+---------+------------------------------------------------------+----------------------+
| 1  | generic-jdbc-connector | 1.99.6  | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO              |
| 2  | kite-connector         | 1.99.6  | org.apache.sqoop.connector.kite.KiteConnector        | FROM/TO              |
| 3  | hdfs-connector         | 1.99.6  | org.apache.sqoop.connector.hdfs.HdfsConnector        | FROM/TO              |
| 4  | kafka-connector        | 1.99.6  | org.apache.sqoop.connector.kafka.KafkaConnector      | TO                   |
+----+------------------------+---------+------------------------------------------------------+----------------------+



=======JDBC 링크 생성




sqoop:000> create link -c 1
Creating link for connector with id 1
0    [main] WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Please fill following values to create new link object
Name: aa

Link configuration

JDBC Driver Class: org.mariadb.jdbc.Driver
JDBC Connection String: jdbc:mariadb://localhost/hive
Username: hive
Password: **********
JDBC Connection Properties: 
There are currently 0 values in the map:
entry# protocol=tcp
There are currently 1 values in the map:
protocol = tcp
entry# 
New link was successfully created with validation status OK and persistent id 1






=======HDFS 링크 생성




sqoop:000> create link -c 3
Creating link for connector with id 3
Please fill following values to create new link object
Name: hd

Link configuration

HDFS URI: hdfs://hadoop-cluster
Hadoop conf directory: /home/hadoop/etc/hadoop
New link was successfully created with validation status OK and persistent id 4

================JOB 생성


sqoop:000> create job -f 1 -t 4
Creating job for links with from id 1 and to id 4
Please fill following values to create new job object
Name: j11

From database configuration

Schema name: tpch_1g
Table name: NATION
Table SQL statement: 
Table column names: 
Partition column name: N_REGIONKEY
Null value allowed for the partition column: 
Boundary query: 

Incremental read

Check column: 
Last value: 

To HDFS configuration

Override null value: 
Null value: 
Output format: 
  0 : TEXT_FILE
  1 : SEQUENCE_FILE
Choose: 0
Compression format: 
  0 : NONE
  1 : DEFAULT
  2 : DEFLATE
  3 : GZIP
  4 : BZIP2
  5 : LZO
  6 : LZ4
  7 : SNAPPY
  8 : CUSTOM
Choose: 0
Custom compression format: 
Output directory: /home/sqoop/tmp  
Append mode: 

Throttling resources

Extractors: 
Loaders: 
New job was successfully created with validation status OK  and persistent id 7



=======JOB 실행

show job


sqoop:000> show job
+----+------+----------------+--------------+---------+
| Id | Name | From Connector | To Connector | Enabled |
+----+------+----------------+--------------+---------+
| 7  | j11  | 1              | 3            | true    |
| 8  | gg   | 1              | 3            | true    |
| 1  | job1 | 1              | 3            | true    |
| 2  | job2 | 1              | 3            | true    |
| 3  | j3   | 1              | 3            | true    |
| 4  | t4   | 1              | 3            | true    |
| 5  | t5   | 1              | 3            | true    |
| 6  | j6   | 1              | 3            | true    |
+----+------+----------------+--------------+---------+




start job -j 7 -s

Submission details
Job ID: 7
Server URL: http://localhost:12000/sqoop/
Created by: hadoop
Creation date: 2018-05-03 21:39:10 EDT
Lastly updated by: hadoop
External ID: job_1525389342277_0008
    http://0.0.0.0:8089/proxy/application_1525389342277_0008/
2018-05-03 21:39:10 EDT: BOOTING  - Progress is not available
2018-05-03 21:39:21 EDT: RUNNING  - 0.00 %
2018-05-03 21:39:31 EDT: RUNNING  - 12.50 %
2018-05-03 21:39:42 EDT: RUNNING  - 12.50 %
2018-05-03 21:39:52 EDT: RUNNING  - 37.50 %
2018-05-03 21:40:03 EDT: SUCCEEDED 
Counters:
    org.apache.hadoop.mapreduce.FileSystemCounter
        FILE_LARGE_READ_OPS: 0
        FILE_WRITE_OPS: 0
        HDFS_READ_OPS: 4
        HDFS_BYTES_READ: 509
        HDFS_LARGE_READ_OPS: 0
        FILE_READ_OPS: 0
        FILE_BYTES_WRITTEN: 1858916
        FILE_BYTES_READ: 0
        HDFS_WRITE_OPS: 4
        HDFS_BYTES_WRITTEN: 2299
    org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter
        BYTES_WRITTEN: 0
    org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
        BYTES_READ: 0
    org.apache.hadoop.mapreduce.JobCounter
        TOTAL_LAUNCHED_MAPS: 10
        MB_MILLIS_MAPS: 99164160
        SLOTS_MILLIS_REDUCES: 0
        VCORES_MILLIS_MAPS: 96840
        NUM_FAILED_MAPS: 6
        SLOTS_MILLIS_MAPS: 96840
        OTHER_LOCAL_MAPS: 10
        MILLIS_MAPS: 96840
    org.apache.sqoop.submission.counter.SqoopCounters
        ROWS_READ: 25
        ROWS_WRITTEN: 25
    org.apache.hadoop.mapreduce.TaskCounter
        SPILLED_RECORDS: 0
        MERGED_MAP_OUTPUTS: 0
        VIRTUAL_MEMORY_BYTES: 8342585344
        MAP_INPUT_RECORDS: 0
        SPLIT_RAW_BYTES: 509
        MAP_OUTPUT_RECORDS: 25
        FAILED_SHUFFLE: 0
        PHYSICAL_MEMORY_BYTES: 492388352
        GC_TIME_MILLIS: 491
        CPU_MILLISECONDS: 5210
        COMMITTED_HEAP_BYTES: 121896960
Job executed successfully

댓글

이 블로그의 인기 게시물

LSF (GPU 스케쥴링) 명령어 사용법

CentOS 7 리부팅 없이 새 디스크 인식 (find a new disk without reboot)

python에서 hive 사용