bigdata

Logcenter Project architect

August 16, 2017 Architect, Architecture, bigdata, hadoop, hive, network, NoSQL, rdbms No comments

We created a project called LC (log center) for ops department
All member of ops are using this system for analyzing in a lower layer.
We collects all types of log including db-system, crond, secutiry log , cmdlog , api log etc.
We used MQ system for log push which based on a policy center. And we created a new background system to search and management.

Click this project LC-system-design

HBASE migrate table part 1

December 19, 2014 bigdata, hbase No comments

Copy table between different hbase clusters – Version 0.96.2-hadoop2

Create test table and initial some records.

$hbase shell
2014-12-19 15:01:36,085 INFO  [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014

hbase(main):001:0> create 'liuyang:mig_hbase', 'test2','member_id','address','info' 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop-2.3.0-cdh5.0.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2014-12-19 15:03:00,695 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
0 row(s) in 4.1540 seconds

=> Hbase::Table - liuyang:mig_hbase
hbase(main):002:0> put'liuyang:mig_hbase','test3','info:age','24'
0 row(s) in 0.1990 seconds

hbase(main):003:0> 
hbase(main):004:0* put'liuyang:mig_hbase','test3','info:birthday','1987-06-17'
0 row(s) in 0.0090 seconds

hbase(main):005:0> 
hbase(main):006:0* put'liuyang:mig_hbase','test3','info:company','alibaba'
0 row(s) in 0.0080 seconds

hbase(main):007:0> 
hbase(main):008:0* put'liuyang:mig_hbase','test3','address:contry','china'
0 row(s) in 0.0080 seconds

hbase(main):009:0> 
hbase(main):010:0* put'liuyang:mig_hbase','test3','address:province','liuyang'
0 row(s) in 0.0080 seconds

hbase(main):011:0> 
hbase(main):012:0* put'liuyang:mig_hbase','test3','address:city','hangzhou' 
0 row(s) in 0.0190 seconds

hbase(main):013:0> scan 'liuyang:mig_hbase'
ROW                                              COLUMN+CELL                                                                                                                                   
 test3                                           column=address:city, timestamp=1418972636248, value=hangzhou                                                                                  
 test3                                           column=address:contry, timestamp=1418972634935, value=china                                                                                   
 test3                                           column=address:province, timestamp=1418972635046, value=liuyang                                                                               
 test3                                           column=info:age, timestamp=1418972634471, value=24                                                                                            
 test3                                           column=info:birthday, timestamp=1418972634626, value=1987-06-17                                                                               
 test3                                           column=info:company, timestamp=1418972634764, value=alibaba                                                                                   
1 row(s) in 0.1010 seconds

————————————————————

Copy table to another cluster

use distcp to copy table to destination cluster. If can’t communicate with each other, use -copyToLocal and -copyFromLocal commands

$hadoop distcp -overwrite   /hbase/data/liuyang/mig_hbase  hdfs://10.0.128.110/hbase/data/liuyang/mig_hbase
14/12/19 15:10:18 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[/hbase/data/liuyang/mig_hbase], targetPath=hdfs://10.0.128.110/hbase/data/liuyang/mig_hbase}
14/12/19 15:10:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/12/19 15:10:20 INFO client.RMProxy: Connecting to ResourceManager at vm-master1/10.0.128.32:8032
14/12/19 15:10:23 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb
14/12/19 15:10:23 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor
14/12/19 15:10:23 INFO client.RMProxy: Connecting to ResourceManager at vm-master1/10.0.128.32:8032
14/12/19 15:10:24 INFO mapreduce.JobSubmitter: number of splits:3
14/12/19 15:10:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1418285943315_0027
14/12/19 15:10:25 INFO impl.YarnClientImpl: Submitted application application_1418285943315_0027
14/12/19 15:10:25 INFO mapreduce.Job: The url to track the job: http://vm-master1:8088/proxy/application_1418285943315_0027/
14/12/19 15:10:25 INFO tools.DistCp: DistCp job-id: job_1418285943315_0027
14/12/19 15:10:25 INFO mapreduce.Job: Running job: job_1418285943315_0027
14/12/19 15:10:37 INFO mapreduce.Job: Job job_1418285943315_0027 running in uber mode : false
14/12/19 15:10:37 INFO mapreduce.Job:  map 0% reduce 0%
14/12/19 15:10:49 INFO mapreduce.Job:  map 100% reduce 0%
14/12/19 15:10:50 INFO mapreduce.Job: Job job_1418285943315_0027 completed successfully
14/12/19 15:10:50 INFO mapreduce.Job: Counters: 33
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=281799
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=3828
		HDFS: Number of bytes written=1075
		HDFS: Number of read operations=59
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=17
	Job Counters 
		Launched map tasks=3
		Other local map tasks=3
		Total time spent by all maps in occupied slots (ms)=110112
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=27528
		Total vcore-seconds taken by all map tasks=27528
		Total megabyte-seconds taken by all map tasks=28188672
	Map-Reduce Framework
		Map input records=9
		Map output records=0
		Input split bytes=354
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=201
		CPU time spent (ms)=4870
		Physical memory (bytes) snapshot=496263168
		Virtual memory (bytes) snapshot=2825789440
		Total committed heap usage (bytes)=297795584
	File Input Format Counters 
		Bytes Read=2399
	File Output Format Counters 
		Bytes Written=0
	org.apache.hadoop.tools.mapred.CopyMapper$Counter
		BYTESCOPIED=1075
		BYTESEXPECTED=1075
		COPY=9
	

——————————————————————————————-

old hbase cluster

$hadoop fs -ls hdfs://pajkcluster/hbase/data/liuyang/
14/12/19 15:11:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 3 items
drwxr-xr-x   - hadoop supergroup          0 2014-12-19 15:10 hdfs://pajkcluster/hbase/data/liuyang/mig_hbase
drwxr-xr-x   - hadoop supergroup          0 2014-12-19 11:48 hdfs://pajkcluster/hbase/data/liuyang/test1
drwxr-xr-x   - hadoop supergroup          0 2014-12-19 14:30 hdfs://pajkcluster/hbase/data/liuyang/test2		


run hbck check 

Summary:
  member is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  liuyang:test1 is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  liuyang:test2 is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  test_replication is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  hbase:meta is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  hbase:acl is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  liuyang:mig_hbase is okay.
    Number of regions: 0
    Deployed on: 
  hbase:namespace is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
2 inconsistencies detected.



run hbck -repair

2014-12-19 15:13:00,192 DEBUG [hbasefsck-pool1-t26] util.HBaseFsck: Loading region info from hdfs:hdfs://pajkcluster/hbase/data/default/member/ff703bdb6418ca85f5056d9948daf9f7
2014-12-19 15:13:00,195 DEBUG [hbasefsck-pool1-t27] util.HBaseFsck: Loading region info from hdfs:hdfs://pajkcluster/hbase/data/hbase/meta/1588230740
2014-12-19 15:13:00,197 DEBUG [hbasefsck-pool1-t17] util.HBaseFsck: Loading region info from hdfs:hdfs://pajkcluster/hbase/data/liuyang/mig_hbase/b9b8c6a494dc67e115937736734c4c50
2014-12-19 15:13:00,198 DEBUG [hbasefsck-pool1-t29] util.HBaseFsck: Loading region info from hdfs:hdfs://pajkcluster/hbase/data/hbase/acl/6b8e6e2eec2b33c9865a9db27ab8abc2
2014-12-19 15:13:00,201 DEBUG [hbasefsck-pool1-t28] util.HBaseFsck: Loading region info from hdfs:hdfs://pajkcluster/hbase/data/default/test_replication/bd129bd09186b37efa83c927b6b8dc84
2014-12-19 15:13:00,205 DEBUG [hbasefsck-pool1-t2] util.HBaseFsck: Loading region info from hdfs:hdfs://pajkcluster/hbase/data/liuyang/test2/10100782140c9c01de3086efbf16e6fd
2014-12-19 15:13:00,206 DEBUG [hbasefsck-pool1-t32] util.HBaseFsck: Loading region info from hdfs:hdfs://pajkcluster/hbase/data/liuyang/test1/9060543031f308ff2f1c362160d74098
2014-12-19 15:13:00,210 DEBUG [hbasefsck-pool1-t30] util.HBaseFsck: Loading region info from hdfs:hdfs://pajkcluster/hbase/data/hbase/namespace/d6708d93fb70b716e5ee13d323f25eaf
2014-12-19 15:13:00,264 DEBUG [hbasefsck-pool1-t10] util.HBaseFsck: HRegionInfo read: {ENCODED => 10100782140c9c01de3086efbf16e6fd, NAME => 'liuyang:test2,,1418961022564.10100782140c9c01de3086efbf16e6fd.', STARTKEY => '', ENDKEY => ''}
2014-12-19 15:13:00,266 DEBUG [hbasefsck-pool1-t7] util.HBaseFsck: HRegionInfo read: {ENCODED => d6708d93fb70b716e5ee13d323f25eaf, NAME => 'hbase:namespace,,1418893161615.d6708d93fb70b716e5ee13d323f25eaf.', STARTKEY => '', ENDKEY => ''}
2014-12-19 15:13:00,267 DEBUG [hbasefsck-pool1-t15] util.HBaseFsck: HRegionInfo read: {ENCODED => 9060543031f308ff2f1c362160d74098, NAME => 'liuyang:test1,,1418960887489.9060543031f308ff2f1c362160d74098.', STARTKEY => '', ENDKEY => ''}
2014-12-19 15:13:00,272 DEBUG [hbasefsck-pool1-t3] util.HBaseFsck: HRegionInfo read: {ENCODED => b9b8c6a494dc67e115937736734c4c50, NAME => 'liuyang:mig_hbase,,1418972581709.b9b8c6a494dc67e115937736734c4c50.', STARTKEY => '', ENDKEY => ''}
2014-12-19 15:13:00,273 DEBUG [hbasefsck-pool1-t5] util.HBaseFsck: HRegionInfo read: {ENCODED => 6b8e6e2eec2b33c9865a9db27ab8abc2, NAME => 'hbase:acl,,1418893164546.6b8e6e2eec2b33c9865a9db27ab8abc2.', STARTKEY => '', ENDKEY => ''}
2014-12-19 15:13:00,277 DEBUG [hbasefsck-pool1-t31] util.HBaseFsck: HRegionInfo read: {ENCODED => ff703bdb6418ca85f5056d9948daf9f7, NAME => 'member,,1418958193008.ff703bdb6418ca85f5056d9948daf9f7.', STARTKEY => '', ENDKEY => ''}
2014-12-19 15:13:00,279 DEBUG [hbasefsck-pool1-t33] util.HBaseFsck: HRegionInfo read: {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
2014-12-19 15:13:00,280 DEBUG [hbasefsck-pool1-t6] util.HBaseFsck: HRegionInfo read: {ENCODED => bd129bd09186b37efa83c927b6b8dc84, NAME => 'test_replication,,1418959507510.bd129bd09186b37efa83c927b6b8dc84.', STARTKEY => '', ENDKEY => ''}

.......

Summary:
  member is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  liuyang:test1 is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  liuyang:test2 is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  test_replication is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  hbase:meta is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  hbase:acl is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  liuyang:mig_hbase is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
  hbase:namespace is okay.
    Number of regions: 1
    Deployed on:  hadoop-vm-master2,60020,1418971284917
0 inconsistencies detected.
Status: OK

run hbase shell to check table status

$hbase shell
2014-12-19 15:28:04,391 INFO  [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014

hbase(main):001:0> scan 'liuyang:mig_hbase'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop-2.3.0-cdh5.0.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2014-12-19 15:28:14,883 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ROW                                              COLUMN+CELL                                                                                                                                   
 test3                                           column=address:city, timestamp=1418972636248, value=hangzhou                                                                                  
 test3                                           column=address:contry, timestamp=1418972634935, value=china                                                                                   
 test3                                           column=address:province, timestamp=1418972635046, value=liuyang                                                                               
 test3                                           column=info:age, timestamp=1418972634471, value=24                                                                                            
 test3                                           column=info:birthday, timestamp=1418972634626, value=1987-06-17                                                                               
 test3                                           column=info:company, timestamp=1418972634764, value=alibaba                                                                                   
1 row(s) in 0.1020 seconds

HBASE Performance Test by YCSB

September 3, 2014 bigdata, hbase No comments

Read this PDF:Hbase_performance