Hbase scan 查询命令大全,前缀,模糊,正则
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了Hbase scan 查询命令大全,前缀,模糊,正则,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含10744字,纯文字阅读大概需要16分钟。
内容图文
![Hbase scan 查询命令大全,前缀,模糊,正则](/upload/InfoBanner/zyjiaocheng/1308/318656746f68496e99b30da72db4b033.jpg)
stu 学生
列族 base 存储学生姓名,身高基本信息
列族 score 存储成绩
c1_s1 c1 班级 s1 学生编号
create ‘stu‘,‘base‘,‘score‘
put ‘stu‘,‘c1_s1‘,‘base:name‘,‘jack‘
put ‘stu‘,‘c1_s2‘,‘base:name‘,‘jack2‘
put ‘stu‘,‘c1_s3‘,‘base:name‘,‘jack3‘
put ‘stu‘,‘c1_s4‘,‘base:name‘,‘jack4‘
put ‘stu‘,‘c2_s1‘,‘base:name‘,‘tom1‘
put ‘stu‘,‘c2_s2‘,‘base:name‘,‘tom2‘
put ‘stu‘,‘c2_s2‘,‘base:weight‘,‘70kg‘
put ‘stu‘,‘c2_s3‘,‘base:name‘,‘tom3‘
put ‘stu‘,‘c2_s3‘,‘base:weight‘,‘85kg‘
put ‘stu‘,‘c2_s3‘,‘base:height‘,‘1.70m‘
小菜:如何将查询的结果,输入文件
echo “scan ‘stu’,{LIMIT=>1}” | ./hbase shell > a.txt
- Hbase scan扫描全表,指定返回特定的列
hbase(main):028:0> scan ‘stu‘,{COLUMNS => [‘base:weight‘,‘base:height‘]}
ROW COLUMN+CELL
c2_s2 column=base:weight, timestamp=1588154167692, value=70kg
c2_s3 column=base:height, timestamp=1588154125060, value=1.70m
c2_s3 column=base:weight, timestamp=1588154124202, value=85kg
2 row(s)
Took 0.0113 seconds
- Hbase TIMERANGE 扫描指定时间内数据,前闭后开
注意:包含等于前面时间的数据,不含等于后面时间的数据
hbase(main):028:0> scan ‘stu‘,{TIMERANGE=>[1588153968060,1588153968207]}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
2 row(s)
Took 0.0108 seconds
- Hbase 利用STARTROW STOPROW 扫描rowkey的范围
注意:包含等于前面key的数据,不含等于后面key的数据
hbase(main):028:0> scan ‘stu‘,{STARTROW=>‘c1_s1‘,STOPROW=>‘c1_s3‘}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
2 row(s)
Took 0.0092 seconds
4.?HBase?翻转结果和时间组合排序 REVERSED
全表扫描翻转结果
scan ‘stu‘, {REVERSED => TRUE}
和时间组合翻转
hbase(main):009:0> scan ‘stu‘,{TIMERANGE=>[1588153968060,1588153968207],REVERSED => TRUE}
ROW COLUMN+CELL
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
c1_s1 column=base:name, timestamp=1588153968060, value=jack
- Hbase 返回指标 ALL_METRICS or METRICS
hbase(main):011:0> scan ‘stu‘,{ALL_METRICS => true}ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
c1_s3 column=base:name, timestamp=1588153968207, value=jack3
c1_s4 column=base:name, timestamp=1588153968258, value=jack4
c2_s1 column=base:name, timestamp=1588153968324, value=tom1
c2_s2 column=base:name, timestamp=1588153968367, value=tom2
c2_s2 column=base:weight, timestamp=1588154167692, value=70kg
c2_s3 column=base:height, timestamp=1588154125060, value=1.70m
c2_s3 column=base:name, timestamp=1588153968409, value=tom3
c2_s3 column=base:weight, timestamp=1588154124202, value=85kg
7 row(s)
?
METRIC VALUE
BYTES_IN_REMOTE_RESULTS 0
BYTES_IN_RESULTS 420
MILLIS_BETWEEN_NEXTS 66
NOT_SERVING_REGION_EXCEPTION 0
REGIONS_SCANNED 1
REMOTE_RPC_CALLS 0
REMOTE_RPC_RETRIES 0
ROWS_FILTERED 0
ROWS_SCANNED 7
RPC_CALLS 1
RPC_RETRIES 0
scan ‘stu‘,{METRICS => [‘ROWS_SCANNED‘,‘RPC_CALLS‘]}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
c1_s3 column=base:name, timestamp=1588153968207, value=jack3
c1_s4 column=base:name, timestamp=1588153968258, value=jack4
c2_s1 column=base:name, timestamp=1588153968324, value=tom1
c2_s2 column=base:name, timestamp=1588153968367, value=tom2
c2_s2 column=base:weight, timestamp=1588154167692, value=70kg
c2_s3 column=base:height, timestamp=1588154125060, value=1.70m
c2_s3 column=base:name, timestamp=1588153968409, value=tom3
c2_s3 column=base:weight, timestamp=1588154124202, value=85kg
7 row(s)
?
METRIC VALUE
ROWS_SCANNED 7
RPC_CALLS 1
Took 0.0476 seconds
6.Hbase 查询以指定开头的rowkey数据。
hbase(main):014:0> scan ‘stu‘,{ROWPREFIXFILTER => ‘c1‘}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
c1_s3 column=base:name, timestamp=1588153968207, value=jack3
c1_s4 column=base:name, timestamp=1588153968258, value=jack4
4 row(s)
hbase(main):016:0> scan ‘stu‘,{FILTER => "PrefixFilter(‘c1‘)"}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
c1_s3 column=base:name, timestamp=1588153968207, value=jack3
c1_s4 column=base:name, timestamp=1588153968258, value=jack4
4 row(s)
Took 0.0181 seconds
7.按列查找 QualifierFilter
按列查找,可以指定某一确定的列或列的范围。binary是确定的参数,substring是参数中含有的值。
scan ‘stu‘,{FILTER => "(QualifierFilter (<,‘binary:name‘)) AND (QualifierFilter (=,‘substring:jack‘))"}
8.以指定列的前缀查找数据。ColumnPrefixFilter
hbase(main):012:0> scan ‘stu‘,{FILTER=>"ColumnPrefixFilter(‘na‘) AND (ValueFilter(=,‘substring:1‘) OR ValueFilter(=,‘substring:3‘))"}
ROW COLUMN+CELL
c1_s3 column=base:name, timestamp=1588153968207, value=jack3
c2_s1 column=base:name, timestamp=1588153968324, value=tom1
c2_s3 column=base:name, timestamp=1588153968409, value=tom3
3 row(s)
Took 0.0075 seconds
- 按值查找,可以指定确定的值或者值的范围。ValueFilter
hbase(main):018:0> scan ‘stu‘,{FILTER=>"ValueFilter(=,‘binary:jack‘)"}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
1 row(s)
10.按时间戳 TimestampsFilter
hbase(main):022:0> scan ‘stu‘,{FILTER => "TimestampsFilter(1588153968060,1588153968207)"}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s3 column=base:name, timestamp=1588153968207, value=jack3
2 row(s)
Took 0.0151 seconds
时间等于1588153968060 和 1588153968207 的记录
- RAW指导扫描器返回所有单元格(包括删除标记和未收集的已删除单元格)。此选项不能与请求特定列相结合。默认情况下禁用。
hbase(main):024:0> scan ‘stu‘,{RAW => true,VERSIONS => 2}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
c1_s3 column=base:name, timestamp=1588153968207, value=jack3
c1_s4 column=base:name, timestamp=1588153968258, value=jack4
c2_s1 column=base:name, timestamp=1588153968324, value=tom1
c2_s2 column=base:name, timestamp=1588153968367, value=tom2
c2_s2 column=base:weight, timestamp=1588154167692, value=70kg
c2_s3 column=base:height, timestamp=1588154125060, value=1.70m
c2_s3 column=base:name, timestamp=1588153968409, value=tom3
c2_s3 column=base:weight, timestamp=1588154124202, value=85kg
7 row(s)
Took 0.0346 seconds
我们删除一条
delete ‘stu‘,‘c1_s4‘,‘base:name‘
?
hbase(main):027:0> scan ‘stu‘,{RAW => true,VERSIONS => 2}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
c1_s3 column=base:name, timestamp=1588153968207, value=jack3
c1_s4 column=base:name, timestamp=1588153968258, type=Delete
c1_s4 column=base:name, timestamp=1588153968258, value=jack4
c2_s1 column=base:name, timestamp=1588153968324, value=tom1
c2_s2 column=base:name, timestamp=1588153968367, value=tom2
c2_s2 column=base:weight, timestamp=1588154167692, value=70kg
c2_s3 column=base:height, timestamp=1588154125060, value=1.70m
c2_s3 column=base:name, timestamp=1588153968409, value=tom3
c2_s3 column=base:weight, timestamp=1588154124202, value=85kg
7 row(s)
Took 0.0189 seconds
显示type=Delete
12.FirstKeyOnlyFilter
一个rowkey可以有多个version,同一个rowkey的同一个column也会有多个的值, 只拿出key中的第一个column的第一个version
KeyOnlyFilter: 只要key,不要value
hbase(main):038:0> scan ‘stu‘,FILTER => "FirstKeyOnlyFilter() AND ValueFilter(=,‘binary:jack2‘) AND KeyOnlyFilter()"
ROW COLUMN+CELL
c1_s2 column=base:name, timestamp=1588153968114, value=
1 row(s)
Took 0.0083 seconds
- 限制返回只要两列
hbase(main):040:0> scan ‘stu‘, {LIMIT => 2}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
2 row(s)
Took 0.0077 seconds
14.引入Java类包
列分页过滤器:基于列进行分页,需要设置偏移量与返回数量。分页ColumnPaginationFilter
语法 ColumnPaginationFilter.new(limit, offset)
hbase(main):002:0> import org.apache.hadoop.hbase.filter.ColumnPaginationFilter
=> [Java::OrgApacheHadoopHbaseFilter::ColumnPaginationFilter]
?
hbase(main):040:0> scan ‘stu‘, {FILTER =>ColumnPaginationFilter.new(3, 1)}
ROW COLUMN+CELL
c2_s2 column=base:weight, timestamp=1588154167692, value=70kg
c2_s3 column=base:name, timestamp=1588153968409, value=tom3
c2_s3 column=base:weight, timestamp=1588154124202, value=85kg
2 row(s)
Took 0.0154 seconds
- 查找rowkey里面包含s2
hbase(main):013:0> import org.apache.hadoop.hbase.filter.CompareFilter
=> [Java::OrgApacheHadoopHbaseFilter::CompareFilter]
hbase(main):014:0> import org.apache.hadoop.hbase.filter.CompareFilter
=> [Java::OrgApacheHadoopHbaseFilter::CompareFilter]
hbase(main):015:0> import org.apache.hadoop.hbase.filter.SubstringComparator
=> [Java::OrgApacheHadoopHbaseFilter::SubstringComparator]
hbase(main):016:0> import org.apache.hadoop.hbase.filter.RowFilter
=> [Java::OrgApacheHadoopHbaseFilter::RowFilter]
?
hbase(main):017:0> scan ‘stu‘,{FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf(‘EQUAL‘),SubstringComparator.new(‘s2‘))}
ROW COLUMN+CELL
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
c2_s2 column=base:name, timestamp=1588153968367, value=tom2
c2_s2 column=base:weight, timestamp=1588154167692, value=70kg
2 row(s)
Took 0.0427 seconds
- 正则表达式查询
import org.apache.hadoop.hbase.filter.RegexStringComparator
import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.filter.RowFilter
直接拷贝上面的四句话
hbase(main):018:0> import org.apache.hadoop.hbase.filter.RegexStringComparator
=> [Java::OrgApacheHadoopHbaseFilter::RegexStringComparator]
hbase(main):019:0> import org.apache.hadoop.hbase.filter.CompareFilter
=> [Java::OrgApacheHadoopHbaseFilter::CompareFilter]
hbase(main):020:0> import org.apache.hadoop.hbase.filter.SubstringComparator
=> [Java::OrgApacheHadoopHbaseFilter::SubstringComparator]
hbase(main):021:0> import org.apache.hadoop.hbase.filter.RowFilter
=> [Java::OrgApacheHadoopHbaseFilter::RowFilter]
hbase(main):027:0> scan ‘stu‘, {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf(‘EQUAL‘),RegexStringComparator.new(‘^c\d+_[a-z]\d+$‘))}
ROW COLUMN+CELL
c1_s1 column=base:name, timestamp=1588153968060, value=jack
c1_s2 column=base:name, timestamp=1588153968114, value=jack2
c1_s3 column=base:name, timestamp=1588153968207, value=jack3
c2_s1 column=base:name, timestamp=1588153968324, value=tom1
c2_s2 column=base:name, timestamp=1588153968367, value=tom2
c2_s2 column=base:weight, timestamp=1588154167692, value=70kg
c2_s3 column=base:height, timestamp=1588154125060, value=1.70m
c2_s3 column=base:name, timestamp=1588153968409, value=tom3
c2_s3 column=base:weight, timestamp=1588154124202, value=85kg
6 row(s)
Took 0.0385 seconds
感觉不到变化
hbase(main):036:0> put ‘stu‘,‘c3_s55‘,‘base:name‘,‘Lucy‘
hbase(main):037:0> scan ‘stu‘, {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf(‘EQUAL‘),RegexStringComparator.new(‘^c\d+_s55$‘))}
ROW COLUMN+CELL
c3_s55 column=base:name, timestamp=1588162870203, value=Lucy
1 row(s)
Took 0.0082 seconds
原文:https://blog.51cto.com/yuexiaosheng/2491886
内容总结
以上是互联网集市为您收集整理的Hbase scan 查询命令大全,前缀,模糊,正则全部内容,希望文章能够帮你解决Hbase scan 查询命令大全,前缀,模糊,正则所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。