问题:
第一次往hbase put数据,索引同步三个字段,第二次更新hbase数据,只更新一个字段,其他两个字段会消失。
原因:
在创建Hbase Indexer 时我们配置文件指定了 read-row="never"
$ cat morphline-hbase-mapper.xml
<?xml version="1.0"?>
<!-- table:需要索引的HBase表名称-->
<!-- mapper:用来实现和读取指定的Morphline配置文件类,固定为MorphlineResultToSolrMapper-->
<indexer table="tableName" mapper="com.ngdata.hbaseindexer.morphline.MorphlineResultToSolrMapper" read-row="never" >
<!--param中的name参数用来指定当前配置为morphlineFile文件 -->
<!--value用来指定morphlines.conf文件的路径,绝对或者相对路径用来指定本地路径,如果是使用Cloudera Manager来管理morphlines.conf就直接写入值morphlines.conf"--><param name="morphlineFile" value="morphlines.conf"/>
<!--value="ZDTableMap",这里test3Map是自定义,接下来要使用。其他的mapper,param name等属性默认即可--><param name="morphlineId" value="TableMap"/>
</indexer>
修改为 read-row="dynamic" ,再次测试,发现不会丢失字段
read-row 说明:https://github.com/NGDATA/hbase-indexer/wiki/Indexer-configuration#read-row
read-row
The read-row attribute has two possible values: dynamic, or never.
This attribute is only important when using row-based indexing. It specifies whether or not the indexer should re-read data from HBase in order to perform indexing.
When set to "dynamic", the indexer will read the necessary data from a row if a partial update to the row is performed in HBase. In dynamic mode, the row will not be re-read if all data needed to perform indexing is included in the row update.
If this attribute is set to never, a row will never be re-read by the indexer.
The default setting is "dynamic".
但可能会遇到以下问题,使用前需要充分的测试
HBase Indexer导致Solr与HBase数据不一致问题解决:
https://blog.csdn.net/d6619309/article/details/51579594