Motivation
In each their own, both NodeJS and HBase are power full tools. NodeJS for spining efficient apis up fast, and HBase for holding large amount of data(in somewhat it's own way). More important HBase also solves the small file issue on Hadoop. So combining them can make sense. But it is fairly not documented.
HBase comes with a REST API and a Thrift API. Where the Thrift API is the most efficient, despite that the REST API is returning instantiated javascript objects (hence JSON). The reason is, Thrift is utilizing binary transmission and which more compact than JSON which is utilized by the REST API. There is an older github page with some benchmarking: https://github.com/stelcheck/node-hbase-vs-thrift
The at-the-time-writing, the latest stable version of HBase is version 1.2.6, it has 2 Thrift interfaces, called: HBase Thrift and HBase Thrift2. HBase Thrift is more general/administrative purpose, where tables can be created, deleted and data manipulated. The stuff I prefer to do in the HBase Shell, and not from a service. HBase Thrift2 is data only, CRUD and even batch operations which are not found in HBase Thrift.
HBase Part
To make this post complete, we'll go from table creation in HBase, to a connection to it, from NodeJS.
Table creation in HBase from the HBase shell
create_namespace 'foo'
create 'foo:bar', 'family1'
Start HBase Thrift2 API from OS shell
bin/hbase-daemon.sh start thrift2
NB! By default HBase Thrift and HBase Thrift2 are setup to use port 9095 and 9090. I you want them to run concurrent, it is possible set custom port numbers for the APIs
NB! HBase Thrift API can crash due to lack of heap memory, the heap memory can be increased in the config file: conf/hbase-env.sh
Good to go
# The maximum amount of heap to use. Default is left to JVM default.
export HBASE_HEAPSIZE=8G
Good to go
NodeJS part
Pre-requisites, besides from having NodeJS installed, is the Thrift compiler and the HBase Thrift definition file. A Thrift definition file acts as a documentation file and a definition file for building service/client proxies.
Thrift compiler can be found on Apache's Thrift homepage: https://thrift.apache.org/
HBase Thrift definition file can be found in the HBase source package from the HBase homepage: https://hbase.apache.org/Start the NodeJS project and add the Thrift package
mkdir node_hbase
cd node_hbase
npm init
npm install thrift
Create the proxy client package from the HBase Thrift definition file
thrift-0.10.0.exe --gen js:node hbase-1.2.6-src\hbase-1.2.6\hbase-thrift\src\main\resources\org\apache\hadoop\hbase\thrift2\Hbase.thrift
Create the index.js file (you can call what ever you want)
var thrift = require('thrift');
var HBaseService = require('./gen-nodejs/THBaseService.js');
var HBaseTypes = require('./gen-nodejs/HBase_types.js');
var connection = thrift.createConnection('IP or DNS to your HBase server', 9090);
connection.on('connect', function () {
var client = thrift.createClient(HBaseService, connection);
client.getAllRegionLocations('foo:bar', function (err, data) {
if (err) {
console.log('error:', err);
} else {
console.log('All region locations for table:' + JSON.stringify(data));
}
connection.end();
});
});
connection.on('error', function (err) {
console.log('error:', err);
});
Run the js script and get some result
node index.js
All region locations for table:[{"serverName":{"hostName":"localhost","port":49048,"startCode":{"buffer":{"type":"Buffer","data":[0,0,1,92,160,234,132,254]},"offset":0}},"regionInfo":{"regionId":{"buffer":{"type":"Buffer","data":[0,0,0,0,0,0,0,0]},"offset":0},"tableName":{"type":"Buffer","data":[102,111,111,58,98,97,114]},"startKey":{"type":"Buffer","data":[]},"endKey":{"type":"Buffer","data":[]},"offline":false,"split":false,"replicaId":0}}]
No comments:
Post a Comment