1. hbase config
Write-Ahead Log (WAL) Codec Class
org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
2. HUE phoenix editor 추가
hue config
Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini
[notebook]
[[interpreters]]
[[[phoenix]]]
name=phoenix
interface=sqlalchemy
options='{"url": "phoenix://ygbaek07.gitcluster.com:8765", "tls": false, "connect_args": "{\"authentication\": \"SPNEGO\", \"verify\": false }", "has_impersonation": true}'
3. HUE 테스트
CREATE TABLE IF NOT EXISTS Company (company_id INTEGER PRIMARY KEY, name VARCHAR(225));
UPSERT INTO Company VALUES(1, 'Cloudera');
UPSERT INTO Company VALUES(2, 'Apache');
UPSERT INTO Company VALUES(3, 'Test');
SELECT * FROM Company;
4. hbase shell에서 확인
$ hbase shell
hbase:093:0> scan "COMPANY"
ROW COLUMN+CELL
\x80\x00\x00\x01 column=0:\x00\x00\x00\x00, timestamp=2024-04-15T15:29:09.392, value=x
\x80\x00\x00\x01 column=0:\x80\x0B, timestamp=2024-04-15T15:29:09.392, value=Cloudera
\x80\x00\x00\x02 column=0:\x00\x00\x00\x00, timestamp=2024-04-15T15:29:11.190, value=x
\x80\x00\x00\x02 column=0:\x80\x0B, timestamp=2024-04-15T15:29:11.190, value=Apache
\x80\x00\x00\x03 column=0:\x00\x00\x00\x00, timestamp=2024-04-17T10:46:53.693, value=x
\x80\x00\x00\x03 column=0:\x80\x0B, timestamp=2024-04-17T10:46:53.693, value=Test
3 row(s)
Phoenix에서 테이블 생성 후 HBase에서 직접 데이터 처리 하는 것 권장하지 않는다
5. Pheonix sqline
# thin (pheonix host:port)
$ phoenix-sqlline-thin 10.200.100.247:8765
# thick (zookeeper host:port zookeeper_hbase_path)
$ phoenix-sqlline 10.200.100.245:2181:/hbase
참조: https://it-sunny-333.tistory.com/m/188#google_vignette
6. HBase에서 테이블 생성 후 Phoenix에서 view table 생성
1) hbase 테이블 생성
$ hbase shell
# create 'tablename', 'columnFamily'
hbase:008:0> create 'customer', 'customer_data'
Created table customer
Took 0.7243 seconds
# put '[table 이름]', '[rowkey]', '[columnFamily:columnName]', 'value'
hbase:089:0> put 'customer','1','customer_data:username','ygbaek'
Took 0.0433 seconds
hbase:090:0> scan 'customer'
ROW COLUMN+CELL
1 column=customer_data:username, timestamp=2024-04-16T10:29:17.494, value=ygbaek
hbase:097:0> put 'customer','2','customer_data:username','haein'
Took 0.0083 seconds
hbase:098:0> put 'customer','2','customer_data:email','haein@gitcluster.com'
Took 0.0053 seconds
hbase:099:0> scan 'customer'
ROW COLUMN+CELL
1 column=customer_data:email, timestamp=2024-04-18T13:53:48.682, value=ygbaek@gitcluster.com
1 column=customer_data:username, timestamp=2024-04-16T10:29:17.494, value=ygbaek
2 column=customer_data:email, timestamp=2024-04-18T13:55:10.434, value=haein@gitcluster.com
2 column=customer_data:username, timestamp=2024-04-18T13:54:44.395, value=haein
2) phoenix view 테이블 생성
CREATE VIEW "customer" ( k VARCHAR primary key, "customer_data"."username" VARCHAR, "customer_data"."email" VARCHAR);
SELECT * FROM "customer";
+---+----------+-----------------------+
| K | username | email |
+---+----------+-----------------------+
| 1 | ygbaek | ygbaek@gitcluster.com |
| 2 | haein | haein@gitcluster.com |
+---+----------+-----------------------+
7. HUE 에서 테이블 리스트 보이도록
1) hbase namespace , phoenix SCHEMA 매핑설정(https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/phoenix-access-data/topics/phoenix-mapping-schemas.html)
2) hbase namespace, table 생성
hbase:006:0> create_namespace 'TEST'
Took 0.2174 seconds
hbase:007:0> list_namespace
NAMESPACE
SYSTEM
TEST
default
hbase
4 row(s)
Took 0.0085 seconds
hbase:008:0> create "TEST:customer","CF1"
Created table TEST:customer
Took 0.7280 seconds
=> Hbase::Table - TEST:customer
3) phoenix view 생성
CREATE SCHEMA "TEST"; -- 이렇게 하면 hue에서 보임
CREATE VIEW "TEST"."customer" ( k VARCHAR primary key, "CF1"."username" VARCHAR);
SELECT * FROM "TEST"."customer";
'Hadoop Eco' 카테고리의 다른 글
local airflow, cloudera spark(yarn) 연결 테스트 (0) | 2025.04.28 |
---|---|
ICEBERG, OZONE tutorial (0) | 2024.05.03 |
Kudu migration 방안 (0) | 2024.04.15 |
Hive Metastore에서 테이블 리스트 추출 (0) | 2024.04.05 |
impala 통계 정보 (0) | 2024.04.01 |