etl 통테 결과

2018. 11. 21. 16:38

분석서버 현황

-test1

input size : 1.4T 모든서버

conditions : watermark 60s, trigger 1s

Test time : 10am~1pm

result : 오답률 50%이상

오답 원인 : 1차 ETL 병목 현상 (3번 Kafka 서버 특정 파티션 consumer 지연)

-test2

input size : 1.4T 모든서버

Conditions : watermark 120s, trigger 2s , 3,4,5 번 서버 Thread 옵션 조정 (8,8,8 → 6,9,9)

Test time : 2pm~3pm

result : 오답률 1% 이하

* spark watermark (지연시간) 를 줄이고 오답률을 개선하려면,

KAFKA /SPARK 클러스터를 복합적으로 늘려서

KAFKA 파티션 , SPARK RDD 파티션, Spark core수 , 1차 ETL 병목 현상

등 복합적으로 개선이 되어야 합니다.

CPU, Memory, Disk I/O 사용률 - 요약

buffer cache flush 필요

Host name	CPU(%user)	Memory	buffer cach flush
bigdata01	0.26%	10%	10.56%
bigdata02	0.16%	31%	5.83%
bigdata03	45%	99%	15%
bigdata04	55%	99%	12.4%
bigdata05	45%	99%	10.7%
bigdata06	0.2%	81%	5%
bigdata07	18%	99%	8%
bigdata08	18%	99%	12.4%
bigdata09	17%	99%	10.9%
bigdata10	15%	99%	11.8%
bigdata11	1.5%	99%	7%
bigdata12	11%	99%	35.8%
bigdata13	3%	99%	22%
bigdata14	1%	99%	5.7%
bigdata15	1.5%	99%	8%

spark 재설치 (0)	2018.11.21
Spark Struct Streaming - output (0)	2018.11.20
Spark Struct Streaming - joins (0)	2018.11.20
Spark Struct Streaming - other operations (0)	2018.11.20
spark struct streaming - window operation (0)	2018.11.20