溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊×

獲取短信驗證碼

其他方式登錄

點擊登錄注冊即表示同意《億速云用戶服務條款》

用戶登錄×

賬戶密碼登錄

請使用微信掃描上方二維碼

使用幫助

請求超時！

請點擊重新獲取二維碼

TEZ MRR optimize to MR？

發布時間：2020-06-16 19:33:51 來源：網絡閱讀：755 作者：r7raul 欄目：大數據

https://issues.apache.org/jira/browse/HIVE-2340

select userid,count(*) from u_data group by userid order by userid will product MRR.

I think when the result of userid,count(*) is small(one reduce can process the result) . This query plan can optimize to MR ?

To prevent bad reducer merging, the reducer merging only kicks in when the

optimizer thinks it gets a perf boost.

MR -> MRR is not a big win when it comes Tez, due to container-reuse -

going wide on the large cardinality in case of missing map-side

aggregation will be safer.

If hive.map.aggr=true and the userid set fits within memory, then smushing

the reducers would be nicer.

To reset the wide-narrow checks, do

set hive.optimize.reducededuplication.min.reducer=1;

But be aware that it will fail (I1ve seen full disks) as you scale upwards

to the 10+ Tb cases.

Cheers,

Gopal

hive.optimize.reducededuplication.min.reducer

Default Value: 4
Added In: Hive 0.11.0 with HIVE-2340

Reduce deduplication merges two RSs (reduce sink operators) by moving key/parts/reducer-num of the child RS to parent RS. That means if reducer-num of the child RS is fixed (order by or forced bucketing) and small, it can make very slow, single MR. The optimization will be disabled if number of reducers is less than specified value.

向AI問一下細節

推薦閱讀：

免責聲明：本站發布的內容（圖片、視頻和文字）以原創、轉載和分享為主，文章觀點不代表本網站立場，如果涉及侵權請聯系站長郵箱：is@yisu.com進行舉報，并提供相關證據，一經查實，將立刻刪除涉嫌侵權內容。

上一篇新聞：
oracle linux 5.8安裝oracle 11g rac環境之grid安裝
下一篇新聞：
DHCP服務器異常，上不了網解決辦法

猜你喜歡

AI
助
手

產品服務

地區劃分

專題活動

幫助支持

關于我們

售后咨詢

7*24小時在線電話：400-100-2938

7*24小時在線 QQ：800811969

關注億速云

億速云公眾號

手機網站二維碼

亚洲午夜精品一区二区_中文无码日韩欧免_久久香蕉精品视频_欧美主播一区二区三区美女