博碩士論文查詢聯邦系統首頁 聯絡信箱 關於我們 加入我們

本論文已被瀏覽 34 次, [ 造訪詳細資料與全文 ] 17 次,[ 回到前頁查詢結果 ] [ 重新搜尋 ]

Mutex Locking versus Hardware Transactional Memory: An Experimental Evaluation.

作者:Sean Ryan Moore,
畢業學校:VT
出版單位:VT
核准日期:2015-10-06
類型:Electronic Thesis or Dissertation
權限:unrestricted.I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my th....

英文摘要

It has historically been the case that CPUs have run programs ever faster without significant intervention on the behalf of the programmer. However, this "free lunch" has largely ended due to the end of exponentially increasing core frequency and the current slow increase in instruction-level parallelism but continues to a degree in cache size improvements. But since Moore's law still largely continues "lunch", i.e. program performance, can still be bought at the price of rewriting code for multiple cores, which is enabled by the trend Moore's law describes. Multicore architectures cannot aid performance for problems whose solutions are necessarily sequential in nature and writing efficient and correct concurrent programs is not easy in all cases when using synchronization methods like fine-grained mutex locks.
<br></br>
Transactional memory, and its implementation as hardware transactional memory, allow programmers to write concurrent applications without the attendant complexity of programming with mutex locks. This allows programmers to focus on optimizing the application for performance. Given that transactions can run two segments of code in parallel that a mutex lock would force to run sequentially and that transactions can abort, causing a program to do the same work more than once, whether transactions perform better or worse than mutex locks is dependent on the program's execution profile and the coarseness or fineness at which mutex locks are used.
<br></br>
In this thesis the GNU C Library's futex implementation of mutex locks and Intel's Restricted Transactional Memory have been compared and the behavior of those transactions has been analyzed. This analysis includes a pathological behavior permitted by the GNU C Library's hardware transactional memory implementation of mutex locks. The tradeoffs between fine-grained and global locking implementations have been discussed, compared, and used in the context of fallback locks for hardware transactions. This thesis provides evidence to the effect that fine-grained locking is not critical for program performance and that in many cases global locking and hardware transactions can provide nearly equivalent performance without the programming difficulties. This work has shown that across the 23 applications examined, with relation to their original locking implementation, a global locking scheme without elision has a 0.96x speedup, Intel's Restricted Transactional Memory (RTM) with the application's original locks as a fallback has a 1.01x speedup and with global lock fallback RTM has a speedup of 0.97x.
<br></br>
This work is supported in part by NAVSEA/NEEC under grant 3003279297. Any opinions, findings, and conclusions or recommendations expressed in this thesis are those of the author and do not necessarily reflect the views of NAVSEA.


chair - Binoy Ravindran

committee_member - Roberto Palmieri

committee_member - Dongyoon Lee


 

計畫贊助者: