Follow
Amar Phanishayee
Amar Phanishayee
Microsoft Research
Verified email at cs.cmu.edu - Homepage
Title
Cited by
Cited by
Year
FAWN: A Fast Array of Wimpy Nodes
DG Andersen, J Franklin, M Kaminsky, A Phanishayee, L Tan, ...
SOSP 2009: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems …, 2009
8012009
PipeDream: generalized pipeline parallelism for DNN training
D Narayanan, A Harlap, A Phanishayee, V Seshadri, NR Devanur, ...
SOSP 2019: Proceedings of the 27th ACM Symposium on Operating Systems …, 2019
7432019
Safe and effective fine-grained TCP retransmissions for datacenter communication
V Vasudevan, A Phanishayee, H Shah, E Krevat, DG Andersen, ...
SIGCOMM 2009: 39 (4), 303-314, 2009
5852009
The non-IID data quagmire of decentralized machine learning
K Hsieh, A Phanishayee, O Mutlu, PB Gibbons
ICML 2020: International Conference on Machine Learning (arXiv preprint …, 2019
5472019
Efficient large-scale language model training on gpu clusters using megatron-lm
D Narayanan, M Shoeybi, J Casper, P LeGresley, M Patwary, ...
Proceedings of the International Conference for High Performance Computing …, 2021
4582021
Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads
M Jeon, S Venkataraman, A Phanishayee, J Qian, W Xiao, F Yang
USENIX ATC 2019 (arXiv preprint arXiv:1901.05758), 2019
393*2019
Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems.
A Phanishayee, E Krevat, V Vasudevan, DG Andersen, GR Ganger, ...
FAST 2008: 6th USENIX Conference on File and Storage Technologies 8, 1-14, 2008
3692008
ProjecToR: Agile Reconfigurable Data Center Interconnect
M Ghobadi, R Mahajan, A Phanishayee, N Devanur, J Kulkarni, ...
SIGCOMM 2016, 216-229, 2016
3422016
PipeDream: Fast and efficient pipeline parallel DNN training
A Harlap, D Narayanan, A Phanishayee, V Seshadri, N Devanur, ...
arXiv preprint arXiv:1806.03377, 2018
2482018
TBD: Benchmarking and Analyzing Deep Neural Network Training
H Zhu, M Akrout, B Zheng, A Pelegris, A Phanishayee, B Schroeder, ...
IISWC 2018 - International Symposium on Workload Characterization - arXiv …, 2018
2192018
Themis: Fair and Efficient GPU Cluster Scheduling
K Mahajan, A Balasubramanian, A Singhvi, S Venkataraman, A Akella, ...
NSDI 2020: 17th USENIX Symposium on Networked Systems Design and …, 2020
2112020
Gist: Efficient Data Encoding for Deep Neural Network Training
A Jain, A Phanishayee, J Mars, L Tang, G Pekhimenko
ISCA 2018: Proceedings of The 45th International Symposium on Computer …, 2018
1902018
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
D Narayanan, K Santhanam, F Kazhamiaka, A Phanishayee, M Zaharia
OSDI 2020: 14th USENIX Symposium on Operating Systems Design and …, 2020
1882020
Memory-efficient pipeline-parallel dnn training
D Narayanan, A Phanishayee, K Shi, X Chen, M Zaharia
International Conference on Machine Learning, 7937-7947, 2021
1722021
Parameter hub: a rack-scale parameter server for distributed deep neural network training
L Luo, J Nelson, L Ceze, A Phanishayee, A Krishnamurthy
SOCC 2018: Proceedings of the ACM Symposium on Cloud Computing, 41-54, 2018
1362018
Atomic In-place Updates for Non-volatile Main Memories with Kamino-Tx
A Memaripour, A Badam, A Phanishayee, Y Zhou, R Alagappan, ...
EuroSys 2017: Twelfth European Conference on Computer Systems, 499-512, 2017
1282017
Blink: Fast and generic collectives for distributed ML
G Wang, S Venkataraman, A Phanishayee, J Thelin, N Devanur, I Stoica
MLSys 2020: Third Conference on Machine Learning and Systems (arXiv preprint …, 2019
1182019
On application-level approaches to avoiding TCP throughput collapse in cluster-based storage systems
E Krevat, V Vasudevan, A Phanishayee, DG Andersen, GR Ganger, ...
Proceedings of the 2nd international workshop on Petascale data storage …, 2007
952007
Analyzing and mitigating data stalls in DNN training
J Mohan, A Phanishayee, A Raniwala, V Chidambaram
arXiv preprint arXiv:2007.06775, 2020
932020
Data center topology having multiple classes of reliability
M Ghobadi, R Mahajan, A Phanishayee, D Zhuo, XK Zou
US Patent 10,187,292, 2019
802019
The system can't perform the operation now. Try again later.
Articles 1–20