Follow
Saurabh Gupta
Saurabh Gupta
AMD Server Performance
Verified email at ncsu.edu - Homepage
Title
Cited by
Cited by
Year
Understanding GPU errors on large-scale HPC systems and the implications for system design and operation
D Tiwari, S Gupta, J Rogers, D Maxwell, P Rech, S Vazhkudai, D Oliveira, ...
2015 IEEE 21st International Symposium on High Performance Computer …, 2015
1962015
Failures in large scale systems: long-term measurement, analysis, and implications
S Gupta, T Patel, C Engelmann, D Tiwari
Proceedings of the International Conference for High Performance Computing …, 2017
1752017
Lazy checkpointing: Exploiting temporal locality in failures to mitigate checkpointing overheads on extreme-scale systems
D Tiwari, S Gupta, SS Vazhkudai
Dependable Systems and Networks (DSN), 2014 44th Annual IEEE/IFIP …, 2014
1192014
A large-scale study of soft-errors on GPUs in the field
B Nie, D Tiwari, S Gupta, E Smirni, JH Rogers
2016 IEEE International Symposium on High Performance Computer Architecture …, 2016
1042016
Reliability lessons learned from GPU experience with the Titan supercomputer at Oak Ridge leadership computing facility
D Tiwari, S Gupta, G Gallarno, J Rogers, D Maxwell
Proceedings of the international conference for high performance computing …, 2015
1002015
Machine learning models for GPU error prediction in a large scale HPC system
B Nie, J Xue, S Gupta, T Patel, C Engelmann, E Smirni, D Tiwari
2018 48th Annual IEEE/IFIP International Conference on Dependable Systems …, 2018
862018
Understanding and exploiting spatial properties of system failures on extreme-scale hpc systems
S Gupta, D Tiwari, C Jantzi, J Rogers, D Maxwell
2015 45th Annual IEEE/IFIP International Conference on Dependable Systems …, 2015
772015
Characterizing temperature, power, and soft-error behaviors in data center systems: Insights, challenges, and opportunities
B Nie, J Xue, S Gupta, C Engelmann, E Smirni, D Tiwari
2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation …, 2017
642017
Best practices and lessons learned from deploying and operating large-scale data-centric parallel file systems
S Oral, J Simmons, J Hill, D Leverman, F Wang, M Ezell, R Miller, D Fuller, ...
SC'14: Proceedings of the International Conference for High Performance …, 2014
592014
Adaptive cache bypassing for inclusive last level caches
S Gupta, H Gao, H Zhou
Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International …, 2013
592013
Locality Principle Revisited: A Probability-Based Quantitative Approach
S Gupta, P Xiang, Y Yang, H Zhou
Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th …, 2012
582012
Reducing waste in extreme scale systems through introspective analysis
L Bautista-Gomez, A Gainaru, S Perarnau, D Tiwari, S Gupta, ...
2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2016
482016
Power-capping aware checkpointing: On the interplay among power-capping, temperature, reliability, performance, and energy
K Tang, D Tiwari, S Gupta, P Huang, Q Lu, C Engelmann, X He
2016 46th Annual IEEE/IFIP International Conference on Dependable Systems …, 2016
292016
A model-driven approach to warp/thread-block level GPU cache bypassing
H Dai, C Li, H Zhou, S Gupta, C Kartsaklis, M Mantor
Proceedings of the 53rd Annual Design Automation Conference, 1-6, 2016
262016
Spatial Locality-Aware Cache Partitioning for Effective Cache Sharing
S Gupta, H Zhou
ICPP, 2015
242015
Improving large-scale storage system performance via topology-aware and balanced data placement
F Wang, S Oral, S Gupta, D Tiwari, SS Vazhkudai
2014 20th IEEE International Conference on Parallel and Distributed Systems …, 2014
202014
A Multi-faceted Approach to Job Placement for Improved Performance on Extreme-Scale Systems
C Zimmer, S Gupta, S Atchley, SS Vazhkudai, C Albing
29th International Conference on High Performance Computing, Networking …, 2016
182016
Understanding and analyzing interconnect errors and network congestion on a large scale HPC system
M Kumar, S Gupta, T Patel, M Wilder, W Shi, S Fu, C Engelmann, D Tiwari
2018 48th Annual IEEE/IFIP International Conference on Dependable Systems …, 2018
172018
Adaptive power profiling for many-core HPC architectures
J Kelley, C Stewart, D Tiwari, S Gupta
2016 IEEE International Conference on Autonomic Computing (ICAC), 179-188, 2016
172016
Check radiography after fixation of hip fractures: is it necessary?
K Mohanty, SK Gupta, RM Evans
Journal of the Royal College of Surgeons of Edinburgh 45 (6), 398-399, 2000
172000
The system can't perform the operation now. Try again later.
Articles 1–20