Khaled Hamidouche

Cited by

	All	Since 2019
Citations	1339	687
h-index	21	14
i10-index	37	18

200

100

150

2011201220132014201520162017201820192020202120222023202412 9 18 73 91 98 135 181 158 177 133 105 95 19

Public access

View all

15 articles

3 articles

available

not available

Based on funding mandates

Khaled Hamidouche

AMD Research

Verified email at amd.com

senior Research Scientist


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
S-caffe: Co-designing mpi runtimes and caffe for scalable deep learning on modern gpu clusters AA Awan, K Hamidouche, JM Hashmi, DK Panda Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of …, 2017	177	2017
Efficient inter-node MPI communication using GPUDirect RDMA for InfiniBand clusters with NVIDIA GPUs S Potluri, K Hamidouche, A Venkatesh, D Bureddy, DK Panda 2013 42nd International Conference on Parallel Processing, 80-89, 2013	174	2013
Efficient large message broadcast using NCCL and CUDA-aware MPI for deep learning AA Awan, K Hamidouche, A Venkatesh, DK Panda Proceedings of the 23rd European MPI Users' Group Meeting, 15-22, 2016	55	2016
MVAPICH-PRISM: A proxy-based communication framework using InfiniBand and SCIF for Intel MIC clusters S Potluri, D Bureddy, K Hamidouche, A Venkatesh, K Kandalla, ... Proceedings of the International Conference on High Performance Computing …, 2013	52	2013
Designing efficient small message transfer mechanism for inter-node MPI communication on InfiniBand GPU clusters R Shi, S Potluri, K Hamidouche, J Perkins, M Li, D Rossetti, DKDK Panda 2014 21st International Conference on High Performance Computing (HiPC), 1-10, 2014	51	2014
A case for application-oblivious energy-efficient MPI runtime A Venkatesh, A Vishnu, K Hamidouche, N Tallent, D Panda, D Kerbyson, ... Proceedings of the international conference for high performance computing …, 2015	44	2015
Designing MPI library with dynamic connected transport (DCT) of InfiniBand: early experiences H Subramoni, K Hamidouche, A Venkatesh, S Chakraborty, DK Panda International Supercomputing Conference, 278-295, 2014	39	2014
Hand: A hybrid approach to accelerate non-contiguous data movement using mpi datatypes on gpu clusters R Shi, X Lu, S Potluri, K Hamidouche, J Zhang, DK Panda 2014 43rd International Conference on Parallel Processing, 221-230, 2014	33	2014
Designing optimized mpi broadcast and allreduce for many integrated core (mic) infiniband clusters K Kandalla, A Venkatesh, K Hamidouche, S Potluri, D Bureddy, DK Panda 2013 IEEE 21st Annual Symposium on High-Performance Interconnects, 63-70, 2013	33	2013
Power-check: An energy-efficient checkpointing framework for HPC clusters RR Chandrasekar, A Venkatesh, K Hamidouche, DK Panda 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2015	30	2015
A scalable and portable approach to accelerate hybrid HPL on heterogeneous CPU-GPU clusters R Shi, S Potluri, K Hamidouche, X Lu, K Tomko, DK Panda 2013 IEEE International Conference on Cluster Computing (CLUSTER), 1-8, 2013	29	2013
INAM²: InfiniBand Network Analysis and Monitoring with MPI H Subramoni, AM Augustine, M Arnold, J Perkins, X Lu, K Hamidouche, ... International Conference on High Performance Computing, 300-320, 2016	27	2016
A framework for an automatic hybrid MPI+ OpenMP code generation. K Hamidouche, J Falcou, D Etiemble SpringSim (hpc), 48-55, 2011	27	2011
Re-designing CNTK deep learning framework on modern GPU enabled clusters DS Banerjee, K Hamidouche, DK Panda 2016 IEEE international conference on cloud computing technology and science …, 2016	26	2016
CUDA kernel based collective reduction operations on large-scale GPU clusters CH Chu, K Hamidouche, A Venkatesh, AA Awan, DK Panda 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2016	26	2016
Scalable Graph500 design with MPI-3 RMA M Li, X Lu, S Potluri, K Hamidouche, J Jose, K Tomko, DK Panda 2014 IEEE International Conference on Cluster Computing (CLUSTER), 230-238, 2014	25	2014
Parallel smith-waterman comparison on multicore and manycore computing platforms with BSP++ K Hamidouche, FM Mendonca, J Falcou, ACMA de Melo, D Etiemble International Journal of Parallel Programming 41, 111-136, 2013	25	2013
Exploiting GPUDirect RDMA in designing high performance OpenSHMEM for NVIDIA GPU clusters K Hamidouche, A Venkatesh, AA Awan, H Subramoni, CH Chu, ... 2015 IEEE International Conference on Cluster Computing, 78-87, 2015	24	2015
GPU triggered networking for intra-kernel communications M LeBeane, K Hamidouche, B Benton, M Breternitz, SK Reinhardt, ... Proceedings of the International Conference for High Performance Computing …, 2017	23	2017
Hybrid bulk synchronous parallelism library for clustered SMP architectures K Hamidouche, J Falcou, D Etiemble Proceedings of the fourth international workshop on High-level parallel …, 2010	22	2010

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by