Efficient volume exploration using the gaussian mixture model
Y Wang, W Chen, J Zhang, T Dong, G Shan, X Chi
IEEE Transactions on Visualization and Computer Graphics 17 (11), 1560-1573, 2011
Optimizing symmetric dense matrix-vector multiplication on GPUs
R Nath, S Tomov, TT Dong, J Dongarra
Proceedings of 2011 International Conference for High Performance Computing …, 2011
Batched matrix computations on hardware accelerators based on GPUs
A Haidar, T Dong, P Luszczek, S Tomov, J Dongarra
The International Journal of High Performance Computing Applications 29 (2 …, 2015
A step towards energy efficient computing: Redesigning a hydrodynamic application on CPU-GPU
T Dong, V Dobrev, T Kolev, R Rieben, S Tomov, J Dongarra
2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014
Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems
I Yamazaki, T Dong, R Solcą, S Tomov, J Dongarra, T Schulthess
Concurrency and computation: Practice and Experience 26 (16), 2652-2666, 2013
LU factorization of small matrices: Accelerating batched DGETRF on the GPU
T Dong, A Haidar, P Luszczek, JA Harris, S Tomov, J Dongarra
2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 …, 2014
A framework for batched and GPU-resident factorization algorithms applied to block householder transformations
A Haidar, TT Dong, S Tomov, P Luszczek, J Dongarra
International Conference on High Performance Computing, 31-47, 2015
A fast batched Cholesky factorization on a GPU
T Dong, A Haidar, S Tomov, J Dongarra
2014 43rd International Conference on Parallel Processing, 432-440, 2014
Volume exploration using ellipsoidal gaussian transfer functions
Y Wang, W Chen, G Shan, T Dong, X Chi
2010 IEEE Pacific Visualization Symposium (PacificVis), 25-32, 2010
Towards batched linear solvers on accelerated hardware platforms
A Haidar, T Dong, P Luszczek, S Tomov, J Dongarra
ACM SIGPLAN Notices 50 (8), 261-262, 2015
Acceleration of computational fluid dynamics codes on GPU
TX Dong, XL Li, S Li, XB Chi
Jisuanji Xitong Yingyong-Computer Systems and Applications 20 (1), 104-109, 2011
Mixed-precision orthogonalization scheme and adaptive step size for improving the stability and performance of CA-GMRES on GPUs
I Yamazaki, S Tomov, T Dong, J Dongarra
International Conference on High Performance Computing for Computational …, 2014
Optimization for performance and energy for batched matrix computations on GPUs
A Haidar, T Dong, P Luszczek, S Tomov, J Dongarra
Proceedings of the 8th Workshop on General Purpose Processing Using GPUs, 59-69, 2015
MAGMA: a new generation of linear algebra library for GPU and multicore architectures
J Dongarra, T Dong, M Gates, A Haidar, S Tomov, I Yamazaki
SC12, 2012
Optimizing the SVD bidiagonalization process for a batch of small matrices
T Dong, A Haidar, S Tomov, J Dongarra
Procedia Computer Science 108, 1008-1018, 2017
Magma batched: A batched blas approach for small matrix factorizations and applications on gpus
T Dong, A Haidar, P Luszczek, S Tomov, A Abdelfattah, J Dongarra
Technical Report. Technical report, 2016
Seismic wave propagation simulation using accelerated support operator rupture dynamics on multi-gpu
Y Zhou, S Song, T Dong, DA Yuen
Computational Science and Engineering (CSE), 2011 IEEE 14th International …, 2011
Tridiagonalization of a symmetric dense matrix on a GPU cluster
I Yamazaki, T Dong, S Tomov, J Dongarra
2013 IEEE International Symposium on Parallel & Distributed Processing …, 2013
The GPU Acceleration of a Two-Dimensional Diffusion Equation [J]
Computer Engineering & Science 11, 034, 2009
Accelerating the SVD bi-diagonalization of a batch of small matrices using GPUs
T Dong, A Haidar, S Tomov, J Dongarra
Journal of computational science 26, 237-245, 2018
