Mehdi Goli
Mehdi Goli
VP R&D and AI Enablement
Bestätigte E-Mail-Adresse bei
Zitiert von
Zitiert von
Parallel patterns for heterogeneous CPU/GPU architectures: Structured parallelism from cluster to cloud
S Campa, M Danelutto, M Goli, H González-Vélez, AM Popescu, ...
Future Generation Computer Systems 37, 354-366, 2014
A new vertical fragmentation algorithm based on ant collective behavior in distributed database systems
M Goli, SMT Rouhani Rankoohi
Knowledge and Information Systems 30, 435-455, 2012
Heterogeneous algorithmic skeletons for fast flow with seamless coordination over hybrid architectures
M Goli, H González-Vélez
2013 21st Euromicro International Conference on Parallel, Distributed, and …, 2013
Accelerated machine learning using TensorFlow and SYCL on OpenCL Devices
M Goli, L Iwanski, A Richards
Proceedings of the 5th International Workshop on OpenCL, 1-4, 2017
Towards cross-platform performance portability of dnn models using sycl
M Goli, K Narasimhan, R Reyes, B Tracy, D Soutar, S Georgiev, ...
2020 IEEE/ACM International Workshop on Performance, Portability and …, 2020
Streaming dynamic coarse-grained CPU/GPU workloads with heterogeneous pipelines in FastFlow
M Goli, MT Garba, H Gonzláez–Vélez
2012 IEEE 14th International Conference on High Performance Computing and …, 2012
SYCL-BLAS: leveraging expression trees for linear algebra
JI Aliaga, R Reyes, M Goli
Proceedings of the 5th International Workshop on OpenCL, 1-5, 2017
oneAPI open-source math library interface
M Krainiuk, M Goli, VR Pascuzzi
2021 International Workshop on Performance, Portability and Productivity in …, 2021
Mapping parallel programs to heterogeneous CPU/GPU architectures using a Monte Carlo Tree Search
M Goli, J McCall, C Brown, V Janjic, K Hammond
2013 IEEE Congress on Evolutionary Computation, 2932-2939, 2013
N‐body computations using skeletal frameworks on multicore CPU/graphics processing unit architectures: an empirical performance evaluation
M Goli, H González–Vélez
Concurrency and Computation: Practice and Experience 26 (4), 972-986, 2014
Visioncpp: A sycl-based computer vision framework
M Goli
Proceedings of the 4th International Workshop on OpenCL, 1-4, 2016
Towards performance portability of ai models using sycl-dnn
M Tanvir, K Narasimhan, M Goli, O El Farouki, S Georgiev, I Ault
International Workshop on OpenCL, 1-3, 2022
Cross-platform performance portability using highly parametrized SYCL kernels
J Lawson, M Goli, D McBain, D Soutar, L Sugy
arXiv preprint arXiv:1904.05347, 2019
Autonomic coordination of skeleton-based applications over CPU/GPU multi-core architectures
M Goli, H González–Vélez
International Journal of Parallel Programming 45, 203-224, 2017
Improving performance of SYCL applications on CPU architectures using LLVM-directed compilation flow
P Ghiglio, U Dolinsky, M Goli, K Narasimhan
Proceedings of the Thirteenth International Workshop on Programming Models …, 2022
Performance portability through machine learning guided kernel selection in SYCL libraries
J Lawson, M Goli
Parallel Computing 107, 102813, 2021
Benchmarking a Proof-of-Concept Performance Portable SYCL-based Fast Fourier Transformation Library
VR Pascuzzi, M Goli
International Workshop on OpenCL, 1-9, 2022
Achieving Near-Native Runtime Performance and Cross-Platform Performance Portability for Random Number Generation Through SYCL Interoperability
VR Pascuzzi, M Goli
International Workshop on Accelerator Programming Using Directives, 22-45, 2021
Toward performance portability of highly parametrizable TRSM algorithm using SYCL
T Sabino, M Goli
International Workshop on OpenCL, 1-10, 2021
Formalised composition and interaction for heterogeneous structured parallelism
M Goli, H González-Vélez
International Journal of Parallel Programming 46, 120-151, 2018
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20