KeOps: seamless Kernel Operations on GPU without memory overflows

Machine Learning
useR 2020, online (originally Saint-Louis, Missouri, USA, cancelled due to Covid-19 crisis)

Benjamin Charlier

Jean Feydy

Joan Glaunès

Ghislain Durif

François-David Collin


July 15, 2020

Keywords: Kernel operation, Matrix reduction, GPU, Autodifferentiation, KeOps, R


The KeOps library ( provides routines to compute generic reductions of large 2d arrays whose entries are given by a mathematical formula, such as kernel operators. Using a C++/CUDA-based implementation with GPU support, it combines a tiled reduction scheme with an automatic differentiation engine. Relying on online map-reduce schemes, it is perfectly suited to the scalable computation of kernel dot products and the associated gradients, even when the full kernel matrix does not fit into the GPU memory.

KeOps is all about breaking through this memory bottleneck and making GPU power available for seamless standard mathematical routine computations. As of 2020, this effort has been mostly restricted to the operations needed to implement deep learning algorithms. KeOps provides GPU support without the cost of developing a specific CUDA implementation for your custom mathematical operators.

KeOps can be used to implement various kernel-based methodologies especially in statistics or machine learning, e.g. density estimation, classification/regression (SVM, k-NN), interpolation and kriging. We provide the package RKeOps to efficiently use KeOps in R.