**Consider relying on a GPU(s). A CPU is designed to be multitasker that can quickly switch between actions, whereas a Graphical Processing Unit(GPU) is designed to do the same calculations repetitively while giving large increases in performance. The stacks in the listed papers, while giving exponentially higher speeds, did not use modern designs or graphics cards, which hindered them from running even faster. **

The GPU (Graphics Prossessing Unit) is changing the face of large scale data mining by significantly speeding up the processing of data mining algorithms. For example, using the K-Means clustering algorithm, the GPU-accelerated version was found to be 200x-400x faster than the popular benchmark program MimeBench running on a single core CPU, and 6x-12x faster than a highly optimised CPU-only version running on an 8 core CPU workstation.

These GPU-accelerated performance results also hold for large data sets. For example in 2009 data set with 1 billion 2-dimensional data points and 1,000 clusters, the GPU-accelerated K-Means algorithm took 26 minutes (using a GTX 280 GPU with 240 cores) whilst the CPU-only version running on a single-core CPU workstation, using MimeBench, took close to 6 days (see research paper “Clustering Billions of Data Points using GPUs” by Ren Wu, and Bin Zhang, HP Laboratories). Substantial additional speed-ups are expected were the tests conducted today on the latest Fermi GPUs with 480 cores and 1 TFLOPS performance.

Over the last two years hundreds of research papers have been published, all confirming the substantial improvement in data mining that the GPU delivers. I will identify a further 7 data mining algorithms where substantial GPU acceleration have been achieved in the hope that it will stimulate your interest to start using GPUs to accelerate your data mining projects:

Hidden Markov Models (HMM) have many data mining applications such as financial economics, computational biology, addressing the challenges of financial time series modelling (non-stationary and non-linearity), analysing network intrusion logs, etc. Using parallel HMM algorithms designed for the GPU, researchers (see cuHMM: a CUDA Implementation of Hidden Markov Model Training and Classification by Chaun Lin, May 2009) were able to achieve performance speedup of up to 800x on a GPU compared with the time taken on a single-core CPU workstation.

Sorting is a very important part of many data mining application. Last month Duane Merrill and Andrew Grinshaw (from University of Virginia) reported achieving a very fast implementation of the radix sorting method and was able to exceed 1G keys/sec average sort rate on an the GTX480 (NVidia Fermi GPU). See http://goo.gl/wpra

Density-based Clustering is an important paradigm in clustering since typically it is noise and outlier robust and very good at searching for clusters of arbitrary shape in metric and vector spaces. Tests have shown that the GPU speed-up ranged from 3.5x for 30k points to almost 15x for 2 million data points. A guaranteed GPU speedup factor of at least 10x was obtained on data sets consisting of more than 250k points. (See “Density-based Clustering using Graphics Processors” by Christian Bohm et al).

Similarity Join is an important building block for similarity search and data mining algorithms. Researchers using a special algorithm called Index-supported similarity join for the GPU to outperform the CPU by a factor of 15.9x on 180 Mbytes of data (See “Index-supported Similarity Join on Graphics Processors” by Christian Bohm et al).

Bayesian Mixture Models has applications in many areas and of particular interest is the Bayesian analysis of structured massive multivariate mixtures with large data sets. Recent research work (see “Understanding the GPU Programming for Statistical Computation: Studies in Massively Massive Mixtures” by Marc Suchard et al.) has demonstrated that an old generation GPU (GeForce GTX285 with 240 cores) was able to achieve a 120x speed-up over a quad-core CPU version.

Support Vector Machines (SVM) has many diverse data mining uses including classification and regression analysis. Training SVM and using them for classification remains computationally intensive. The GPU version of a SVM algorithm was found to be 43x-104x faster than SVM CPU version for building classification models and 112x-212x faster over SVM CPU version for building regression models. See “GPU Accelerated Support Vector Machines for Mining High-Throughput Screening Data” by Quan Liao, Jibo Wang, et al.

Kernel Machines. Algorithms based on kernel methods play a central part in data mining including modern machine learning and non-parametric statistics. Central to these algorithms are a number of linear operations on matrices of kernel functions which take as arguments the training and testing data. Recent work (See “GPUML: Graphical processes for speeding up kernel machines” by Balaji Srinivasan et al. 2009) involves transforming these Kernel Machines into parallel kernel algorithms on a GPU and the following are two example where considerable speed-ups were achieved; (1) To estimate the densities of 10,000 data points on 10,000 samples. The CPU implementation took 16 seconds whilst the GPU implementation took 13ms which is a significant speed-up will in excess of 1,230x; (2) In a Gaussian process regression, for regression 8 dimensional data the GPU took 2 seconds to make predictions whist the CPU version took hours to make the same prediction which again is a significant speed-up over the CPU version.

If you want to use the GPUs but you do not want to get your hands “dirty” writing CUDA C/C++ code (or other languages bindings such as Python, Java, .NET, Fortran, Perl, or Lau) then consider using MATLAB Parallel Computing Toolbox. This is a powerful solution for those who know MATLAB. Alternatively R now has GPU plugins. A subsequent post will cover using MATLAB and R for GPU accelerated data mining.