SCIENCE
CULA R10 Versus MAGMA 1.0 (Part 2)
In the previous post, we took the time to describe the performance of the state of the art in GPU-assisted linear algebra computations. While performance is a huge motivating factor for the adoption of GPU code, there is also a lot to be said for the usability and capabilities of that library. We will take this post to highlight some of our favorite features.
Equally important to speed is the question "are all my routines supported?" - a critical first question when evaluating an alternative library. Counting precision variants, our present routine roster is at 158 LAPACK routines and 34 BLAS routines (see here for info on our BLAS system.) In comparison, MAGMA has roughly 100 routines (ignoring non-LAPACK variants) and not all of them have both CPU and GPU interfaces, which all CULA routines do. This is a point of pride for us; we want to provide a consistent and confusion-free experience across all platforms, all interfaces, and across as many languages as possible.
Speaking of interfaces, we provide many interfaces into CULA in order to best match as many programming styles as possible. We have the basic bindings in C that most libraries support, and also do type overloaded calls in the C++ headers - and both of these have host memory and device memory interfaces too. We have Fortran bindings too for gfortran, Intel Fortran, and PGI Fortran. We have a Bridge interface that is a very low effort interface to quickly try out CULA's host interface for ALL of the supported LAPACK and BLAS calls in your whole program! For all the Matlab users out there, we have demonstrated how to call CULA functions in your Matlab Mex routines. In comparison, Magma supports only plain C interfaces for host and device calls, so the integration effort is placed on the user.
This isn't to say we're perfect, but if you check out our forums, you will see that we make an effort to aid users in their integration, and when bugs are discovered we attempt to correct them very quickly (see, as an example, this post). We feel strongly that CULA provides the best user experience, and heartily encourage you to take it for a test drive, starting with the free CULA Basic version.
MAGMA | CULA | |
Number of Unique LAPACK Routines | 100 | 158 |
Number of Unique BLAS Routines | 36 | 34 |
Optimized SVD Solver | ||
Optimized Symmetric Eigenvalue Solver | ||
Banded Solvers | ||
Check and Report Errors | ||
Benchmark Program | ||
Has Examples | ||
Host Memory Interface | Partial | |
Device Memory Interface | Partial | |
Bridge Interface | ||
Fortran Support | Partial | |
Compiles Easily | Requires edits | |
Precompiled |