Setting up PyOpenCL with a cpu implementation of OpenCL

Posted by admin on za 08 september 2012

I've spent today in trying to get a working setup for OpenCL on my Ubuntu machine. I wanted to avoid having to install the non-free drivers, so I decided to go for a software implementation.

Here's the steps I followed. They look straightforward, so why do I feel it has been a struggle? Here's the important things that will give hard-to-trace errors when you're doing it wrong:

  • choose pocl, not clover, as the software implementation
  • compile pocl using clang, not using gcc
  • compile pyopencl to support only opencl version 1.1

In detail, these are the steps you need:

Install the most recent version of llvm and clang from source

$ # install the packaged version of clang to compile llvm
$ sudo apt-get install llvm clang
$ ### -- the rest is probably optional -- ###
$ # download and extract llvm and clang
$ wget
$ wget
$ tar -xf llvm-3.1.src.tar.gz
$ tar -xf clang-3.1.src.tar.gz
$ mv llvm-3.1{.src,}
$ mv clang-3.1.src llvm-3.1/tools/clang
$ cd llvm-3.1
$ ./configure && make && sudo make install

Install the most recent version of pocl from source

$ bzr branch lp:pocl
$ cd pocl
$ ### --- if you have installed llvm from tarball, you need this ---
$ patch -p0 <<"EOF"
--- tools/llvm-ld/ 2012-08-22 11:35:13 +0000
+++ tools/llvm-ld/ 2012-09-08 18:11:01 +0000
@@ -1,6 +1,6 @@
 bin_PROGRAMS = pocl-llvm-ld
 pocl_llvm_ld_SOURCES = llvm-ld.cpp Optimize.cpp
-pocl_llvm_ld_LDFLAGS = `@LLVM_CONFIG@ --ldflags` -lLLVM-`@LLVM_CONFIG@ --version`
+pocl_llvm_ld_LDFLAGS = `@LLVM_CONFIG@ --ldflags` `@LLVM_CONFIG@ --libs`
 pocl_llvm_ld_CXXFLAGS = `@LLVM_CONFIG@ --cxxflags`
$ # need to compile with clang (I haven't checked whether it's
$ # clang or clang++). Also, for some reason, -ldl isn't properly
$ # pickedup from llvm-config --ldflags, so I'm adding it, too
$ CC=clang CXX=clang++ LDFLAGS="-ldl" ./configure && make && sudo make install

Install the most recent version of pyopencl from source

In fact, I wanted to use pyopencl inside of sage, and not just in python. So here's what I did.

$ wget
$ tar -xf pyopencl-2012.1
$ cd pyopencl-2012.1
$ # pocl doesn't have a complete implementation of the API, so we
$ # restrict to OpenCL 1.1
$ ./ --cl-pretend-version=1.1
$ sage -python build
$ sage -python install

Run the example program for pyopencl

$ sage
sage: import pyopencl as cl
sage: import numpy
sage: import numpy.linalg as la

sage: a = numpy.random.rand(50000).astype(numpy.float32)
sage: b = numpy.random.rand(50000).astype(numpy.float32)

sage: ctx = cl.create_some_context()
sage: queue = cl.CommandQueue(ctx)

sage: mf = cl.mem_flags
sage: a_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
sage: b_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b)
sage: dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, b.nbytes)

sage: # don't mind the pocl warning.
sage: # when copy-pasting, don't forget to remove the leading ....:
sage: prg = cl.Program(ctx, """
....:    __kernel void sum(__global const float *a,
....:    __global const float *b, __global float *c)
....:    {
....:      int gid = get_global_id(0);
....:      c[gid] = a[gid] + b[gid];
....:    }
....:    """).build()
pocl warning: encountered incomplete implementation in clBuildProgram.c:56
sage: prg.sum(queue, a.shape, None, a_buf, b_buf, dest_buf)
<pyopencl._cl.Event object at 0x553c600>
sage: a_plus_b = numpy.empty_like(a)
sage: cl.enqueue_copy(queue, a_plus_b, dest_buf)
<pyopencl._cl.NannyEvent object at 0x553c830>
sage: print la.norm(a_plus_b - (a+b))