Building OpenKinect on Mac OS X lion

OpenKinect is an open source suite for interfacing with the xbox kinect.  The instructions for building on OS X were a little vague, and the python bindings didn’t build out of the box.  This is how I built it.  Building libfreenect is mostly the same as in the documentation, some paths differ but its fairly obvious what is required.

  1. Download and install xcode. First you first need to purchase xcode from the App Store, this will then download, and in the applications folder you will have an app called Install XCode.  Run the installer, it will install the toolkit, compilers and ide etc.  You need this so you can compile packages.
  2. Install mac ports, details on how to do so can be found on the macports web site.
  3. Use port to install cython and numpy:
    port install py2.7-cython py2.7-numpy
  4. Use port to install git, libusb, cmake and libtool:
    sudo port install git-core
    sudo port install libusb-devel
    sudo port install libtool
    sudo port install libusb-devel
  5. Download the libfreenect using
    git:git clone git://github.com/OpenKinect/libfreenect.gitThis will create a new directory called libfreenect
  6. Change into this directory and create a new directory called build and change into it.
    cd libfreenect
    mkdir build
    cd build
  7. Run ccmake, this will pop up a curses based menu, hit c to configure, change any paths that are required, then hit g to generate.
    ccmake ..
  8. Generate the make file, build and install libfreenect.
    cmake ..
    make
    make install

Now that libfreenect is installed, its time to make the python bindings, change into the libfreenect/wrappers/python directory, and build it like you would any other python package.  You need to make sure your library include paths are set correctly however.

  1. Change into the python wrappers dir.
    cd ../wrappers/python
  2. Modify setup.py as follows: add /opt/local/lib to the runtime_library_dirs array, add /opt/local/include and /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/include to the extra_compile_args array.
  3. Run python setup.py install as normal

Test the installation using python and import freenect.

 

 

LSF Queue Statistics

Something that is rarely discussed when talking about cluster efficiency is the amount of CPU time that is wasted.  When I say wasted, I mean a job that is running, but does not run to completion.  CPU time can be wasted in a number of ways, as an end user, I can let my job run for a week before realizing that my input file had the wrong value, I cancel the job, but I’ve essentially wasted a week of CPU without producing anything useful.

As an administrator, I can kill a job, either by accident, or due to necessity, but assuming the job cannot checkpoint, it is still CPU time that was unproductive.

Hardware failures can also waste huge amounts of resources, the aim of the game may be commodity computing, but a node failing 3 weeks into a 128core job can really dent efficiency.  To highlight this I wrote a script using the LSF accounting library.  It produces some stats that show per queue and per user how much time jobs have run for based on their exit status.

Name: batchq
 Total Jobs:      20000
 Failed Jobs:     297
 Total Wait Time: 719 days, 8:56:32
 Total Wall Time: 407 days, 21:27:31
 Total CPU Time:  407 days, 21:27:31
 Total Terminated CPU Time: 3 days, 13:51:15
Name: joe
 Total Jobs:      4144
 Failed Jobs:     294
 Total Wait Time: 687 days, 20:27:17
 Total Wall Time: 388 days, 7:29:02
 Total CPU Time:  388 days, 7:29:02
 Total Terminated CPU Time: 3 days, 13:51:15
Name: fred
 Total Jobs:      3
 Failed Jobs:     2
 Total Wait Time: 5:46:34
 Total Wall Time: 0:10:56
 Total CPU Time:  0:10:56
 Total Terminated CPU Time: 0:00:00
Name: barney
 Total Jobs:      4
 Failed Jobs:     0
 Total Wait Time: 0:13:41
 Total Wall Time: 0:14:31
 Total CPU Time:  0:14:31
 Total Terminated CPU Time: 0:00:00

It works quite simply by iterating through the accounting file and adding the wall time, cpu time etc, and checking the termination status of the job, and if it is not zero, adding that to the failed job count.

# Dictionary containing an entry for each queue, which is in itself a dictionary
# containing the stats for the queue
qs={}
us={}

for i in AcctFile(acctf):
    # If the queue does not have an entry in the dictionary, then create
    # one now.
    if not i.queue in qs:
        qs[i.queue]={
                'name':i.queue,
                'numJobs':0,
                'numFJobs':0,
                'waitTime':datetime.timedelta(0),
                'runTime':datetime.timedelta(0),
                'wallTime':datetime.timedelta(0),
                'wasteTime':datetime.timedelta(0),
                }
    # Based on the queue, increment the timers and counters accordingly
    # increment the number of jobs
    qs[i.queue]['numJobs']+=1
    # Add the time the job had to wait before it was started
    qs[i.queue]['waitTime']+=i.waitTime
    # Work out the CPU time, this is the wall clock time multiplied by the
    # number of slots.
    qs[i.queue]['runTime']+=(i.numProcessors*i.runTime)
    # Add the wall clock time
    qs[i.queue]['wallTime']+=i.runTime
    # If the terminfo number is >0, then it was not a normal exit status.  Add
    # the cpu time to the wasted time.
    if i.termInfo.number>0:
        qs[i.queue]['wasteTime']+=(i.numProcessors*i.runTime)
        qs[i.queue]['numFJobs']+=1

Once the stats are gathered, its just a case of pretty printing them out:

# Print out a summary per queue.
for q in qs.values():
    print "Name: %s" % q['name']
    print " Total Jobs:      %d" % q['numJobs']
    print " Failed Jobs:     %d" % q['numFJobs']
    print " Total Wait Time: %s" % q['waitTime']
    print " Total Wall Time: %s" % q['wallTime']
    print " Total CPU Time:  %s" % q['runTime']
    print " Total Terminated CPU Time: %s" %q['wasteTime']

You can download jobStats.py directly, and you will also need the LSF Python tools.