We have moved! Please visit us at ANTHROECOLOGY.ORG. This website is for archival purposes only.

Feb 29 2012

64-bit Python Computing

Recently I ran into a memory problem running large point cloud arrays through Python and Numpy.  I quickly determined that I was asking Numpy to work on massive arrays that were exceeding the limits of the 32-bit Python process in Windows.  I came up with a workaround whereby I truncate the UTM X-Y coordinate information so I can store the numbers as 32-bit floating point values without losing precision, then add the extra numbers back at the end.  Basically, I translated the X-Y coordinates (352845.49 4346713.91 --> 2845.49 6713.91) then back again to the original UTM values after computation.  This was OK, but I wanted to overcome the 32-bit limit in Python.

There are unofficial 64-bit builds available here, but I wanted to try something established.  I got one of our machines running dual-boot Windows 7 / Ubuntu 11.10 64-bit and compiled all the Python, Scipy, and Numpy libs in Ubuntu.  These builds are inherently 64-bit because of the 64-bit OS install, so there are no issues with addressing large arrays of data.  Back to work!

Apr 07 2011

Open Source Terrain Processing

I am very excited by the current prospects of incorporating free, open-source terrain processing algorithms into our workflow.  While we are ultimately interested in studying the trees in our 3D scans, it is necessary to automatically derive a digital terrain model (DTM) that represents the ground below the canopy for the purpose of estimating tree height.

A recent paper in the open-source journal Remote Sensing, describes several freely available algorithms for terrain processing.  I am in the process of converting the entire ArcGIS workflow we used in our first paper into an automated Python workflow, and am excited about the prospect of incorporating other open-source algorithms into the mix.  Currently, by working with Numpy in Python, my processing code takes a input Ecosynth point cloud and applies two levels of ‘global’ and ‘local’ statistical filtering to remove outlier and noise elevation points in about a minute for 500,000 points.  This had previously taken hours with ArcGIS, but by formatting the data into arrays, Numpy effortlessly screams through all the points in no time. 

I am going to focus on two pieces of software.  One is the Multiscale Curvature Classification algorithm (MCC-LIDAR) by Evans and Hudak, at sourceforge here, that was mentioned in the recent paper in Remote Sensing.  The other is the libLAS module for Python, included with OSGeo, that can be used to read and write to the industry standard LAS data format for working with LiDAR data. Fun, fun!  This of course if going on in the meantime while I try to get my proposal finished.


Dandois, J.P.; Ellis, E.C. Remote Sensing of Vegetation Structure Using Computer Vision. Remote Sens. 2010, 2, 1157-1176.

Tinkham, W.T.; Huang, H.; Smith, A.M.S.; Shrestha, R.; Falkowski, M.J.; Hudak, A.T.; Link, T.E.; Glenn, N.F.; Marks, D.G. A Comparison of Two Open Source LiDAR Surface Classification Algorithms. Remote Sens. 2011, 3, 638-649.