Archive for the 'main' Category

What is a KdTree?!

Saturday, September 4th, 2010

So what is a KdTree? There are heaps of explanations you can find on the web. Wikipedia’s explanation is ok -although kinda hard to read. Anyway I figured I’d give a quick overview of what its all about.

So a KdTree is a type of “spatial partitioning” data structure. You feed in a bunch of spatial data and the KdTree organizes it in a way that makes certain queries really fast. In particular nearest neighbour searches.

I originally started toying with this after trying to get my skin weight loading script a bit smarter when dealing with radically different geometries. Searching within a tolerance doesn’t work well if the geometries can be vastly different. After all, some verts could be right next to each other, while others have matches further away. Weighting the further points helps, but it can screw up the points that have more exact matches.

So how do we solve this? Well I use an algorithm that finds the closest point, grabs the distance to this closest point and given a ratio looks for others within this radius. You give the search function a factor – it defaults to 2 – and it sees if there are any other points within that radius. So basically the tolerance is dynamic, not fixed. So if the closest point is very close, the search radius used is very small. However if the closest point is very far away, then all points within a much larger radius get considered. It turns out this works really well when transferring vertex data like skin weighting between radically different geometries.

Anyway, this is totally possible to do using the binary search, but because we’re now searching twice (finds the closest then the tolerance search) we now need every single bit of speed we can extract – hence my delving into faster ways of searching. Of course, I should probably be doing this all in c++, but… Maybe next.

Let me finish off with some numbers. On my 3 year old, dual core 2Ghz laptop, given a 10k point KdTree, I can run 10k queries on the data in about 2.2 seconds given similar inputs, or if the data is vastly different it takes about 7 seconds. As you can see, the early out test I wrote about in the previous post can make the code up to 4 times faster. By contrast, the binary search algorithm takes about 25 seconds.

Building classes is slow

Friday, September 3rd, 2010

So I’ve been trying to write a really speedy KdTree class the last couple of nights. I think I have it as fast as I think I can make it in python (although I’d be pretty happy if you could prove me wrong), but anyway I made an unusual discovery just now.

So one of the shortcuts I put in is detecting exact point matches. The closest point search needs to ultimately determine the squared distance to the query point. So I figured I’d put in a test to see if that distance was zero, and because the search function is recursive, the easiest way to bail on the recurse stack is to wrap the top entry point in a try:except block, and throw a known exception when an exact match is detected. All good.

But I was being a bit lazy and was declaring the class within the closest point function. Ie the code looked something like this:

def getClosest( self, queryPoint ):
	class ExactMatch(Exception): pass
	def search( node, depth ):
		...
		if dist == 0: raise ExactMatch  #short circuit test
		...

	try:
		search( self.rootNode, 0 )
	except ExactMatch: pass
	...

You get the idea. Anyway, so I use this for matching two pieces of arbitrary but similar geometry – so getClosest gets called heaps. Like 5-20k times. Adding this test saved a heap of time in the cases where lots of points were the same, but it also seemed to slow the case where none of the points were the same a lot. Way more than I expected. Or so I thought.

So after a bit more investigation I realized it wasn’t the extra test I had added, it was the ExactMatch class declaration that was doing it.

Lo and behold, declaring classes is SUPER slow. Like really really slow. You should try it out. Go on – here is some example code

import time
s=time.clock()
for n in range( 1000000 ):
	class ExactMatch(Exception): pass
print '%0.3g' % (time.clock()-s)

Now try running that same code and instead of declaring a class, instantiate a list or something. On my machine its around 30 times slower to declare this empty class than it is to instantiate a list.

Anyway – hardly a big deal, but something to keep in mind.

NOTE: feel free to use the kdtree code above, but you’ll either need to either use the Vector class here, provide a get_magnitude method on your own Vector class, or tweak my code to get rid of the need for a get_magnitude call all together. Its only used in a single place, but I’m lazy and couldn’t be bothered writing the extra line or two of code that it needs to be fully independent. ;)

Oh, the code also assumes that vector points in the tree are indexable. Ie you can get at the “X” component of a point using someVector[0]. My Vector class inherits directly from list because accessing vector members this way is way faster than having to define an attribute or worse, property.

Maya UI Wrappers

Wednesday, August 25th, 2010

Ever since I first started writing python code in maya I’ve been slowly building up my own UI wrappers.  I’m certainly not the only one.  Chad has done it with pymel, Byron has done it with MRV, and I’m sure there are a bunch of others.  But of course, I wasn’t using pymel back then, nor did I know anything about MRV.

The good thing about my implementation however, is that its really small and self contained.  Its a single script.  Its fairly gratuitously commented, and there is a page with some tips, tricks and general high level discussion about it here.  It a nutshell it tries to make UI authoring more like writing UI code in WX or QT.

Anyway, for those interested take a look at it here.  Enjoy!

Oh. Duh.

Friday, August 20th, 2010

On the new vs init thing…  Duh.  Multiple inheritance.  Separating new and init mean you can leverage more instance setup when doing multiple inheritance.  I guess I generally don’t use multiple inheritance…  But good to know.

[edit:] to be more explicit – new can only be called once because it is responsible for actual instance creation.  new does the actual object creation whereas init just takes the new instance and initializes it.  Initialization methods on multiple parent classes can both do complementary things on the same instance, so being able to split out initialization from construction makes a lot of things possible if you’re working with multiple parent classes.  Its not as useful with single inheritance – arguably it makes things more confusing.  But multiple inheritance its a godsend.

python’s new and init

Wednesday, August 11th, 2010

I’ve never realized this before, but the **kw that gets passed to the __new__ method is NOT the same dictionary thats given to __init__.  Ie if you want to mess with the **kw dict in the __new__ method for some reason, you can’t because python gives __init__ a new copy of the original **kw dict passed in.

Seems kinda weird to me.  Any python gurus out there that know why this is?  I’ve never really understood why python has both a __new__ and an __init__.  Why not just ditch one of them and be done with it?

auto complete in maya’s script editor

Tuesday, August 10th, 2010

I just discovered this.

I never realized how badly you could screw up an auto completer.  Do these people use their own software?!

Kahn Academy FTW!

Tuesday, August 3rd, 2010

This is too awesome not to share – the Kahn Academy is some dude just spewing out heaps of video tutorials covering a wide variety of subjects.  Most interestingly to me at the moment, he covers a heap of linear algebra.

I took linear algebra in school, but it didn’t seem at all applicable to life at the time so its since been purged from my brain.  My lack of 3d math skills has always been one of my weaknesses, and I try to rectify that whenever the opportunity arises.  Anyway this guy has some videos that have helped, and I figured others would probably benefit as well.

Finding Dependencies

Tuesday, July 27th, 2010

For some time now I’ve been almost the only one really writing python tools at work. But that has started to change recently.  So I’ve been becoming more and more concerned about the lack of formalization in the existing code base. There aren’t really any unit-tests, and there hasn’t been any way of querying what other tools depend on the one you’re changing.  Which of course means its easy to break things and not even know it.

So last week I decided to bite the bullet and learn what I could about these sorts of things.  I started off with the dependency problem because testing seemed useless without an understanding of what to test.

A quick google search pointed me to this page. Not quite what I was after, but it was a fantastic piece of code to learn just how all encompassing the python standard library is. To save you having to look at the code and trawl through it yourself, this is basically what it does.

In the python standard library is a module called moduleFinder. What it does is it takes a module, finds where it exists on disk, opens the file and reads it in. This is important, at this point it has not imported the code – ie the code has not been executed. It has simply loaded the file and read its contents into memory. Then it takes this and compiles it into a code object using the compile builtin function.

Now the next bit is cool – it takes the code object, which contains the bytecode for the python code that was just compiled, and iterates over all the instructions looking for various commands – most importantly import commands. So its pretty robust, because its using python to reverse engineer its own code and walk through the dependencies.

Because the code isn’t being executed you can query dependencies for maya tools without having to run the script from inside maya.  So if I make a change to say my vectors module (which is a standalone python module) the dependency query will still be able to list all the maya scripts that use this module.  This is obviously important if you’re in a studio where you have a significant portion of your python codebase outside maya.

Surprisingly this is super fast too, because the bytecode is literally just a stream of bytes. So its just integer comparisons when iterating over the bytecode. For the 320 scripts in one of our branches it takes 20 seconds to scan.

Now 20 seconds is still a drag, so I wrote a simple caching mechanism. The cache basically stores a crc for each file, and that file’s immediate downstream dependencies. When the cache is loaded all entries are checked to make sure they still exist on disk, and the crc’s are matched. Those that have changed get re-scanned and the cache is updated. With the caching, even a huge change that touches many files only takes a second or two to make a query against. Which is super cool.

Now all I have to do is figure out how to associate a set of unit tests with each script and then the dependency query could be made to run the downstream scripts potentially affected to ensure the change is sound. That would be cool.

So anyway, if you’re thinking about writing tools to query dependencies, do it – its really easy, and the link above gives you a great place to start.

Winning The Perf War

Wednesday, July 21st, 2010

So one super convenient thing pymel gives you is a name independent way of tracking nodes in your code. So doing this:

from maya.cmds import *
o = group( em=True )
rename( o, 'something_else' )
select( o )

This will fail because the o variable is just the name of the group when it was originally created, and the name changes before the select command is run. Now obviously the above is a trivial case and can be easily solved, but sometimes its not so easy.

For example, if you have a group with a non-unique leaf name that is nested deeply in the hierarchy, just by re-parenting one of its parents, the node will change. Tracking this changed name is non-trivial, and is the sort of problem you run into every now and then when writing generalized tools for rigging, or just general scene construction.

Anyhoo, so poking around a bit I experimented with adding a few methods to MObject, kinda like this:

from maya.OpenMaya import *
def asMObject( otherMobject=None ):
	'''
	tries to cast the given obj to an mobject - it can be string
	'''
	if otherMobject is None:
		return MObject()

	if isinstance( otherMobject, basestring ):
		sel = MSelectionList()
		sel.add( otherMobject )

		tmp = MObject()
		sel.getDependNode( 0, tmp )
		#tmp.cast()

		return tmp

def partialPathName( self ):
	'''
	returns the partial name of the node
	'''
	if self.hasFn( MFn.kDagNode ):
		dagPath = MDagPath()
		MDagPath.getAPathTo( self, dagPath )

		return dagPath.partialPathName()

	return MFnDependencyNode( self ).name()

MObject.__str__ = partialPathName
MObject.__unicode__ = partialPathName
MObject.partialPathName = partialPathName
MObject.name = partialPathName

def _hash( self ):
	return MObjectHandle( self ).hashCode()

MObject.__hash__ = _hash

def cmpNodes( a, b ):
	'''
	compares two nodes and returns whether they're equivalent or not.
	the compare is done on MObjects not strings so passing the fullname
	and a partial name to the same dag will still return True
	'''
	if not isinstance( a, MObject ):
		a = asMObject( a )

	if not isinstance( b, MObject ):
		b = asMObject( b )

	return a.name() == b.name()

Now this is hardly rocket science here, but it does go along way toward making MObjects useful in just your average script, and more to the point it solves the problem discussed above.

How? Well, first off, its easy to cast a node name to an MObject with the asMObject function. More importantly, you can still pass the variable to a normal mel command. ie this works:

from maya.cmds import *
o = asMObject( group( em=True ) )
rename( o, 'something_else' )
select( o )

Plus you can compare two nodes without having to worry about whether one is a full path to the node or a partial path.

This code can be packaged up into a simple little module that you can import at will – simply by importing the script you get the functionality added to the MObject class, so all you need to do really is add this to the top of your scripts:

from apiExtensions import asMObject

And wham – you get the rest for free.

Casting to these MObjects is still slow – there’s no way around that. Thats just maya being a slut. But you can easily target which parts of your script to add this functionality to without having to change your entire code base to use something like pymel or MRV. Plus you can also pass this MObject to the api class – again without having to wrap anything.

I’m smelling win.

Thoughts?

Not to be repetitive

Tuesday, July 20th, 2010

But pymel really is horribly slow. Perhaps I’m doing something wrong – although if I am I’m pretty sure I’m not alone. But I just spent a few hours tearing it out of my rigging tool. There is still a long way to go, but all of a sudden the tool is unbelievably faster. Like to the point where I no longer have to spend time trying to optimize the tool.

So be warned people! Pymel might make for more readable code, but there is a huge cost. HUGE.