::Finding Dependencies::

July 27th, 2010 by hamish download the zooToolBox

For some time now I’ve been almost the only one really writing python tools at work. But that has started to change recently.  So I’ve been becoming more and more concerned about the lack of formalization in the existing code base. There aren’t really any unit-tests, and there hasn’t been any way of querying what other tools depend on the one you’re changing.  Which of course means its easy to break things and not even know it.

So last week I decided to bite the bullet and learn what I could about these sorts of things.  I started off with the dependency problem because testing seemed useless without an understanding of what to test.

A quick google search pointed me to this page. Not quite what I was after, but it was a fantastic piece of code to learn just how all encompassing the python standard library is. To save you having to look at the code and trawl through it yourself, this is basically what it does.

In the python standard library is a module called moduleFinder. What it does is it takes a module, finds where it exists on disk, opens the file and reads it in. This is important, at this point it has not imported the code – ie the code has not been executed. It has simply loaded the file and read its contents into memory. Then it takes this and compiles it into a code object using the compile builtin function.

Now the next bit is cool – it takes the code object, which contains the bytecode for the python code that was just compiled, and iterates over all the instructions looking for various commands – most importantly import commands. So its pretty robust, because its using python to reverse engineer its own code and walk through the dependencies.

Because the code isn’t being executed you can query dependencies for maya tools without having to run the script from inside maya.  So if I make a change to say my vectors module (which is a standalone python module) the dependency query will still be able to list all the maya scripts that use this module.  This is obviously important if you’re in a studio where you have a significant portion of your python codebase outside maya.

Surprisingly this is super fast too, because the bytecode is literally just a stream of bytes. So its just integer comparisons when iterating over the bytecode. For the 320 scripts in one of our branches it takes 20 seconds to scan.

Now 20 seconds is still a drag, so I wrote a simple caching mechanism. The cache basically stores a crc for each file, and that file’s immediate downstream dependencies. When the cache is loaded all entries are checked to make sure they still exist on disk, and the crc’s are matched. Those that have changed get re-scanned and the cache is updated. With the caching, even a huge change that touches many files only takes a second or two to make a query against. Which is super cool.

Now all I have to do is figure out how to associate a set of unit tests with each script and then the dependency query could be made to run the downstream scripts potentially affected to ensure the change is sound. That would be cool.

So anyway, if you’re thinking about writing tools to query dependencies, do it – its really easy, and the link above gives you a great place to start.

Share

This post is public domain

This entry was posted on Tuesday, July 27th, 2010 at 20:31 and is filed under main. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

  • Byron

    A very interesting post indeed ! Usually I end up running all my unittests before deploying, which usually reveals bugs that were not found before as the tests in question were just not run.
    A tool that figures out the modules dependent on the module that was just changed, and runs their related tests , could help finding related bugs much earlier in the development cycle, the guy who has to make a release will be most grateful too ;).

  • hamish

    Yeah its really worth doing – so easy too. My solution including the caching functionality and a bunch of other specific features that help with overriding code in superior project branches is a tad over 300 lines of code. So very worth doing. Its also cool because you can dump out deep dependency graphs really easily too. So it can be useful to build up a mind map of how a piece of code is related to everything else.

  • varomix

    Hi
    Just wanted to point you to this in cause you haven’t see it Pythonscope, for unit testing, http://www.disneyanimation.com/technology/opensource.html

    I’m not that advanced yet so I haven’t use it .

    I’m very interested to now is you finally overcome the pymel slowness problem and how, and example would be awesome.

    Thank you

  • hamish

    Nice. Yeah certainly looks helpful, but it seems to be just a tool that generates the skeleton code for the test. Which is certainly helpful, but the hard part is actually writing the test.

    I spent a bunch of time last week writing some unit tests for one of our maya tools. It took me about a day to write a reasonably comprehensive test for one of our simple tools. Just figuring out all the things that could go wrong, writing the code to test them, testing that test code, make sure things get cleaned up properly and reliably… Its hard work.

    As for the wrapper code I’m writing, sure I’ll post it when I’m done. Its pretty usable right now, but its scope is pretty limited.