Parallel processing

BigRoy · October 12, 2015, 2:16pm

How do you see this related to the pointcaches as how we’re tackling them currently in Magenta? There every asset goes into its own Alembic file and as such is its own instance, correct?

In most of real production scenes this would result in many many instances (similar to @tokejepsen’s number). When is something “combined”?

tokejepsen · October 12, 2015, 2:18pm

It is probably a workflow problem. The main reason for having all the meshes as individual instances was for the artist to decide on what to publish through the UI, but I’m unsure how much that is being used.
I’ll look at the combined solution instead.

marcus · October 12, 2015, 2:33pm

I suppose it comes down to what you expect to change individually.

If you have 300+ things in a scene where you expect each to change in isolation - e.g. for each to have their own versions - you are doing something wrong.

Determine what changes together and what changes separately. For example, a character Instance might consist of both a body mesh, and a clothing mesh, and perhaps no need for them to be treated separately.

Do you mean in The Deal? I don’t have it in front of me at the moment, but doesn’t it have about 5-10 instances at most? 5-10 seems reasonable I think, but again, it comes down to workflow and what you expect to change individually.

BigRoy · October 12, 2015, 3:52pm

Yes, I was referring to the deal and how things are considered separate assets and how you always wanted the pieces to be separate. Seeing you refer to aim to around 10-20 instances in total makes much more sense to me, but seems very different to what was being done in The Deal. Since it’s set up in a way where every single asset is pointcached to its own file from animation. (And I always considered real productions to have much more assets in a scene than solely a table, two mugs and two characters.) Anyway… 10-20 instances sounds about right.

tokejepsen · March 14, 2016, 4:04pm

Running into this issue again on a shot. It is a very heavy shot, maybe the heaviest on the show, so I don’t think its an immediate concern, but I also don’t think its the last time I’ll encounter a shot with hundreds of instances.
Has anyone else encountered production shots with this many instances?

The issue is with renderlayers in Maya. We have an asset that uses 6+ renderlayers, and there are 40 versions of the asset in a shot. Now as we have tested the Pyblish GUI becomes quite sluggish, and processing slows down when there are a lot of instances. But even when Pyblish only has to process one of the many instances, its slow.
Do we have any idea where this slowdown could be coming from?

marcus · March 14, 2016, 4:15pm

I’m not sure what might be causing the slowdown.

But let’s try and narrow it down. Is it as slow with pyblish.util.publish()?

tokejepsen · March 14, 2016, 5:51pm

Its definitely something to do with pyblish-qml. Had ~4 secs per instance with the GUI, and I couldn’t even time pyblish.util.publish() correctly cause it was so fast:)

marcus · March 14, 2016, 5:55pm

Ok.

I think the quickest and most succinct method to test this, is via standard unit tests. It’s unlikely that the drawing itself is what takes time, and more likely something to do behind the scenes, and we can produce reproducible tests via unit testing.

I would have a look at test_control.py to try and reproduce the slowdown via a as-small-as-possible script. From there we can start having a look at disabling things via monkeypatching to see which is the culprit.

Could you have a look at reproducing it with a unit test, @tokejepsen?

tokejepsen · March 14, 2016, 5:57pm

Sure, I’ll have a look at it when I have time.

Could you or someone else just test this first?

import pymel.core
import pyblish.api
import pyblish.util
import pyblish_integration

# setup scene
for count in range(0, 500):
    pymel.core.spaceLocator()

# collecting locators
class CollectLocators(pyblish.api.Collector):
    
    def process(self, context):
        for node in pymel.core.ls(type='locator'):
            instance = context.create_instance(name=node.getParent().name())
            instance.set_data('family', value='locator')

# empty plugin
class EmptyPlugin(pyblish.api.Validator):
    
    families = ['locator']
    
    def process(self, instance):
        self.log.info(instance)
        return

pyblish.api.register_plugin(CollectLocators)
pyblish.api.register_plugin(EmptyPlugin)

# ~4 sec per instance
#pyblish_integration.show()

# cant time cause its so fast
#pyblish.util.publish()

marcus · March 14, 2016, 8:29pm

Thanks, I can confirm that that is indeed really slow.

Looking in the process manager, I can also spot that Maya is under incredibly heavy load during this time, which suggests that the slowdown is likely the serialisation of context/instances to JSON, and probably not related to network traffic.

marcus · March 14, 2016, 8:30pm

By the way, that’s a really neat way of showing the GUI in a software-independent way. You could probably do away with the pymel calls, and just create empty instances altogether. That way we could test in multiple hosts.

tokejepsen · April 1, 2016, 3:50pm

Good idea

Here is a test where you can get the timing it takes to process as well:

import pyblish.api
import pyblish.util
import pyblish_integration
import time


# collecting instances
class CollectInstances(pyblish.api.ContextPlugin):
    
    order = pyblish.api.CollectorOrder
    
    def process(self, context):
        for count in range(0, 500):
            context.create_instance(name='temp' + str(count), family='instance')


# empty plugin
class EmptyPlugin(pyblish.api.InstancePlugin):

    order = pyblish.api.ValidatorOrder    
    families = ['instance']
    
    def process(self, instance):
        self.log.info(instance)
        
        if 'time' in instance.context.data:
            t = time.time() - instance.context.data['time']
            
            try:
                times = instance.context.data['times']
                times.append(t)
                self.log.info(sum(times)/len(times))
            except:
                instance.context.data['times'] = [t]
        
        instance.context.data['time'] = time.time()
        return

pyblish.api.deregister_all_plugins()
pyblish.api.register_plugin(CollectInstances)
pyblish.api.register_plugin(EmptyPlugin)

pyblish_integration.show()
#pyblish.util.publish()

Maya

pyblish_integration.show() > ~ 0.877218753099 sec
pyblish.util.publish() > ~ 0.00121442731731 sec

Nuke

pyblish_integration.show() > ~ 0.861935779589 sec
pyblish.util.publish() > ~ 0.00168537185761 sec

This was done with the latests Optimisations.
Either the host hasn’t got anything to do with the slow down, or its a common issue across multiple hosts.

marcus · April 3, 2016, 12:30pm

Thanks for running these, @tokejepsen.

I think it’s about time we dug into this and start optimising. Let me have a think about a minimal example to reproduce the slowdown. I think it should be a matter of setting up an XML-RPC server and client and sending data back and forth. That is currently the weakest link, even though it really shouldn’t be this weak.

tokejepsen · April 29, 2016, 12:59pm

Don’t know what you guys did, @marcus and @BigRoy, but I’m seeing an 8x speed increase publishing with the GUI!

Maya

pyblish_integration.show() > ~ 0.177625002464 sec

Nuke

pyblish_integration.show() > ~ 0.180383561409 sec

marcus · April 29, 2016, 1:06pm

We sped it up allright. Glad to hear it’s making a difference!

BigRoy · April 29, 2016, 1:21pm

Sweet! Definitely what we were looking for!