It’s been touched upon before, but I thought I’d make a formal exploration into the topic of creating one Instance using multiple Collectors
Goal
To simplify Collection.
Motivation
A goal of Pyblish is for validation to be as general and as encompassing as possible. Ultimately I’d like there to be a wide, global repository of validations that anyone can benefit from and that applies to everything from the most general to the most specific asset requirements, whilst still being technically compatible with any Instance
.
To accomplish this, more responsibility must be delegated to Collection. It is the Collectors job to map information from the complex per-studio asset into a format compatible with Pyblish.
Example
In case an Instance
is to be tested for height
, a collector responsible for finding and storing height
must be present. To avoid the same collector having to be modified each time a new validator appears, a new collector can appear to support it.
class CollectHeight(pyblish.api.Collector):
families = ["model", "rig"]
def process(self, context):
for instance_name in pipeline.ls():
if instance_name not in context:
context.create_instance(instance_name)
instance = context[instance_name]
host.compute
Implementation
There are two major approaches to cooperative collection.
- Ordered
- Unordered
In a nutshell they are each others opposites; unordered favouring independence and compatibility, whereas order favours less code and higher performance.
The ordered cooperative collection (OCC) is simple, it means CollectorB
depends on CollectorA
; i.e. CollectorA
must process before CollectorB
.
import pyblish.api
import pyblish.util
class CollectorA(pyblish.api.Collector):
order = pyblish.api.Collector.order + 0.0
def process(self, context):
my_instance = context.create_instance(
name="MyInstance",
family="MyFamily")
my_instance.set_data("age", 12)
class CollectorB(pyblish.api.Collector):
order = pyblish.api.Collector.order + 0.1
def process(self, context):
# This would break unless A ran first
my_instance = context["MyInstance"]
my_instance.set_data("height", 1.12)
pyblish.api.register_plugin(CollectA)
pyblish.api.register_plugin(CollectB)
context = pyblish.util.publish()
print context["MyInstance"].data("height")
# 1.12
Conversely, unordered cooperative collection (UCC) means Collectors can run in any order and still produce identical results.
import pyblish.api
import pyblish.util
class CollectorA(pyblish.api.Collector):
def process(self, context):
if "MyInstance" not in context:
context.create_instance(
name="MyInstance",
family="MyFamily")
my_instance = context["MyInstance"]
my_instance.set_data("age", 12)
class CollectorB(pyblish.api.Collector):
def process(self, context):
if "MyInstance" not in context:
context.create_instance(
name="MyInstance",
family="MyFamily")
my_instance = context["MyInstance"]
my_instance.set_data("height", 1.12)
pyblish.api.register_plugin(CollectorA)
pyblish.api.register_plugin(CollectorB)
context = pyblish.util.publish()
print context["MyInstance"].data("height")
# 1.12
Observations
Here are some observations of the approaches so far.
Ordered Pros
- The relationship between two or more collectors is clear; one must come before the other
- Subsequent collectors can communicate by passing data from one Instance to the other.
Ordered Cons
- Encourages tight coupling between collectors
- Difficult to re-use (as they depend on each other)
- Difficult to test (as they can’t run without each other)
Unordered Pros
- Mixable; any collector can be added to contribute to the final
Instance
without regard to what comes before it. - Testable; without ordering, testing can happen in isolation
Unordered Cons
- More code; nothing can be expected, must be queried before used.
It would seem that from a long-term perspective, and where validations are written not just by a single developer but needs to be interchangeable with others, that UCC is favourable.
UCC enables the use of unknown validations to be plugged into an existing plug-in stack and append to existing Instance
's without distrupting prior collectors.
OCC danger
When you couple collectors by their order, you must take care when modifying data. This is a typical multi-process problem known as a “race condition”.
import pyblish.api
import pyblish.util
class CollectorA(pyblish.api.Collector):
order = pyblish.api.Collector.order + 0.0
def process(self, context):
my_instance = context.create_instance(
name="MyInstance",
family="MyFamily")
my_instance.set_data("members", [1])
class CollectorB(pyblish.api.Collector):
order = pyblish.api.Collector.order + 0.1
def process(self, context):
my_instance = context["MyInstance"]
my_instance.data("members").append(2)
class CollectorC(pyblish.api.Collector):
order = pyblish.api.Collector.order + 0.2
def process(self, context):
my_instance = context["MyInstance"]
my_instance.data("members").append(3)
pyblish.api.register_plugin(CollectorA)
pyblish.api.register_plugin(CollectorB)
pyblish.api.register_plugin(CollectorC)
context = pyblish.util.publish()
print context["MyInstance"].data("members")
[1, 2, 3]
From here, you can build upon your knowledge that members
will always be a list of incremented numbers. The problem then is when an external or unknown collector is introduced.
class CollectorAB(pyblish.api.Collector):
order = pyblish.api.Collector.order + 0.15
def process(self, context):
my_instance = context["MyInstance"]
my_instance.data("members").append(0.5)
The resulting members
now includes a floating point number at an unexpected position.
# [1, 2, 0.5, 3]
Hence there is no way to guarantee the value of members
unless you first gain full insight and understanding of each collector added to your stack. Something which can be difficult if validations come from elsewhere and are unknown to you.
Discussion
I have yet to test things out in practice, it’s likely things aren’t as solid as they seem and that the amount of duplicated code outweighs the benefit of cooperative collection. If not, I see a very bright future ahead, one I will share with you shortly.
Do try it out, UCC in particular, and share your experiences here.