Multiple families workflow

marcus · November 17, 2016, 1:55pm

Complex is easy, simple is hard.

When it comes to introducing new functionality, especially one that is inherently more complex, I try and look towards the benefits of keeping it simple.

Less of a learning curve
Easier to understand other peoples code
Easier to maintain

And then think about whether the advantages of the new functionality outweighs these.

In this case, the ability to define custom matching algorithms, albeit cool and logical the way you’ve proposed it, does it add more value than cost?

At one extreme, I see a future where every plug-in defines a corresponding matching algorithm. At that point, to even begin to understand a series of plug-ins - especially those mixed and matched from elsewhere - would see an increase in the time required to understand it.

On the other extreme, where there is only one matching algorithm, you need to get creative with the little you’ve got in order to achieve complex behaviour.

Having said that, it’s also possible that families as it exists today is a subset of what this does. That defining your own matching algorithm could be the de facto method of associating plug-ins to instances and that it makes for both more flexibility and simplicity.

Let’s explore it.

First off, I’m interested in what you mentioned @tokejepsen about your requirement in pyblish-ftrack and what hoops you jump to currently in order to achieve the necessary effect.

tokejepsen · November 17, 2016, 2:24pm

The main problem was originally that we couldn’t process all instances, because they wouldn’t all have the same family name. This was back when you could only associate a single family with an instance. So we decided to process all instances, but just return early when a certain data member was missing.

These days with multiple families we could easily utilize a certain family name like ftrack to figure out which instances to process. The user could easily add another family to an instance. So there aren’t actually much of an argument for matching algorithms.
There are the edge case where some one is using the family ftrack for someting else. At this point having a matching algorithms that only processes the instances that have the correct data would be beneficial, but this case is highly unlikely to happen. Even the cross over of different plugins packages has yet to be an issue.

marcus · November 18, 2016, 10:22am

The above api.Intersection, api.Subset and api.Exact algorithms have now been merged and released as 1.4.3.

$ pip install pyblish-base

tokejepsen · November 22, 2016, 9:43am

Just tested this with a new plugin, and it works great

mattiaslagergren · November 22, 2016, 2:15pm

We just switched to using this feature in our tech preview. So now we’re collecting families like:

instance = context.create_instance(
   node.name(), families=['ftrack', 'camera']
)

And on the plugins:

families = ['ftrack', 'camera']
match = pyblish.api.Subset

Seems to work well - nice addition

marcus · November 24, 2016, 12:51pm

.match attribute added to API documentation.

tokejepsen · December 16, 2016, 8:26pm

I’ve noticed one workflow where a custom matching algorithm would be useful

If you collect some instances, and wanted to append some data to them with a different collector. The case is host specific instances, where you want to append data to instances cross host without code duplicate per host.
Once the instances are created, they have to be publishable to be proceed by other InstancePlugins. Here I’m using a ContextPlugin and manually filtering to the correct instances, to force processing all instances.

marcus · December 17, 2016, 9:10am

Would it be possible to post an example of this?

tokejepsen · December 17, 2016, 4:18pm

Sure

Maya Collector

class MayaCollector(pyblish.api.ContextPlugin):

    order = pyblish.api.CollectorOrder
    hosts = ["maya"]

    def process(self, context):

        instance = context.create_instance("A")
        instance.data["families"] = ["farm"]
        instance.data["publish"] = False

Houdini Collector

class HoudiniCollector(pyblish.api.ContextPlugin):

    order = pyblish.api.CollectorOrder
    hosts = ["houdini"]

    def process(self, context):

        instance = context.create_instance("B")
        instance.data["families"] = ["farm"]
        instance.data["publish"] = False

Append data collector

class AppendData(pyblish.api.InstancePlugin):

    order = pyblish.api.CollectorOrder + 0.1
    families = ["farm"]

    def process(self, instance):

        instance.data["SomeData"] = {"something": "else", "some": 1}

marcus · December 17, 2016, 5:02pm

Do you mean that because of instance.data["publish"] = False, the AppendData plug-in has no effect? You would have expected those instances to be processed by plug-ins within the Collection order even with this publish member set to False?

How about a post plug-in, to determine defaults?

class HoudiniCollector(pyblish.api.ContextPlugin):

    order = pyblish.api.CollectorOrder
    hosts = ["houdini"]

    def process(self, context):

        instance = context.create_instance("B")
        instance.data["families"] = ["farm"]
        # instance.data["publish"] = False

class AppendData(pyblish.api.InstancePlugin):

    order = pyblish.api.CollectorOrder + 0.1
    families = ["farm"]

    def process(self, instance):
        instance.data["SomeData"] = {"something": "else", "some": 1}

class SetPublish(pyblish.api.InstancePlugin):

    order = pyblish.api.CollectorOrder + 0.2
    families = ["farm"]

    def process(self, instance):
        instance.data["publish"] = something is True

I think the reason I’d avoid custom algorithms is for (1) the added learning curve for anyone looking to learn your plug-ins, and for the (2) lessened re-usability and (3) intermixing of plug-ins. The advantage would need to be rather significant to justify such a sacrifice and there would ideally be no workaround, or at least one that was significantly more difficult to manage.

I think if you could show me a way to implement it without encountering the 3 cons above, that would be a great starting point for the feature.

tokejepsen · December 19, 2016, 6:58am

Initially I probably would have, because I wrote the plugin without thinking too much Although when I think about how Pyblish works, I know that it wouldn’t get processed.
It might just be one of those pitfalls you need to be aware of.

You could definitely do this, but I personally would like to avoid too much order dependencies. Been down that road, and the more plugins you offset from each other, the more difficult it becomes to manage.

I agree. Think my posts here are also more of thoughts out loud

marcus · December 19, 2016, 7:56am

It’s a scenario where the word “publish” doesn’t apply, so maybe there’s room for change here.

Initially, the word was chosen because an instance was created and either published, or not published. So “publish” seemed the most logical name. But in this case, your second collector doesn’t necessarily publish the instance and so the logic breaks.

We could call it what plug-ins call it?

instance.data["active"] = False

That might make it more clear that it won’t be considered anymore.

Off topic, but this would make a fantastic use-case story for a dedicated thread! Why did you do it? What did you expect? What did you find? Why was it difficult to manage?

Consider it?

tokejepsen · December 19, 2016, 8:12am

That might work

I’ll try and collect some thoughts, but won’t be for a while.

tokejepsen · October 5, 2017, 3:52pm

What do we think about having an Exclude matching algorithm?

If any of the families on the plugin are in the instances families, then exclude that instance.

marcus · October 8, 2017, 3:24pm

An example would help.

tokejepsen · October 9, 2017, 8:01am

For example I would like a plugin to process all instances except for those of family source.

marcus · October 9, 2017, 8:09am

Thanks, but could you share an example of how you would use it and the problem it would solve? I’d like to make sure there isn’t some other way of accomplishing what you’re looking for, before adding more features to support it.

tokejepsen · October 9, 2017, 8:36am

Ahh, I see.

The most recent plugin I’ve made where this exclusion matching would help, is about updating statuses in Ftrack.

On every component published to Ftrack, I would like to update the status to “In Progress”, except for any source components.

marcus · October 9, 2017, 9:11am

Ok, so it’d be this, but the inverse?

from pyblish import api

class UpdateStatus(api.InstancePlugin):
  order = api.IntegratorOrder
  families = ["includeMe", "meToo", "andMe"]

  def process(self, instance):
    import ftrack
    ftrack.update_status(instance, "Done")

Such as this?

from pyblish import api

class UpdateStatus(api.InstancePlugin):
  order = api.IntegratorOrder
  families = ["excludeMe"]
  match = api.Exclude

  def process(self, instance):
    import ftrack
    ftrack.update_status(instance, "Done")

tokejepsen · October 9, 2017, 9:29am

Yup, exactly. Currently I’m processing all instances, but skipping the families I don’t want in the plugin.

from pyblish import api

class UpdateStatus(api.InstancePlugin):
  order = api.IntegratorOrder

  def process(self, instance):
    families = [instance.data["family"]]
    families += instance.data.get("families", [])
    if "excludeMe" in families:
        return

    import ftrack
    ftrack.update_status(instance, "Done")