Multiple families workflow

Topic

I’ve been having a look how to make possible use of the “multiple families” support that was added in Pyblish 1.2. But I have been having a hard time pinpointing what a good workflow is.

What I’ve been trying

Currently I played around with having an instance get an additional comment family when any notes are available on the object. These are then transfered similar to a git commit message and cleaned up afterwards so they won’t be there again for the next publish.

The problem here now is integration. Currently the integrated files (how I have been using them) depend on the family to define its name to ensure the output of a pointcache and proxy can be side to side. (both Alembic files).

# simplified example names
publish/asset_pointcache_v001.abc
publish/asset_proxy_v001.abc

More in-depth, think of the integrated filename as a format of {asset}_{family}_{subset}_{version}.{extension}. Also see Asset Map 0.7 for terminology.

Questions

  • Is anyone using multiple families in production?
  • Do you have good (simple?) use cases with an example?
  • Does the artist tag an instance with more families (e.g. geometry, proxy) or where lies that responsibility?
    • How would you validate that a “family” is missing? Or even better, would you even do that?

It’s very likely that I’ve been looking at this feature the wrong way and as such am not sure how to put it to good use. So any insights are greatly appreciated. Looking for something that defines the whole “from artist interaction/creation to final filename output”.

Good initiative, I’ve been meaning to do a write-up or tutorial on this but haven’t found the right time. Maybe now is the time.

Basics

For starters, the full potential of the technique is still to be discovered, I’d imagine everyone using it in different ways until things settle and clear pros and cons can be identified, but I can try and summarise the initial intent of why it got implemented in the first place.

Multiple families is a complete shift in thinking; the inverse of using a single family. That is, rather than associating multiple plug-ins to a family, you associate multiple families to a plug-in. Put yet another way, rather than associating multiple operations to a single type of data, you dynamically redefine data to encompass multiple operations.

Examples

Consider these scenarios - one with single family and one with multiple families.

Single family


import pyblish.api
import pyblish.util


class CollectInstances(pyblish.api.ContextPlugin):
    order = pyblish.api.CollectorOrder

    def process(self, context):
        instance = context.create_instance("capeModel")
        instance.data["family"] = "model"


class ValidateNormals(pyblish.api.InstancePlugin):
    order = pyblish.api.ValidatorOrder
    families = ["model"]

    def process(self, instance):
        self.log.info("Validating normals..")


class ValidateHierarchy(pyblish.api.InstancePlugin):
    order = pyblish.api.ValidatorOrder
    families = ["model"]

    def process(self, instance):
        self.log.info("Validating hierarchy..")


pyblish.api.register_plugin(CollectInstances)
pyblish.api.register_plugin(ValidateNormals)
pyblish.api.register_plugin(ValidateHierarchy)

pyblish.util.publish()

Multiple families

import pyblish.api
import pyblish.util


class CollectInstances(pyblish.api.ContextPlugin):
    order = pyblish.api.CollectorOrder

    def process(self, context):
        instance = context.create_instance("capeModel")
        instance.data["families"] = ["geometry", "prop"]


class ValidateNormals(pyblish.api.InstancePlugin):
    order = pyblish.api.ValidatorOrder
    families = ["geometry"]

    def process(self, instance):
        self.log.info("Validating normals..")


class ValidateHierarchy(pyblish.api.InstancePlugin):
    order = pyblish.api.ValidatorOrder
    families = ["prop"]

    def process(self, instance):
        self.log.info("Validating hierarchy..")


pyblish.api.register_plugin(CollectInstances)
pyblish.api.register_plugin(ValidateNormals)
pyblish.api.register_plugin(ValidateHierarchy)

pyblish.util.publish()

Both of which achieve the same end result.

Adding an external plug-in

Now consider adding an external plug-in to your pipeline. Let’s imagine you just found and downloaded this file off the internet.

validate_tangency.py

ValidateTangency:
 - description: "Ensure no angles between two edges exceed 60 degrees."
 - families: ["organicMaterial"]
 - complexity: O(n)
 - running time: 2ns/vertex

You don’t know or need to know how it works, you just know what it does and the families it operates on.

Single family

In the single family example, the only way you could use this plug-ins, is by either redefining your existing families to instead use organicMaterial, or by re-writing this plug-in to also cover families you would like for it to apply.

ValidateTangency:
 - families: ["organicMaterial", "model"]

Rewriting one isn’t a big problem. Updating becomes a hassle, but it’s doable.

Rewriting many on the other hand is a no-go. Considering that the goal is to facilitate installing plug-ins via a package manager, similar to Sublime Text or Atom, there must be a better way.

Multiple families

In this case, “installing” this plug-in is under your control. Specifically, under the control of your Collector(s). You don’t need to modify the external plug-in.

class CollectInstances(pyblish.api.ContextPlugin):
    order = pyblish.api.CollectorOrder

    def process(self, context):
        instance = context.create_instance("capeModel")
        instance.data["families"] = ["geometry", "prop", "organicMaterial"]

Mindset

Before

With single families, the assumed mindset of the implementer is this.

  • In our pipeline, we have Models, Character Rigs and Animation
  • Models should be exported as obj and validated for x, y and z.
  • Character Rigs are exported as .ma and validated for a, b and c.
  • Animation are published as .abc and validated for q, w and e.

You then build plug-ins that live up to these requirements and associate families to content matching the description.

After

Once multiple families have settled and we’ve got our package manager, I’d imagine a mindset like this.

  • In our pipeline, we have Models, Character Rigs and Animation
  • Models in our pipeline are published using the families ["geometry", "napoleon.proxy", "xyzKit.reviewEntity"]
  • Character rigs are published with ["rig", "filmKit.rig", "xyzKit.animatableEntity"]
  • Animations are published with ["animation", "filmKit.animation"]

Where napoleon, xyzKit and filmKit are external vendors you’ve installed, and their contained plug-ins are prefixed with the name of the package. The assignments to each instance is made via your Collector. In the case of Magenta, the Collectors delegates the responsibility to the artist, in which case the artist is the one who makes these assignments.

The external vendors might have dozens of additional plug-ins and families, but in your pipeline, you only care about a handful. The rest is taken care of by bespoke plug-ins developed in-house, in this case for families geometry, rig and animation.

Whether multiple families is for you, your studio, or your extension is hard to say. I would experiment with both to try and find which of the two makes more sense to you.

Thanks @marcus

That’s a great thorough explanation. It’s also very similar to how I expected its use where a datatype in the pipeline, like a model would be built from multiple families.

As such the pipeline would define that their publish of model would contain something like the following: ["geometry", "magenta.geometry"]. This means that the project’s pipeline defines the combination of families for a type.

This also means that it wouldn’t be up to the artist to tag the contents he’s producing for a model with all their required families, as it would be prone to human error. (Aside from the confusion of having to tag things like ["geometry", "napoleon.geometry", "magenta.geometry"].

The artist would create its output according to the pipeline’s definition of what he’s producing, for simplicity let’s stick to a model. The artist would tag it model. Then it’s up to the collector to define what is related for this pipeline to be a model and run the relevant plug-ins.

This would mean that families is more of an internal organization for the TDs (plug-in developers and pipeline managers) then a naming for the artist.

As such I’d imagine a type of sorts to be registered (instead of knowing about families, artists would know about these types) to hold such groupings of families.


types = {
 "model": ["geometry", "napoleon.proxy", "magenta.mesh.uv"]
 "rig": ["magenta.rig", "no_anim"]
} 

Does that seem about right?

Can’t tell you whether it’s right or wrong, but it definitely sounds doable. Looks like you could append to the list of families without affecting an artist’s workflow. Maybe even reaping the benefits of both single and multi families.

If you head down this route, I’d be interested in how it turns out.

All of this has a lot of great potential.

I haven’t tried multiple families myself yet, But I’m wondering how does this work with the UI. Is an instance shows multiple times? one for each family?

If so, it could be quite an overkill for the artist if ben suddenly apprears in multiple categories.

From some quick testing:

With the current version the GUI doesn’t seem to make any assumptions and only show it grouped under the family data if present. As such think of there being one main family where families are “additions” and they for now are not shown. What the expected or preferred behavior should be I don’t know, but this is how it currently operates.

Also when only families are present and no family data it looks as if the plug-in doesn’t get any group associated with it and is lacking a family header which looks rather confusing (maybe a bug?).

I’m referring to the family and families data keys on the instances.

As such currently it seems the GUI doesn’t show in any way what families are related to the plug-in. So there is no visual way in the GUI to see what families it is related to (except for the family data). Of course you could log the data from a Plug-in for debugging purposes.

It still shows family.

For the time being, when using multiple families, think of family as a description.

Give it a purely human descriptive name, such as Geometry Assets or Pipeline Data etc.

Once things have settled, and we know more about the technique and things are more clear, we can have a look at how to better transition or otherwise represent the options of using family or families.

I see.

Well there are 3 options that might standardize this that pop to mind.

  1. When instance is created, automatically assign the first families entry to it’s family. family can then be treated as the ‘master’ that dictates how it shows in the UI and generally acts as it’s default, sort of speak.
    That would make sure every instance always has family (making it a compulsory data member), while families would stay optional.

  2. Do a simple check whether family data exists. If so , then use it for the UI. If It doesn’t, take the first entry from `families’ and show that in the UI.

  3. Another one I just thought of might be that if family is not present.Pyblish could then assume that user wants to display all the families in the UI. I can think of some scenarios, where it might be useful. For instance the one with the comment collected from the instance. If it shows as both model and a comment, user can turn the comment off.
    Developer would also have fairly good control over how to show instances in the UI. If I want it to appear only as a single entry, then I make sure I set family, if I want multiple entries, I leave it empty.

Edit: @marcus replied as I was writing this and cleared things up, However these points might still be good for though.

We are currently using families in Hiero, where this feature fits perfectly with Hiero native workflow of tagging track items for export. You can see a preview of it in the ftrack webinar coming up, but basically the track items are displayed once when though they have multiple families associated with them for publishing.

I’m not on the latest Pyblish, but last time I checked if an instance doesn’t have the family data member it would be labelled underneath a default family.

1 Like

That’s right. This got added, rather sneakily, back in October during the refactoring of the plain dictionary for data.

I’m increasingly using more families as a way to tag and categorize instances. This gives me quite a bit of control over which instances I want to get a hold of, without having to know specifically what those instances are called.

What I am missing a lot of the time is finer control over which instances a plugin processes. Take these instances;

class Collect(pyblish.api.ContextPlugin):

    order = pyblish.api.CollectorOrder

    def process(self, context):

        instance = context.create_instance(name="A")
        instance.data["families"] = ["alembic", "local"]

        instance = context.create_instance(name="B")
        instance.data["families"] = ["alembic", "farm"]

        instance = context.create_instance(name="C")
        instance.data["families"] = ["renderlayer", "local"]

Say I want to have a plugin process instance A only. I have to do further filtering on the passed in instances:

class Plugin(pyblish.api.InstancePlugin):

    order = pyblish.api.ValidatorOrder

    def process(self, instance):

        # Filter to instances that have "alembic" AND "local" families.
        families = instance.data["families"]
        if "alembic" not in families or "local" not in families:
            return

        self.log.info(str(instance))

This might be edgy cases, but when using multiple families I tend to find I have to do this a lot. My question is whether we could (and should?) have a way for plugins to process only the instances that fulfill all the the families?

It’s an interesting topic, I don’t think it’s an edge case but rather something fundamental to the mindset when building these associations.

I think we’re looking at 3 potential matching-algorithms, where only the first one is currently in place.

  1. Intersection
  2. Subset
  3. Exact match
# 1. Include on any match
assert set(["a", "b"]).intersection(["b", "c"])

# 2. Include on all match
assert set(["a", "b"]).issubset(["a", "b", "c"])

# 3. Include on exact match
assert ["a", "b"] == ["a", "b"]

@tokejepsen which of the subsequent two algorithms do you think would be a good fit for your usecase?

I would say Subset would be the best algorithm.

In the case where you can’t get the instances you want with Subset, you either have to revise which families you are searching for in the plugin, or which families are available from collection.

The whole point of tagging the instances are to be able to get certain instances without know all the tags. An Exact mach algorithm would be similar to the current Intersection where we are just using longer descriptive families.

The next question then is how to best describe this relationship.

Perhaps the most straightforward and naive way of achieving it, would be with a global setting.

pyblish.api.matching_algorithm = "subset"

The problem with that being that you’ve now limited all plug-ins, even from packages outside of your control like the default ones with pyblish-maya or some third-party provider, to this one algorithm. Which might (most likely) cause them to behave in unexpected ways.

Another approach might be to add another property to a plug-in (or instance?).

class MyPlugin(...):
  matching_algorithm = "subset"

# Or..
instance.data["matching_algorithm"] = "subset"

But I can’t see ahead of time where (if at all) that might be an insufficient means of describing it.

Can you think of any other approach?

I definitely think it should be a plugin property.
Although having it on the instance might interesting, I don’t see a use case for it. Mostly because in my head instances are for storing data, and plugins are for processing.

What I’m not fond of is magical key words that mean something to Pyblish but nothing to a user. How about assigning the matching algorithm to the plugin property?

class MyPlugin(...):
    matching_algorithm = pyblish.api.algorithms.families_subset

In theory you could make your own matching algorithms, like for example think @mkolar and I was once talking about how to match all plugins that have a certain data member.

Perfect, I think it sounds really good.

I’ll put a PR up now and we can talk implementation specifics in there.

Ok, let me know what you think.

Brain dump (sorry!)

This might be slightly off track but I wanted to present an idea which has been stuck in my head for a while. What if the plug-in would be able to tell whether it is compatible with an instance? Say:

plugin.is_compatible(instance)

This could be called to identify whether a plug-in should be run to process a particular instance. This also would take some of the stress away from implementing similar functionality into UIs since it could ask the plug-in whether it’s compatible. For example a UI could decide to hide a plug-in when it’s not compatible with anything (no Context or Instance) and as such won’t be run.

The interesting bit there would be that a plug-in could be exactly tailored to run in a very specific situation if the user decides to override it.

class MyPlugin():
    def is_compatible(self, context):
        for instance in context:
            if "x" in instance:
                return True
         return False

Whether it should receive the context or instance (or maybe both is_compatible(instance_or_context)?) is still a question. Because you might want to run it only for the Instance or only for the Context. Or only once for the Context if a particular Instance is present. It could also just return (instead of True/False) the instances that should be processed.

class MyPlugin():
    def is_compatible(self, node):
        # ignore context
        if is_context(node):
            return False

        if is_instance(node):
            return True

To remain backwards compatible the default implementation could just be the current behavior.
The is_compatible method could then also impement the variable family-matching algorithms by default.

Thats is an interesting approach @BigRoy.

The functionality is what I was referring to when talking about custom matching algorithms.
I do like the approach of overwriting the function as you would normally do when subclassing.

As for whether an instance or context gets passed in, it could depend on whether the plugin is an InstancePlugin or Context Plugin. Or maybe that is too magical?

Forgot to say that one problem with subclassing could be to provide people with options, similar to Subset and Exact algorithms.
I guess you could still provide these algorithms via the api.