Pyblish Magenta

I’d imagine so, yes. It depends on how hierarchical your assets are, I suppose. For example, the environment may be a single Instance, or hundreds.

Just so we’re talking about the same thing, I’m saying a character would likely become it’s own .abc file, alongside the table, and individual chairs. Each would process and get extracted separately, anything else would be an optimisation, something we could take a look at later. E.g. by running many copies of Maya, either locally or on a farm, caching individual Instances.

Yeah, it’s chicken or egg. I think we’re on the right path currently.

What are we expecting the wildcard to represent? '*_pointcache_SEL'

I was thinking there would only ever be a single pointcache set per rig, it’s namespace would help associate it with a particular Instance. If there are more sets, they could have different names, like collisionGeometry_SEL or focusLocators_SEL. Do you have any specific outputs in mind?

Other than that, and without having tested it, I’d say it looks good.

Basically I’m using the pointcache_SEL suffix to identify it as what needs to be extracted as a point cache, what is in front is the name of the instance. What you’re doing with collisionGeometry_SEL would mean it wouldn’t get caught by the current Collector since you’re losing the pointcache identifier. Basically that would mean anything ending with _SEL is considered for pointcaching?

Did a quick run within thedeal on a test animation and it seemed to work fine so far.

Note that I also changed IntegrateAsset in one of my earlier commits to accompany working for both Shots and Assets. You’ll need that to work with the updated schema.yaml of thedeal.

Anything within the namespace of the asset? E.g. ben01_:collisionGeometry_SEL

Hmm, then I think the suffix _SEL is the confusing bit since I linked it to selection which I thought was a reference to the collection in an objectSet. Where you referring to something else?


Also, I wonder why you oppose against making a ‘scene-wide cache’ and instead want to separate as much to individual Instances. Especially since Alembic has such strengths in data de-duplication?

Usually we (at Colorbleed) put as much information about a scene into a single Alembic as possible. Looking at the video you linked (this one) I feel like they are making a single cache of the scene as opposed to numerous smaller ones.

Not sure which of the two is considered best practice.

I assume having a cache that recreates the scene is what it is meant for, seems to be the only way that ensures you can load that same scene into different applications without ‘middle-managing’ it? Eg. loading an Alembic into Fusion/Nuke for compositing or into Houdini for sims?

Another thing I just thought of… where do you include the camera in the Alembic?

Sure, any suffix will do, that’s how I’ve seen it used in the past and think it refers to “selection set”. How about _SET?

I’d never consider putting everything into one giant blob, simply because updating one thing means updating everything, both for the artists using it, and the artist producing it. Now it suddenly makes sense how you can end up with cache files at 50 gb…

I wouldn’t.

Edit. To elaborate, cameras to me are assets like any other. Like rigs in fact. They are published, versioned and used like any other. In this case, you could use Alembic. I’ve mostly worked with FBX for cameras, but it’s possible Alembic works equally well. What I wouldn’t do is include the camera with the rest of the pointcached geometry for the same reason as above. They change independently and should be treated as such.

In preparation for shot work - about the directory layout, it currently works like this.

thedeal
└── shots
    └── film
        └── shot1

We’ve got a a project with a sequence named film, within a directory called shots with a directory called shot1.

A tad confusing, might require some explanation on part of someone new along with reminders to yourself when getting back into it after a few days/weeks.

How about this instead?

thedeal
└── film
    └── seq1
        └── shot1

Where film now is a top-level group alongside assets, and contains sequences. The sequence then contains shots, like before.

1 Like

About validate_single_root_transform.

This plug-in isn’t necessary.

It is already guaranteed that only a single transform is ever exported with the way things are collected; i.e. by looking for a match between ITEM from the environment, such as ben, followed by a _GRP suffix.

Having this validator means there can’t be anything else in the scene at the top-level, not even things used during development-only, which is a bit restrictive and hampers creativity and possibility.

Edit: Actually, scratch that.

The way things work is it warns if the nodes to be exported actually contains multiple root transforms. This shouldn’t happen, and it’s good it’s being validated against!

Yup, was just going to say that. Plus it doesn’t warn if there are other groups in the scene, only if other top groups were collected. :wink:

I think the validation could even be optimized by checking whether it’s an assembly node using cmds.ls, like so:

assemblies = maya.cmds.ls(instance, assemblies=True)
assert len(assemblies) == 1

Also note that there’s a typo in the raised error, the sentence is a bit unreadable. :wink:

Rewritten to this.

import pyblish.api

class ValidateSingleAssembly(pyblish.api.Validator):
    """Ensure all nodes are in a single assembly

    Published assets must be contained within a single transform
    at the root of your outliner.

    """

    families = ['rig', 'model']
    hosts = ['maya']
    category = 'rig'
    version = (0, 1, 0)
    label = 'Single Assembly'

    def process(self, instance):
        from maya import cmds
        assemblies = cmds.ls(instance, assemblies=True)
        assert 1 == len(assemblies), (
            'Multiple assemblies found: %s' % assemblies)
1 Like

Here comes a few updates.

Let’s have the discussion here, not in the pull request, unless it involves the code itself.

Renamed validate_single_root_transform

This one is straightforward enough, it’s now called validate_single_assembly

Added support to find_next_version for multiple numbers

Before, numbers couldn’t go above 10, this was a bug and is fixed now.

Adding option for extractor

Made it so that maya_ascii publishes don’t include their original references; instead they are baked into the publish.

Renaming integrate_cleanup.py to cleanup_tempdir.py

To better align with the new cleanup_comment.py.

Added support for comments

When publishing, it looks for a node of any type called “comment”. If it has anything in it’s “notes” field, it will be published as “comment.txt” alongside everything else.

The comment is then removed during CleanupComment to avoid the same comment being published more than once.

Moving temp_dir to Context, for simplified integration

Something to talk about; before, the temp directory was embedded into each Instance, meaning integration operated on a per-instance basis. I moved the temp dir to the Context now, such that integration simply moves everything inside of it into it’s corresponding version.

Benefit is simpler integration, disadvantage is less control over individual files as we no longer have access to files relative to which instance they came from. Whether or not it matters remains to be seen.

Does it make sense though? It might not have been necessary, it’s just a trail.

Nice update @marcus!

Tricky.

Because the content of extraction belongs to that instance, not to the context. If you want a single temp directory I would propose having the instance’s directory being available inside the context’s temp directory, eg. tmpdir/instance_name/<contents>.

Imagine publishing multiple instances, this way they could never end up in their respective folders as they should. I think we can’t assume a context to only ever hold up to a single instance. If that were true than an instance as data would’ve never been necessary anyway.

I think you’re right.

Will attempt to remedy this next.

@BigRoy, I’m having some trouble with Lucidity. Looking at this, can you spot what’s wrong?

Fixed it behind the scenes, on Hangouts. Thanks Roy.

This.

    shot.dev: 
        pattern: '{@shot}/{task}/work'
    shot.asset:
        pattern: '{@shot}/{task}/publish'

Became this.

    shot.work: 
        pattern: '{@shot}/{task}/work'
    shot.publish:
        pattern: '{@shot}/{task}/publish'

New Changes

For code review, let’s talk in the issue. For pipeline and workflow topics, let’s talk here.

I’m not completely happy with how this turned out, but it’s a start.

Published assets end up in, e.g.

Pyblish\thedeal\assets\sofa\rigging\publish\v004\sofa_rig.ma

Where sofa is the name of the Instance, and rig is it’s family.

Whereas Shots end up in, e.g.

Pyblish\thedeal\film\seq01\1000\animation\publish\v019\sofa02_pointcache.abc

Where sofa02 is the name of the instance, provided by it’s namespace in the scene, and pointcache is it’s family.

What I’m not happy about is that Assets have the name of their Instance, which in this case corresponds to the name of the asset itself. It’s duplicate information. And that’s bad bad bad.

Other than that, it’s shaping up nicely.

Here are some things I think we could do better.

  1. Environment Dependency Currently, publishing is relative the environment. If we are in animation “mode”, certain things happen, otherwise they do not. I think we could eliminate this dependency and look solely towards what is in the actual scene, and look towards the scene file for anything external (which should not be much).
  2. Conflicting Argument Count Currently, the 2nd and 3rd argument passed via the command-line is assumed to be ITEM and TASK. But this isn’t true for shots, that have an extra argument; thedeal seq01 1000 animation. I’ve worked around this by instead of looking from start to end, 2nd argument, for the task, I’m looking at the last argument; which is always the task.
  3. Asset Initialisation Instances in the scene are assumed to be namespaces, as such ben01_:, where ben is the name of the asset, and 01 is how many of him is in there. The _ is added for cosmetics, it’s easier to look at in the Outliner. The problem is, I’m doing this manually by now. Importing a published asset, and giving it a name explicitly. We could automate this, and this is where an asset library is handy.
  4. Asset Names Currently, an asset have the top-level assembly named by which ITEM we are working with followed by _GRP, e.g. ben_GRP. This, alongside the namespace, ben01_: is duplicate information. It would be better to instead name the top-level assembly by what it is; such as rig_:. This way, the full name of a referenced asset becomes ben01_:rig_GRP, as opposed to ben01_:ben_GRP.
  5. Publish/Work Minor complaint, but these two words are verbs, they should really be adjectives; such as “/published” and “/development_files”. As they describe what is inside of them.

In practice a filename that defines what it actually holds or represents is the best way to format the name. Most applications will highlight the name of the file when being opened, loaded, referenced or used in any way to identify what file it’s currently working with. Think videos in Quicktime Player or VLC, references in Maya’s reference editor or even drag ‘n’ dropping a random file into Maya that uses the namespace by default from the file. Or even something like Rebus Farm (online render farm) that drops all linked files (textures/references) into a single folder without checking for clashing names. Or finding the extracted content to be somewhat identifiable if it failed to integrate and you’re searching your temp folders.

If you ever just need to share with anyone or manage to have the file out of its context its perfect that it by itself is descriptive, even if it’s only the asset name.

To be honest, I wouldn’t make it hold less data than this.

Environment Dependency

You usually don’t want to publish a model from a rig, even though the mesh in the rig is the exact same content (because it’s the output of the model department being used there). Preferrably used in such a way that it’s swappable with an updated version of the published content. So 1 to 1 usage.

Might make it tricky to avoid publishing the model again in the other scenes? Or how would you avoid having it trigger the ‘model publish’ on that content?

Still interesting to hear about alternatives, broaden the horizon.

Asset Initialisation & Asset Names

Combining bullet point 3 & 4:

Cons
  • Is renaming namespaces for a referenced asset, especially when already animated, still buggy? Like if you do it after something was done with the object that it sometimes loses connections to animations or changes.
  • I can imagine people using tools that are not proprietary to manage references in a scene; eg. we have custom scripts where a reference is duplicated with position or animations to ease layouting. I assume others might use similar tools that might not be written by themselves? It might be hard to change non-proprietary tools to provide this same structure to loaded assets. But if it remains purely cosmetic then I think it’s perfectly fine.
  • Other than that I know many artists who prefer to work with the namespaces hidden in the outliner who might be bugged by having it removed as information after the namespace.
  • Plus there are actually formats of exporting that ditch namespaces resulting in solely output of rig_GRP. Though I can’t remember which those were and whether it got fixed in newer versions of the file format. Does .obj support namespaces?
Pros
  • An upside to having just rig_GRP instead of the asset name in the group is that it makes it easier to transfer reference edits since it might just happen that replacing a reference with another asset automatically has the same name and takes the same reference edits without problems.

Publish/Work

Working with artists the work and publish was immediate clear, where development had no meaning/context for them. Publish was clear for the artists as long as they knew of the terminology of publishing in a pipeline.

Maybe work and published makes for a nice combination, one is where you actually work in whereas the other has contents that were produced.

This is a good place to start talking about what we design for code, and what we design for humans.

I think we can make a distinction here, such that we can escape the burden of metadata deficiency when designing for code, along with having the freedom to include as much as is necessary and relevant for humans.

For example, we could export Quicktime versions of each asset as we are today

/thedeal/assets/ben/rigging/publish/v065/Turntable.mov

…but then also integrate them into a /dailies folder. The Quicktime file in this folder can then contains all information relevant to the particular asset.

thedeal/dailies/2015-08-10/ben_rigging_v065_marcus.mov
thedeal/dailies/2015-08-10/seq01_1000_animation_v011_roy.mov

The human-facing integrator would then be the one to decide which of the available Quicktime video(s) are relevant for dailies.

This way, we gain the advantage of absolute paths with no duplicity, along with fully qualified filenames where relevant. Win-win?

You’re right, but that’s not exactly what I’m proposing we do.

Each family is a contract, it specifies in which shape and form content must be formatted as in order to qualify as the family, such as a model. If the content is formatted in such a manner that it conforms to the model contract, then it is a model, otherwise it is not.

If content then is picked up as a model where it should not have been, such as when located within content qualified as rig, then the contract of model is at fault and will need to be updated. For example, it could include that the content must be an assembly.

The goal is to loosen the restrictions on what has to be done in order for the pipeline for function, and at the same time gain better control and specificity into what a family of content actually is.

I can imagine people using tools that are not proprietary to manage references in a scene

That’s a good point, but we can’t realistically design a pipeline to encompass all possible third-party tools out there.

The idea is that, with a well-defined pipeline, these tools should be trivial to implement yourself. Isn’t this what a pipeline is for to begin with?

Other than that I know many artists who prefer to work with the namespaces hidden in the outliner who might be bugged by having it removed as information after the namespace.

I won’t dignify this with an answer. :smile:

Plus there are actually formats of exporting that ditch namespaces resulting in solely output of rig_GRP.

We are in control over how things are exported, this isn’t a problem.

+1

I think this duplicity of data doesn’t beat the fact that the Turntable.mov by itself still lacks even the slightest definition of what the turntable is from. You’re reducing duplicity in a filename, but end up having to duplicate the file?

You’re actually using that published file directly elsewhere where the context of the file might not be directly visible (or even reachable!). I’m saying there are many production tools (or workflows) you will not have control over, or won’t have the time for to write a workaround over.

For example Zbrush picks up the filename by default. Would we write tools for Zbrush to ease working with our pipeline? I don’t even know if there’s a way to access where an imported file came from in Zbrush except for that it takes over the file’s name. Would I even know what version was imported? And Mari?

Again not saying that all data should be in the filename, but unfortunately there’s not something that every application supports to override how a file is handled upon import.

If the pipeline allows us to retrieve information on how the filename should get formatted and we can access that method then it’s not making our code more unmanageable if the filename itself holds more data. Is it?

Ok. Clear. I think we should manage our workflow in such a way so that it’s least blocking/limiting. Otherwise every tool ever used will need to be wrapped in pipeline tools. This will become crazy to maintain if you need to add/swap plug-ins you use, eg. fur/cloth tools, separate sculpting tools.

Not that much if you’re trying to extract something that is currently referenced. You won’t be able to rename pre-export. Only way to fix would be to change it post-export.

Turntable.mov is meant for code, this is the point. This file doesn’t work outside of it’s parent directory, and it’s full name does include context.

/thedeal/assets/ben/rigging/v012/Turntable.mov

Easy now, that was merely an example.

The point is to facilitate tools and humans, and there are many ways of doing this.

If a file in /dailies needed to be traced back to it’s origin, then you can do that. Either with the information from the file, which should be enough? You can always guarantee it’s from the /publish directory.

Another example could be to perhaps use a symlink. The symlink would reduce disk-use (optimisation) but also maintain a physical link to it’s origin that can be traced by tools.

If that’s not good enough, then why not include a link to it’s origin along with the integration?

/dailies/thedeal/dailies/2015-08-10/ben_rigging_v065_marcus/Turntable.mov
/dailies/thedeal/dailies/2015-08-10/ben_rigging_v065_marcus/Origin.lnk

Or how about a link to the published directory instead of a file?

There are many ways to solve this riddle.

How do you mean? Aren’t artists browsing directories to get to this file? Would it be possible for you to make a short recording to illustrate the problem?

This is when a file has been already imported/loaded into an application. For example during playback in quicktime. Or when reopening a zbrush file in which you already imported a model. Or when looking at the Maya reference editor at the files that are already loaded.

Of course when looking solely at the file within that folder it has context (both for humans and the computer). But once it goes into an application there’s no guarantee that you’ll be able to know what that source file was. Similarly if you open twelve quicktime players looking at different files you might want to know which is older and which is the newer one. This same dependency on the file name is in almost all application. In many (or most) applications opening a file moves it out of context since it reads it and loses the link to the original file, like importing a model in zBrush.

Does that make more sense?

I will jump into this for a second even though I’m not following development of Magenta closely. I have to agree with BigRoy with the filenames 100%. The way I see it. Including at least the basics of information directly into a file name is super trivial, and only adds benefit to tools and humans working with the files. For Humans, way better readability (even in situations, where you are loading the file, but the UI is too small to show the full path at the same time, you however certainly see the filename.) for tools you get extra information that you can potentially use to your benefit. I’ve had to search filesystem before, when file went rogue (artist are able to go around any pipeline when they don’t think), but was able to retrieve it easily by comparing files locations to filename. If that data didn’t match you know there is a problem and you can deal with it.

My point being, it’s super easy to screw up location of files even in rigid pipelines, so descriptive filename give you a good double check on things.

1 Like