Pyblish Magenta


Let’s give it a go. Is this something you could refactor when you have the time?

And thanks for putting the thought into this.


No need to refactor, what I’m suggesting it something like this.


from .utils.maya.context import (

They would be a form of “aliases” to their implementations. It means we can shuffle the internals around, without worrying about breaking the interface everyone depends on.


Ah in that case we keep the organization we have in place. I misunderstood before. Seems like a good way forward, let’s do it.


Exactly, the key is to allow ourselves any organisation internally, to what makes the most sense during development, but then to also provide a consistent interface for others that won’t break when we refactor.

Have a look at pyblish.api for reference, it’s along those lines I’m thinking.


Should we also set up a pyblish_magenta.api then? This way we keep it out of

@Mahmoodreza_Aarabi A working selector for the rig is in place so I hope it becomes trivial to implement some Validators for your rig. I know you had some great ideas and I’d love to see them implemented. Are you up for it?

Also… we had an animator on board who wanted to do some testing with the system right? How can we make it simple for him/her to hop in and get playing with it including running the animations through Pyblish?


That’s a good question, it would create symmetry between how pyblish.api is used, but, the only reason the API for pyblish isn’t in is because of the configuration initialisation that happens when importing api. It was important that this didn’t happen internally within the library, and only when used from the outside.

That being said, there is comfort (in my opinion) to have a dedicated module for the API to separate responsibilities, and also makes it easier to one day separate it fully into it’s own repo when/if things get too complex (with backwards/forward compatibility, for example).

Finally, most other libraries expose their api’s via, so in that sense doing the same would be familiar.

Not sure which one is better, I’ll leave the decision to you as it’s your baby.


Let’s use pyblish_magenta.api so people know it’s a public facing API.

We’re in early stages and we can assume that for still some time to come everything used from Magenta will still be in the Magenta repository so refactoring in a month or so shouldn’t do any harm. Let’s use that time to get a feeling for whether this is the right way forward.


With the latest commits to Pyblish Magenta it’s becoming easier and easier to hop in and contribute. We’re working to hard to having newcomers get access to the The Deal dropbox folder and having them actively contributing within a matter of minutes. To get to that point we’ll need to get some testers on board and set up a quick 5 minute how-to tutorial in the wiki, see Setting up “Getting Started” documentation (#26).

Currently a draft implementation of versioning is implemented (and working!) So there’s not a better time to jump in and start contributing the simplest of models right away. Here’s a quick publish of the cup asset:

Of course I’d love to get everyone involved in discussing versioning, integration and other possible validators. If you think something is essential for a pipeline in terms of API or Validators/Extractors hop into Github and set up an issue. We would be glad to take your ideas and contribute towards it within Magenta!


I’ve pushed an update to my fork of pyblish_magenta to add in collecting animations and extracting an Alembic from it, this is the relevant commit.

Consider it a quick draft (it’s currently hardcoded to always take frame 1 to 10; just for testing).

Collect Animation

I wanted to have a quick discussion for two opposing implementations of collecting what needs to be extracted for an animation scene; or any scene really!

Collect whatever is visible!

Currently the Collector for animation (from the commit) retrieves all visible geometry and non-default cameras and considers that worth extracting. No need to have an artist set up publish sets. Basically the animator’s scene (visually) represents the outgoing data. Somewhat WYSIWYG.


  • The published content is similar to what we see in the animated scene.
  • No need to define what gets published since it’s basically filtering to relevant nodes from the normal scene contents.


  • The published content is similar to what we see in the animated scene has its clear downsides.
    • The animator might be working with a proxy version of the rig, but should publish the high-res version?
  • The code to define what is relevant for extraction gets complex quickly (for example looking up whether a node is at least visible once)

Or take control on what will extract!

Another approach is to have the artist define what is relevant for extraction, for example creating a publish_animation objectSet that contains all relevant nodes. The downside is that it becomes prone to human error; what happens if the animator adds ‘wrong’ objects? (And will we be able to validate?)


  • There could be multiple objectSets in a scene to separate between certain extractions, possibly with different settings each.
  • Additional settings for the publish can be set on the node in the scene. Example given it could have an attribute defining the start and end frame of the animation, this way it would take that time range even if the current timeline differs.
  • Having the artist more in control (at a more granular level) it’s easier to perform quick hacks for complex projects that might require a workaround. (eg. separating a super-high-res mesh for Alembic extraction, separating an unsupported mesh format into its own extraction or keeping a procedural system as a procedural network for optimizations)


  • What happens if the objectSets contains both the proxy mesh and hi-res mesh of a rig. How do we identify whether what the user did is the correct data to export?
  • Related to Alembic: I assume we want to writeVisibilities so the animator remains in control of animating visibilities. Though if the high-res mesh of a rig was hidden (eg. when working with proxy rig) it would get publish as hadding geometry.

@marcus, @mkolar, @Mahmoodreza_Aarabi I think this is definitely something you all have something to contribute to based on previous experiences from productions?



  1. The publish needs to be reproducible without human intervention
  2. and the artist must be able to take some form of responsibility.

There’s an implementation of an Alembic extractor in the Napoleon package here, I remember there were a few things that needed solving that got solved here, not sure if you have had similar experiences, but maybe it can help.

You define two objectSets.

I’d assume this depends on your renderer and what it does with it, but in general I’ve rarely had to consider visibility on anything to be animatable. There’s just no realistic equivalent to what that would look like in real life.


The Extractor that I implemented was based on that one (have a look at the code, it’s pretty similar). Yet I added some of the missing arguments (still need to add all relevant to the docstring) as possible overridden data. Also I removed the dependency of evaluating a mel command yet used the Python version of AbcExport directly.

This is true for both options. Since the scene is analyzed to define the extraction the same scene will result in the same extraction, it is reproducible.

The hurdle here is working with freelancers, interns or anyone that’s new or short-term in a project. The responsibility they have to take to prepare a scene in a certain way makes it a two-way problem. One has to explain and ensure they know how to proceed, plus the other has to learn, adapt and actually perform this step.


The objectSet isn’t defined by the animators, but by the rigger. He knows what is meant to be cached and can make the appropriate sets. They can then be updated from a single source, and automatically re-cached/published where needed.


Same problem different place?

But I can see how this does reduce the amount of ‘set defining’ that needs to be done.

Need to tinker about this some more to see what’s a feasible solution, getting pros and cons weighed correctly. Something that we do regularly is reduce the cached content to what is required for a scene to function. For example a table’s legs could be removed if the whole shot takes place on top of the table. (And then consider the table to be some huge high-res building consisting of multiple floors of high-res geometry so it’s worth hiding it.) Whereas the modeler/rigger might have considered the whole building to be important for caching!

Again it’s a level of granular control we currently have which has had its clear upsides… like when you need to reduce the Alembic’s size from 50 GB because a whole hi-resolution set gets shattered. There’s likely other solution we haven’t thought of or tried since we already had this ‘workaround’ working.

Either way I think having sets somewhere is a solution for this. In theory the set could even be created and filled with contents by a script that follows the same rules the Collector is currently doing in this draft implementation. This keeps both options open during production.


Well, same problem, but one time as opposed to once every shot.

It will have to get defined at one point or another, I’m not seeing how an animator hiding parts of the scene - which he might to for his own sake as an optimisation - would better guarantee any form of consistency.

That’s an optimisation, a premature one at best. Off the top of my head, with alembic you can selectively import only part of a file, and exclude the table legs from there. It’s part of it’s core functionality.


Hmm… not entirely a solution I suppose. The fact is you want to reduce the size of the alembic file to speed up the reading process, right? Either if it’s loaded deferred with the renderer or even loaded into a scene for playblack/lighting you want to able to scrub through the frames to some extent. Alembic isn’t a one-time import, it keeps the connection with the source file and uses it to read from as you scrub through time. It’s more a reference of sorts, at least in Maya it is (for moving geometry).

Also you can filter by name with AbcImport in Maya yes. Not sure how it works with hierarchies.

Either way… I’m likely solving the extremes I know from production with the quick workarounds that worked at the time. I think when times become critical we’ll find others… just means we need to push ourselves to the limits for testing.

@marcus How would you propose retrieving the sets and doing the Alembic extraction? Would you extract every single element into its own Alembic. If so, why? An even better question would be how you’d prefer to extract an animation scene to have everything out that the other department needs (eg. lighting). How would they recreate the scene?

And if so I think it’s better to set up the extractor so that it processes the context as opposed to instances and extracts all instances in a single run passing in multiple jobs to the AbcExport command?


There’s a great talk about Alembic and how it works internally, but in short, no, reading performance is not affected by it’s size.

After the 14:00 mark.

Alembic is random-access, meaning it reads only the part of the file that it needs, as opposed to other types of files, such as .rar in which case the whole file is loaded and decompressed before anything can be read.

I’ll get back to you on that.


Heard/read that before, thanks for confirming. Really need to test more why Maya is so darn slow with the bigger caches then!

Perfect. I’ll push an update to my fork in a bit that loads an objectSet by name and such. I guess it’s a better starting point either way.


Sounds good.

I’d make it a standardised name per objectSet, such as controls_SEL and pointcache_SEL for animation controls and pointcache geometry respectively. Every rig can have these.


When you say “element” do you mean every node/mesh? I’d include an entire Instance in the Alembic, I think this is what it’s made for. It does a lot of de-duplication and optimisations internally where meshes are similar or non-deforming, it’d be a shame not to take advantage of this, especially when caching sets.

I’d try and extract every Instance separately, and import them separately. Didn’t we talk about “recipes” earlier? About specifying what how assets belong together as a JSON or similar? This applies here as well. The same recipe used to initially build a scene with rigs can be used to build one with cached assets. The same tools apply, the same initialisation.


Ben01: Ben
Ben02: Ben
Table01: Table
Chair01: Chair
Chair02: Chair

The source files for these can be relative the shot itself, from it’s own published assets.



No, more like every single Asset. For example 10 characters or buildings. I think you just mentioned them as Instances.

Just checking… you would easily get tons of instances in a single scene, correct? Roughly similar to the amount of references you would have in a scene?

So they would process after each other into a cache?

More and more I feel like we need to have a working pipeline API before we can proceed and manage things like this within Magenta.

Here’s the new CollectAnimation that uses an objectSet.