I know you have your concerns with repairing, but would this dependency injection effect the repair methods as well?
That is a good question.
It would make sense to mirror the change to affect repair
as well.
def repair(self, context, instance):
# perform repair
But ideally we would get on with Actions and spend time where it counts, as I think it’s pretty clear by now that that’s the way forward.
I think for the time being, as it’s not directly relevant to beginners, I’ll leave repair as-is unless it turns out to be relatively straightforward to bring it along, and save any additional work for when we move on to Actions.
Of course, pull-requests are welcome. Perhaps we could develop both simultaneously.
Any takers?
Cool, sounds good. When I have the pyblish-deadline package done I might jump onto this, can’t promise anything though:)
In-memory plug-ins, families
and hosts
defaults and Collector and Integrator plug-ins have been implemented. Along with #170, #178.
To follow along
If you can, follow along and update Pyblish as development progresses. Things might break and it will be buggy, but the more tests we put it through, the faster a working version can get pushed out.
- Install Pyblish from the mottosso fork.
- Append the installation before any other version, either via
PYTHONPATH
orsys.path
To toggle between bleeding edge and original, simply rename the repo, or remove the installation from the path.
Changelog so far.
- Feature: In-memory plug-ins (see #140)
- Enhancement: Logic unified between pyblish.util and pyblish.cli
- Bugfix: Order now works with pyblish.util and pyblish.cli-
- pyblish.util minified. For data visualisation, refer to pyblish-qml
- API: Added pyblish.api.plugins_by_instance()
- API: New defaults for `hosts` and `families` of plug-ins. (see #176)
- API: Added pyblish.api.register_plugin()
- API: Added pyblish.api.deregister_plugin()
- API: Added pyblish.api.registered_plugins()
- API: Added pyblish.api.deregister_all_plugins()
- API: Renamed pyblish.api.deregister_all -> deregister_all_paths
Otherwise, an update will be pushed to Pyblish Suite and Pyblish for Windows in a few weeks.
Updates
In-memory plug-ins
In-memory plug-ins were added mainly for testing and tutorials, but it’s got the potential to re-shape how you deploy plug-ins in your organisation. For example, you could discard physical files altogether, and register all plug-ins at run-time.
Here’s an example of how in-memory plug-ins work.
import pyblish.api as pyblish
# Mock file-system and destination server
_disk = list()
_server = dict()
class SelectInstances(pyblish.api.Selector):
def process_context(self, context):
instance = context.create_instance(name="MyInstance")
instance.set_data("family", "MyFamily")
SomeData = type("SomeData", (object,), {})
SomeData.value = "MyValue"
instance.add(SomeData)
class ValidateInstances(pyblish.api.Validator):
def process_instance(self, instance):
assert_equals(instance.data("family"), "MyFamily")
class ExtractInstances(pyblish.api.Extractor):
def process_instance(self, instance):
for child in instance:
_disk.append(child)
class IntegrateInstances(pyblish.api.Integrator):
def process_instance(self, instance):
_server["assets"] = list()
for asset in _disk:
asset.metadata = "123"
_server["assets"].append(asset)
# Register all plug-ins
for plugin in (SelectInstances,
ValidateInstances,
ExtractInstances,
IntegrateInstances):
pyblish.api.register_plugin(plugin)
# Publish
pyblish.util.publish()
# Disk and server has been updated.
assert _disk[0].value == "MyValue"
assert _server["assets"][0].value, "MyValue"
assert _server["assets"][0].metadata == "123"
Conveinence publishing and the command-line interface
Both have seen a major overhaul in terms of logging output.
Initially, publishing via scripting was the primary means of publishing anything, so logging was essential. Nowadays, results are visualised in the GUI and less is required of publishing via scripting.
As a result, the implementation is much smaller and maintenance is simplified. As an added bonus, the order attribute now works via both scripting and the command-line.
pyblish.api.plugins_by_instance
This was added to add symmetry to pyblish.api.instances_by_plugin
, and merely runs pyblish.api.plugins_by_family
by automatically fetching the family from the given instance. Symmetry is good.
New defaults
For plug-ins that support any host or family, there’s now no need to specify a wildcard.
# This
class MyPlugin(...):
families = ["*"]
hosts = ["*"]
# Is identical to this
class MyPlugin(...):
pass
Dependency Injection
Here’s some topics of discussion I’ve encountered while implementing this.
1. Order of execution
Currently, process_context
is always processed regardless of the presence of any instances, and it’s always processed before process_instance
in cases where instances are present. With DI, this behaviour is lost.
def process_context(self, context):
print "I'm processed first, and only once"
def process_instance(self, instance):
print "I'm processed for every available instance"
With a context of three instances, this yields.
"I'm processed first, and only once"
"I'm processed for every available instance"
"I'm processed for every available instance"
"I'm processed for every available instance"
With DI, it would look like this.
def process(self, context, instance):
print "I'm processed for every available instance"
Which under the same scenario outputs:
"I'm processed for every available instance"
"I'm processed for every available instance"
"I'm processed for every available instance"
Possible Solution
Since initialisation can sometimes be important, one alternative is to handle it in __init__
.
def __init__(self):
print "I'm processed first, and only once"
def process(self, context, instance):
print "I'm processed for every available instance"
Which will output identical results to the current results.
"I'm processed first, and only once"
"I'm processed for every available instance"
"I'm processed for every available instance"
"I'm processed for every available instance"
In addition, handling initialisation in __init__
is more Pythonic and familiar to newcomers.
It does mean a minor but significant change in the overall behaviour. It means plug-ins are no longer stateless.
# Current event loop
for Plugin in Plugins:
for instance in context:
Plugin().process(instance)
# DI event loop
for Plugin in Plugins:
plugin = Plugin()
for instance in context:
plugin.process(instance)
Not being stateless, from a development point of view, is more flexible and more powerful. But at what cost?
Open questions
Can you think of any other way to solve this? Is ordering important to begin with? Do we need initialisation? What are the pracital benefits? Costs?
2. Services
DI opens up doors for added functionality not before possible.
def process(self, context, instance):
# the function is given the current Context, and Instance
The above mimics the current behaviour, with slightly less typing and options for excluding both Context
and Instance
from the function signature where needed.
But it also means the ability to inject custom functionality.
def process(self, instance, time, user):
print("%s was published @ %s by %s" % instance.data("name"), time(), user)
In which time
and user
are injected on-demand, providing additional functionality to the plug-in. In this case, a callable function time
which returns the current time, and a static value user
.
Furthermore, services can be registered by developers.
import pyblish.api
pyblish.api.register_service(
"time", lambda: datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%S.%fZ"))
In the above, a custom service time
is registered and made available to plug-ins, providing a pre-formatted version the current time, such that every plug-in uses the same formatting and needn’t concern themselves with maintaining any updates to it.
Services vs. Data
Where does the line go between what is data and what is a service?
If data, added via e.g. Context.set_data(key, value)
, represents data shared amongst plug-ins, services may represent shared functionality.
Though there is technically nothing preventing you from storing callables as data…
import time
context.set_data("time", lambda: time.time)
Just as there is technically nothing preventing you from providing constants as a service.
pyblish.api.register_service("user", getpass.getuser())
It may make sense from a maintenance point of view to make the data/function separation. This way, data can be kept constant which simplifies archiving and visualisation, like passing the entire thing to a database, whereas functionality can be kept free of constants.
Open questions
Is additional services something we need, or does it add complexity? When a plug-in requests a service that isn’t available, when do we throw an error?
def process(self, not_exist):
pass
- Thrown during discovery
- Thrown during processing, e.g. in the GUI
- Silently skipped; rely on external tool for checking correctness.
# Checking correctness
$ pyblish check select_something.py
Plug-in is valid.
DI Transition Guide
Here are some notes of what is involved in converting your plug-ins to a dependency injection-style of working.
Note that none of this is fixed and is very debatable at the moment, so if you have any concerns or input, now is a good time.
- In cases where you have either
process_context
orprocess_instance
, a simple search-and-replace toprocess
will work fine. - In cases where you have both, see below.
- For
process()
to be called, it must either ask forcontext
and/orinstance
. If neither is present,process()
will not be called at all. See below. - During the transition phase, the distinction is made internally by looking for the existence of a
process_context
orprocess_instance
method. - If either exist, the plug-in is deemed “old-style” and is processed using the current implementation.
- If both
process
and eitherprocess_context
orprocess_instance
is present, old-style wins andprocess
will not be called.
I’ll update the list as more things come to mind. So far, updating the entire Napoleon extension took less than a minute and was a matter of a simple search-and-replace, leaving the behaviour unspoiled.
Both process_context
and process_instance
The current behaviour of this is for process_context
to be processed first, followed by process_instance
. This behaviour isn’t possible any more. You can however process both in the same function.
def process(self, context, process):
# do things
In case you do have both, process_instance
will overwrite process_context
due to your plug-in being re-written to a it’s Dependency Injection equivalent at run-time.
def process_context(self, context):
# I will not be called. :(
def process_instance(self, instance):
# Runs as usual
Old-first
The reason for looking for old-style methods before new-style is because of the newly introduced ability to use __init__
. In cases where __init__
is used, and process
not being implemented, the plug-in is still deemed new-style as __init__
is assumed to not have been in use.
Empty process()
If neither context
nor instance
is present in the signature of process()
, nothing happens.
I struggled to provide the ability to implement “anonymous” processes, for things that do something unrelated to either the Context
nor Instance
, but primarily to aid in the initial learning phase.
For example.
class MyPlugin(...):
def process(self):
cmds.file(exportSelected=True)
This could be a user’s first plug-in. From here, he could learn about the benefits of using Context
and thereafter Instance
, and thereby learning about why they exist in the first place. Baby-step style.
But, I just can’t for the life of me figure out how to do that in a way that makes sense.
For example, in this case.
class ValidateInstanceA(pyblish.Validator):
families = ["familyA"]
def process(self, instance):
# validate the instance
It’s quite clear that when there isn’t an instance suitable to this family, process
should not be called.
However.
class ValidateInstanceA(pyblish.Validator):
families = ["familyA"]
def process(self, context):
# validate the context
What about now? The context isn’t dependent on a family, but should always be called regardless. So clearly, process
is called, even if no compatible instance is present.
Which brings us to.
class ValidateInstanceA(pyblish.Validator):
families = ["familyA"]
def process(self):
# do something simple
What happens now? Should it be called?
I considered letting it run if the arguments are either empty, or context
is present. But that doesn’t work if other arguments are to be injected.
class ValidateInstanceA(pyblish.Validator):
families = ["familyA"]
def process(self, time):
# do something simple with time
Thoughts?
Hey @mkolar, @BigRoy and @tokejepsen, I just updated the post above, would you mind having a look, specifically the last part about whether or not to run an empty process()
?
It’s an subtle but important distinction that will be difficult to change once implemented, your input would be very valuable.
I think this is where a difference is between your and my interpretation.
I would consider that the families
attribute here is what limits it from being processed. If this would still get processed even if no compatible instance is available that would only make it more confusing.
If someone really wanted to run just a context check no matter what the family then family can just be ["*"]
.
As you state the context isn’t dependent on the family. That is true, but the plug-in is dependendent on the family, so should not get processed.
The other confusing bit here is the amount of time things get run. With context you would expect a single run (over the context) whereas with the instance you want each individual one. It’s a clear distinction that might not only be apparent with dependency injection.
1. Keep process_instance
and process_context
separate.
Maybe this is where we decide that both process_context()
and process_instance()
have their respective place. (One gets called once, the other per instance) Of course they could still have the benefits of Dependency Injection for other attributes.
2. Drop the behaviour of per instance processing as a built-in method.
The other side might be to drop the behaviour to run something like process_instance()
. Instead only have it run once, always. But that might make it harder to implement behaviour per instance (especially since one error will kill it for each single Instance and you make the plug-in developer responsible for error catching.
I think 1 could work best? It’s already proven itself that it works.
Thanks @BigRoy, really got me thinking.
I woke up this morning to another potential solution - which is to make the presence of instance
determine whether or not to process once or per-instance. If instance
is not requested, it will process once regardless.
- Process once
class ValidateInstanceA(pyblish.Validator):
families = ["*"]
def process(self):
# validate the world, once
- Process per-instance
class ValidateInstanceA(pyblish.Validator):
families = ["*"]
def process(self, instance):
# validate each instance
I like the sound of that.
For clarity, let me give some examples.
class ValidateInstanceA(pyblish.Validator):
families = ["*"]
def process(self, context):
# validate context
This would process once per publish, regardless of instances.
class ValidateInstanceA(pyblish.Validator):
families = ["myFamily"]
def process(self, context):
# validate context
Whereas this would process once per process, only if an instance of family "myFamily"
is present.
This is rather complex and possibly confusing, but also flexible and how it works currently.
Should we keep this behaviour?
Having implemented the above and run it through the tests, it looks very good.
Currently, every discovered plug-in is processed at least once, with those requesting instance
being processed once per available instance and in case there are no compatible instance doesn’t process at all.
instance
then acts as a filter, enabling processing of every instance and preventing processing in cases where instances aren’t available. It’s a subtle difference, but I think it is the one that makes most sense.
It also means SimplePlugin now works as-is, without any custom code. It’s been given an order of -1
meaning they will run before anything else, but can of course be given an order explicitly, effectively making them into SVEC in case of having any of their orders set between 0-3.
This isn’t how it will work. It eliminated the use of plug-ins when no instance were present, like stacks that only operate on the context, and SimplePlugin, which doesn’t have any notion of instances.
About this, repair will also see an update to dependency injection, but I’m expecting a deprecation shortly in favour of Actions.
Your current repair_instance
will continue to work fine, with the addition of being able to instead implement repair
, passing it instance
. As with process_*
, a simple search-and-replace will suffice.
So a Plug-in with a specific family will always get its process() triggered (even if not one of those families is available as an instance)? In that case I think it should be clarified that family means instance_family and is only a filter for instances.
I would think it’s more convenient to always filter by family (even for ordinary process), except for when family is not filtered (like family = ["*"]
). In that case SimplePlugin should still behave as you want since the default is that plug-ins are unfiltered. Maybe to clarify being unfiltered even further the family might be None
by default?
Looking forward to a draft Actions implementations, woohoo!
Yeah, that sounds like it would work. I’ll have to double check the logic…
It looks like it does work, all tests pass and your logic is sound.
Considering it’s easier to go from here and back, than it is to go from allowing everything to adding limits, I’ll leave this in for the next release. I also think it makes more sense.
Thanks for spotting this.
Ok, so the logic is essentially this:
- Asking for
instance
will limit your plug-in to only process supported instances. - Asking for
instance
when no instances are present, or only instances of unsupported families, the plug-in will never get run. Not even once. - All plug-ins process at least once, unless limited to a particular set of families.
class ValidateInEmergency(pyblish.Validator):
families = ["emergencyFamily"]
def process(self):
call_police()
This plug-in will only run if an instance of emergecyFamily
is present.
This looks quite clear and predictable to me.
As a last-minute modification, I’m considering adding arbitrary arguments to create_instance
.
# Before
instance = context.create_instance(name="MyInstance")
instance.set_data("family", "myFamily")
# After
instance = context.create_instance(name="MyInstance", family="myFamily")
Where name
is a first positional, required argument, and everything after are arbitrary keyword arguments.
def create_instance(name, **kwargs):
instance = Instance(name)
for key, value in kwargs.items():
instance.set_data(key, value)
Allowing us to do things a bit more succinctly and flexibly.
instance = context.create_instance(
name="MyInstance",
family="myFamily",
data1="more",
data2="even more")
It’s a non-breaking change, but would be difficult to turn back from. Considering 1.1 is a large adjustment already, I’d say we’d either implement it now or at 2.0.
Thoughts?
I’d go for it. Look like it would make the code tiny bit cleaner, which is always a good thing
Sounds good. Since it’s a non-breaking change I think implementing it right away should be fine. +1