Cleaning up extracted files

Another “best practices” question, are people deleting the extracted transient files after integration is complete? Is there any method in pyblish to handle this sort of cleanup in situations where extraction or integration fails?

Don’t think we have discussed a best practise on this.

Personally I extract files to a temporary local directory, which I then delete as a last step in integration (optional of course), with try/except.

Great question, @morganloomis!

I’ve found the approach you describe to work best, personally, but I think it varies amongst developers.

I think of clean-up generally as a two-step process.

  1. Gently clean up locally created files, accept failures.
  2. Use a scheduled “clean-up crew”, such as a background daemon

The locally created files I typically clean up as you describe. If you store the location of where each extractor extracts into your context, then you can assign a dedicated integrator as a last step in the pipeline to wipe those.

import shutil
import pyblish.api

class CleanupIntegrator(pyblish.api.ContextPlugin):
  order = pyblish.api.IntegratorOrder + 0.2

  def process(self, context):
    shutil.rmtree(context.data["tempdir"])

The 0.2 means it’ll get run after the normal integration has taken place. Any plug-in may be offsetted like this, down to and including -0.5 and up to but not including 0.5. Beyond that and you are stepping into the previous/next CVEI step. Integration happens to be the last stage, so it’s safe to go beyond, but in general it’s good to keep things consistent.

By keeping the integration simple, and separate it to the end, it’s guaranteed to be run. Then it’s up to you to ensure that this doesn’t actually fail too. In those cases, it’s better to be forgiving than strict, as deleting files later can sometimes be preferable to having to recover them in error right away.

The clean-up crew is somewhat outside the scope of Pyblish, but is important to keep in mind when designing your clean up integrator. In a nutshell it could either delete files after a certain lifetime (say, 1 month) and/or based on some other metric, such as whether there are 10 versions, but only 9 are allowed to co-exist. Such a daemon could crawl common temporary directories and ensure nothing remains.

1 Like

Cool, that all sounds good, thanks!