I’ve been meaning to give this a proper introduction sometime in the future, but seeing that it’s relevant to the conversation I figure I’d put it out there to give some perspective on an alternative to path parsing.
Motivation
Path parsing is ultimately about metadata. About deriving information related to a particular file or folder (i.e. “asset”) from its absolute path, or vice versa about creating paths. Since paths are also structural, the metadata is tightly, very tightly coupled with its access. With its location on disk. Even though these two have very little in common and needn’t be coupled at all.
In the real world, taking a fruit out of a bowl doesn’t make it any less of a fruit.
With path parsing, this is no longer true, as it’s identity is based on where it is. This ultimately leads to brittle tools built upon it, and a limited, very limited amount and type of data you may associate with any asset.
Amount and type can be worked around. A common example is to associate an absolute path with an external entity, such as a database. This way, an absolute path points to an external source where it’s identity, and arbitrary amount of binary data lies.
But no matter where you choose to associate a path, it doesn’t eliminate the most destructive disadvantage to tools development and asset re-use which is that when you take the fruit out of the bowl, the fruit is no longer a fruit.
An alternative
Some time ago, I invested an amount of research and development into solving this and the fruits of this labour (pun!) led to a system similar to what Sony Imageworks and ILM uses and is based on the Unix philosophy “Everything is a file”.
In a nutshell, rather than associating identity with an absolute path, it is associated to the asset itself in the form of side-car files.
$ cd Peter
$ cquery tag .Asset
This way, no matter where it is, a fruit is always a fruit and remains a fruit even across different projects.
Another advantage to this approach is that paths no longer require a schema. Because remember the sole purpose of a path schema is to derive metadata out of the path. In this case, metadata is available at the source which means paths are fully decoupled.
I mocked up an example of this, that I refer to as “Schemaless Directory Structure”, in a GitHub gist here:
Transitioning
Due to how different this approach is to path parsing, a transition isn’t necessarily pretty. However I’m confident the gain is enough to make it just. If you are in a position of re-write and can make a clean entry, then getting started should be quite a lot easier than path parsing. No schemas, no regular expressions, just tag it.
Performance
Databases are second-to-none when it comes to the performance of querying and filtering information and it may seem that storing identity with assets eliminate this advantage. But there’s no reason to.
For example, rather than associating a database-entry to an absolute path, associate the database-entry to the asset.
$ cd MyAsset
$ cquery tag .0f3fvbs36nASjr
Move the fruit, it’s identity moves along with it whilst still reap the benefits of database performance.
Future
As I’m the only maintainer to the project and my time is spent mostly with Pyblish, development has been slow. But I believe it’s a clear winner to future asset identification techniques and am frankly astounded by how large established brands get by relying on solely on path parsing, and there are several out there. I’ve gotten several level-ups since the project got started and intend on continuing what I’ve started when the time is right.