Wishlist: A Metamodel Partition in Neo4j Graph Database

There are many reasons I had to explore the embedded metamodel subgraph design pattern for Neo4j databases in the first series of FactMiners ecosystem design documents.

My interest is grounded in my experience in the 1990's developing a pair of complementary Distributed Smalltalk frameworks to do what we called "executable business models." The basic idea was that if we came up with a super-elegant metamodel about how to do business processes AND STUCK TO IT no matter what on the server/executing-model side of things then the Desktop Visualization framework could dynamically generate what customer/users mistook for "applications."

At the time, this required us to do something that was considered heretical in the OOP community which was to explicitly objectify BusinessProcess and similar "non-object" objects. While the OOP purists poo-pooed what we were doing, we found INCREDIBLE leverage in design-to-implementation and stakeholder buy-in (as what we built made sense to them... it was their mental model embodied in software).

We can bring this kind of leverage to graph databases, and I hope Neo4j will be the pioneer leader in this, by coming up with a community standard for an embedded metamodel subgraph feature. By adopting the general embedded metamodel subgraph pattern and defining a core structure for the general semantics of a metamodel in a graph database, we would then have a common mechanism for 3rd party product/service developers to hang their tool-specific "decorations"/hints/config-specs/whatever. With some kind of lightweight mechanism for registering a property namespace in the META partition (or whatever this subgraph is called), 3rd party devs would be able to work together more efficiently to provide interoperability and other pipeline-type features that are all good for the Neo4j ecosystem.

While nothing technically would need to be done in Neo4j core to create and use such a design pattern, there would be a couple helpful things:

  • As part of its "schema-like support" Neo4j could support an OPTIONAL configuration directive to name the label of an embedded non-connected subgraph, e.g. META.

    While our interest is in using this for an embedded metamodel, there doesn't need to be any requirement on its use other than this is a "database within a database" such that it can be systematically IGNORED in Cypher queries unless explicitly referenced. (Not sure of the technical feasibility of this systematic exclusion. But even without it, I believe the pattern can be used without too much chance of 'result contamination' providing best practices for naming conventions are followed, etc.)

  • While not strictly required and not limited in usefulness to metamodel modeling, I would like to see some kind of optional PATH SEMANTIC for LABELS to express subset containment and not just subset membership. E.g., (foo:Man.Chu) is different than (foo:Man:Chu) in that in the first, Chu is a subset of Man.

Subset containment would be EXTREMELY helpful for organizing a metamodel as explored in our first FactMiners GraphGists. The best solution for adding a label path semantic is one that would provide a pattern-matching syntax in Cypher so users would not need to resort to regexes, etc. when doing metamodel discovery, interpretation, and application or visualization configuration, etc.

I certainly see where this design pattern could be used to provide a "freeze-dry" mechanism for in-graphdb storage and transmission of Structr-specific info. I would love to see it where you load a new Neo4j database into Structr, it finds a META partition and grabs all the Structr-specific info the database creator/updater provides. If Structr, upon examining the metamodel finds KeyLines "decorations", for example, and the current Structr user has KeyLines installed, Structr/CambridgeIntelligence can work out what that combination means and provide tool-2-tool configuration... again, to be stored in the shared embedded metamodel once that collaboration is determined.

I am looking at this strictly from the Wish List of an "itch scratching" developer. I do not know the particulars of what it would take to implement such an optional ignorable subgraph partition. I just know what I'd do with one if we had it. :-)

Thoughts?