Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Rel tables with only forward directed storage #4320

Open
2 of 10 tasks
ray6080 opened this issue Sep 30, 2024 · 1 comment · May be fixed by #4705
Open
2 of 10 tasks

Feature: Rel tables with only forward directed storage #4320

ray6080 opened this issue Sep 30, 2024 · 1 comment · May be fixed by #4705
Assignees
Labels
feature New features or missing components of existing features high-priority

Comments

@ray6080
Copy link
Contributor

ray6080 commented Sep 30, 2024

Description

Currently, when we create rel tables, we always store rel tuples in a duplicated way that each tuple is stored in both forward and backward directed storage. This is to allow the flexibility of planner to pick plans that scan from either forward or backward directions. The downsides are: 1) storage space overheads; 2) copy/insert/update/delete overheads.

There are cases where only scan from forward or backward is needed, thus we don't need to keep the storage duplicated for both directions:

  1. advanced users are aware that the rel tables can always be scanned in one direction;
  2. full text search index always scan the rel tables from one direction.

Syntax changes

// We introduce `WITH` to specify storage options.
CREATE REL TABLE Follows (FROM User TO User, since DATE) WITH (storage_direction = 'fwd');

or 

// We embed the storage option in schema definition.
CREATE REL TABLE Follows (FROM User TO User, since DATE, storage_direction = 'fwd');

Storage and Operator changes

  1. Partitioner should be aware of storage direction info so avoid duplicating partitions in both directions when storage_direction = 'fwd' is specified.
  2. bwdIndex in LocalRelTable should be optional.
  3. bwdRelTableData in RelTable should also be optional.

Planner changes

  1. When plan COPY REL statement, we need to decide if there will be two RelBatchInsert pipelines or just one.
  2. Planner should be aware that the storage direction is limited to only fwd-directed Extend.

TODOs

  • Apply grammar changes and catalog entry change.
  • Support rel batch insert for fwd directed storage.
  • Support insert/update/delete for fwd directed storage.
  • Support planner to correctly plan for scans (i.e. Extend) over fwd directed storage.
  • Support forward directed storage for rel table groups
  • GDS algorithms
    • Check which algorithms can be modified to work with single-direction rel storage
    • For algorithms that require both directions we should throw if it tries to run on single-direction rel storage
  • Correctly handle features that currently require both rel directions to work properly:
    • Rel multiplicity constraints
    • Detach delete
    • Multi-label patterns (if matched tables have different storage directions we should be able to handle that in the planner)

Note: for the current version of full text search to work, we only need to get the first two TODO items done.

@ray6080 ray6080 added the feature New features or missing components of existing features label Sep 30, 2024
@ray6080 ray6080 self-assigned this Sep 30, 2024
@semihsalihoglu-uw
Copy link
Contributor

Another use case of this is for storing undirected relationships if we start natively supporting undirected relationships. I think sooner or later we need to have this feature as the "directed only" nature of the edges of property graphs forces application-level tricks to force a direction on the edges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New features or missing components of existing features high-priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants