Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify 'sameClass' boolean during _CopyFrom to check that we aren't dealing with two different databases #670

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

nick4598
Copy link
Contributor

@nick4598 nick4598 commented Mar 13, 2024

Make the 'sameClass' boolean in DgnElement::_CopyFrom more restrictive. Previously there was a chance that across two databases we were cloning an element who had the same elementClassId in both the source and target which would consider them the sameClass.

The source schema I added had a Quantity type of int, the target schema I added had a Quantity type of string. Cloning the element when sameClass was true meant that the targetElementProps had a Quantity type of int. When sameClass is false, the targetElementProps had a Quantity type of string as it should.

I think its also worth mentioning that during transformations there are often times where the target starts off as a copy of the source, which would make it safe to do the original sameClass check of just checking their ElementClassIds until new schemas got imported. So we were likely using this across dbs without issue, but that doesn't mean an issue won't ever arise.

@@ -1646,7 +1646,8 @@ void DgnElement::_CopyFrom(DgnElementCR other, CopyFromOptions const& opts)
const auto ecOtherInstanceIsValid = ecOther->IsValid();
if (ecOtherInstanceIsValid)
{
bool sameClass = (GetElementClassId() == other.GetElementClassId());
// bool test = (&other.GetDgnDb() == &GetDgnDb()) && (GetElementClassId() == other.GetElementClassId());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if reviewer has a preference or opinion, but is there a better way to do this? checking by fileName vs the bool test I've commented out?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClassIds can be different for same class across different imodels

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File names are not straightforward when you open a cloudsqlite file. We used to have DbFIleGuid even if that is not set sometime. We do not store imodel id in file either.

Another way to see if two files have the same map is the following.

iModelConsole> PRAGMA checksum(ecdb_map);

sha3_256
--------------------
21dc87c6555dc4a24f52c630960410e139407cada11eb460f0acc98b2b020b5e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or may be use PRAGMA checksum(ecdb_schema);

Copy link
Contributor Author

@nick4598 nick4598 Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are you saying that sameClass should be determined off of the equality of the checksums of the two schemas containing those classes?

Couldn't two different schemas have a class which had all the same properties? I.e. their checksum might be different because schema A has some other class defined in it, but the class they do have in common is identical? Or is that too granular and we should consider per schema only?

EDIT: Bill also suggested that sameClass could be a comparison of schema name, schema class and schema version.
I do worry about the performance of some of these checks since I believe this will take place for every element we are processing changes on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should go for the dumbest check. If the files are not the same then always assume it is not the same class.

If we're going for the dumbest check then what if we checked if the classes were the same memory reference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not only can this cause the program to incorrectly assume that a class is the same and do horrible things, but afaict this only ever works on classes that are coincidentally the same, likely mostly classes in a common seed or exact schema versions imported in the same order coincidentally.

I would say it's worth measuring the performance impact (that's what the transformer performance regression pipeline is for) of never using the same-class optimization, because I think not doing it when the dbs are different is equivalent to never. (transformation is rarely over the same file, barring e.g. template placement)

I think the best thing to do would be Bill's idea, which is what the transformer does it in other places iirc. We should ahead of time calculate the mapping of classes between the iModels by id, if we don't already, and use that to check for class id equality. There should be many less classes than elements in an iModel, although I know there are pathological cases involving dynamic schemas. I think for dynamic schemas we can always assume classes are not the same without much issue which takes care of that.

Copy link
Contributor Author

@nick4598 nick4598 Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you link to snippets where the transformer does the schema comparison in other places?

@pmconne
Copy link
Member

pmconne commented Mar 13, 2024

Please provide a description for your PR.

@nick4598 nick4598 changed the title Modify 'sameClass' boolean during _CloneFrom to be more restrictive Modify 'sameClass' boolean during _CloneFrom to check that we aren't dealing with two different databases Mar 13, 2024
@nick4598 nick4598 changed the title Modify 'sameClass' boolean during _CloneFrom to check that we aren't dealing with two different databases Modify 'sameClass' boolean during _CopyFrom to check that we aren't dealing with two different databases Mar 13, 2024
@nick4598
Copy link
Contributor Author

Please provide a description for your PR.

done

@nick4598 nick4598 requested a review from ColinKerr March 13, 2024 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants