-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify 'sameClass' boolean during _CopyFrom to check that we aren't dealing with two different databases #670
base: main
Are you sure you want to change the base?
Conversation
@@ -1646,7 +1646,8 @@ void DgnElement::_CopyFrom(DgnElementCR other, CopyFromOptions const& opts) | |||
const auto ecOtherInstanceIsValid = ecOther->IsValid(); | |||
if (ecOtherInstanceIsValid) | |||
{ | |||
bool sameClass = (GetElementClassId() == other.GetElementClassId()); | |||
// bool test = (&other.GetDgnDb() == &GetDgnDb()) && (GetElementClassId() == other.GetElementClassId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if reviewer has a preference or opinion, but is there a better way to do this? checking by fileName vs the bool test I've commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClassIds can be different for same class across different imodels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File names are not straightforward when you open a cloudsqlite file. We used to have DbFIleGuid even if that is not set sometime. We do not store imodel id in file either.
Another way to see if two files have the same map is the following.
iModelConsole> PRAGMA checksum(ecdb_map);
sha3_256
--------------------
21dc87c6555dc4a24f52c630960410e139407cada11eb460f0acc98b2b020b5e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or may be use PRAGMA checksum(ecdb_schema);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So are you saying that sameClass should be determined off of the equality of the checksums of the two schemas containing those classes?
Couldn't two different schemas have a class which had all the same properties? I.e. their checksum might be different because schema A has some other class defined in it, but the class they do have in common is identical? Or is that too granular and we should consider per schema only?
EDIT: Bill also suggested that sameClass could be a comparison of schema name, schema class and schema version.
I do worry about the performance of some of these checks since I believe this will take place for every element we are processing changes on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should go for the dumbest check. If the files are not the same then always assume it is not the same class.
If we're going for the dumbest check then what if we checked if the classes were the same memory reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not only can this cause the program to incorrectly assume that a class is the same and do horrible things, but afaict this only ever works on classes that are coincidentally the same, likely mostly classes in a common seed or exact schema versions imported in the same order coincidentally.
I would say it's worth measuring the performance impact (that's what the transformer performance regression pipeline is for) of never using the same-class optimization, because I think not doing it when the dbs are different is equivalent to never. (transformation is rarely over the same file, barring e.g. template placement)
I think the best thing to do would be Bill's idea, which is what the transformer does it in other places iirc. We should ahead of time calculate the mapping of classes between the iModels by id, if we don't already, and use that to check for class id equality. There should be many less classes than elements in an iModel, although I know there are pathological cases involving dynamic schemas. I think for dynamic schemas we can always assume classes are not the same without much issue which takes care of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you link to snippets where the transformer does the schema comparison in other places?
Please provide a description for your PR. |
done |
Make the 'sameClass' boolean in DgnElement::_CopyFrom more restrictive. Previously there was a chance that across two databases we were cloning an element who had the same elementClassId in both the source and target which would consider them the sameClass.
The source schema I added had a Quantity type of int, the target schema I added had a Quantity type of string. Cloning the element when sameClass was true meant that the targetElementProps had a Quantity type of int. When sameClass is false, the targetElementProps had a Quantity type of string as it should.
I think its also worth mentioning that during transformations there are often times where the target starts off as a copy of the source, which would make it safe to do the original sameClass check of just checking their ElementClassIds until new schemas got imported. So we were likely using this across dbs without issue, but that doesn't mean an issue won't ever arise.