-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
During incremental copy, the last row is always updated #104
Comments
I have the same doubt. |
I agree, it should be ">". I've tested this change and am no longer getting the duplicates. Hoping we can get this change made to master. |
- [Issue 104](singer-io#104) - Incremental sync needs to look for records '>', not '>=' state
In fact, it's normal to have ">=" to avoid loss of data. |
What's the best way to avoid duplicated rows then? I haven't run into this issue with any of the other taps I've worked with. |
The target must manage that and upsert or merge the New data. |
@NicolasRisi Any recommends when original table doesn't have the PK but target has PK? so WHERE {} >= '{}'::{} this makes duplicated key for the last row as @timmysuh mentioned when rerun the query in this case. |
I cannot think of a reason why we have ">=" instead of just ">" for INCREMENTAL replication logic?
It seems that we always end up updating the last row. Why is this a desired behavior?
if replication_key_value:
select_sql = """SELECT {}
FROM {}
WHERE {} >= '{}'::{}
ORDER BY {} ASC""".format(','.join(escaped_columns),
post_db.fully_qualified_table_name(schema_name, stream['table_name']),
post_db.prepare_columns_sql(replication_key), replication_key_value, replication_key_sql_datatype,
post_db.prepare_columns_sql(replication_key))
The text was updated successfully, but these errors were encountered: