Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debezium connector failing randomly with error "history topic or its content is fully or partially missing" #5

Open
Nicodox opened this issue Jun 14, 2022 · 6 comments

Comments

@Nicodox
Copy link

Nicodox commented Jun 14, 2022

Hi all,

we've implemented Debezium on Azure AKS cluster, connected it to Azure Event Hub, capturing data changes on Azure SQL Server PaaS. Everything works fine but we randomly get some errors and need to recreate a new connector. Following more details:

we've deployed a CDC Debezium deployment on cluster AKS following this guide.

The debezium deployment is capturing CDC events on SQLServer PaaS on Azure and transferring them as events in the event hub.

Debezium connector is working as expected, publishing CDC events to the event hub, which are consumed by other tools; however, after some time Debezium returns the following error:

ERROR || WorkerSourceTask{id=wwi4-0} Task threw an uncaught and unrecoverable exception. Task is being killed
and will not recover until manually restarted [org.apache.kafka.connect.runtime.WorkerTask]
io.debezium.DebeziumException: The db history topic or its content is fully or partially missing. Please check database history topic
configuration and re-execute the snapshot.

When we have this error, the CDC is not working. The only workaround we found out is to create a new Debezium connector, but ends with the same error after some time. It can be a few hours or a few days between the creation of the connector and this error.

This is the Connector:

{
"snapshot.mode": "schema_only",
"connector.class": "io.debezium.connector.sqlserver.SqlServerConnector",
"database.hostname": "XXXXXXXXX",
"database.port": "1433",
"database.user": "XXXXX",
"database.password": "XXXXXX",
"database.dbname": "XXXXX",
"database.server.name": "SQLAzure",
"tasks.max": "1",
"decimal.handling.mode": "string",
"table.include.list": "XXXXXX,YYYYY,ZZZZZ",
"transforms": "Reroute",
"transforms.Reroute.type": "io.debezium.transforms.ByLogicalTableRouter",
"transforms.Reroute.topic.regex": "(.*)",
"transforms.Reroute.topic.replacement": "wwi",
"tombstones.on.delete": false,
"database.history": "io.debezium.relational.history.MemoryDatabaseHistory"
}

I think the problem is related to the parameter and value: "database.history": "io.debezium.relational.history.MemoryDatabaseHistory"

From the Debezium documentation (https://debezium.io/documentation/reference/stable/operations/debezium-server.html#debezium-source-database-history-class)
we read that: io.debezium.relational.history.MemoryDatabaseHistory is a "volatile store for test environments".

So my questions are:

  1. Can one avoid the error of History topic fully or partially missing by maintaining the value io.debezium.relational.history.MemoryDatabaseHistory and how?

  2. Can one go in production environment with by maintaining the value io.debezium.relational.history.MemoryDatabaseHistory, even if Debezium documentation tells us that the parameter is a "volatile store for test environments".

Can you please help us solving this issue?

Thank you,

kind regards from Italy,

Nicolò

@thepaulmacca
Copy link

@Nicodox I would love to see how you've configured this on AKS. Do you have a blog post or anything that you could share please?

@yorek
Copy link
Contributor

yorek commented Jul 11, 2022

Hi @Nicodox and sorry for not answering before, the issue completely went under the radar. Thanks @thepaulmacca for bringing this up again :) The suggestion would be to use the FileDatabaseHistory or RedisDatabaseHistory as mentioned in the documentation you also linked. This in case EventHub doesn't support automatic topic creation for database history yet (it's been a while I tested it, maybe you can check again if now the issue has been resolved: Azure/azure-event-hubs-for-kafka#61), otherwise the default KafkaDatabaseHistory should be the preferred option, AFAIK.

@thepaulmacca
Copy link

@yorek do you know of an example on using this with AKS or anything?

Between the debezium docs and this repo, I'm struggling to get my head around an end-to-end configuration (I am quite new to debezium though!)

If you know of any useful blog posts or anything, that would be helpful

Thanks

@yorek
Copy link
Contributor

yorek commented Jul 11, 2022

@thepaulmacca unfortunately not. I'm not an AKS expert and I've just used Debezium with "plain" containers (Docker and Azure Container Instances)...ping me on Twitter (https://twitter.com/mauridb) so that I can bring Gunnar - the dev lead for Debezium - into the discussion. I'm sure he can give more help than what I can do :)

@thepaulmacca
Copy link

That would be great, thanks!

@poikjo
Copy link

poikjo commented Sep 3, 2024

Anything new on this topic?

We had Debezium 1.8 version running almost year and half without issues using history configs like:
"database.history": "io.debezium.relational.history.FileDatabaseHistory", "database.history.file.filename": "history.dat"
Now after update to 2.7 and Event hub schema history we started seeing "The db history topic is missing" exception and the only solution is apparently to recreate the connection.

Should schema history event hub work fine? How about that cleanup policy? Is there mismatch on documentation as SQL Server connector configuration still mentions MemoryDatabaseHistory usage?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants