From c966e75a38a9b1a23eda98d02a771c8b08b3ba24 Mon Sep 17 00:00:00 2001
From: Andreas Motl <andreas.motl@crate.io>
Date: Wed, 18 Dec 2024 16:06:19 +0100
Subject: [PATCH] Documentation: Add `backlog.md`

---
 docs/backlog.md | 78 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)
 create mode 100644 docs/backlog.md

diff --git a/docs/backlog.md b/docs/backlog.md
new file mode 100644
index 0000000..14f2d61
--- /dev/null
+++ b/docs/backlog.md
@@ -0,0 +1,78 @@
+# Backlog
+
+## Iteration +1
+- Release 0.0.0
+- Revise "Project links" (see PyPI)
+- Mitigate `CRATEDB_API_KEY`
+- Provide async variants for all methods
+- Release 0.0.1
+- Add CrateDB adapter to LangChain's "Provider" list 
+- Documentation: Update all Notebooks to provide concise usage information
+- Release 0.1.0
+
+## Iteration +2
+- Documentation: Improve `as_retriever` feature:
+  - https://python.langchain.com/docs/tutorials/retrievers/#retrievers
+  - https://python.langchain.com/api_reference/mongodb/vectorstores/langchain_mongodb.vectorstores.MongoDBAtlasVectorSearch.html
+
+## Iteration +3
+- Document retrieval demo (PDF)
+  - Import tests from `langchain-mongodb` about PDFs
+  - https://github.com/langchain-ai/langchain-mongodb/blob/main/libs/mongodb/tests/integration_tests/conftest.py
+  - https://github.com/microsoft/markitdown (https://api.python.langchain.com/en/latest/_modules/langchain_community/document_loaders/markdown.html)
+  - https://github.com/getomni-ai/zerox (https://python.langchain.com/docs/integrations/document_loaders/zeroxpdfloader/#loader-features)
+  - https://github.com/freedmand/textra
+  - https://github.com/gsidhu/winocr_cli
+- Unlock other subsystems: Cache, Pipeline, Docstores, Retrievers, Store, ByteStore
+  - https://python.langchain.com/api_reference/astradb/index.html
+  - https://python.langchain.com/api_reference/elasticsearch/index.html
+  - https://github.com/langchain-ai/langchain-mongodb/tree/main/libs/mongodb/langchain_mongodb
+  - https://python.langchain.com/api_reference/couchbase/cache/langchain_couchbase.cache.CouchbaseSemanticCache.html
+  - https://python.langchain.com/api_reference/mongodb/cache/langchain_mongodb.cache.MongoDBAtlasSemanticCache.html
+  - https://python.langchain.com/api_reference/community/storage/langchain_community.storage.sql.SQLStore.html
+  - https://python.langchain.com/api_reference/community/storage/langchain_community.storage.mongodb.MongoDBStore.html
+  - https://python.langchain.com/api_reference/community/storage/langchain_community.storage.mongodb.MongoDBByteStore.html
+- SQL-based Docstores
+  - https://github.com/langchain-ai/langchain/discussions/12085
+  - https://stackoverflow.com/questions/77438251/langchain-parentdocumetretriever-save-and-load
+  - SQLStrStore and SQLDocStore persist `str` and `Document` objects, but they can be extended easily.
+    Thanks, @gcheron.
+    - https://github.com/langchain-ai/langchain/pull/15909
+    - https://github.com/gcheron/langchain/blob/sql-store/docs/docs/integrations/stores/sql.ipynb
+  - https://medium.com/@guilhem.cheron35/sql-storage-langchain-rags-inmemorystore-alternative-ex-with-parentdocumentretriever-pgvector-5cc162950d77
+- Get into / do more with Retrievers
+  - https://python.langchain.com/docs/concepts/retrieval/
+  - https://api.python.langchain.com/en/latest/community/retrievers.html
+  - https://api.python.langchain.com/en/latest/community/retrievers/langchain_community.retrievers.bm25.BM25Retriever.html#langchain_community.retrievers.bm25.BM25Retriever
+  - https://api.python.langchain.com/en/latest/community/retrievers/langchain_community.retrievers.elastic_search_bm25.ElasticSearchBM25Retriever.html#langchain_community.retrievers.elastic_search_bm25.ElasticSearchBM25Retriever
+  - https://api.python.langchain.com/en/latest/community/retrievers/langchain_community.retrievers.qdrant_sparse_vector_retriever.QdrantSparseVectorRetriever.html
+  - https://api.python.langchain.com/en/latest/community/retrievers/langchain_community.retrievers.weaviate_hybrid_search.WeaviateHybridSearchRetriever.html
+  - https://python.langchain.com/docs/how_to/time_weighted_vectorstore/
+- Translators
+  - https://python.langchain.com/docs/how_to/query_constructing_filters/
+- Misc
+  - https://python.langchain.com/api_reference/experimental/sql/langchain_experimental.sql.base.SQLDatabaseChain.html#langchain_experimental.sql.base.SQLDatabaseChain
+  - https://python.langchain.com/api_reference/experimental/tabular_synthetic_data/langchain_experimental.tabular_synthetic_data.base.SyntheticDataGenerator.html#langchain_experimental.tabular_synthetic_data.base.SyntheticDataGenerator
+  - https://python.langchain.com/api_reference/experimental/sql/langchain_experimental.sql.base.SQLDatabaseChain.html#langchain_experimental.sql.base.SQLDatabaseChain
+  - https://python.langchain.com/api_reference/experimental/retrievers/langchain_experimental.retrievers.vector_sql_database.VectorSQLDatabaseChainRetriever.html#langchain_experimental.retrievers.vector_sql_database.VectorSQLDatabaseChainRetriever
+- Product
+  - https://aws.amazon.com/kendra/
+  - https://aws.amazon.com/bedrock/knowledge-bases/
+  - https://cloud.google.com/document-ai
+  - https://cloud.google.com/generative-ai-app-builder/docs/try-enterprise-search
+  - https://cloud.google.com/generative-ai-app-builder/docs/try-media-search
+  - https://python.langchain.com/docs/tutorials/retrievers/
+  - https://api.python.langchain.com/en/latest/community/retrievers/langchain_community.retrievers.bedrock.AmazonKnowledgeBasesRetriever.html#langchain_community.retrievers.bedrock.AmazonKnowledgeBasesRetriever
+  - https://api.python.langchain.com/en/latest/community/retrievers/langchain_community.retrievers.google_vertex_ai_search.GoogleVertexAISearchRetriever.html
+  - https://api.python.langchain.com/en/latest/community/retrievers/langchain_community.retrievers.google_cloud_documentai_warehouse.GoogleDocumentAIWarehouseRetriever.html
+
+## Done
+- Make it work
+- Remove JSONB, only PostgreSQL has it
+- Update README: https://github.com/langchain-ai/langchain-postgres
+- Prune subsystems that are not applicable
+- Rename `chat_message_histories` to just `chat_history`, like ES is doing it
+- Minimally update documentation for uploading to PyPI v0.0.0
+- Add code coverage reporting
+- Dependencies: Use version ranges, focused around upper bounds
+- Add PyPI project metadata