You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Microsoft.Extensions.VectorData currently has the GetRecordOptions, DeleteRecordOptions and UpsertRecordOptions types, which are accepted by e.g. GetAsync, DeleteAsync, UpsertAsync (and their batching counterparts). DeleteRecordOptions and UpsertRecordOptions are empty, and GetRecordOptions contains only IncludeVectors.
AFAICT, these can serve two possible purposes:
To allow providers to subclass these (e.g. MyDbDeleteRecordOptions), and add their own provider-specific options.
To allow adding new universal options directly on these objects, which apply to all/most vector databases (like GetRecordOptions.IncludeVectors today).
Re the 1st function (provider-specific options), consuming code that wishes to use a provider-specific option will have to interact with provider-specific code in any case (i.e. with the subclass of *RecordOptions); therefore, it seems better to accept provider-specific options on a provider-specific overload exposed by the implementation, rather than having it on the abstraction. In addition, in many cases a provider-specific options object will already be exposed via the low-level SDK of the vector database; a provider-specific overload can simply accept that low-level options type directly, but the current abstraction method requires the options type to extend the *RecordOptions based types, which the low-level options type will never do. So to summarize, it seems better off to leave these provider-specific extension points out of the abstraction, and simply have implementations add whatever overloads make sense for them.
Re the 2nd function (universal options), while it's theoretically possible that we'd want to add new ones, we already have a very rich sample of vector databases, and it seems quite unlikely that we're missing such an option or that a new one would appear. So I'd propose we simplify the API by removing these extension points (am noting that Delete and Upsert are currently empty) - at least for now. Note that we can always re-introduce these in the future via a new, separate overload that accepts the options (this would throw by default; this isn't any worse from today's story, where if we introduce a new property into e.g. DeleteRecordOptions, it wouldn't be respected by all existing providers anyway). The only potential negative point there would be IncludeVectors, which would be an additional parameter on GetAsync (instead of on GetRecordOptions); but even for that case, adding a GetAsync overload alongside that with GetRecordOptions doesn't seem like it would be problematic.
The reason the three options classes were added originally was to easily be able to extend GetAsync, DeleteAsync and UpsertAsync with additional optional options in future if needed, but as you point out, nothing has really shown up that would be required for Delete and Upsert.
Also note, that we are following a pattern in the SDK where required parameters are on the signature but optional are delegated to an options class, with an exception only for CancellationToken.
I think for GetAsync, related to our recent discussed, adding IncludeData may also make sense, if we do something similar on vector & non-vector search, so I would not be that much in favour of removing GetRecordOptions.
I think for GetAsync, related to our recent discussed, adding IncludeData may also make sense, if we do something similar on vector & non-vector search, so I would not be that much in favour of removing GetRecordOptions.
Makes sense. So I'll submit a PR removing the options for Delete/Upsert (at least for now) and leave the Get options in place.
Also note, that we are following a pattern in the SDK where required parameters are on the signature but optional are delegated to an options class, with an exception only for CancellationToken.
OK. I'm not aware of a strict pattern like that e.g. in .NET API design - where we're reasonably sure there's a small number of option parameters, it seems OK to simply include them directly on the method rather than systematically adding option classes (which add to the concept count). But I agree makes sense for the vector search API, and so at the very least for consistency also for Get... We can always discuss case-by-case as needed.
Microsoft.Extensions.VectorData currently has the GetRecordOptions, DeleteRecordOptions and UpsertRecordOptions types, which are accepted by e.g. GetAsync, DeleteAsync, UpsertAsync (and their batching counterparts). DeleteRecordOptions and UpsertRecordOptions are empty, and GetRecordOptions contains only IncludeVectors.
AFAICT, these can serve two possible purposes:
Re the 1st function (provider-specific options), consuming code that wishes to use a provider-specific option will have to interact with provider-specific code in any case (i.e. with the subclass of *RecordOptions); therefore, it seems better to accept provider-specific options on a provider-specific overload exposed by the implementation, rather than having it on the abstraction. In addition, in many cases a provider-specific options object will already be exposed via the low-level SDK of the vector database; a provider-specific overload can simply accept that low-level options type directly, but the current abstraction method requires the options type to extend the *RecordOptions based types, which the low-level options type will never do. So to summarize, it seems better off to leave these provider-specific extension points out of the abstraction, and simply have implementations add whatever overloads make sense for them.
Re the 2nd function (universal options), while it's theoretically possible that we'd want to add new ones, we already have a very rich sample of vector databases, and it seems quite unlikely that we're missing such an option or that a new one would appear. So I'd propose we simplify the API by removing these extension points (am noting that Delete and Upsert are currently empty) - at least for now. Note that we can always re-introduce these in the future via a new, separate overload that accepts the options (this would throw by default; this isn't any worse from today's story, where if we introduce a new property into e.g. DeleteRecordOptions, it wouldn't be respected by all existing providers anyway). The only potential negative point there would be IncludeVectors, which would be an additional parameter on GetAsync (instead of on GetRecordOptions); but even for that case, adding a GetAsync overload alongside that with GetRecordOptions doesn't seem like it would be problematic.
/cc @westey-m
The text was updated successfully, but these errors were encountered: