-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Pydantic integration #3086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pydantic integration #3086
Conversation
1f3f66c
to
b9ada0f
Compare
fe23bb4
to
8902d66
Compare
🔍 Preview links for changed docs |
c15e7b5
to
4884de8
Compare
4884de8
to
b19fa58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM. I only have comments on the example app. I only skimmed the frontend code.
doc = None | ||
try: | ||
doc = await Quote._doc.get(id) | ||
except NotFoundError: | ||
pass | ||
if not doc: | ||
raise HTTPException(status_code=404, detail="Item not found") | ||
return Quote.from_doc(doc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this code the same?
doc = None | |
try: | |
doc = await Quote._doc.get(id) | |
except NotFoundError: | |
pass | |
if not doc: | |
raise HTTPException(status_code=404, detail="Item not found") | |
return Quote.from_doc(doc) | |
try: | |
doc = await Quote._doc.get(id) | |
return Quote.from_doc(doc) | |
except NotFoundError: | |
raise HTTPException(status_code=404, detail="Item not found") |
This also applies to get_quote
and delete_quote
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is close, but not the same. My version doesn't raise any typing errors, yours does. The problem is that our AsyncDocument.get()
method can return None
if the document is not found and AsyncElasticsearch.get()
is not configured to raise the not found exception. That makes all uses of the get()
method awkward, because you always have to account for the possible None
.
async def create_quote(req: Quote) -> Quote: | ||
embed_quotes([req]) | ||
doc = req.to_doc() | ||
doc.meta.id = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for security. We do not want the client to decide what the id should be, so by blanking it before calling save()
we make sure if they provided one it is discarded.
Quotes database example, which demonstrates the Elasticsearch integration with | ||
Pydantic models. This example features a React frontend and a FastAPI back end. | ||
|
||
 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great quotes! Do you maybe have an example that does rely on embeddings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you mean here. This example uses embeddings, both on their own and combined with BM25 search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, the results would have been the same with BM25 only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe there is something else than "dogs and books" that we can use. That's just a nitpick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, you are talking about the screenshot. In fact, "dogs and books" does not match the Groucho Marx quote at the top when using BM25, because that quote has "dog" and "book" in it in singular. I have to test this, but maybe using "canine" instead of dog makes the example more clear and returns the same results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "books and pets" search term returns a few good ones, including the one from Groucho Marx at the top of the list.
b19fa58
to
d2638e6
Compare
Co-authored-by: Quentin Pradet <[email protected]>
Co-authored-by: Quentin Pradet <[email protected]>
* Support `Annotated` typing hint * Add option to exclude DSL class field from mapping * Pydantic integration with the BaseESModel class * object and nested fields * complete CRUD example * Use a smaller dataset * documentation * Update examples/quotes/backend/pyproject.toml Co-authored-by: Quentin Pradet <[email protected]> * Update examples/quotes/backend/quotes.py Co-authored-by: Quentin Pradet <[email protected]> * Use a better screenshot --------- Co-authored-by: Quentin Pradet <[email protected]> (cherry picked from commit f842d9b)
This change adds a few features that support the use of Pydantic models with the DSL module, instead of the standard models defined as subclasses of the
AsyncDocument
class.As part of this work some additions have been made to the typing implementation of DSL documents.
Annotated
syntax when defining document fields in the DSL module. Examples:New
BaseESModel
andAsyncBaseESModel
classes that inherit from Pydantic'sBaseModel
and add Elasticsearch superpowers. In particular, any model defined with one of these as its base class will havemeta
and_doc
private attributes andto_doc()
andfrom_doc()
methods. Themeta
attribute includes metadata for each document, things such asid
orscore
. The_doc
attribute is a dynamically generatedDocument
orAsyncDocument
instance that can be used whenever access to the Elasticsearch index is needed. The methods convert between Pydantic models and ES documents.Aside from the extra attributes, this class works exactly like
BaseModel
and can be used to define data attributes and their validation rules, and the ES document is derived from them automatically. In particular, this class can be used in FastAPI routes, as shown in thequotes
example included in this PR. Any annotations intended for the DSL module can be included in theAnnotated[]
type hint of the respective fields. TheIndex
inner class can be included as well.