| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102 | // tag::cohere[][source,console]--------------------------------------------------PUT cohere-embeddings{  "mappings": {    "properties": {      "content_embedding": { <1>        "type": "dense_vector", <2>        "dims": 1024, <3>        "element_type": "byte"      },      "content": { <4>        "type": "text" <5>      }    }  }}--------------------------------------------------<1> The name of the field to contain the generated tokens. It must be refrencedin the {infer} pipeline configuration in the next step.<2> The field to contain the tokens is a `dense_vector` field.<3> The output dimensions of the model. Find this value in thehttps://docs.cohere.com/reference/embed[Cohere documentation] of the model youuse.<4> The name of the field from which to create the dense vector representation.In this example, the name of the field is `content`. It must be referenced inthe {infer} pipeline configuration in the next step.<5> The field type which is text in this example.// end::cohere[]// tag::hugging-face[][source,console]--------------------------------------------------PUT hugging-face-embeddings{  "mappings": {    "properties": {      "content_embedding": { <1>        "type": "dense_vector", <2>        "dims": 768, <3>        "element_type": "float"      },      "content": { <4>        "type": "text" <5>      }    }  }}--------------------------------------------------<1> The name of the field to contain the generated tokens. It must be refrencedin the {infer} pipeline configuration in the next step.<2> The field to contain the tokens is a `dense_vector` field.<3> The output dimensions of the model. Find this value in thehttps://huggingface.co/sentence-transformers/all-mpnet-base-v2[HuggingFace model documentation].<4> The name of the field from which to create the dense vector representation.In this example, the name of the field is `content`. It must be referenced inthe {infer} pipeline configuration in the next step.<5> The field type which is text in this example.// end::hugging-face[]// tag::openai[][source,console]--------------------------------------------------PUT openai-embeddings{  "mappings": {    "properties": {      "content_embedding": { <1>        "type": "dense_vector", <2>        "dims": 1536, <3>        "element_type": "float",        "similarity": "dot_product" <4>      },      "content": { <5>        "type": "text" <6>      }    }  }}--------------------------------------------------<1> The name of the field to contain the generated tokens. It must be refrencedin the {infer} pipeline configuration in the next step.<2> The field to contain the tokens is a `dense_vector` field.<3> The output dimensions of the model. Find this value in thehttps://platform.openai.com/docs/guides/embeddings/embedding-models[OpenAI documentation]of the model you use.<4> The faster` dot_product` function can be used to calculate similaritybecause OpenAI embeddings are normalised to unit length. You can check thehttps://platform.openai.com/docs/guides/embeddings/which-distance-function-should-i-use[OpenAI docs]about which similarity function to use.<5> The name of the field from which to create the dense vector representation.In this example, the name of the field is `content`. It must be referenced inthe {infer} pipeline configuration in the next step.<6> The field type which is text in this example.// end::openai[]
 |