lqb
/
elasticsearch
zrkadlo https://gitee.com/mirrors/elasticsearch.git


			
							12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
							////

[source,console]
----
DELETE _ingest/pipeline/my-text-embeddings-pipeline
----
// TEST
// TEARDOWN

////

// tag::elser[]

This is how an ingest pipeline that uses the ELSER model is created:

[source,console]
----
PUT _ingest/pipeline/my-text-embeddings-pipeline
{
  "description": "Text embedding pipeline",
  "processors": [
    {
      "inference": {
        "model_id": ".elser_model_2",
        "target_field": "my_embeddings",
        "field_map": { <1>
          "my_text_field": "text_field"
        },
        "inference_config": {
          "text_expansion": { <2>
            "results_field": "tokens"
          }
        }
      }
    }
  ]
}
----
<1> The `field_map` object maps the input document field name (which is
`my_text_field` in this example) to the name of the field that the model expects
(which is always `text_field`).
<2> The `text_expansion` inference type needs to be used in the inference ingest 
processor.

To ingest data through the pipeline to generate tokens with ELSER, refer to the 
<<reindexing-data-elser>> section of the tutorial. After you successfully 
ingested documents by using the pipeline, your index will contain the tokens 
generated by ELSER.

// end::elser[]


// tag::dense-vector[]

This is how an ingest pipeline that uses a text embedding model is created:

[source,console]
----
PUT _ingest/pipeline/my-text-embeddings-pipeline
{
  "description": "Text embedding pipeline",
  "processors": [
    {
      "inference": {
        "model_id": "sentence-transformers__msmarco-minilm-l-12-v3", <1>
        "target_field": "my_embeddings",
        "field_map": { <2>
          "my_text_field": "text_field"
        }
      }
    }
  ]
}
----
<1> The model ID of the text embedding model you want to use.
<2> The `field_map` object maps the input document field name (which is 
`my_text_field` in this example) to the name of the field that the model expects 
(which is always `text_field`).

To ingest data through the pipeline to generate text embeddings with your chosen 
model, refer to the 
{ml-docs}/ml-nlp-text-emb-vector-search-example.html#ex-text-emb-ingest[Add the text embedding model to an inference ingest pipeline] 
section. The example shows how to create the pipeline with the inference 
processor and reindex your data through the pipeline. After you successfully 
ingested documents by using the pipeline, your index will contain the text 
embeddings generated by the model.

// end::dense-vector[]