Skip to content
云栖回顾 | 2024 云栖大会微服务和网关相关演讲材料Know more

AI RAG

Function Description

Implement LLM-RAG by integrating with Alibaba Cloud Vector Search Service, as shown in the figure below:

Running Attributes

Plugin execution phase: Default Phase
Plugin execution priority: 400

Configuration Description

NameData TypeRequirementDefault ValueDescription
dashscope.apiKeystringRequired-Token used for authentication when accessing Tongyi Qianwen service.
dashscope.serviceFQDNstringRequired-Tongyi Qianwen service name
dashscope.servicePortintRequired-Tongyi Qianwen service port
dashscope.serviceHoststringRequired-Domain name for accessing Tongyi Qianwen service
dashvector.apiKeystringRequired-Token used for authentication when accessing Alibaba Cloud Vector Search Service.
dashvector.serviceFQDNstringRequired-Alibaba Cloud Vector Search service name
dashvector.servicePortintRequired-Alibaba Cloud Vector Search service port
dashvector.serviceHoststringRequired-Domain name for accessing Alibaba Cloud Vector Search service
dashvector.topkintRequired-Number of vectors to retrieve from Alibaba Cloud Vector Search
dashvector.thresholdfloatRequired-Vector distance threshold; documents above this threshold will be filtered out
dashvector.fieldstringRequired-Field name where documents are stored in Alibaba Cloud Vector Search

Once the plugin is enabled, while using the tracing feature, the document ID information retrieved by RAG will be added to the span’s attributes for troubleshooting purposes.

Example

dashscope:
apiKey: xxxxxxxxxxxxxxx
serviceFQDN: dashscope
servicePort: 443
serviceHost: dashscope.aliyuncs.com
dashvector:
apiKey: xxxxxxxxxxxxxxxxxxxx
serviceFQDN: dashvector
servicePort: 443
serviceHost: vrs-cn-xxxxxxxxxxxxxxx.dashvector.cn-hangzhou.aliyuncs.com
collection: xxxxxxxxxxxxxxx
topk: 1
threshold: 0.4
field: raw

The CEC-Corpus dataset contains 332 news reports on emergency events, along with annotation data. The original news text is extracted, vectorized, and then added to Alibaba Cloud Vector Search Service. For text vectorization tutorials, you can refer to “Implementing Semantic Search Based on Vector Search Service and Lingji”.

Below is an example enhanced using RAG, with the original request being:

Where did the rear-end collision in Hainan occur? What was the cause? How many casualties were there?

The result returned by LLM without RAG plugin processing was:

I'm sorry, as an AI model, I cannot retrieve and update specific information on news events in real time, including details such as location, cause, and casualties. For such specific events, it is recommended that you consult the latest news reports or official announcements for accurate information. You can visit mainstream media websites, use news applications, or follow announcements from relevant government departments to get dynamic updates.

After processing with RAG plugin, the result returned by LLM was:

The rear-end collision in Hainan occurred on the Haiven Expressway, 37 kilometers from Wenchang to Haikou. Regarding the specific cause of the accident, traffic police were still conducting further investigations at the time, so the exact cause of the accident cannot be determined based on the provided information. The casualty situation is 1 death (the driver died on the spot) and 8 injuries (including 2 children and 6 adults). All injured persons were rescued and sent to the hospital for treatment.