使用xprobe/xinference:latest镜像
1.拉取镜像
拉取最新稳定版 Xinference 镜像
docker pull xprobe/xinference:latest
验证镜像是否拉取成功(显示 xprobe/xinference 即生效)
docker images | findstr xinference
2.运行镜像-PowerShell
Windows CMD 命令(复制直接运行)
docker run -d ^
--name xinference-server ^
-p 9997:9997 ^
-v %USERPROFILE%\.xinference:/root/.xinference ^
xprobe/xinference:latest ^
xinference-local --host 0.0.0.0 --port 9997
若用 PowerShell,替换为(目前在win10上运行成功)
docker run -d `
--name xinference-server `
-p 9997:9997 `
-v $env:USERPROFILE\.xinference:/root/.xinference `
xprobe/xinference:latest `
xinference-local --host 0.0.0.0 --port 9997
参数说明:
- –name xinference-server:给容器命名,方便后续管理;
- -p 9997:9997:映射容器 9997 端口到主机,外部可访问;
- -v %USERPROFILE%.xinference:/root/.xinference:挂载主机的模型缓存目录,重启容器后模型不丢失;
- –host 0.0.0.0:允许容器外部访问 Xinference 服务。
3.验证容器与服务是否启动成功
进入 xinference-server 容器的命令行
docker exec -it xinference-server bash
示例:在容器内加载 bge-reranker-v2-m3 重排序模型
xinference launch --model-name bge-reranker-v2-m3 --model-type rerank --repository-id BAAI/bge-reranker-v2-m3
4.Xinference缓存模型
缓存成功的结果展示


已运行的模型

缓存模型的设定

等待缓存完成即可。
5.Xinference运行模型

6.dify配置Xinference中的模型

7.智能体编排中使用Reranker模型

知识库设置完成后,记得发布。
附DSL内容-另存为yml即可直接用与dify
app:
description: ''
icon: 🤖
icon_background: '#FFEAD5'
mode: advanced-chat
name: 运维规章制度
use_icon_as_answer_icon: false
dependencies:
- current_identifier: null
type: marketplace
value:
marketplace_plugin_unique_identifier: langgenius/openai_api_compatible:0.0.25@6c02d20ecf7eba40234be5201f25c2b6ea918ec09e0f8eb2a333efb495947d02
version: null
kind: app
version: 0.5.0
workflow:
conversation_variables: []
environment_variables: []
features:
file_upload:
allowed_file_extensions:
- .JPG
- .JPEG
- .PNG
- .GIF
- .WEBP
- .SVG
allowed_file_types:
- image
allowed_file_upload_methods:
- local_file
- remote_url
enabled: false
fileUploadConfig:
audio_file_size_limit: 50
batch_count_limit: 5
file_size_limit: 15
image_file_size_limit: 10
video_file_size_limit: 100
workflow_file_upload_limit: 10
image:
enabled: false
number_limits: 3
transfer_methods:
- local_file
- remote_url
number_limits: 3
opening_statement: 你好 我是运维管理员
retriever_resource:
enabled: true
sensitive_word_avoidance:
enabled: false
speech_to_text:
enabled: false
suggested_questions: []
suggested_questions_after_answer:
enabled: false
text_to_speech:
enabled: false
language: ''
voice: ''
graph:
edges:
- data:
sourceType: llm
targetType: answer
id: llm-answer
source: llm
sourceHandle: source
target: answer
targetHandle: target
type: custom
- data:
isInLoop: false
sourceType: start
targetType: knowledge-retrieval
id: 1771985974968-source-1771986921623-target
source: '1771985974968'
sourceHandle: source
target: '1771986921623'
targetHandle: target
type: custom
zIndex: 0
- data:
isInLoop: false
sourceType: knowledge-retrieval
targetType: llm
id: 1771986921623-source-llm-target
source: '1771986921623'
sourceHandle: source
target: llm
targetHandle: target
type: custom
zIndex: 0
nodes:
- data:
desc: 请输入需要了解的制度内容
selected: false
title: 用户输入
type: start
variables:
- default: ''
hint: ''
label: 请输入您的问题
max_length: 48
options: []
placeholder: ''
required: true
type: text-input
variable: input_text
height: 136
id: '1771985974968'
position:
x: -250.08913350951966
y: 324.99999999999994
positionAbsolute:
x: -250.08913350951966
y: 324.99999999999994
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 242
- data:
context:
enabled: true
variable_selector:
- '1771986921623'
- result
desc: 大模型
model:
completion_params:
temperature: 0.7
mode: chat
name: Qwen3-32B
provider: langgenius/openai_api_compatible/openai_api_compatible
prompt_template:
- id: 2ee916cc-893c-43bd-a02d-975bc50446ff
role: system
text: "角色:\n您是一个运维管理员,熟悉所有的运维管理规范\n任务:\n请根据智慧运维知识库的所有内容,并提取核心观点,最后生成一段简短的摘要:\
\ \n要求: \n1、 阅读索引文件并进行总结,语言简洁,不超过200字。 \n2、使用列表形式展示核心观点。\n3、仅使用知识库内容回答问题。\n"
selected: false
structured_output_enabled: false
title: LLM
type: llm
vision:
enabled: false
height: 115
id: llm
position:
x: 391.05956484547767
y: 353.7284936842676
positionAbsolute:
x: 391.05956484547767
y: 353.7284936842676
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 242
- data:
answer: '{{#llm.text#}}/'
selected: false
title: 直接回复
type: answer
variables: []
height: 102
id: answer
position:
x: 754.7621681354833
y: 378.51301645002985
positionAbsolute:
x: 754.7621681354833
y: 378.51301645002985
selected: false
sourcePosition: right
targetPosition: left
type: custom
width: 242
- data:
dataset_ids:
- I96tLXkjt6ZmRKEH/mjSSzrEXpBR1mpvo9tbogayCnZr22f+SmIh2J1nJoaSpWnJ
multiple_retrieval_config:
reranking_enable: true
reranking_mode: reranking_model
reranking_model:
model: models--baai--bge-reranker-v2-m3
provider: langgenius/openai_api_compatible/openai_api_compatible
score_threshold: null
top_k: 4
weights:
keyword_setting:
keyword_weight: 0.3
vector_setting:
embedding_model_name: qwen3-embedding:8b
embedding_provider_name: langgenius/ollama/ollama
vector_weight: 0.7
weight_type: customized
query_variable_selector:
- '1771985974968'
- input_text
retrieval_mode: multiple
selected: true
title: 知识检索
type: knowledge-retrieval
height: 89
id: '1771986921623'
position:
x: 66.20826320047445
y: 353.7284936842676
positionAbsolute:
x: 66.20826320047445
y: 353.7284936842676
selected: true
sourcePosition: right
targetPosition: left
type: custom
width: 242
viewport:
x: 247.65853216270784
y: 10.441073703044367
zoom: 1.0000000000000009
rag_pipeline_variables: []
1万+

被折叠的 条评论
为什么被折叠?



