文章

华为昇腾910B基于Ctyunos部署Qwen3

华为昇腾910B基于Ctyunos部署Qwen3

环境准备

下载Qwen3模型文件

由于 modelscope 下载需要 Python 3.10 以上的支持,为了不影响系统的 Python 版本,可以选择在 Docker 容器中下载模型文件。

步骤 1: 启动 Docker 容器

运行以下命令启动一个容器:

1
2
docker run -d --name mypython --rm -v /mnt/nvme01/download:~/.cache/modelscope/hub \
  666860.xyz/python:3.11-slim tail -f /dev/null
步骤 2: 进入容器内部

使用以下命令进入容器内部:

1
docker exec -it mypython bash
步骤 3: 下载模型文件

在容器内执行以下命令安装 modelscope 并下载模型文件:

1
2
pip install modelscope -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn
modelscope download --model deepseek-ai/DeepSeek-R1-Distill-Llama-70B

下载完成后,模型文件会存放在指定的目录 /mnt/nvme01/download

拉取mindie容器镜像

建议选择我这个相同的版本,因为我到华为官网没有发现这个版本的,于是就下载了他们的最新版本(是不是最新的我不知道),会出现各种各样的问题,比如出现什么【qwen3不支持】哈哈🤣

1
docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/thxcode/mindie:2.0.T17-800I-A2-py311-openeuler24.03-lts-linuxarm64

拉取好镜像我们的食材就准备好了,接下来可以进行做饭了。

创建mindie配置文件

1
vim /root/config.json

填入下面的内容(下面的内容是我从容器中复制出来的,并且亲测过可以使用,没有特殊需求不需要改,相关配置信息可以查看官方手册):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
{
    "Version" : "1.0.0",

    "ServerConfig" :
    {
        "ipAddress" : "0.0.0.0",
        "managementIpAddress" : "0.0.0.0",
        "port" : 1025,
        "managementPort" : 1026,
        "metricsPort" : 1027,
        "allowAllZeroIpListening" : true,
        "maxLinkNum" : 1000,
        "httpsEnabled" : false,
        "fullTextEnabled" : false,
        "tlsCaPath" : "security/ca/",
        "tlsCaFile" : ["ca.pem"],
        "tlsCert" : "security/certs/server.pem",
        "tlsPk" : "security/keys/server.key.pem",
        "tlsPkPwd" : "security/pass/key_pwd.txt",
        "tlsCrlPath" : "security/certs/",
        "tlsCrlFiles" : ["server_crl.pem"],
        "managementTlsCaFile" : ["management_ca.pem"],
        "managementTlsCert" : "security/certs/management/server.pem",
        "managementTlsPk" : "security/keys/management/server.key.pem",
        "managementTlsPkPwd" : "security/pass/management/key_pwd.txt",
        "managementTlsCrlPath" : "security/management/certs/",
        "managementTlsCrlFiles" : ["server_crl.pem"],
        "kmcKsfMaster" : "tools/pmt/master/ksfa",
        "kmcKsfStandby" : "tools/pmt/standby/ksfb",
        "inferMode" : "standard",
        "interCommTLSEnabled" : false,
        "interCommPort" : 1121,
        "interCommTlsCaPath" : "security/grpc/ca/",
        "interCommTlsCaFiles" : ["ca.pem"],
        "interCommTlsCert" : "security/grpc/certs/server.pem",
        "interCommPk" : "security/grpc/keys/server.key.pem",
        "interCommPkPwd" : "security/grpc/pass/key_pwd.txt",
        "interCommTlsCrlPath" : "security/grpc/certs/",
        "interCommTlsCrlFiles" : ["server_crl.pem"],
        "openAiSupport" : "vllm",
        "tokenTimeout" : 600,
        "e2eTimeout" : 600,
        "distDPServerEnabled":false
    },

    "BackendConfig" : {
        "backendName" : "mindieservice_llm_engine",
        "modelInstanceNumber" : 1,
        "npuDeviceIds" : [[0,1,2,3,4,5,6,7]],
        "tokenizerProcessNumber" : 8,
        "multiNodesInferEnabled" : false,
        "multiNodesInferPort" : 1120,
        "interNodeTLSEnabled" : false,
        "interNodeTlsCaPath" : "security/grpc/ca/",
        "interNodeTlsCaFiles" : ["ca.pem"],
        "interNodeTlsCert" : "security/grpc/certs/server.pem",
        "interNodeTlsPk" : "security/grpc/keys/server.key.pem",
        "interNodeTlsPkPwd" : "security/grpc/pass/mindie_server_key_pwd.txt",
        "interNodeTlsCrlPath" : "security/grpc/certs/",
        "interNodeTlsCrlFiles" : ["server_crl.pem"],
        "interNodeKmcKsfMaster" : "tools/pmt/master/ksfa",
        "interNodeKmcKsfStandby" : "tools/pmt/standby/ksfb",
        "ModelDeployConfig" :
        {
            "maxSeqLen" : 32768,
            "maxInputTokenLen" : 32768,
            "truncation" : false,
            "ModelConfig" : [
                {
                    "modelInstanceType" : "Standard",
                    "modelName" : "Qwen3-32B",
                    "modelWeightPath" : "/mnt/nvme01/model/Qwen3-32B",
                    "worldSize" : 8,
                    "cpuMemSize" : 5,
                    "npuMemSize" : -1,
                    "backendType" : "atb",
                    "trustRemoteCode" : false
                }
            ]
        },

        "ScheduleConfig" :
        {
            "templateType" : "Standard",
            "templateName" : "Standard_LLM",
            "cacheBlockSize" : 128,

            "maxPrefillBatchSize" : 50,
            "maxPrefillTokens" : 32768,
            "prefillTimeMsPerReq" : 150,
            "prefillPolicyType" : 0,

            "decodeTimeMsPerReq" : 50,
            "decodePolicyType" : 0,

            "maxBatchSize" : 200,
            "maxIterTimes" : 32768,
            "maxPreemptCount" : 0,
            "supportSelectBatch" : false,
            "maxQueueDelayMicroseconds" : 5000
        }
    }
}

创建容器

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
docker run -it -d --net=host --shm-size=1g \
    --name Qwen3-32B \
    --device=/dev/davinci_manager \
    --device=/dev/hisi_hdc \
    --device=/dev/devmm_svm \
    --device=/dev/davinci0 \
    --device=/dev/davinci1 \
    --device=/dev/davinci2 \
    --device=/dev/davinci3 \
    --device=/dev/davinci4 \
    --device=/dev/davinci5 \
    --device=/dev/davinci6 \
    --device=/dev/davinci7 \
    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro \
    -v /usr/local/sbin:/usr/local/sbin:ro \
    -v /root/qwen/config.json:/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json \
    -v /mnt/nvme01/model/Qwen3-32B:/mnt/nvme01/model/Qwen3-32B:ro \
    swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/thxcode/mindie:2.0.T17-800I-A2-py311-openeuler24.03-lts-linuxarm64 bash

这样我们就创建了一个名为Qwen3-32B的容器,可以使用下面命令进入容器进行操作:

1
docker exec -it Qwen3-32B bash

进入容器后,需要更新transformers,因为Qwen3需要使用新版本的transformers,在此感谢:@Lucent的博客https://lucent.blog/?p=xKdMOl2l

1
pip install --upgrade transformers==4.51.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

前台测试是否正常启动:

1
/usr/local/Ascend/mindie/latest/mindie-service/bin/mindieservice_daemon

如果有问题可以查看日志,看不懂可以在左侧找我的联系方式。

image-20250710220426414

在后台启动:

1
nohup /usr/local/Ascend/mindie/latest/mindie-service/bin/mindieservice_daemon > /root/mindie/log/mindieservice_daemon.log 2>&1 &

如果需要重启的话,可以直接重启容器,但是重启容器后需要进入容器再次后台运行一下。后续查看运行日志在/root/mindie/log/mindieservice_daemon.log

本文由作者按照 CC BY 4.0 进行授权