以下是在CentOS 系统上本地部署DeepSeek的详细教程。本教程可协助您部署深度求索(DeepSeek)公司的AI大模型,部署过程涵盖环境准备、依赖安装、模型配置及运行测试等步骤。
### **一、系统准备与更新**
1. **更新系统及安装基础工具**
sudo yum update -y
sudo yum groupinstall "Development Tools" -y
sudo yum install epel-release -y # 启用EPEL仓库
sudo yum install wget curl git openssl-devel bzip2-devel libffi-devel -y
2. **安装Python +**
# 安装Python依赖
sudo yum install sqlite-devel xz-devel zlib-devel -y
# 下载并编译Python
wget https://www.python.org/ftp/python//Python-.tgz
tar xzf Python-.tgz
cd Python-
./configure --enable-optimizations
make -j$(nproc)
sudo make altinstall
# 验证安装
python3.8 --version
pip3.8 --version
3. **配置虚拟环境(推荐)**
pip3.8 install virtualenv
virtualenv deepseek-env --python=python3.8
source deepseek-env/bin/activate
### **二、安装GPU驱动与CUDA(如使用NVIDIA GPU)**
1. **安装NVIDIA驱动**
- 禁用默认驱动:
sudo echo "blacklist nouveau" > /etc/modprobe.d/blacklist-nouveau.conf
sudo dracut --force
sudo reboot # 重启生效
- 从[NVIDIA官网](
https://www.nvidia.com/Download/index.aspx)下载驱动并安装:
chmod +x NVIDIA-Linux-x86_64-*.run
sudo ./NVIDIA-Linux-x86_64-*.run
2. **安装CUDA .x**
- 从[NVIDIA CUDA Archive](
https://developer.nvidia.com/cuda-toolkit-archive)下载CUDA :
```bash
wget https://developer.download.nvidia.com/compute/cuda//local_installers/cuda_11.8.0_520..05_linux.run
sudo sh cuda_11.8.0_520..05_linux.run
- 配置环境变量:
echo 'export PATH=/usr/local/cuda-/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
3. **安装cuDNN**
- 从[NVIDIA cuDNN页面](
https://developer.nvidia.com/cudnn)下载对应版本,解压后复制文件到CUDA目录:
tar -xzvf cudnn-linux-x86_64-.28_cuda11-archive.tar.xz
sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include/
sudo cp cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
### **三、安装PyTorch与依赖**
1. **安装PyTorch(适配CUDA )**
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
2. **安装模型依赖**
pip install transformers datasets accelerate sentencepiece peft deepspeed
### **四、下载与配置DeepSeek模型**
1. **获取模型文件**
- 从Hugging Face或官方渠道下载模型(如 `
deepseek-ai/deepseek-llm-7b-chat`):
git lfs install
git clone https://huggingface.co/deepseek-ai/deepseek-llm-7b-chat
2. **配置模型路径**
- 创建Python脚本 `run_deepseek.py`:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "./deepseek-llm-7b-chat"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
prompt = "你好,请介绍一下你自己。"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
### **五、运行与测试**
1. **启动推理脚本**
python run_deepseek.py
- 预期输出模型生成的回答。
2. **启用Web服务(可选)**
- 使用FastAPI或Gradio搭建API:
pip install fastapi uvicorn gradio
- 创建 `api.py`(示例):
from fastapi import FastAPI
from transformers import pipeline
app = FastAPI()
model = pipeline("text-generation", model="./deepseek-llm-7b-chat")
@app.post("/generate")
def generate_text(prompt: str):
return model(prompt, max_length=)[0]['generated_text']
- 启动服务:
uvicorn api:app --host --port
### **六、防火墙与端口开放**
sudo firewall-cmd --zone=public --add-port=/tcp --permanent
sudo firewall-cmd --reload
### **常见问题**
1. **CUDA内存不足**
- 降低`batch_size`或使用`fp16`精度:
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
2. **依赖冲突**
- 使用虚拟环境并确保依赖版本一致:
pip freeze > requirements.txt # 保存环境
pip install -r requirements.txt # 恢复环境
按照以上步骤操作,即可在CentOS 上完成DeepSeek模型的本地部署。根据实际硬件调整模型参数(如GPU显存限制)。