初次尝试在kubernetes 1.31 上安装 人工智能模型运行平台 llm-d - 详解

news/2025/9/27 17:20:07/文章来源:https://www.cnblogs.com/slgkaifa/p/19115339

初次尝试在kubernetes 1.31 上安装 人工智能模型运行平台 llm-d - 详解

备注:

按照官方文档,排除官方文档不清楚的步骤,安装到结果一步了,只缺 HF_TOKEN了,因为我的kubernetes集群无法访问HF.

[root@bastion quickstart]# cat /etc/redhat-release

Rocky Linux release 9.5 (Blue Onyx)

[root@bastion quickstart]#

[root@bastion quickstart]# kubectl get nodes

NAME STATUS ROLES AGE VERSION

master01.kcloudonline.com Ready control-plane 46h v1.31.0

worker01.kcloudonline.com Ready <none> 46h v1.31.0

worker02.kcloudonline.com Ready <none> 46h v1.31.0

worker03.kcloudonline.com Ready <none> 46h v1.31.0

[root@bastion quickstart]#

获取安装代码/介质 (Get the code)

Clone the llm-d-deployer repository.

git clone https://github.com/llm-d/llm-d-deployer.git

Navigate to the quickstart directory

cd llm-d-deployer/quickstart

[root@bastion software]# dnf install git -y

[root@bastion software]# mkdir llm-d

[root@bastion software]# cd llm-d/

[root@bastion llm-d]# git clone https://github.com/llm-d/llm-d-deployer.git

[root@bastion llm-d]# cd llm-d-deployer/

[root@bastion llm-d-deployer]# ls

chart-dependencies CONTRIBUTING.md ct-install.yaml DCO LICENSE Makefile OWNERS README.md

charts cr.yaml ct.yaml helpers lintconf.yaml notes quickstart REPO_DOCS.md

[root@bastion llm-d-deployer]# cd quickstart/

[root@bastion quickstart]# ls

examples grafana grafana-setup.md infra install-deps.sh llmd-installer.sh metrics-overview.md README.md README-minikube.md test-request.sh

[root@bastion quickstart]#

要求的工具 (Required tools)

Following prerequisite are required for the installer to work.

yq (mikefarah) – installation

jq – download & install guide

git – installation guide

Helm – quick-start install

Kustomize – official install docs

kubectl – install & setup

You can use the installer script that installs all the required dependencies.

./install-deps.sh

# 下载并安装yq

sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/local/bin/yq

# 赋予执行权限

sudo chmod +x /usr/local/bin/yq

# 验证安装

yq –version

使用官方脚本安装(推荐)

# 下载并安装最新版本的Kustomize

curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash

# 将kustomize移动到系统PATH中

sudo mv kustomize /usr/local/bin/

# 验证安装

kustomize version

[root@bastion quickstart]# sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/local/bin/yq

Resolving release-assets.githubusercontent.com (release-assets.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ...

Connecting to release-assets.githubusercontent.com (release-assets.githubusercontent.com)|185.199.110.133|:443... connected.

HTTP request sent, awaiting response... 200 OK

Length: 11477176 (11M) [application/octet-stream]

Saving to: ‘/usr/local/bin/yq’

/usr/local/bin/yq 100%[=====================================================================================>] 10.95M 1002KB/s in 7.1s

2025-09-26 08:34:22 (1.55 MB/s) - ‘/usr/local/bin/yq’ saved [11477176/11477176]

[root@bastion quickstart]# sudo chmod +x /usr/local/bin/yq

[root@bastion quickstart]# yq --version

yq (https://github.com/mikefarah/yq/) version v4.47.2

[root@bastion quickstart]#

[root@bastion llm-d]# curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash

v5.7.1

kustomize installed to /software/llm-d/kustomize

[root@bastion llm-d]# ls

kustomize llm-d-deployer

[root@bastion llm-d]# cp kustomize /usr/local/bin/

[root@bastion llm-d]# kustomize version

v5.7.1

[root@bastion llm-d]#

[root@bastion quickstart]# ./install-deps.sh

Rocky Linux 9 - BaseOS 2.5 kB/s | 4.1 kB 00:01

Rocky Linux 9 - AppStream 5.0 kB/s | 4.5 kB 00:00

Rocky Linux 9 - Extras 631 B/s | 2.9 kB 00:04

Dependencies resolved.

========================================================================================================================================================================= Package Architecture Version Repository Size

=========================================================================================================================================================================Installing:

make x86_64 1:4.3-8.el9 baseos 529 k

Transaction Summary

=========================================================================================================================================================================Install 1 Package

Total download size: 529 k

Installed size: 1.6 M

Downloading Packages:

make-4.3-8.el9.x86_64.rpm 301 kB/s | 529 kB 00:01

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------Total 212 kB/s | 529 kB 00:02

Running transaction check

Transaction check succeeded.

Running transaction test

Transaction test succeeded.

Running transaction

Preparing : 1/1

Installing : make-1:4.3-8.el9.x86_64 1/1

Running scriptlet: make-1:4.3-8.el9.x86_64 1/1

Verifying : make-1:4.3-8.el9.x86_64 1/1

Installed:

make-1:4.3-8.el9.x86_64

Complete!

Installing yq...

[root@bastion quickstart]#

要求的凭证和配置 (Required credentials and configuration)

llm-d-deployer GitHub repo – clone here(https://github.com/llm-d/llm-d-deployer.git)

HuggingFace HF_TOKEN (https://huggingface.co/docs/hub/en/security-tokens) with download access for the model you want to use. By default the sample application will use meta-llama/Llama-3.2-3B-Instruct.

⚠️ Your Hugging Face account must have access to the model you want to use. You may need to visit Hugging Face meta-llama/Llama-3.2-3B-Instruct and accept the usage terms if you have not already done so.

目标平台 (Target Platforms)

Since the llm-d-deployer is based on helm charts, llm-d can be deployed on a variety of Kubernetes platforms.

安装llm-d (llm-d Installation)

Only a single installation of llm-d on a cluster is currently supported. In the future, multiple model services will be supported. Until then, uninstall llm-d before reinstalling.

The llm-d-deployer contains all the helm charts necessary to deploy llm-d. To facilitate the installation of the helm charts, the llmd-installer.sh script is provided. This script will populate the necessary manifests in the manifests directory. After this, it will apply all the manifests in order to bring up the cluster.

The llmd-installer.sh script aims to simplify the installation of llm-d using the llm-d-deployer as it's main function. It scripts as many of the steps as possible to make the installation process more streamlined. This includes:

Installing the GAIE infrastructure

Creating the namespace with any special configurations

Creating the pull secret to download the images

Creating the model service CRDs

Applying the helm charts

Deploying the sample app (model service)

It also supports uninstalling the llm-d infrastructure and the sample app.

Before proceeding with the installation, ensure you have completed the prerequisites and are able to issue kubectl or oc commands to your cluster by configuring your ~/.kube/config file or by using the oc login command.

Usage

The installer needs to be run from the llm-d-deployer/quickstart directory as a cluster admin with CLI access to the cluster.

./llmd-installer.sh [OPTIONS]

Flags

案例(Examples)

在Kubernetes 安装 (Install llm-d on an Existing Kubernetes Cluster)

export HF_TOKEN="your-token"

./llmd-installer.sh

[root@bastion quickstart]# ./llmd-installer.sh

Setting up script environment...

kubectl can reach to a running Kubernetes cluster.

❌ HF_TOKEN not set; Run: export HF_TOKEN=<your_token>

[root@bastion quickstart]#

备注:

llm-d的安装和模型没有分离,这个设计我觉得有点疑问。按照我的理解,安装好了 再上载模型可能更好。

在OpenShift上安装(Install on OpenShift )

Before running the installer, ensure you have logged into the cluster as a cluster administrator. For example:

oc login --token=sha256~yourtoken --server=https://api.yourcluster.com:6443

export HF_TOKEN="your-token"

./llmd-installer.sh

Validation

The inference-gateway serves as the HTTP ingress point for all inference requests in our deployment. It’s implemented as a Kubernetes Gateway (gateway.networking.k8s.io/v1) using either kgateway or istio as the gatewayClassName, and sits in front of your inference pods to handle path-based routing, load balancing, retries, and metrics. This example validates that the gateway itself is routing your completion requests correctly. You can execute the test-request.sh script to test on the cluster.

# Default options (the model id will be discovered via /v1/models)

./test-request.sh

# Non-default namespace/model

./test-request.sh -n <NAMESPACE> -m <FULL_MODEL_NAME> --minikube

If you receive an error indicating PodSecurity "restricted" violations when running the smoke-test script, you need to remove the restrictive PodSecurity labels from the namespace. Once these labels are removed, re-run the script and it should proceed without PodSecurity errors. Run the following command:

kubectl label namespace <NAMESPACE> \

pod-security.kubernetes.io/warn- \

pod-security.kubernetes.io/warn-version- \

pod-security.kubernetes.io/audit- \

pod-security.kubernetes.io/audit-version-

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/919727.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

三种神器让LLM输出结构化数据:LangChain、LlamaIndex与Function Calling实战指南

💡 引言:为什么结构化输出如此重要? 在构建大语言模型应用时,你是否遇到过这样的困扰:模型输出的数据格式混乱,难以直接使用?想象一下,当你需要从一段文本中提取用户信息时,模型却返回了自由散漫的文本段落—…

有什么好的加盟店项目该如何建设和优化一个网站

无线透传技术与其他数据传输技术(如有线连接)相比&#xff0c;具有以下优势和劣势&#xff1a; 1. 优势&#xff1a; 简单易用&#xff1a;无线透传模块由于直接进行数据透传&#xff0c;省去了许多数据处理和协议转换的步骤&#xff0c;使得数据传输更为快速和高效。   传输…

野指针

野指针Posted on 2025-09-27 17:10 steve.z 阅读(0) 评论(0) 收藏 举报野指针(Wild Pointer)是指向未知或无效内存地址的指针,是C/C++编程中常见的一种错误类型。 野指针的几种情况: 1. 未初始化的指针 int *…

win11安装ensp

安装包 1. eNSP+V100R003C00SPC100+Setup:https://www.alipan.com/s/mWWEg4rpsnj(严格) 2.VirtualBox-5.2.44-139111-Win:https://www.alipan.com/s/qJZBMJs3U9H(严格) 3.Wireshark-4.4.5-x64:https://www.alip…

自己湿热内蕴出汗痒和岳母生病2天不洗澡发痒的不同-完美解释小孩为啥没那么容易痒

先解释小孩为啥没那么容易痒,是因为他的身体比较平衡、既没有我的"湿热内蕴”, 也没有岳母的 "年老卫气渐虚",所以既没看到过他出汗后抓痒,也没看到、听到他说有两天没洗澡就身上发痒-虽然他自己常…

无刷电机驱动工程及PID算法FOC算法完成(超高质量超高质量

pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", …

vue: ubuntu安装vue环境

一,官网: https://nodejs.org/en/download 二,安装: # Download and install nvm: curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash # in lieu of restarting the shell \. "…

golang实现ai聊天窗口

在Go语言中实现AI聊天窗口需要结合前端界面、后端服务以及AI模型调用。以下是一个分步实现的详细方案,包含关键代码示例和架构设计: 一、整体架构设计前端:使用HTML/CSS/JavaScript构建Web界面,通过WebSocket与后端…

源码反码补码

源码的每一位按位取反得到反码,反码是补码加一,反码解决的是正负零的问题。这让正零和负零是一个数,使得0统一

图的分类法:解耦数据和图表类型

HKUST 25 Fall COMP 6411D Data Visualization 课堂笔记可视化的可视化 当我第一眼看到 slide 中“chart taxonomies” ,我有两个反应:图表数量也太多了 这个分类方式并不是很直观,Comparsion / Relationship / Di…

完整教程:图解人工智能中的高等数学(python实现)---一元函数微分学

pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", …

怎么建设网站临汾做公关用的网站

http://{ip}:{port}/refresh 说明&#xff1a;springBoot版本是1.5.9&#xff0c;接口路径与2.x&#xff0c;不同 路径区别&#xff1a;/refresh VS /actuator/refresh 用postman调用refresh接口刷新git配置&#xff0c;报错如下&#xff0c;没有权限 在服务本地启动&#…

提供信息门户网站搭建wap网站源码下载

笔者是ctf小白&#xff0c;这两天也是遇到.git泄露的题目&#xff0c;需要工具来解决问题&#xff0c;在下载和使用的过程中也是遇到很多问题&#xff0c;写此篇记录经验&#xff0c;以供学习 在本篇标题三中有详细介绍了Linux系统添加环境变量的操作教程&#xff0c;以供学习 …

35Bourbaki1-1@《数学原理》1-1@20250927

35Bourbaki1-1@《数学原理》1-1@20250927

IDEA 2024的零卡死配置

IDEA 2024的零卡死配置用了苹果container跑Linux之后,IDEA 2024 用着用着就卡死,这是苹果 Silicon M1 Max专属JDK 8 的零卡死配置。 JDK版本配置 IDEA 2024在默认情况下 Project 的 Language level 会从JDK22开始的,…

Python + MediaPipe 手势绘画高级应用:从基础到创意交互 - 实践

Python + MediaPipe 手势绘画高级应用:从基础到创意交互 - 实践pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "C…

有建设网站的软件吗吉林市最新消息今天

这是连续剧般的文章&#xff0c;请关注&#xff0c;持续更新中... 系列文章: http://t.csdnimg.cn/Os83Qhttp://t.csdnimg.cn/Os83Q这篇文章将我们的reactro转变成http服务器... HTTP HTTP介绍 HTTP是一个基于TCP通信协议的基础上的应用层协议。接下来我们需要解析HTTP请求消息…

网站怎么做百度认证做照片有那些网站好

这篇文章主要介绍了opencv python图像梯度实例详解,文中通过示例代码介绍的非常详细&#xff0c;对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下一阶导数与Soble算子二阶导数与拉普拉斯算子图像边缘&#xff1a;Soble算子&#xff1a;二阶导数&#xff1a;拉…

天津智能网站建设制作桓台网站建设

转自&#xff1a;进阶 JavaScript 必知的 33 个点【进阶必备】 进阶 JavaScript 必知的 33 个点【进阶必备】 Original 前端小菜鸡之菜鸡互啄 前端开发爱好者 2022-04-11 08:32 收录于话题#javaScript进阶1个 点击下方“前端开发爱好者”&#xff0c;选择“设为星标” 第一…