{"id":899,"date":"2025-09-22T14:49:13","date_gmt":"2025-09-22T06:49:13","guid":{"rendered":"http:\/\/www.o6s.net\/?p=899"},"modified":"2025-10-25T22:50:47","modified_gmt":"2025-10-25T14:50:47","slug":"%e5%9c%a8-openeuler-%e4%b8%8a%e4%bd%bf%e7%94%a8-kserve-%e9%83%a8%e7%bd%b2-qwen3","status":"publish","type":"post","link":"https:\/\/www.o6s.net\/index.php\/2025\/09\/22\/%e5%9c%a8-openeuler-%e4%b8%8a%e4%bd%bf%e7%94%a8-kserve-%e9%83%a8%e7%bd%b2-qwen3\/","title":{"rendered":"\u5728 openEuler \u4e0a\u4f7f\u7528 KServe \u90e8\u7f72 Qwen3"},"content":{"rendered":"<p>\u7b80\u4ecb<br \/>\nKServe \u662f\u4e00\u79cd\u57fa\u4e8e Kubernetes \u7684\u6a21\u578b\u670d\u52a1\uff08Model Serving\uff09\u5e73\u53f0\uff0c\u80fd\u591f\u7b80\u5316\u673a\u5668\u5b66\u4e60\u6a21\u578b\u5728\u751f\u4ea7\u73af\u5883\u4e2d\u7684\u90e8\u7f72\u548c\u7ba1\u7406\u3002\u901a\u8fc7\u6807\u51c6\u5316\u7684\u63a5\u53e3\u548c CRD\uff08\u81ea\u5b9a\u4e49\u8d44\u6e90\u5b9a\u4e49\uff09\uff0cKServe \u652f\u6301\u591a\u79cd\u4e3b\u6d41\u63a8\u7406\u540e\u7aef\uff08\u5982 TensorFlow Serving\u3001TorchServe\u3001Triton Inference Server \u53ca Hugging Face Server\uff09\uff0c\u9002\u7528\u4e8e\u5404\u7c7b\u6df1\u5ea6\u5b66\u4e60\u6a21\u578b\u7684\u5728\u7ebf\u63a8\u7406\u670d\u52a1\u3002<\/p>\n<p>\u672c\u6587\u5c06\u6f14\u793a\u5982\u4f55\u5728 OpenAtom openEuler\uff08\u7b80\u79f0\u201copenEuler\u201d\uff09\u64cd\u4f5c\u7cfb\u7edf\u4e0a\u90e8\u7f72\u5e76\u4f7f\u7528 KServe\uff0c\u8fdb\u884c Hugging Face Qwen3 \u6a21\u578b\u7684\u6587\u672c\u751f\u6210\u4efb\u52a1\u3002<\/p>\n<p>\u573a\u666f\u8bf4\u660e<br \/>\n\u5728\u672c\u793a\u4f8b\u4e2d\uff0c\u6211\u4eec\u5c06\u6f14\u793a\u5982\u4f55\u901a\u8fc7\u90e8\u7f72 Hugging Face Serving \u8fd0\u884c\u65f6\u7684 InferenceService\uff0c\u5c06 Hugging Face \u4e0a\u7684 Llama3 \u6a21\u578b\u7528\u4e8e\u6587\u672c\u751f\u6210\u4efb\u52a1\u3002<\/p>\n<p>KServe \u7684 Hugging Face \u8fd0\u884c\u65f6\u9ed8\u8ba4\u91c7\u7528 vLLM \u4f5c\u4e3a\u540e\u7aef\u6765\u670d\u52a1\u5927\u8bed\u8a00\u6a21\u578b\uff08LLM\uff09\uff0c\u76f8\u6bd4 Hugging Face \u5b98\u65b9 API \u80fd\u591f\u5b9e\u73b0\u66f4\u5feb\u7684\u9996 token \u54cd\u5e94\u65f6\u95f4\uff08TTFT\uff09\u548c\u66f4\u9ad8\u7684 token \u751f\u6210\u541e\u5410\u91cf\u3002<\/p>\n<p>\u73af\u5883\u51c6\u5907<br \/>\n\u64cd\u4f5c\u7cfb\u7edf\u7248\u672c<br \/>\n\u672c\u6307\u5357\u4ee5 openEuler 24.03 LTS SP2 \u4e3a\u4f8b\uff0c\u5176\u4ed6\u65b0\u7248\u672c openEuler \u4ea6\u53ef\u53c2\u8003\u3002\u8bf7\u786e\u4fdd\u7cfb\u7edf\u5df2\u66f4\u65b0\u5230\u6700\u65b0\u8865\u4e01\uff0c\u5e76\u5177\u6709 sudo \u6743\u9650\u3002<\/p>\n<p>\u57fa\u7840\u4f9d\u8d56\u5b89\u88c5<br \/>\n\u5b89\u88c5\u5fc5\u8981\u7684\u7cfb\u7edf\u5de5\u5177\u548c\u4f9d\u8d56\uff1a<br \/>\nyum update -y<br \/>\nyum install -y wget curl tar iptables<br \/>\n\u5b89\u88c5 Docker<br \/>\ncurl -sL https:\/\/raw.githubusercontent.com\/cnrancher\/euler-packer\/refs\/heads\/main\/scripts\/others\/install-docker.sh | bash &#8211;<br \/>\n\u5b89\u88c5 Kind<br \/>\ncurl -Lo .\/kind https:\/\/kind.sigs.k8s.io\/dl\/v0.29.0\/kind-linux-amd64<br \/>\nchmod +x .\/kind<br \/>\nmv .\/kind \/usr\/local\/bin\/kind<br \/>\n\u5b89\u88c5 Kubenetes CLI<br \/>\ncurl -LO &#8220;https:\/\/dl.k8s.io\/release\/$(curl -L -s https:\/\/dl.k8s.io\/release\/stable.txt)\/bin\/linux\/amd64\/kubectl&#8221;<br \/>\ninstall -o root -g root -m 0755 kubectl \/usr\/local\/bin\/kubectl<br \/>\n\u5b89\u88c5 Helm<br \/>\nwget https:\/\/get.helm.sh\/helm-v3.18.4-linux-amd64.tar.gz<br \/>\ntar -zxvf helm-v3.18.4-linux-amd64.tar.gz<br \/>\nmv linux-amd64\/helm \/usr\/local\/bin\/helm<br \/>\n\u5b89\u88c5\u6b65\u9aa4<br \/>\n\u5b89\u88c5 KServe<br \/>\n\u521b\u5efa Kubernetes \u96c6\u7fa4<br \/>\nkind create cluster<br \/>\n\u5207\u6362 kubectl \u4e0a\u4e0b\u6587<br \/>\nkubectl config use-context kind-kind<br \/>\n\u5b89\u88c5 KServe \u53ca\u76f8\u5173\u4f9d\u8d56<br \/>\ncurl -sL &#8220;https:\/\/gitee.com\/openeuler\/openeuler-docker-images\/raw\/master\/AI\/kserve\/controller\/doc\/quick_install.sh&#8221; | bash -s &#8212; -r<br \/>\n\u90e8\u7f72 Qwen3 InferenceService<br \/>\n\u521b\u5efa Hugging Face Token \u5bc6\u6587\u5bf9\u8c61<\/p>\n<p>kubectl apply -f &#8211; &lt;&lt;EOF<br \/>\napiVersion: v1<br \/>\nkind: Secret<br \/>\nmetadata:<br \/>\nname: hf-secret<br \/>\ntype: Opaque<br \/>\nstringData:<br \/>\nHF_TOKEN:<br \/>\nEOF<br \/>\n\u521b\u5efa Hugging Face Qwen3 \u670d\u52a1\u7684 CRD \u914d\u7f6e<\/p>\n<p>kubectl apply -f &#8211; &lt; POST \/openai\/v1\/completions HTTP\/1.1<br \/>\n&gt; Host: huggingface-qwen3-default.example.com<br \/>\n&gt; User-Agent: curl\/7.88.1<br \/>\n&gt; Accept: *\/*<br \/>\n&gt; content-type: application\/json<br \/>\n&gt; Content-Length: 91<br \/>\n&gt;<br \/>\n&lt; HTTP\/1.1 200 OK<br \/>\n&lt; date: Tue, 12 Aug 2025 05:36:03 GMT<br \/>\n&lt; server: uvicorn<br \/>\n&lt; content-length: 474<br \/>\n&lt; content-type: application\/json<br \/>\n&lt;<br \/>\n* Connection #0 to host 10.96.149.169 left intact<br \/>\n{&#8220;id&#8221;:&#8221;cmpl-a2ead2a3246f47fe85c48b7aadbd30d5&#8243;,&#8221;object&#8221;:&#8221;text_completion&#8221;,&#8221;created&#8221;:1754976963,&#8221;model&#8221;:&#8221;qwen3&#8243;,&#8221;choices&#8221;:[{&#8220;index&#8221;:0,&#8221;text&#8221;:&#8221; in the style of a haiku, with each line containing a different color and a different season, and each line also incorporating a different sense. The&#8221;,&#8221;logprobs&#8221;:null,&#8221;finish_reason&#8221;:&#8221;length&#8221;,&#8221;stop_reason&#8221;:null,&#8221;prompt_logprobs&#8221;:null}],&#8221;usage&#8221;:{&#8220;prompt_tokens&#8221;:5,&#8221;total_tokens&#8221;:35,&#8221;completion_tokens&#8221;:30,&#8221;prompt_tokens_details&#8221;:null}}<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u7b80\u4ecb KServe \u662f\u4e00\u79cd\u57fa\u4e8e Kubernetes \u7684\u6a21\u578b\u670d\u52a1\uff08Model Serving\uff09\u5e73\u53f0\uff0c\u80fd\u591f\u7b80\u5316\u673a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8],"tags":[],"class_list":["post-899","post","type-post","status-publish","format-standard","hentry","category-openeuler"],"_links":{"self":[{"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/posts\/899","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/comments?post=899"}],"version-history":[{"count":1,"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/posts\/899\/revisions"}],"predecessor-version":[{"id":904,"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/posts\/899\/revisions\/904"}],"wp:attachment":[{"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/media?parent=899"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/categories?post=899"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.o6s.net\/index.php\/wp-json\/wp\/v2\/tags?post=899"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}