【个人记录】openEuler安装K3S并配置为GPU节点
前言
国内网络环境特殊,在线安装比较麻烦,K3S采用离线安装方式进行部署。
安装整体思路是:
- 安装GPU驱动
- 安装CUDA工具
- 安装nvidia容器运行时
- 安装K3S
- 设置K3S使用GPU
基础环境
采用All In One方式(其实只有一张GPU卡)部署。
参数 | 内容 |
---|---|
系统 | openEuler 22.03 (LTS-SP3) |
CPU | 8 |
内存 | 64G |
系统盘 | 500G |
GPU | V100-32G |
GPU采用直通的方式,vGPU应该也差不多。
安装GPU驱动
下载驱动
先去官网(https://www.nvidia.cn/geforce/drivers/)下载驱动
选择对应型号操作系统为Linux 64-bit搜索后下载即可
像我V100的下载链接就是https://cn.download.nvidia.com/tesla/560.35.03/NVIDIA-Linux-x86_64-560.35.03.run
wget https://cn.download.nvidia.com/tesla/560.35.03/NVIDIA-Linux-x86_64-560.35.03.run -O NVIDIA-Linux.run
安装构建工具和依赖
yum install gcc make kernel-devel-$(uname -r) vulkan-loader -y
运行安装
根据指示安装完即可。
bash NVIDIA-Linux.run --kernel-source-path=/usr/src/kernels/$(uname -r)
测试是否安装成功
nvidia-smi
运行后应该会输出显卡和驱动信息,输出则表示安装成功
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla V100-PCIE-32GB Off | 00000000:00:06.0 Off | 0 |
| N/A 41C P0 39W / 250W | 310MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
安装CUDA工具
下载CUDA工具
前往官网(https://developer.nvidia.com/cuda-toolkit-archive),选择要下载的版本进行下载。
这里我选择的是12.4.0版本(PyTorch支持的版本),访问后操作系统选择Linux、架构选择x86_64、发行选择RHEL、版本选择9、安装方式选择runfile(local)。
根据安装指引运行
wget https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda_12.4.0_550.54.14_linux.run
sudo sh cuda_12.4.0_550.54.14_linux.run
校验CUDA
nvcc -V
运行后会返回CUDA版本信息
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
安装cuDNN
前往官网(https://developer.nvidia.com/rdp/cudnn-archive),选择要下载的版本进行下载。
访问后需要登录,根据指引下载至服务器即可,这个下载地址有时效性,需要尽快下载
下载完后运行解压安装即可
xz -d cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar.xz
tar -xvf cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar
cd cudnn-linux-x86_64-8.9.7.29_cuda12-archive/
/usr/bin/cp -r include/* /usr/local/cuda-12.4/include
/usr/bin/cp -r lib/* /usr/local/cuda-12.4/lib64
安装nvidia容器运行时
为了使容器内可以使用显卡,需要安装nvidia容器运行时,这一步网上经常是搜索的是ubuntu的教材,这里按官网步骤安装。
安装YUM源
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
启动YUM源
sudo dnf-config-manager --enable nvidia-container-toolkit-experimental
安装容器工具包
sudo dnf install -y nvidia-container-toolkit
安装完即可,在后续步骤会使用。
安装K3S
因为网络原因在线安装K3S不太稳定,这里使用离线安装。
下载镜像
mkdir -p /var/lib/rancher/k3s/agent/images/ /usr/local/bin/
wget https://github.com/k3s-io/k3s/releases/download/v1.31.5%2Bk3s1/k3s -O /usr/local/bin/k3s
chmod a+x /usr/local/bin/k3s
wget https://github.com/k3s-io/k3s/releases/download/v1.31.5%2Bk3s1/k3s-airgap-images-amd64.tar -O /var/lib/rancher/k3s/agent/images/k3s-airgap-images-amd64.tar
这里下载很慢可以使用类似镜像加速站点(https://gh-proxy.com)进行加速下载。
下载安装脚本
wget https://get.k3s.io -O install.sh
如果下载不了可以使用下面命令直接保存安装脚本
cat > install.sh <<EOF
#!/bin/sh
set -e
set -o noglob# Usage:
# curl ... | ENV_VAR=... sh -
# or
# ENV_VAR=... ./install.sh
#
# Example:
# Installing a server without traefik:
# curl ... | INSTALL_K3S_EXEC="--disable=traefik" sh -
# Installing an agent to point at a server:
# curl ... | K3S_TOKEN=xxx K3S_URL=https://server-url:6443 sh -
#
# Environment variables:
# - K3S_*
# Environment variables which begin with K3S_ will be preserved for the
# systemd service to use. Setting K3S_URL without explicitly setting
# a systemd exec command will default the command to "agent", and we
# enforce that K3S_TOKEN is also set.
#
# - INSTALL_K3S_SKIP_DOWNLOAD
# If set to true will not download k3s hash or binary.
#
# - INSTALL_K3S_FORCE_RESTART
# If set to true will always restart the K3s service
#
# - INSTALL_K3S_SYMLINK
# If set to 'skip' will not create symlinks, 'force' will overwrite,
# default will symlink if command does not exist in path.
#
# - INSTALL_K3S_SKIP_ENABLE
# If set to true will not enable or start k3s service.
#
# - INSTALL_K3S_SKIP_START
# If set to true will not start k3s service.
#
# - INSTALL_K3S_VERSION
# Version of k3s to download from github. Will attempt to download from the
# stable channel if not specified.
#
# - INSTALL_K3S_COMMIT
# Commit of k3s to download from temporary cloud storage.
# * (for developer & QA use)
#
# - INSTALL_K3S_PR
# PR build of k3s to download from Github Artifacts.
# * (for developer & QA use)
#
# - INSTALL_K3S_BIN_DIR
# Directory to install k3s binary, links, and uninstall script to, or use
# /usr/local/bin as the default
#
# - INSTALL_K3S_BIN_DIR_READ_ONLY
# If set to true will not write files to INSTALL_K3S_BIN_DIR, forces
# setting INSTALL_K3S_SKIP_DOWNLOAD=true
#
# - INSTALL_K3S_SYSTEMD_DIR
# Directory to install systemd service and environment files to, or use
# /etc/systemd/system as the default
#
# - INSTALL_K3S_EXEC or script arguments
# Command with flags to use for launching k3s in the systemd service, if
# the command is not specified will default to "agent" if K3S_URL is set
# or "server" if not. The final systemd command resolves to a combination
# of EXEC and script args ($@).
#
# The following commands result in the same behavior:
# curl ... | INSTALL_K3S_EXEC="--disable=traefik" sh -s -
# curl ... | INSTALL_K3S_EXEC="server --disable=traefik" sh -s -
# curl ... | INSTALL_K3S_EXEC="server" sh -s - --disable=traefik
# curl ... | sh -s - server --disable=traefik
# curl ... | sh -s - --disable=traefik
#
# - INSTALL_K3S_NAME
# Name of systemd service to create, will default from the k3s exec command
# if not specified. If specified the name will be prefixed with 'k3s-'.
#
# - INSTALL_K3S_TYPE
# Type of systemd service to create, will default from the k3s exec command
# if not specified.
#
# - INSTALL_K3S_SELINUX_WARN
# If set to true will continue if k3s-selinux policy is not found.
#
# - INSTALL_K3S_SKIP_SELINUX_RPM
# If set to true will skip automatic installation of the k3s RPM.
#
# - INSTALL_K3S_CHANNEL_URL
# Channel URL for fetching k3s download URL.
# Defaults to 'https://update.k3s.io/v1-release/channels'.
#
# - INSTALL_K3S_CHANNEL
# Channel to use for fetching k3s download URL.
# Defaults to 'stable'.GITHUB_URL=${GITHUB_URL:-https://github.com/k3s-io/k3s/releases}
GITHUB_PR_URL=""
STORAGE_URL=https://k3s-ci-builds.s3.amazonaws.com
DOWNLOADER=# --- helper functions for logs ---
info()
{echo '[INFO] ' "$@"
}
warn()
{echo '[WARN] ' "$@" >&2
}
fatal()
{echo '[ERROR] ' "$@" >&2exit 1
}# --- fatal if no systemd or openrc ---
verify_system() {if [ -x /sbin/openrc-run ]; thenHAS_OPENRC=truereturnfiif [ -x /bin/systemctl ] || type systemctl > /dev/null 2>&1; thenHAS_SYSTEMD=truereturnfifatal 'Can not find systemd or openrc to use as a process supervisor for k3s'
}# --- add quotes to command arguments ---
quote() {for arg in "$@"; doprintf '%s\n' "$arg" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/'/"done
}# --- add indentation and trailing slash to quoted args ---
quote_indent() {printf ' \\\n'for arg in "$@"; doprintf '\t%s \\\n' "$(quote "$arg")"done
}# --- escape most punctuation characters, except quotes, forward slash, and space ---
escape() {printf '%s' "$@" | sed -e 's/\([][!#$%&()*;<=>?\_`{|}]\)/\\\1/g;'
}# --- escape double quotes ---
escape_dq() {printf '%s' "$@" | sed -e 's/"/\\"/g'
}# --- ensures $K3S_URL is empty or begins with https://, exiting fatally otherwise ---
verify_k3s_url() {case "${K3S_URL}" in"");;https://*);;*)fatal "Only https:// URLs are supported for K3S_URL (have ${K3S_URL})";;esac
}# --- define needed environment variables ---
setup_env() {# --- use command args if passed or create default ---case "$1" in# --- if we only have flags discover if command should be server or agent ---(-*|"")if [ -z "${K3S_URL}" ]; thenCMD_K3S=serverelseif [ -z "${K3S_TOKEN}" ] && [ -z "${K3S_TOKEN_FILE}" ]; thenfatal "Defaulted k3s exec command to 'agent' because K3S_URL is defined, but K3S_TOKEN or K3S_TOKEN_FILE is not defined."fiCMD_K3S=agentfi;;# --- command is provided ---(*)CMD_K3S=$1shift;;esacverify_k3s_urlCMD_K3S_EXEC="${CMD_K3S}$(quote_indent "$@")"# --- use systemd name if defined or create default ---if [ -n "${INSTALL_K3S_NAME}" ]; thenSYSTEM_NAME=k3s-${INSTALL_K3S_NAME}elseif [ "${CMD_K3S}" = server ]; thenSYSTEM_NAME=k3selseSYSTEM_NAME=k3s-${CMD_K3S}fifi# --- check for invalid characters in system name ---valid_chars=$(printf '%s' "${SYSTEM_NAME}" | sed -e 's/[][!#$%&()*;<=>?\_`{|}/[:space:]]/^/g;' )if [ "${SYSTEM_NAME}" != "${valid_chars}" ]; theninvalid_chars=$(printf '%s' "${valid_chars}" | sed -e 's/[^^]/ /g')fatal "Invalid characters for system name:${SYSTEM_NAME}${invalid_chars}"fi# --- use sudo if we are not already root ---SUDO=sudoif [ $(id -u) -eq 0 ]; thenSUDO=fi# --- use systemd type if defined or create default ---if [ -n "${INSTALL_K3S_TYPE}" ]; thenSYSTEMD_TYPE=${INSTALL_K3S_TYPE}elseSYSTEMD_TYPE=notifyfi# --- use binary install directory if defined or create default ---if [ -n "${INSTALL_K3S_BIN_DIR}" ]; thenBIN_DIR=${INSTALL_K3S_BIN_DIR}else# --- use /usr/local/bin if root can write to it, otherwise use /opt/bin if it existsBIN_DIR=/usr/local/binif ! $SUDO sh -c "touch ${BIN_DIR}/k3s-ro-test && rm -rf ${BIN_DIR}/k3s-ro-test"; thenif [ -d /opt/bin ]; thenBIN_DIR=/opt/binfififi# --- use systemd directory if defined or create default ---if [ -n "${INSTALL_K3S_SYSTEMD_DIR}" ]; thenSYSTEMD_DIR="${INSTALL_K3S_SYSTEMD_DIR}"elseSYSTEMD_DIR=/etc/systemd/systemfi# --- set related files from system name ---SERVICE_K3S=${SYSTEM_NAME}.serviceUNINSTALL_K3S_SH=${UNINSTALL_K3S_SH:-${BIN_DIR}/${SYSTEM_NAME}-uninstall.sh}KILLALL_K3S_SH=${KILLALL_K3S_SH:-${BIN_DIR}/k3s-killall.sh}# --- use service or environment location depending on systemd/openrc ---if [ "${HAS_SYSTEMD}" = true ]; thenFILE_K3S_SERVICE=${SYSTEMD_DIR}/${SERVICE_K3S}FILE_K3S_ENV=${SYSTEMD_DIR}/${SERVICE_K3S}.envelif [ "${HAS_OPENRC}" = true ]; then$SUDO mkdir -p /etc/rancher/k3sFILE_K3S_SERVICE=/etc/init.d/${SYSTEM_NAME}FILE_K3S_ENV=/etc/rancher/k3s/${SYSTEM_NAME}.envfi# --- get hash of config & exec for currently installed k3s ---PRE_INSTALL_HASHES=$(get_installed_hashes)# --- if bin directory is read only skip download ---if [ "${INSTALL_K3S_BIN_DIR_READ_ONLY}" = true ]; thenINSTALL_K3S_SKIP_DOWNLOAD=truefi# --- setup channel valuesINSTALL_K3S_CHANNEL_URL=${INSTALL_K3S_CHANNEL_URL:-'https://update.k3s.io/v1-release/channels'}INSTALL_K3S_CHANNEL=${INSTALL_K3S_CHANNEL:-'stable'}
}# --- check if skip download environment variable set ---
can_skip_download_binary() {if [ "${INSTALL_K3S_SKIP_DOWNLOAD}" != true ] && [ "${INSTALL_K3S_SKIP_DOWNLOAD}" != binary ]; thenreturn 1fi
}can_skip_download_selinux() {if [ "${INSTALL_K3S_SKIP_DOWNLOAD}" != true ] && [ "${INSTALL_K3S_SKIP_DOWNLOAD}" != selinux ]; thenreturn 1fi
}# --- verify an executable k3s binary is installed ---
verify_k3s_is_executable() {if [ ! -x ${BIN_DIR}/k3s ]; thenfatal "Executable k3s binary not found at ${BIN_DIR}/k3s"fi
}# --- set arch and suffix, fatal if architecture not supported ---
setup_verify_arch() {if [ -z "$ARCH" ]; thenARCH=$(uname -m)ficase $ARCH inamd64)ARCH=amd64SUFFIX=;;x86_64)ARCH=amd64SUFFIX=;;arm64)ARCH=arm64SUFFIX=-${ARCH};;s390x)ARCH=s390xSUFFIX=-${ARCH};;aarch64)ARCH=arm64SUFFIX=-${ARCH};;arm*)ARCH=armSUFFIX=-${ARCH}hf;;*)fatal "Unsupported architecture $ARCH"esac
}# --- verify existence of network downloader executable ---
verify_downloader() {# Return failure if it doesn't exist or is no executable[ -x "$(command -v $1)" ] || return 1# Set verified executable as our downloader program and return successDOWNLOADER=$1return 0
}# --- create temporary directory and cleanup when done ---
setup_tmp() {TMP_DIR=$(mktemp -d -t k3s-install.XXXXXXXXXX)TMP_HASH=${TMP_DIR}/k3s.hashTMP_ZIP=${TMP_DIR}/k3s.zipTMP_BIN=${TMP_DIR}/k3s.bincleanup() {code=$?set +etrap - EXITrm -rf ${TMP_DIR}exit $code}trap cleanup INT EXIT
}# --- use desired k3s version if defined or find version from channel ---
get_release_version() {if [ -n "${INSTALL_K3S_PR}" ]; thenVERSION_K3S="PR ${INSTALL_K3S_PR}"get_pr_artifact_urlelif [ -n "${INSTALL_K3S_COMMIT}" ]; thenVERSION_K3S="commit ${INSTALL_K3S_COMMIT}"elif [ -n "${INSTALL_K3S_VERSION}" ]; thenVERSION_K3S=${INSTALL_K3S_VERSION}elseinfo "Finding release for channel ${INSTALL_K3S_CHANNEL}"version_url="${INSTALL_K3S_CHANNEL_URL}/${INSTALL_K3S_CHANNEL}"case $DOWNLOADER incurl)VERSION_K3S=$(curl -w '%{url_effective}' -L -s -S ${version_url} -o /dev/null | sed -e 's|.*/||');;wget)VERSION_K3S=$(wget -SqO /dev/null ${version_url} 2>&1 | grep -i Location | sed -e 's|.*/||');;*)fatal "Incorrect downloader executable '$DOWNLOADER'";;esacfiinfo "Using ${VERSION_K3S} as release"
}# --- get k3s-selinux version ---
get_k3s_selinux_version() {available_version="k3s-selinux-1.2-2.${rpm_target}.noarch.rpm"info "Finding available k3s-selinux versions"# run verify_downloader in case it binary installation was skippedverify_downloader curl || verify_downloader wget || fatal 'Can not find curl or wget for downloading files'case $DOWNLOADER incurl)DOWNLOADER_OPTS="-s";;wget)DOWNLOADER_OPTS="-q -O -";;*)fatal "Incorrect downloader executable '$DOWNLOADER'";;esacfor i in {1..3}; doset +eif [ "${rpm_channel}" = "testing" ]; thenversion=$(timeout 5 ${DOWNLOADER} ${DOWNLOADER_OPTS} https://api.github.com/repos/k3s-io/k3s-selinux/releases | grep browser_download_url | awk '{ print $2 }' | grep -oE "[^\/]+${rpm_target}\.noarch\.rpm" | head -n 1)elseversion=$(timeout 5 ${DOWNLOADER} ${DOWNLOADER_OPTS} https://api.github.com/repos/k3s-io/k3s-selinux/releases/latest | grep browser_download_url | awk '{ print $2 }' | grep -oE "[^\/]+${rpm_target}\.noarch\.rpm")fiset -eif [ "${version}" != "" ]; thenbreakfisleep 1doneif [ "${version}" == "" ]; thenwarn "Failed to get available versions of k3s-selinux..defaulting to ${available_version}"returnfiavailable_version=${version}
}# --- download from github url ---
download() {[ $# -eq 2 ] || fatal 'download needs exactly 2 arguments'# Disable exit-on-error so we can do custom error messages on failureset +e# Default to a failure statusstatus=1case $DOWNLOADER incurl)curl -o $1 -sfL $2status=$?;;wget)wget -qO $1 $2status=$?;;*)# Enable exit-on-error for fatal to executeset -efatal "Incorrect executable '$DOWNLOADER'";;esac# Re-enable exit-on-errorset -e# Abort if download command failed[ $status -eq 0 ] || fatal 'Download failed'
}# --- download hash from github url ---
download_hash() {if [ -n "${INSTALL_K3S_PR}" ]; theninfo "Downloading hash ${GITHUB_PR_URL}"curl -s -o ${TMP_ZIP} -H "Authorization: Bearer $GITHUB_TOKEN" -L ${GITHUB_PR_URL}unzip -p ${TMP_ZIP} k3s.sha256sum > ${TMP_HASH}elseif [ -n "${INSTALL_K3S_COMMIT}" ]; thenHASH_URL=${STORAGE_URL}/k3s${SUFFIX}-${INSTALL_K3S_COMMIT}.sha256sumelseHASH_URL=${GITHUB_URL}/download/${VERSION_K3S}/sha256sum-${ARCH}.txtfiinfo "Downloading hash ${HASH_URL}"download ${TMP_HASH} ${HASH_URL}fiHASH_EXPECTED=$(grep " k3s${SUFFIX}$" ${TMP_HASH})HASH_EXPECTED=${HASH_EXPECTED%%[[:blank:]]*}
}# --- check hash against installed version ---
installed_hash_matches() {if [ -x ${BIN_DIR}/k3s ]; thenHASH_INSTALLED=$(sha256sum ${BIN_DIR}/k3s)HASH_INSTALLED=${HASH_INSTALLED%%[[:blank:]]*}if [ "${HASH_EXPECTED}" = "${HASH_INSTALLED}" ]; thenreturnfifireturn 1
}# Use the GitHub API to identify the artifact associated with a given PR
get_pr_artifact_url() {github_api_url=https://api.github.com/repos/k3s-io/k3s# Check if jq is installedif ! [ -x "$(command -v jq)" ]; thenfatal "Installing PR builds requires jq"fi# Check if unzip is installedif ! [ -x "$(command -v unzip)" ]; thenfatal "Installing PR builds requires unzip"fiif [ -z "${GITHUB_TOKEN}" ]; thenfatal "Installing PR builds requires GITHUB_TOKEN with k3s-io/k3s repo permissions"fi# GET request to the GitHub API to retrieve the latest commit SHA from the pull requestset +ecommit_id=$(curl -f -s -H "Authorization: Bearer ${GITHUB_TOKEN}" "${github_api_url}/pulls/${INSTALL_K3S_PR}" | jq -r '.head.sha')set -eif [ -z "${commit_id}" ]; thenfatal "Installing PR builds requires GITHUB_TOKEN with k3s-io/k3s repo permissions"fi# GET request to the GitHub API to retrieve the Build workflow associated with the commitrun_id=$(curl -s -H "Authorization: Bearer ${GITHUB_TOKEN}" "${github_api_url}/commits/${commit_id}/check-runs?check_name=build%20%2F%20Build" | jq -r '[.check_runs | sort_by(.id) | .[].details_url | split("/")[7]] | last')# Extract the artifact ID for the "k3s" artifactGITHUB_PR_URL=$(curl -s -H "Authorization: Bearer ${GITHUB_TOKEN}" "${github_api_url}/actions/runs/${run_id}/artifacts" | jq -r '.artifacts[] | select(.name == "k3s") | .archive_download_url')
}# --- download binary from github url ---
download_binary() {if [ -n "${INSTALL_K3S_PR}" ]; then# Since Binary and Hash are zipped together, check if TMP_ZIP already existsif ! [ -f ${TMP_ZIP} ]; theninfo "Downloading K3s artifact ${GITHUB_PR_URL}"curl -s -f -o ${TMP_ZIP} -H "Authorization: Bearer $GITHUB_TOKEN" -L ${GITHUB_PR_URL}fi# extract k3s binary from zipunzip -p ${TMP_ZIP} k3s > ${TMP_BIN}returnelif [ -n "${INSTALL_K3S_COMMIT}" ]; thenBIN_URL=${STORAGE_URL}/k3s${SUFFIX}-${INSTALL_K3S_COMMIT}elseBIN_URL=${GITHUB_URL}/download/${VERSION_K3S}/k3s${SUFFIX}fiinfo "Downloading binary ${BIN_URL}"download ${TMP_BIN} ${BIN_URL}
}# --- verify downloaded binary hash ---
verify_binary() {info "Verifying binary download"HASH_BIN=$(sha256sum ${TMP_BIN})HASH_BIN=${HASH_BIN%%[[:blank:]]*}if [ "${HASH_EXPECTED}" != "${HASH_BIN}" ]; thenfatal "Download sha256 does not match ${HASH_EXPECTED}, got ${HASH_BIN}"fi
}# --- setup permissions and move binary to system directory ---
setup_binary() {chmod 755 ${TMP_BIN}info "Installing k3s to ${BIN_DIR}/k3s"$SUDO chown root:root ${TMP_BIN}$SUDO mv -f ${TMP_BIN} ${BIN_DIR}/k3s
}# --- setup selinux policy ---
setup_selinux() {case ${INSTALL_K3S_CHANNEL} in*testing)rpm_channel=testing;;*latest)rpm_channel=latest;;*)rpm_channel=stable;;esacrpm_site="rpm.rancher.io"if [ "${rpm_channel}" = "testing" ]; thenrpm_site="rpm-testing.rancher.io"fi[ -r /etc/os-release ] && . /etc/os-releaseif [ `expr "${ID_LIKE}" : ".*suse.*"` != 0 ]; thenrpm_target=slerpm_site_infix=microospackage_installer=zypperif [ "${ID_LIKE:-}" = suse ] && ( [ "${VARIANT_ID:-}" = sle-micro ] || [ "${ID:-}" = sle-micro ] ); thenrpm_target=slerpm_site_infix=slemicropackage_installer=zypperfielif [ "${ID_LIKE:-}" = coreos ] || [ "${VARIANT_ID:-}" = coreos ] || [ "${VARIANT_ID:-}" = "iot" ] || \{ { [ "${ID:-}" = fedora ] || [ "${ID_LIKE:-}" = fedora ]; } && [ -n "${OSTREE_VERSION:-}" ]; }; thenrpm_target=coreosrpm_site_infix=coreospackage_installer=rpm-ostreeelif [ "${VERSION_ID%%.*}" = "7" ] || ( [ "${ID:-}" = amzn ] && [ "${VERSION_ID%%.*}" = "2" ] ); thenrpm_target=el7rpm_site_infix=centos/7package_installer=yumelif [ "${VERSION_ID%%.*}" = "8" ] || [ "${VERSION_ID%%.*}" = "V10" ] || [ "${VERSION_ID%%.*}" -gt "36" ]; thenrpm_target=el8rpm_site_infix=centos/8package_installer=yumelserpm_target=el9rpm_site_infix=centos/9package_installer=yumfiif [ "${package_installer}" = "rpm-ostree" ] && [ -x /bin/yum ]; thenpackage_installer=yumfiif [ "${package_installer}" = "yum" ] && [ -x /usr/bin/dnf ]; thenpackage_installer=dnffipolicy_hint="please install:${package_installer} install -y container-selinux${package_installer} install -y https://${rpm_site}/k3s/${rpm_channel}/common/${rpm_site_infix}/noarch/${available_version}
"if [ "$INSTALL_K3S_SKIP_SELINUX_RPM" = true ] || can_skip_download_selinux || [ ! -d /usr/share/selinux ]; theninfo "Skipping installation of SELinux RPM"returnfiget_k3s_selinux_versioninstall_selinux_rpm ${rpm_site} ${rpm_channel} ${rpm_target} ${rpm_site_infix}policy_error=fatalif [ "$INSTALL_K3S_SELINUX_WARN" = true ] || [ "${ID_LIKE:-}" = coreos ] ||[ "${VARIANT_ID:-}" = coreos ] || [ "${VARIANT_ID:-}" = iot ]; thenpolicy_error=warnfiif ! $SUDO chcon -u system_u -r object_r -t container_runtime_exec_t ${BIN_DIR}/k3s >/dev/null 2>&1; thenif $SUDO grep '^\s*SELINUX=enforcing' /etc/selinux/config >/dev/null 2>&1; then$policy_error "Failed to apply container_runtime_exec_t to ${BIN_DIR}/k3s, ${policy_hint}"fielif [ ! -f /usr/share/selinux/packages/k3s.pp ]; thenif [ -x /usr/sbin/transactional-update ] || [ "${ID_LIKE:-}" = coreos ] || \{ { [ "${ID:-}" = fedora ] || [ "${ID_LIKE:-}" = fedora ]; } && [ -n "${OSTREE_VERSION:-}" ]; }; thenwarn "Please reboot your machine to activate the changes and avoid data loss."else$policy_error "Failed to find the k3s-selinux policy, ${policy_hint}"fifi
}install_selinux_rpm() {if [ -r /etc/redhat-release ] || [ -r /etc/centos-release ] || [ -r /etc/oracle-release ] ||[ -r /etc/fedora-release ] || [ -r /etc/system-release ] || [ "${ID_LIKE%%[ ]*}" = "suse" ]; thenrepodir=/etc/yum.repos.dif [ -d /etc/zypp/repos.d ]; thenrepodir=/etc/zypp/repos.dfiset +o noglob$SUDO rm -f ${repodir}/rancher-k3s-common*.reposet -o noglobif [ -r /etc/redhat-release ] && [ "${3}" = "el7" ]; then$SUDO yum install -y yum-utils$SUDO yum-config-manager --enable rhel-7-server-extras-rpmsfi$SUDO tee ${repodir}/rancher-k3s-common.repo >/dev/null << EOF
[rancher-k3s-common-${2}]
name=Rancher K3s Common (${2})
baseurl=https://${1}/k3s/${2}/common/${4}/noarch
enabled=1
gpgcheck=1
repo_gpgcheck=0
gpgkey=https://${1}/public.key
EOFcase ${3} insle)rpm_installer="zypper --gpg-auto-import-keys"if [ "${TRANSACTIONAL_UPDATE=false}" != "true" ] && [ -x /usr/sbin/transactional-update ]; thentransactional_update_run="transactional-update --no-selfupdate -d run"rpm_installer="transactional-update --no-selfupdate -d run ${rpm_installer}": "${INSTALL_K3S_SKIP_START:=true}"fi# create the /var/lib/rpm-state in SLE systems to fix the prein selinux macro$SUDO ${transactional_update_run} mkdir -p /var/lib/rpm-state;;coreos)rpm_installer="rpm-ostree --idempotent"# rpm_install_extra_args="--apply-live": "${INSTALL_K3S_SKIP_START:=true}";;*)rpm_installer="yum";;esacif [ "${rpm_installer}" = "yum" ] && [ -x /usr/bin/dnf ]; thenrpm_installer=dnffiif rpm -q --quiet k3s-selinux; then# remove k3s-selinux module before upgrade to allow container-selinux to upgrade safelyif check_available_upgrades container-selinux ${3} && check_available_upgrades k3s-selinux ${3}; thenMODULE_PRIORITY=$($SUDO semodule --list=full | grep k3s | cut -f1 -d" ")if [ -n "${MODULE_PRIORITY}" ]; then$SUDO semodule -X $MODULE_PRIORITY -r k3s || truefififi# shellcheck disable=SC2086$SUDO ${rpm_installer} install -y "k3s-selinux"fireturn
}check_available_upgrades() {set +ecase ${2} insle)available_upgrades=$($SUDO zypper -q -t -s 11 se -s -u --type package $1 | tail -n 1 | grep -v "No matching" | awk '{print $3}');;coreos)# currently rpm-ostree does not support search functionality https://github.com/coreos/rpm-ostree/issues/1877;;*)available_upgrades=$($SUDO yum -q --refresh list $1 --upgrades | tail -n 1 | awk '{print $2}');;esacset -eif [ -n "${available_upgrades}" ]; thenreturn 0fireturn 1
}
# --- download and verify k3s ---
download_and_verify() {if can_skip_download_binary; theninfo 'Skipping k3s download and verify'verify_k3s_is_executablereturnfisetup_verify_archverify_downloader curl || verify_downloader wget || fatal 'Can not find curl or wget for downloading files'setup_tmpget_release_versiondownload_hashif installed_hash_matches; theninfo 'Skipping binary downloaded, installed k3s matches hash'returnfidownload_binaryverify_binarysetup_binary
}# --- add additional utility links ---
create_symlinks() {[ "${INSTALL_K3S_BIN_DIR_READ_ONLY}" = true ] && return[ "${INSTALL_K3S_SYMLINK}" = skip ] && returnfor cmd in kubectl crictl ctr; doif [ ! -e ${BIN_DIR}/${cmd} ] || [ "${INSTALL_K3S_SYMLINK}" = force ]; thenwhich_cmd=$(command -v ${cmd} 2>/dev/null || true)if [ -z "${which_cmd}" ] || [ "${INSTALL_K3S_SYMLINK}" = force ]; theninfo "Creating ${BIN_DIR}/${cmd} symlink to k3s"$SUDO ln -sf k3s ${BIN_DIR}/${cmd}elseinfo "Skipping ${BIN_DIR}/${cmd} symlink to k3s, command exists in PATH at ${which_cmd}"fielseinfo "Skipping ${BIN_DIR}/${cmd} symlink to k3s, already exists"fidone
}# --- create killall script ---
create_killall() {[ "${INSTALL_K3S_BIN_DIR_READ_ONLY}" = true ] && returninfo "Creating killall script ${KILLALL_K3S_SH}"$SUDO tee ${KILLALL_K3S_SH} >/dev/null << \EOF
#!/bin/sh
[ $(id -u) -eq 0 ] || exec sudo --preserve-env=K3S_DATA_DIR $0 $@K3S_DATA_DIR=${K3S_DATA_DIR:-/var/lib/rancher/k3s}for bin in ${K3S_DATA_DIR}/data/**/bin/; do[ -d $bin ] && export PATH=$PATH:$bin:$bin/aux
doneset -xfor service in /etc/systemd/system/k3s*.service; do[ -s $service ] && systemctl stop $(basename $service)
donefor service in /etc/init.d/k3s*; do[ -x $service ] && $service stop
donepschildren() {ps -e -o ppid= -o pid= | \sed -e 's/^\s*//g; s/\s\s*/\t/g;' | \grep -w "^$1" | \cut -f2
}pstree() {for pid in $@; doecho $pidfor child in $(pschildren $pid); dopstree $childdonedone
}killtree() {kill -9 $({ set +x; } 2>/dev/null;pstree $@;set -x;) 2>/dev/null
}remove_interfaces() {# Delete network interface(s) that match 'master cni0'ip link show 2>/dev/null | grep 'master cni0' | while read ignore iface ignore; doiface=${iface%%@*}[ -z "$iface" ] || ip link delete $ifacedone# Delete cni related interfacesip link delete cni0ip link delete flannel.1ip link delete flannel-v6.1ip link delete kube-ipvs0ip link delete flannel-wgip link delete flannel-wg-v6# Restart tailscaleif [ -n "$(command -v tailscale)" ]; thentailscale set --advertise-routes=fi
}getshims() {ps -e -o pid= -o args= | sed -e 's/^ *//; s/\s\s*/\t/;' | grep -w "${K3S_DATA_DIR}"'/data/[^/]*/bin/containerd-shim' | cut -f1
}killtree $({ set +x; } 2>/dev/null; getshims; set -x)do_unmount_and_remove() {set +xwhile read -r _ path _; docase "$path" in $1*) echo "$path" ;; esacdone < /proc/self/mounts | sort -r | xargs -r -t -n 1 sh -c 'umount -f "$0" && rm -rf "$0"'set -x
}do_unmount_and_remove '/run/k3s'
do_unmount_and_remove '/var/lib/kubelet/pods'
do_unmount_and_remove '/var/lib/kubelet/plugins'
do_unmount_and_remove '/run/netns/cni-'# Remove CNI namespaces
ip netns show 2>/dev/null | grep cni- | xargs -r -t -n 1 ip netns deleteremove_interfacesrm -rf /var/lib/cni/
iptables-save | grep -v KUBE- | grep -v CNI- | grep -iv flannel | iptables-restore
ip6tables-save | grep -v KUBE- | grep -v CNI- | grep -iv flannel | ip6tables-restore
EOF$SUDO chmod 755 ${KILLALL_K3S_SH}$SUDO chown root:root ${KILLALL_K3S_SH}
}# --- create uninstall script ---
create_uninstall() {[ "${INSTALL_K3S_BIN_DIR_READ_ONLY}" = true ] && returninfo "Creating uninstall script ${UNINSTALL_K3S_SH}"$SUDO tee ${UNINSTALL_K3S_SH} >/dev/null << EOF
#!/bin/sh
set -x
[ \$(id -u) -eq 0 ] || exec sudo --preserve-env=K3S_DATA_DIR \$0 \$@K3S_DATA_DIR=\${K3S_DATA_DIR:-/var/lib/rancher/k3s}${KILLALL_K3S_SH}if command -v systemctl; thensystemctl disable ${SYSTEM_NAME}systemctl reset-failed ${SYSTEM_NAME}systemctl daemon-reload
fi
if command -v rc-update; thenrc-update delete ${SYSTEM_NAME} default
firm -f ${FILE_K3S_SERVICE}
rm -f ${FILE_K3S_ENV}remove_uninstall() {rm -f ${UNINSTALL_K3S_SH}
}
trap remove_uninstall EXITif (ls ${SYSTEMD_DIR}/k3s*.service || ls /etc/init.d/k3s*) >/dev/null 2>&1; thenset +x; echo 'Additional k3s services installed, skipping uninstall of k3s'; set -xexit
fifor cmd in kubectl crictl ctr; doif [ -L ${BIN_DIR}/\$cmd ]; thenrm -f ${BIN_DIR}/\$cmdfi
doneclean_mounted_directory() {if ! grep -q " \$1" /proc/mounts; thenrm -rf "\$1"return 0fifor path in "\$1"/*; doif [ -d "\$path" ]; thenif grep -q " \$path" /proc/mounts; thenclean_mounted_directory "\$path"elserm -rf "\$path"fielserm "\$path"fidone
}rm -rf /etc/rancher/k3s
rm -rf /run/k3s
rm -rf /run/flannel
clean_mounted_directory \${K3S_DATA_DIR}
rm -rf /var/lib/kubelet
rm -f ${BIN_DIR}/k3s
rm -f ${KILLALL_K3S_SH}if type yum >/dev/null 2>&1; thenyum remove -y k3s-selinuxrm -f /etc/yum.repos.d/rancher-k3s-common*.repo
elif type rpm-ostree >/dev/null 2>&1; thenrpm-ostree uninstall k3s-selinuxrm -f /etc/yum.repos.d/rancher-k3s-common*.repo
elif type zypper >/dev/null 2>&1; thenuninstall_cmd="zypper remove -y k3s-selinux"if [ "\${TRANSACTIONAL_UPDATE=false}" != "true" ] && [ -x /usr/sbin/transactional-update ]; thenuninstall_cmd="transactional-update --no-selfupdate -d run \$uninstall_cmd"fi$SUDO \$uninstall_cmdrm -f /etc/zypp/repos.d/rancher-k3s-common*.repo
fi
EOF$SUDO chmod 755 ${UNINSTALL_K3S_SH}$SUDO chown root:root ${UNINSTALL_K3S_SH}
}# --- disable current service if loaded --
systemd_disable() {$SUDO systemctl disable ${SYSTEM_NAME} >/dev/null 2>&1 || true$SUDO rm -f /etc/systemd/system/${SERVICE_K3S} || true$SUDO rm -f /etc/systemd/system/${SERVICE_K3S}.env || true
}# --- capture current env and create file containing k3s_ variables ---
create_env_file() {info "env: Creating environment file ${FILE_K3S_ENV}"$SUDO touch ${FILE_K3S_ENV}$SUDO chmod 0600 ${FILE_K3S_ENV}sh -c export | while read x v; do echo $v; done | grep -E '^(K3S|CONTAINERD)_' | $SUDO tee ${FILE_K3S_ENV} >/dev/nullsh -c export | while read x v; do echo $v; done | grep -Ei '^(NO|HTTP|HTTPS)_PROXY' | $SUDO tee -a ${FILE_K3S_ENV} >/dev/null
}# --- write systemd service file ---
create_systemd_service_file() {info "systemd: Creating service file ${FILE_K3S_SERVICE}"$SUDO tee ${FILE_K3S_SERVICE} >/dev/null << EOF
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target[Install]
WantedBy=multi-user.target[Service]
Type=${SYSTEMD_TYPE}
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-${FILE_K3S_ENV}
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service 2>/dev/null'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=${BIN_DIR}/k3s \\${CMD_K3S_EXEC}EOF
}# --- write openrc service file ---
create_openrc_service_file() {LOG_FILE=/var/log/${SYSTEM_NAME}.loginfo "openrc: Creating service file ${FILE_K3S_SERVICE}"$SUDO tee ${FILE_K3S_SERVICE} >/dev/null << EOF
#!/sbin/openrc-rundepend() {after network-onlinewant cgroups
}start_pre() {rm -f /tmp/k3s.*
}supervisor=supervise-daemon
name=${SYSTEM_NAME}
command="${BIN_DIR}/k3s"
command_args="$(escape_dq "${CMD_K3S_EXEC}")>>${LOG_FILE} 2>&1"output_log=${LOG_FILE}
error_log=${LOG_FILE}pidfile="/var/run/${SYSTEM_NAME}.pid"
respawn_delay=5
respawn_max=0set -o allexport
if [ -f /etc/environment ]; then . /etc/environment; fi
if [ -f ${FILE_K3S_ENV} ]; then . ${FILE_K3S_ENV}; fi
set +o allexport
EOF$SUDO chmod 0755 ${FILE_K3S_SERVICE}$SUDO tee /etc/logrotate.d/${SYSTEM_NAME} >/dev/null << EOF
${LOG_FILE} {missingoknotifemptycopytruncate
}
EOF
}# --- write systemd or openrc service file ---
create_service_file() {[ "${HAS_SYSTEMD}" = true ] && create_systemd_service_file && restore_systemd_service_file_context[ "${HAS_OPENRC}" = true ] && create_openrc_service_filereturn 0
}restore_systemd_service_file_context() {$SUDO restorecon -R -i ${FILE_K3S_SERVICE} 2>/dev/null || true$SUDO restorecon -R -i ${FILE_K3S_ENV} 2>/dev/null || true
}# --- get hashes of the current k3s bin and service files
get_installed_hashes() {$SUDO sha256sum ${BIN_DIR}/k3s ${FILE_K3S_SERVICE} ${FILE_K3S_ENV} 2>&1 || true
}# --- enable and start systemd service ---
systemd_enable() {info "systemd: Enabling ${SYSTEM_NAME} unit"$SUDO systemctl enable ${FILE_K3S_SERVICE} >/dev/null$SUDO systemctl daemon-reload >/dev/null
}systemd_start() {info "systemd: Starting ${SYSTEM_NAME}"$SUDO systemctl restart ${SYSTEM_NAME}
}# --- enable and start openrc service ---
openrc_enable() {info "openrc: Enabling ${SYSTEM_NAME} service for default runlevel"$SUDO rc-update add ${SYSTEM_NAME} default >/dev/null
}openrc_start() {info "openrc: Starting ${SYSTEM_NAME}"$SUDO ${FILE_K3S_SERVICE} restart
}has_working_xtables() {if $SUDO sh -c "command -v \"$1-save\"" 1> /dev/null && $SUDO sh -c "command -v \"$1-restore\"" 1> /dev/null; thenif $SUDO $1-save 2>/dev/null | grep -q '^-A CNI-HOSTPORT-MASQ -j MASQUERADE$'; thenwarn "Host $1-save/$1-restore tools are incompatible with existing rules"elsereturn 0fielseinfo "Host $1-save/$1-restore tools not found"fireturn 1
}# --- startup systemd or openrc service ---
service_enable_and_start() {if [ -f "/proc/cgroups" ] && [ "$(grep memory /proc/cgroups | while read -r n n n enabled; do echo $enabled; done)" -eq 0 ];theninfo 'Failed to find memory cgroup, you may need to add "cgroup_memory=1 cgroup_enable=memory" to your linux cmdline (/boot/cmdline.txt on a Raspberry Pi)'fi[ "${INSTALL_K3S_SKIP_ENABLE}" = true ] && return[ "${HAS_SYSTEMD}" = true ] && systemd_enable[ "${HAS_OPENRC}" = true ] && openrc_enable[ "${INSTALL_K3S_SKIP_START}" = true ] && returnPOST_INSTALL_HASHES=$(get_installed_hashes)if [ "${PRE_INSTALL_HASHES}" = "${POST_INSTALL_HASHES}" ] && [ "${INSTALL_K3S_FORCE_RESTART}" != true ]; theninfo 'No change detected so skipping service start'returnfifor XTABLES in iptables ip6tables; doif has_working_xtables ${XTABLES}; then$SUDO ${XTABLES}-save 2>/dev/null | grep -v KUBE- | grep -iv flannel | $SUDO ${XTABLES}-restorefidone[ "${HAS_SYSTEMD}" = true ] && systemd_start[ "${HAS_OPENRC}" = true ] && openrc_startreturn 0
}# --- re-evaluate args to include env command ---
eval set -- $(escape "${INSTALL_K3S_EXEC}") $(quote "$@")# --- run the install process --
{verify_systemsetup_env "$@"download_and_verifysetup_selinuxcreate_symlinkscreate_killallcreate_uninstallsystemd_disablecreate_env_filecreate_service_fileservice_enable_and_start
}
EOF
安装K3S
export INSTALL_K3S_EXEC="--system-default-registry registry.cn-hangzhou.aliyuncs.com --write-kubeconfig ~/.kube/config --disable=traefik --embedded-registry --service-node-port-range=443-32767"
# 设置使用国内源
export INSTALL_K3S_MIRROR=cn
# 设置强制下载
export INSTALL_K3S_SYMLINK=force
# 设置镜像url
export INSTALL_K3S_MIRROR_URL=${INSTALL_K3S_MIRROR_URL:-'rancher-mirror.rancher.cn'}
export INSTALL_K3S_SKIP_DOWNLOAD=true
bash install.sh
配置镜像源
安装完成后还需要配置镜像源,不然大概率使用中会出现ImagePullErr的错误
cat >/etc/rancher/k3s/registries.yaml <<EOF
mirrors:gcr.io:quay.io:ghcr.io:registry.k8s.io:docker.io:endpoint:- "https://docker.1panel.live"- "https://docker.1ms.run"
EOF
systemctl restart k3s
设置K3S使用GPU
安装完K3S默认是不能使用GPU的,因为用的是默认运行时,默认运行时没有调用GPU的能力。
网上文档大多是讲修改Docker运行时或者完整k8s的操作,经过踩坑后我整理的使用步骤如下:
修改配置文件更改k3s默认运行时为nvidia容器运行时
nvidia-ctk runtime configure --runtime=containerd --set-as-default --config /var/lib/rancher/k3s/agent/etc/containerd/config.toml
配置持久化
k3s每次重启都会生成新的配置,需要将配置保存成tmpl文件才能自动加载
cp /var/lib/rancher/k3s/agent/etc/containerd/config.toml /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl
重启使配置生效
systemctl restart k3s
安装nvidia容器驱动
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.5/nvidia-device-plugin.yml
运行后检查是否安装成功
kubectl get pods -n kube-system | grep nvidia-device-plugin
安装成功后应该会出现类似如下的输出
nvidia-device-plugin-daemonset-mbptx 1/1 Running 0 78m
这时候需要检查设备插件日志,确认是否有报错:
kubectl logs -n kube-system nvidia-device-plugin-daemonset-mbptx
看到这样的字段表示设备成功注册
I0220 11:14:12.852130 1 main.go:356] Retrieving plugins.
I0220 11:14:12.957036 1 server.go:195] Starting GRPC server for 'nvidia.com/gpu'
I0220 11:14:12.995786 1 server.go:139] Starting to serve 'nvidia.com/gpu' on /var/lib/kubelet/device-plugins/nvidia-gpu.sock
I0220 11:14:13.030801 1 server.go:146] Registered device plugin for 'nvidia.com/gpu' with Kubelet
测试K3S调用GPU
最后校验运行k3s使用GPU
cat > gpu-test.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:name: nvidia-smi-test
spec:restartPolicy: OnFailurecontainers:- name: nvidia-smiimage: nvidia/cuda:12.1.0-base-ubuntu22.04command: ["nvidia-smi"]resources:limits:nvidia.com/gpu: 1
EOF
kubectl create -f gpu-test.yaml
kubectl logs pod/nvidia-smi-test
最后应该可以成功认到显卡输出
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla V100-PCIE-32GB Off | 00000000:00:06.0 Off | 0 |
| N/A 41C P0 39W / 250W | 310MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------++-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
至此配置完成。
相关文章:
【个人记录】openEuler安装K3S并配置为GPU节点
前言 国内网络环境特殊,在线安装比较麻烦,K3S采用离线安装方式进行部署。 安装整体思路是: 安装GPU驱动安装CUDA工具安装nvidia容器运行时安装K3S设置K3S使用GPU 基础环境 采用All In One方式(其实只有一张GPU卡)部…...
SpringAI系列 - ToolCalling篇(二) - 如何设置应用侧工具参数ToolContext(有坑)
目录 一、引言二、集成ToolContext示例步骤1: 在`@Tool`标注的工具方法中集成`ToolConext`参数步骤2:`ChatClient`运行时动态设置`ToolContext`参数三、填坑一、引言 在使用AI大模型的工具调用机制时,工具参数都是由大模型解析用户输入上下文获取的,由大模型提供参数给本地…...
Linux系统中常见的词GNU是什么意思?
GNU 是 “GNU’s Not Unix” 的递归缩写,它是一个自由软件项目,旨在创建一个完全自由的操作系统。这个名字反映了GNU项目的核心理念:它试图创建一个类Unix的系统,但不是Unix本身。 GNU 项目由 理查德斯托曼(Richard S…...
Revit API:对元素过滤的三点思考
一、为什么要对元素进行过滤? 提高效率:Revit模型可能包含成千上万的元素,直接对所有元素进行操作会非常耗时。通过过滤,可以只选择需要处理的元素,从而提高代码的执行效率。 精准控制:通过过滤࿰…...
计算机毕业设计Python农产品推荐系统 农产品爬虫 农产品可视化 农产品大数据(源码+LW文档+PPT+讲解)
温馨提示:文末有 CSDN 平台官方提供的学长联系方式的名片! 温馨提示:文末有 CSDN 平台官方提供的学长联系方式的名片! 温馨提示:文末有 CSDN 平台官方提供的学长联系方式的名片! 作者简介:Java领…...
计算机网络之TCP的可靠传输
上一篇内容可能比较多,显得比较杂乱,这一篇简单总结一下TCP是靠什么实现可靠传输的吧。 校验和 TCP是端到端的传输,由发送方计算校验和,接收方进行验证,目的是为了验证TCP首部和数据在发送过程中没有任何改动&#x…...
linux 驱动编程配置(minis3c2440)
1.介绍 1. 启动过程:启动u-boot------>>启动linux内核----->>挂载根文件系统 2. uboot是一个裸机程序,是一个bootloader,用于启动linux系统以及系统初始化 ubootloader主要完成了哪些任务:1. 初始化异常向量表&a…...
数据库基础1
MySQL在C语言中的操作步骤 C语言操作MySQL的基本流程如下: 引入MySQL头文件初始化MySQL连接连接到MySQL服务器执行SQL语句获取查询结果处理查询结果释放结果集和连接 数据库中有哪些约束规则 1.主键约束 promary key 具有唯一且非空 2.外键约束 foreign key 3.非…...
Redis文档总结
文档:https://redis.com.cn/topics/why-use-redis.html 1.我们为什么一定要用 Redis 呢? 速度快,完全基于内存,使用 C 语言实现,网络层使用 epoll 解决高并发问题,单线程模型避免了不必要的上下文切换及竞争…...
论文笔记(七十二)Reward Centering(二)
Reward Centering(二) 文章概括摘要2 简单的奖励中心 文章概括 引用: article{naik2024reward,title{Reward Centering},author{Naik, Abhishek and Wan, Yi and Tomar, Manan and Sutton, Richard S},journal{arXiv preprint arXiv:2405.0…...
25年2月通信基础知识补充:多普勒频移与多普勒扩展、3GPP TDL信道模型
看文献过程中不断发现有太多不懂的基础知识,故长期更新这类blog不断补充在这过程中学到的知识。由于这些内容与我的研究方向并不一定强相关,故记录不会很深入请见谅。 【通信基础知识补充7】25年2月通信基础知识补充1 一、多普勒频移与多普勒扩展傻傻分不…...
ASUS/华硕无畏16 X1605VA 原厂Win11 22H2系统 工厂文件 带ASUS Recovery恢复
华硕工厂文件恢复系统 ,安装结束后带隐藏分区,带一键恢复,以及机器所有的驱动和软件。 支持型号:X1605VA 系统版本:Windows 11 22H2 文件下载:点击下载 文件格式:工厂文件 安装教程&#x…...
DeepSeek赋能智慧文旅:新一代解决方案,重构文旅发展的底层逻辑
DeepSeek作为一款前沿的人工智能大模型,凭借其强大的多模态理解、知识推理和内容生成能力,正在重构文旅产业的发展逻辑,推动行业从传统的经验驱动向数据驱动、从人力密集型向智能协同型转变。 一、智能服务重构:打造全域感知的智…...
自学FOC系列分享--SVPWM和clark 逆变换及代码实战
自学FOC系列分享--SVPWM和clark 逆变换 1 说在前面2 回顾clarke 和 park的变换和逆变换2.1 概述2.2 公式说明 3 SVPWM是如何写的3.1 简单说明3.2 重要对比 4、代码实战4.1 代码构成说明4.2 全部代码4.3 测试代码4.4 测试结果4.5 结果分析 总结 1 说在前面 如前一篇文章所述的系…...
LeetCode-524. 通过删除字母匹配到字典里最长单词
1、题目描述: 给你一个字符串 s 和一个字符串数组 dictionary ,找出并返回 dictionary 中最长的字符串,该字符串可以通过删除 s 中的某些字符得到。 如果答案不止一个,返回长度最长且字母序最小的字符串。如果答案不存在&#x…...
浅谈网络 | 容器网络之Cilium
目录 Cilium介绍Cilium是什么Cilium 主要功能特性为什么用Cilium? 功能概述组件概况BPF 与 XDPeBPF (Extended Berkeley Packet Filter)XDP (eXpress Data Path) Cilium介绍 Cilium是什么 Cilium 是一个开源网络和安全项目,专为 Kubernetes、Docker 和…...
armv7l
在 **ARMv7l** 中,最后的字符是字母 **“l”**(小写字母 “L”),而不是数字 **“1”**。 --- ### 1. **ARMv7l 的含义** - **ARMv7**:指的是 **ARM 架构的第 7 代版本**,是一种广泛应用于嵌入式系统&…...
从零开始构建一个小型字符级语言模型的完整详细教程(基于Transformer架构)
最近特别火的DeepSeek,是一个大语言模型,那一个模型是如何构建起来的呢?DeepSeek基于Transformer架构,接下来我们也从零开始构建一个基于Transformer架构的小型语言模型,并说明构建的详细步骤及内部组件说明。我们以构建一个字符级语言模型(Char-Level LM)为例,目标是通…...
期权帮|股指期货交割日为啥会大跌?
锦鲤三三每日分享期权知识,帮助期权新手及时有效地掌握即市趋势与新资讯! 股指期货交割日为啥会大跌? 股指期货交割日股市可能会大跌,主要原因有以下几点: 1.交割日时多空双方需要平仓或转仓,若多头急于平仓…...
B树和B+树
1. B树 1.1 定义 B树是一种多路平衡查找树,具有以下性质: 每个节点最多包含 m 个子节点(m 阶 B树)。 根节点至少有两个子节点(除非它是叶子节点)。 每个内部节点(非根和非叶子节点ÿ…...
分布式事务核心理论:CAP与BASE
一、引言:分布式系统的挑战 在互联网应用中,随着业务规模的扩大,单机数据库已无法满足高并发和海量数据存储需求。分布式系统将数据拆分到不同节点,但随之带来了数据一致性的难题。CAP与BASE理论为分布式事务提供了重要的设计指导…...
【UCB CS 61B SP24】Lecture 4 - Lists 2: SLLists学习笔记
本文内容为重写上一节课中的单链表,将其重构成更易于用户使用的链表,实现多种操作链表的方法。 1. 重构单链表SLList 在上一节课中编写的 IntList 类是裸露递归的形式,在 Java 中一般不会这么定义,因为这样用户可能需要非常了解…...
记录一个ES分词器不生效的解决过程
问题背景 商城项目,其中商品查询检索使用的是ES, 但存在某些商品查询不到的问题 例如:某商品名包含AA_BBB这样的关键词,但是搜索"AA"不能查询到该商品,但是将商品名修改为AA BBB后就能查询到了. 怀疑是分词的问题,但看代码,在创建ES索引时在对应字段上也定义了分词器…...
Jeecg+vue3去掉JInput组件默认模糊查询的配置方式
遇见的问题:input查询带**号 情况1、使用schemas配置的表单项 添加type""属性,默认type为like,去掉模糊则配置为空 文档链接 jeecg文档说明JInput 2、直接调用组件则在属性上直接添加type""属性...
Could not initialize class io.netty.util.internal.Platfor...
异常信息: Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class io.netty.util.internal.PlatformDependent0 Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.reflect.InaccessibleObjec…...
什么是业务流程分类框架
业务流程分类框架是一个用于组织和系统化地分类业务流程的结构化方法。它旨在帮助企业理解、管理、分析和改进其运营流程。 可以把它想象成一个图书馆的图书分类系统,帮助快速找到和理解不同类型的书籍。对于业务流程来说,分类框架帮助快速了解不同类型的…...
向量的点乘的几何意义
源自AI 向量的点乘(Dot Product)在几何和图形学中有重要的意义。它不仅是数学运算,还可以用来描述向量之间的关系。以下是点乘的几何意义及其应用: 1. 点乘的定义 对于两个向量 a 和 b,它们的点乘定义为:…...
达梦数据库应用开发_JDBC接口介绍_yxy
达梦数据库应用开发_JDBC接口介绍 1 JDBC是什么?2 JDBC主要类或接口介绍2.1 建立与 DM 数据库的连接2.2 转接发送 SQL 语句到数据库2.3 处理并返回语句执行结果 3 JDBC基本使用3.1 前期准备3.2 增删改查代码示例3.3 绑定变量操作示例3.4 大字段操作示例 1 JDBC是什么…...
在ubuntu上用Python的openpyxl模块操作Excel的案例
文章目录 安装模块读取Excel数据库取数匹配数据和更新Excel数据 在Ubuntu系统的环境下基本职能借助Python的openpyxl模块实现对Excel数据的操作。 安装模块 本次需要用到的模块需要提前安装(如果没有的话) pip3 install openpyxl pip3 install pymysql在操作前,需…...
RabbitMQ介绍以及基本使用
文章目录 一、什么是消息队列? 二、消息队列的作用(优点) 1、解耦 2、流量削峰 3、异步 4、顺序性 三、RabbitMQ基本结构 四、RabbitMQ队列模式 1、简单队列模式 2、工作队列模式 3、发布/订阅模式 4、路由模式 5、主题模式 6、…...
搭建 Hadoop 3.3.6 伪分布式
搭建 Hadoop 3.3.6 伪分布式 IP 192.168.157.132 初始化操作 更改yum源 # 1_1.安装Wget yum install wget# 1_2.备份CentOS-Base.repo文件 mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo_bak# 2.下载阿里yum源配置 wget -O /etc/yum.repos.d/Cen…...
进程控制(靠原语实现)
什么是进程控制? 进程控制的主要功能是:对系统中的所有进程实施有效管理,它具有创建新进程、撤销已有进程、实现进程状态转换等功能。 简化理解:进程控制就是要实现进程状态转换。 知识一览: 如何实现进程控制&#…...
linux5-多任务--进程fork()
一.多任务:让系统具备同时处理多个任务的能力 1.如何实现多任务 1.1进程:操作系统上正在运行的程序,需要消耗内存和CPU 1.1.1 进程的生存周期:创建、调度、消亡 1.1.1.1进程的创建:每个进程被创建时,操作…...
【蓝桥】二分法
1、简介 1.1 定义: 通过将问题的搜索范围一分为二,迭代地缩小搜索范围,直到找到目标或确定目标不存在 1.2 适用: 有序数据集合,且每次迭代可以将搜索范围缩小一半 1.3 本质: 枚举:利用数据结构…...
linux查看程序占用的本地端口
ss是Socket Statistics的缩写,用来替代旧的netstat工具,功能更强大,执行更快。它用于查看系统的网络连接情况,包括TCP、UDP等协议的信息。 一. 命令解析: sudo ss -tulwnpss (Socket Statistics):替代 ne…...
统计函数运行时间的python脚本
这是一个统计函数运行时间的实用脚本,其中用到了函数的嵌套、链式传输参数,以及修饰器。 import time# 定义一个装饰器timer,用于计算被装饰函数的运行时间 def timer(func):print("执行了timer")def wrapper(*args, **kwargs):st…...
百度搜索和文心智能体接入DeepSeek满血版——AI搜索的新纪元
在当今数字化时代,搜索引擎作为互联网信息获取的核心工具,正经历着前所未有的变革。据悉,2025年2月16日,百度搜索和文心智能体平台宣布全面接入DeepSeek和文心大模型的最新深度搜索功能,搜索用户可免费使用DeepSeek和文…...
快速入门Springboot+vue——MybatisPlus快速上手
学习自哔哩哔哩上的“刘老师教编程”,具体学习的网站为:6.MybatisPlus快速上手_哔哩哔哩_bilibili,以下是看课后做的笔记,仅供参考。 第一节 ORM介绍 ORM对象关系映射,为了解决面向对象与关系数据库存在的互补匹配现象…...
在C#中动态访问对象属性时,用表达式树可以获得高效性能
在C#中如何用表达式树动态访问对象属性的问题。用户可能已经知道反射的基本用法,但想用表达式树来提高性能,因为表达式树编译后的委托执行速度比反射快。 首先,表达式树的基本概念。表达式树允许在运行时构建代码,并编译成可执行的…...
性能:React 实战优化技巧 之 函数闭包
子组件使用了 React.memo ,为什么 “prop 值未发生改变”,子组件依然被重新渲染了? 🚧 示例:点击子组件中按钮,获取 input 数据进行提交(常见于表单) index.tsx import Author f…...
蓝桥杯学习大纲
(致酷德与热爱算法、编程的小伙伴们) 在查阅了相当多的资料后,发现没有那篇博客、文章很符合我们备战蓝桥杯的学习路径。所以,干脆自己整理一篇,欢迎大家补充! 一、蓝桥必备高频考点 我们以此为重点学习…...
Windows11切换回Windows10风格右键菜单
参考文章:Win11新版右键菜单用不惯?一键切换回Win10经典版!-CSDN博客 以管理员权限运行命令行cmd 切换为经典旧版右键菜单,执行 reg.exe add “HKCU\Software\Classes\CLSID\{86ca1aa0-34aa-4e8b-a509-50c905bae2a2}\InprocServe…...
Python 爬虫selenium
1.selenium自动化 selenium可以操作浏览器,在浏览器页面上实现:点击、输入、滑动 等操作。 不同于selenium自动化,逆向本质是: 分析请求,例如:请求方法、请求参数、加密方式等。用代码模拟请求去实现同等…...
Linux常用操作
软件安装 CentOS系统使用: yum [install remove search] [-y] 软件名称 install 安装 remove 卸载 search 搜索 -y,自动确认 Ubuntu系统使用 apt [install remove search] [-y] 软件名称 install 安装 remove 卸载 search 搜索 -y,自动确认 yum 和 apt 均需要root权限 syste…...
Note25022001_Excel表格如何在文字的中间或者后边插入当前日期
Excel表格如何在文字的中间或者后边插入当前日期 关键字: EXCEL;当前日期;文字中间 如图所示: 其中一种实现方法如下: 打开表格,在某一个表格中输入: "项目计划管理表(厂内&…...
Django5 实用指南(四)URL路由与视图函数
4.1 Django5的URL路由系统 Django 的 URL 路由系统是其核心组件之一,它负责将用户的 HTTP 请求(即 URL)映射到相应的视图函数上。每当用户在浏览器中访问某个 URL 时,Django 会根据项目的 URL 配置文件(urls.py&#…...
Unity3D 基于 GPU 动画和 Compute Shader 的大批量动画渲染详解
引言 在现代游戏开发中,渲染大量动画角色是一个常见的需求,尤其是在大规模战斗场景、开放世界游戏或 VR/AR 应用中。传统的 CPU 动画计算和渲染方式在面对大批量角色时,往往会遇到性能瓶颈。为了优化性能,开发者可以利用 GPU 的强…...
遥感影像目标检测:从CNN(Faster-RCNN)到Transformer(DETR)
我国高分辨率对地观测系统重大专项已全面启动,高空间、高光谱、高时间分辨率和宽地面覆盖于一体的全球天空地一体化立体对地观测网逐步形成,将成为保障国家安全的基础性和战略性资源。未来10年全球每天获取的观测数据将超过10PB,遥感大数据时…...
什么是DeFi (去中心化金融)
DeFi (去中心化金融) 概述 💰 1. DeFi 基础概念 1.1 什么是 DeFi? DeFi 是建立在区块链上的金融服务生态系统,它: 无需中心化中介开放且透明无需许可即可参与代码即法律 1.2 DeFi 的优势 开放性:任何人都可以参与…...
深入解析 sudo -l 命令的输出内容
在 Linux 系统中,sudo 命令允许普通用户以超级用户(root)权限执行命令。sudo -l 命令用于查看当前用户在 sudoers 配置文件中的权限,以及与 sudo 相关的安全策略。本文将详细解析 sudo -l 输出的各个部分,包括 用户权限…...