news 2026/6/10 16:16:42

AWS EKS部署Prometheus和Grafana

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
AWS EKS部署Prometheus和Grafana

、创建Prometheus工作区

1.创建工作区

为了可以把Prometheus数据写入到AWS managed Prometheus,需要先在AWS Prometheus控制台中创建工作区

image

2.保存工作区配置

点击AWS Prometheus工作区ID进入详情,将提取/收集 中的配置保存为prometheus.yaml,后面会在安装prometheus时使用。

image

3.创建从EKS提取指标的role

使用以下内容创建名为 createIRSA-AMPIngest.sh 的文件。将 <my_amazon_eks_clustername> 替换为您集群的名称,并将 <my_prometheus_namespace> 替换为您的 Prometheus 命名空间

复制代码

#!/bin/bash -e

CLUSTER_NAME=<my_amazon_eks_clustername>

SERVICE_ACCOUNT_NAMESPACE=<my_prometheus_namespace>

AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)

OIDC_PROVIDER=$(aws eks describe-cluster --name $CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")

SERVICE_ACCOUNT_AMP_INGEST_NAME=amp-iamproxy-ingest-service-account

SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE=amp-iamproxy-ingest-role

SERVICE_ACCOUNT_IAM_AMP_INGEST_POLICY=AMPIngestPolicy

#

# Set up a trust policy designed for a specific combination of K8s service account and namespace to sign in from a Kubernetes cluster which hosts the OIDC Idp.

#

cat <<EOF > TrustPolicy.json

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"

},

"Action": "sts:AssumeRoleWithWebIdentity",

"Condition": {

"StringEquals": {

"${OIDC_PROVIDER}:sub": "system:serviceaccount:${SERVICE_ACCOUNT_NAMESPACE}:${SERVICE_ACCOUNT_AMP_INGEST_NAME}"

}

}

}

]

}

EOF

#

# Set up the permission policy that grants ingest (remote write) permissions for all AMP workspaces

#

cat <<EOF > PermissionPolicyIngest.json

{

"Version": "2012-10-17",

"Statement": [

{"Effect": "Allow",

"Action": [

"aps:RemoteWrite",

"aps:GetSeries",

"aps:GetLabels",

"aps:GetMetricMetadata"

],

"Resource": "*"

}

]

}

EOF

function getRoleArn() {

OUTPUT=$(aws iam get-role --role-name $1 --query 'Role.Arn' --output text 2>&1)

# Check for an expected exception

if [[ $? -eq 0 ]]; then

echo $OUTPUT

elif [[ -n $(grep "NoSuchEntity" <<< $OUTPUT) ]]; then

echo ""

else

>&2 echo $OUTPUT

return 1

fi

}

#

# Create the IAM Role for ingest with the above trust policy

#

SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN=$(getRoleArn $SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE)

if [ "$SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN" = "" ];

then

#

# Create the IAM role for service account

#

SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN=$(aws iam create-role \

--role-name $SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE \

--assume-role-policy-document file://TrustPolicy.json \

--query "Role.Arn" --output text)

#

# Create an IAM permission policy

#

SERVICE_ACCOUNT_IAM_AMP_INGEST_ARN=$(aws iam create-policy --policy-name $SERVICE_ACCOUNT_IAM_AMP_INGEST_POLICY \

--policy-document file://PermissionPolicyIngest.json \

--query 'Policy.Arn' --output text)

#

# Attach the required IAM policies to the IAM role created above

#

aws iam attach-role-policy \

--role-name $SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE \

--policy-arn $SERVICE_ACCOUNT_IAM_AMP_INGEST_ARN

else

echo "$SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN IAM role for ingest already exists"

fi

echo $SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN

#

# EKS cluster hosts an OIDC provider with a public discovery endpoint.

# Associate this IdP with AWS IAM so that the latter can validate and accept the OIDC tokens issued by Kubernetes to service accounts.

# Doing this with eksctl is the easier and best approach.

#

eksctl utils associate-iam-oidc-provider --cluster $CLUSTER_NAME --approve

复制代码

执行以上脚本创建role

bash createIRSA-AMPIngest.sh

使用以下内容创建名为 createIRSA-AMPQuery.sh 的文件。将 <my_amazon_eks_clustername> 替换为集群的名称,并将 <my_prometheus_namespace> 替换为您的 Prometheus 命名空间。

复制代码

#!/bin/bash -e

CLUSTER_NAME=<my_amazon_eks_clustername>

SERVICE_ACCOUNT_NAMESPACE=<my_prometheus_namespace>

AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)

OIDC_PROVIDER=$(aws eks describe-cluster --name $CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")

SERVICE_ACCOUNT_AMP_QUERY_NAME=amp-iamproxy-query-service-account

SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE=amp-iamproxy-query-role

SERVICE_ACCOUNT_IAM_AMP_QUERY_POLICY=AMPQueryPolicy

#

# Setup a trust policy designed for a specific combination of K8s service account and namespace to sign in from a Kubernetes cluster which hosts the OIDC Idp.

#

cat <<EOF > TrustPolicy.json

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"

},

"Action": "sts:AssumeRoleWithWebIdentity",

"Condition": {

"StringEquals": {

"${OIDC_PROVIDER}:sub": "system:serviceaccount:${SERVICE_ACCOUNT_NAMESPACE}:${SERVICE_ACCOUNT_AMP_QUERY_NAME}"

}

}

}

]

}

EOF

#

# Set up the permission policy that grants query permissions for all AMP workspaces

#

cat <<EOF > PermissionPolicyQuery.json

{

"Version": "2012-10-17",

"Statement": [

{"Effect": "Allow",

"Action": [

"aps:QueryMetrics",

"aps:GetSeries",

"aps:GetLabels",

"aps:GetMetricMetadata"

],

"Resource": "*"

}

]

}

EOF

function getRoleArn() {

OUTPUT=$(aws iam get-role --role-name $1 --query 'Role.Arn' --output text 2>&1)

# Check for an expected exception

if [[ $? -eq 0 ]]; then

echo $OUTPUT

elif [[ -n $(grep "NoSuchEntity" <<< $OUTPUT) ]]; then

echo ""

else

>&2 echo $OUTPUT

return 1

fi

}

#

# Create the IAM Role for query with the above trust policy

#

SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN=$(getRoleArn $SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE)

if [ "$SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN" = "" ];

then

#

# Create the IAM role for service account

#

SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN=$(aws iam create-role \

--role-name $SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE \

--assume-role-policy-document file://TrustPolicy.json \

--query "Role.Arn" --output text)

#

# Create an IAM permission policy

#

SERVICE_ACCOUNT_IAM_AMP_QUERY_ARN=$(aws iam create-policy --policy-name $SERVICE_ACCOUNT_IAM_AMP_QUERY_POLICY \

--policy-document file://PermissionPolicyQuery.json \

--query 'Policy.Arn' --output text)

#

# Attach the required IAM policies to the IAM role create above

#

aws iam attach-role-policy \

--role-name $SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE \

--policy-arn $SERVICE_ACCOUNT_IAM_AMP_QUERY_ARN

else

echo "$SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN IAM role for query already exists"

fi

echo $SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN

#

# EKS cluster hosts an OIDC provider with a public discovery endpoint.

# Associate this IdP with AWS IAM so that the latter can validate and accept the OIDC tokens issued by Kubernetes to service accounts.

# Doing this with eksctl is the easier and best approach.

#

eksctl utils associate-iam-oidc-provider --cluster $CLUSTER_NAME --approve

复制代码

执行以上脚本,创建role

bash createIRSA-AMPQuery.sh

二、部署Prometheus

1.添加helm仓库

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

helm repo add kube-state-metrics https://kubernetes.github.io/kube-state-metrics

helm repo update

2.创建部署Prometheus的命名空间

kubectl create namespace monitoring

3.检查Amazon EBS CSI

如果EBS CSI组件没有附加对应的IAM role,需要在IAM 控制台中创建附有AmazonEBSCSIDriverPolicy权限且类型为AWS账号的role,否则EKS创建PVC时会报错

image

4.创建storageClass

复制代码

#cat sc.yaml

apiVersion: storage.k8s.io/v1

kind: StorageClass

metadata:

name: ebs-sc

annotations:

storageclass.kubernetes.io/is-default-class: "true"

provisioner: ebs.csi.aws.com

allowVolumeExpansion: true

volumeBindingMode: WaitForFirstConsumer

parameters:

type: gp3

#kubectl apply -f sc.yaml

复制代码

5.部署Prometheus

helm install prometheus prometheus -n monitoring -f prometheus.yaml

6.查看Prometheus是否部署成功

kubectl get pods -n monitoring

7.部署grafana

复制代码

#cat grafana.yaml

---

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

name: grafana-pvc

spec:

accessModes:

- ReadWriteOnce

resources:

requests:

storage: 1Gi

---

apiVersion: apps/v1

kind: Deployment

metadata:

labels:

app: grafana

name: grafana

spec:

selector:

matchLabels:

app: grafana

template:

metadata:

labels:

app: grafana

spec:

securityContext:

fsGroup: 472

supplementalGroups:

- 0

containers:

- name: grafana

image: grafana/grafana:latest

imagePullPolicy: IfNotPresent

ports:

- containerPort: 3000

name: http-grafana

protocol: TCP

readinessProbe:

failureThreshold: 3

httpGet:

path: /robots.txt

port: 3000

scheme: HTTP

initialDelaySeconds: 10

periodSeconds: 30

successThreshold: 1

timeoutSeconds: 2

livenessProbe:

failureThreshold: 3

initialDelaySeconds: 30

periodSeconds: 10

successThreshold: 1

tcpSocket:

port: 3000

timeoutSeconds: 1

resources:

requests:

cpu: 250m

memory: 750Mi

volumeMounts:

- mountPath: /var/lib/grafana

name: grafana-pv

volumes:

- name: grafana-pv

persistentVolumeClaim:

claimName: grafana-pvc

---

apiVersion: v1

kind: Service

metadata:

name: grafana

spec:

ports:

- port: 3000

protocol: TCP

targetPort: http-grafana

selector:

app: grafana

sessionAffinity: None

type: ClusterIP

#kubectl apply -f grafana.yaml -n monitoring

复制代码

三、访问Prometheus和grafana

Prometheus和grafana部署完成以后,可以将SVC类型改为nodeport,然后通过ALB暴露出来,通过公网进行访问

grafana默认用户密码为admin/admin

image

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/6/10 15:50:45

Wayback Machine浏览器扩展终极指南:5个实用技巧快速上手

Wayback Machine浏览器扩展终极指南&#xff1a;5个实用技巧快速上手 【免费下载链接】wayback-machine-webextension A web browser extension for Chrome, Firefox, Edge, and Safari 14. 项目地址: https://gitcode.com/gh_mirrors/wa/wayback-machine-webextension …

作者头像 李华
网站建设 2026/6/9 21:30:29

RT-DETR 2025实战指南:动态卷积如何重塑工业级目标检测

RT-DETR 2025实战指南&#xff1a;动态卷积如何重塑工业级目标检测 【免费下载链接】rtdetr_r101vd_coco_o365 项目地址: https://ai.gitcode.com/hf_mirrors/PekingU/rtdetr_r101vd_coco_o365 技术痛点与行业挑战 当前工业级目标检测面临三大核心难题&#xff1a;精度…

作者头像 李华
网站建设 2026/6/10 1:15:29

GRF框架:构建下一代因果机器学习系统的核心技术解析

GRF框架&#xff1a;构建下一代因果机器学习系统的核心技术解析 【免费下载链接】grf Generalized Random Forests 项目地址: https://gitcode.com/gh_mirrors/gr/grf 在当今数据驱动的决策环境中&#xff0c;准确识别和量化因果效应已成为企业和研究机构的核心需求。G…

作者头像 李华
网站建设 2026/6/10 5:41:14

OpenXR-Toolkit完全指南:轻松打造高性能VR应用的终极解决方案

想要让你的VR应用性能翻倍却不知从何入手&#xff1f;OpenXR-Toolkit正是你需要的那个强大工具&#xff01;这个强大的开源工具包专为OpenXR应用程序量身定制&#xff0c;提供了一系列即插即用的功能模块&#xff0c;让复杂的VR开发变得简单高效。 【免费下载链接】OpenXR-Tool…

作者头像 李华
网站建设 2026/6/10 15:49:14

SOME/IP在智能座舱中的5个典型应用场景解析

快速体验 打开 InsCode(快马)平台 https://www.inscode.net输入框内输入如下内容&#xff1a; 构建一个智能座舱通信模拟器&#xff0c;包含&#xff1a;1. 中控屏服务 2. 仪表盘服务 3. HUD服务 4. 语音控制服务。要求&#xff1a;- 各服务间通过SOME/IP通信 - 模拟用户点击中…

作者头像 李华
网站建设 2026/6/10 14:05:26

Windows 11安装终极指南:轻松绕过硬件限制的完整解决方案

Windows 11安装终极指南&#xff1a;轻松绕过硬件限制的完整解决方案 【免费下载链接】MediaCreationTool.bat Universal MCT wrapper script for all Windows 10/11 versions from 1507 to 21H2! 项目地址: https://gitcode.com/gh_mirrors/me/MediaCreationTool.bat 还…

作者头像 李华