华为云负载均衡连接数监控与飞书通知
在云服务的日常运维中,持续监控资源状态是保障系统稳定性的关键步骤之一。本文通过一个实际案例展示了如何使用华为云的Go SDK获取负载均衡器的连接数,并通过飞书Webhook发送通知到团队群组,以便运维人员及时获取最新的监控信息。本来准备直接使用ces告警,但是看了一下模版以及最佳实践貌似没有很好的支持webhook,就直接自己使用go sdk实现了!
背景知识
在华为云上,负载均衡服务(ELB)用于分发来自客户端的网络请求到多个云服务器,确保系统在面对不同的负载情况时,仍能够提供稳定、可靠的服务。ELB的性能指标,如每分钟连接数(CPS),是反映当前系统承载能力的重要数据。通常情况下,我们希望能够实时监控这些关键指标。
随着云服务技术的成熟,大型企业往往会将监控数据集成到实时通讯工具中,便于团队成员即时查看和响应潜在的问题。本案例中选择的通讯工具是飞书,华为云Go SDK则是我们与华为云服务交互的媒介。
环境准备
华为云提供的Go SDK是一套围绕华为云API构建的开发工具包,使得开发者可以在Go语言环境中便捷地调用云服务。在这里,我们利用Cloud Eye Service (CES) 的API,通过SDK检索ELB的CPS指标数据。
安装华为云Go SDK
首先需要安装华为云Go SDK。可以通过go get
命令安装所需的SDK包:
go get -u github.com/huaweicloud/huaweicloud-sdk-go-v3
安装完成后,即可在项目中引入相关的SDK模块。
初始化客户端
要与华为云的服务交互,我们需要创建并初始化一个SDK客户端。如下示例中,我们创建了用于CES(Cloud Eye Service)服务的客户端,并使用了之前提到的AK和SK进行了认证。
package main// 导入相关的包
import ("fmt""bytes""json""http""ioutil""time""github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic""github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1"ces "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1/model"
)const (feishuWebhookURL = "xxxx" // 飞书Webhook URLak = "xxx" // Access Keysk = "xxxxx" // Secret Key
)func main() {// 构建认证信息auth := basic.NewCredentialsBuilder().WithAk(ak).WithSk(sk).Build()// 初始化CES客户端client := ces.NewCesClient(ces.CesClientBuilder().WithRegion(region.ValueOf("cn-east-3")).WithCredential(auth).Build())// ...后续代码
}
设置定时器和执行任务
我们通过一个定时器来定期检查负载均衡器的最大连接数,例如:
ticker := time.NewTicker(1 * time.Minute) // 每分钟触发检查for {select {case t := <-ticker.C:currentHour := t.Hour()// 只在既定的时间范围内执行if currentHour >= 7 && currentHour < 24 {// 我们设定在59分时收集数据if t.Minute() == 59 {go collectDataAndSendToFeishu(client)}}}}
这里限制了定时器发送的时间范围早上7点到24点执行,0点-7点默认不执行。并且执行的时间是每个小时的59分执行!
收集和发送数据
一旦定时器触发并满足条件,我们会收集负载均衡的最大连接数并发送给飞书,参考 华为云ces ShowMetricData接口:
具体实现如下,注意**ShowMetricDataRequest **中具体参数:
func collectDataAndSendToFeishu(client *ces.CesClient) {currentTime := time.Now().UTC()startTime := currentTime.Truncate(time.Hour).Add(time.Minute * 58)endTime := startTime.Add(time.Minute)startTimestamp := startTime.UnixNano() / int64(time.Millisecond)endTimestamp := endTime.UnixNano() / int64(time.Millisecond)request := &model.ShowMetricDataRequest{Namespace: "SYS.ELB",MetricName: "m1_cps",Dim0: "lbaas_instance_id,xxxxxx",Filter: model.GetShowMetricDataRequestFilterEnum().MAX,Period: int32(1),From: startTimestamp,To: endTimestamp,}response, err := client.ShowMetricData(request)if err != nil {fmt.Println("Error querying CES data:", err)return}fmt.Printf("CES response: %+v\n", response)// Extract max value and timestamp from the responsevar maxConnection float64var timestamp int64if response.Datapoints != nil && len(*response.Datapoints) > 0 {datapoints := *response.DatapointsmaxConnection = *datapoints[0].Maxtimestamp = datapoints[0].Timestamp}// Format the timestamp to a readable formreadableTime := time.Unix(timestamp/1000, (timestamp%1000)*int64(time.Millisecond)).Format("2006-01-02 15:04:05")// Prepare the message to send to FeishufeishuMessage := fmt.Sprintf("当前时间 %s 负载均衡最大连接数是 %.2f", readableTime, maxConnection)if err := sendToFeishuWebhook(feishuWebhookURL, feishuMessage); err != nil {fmt.Println("Error sending to Feishu webhook:", err)}
}
发送Webhook通知
最后,实现sendToFeishuWebhook
方法以将消息推送到飞书。
func sendToFeishuWebhook(webhookURL string, message string) error {webhookMessage := FeishuWebhookMessage{MsgType: "text",}webhookMessage.Content.Text = messagejsonData, err := json.Marshal(webhookMessage)if err != nil {return fmt.Errorf("failed to marshal webhook message: %v", err)}req, err := http.NewRequest("POST", webhookURL, bytes.NewBuffer(jsonData))if err != nil {return fmt.Errorf("failed to create HTTP request: %v", err)}req.Header.Set("Content-Type", "application/json")client := &http.Client{}resp, err := client.Do(req)if err != nil {return fmt.Errorf("failed to send HTTP request: %v", err)}defer resp.Body.Close()responseBody, err := ioutil.ReadAll(resp.Body)if err != nil {return fmt.Errorf("failed to read webhook response body: %v", err)}fmt.Println("Feishu webhook response:", string(responseBody))return nil
}
根
完整代码:
每小时59分统计58-59分最大值发送统计到飞书:
package main
type FeishuWebhookMessage struct {MsgType string `json:"msg_type"`Content struct {Text string `json:"text"`} `json:"content"`
}const (// 定时器间隔,用于根据特定时间点触发数据检索。例如:59分时执行任务,就是(59 - 当前时间的分钟数) x 每分钟的秒数feishuWebhookURL = "xxxx"ak = "xxx"sk = "xxxxx"
)func main() {auth := basic.NewCredentialsBuilder().WithAk(ak).WithSk(sk).Build()client := ces.NewCesClient(ces.CesClientBuilder().WithRegion(region.ValueOf("cn-east-3")).WithCredential(auth).Build())ticker := time.NewTicker(1 * time.Minute) // Check every 10 minutes to adjust for the next 59th minute.for {select {case t := <-ticker.C:// 这里设置只在7-24点执行currentHour := t.Hour()if currentHour >= 7 && currentHour < 24 {if t.Minute() == 59 {go collectDataAndSendToFeishu(client)}}}}
}
func collectDataAndSendToFeishu(client *ces.CesClient) {currentTime := time.Now().UTC()startTime := currentTime.Truncate(time.Hour).Add(time.Minute * 58)endTime := startTime.Add(time.Minute)startTimestamp := startTime.UnixNano() / int64(time.Millisecond)endTimestamp := endTime.UnixNano() / int64(time.Millisecond)request := &model.ShowMetricDataRequest{Namespace: "SYS.ELB",MetricName: "m1_cps",Dim0: "lbaas_instance_id,xxxxxx",Filter: model.GetShowMetricDataRequestFilterEnum().MAX,Period: int32(1),From: startTimestamp,To: endTimestamp,}response, err := client.ShowMetricData(request)if err != nil {fmt.Println("Error querying CES data:", err)return}fmt.Printf("CES response: %+v\n", response)// Extract max value and timestamp from the responsevar maxConnection float64var timestamp int64if response.Datapoints != nil && len(*response.Datapoints) > 0 {datapoints := *response.DatapointsmaxConnection = *datapoints[0].Maxtimestamp = datapoints[0].Timestamp}// Format the timestamp to a readable formreadableTime := time.Unix(timestamp/1000, (timestamp%1000)*int64(time.Millisecond)).Format("2006-01-02 15:04:05")// Prepare the message to send to FeishufeishuMessage := fmt.Sprintf("当前时间 %s 负载均衡最大连接数是 %.2f", readableTime, maxConnection)if err := sendToFeishuWebhook(feishuWebhookURL, feishuMessage); err != nil {fmt.Println("Error sending to Feishu webhook:", err)}
}// sendToFeishuWebhook sends a message to Feishu webhook
func sendToFeishuWebhook(webhookURL string, message string) error {webhookMessage := FeishuWebhookMessage{MsgType: "text",}webhookMessage.Content.Text = messagejsonData, err := json.Marshal(webhookMessage)if err != nil {return fmt.Errorf("failed to marshal webhook message: %v", err)}req, err := http.NewRequest("POST", webhookURL, bytes.NewBuffer(jsonData))if err != nil {return fmt.Errorf("failed to create HTTP request: %v", err)}req.Header.Set("Content-Type", "application/json")client := &http.Client{}resp, err := client.Do(req)if err != nil {return fmt.Errorf("failed to send HTTP request: %v", err)}defer resp.Body.Close()responseBody, err := ioutil.ReadAll(resp.Body)if err != nil {return fmt.Errorf("failed to read webhook response body: %v", err)}fmt.Println("Feishu webhook response:", string(responseBody))return nil
}
运行以上代码:
其它的扩展玩法:
每分钟检查一次,当连接数大于100报警触发
package mainimport ("bytes""encoding/json""fmt""github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic"ces "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1""github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1/model"region "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1/region""io/ioutil""net/http""time"
)type FeishuWebhookMessage struct {MsgType string `json:"msg_type"`Content struct {Text string `json:"text"`} `json:"content"`
}const (feishuWebhookURL = "xxxx"ak = "xxxx"sk = "xxxxx"
)func main() {auth := basic.NewCredentialsBuilder().WithAk(ak).WithSk(sk).Build()client := ces.NewCesClient(ces.CesClientBuilder().WithRegion(region.ValueOf("cn-east-3")).WithCredential(auth).Build())ticker := time.NewTicker(1 * time.Minute) // Check every 10 minutes to adjust for the next 59th minute.for {select {case t := <-ticker.C:// 这里设置只在7-24点执行currentHour := t.Hour()if currentHour >= 7 && currentHour < 24 {go collectDataAndSendToFeishu(client)}}}
}
func collectDataAndSendToFeishu(client *ces.CesClient) {currentTime := time.Now().UTC().Truncate(time.Minute)startTime := currentTime.Add(-1 * time.Minute)endTime := currentTimestartTimestamp := startTime.UnixNano() / int64(time.Millisecond)endTimestamp := endTime.UnixNano() / int64(time.Millisecond)request := &model.ShowMetricDataRequest{Namespace: "SYS.ELB",MetricName: "m1_cps",Dim0: "lbaas_instance_id,xxxxx",Filter: model.GetShowMetricDataRequestFilterEnum().MAX,Period: int32(1),From: startTimestamp,To: endTimestamp,}response, err := client.ShowMetricData(request)if err != nil {fmt.Println("Error querying CES data:", err)return}fmt.Printf("CES response: %+v\n", response)// Extract max value and timestamp from the responsevar maxConnection float64var timestamp int64if response.Datapoints != nil && len(*response.Datapoints) > 0 {datapoints := *response.DatapointsmaxConnection = *datapoints[0].Maxtimestamp = datapoints[0].Timestamp}// Format the timestamp to a readable formreadableTime := time.Unix(timestamp/1000, (timestamp%1000)*int64(time.Millisecond)).Format("2006-01-02 15:04:05")// Prepare the message to send to Feishuif maxConnection > 100 {// Prepare the alert message to send to FeishufeishuMessage := fmt.Sprintf("警告:当前时间 %s 负载均衡连接数超越100,当前数值是 %.2f", readableTime, maxConnection)if err := sendToFeishuWebhook(feishuWebhookURL, feishuMessage); err != nil {fmt.Println("Error sending to Feishu webhook:", err)}}
}// sendToFeishuWebhook sends a message to Feishu webhook
func sendToFeishuWebhook(webhookURL string, message string) error {webhookMessage := FeishuWebhookMessage{MsgType: "text",}webhookMessage.Content.Text = messagejsonData, err := json.Marshal(webhookMessage)if err != nil {return fmt.Errorf("failed to marshal webhook message: %v", err)}req, err := http.NewRequest("POST", webhookURL, bytes.NewBuffer(jsonData))if err != nil {return fmt.Errorf("failed to create HTTP request: %v", err)}req.Header.Set("Content-Type", "application/json")client := &http.Client{}resp, err := client.Do(req)if err != nil {return fmt.Errorf("failed to send HTTP request: %v", err)}defer resp.Body.Close()responseBody, err := ioutil.ReadAll(resp.Body)if err != nil {return fmt.Errorf("failed to read webhook response body: %v", err)}fmt.Println("Feishu webhook response:", string(responseBody))return nil
}
为了方便代码的复用性,可读性。将100作为一个可配置常量提取出来:
package mainimport ("bytes""encoding/json""fmt""github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic"ces "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1""github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1/model"region "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1/region""io/ioutil""net/http""time"
)type FeishuWebhookMessage struct {MsgType string `json:"msg_type"`Content struct {Text string `json:"text"`} `json:"content"`
}const (// 定时器间隔,用于根据特定时间点触发数据检索。例如:59分时执行任务,就是(59 - 当前时间的分钟数) x 每分钟的秒数feishuWebhookURL = "xxxxx"ak = "xxxx"sk = "xxxx"loadBalancerConnectionThreshold = 100.00 // 负载均衡连接数阈值
)func main() {auth := basic.NewCredentialsBuilder().WithAk(ak).WithSk(sk).Build()client := ces.NewCesClient(ces.CesClientBuilder().WithRegion(region.ValueOf("cn-east-3")).WithCredential(auth).Build())ticker := time.NewTicker(1 * time.Minute) // Check every 10 minutes to adjust for the next 59th minute.for {select {case t := <-ticker.C:// 这里设置只在7-24点执行currentHour := t.Hour()if currentHour >= 7 && currentHour < 24 {go collectDataAndSendToFeishu(client)}}}
}
func collectDataAndSendToFeishu(client *ces.CesClient) {currentTime := time.Now().UTC().Truncate(time.Minute)startTime := currentTime.Add(-1 * time.Minute)endTime := currentTimestartTimestamp := startTime.UnixNano() / int64(time.Millisecond)endTimestamp := endTime.UnixNano() / int64(time.Millisecond)request := &model.ShowMetricDataRequest{Namespace: "SYS.ELB",MetricName: "m1_cps",Dim0: "lbaas_instance_id,xxxx",Filter: model.GetShowMetricDataRequestFilterEnum().MAX,Period: int32(1),From: startTimestamp,To: endTimestamp,}response, err := client.ShowMetricData(request)if err != nil {fmt.Println("Error querying CES data:", err)return}fmt.Printf("CES response: %+v\n", response)// Extract max value and timestamp from the responsevar maxConnection float64var timestamp int64if response.Datapoints != nil && len(*response.Datapoints) > 0 {datapoints := *response.DatapointsmaxConnection = *datapoints[0].Maxtimestamp = datapoints[0].Timestamp}// Format the timestamp to a readable formreadableTime := time.Unix(timestamp/1000, (timestamp%1000)*int64(time.Millisecond)).Format("2006-01-02 15:04:05")// Prepare the message to send to Feishuif maxConnection > loadBalancerConnectionThreshold {// Prepare the alert message to send to FeishufeishuMessage := fmt.Sprintf("警报:在%s,负载均衡器的连接数超过了%.2f,当前连接数:%.2f", readableTime, loadBalancerConnectionThreshold, maxConnection)if err := sendToFeishuWebhook(feishuWebhookURL, feishuMessage); err != nil {fmt.Println("Error sending to Feishu webhook:", err)}}
}// sendToFeishuWebhook sends a message to Feishu webhook
func sendToFeishuWebhook(webhookURL string, message string) error {webhookMessage := FeishuWebhookMessage{MsgType: "text",}webhookMessage.Content.Text = messagejsonData, err := json.Marshal(webhookMessage)if err != nil {return fmt.Errorf("failed to marshal webhook message: %v", err)}req, err := http.NewRequest("POST", webhookURL, bytes.NewBuffer(jsonData))if err != nil {return fmt.Errorf("failed to create HTTP request: %v", err)}req.Header.Set("Content-Type", "application/json")client := &http.Client{}resp, err := client.Do(req)if err != nil {return fmt.Errorf("failed to send HTTP request: %v", err)}defer resp.Body.Close()responseBody, err := ioutil.ReadAll(resp.Body)if err != nil {return fmt.Errorf("failed to read webhook response body: %v", err)}fmt.Println("Feishu webhook response:", string(responseBody))return nil
}
注意loadBalancerConnectionThreshold 为float64.
增加 MetricName多个条件
这里以弹性IP EIP与负载均衡为例,我想查询负载均衡连接数大于100报警,并根据负载均衡对应eip的四个指标:“upstream_bandwidth_usage”,“downstream_bandwidth_usage”,“upstream_bandwidth”,"downstream_bandwidth"报警,完整代码如下:
package mainimport ("bytes""encoding/json""fmt""github.com/huaweicloud/huaweicloud-sdk-go-v3/core/auth/basic"ces "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1""github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1/model"region "github.com/huaweicloud/huaweicloud-sdk-go-v3/services/ces/v1/region""io/ioutil""net/http""time"
)type FeishuWebhookMessage struct {MsgType string `json:"msg_type"`Content struct {Text string `json:"text"`} `json:"content"`
}const (// 定时器间隔,用于根据特定时间点触发数据检索。例如:59分时执行任务,就是(59 - 当前时间的分钟数) x 每分钟的秒数feishuWebhookURL = "xxxxxxx"ak = "xxx"sk = "xxxx"upstreamBandwidthThreshold = 40 // 出网带宽阈值,单位MbpsdownstreamBandwidthThreshold = 80 // 入网带宽阈值,单位Mbps(示例中以百分比为单位,根据实际单位调整)upstreamUsageThreshold = 20 // 出网带宽使用率阈值,单位百分比downstreamUsageThreshold = 40 // 入网带宽使用率阈值,单位百分比
)var metricNames = []string{"upstream_bandwidth_usage","downstream_bandwidth_usage","upstream_bandwidth","downstream_bandwidth",
}func main() {auth := basic.NewCredentialsBuilder().WithAk(ak).WithSk(sk).Build()client := ces.NewCesClient(ces.CesClientBuilder().WithRegion(region.ValueOf("cn-east-3")).WithCredential(auth).Build())ticker := time.NewTicker(time.Minute) // Check every minute.for {select {case <-ticker.C:// 每分钟执行go collectDataAndSendToFeishu(client)}}
}func collectDataAndSendToFeishu(client *ces.CesClient) {currentTime := time.Now().UTC().Truncate(time.Minute)startTime := currentTime.Add(-1 * time.Minute)endTime := currentTimestartTimestamp := startTime.UnixMilli()endTimestamp := endTime.UnixMilli()dimensionValues := "bandwidth_id,xxxx"for _, metricName := range metricNames {request := &model.ShowMetricDataRequest{Namespace: "SYS.VPC",MetricName: metricName,Dim0: dimensionValues, // Replace with actual dimension valueFilter: model.GetShowMetricDataRequestFilterEnum().MAX,Period: int32(1),From: startTimestamp,To: endTimestamp,}response, err := client.ShowMetricData(request)if err != nil {fmt.Printf("Error querying CES data for %s: %v\n", metricName, err)continue}if response.Datapoints == nil || len(*response.Datapoints) == 0 {fmt.Printf("No datapoints received for %s\n", metricName)continue}datapoints := *response.Datapointsvar maxUsage float64for _, point := range datapoints {if point.Max != nil {if metricName == "upstream_bandwidth" || metricName == "downstream_bandwidth" {// Convert from bits to MbitsmaxUsage = *point.Max / 1000000.0} else {maxUsage = *point.Max // for utilization metrics, which are percentages}break // Assuming there's only 1 datapoint with MAX filter}}if (metricName == "upstream_bandwidth" && maxUsage > upstreamBandwidthThreshold) ||(metricName == "downstream_bandwidth" && maxUsage > downstreamBandwidthThreshold) ||(metricName == "upstream_bandwidth_usage" && maxUsage > upstreamUsageThreshold) ||(metricName == "downstream_bandwidth_usage" && maxUsage > downstreamUsageThreshold) {alertMessage := createAlertMessage(metricName, maxUsage, endTime)if err := sendToFeishuWebhook(feishuWebhookURL, alertMessage); err != nil {fmt.Printf("Error sending to Feishu webhook: %v\n", err)}}}
}func createAlertMessage(metricName string, usage float64, endTime time.Time) string {readableTime := endTime.Add(+8 * time.Hour).Format("2006-01-02 15:04:05")var alertMessage string// 注意阈值和单位已更新,具体文本格式根据实际需要调整switch metricName {case "upstream_bandwidth":alertMessage = fmt.Sprintf("警报:在%s,出网带宽超过了%.2fMbps,当前带宽:%.2fMbps", readableTime, upstreamBandwidthThreshold, usage)case "downstream_bandwidth":alertMessage = fmt.Sprintf("警报:在%s,入网带宽超过了%.2fMbps,当前带宽:%.2fMbps", readableTime, downstreamBandwidthThreshold, usage)case "upstream_bandwidth_usage":alertMessage = fmt.Sprintf("警报:在%s,出网带宽使用率超过了%.2f%%,当前使用率:%.2f%%", readableTime, upstreamUsageThreshold, usage)case "downstream_bandwidth_usage":alertMessage = fmt.Sprintf("警报:在%s,入网带宽使用率超过了%.2f%%,当前使用率:%.2f%%", readableTime, downstreamUsageThreshold, usage)}return alertMessage
}func sendToFeishuWebhook(webhookURL string, message string) error {webhookMessage := FeishuWebhookMessage{MsgType: "text",}webhookMessage.Content.Text = messagejsonData, err := json.Marshal(webhookMessage)if err != nil {return fmt.Errorf("failed to marshal webhook message: %v", err)}req, err := http.NewRequest("POST", webhookURL, bytes.NewBuffer(jsonData))if err != nil {return fmt.Errorf("failed to create HTTP request: %v", err)}req.Header.Set("Content-Type", "application/json")client := &http.Client{}resp, err := client.Do(req)if err != nil {return fmt.Errorf("failed to send HTTP request: %v", err)}defer resp.Body.Close()responseBody, err := ioutil.ReadAll(resp.Body)if err != nil {return fmt.Errorf("failed to read webhook response body: %v", err)}fmt.Printf("Feishu webhook response: %s\n", responseBody)return nil
}
测试报警如下:
其他
- 遍历负载均衡列表,批量查询所有负载均衡连接数?发送告警时候传入负载均衡的名称?
- 根据负载均衡列表查询绑定的eip实例,查询所有eip对应bandwidth_id,输出所有eip的指标?
总结
此文为你展示了如何通过Go SDK获取华为云上的负载均衡最大连接数and eip指标的多个条件查询,并通过飞书Webhook发送通知的过程。以上的实现可以根据你自己的需求进行调整,比如改变监测的指标或者消息发送的方式。希望本文能帮助你更好地监控和管理华为云上的资源。