-
Notifications
You must be signed in to change notification settings - Fork 145
Description
My ipmi-exporter image version is v1.10.0, and I often encounter such confusion. In my usage scenario, it often happens that after starting ipmi-exporter, the server where the container is located will start thousands, even more, ipmi-sensors. This is a bad situation because the processes cannot be automatically released, causing my server CPU to run at a high load continuously, which is a catastrophic event for other containers on this server.
This image is a monitoring record of server load by Prometheus, with the promql metric being node:load5:ratio * 100 > 300.
I haven't found a solution yet, so I can only temporarily shut down the ipmi-exporter program to alleviate this issue. It should be noted that such situations occur sporadically and without regularity, and we don't have a corresponding handling approach.
Here is my ipmi exporter configuration file
modules:
advanced:
collector_cmd:
ipmi: sudo
sel: sudo
collectors:
- ipmi
- sel
custom_args:
ipmi:
- ipmimonitoring
sel:
- ipmi-sel
driver: LAN
pass: secret_pw
privilege: admin
user: some_user
dcmi:
collectors:
- dcmi
driver: LAN_2_0
pass: another_pw
privilege: admin
user: admin_user
default:
collectors:
- bmc
- ipmi
- chassis
driver: LAN_2_0
exclude_sensor_ids:
- 2
- 29
- 32
- 50
- 52
- 55
pass: xxxx
privilege: user
timeout: 10000
user: xxx
thatspecialhost:
collectors:
- ipmi
- sel
custom_args:
ipmi:
- --bridge-sensors
driver: LAN
pass: secret_pw
privilege: admin
user: some_user
workaround_flags:
- discretereading