Part of our network, the core of our London office is handled by a Cisco Catalyst switch, it quietly gets on with the job, but we don’t have any monitoring in place – we’ve never really needed it, when things goes wrong we can monitor on the commandline to see what’s going on temperature wise amongst a whole range of other stats. Temperature is theme at the moment due to the environment it sits in, as an eco charity we don’t use air conditioning.
When SSH’d onto it you can run sh env to get a print out of temperature sensors placed all over the device. When things are getting toasty it looks like this;
Module Sensor Temperature Status
1 Air inlet 57C (56C,68C,71C) warning
1 Air inlet remote 46C (46C,59C,62C) warning
1 Air outlet 56C (66C,76C,79C) ok
1 Air outlet remote 42C (60C,71C,74C) ok
2 Air inlet 62C (56C,68C,71C) warning
2 Air inlet remote 47C (46C,59C,62C) warning
2 Air outlet 58C (66C,76C,79C) ok
2 Air outlet remote 45C (60C,71C,74C) ok
3 air inlet 58C (45C,60C,70C) warning
3 air outlet 56C (61C,76C,86C) ok
4 air inlet 52C (45C,60C,70C) warning
4 air outlet 59C (61C,76C,86C) ok
5 Air inlet 41C (50C,66C,72C) ok
5 Air outlet 61C (73C,80C,92C) ok
5 XPP 109C (101C,120C,123C) warning
5 IFE 69C (65C,90C,120C) warning
5 CONAN 81C (67C,82C,100C) warning
5 CPU 72C (69C,102C,105C) warning
(Pardon the formatting…)
It looks slightly better on the command line but not much. As with all metrics, i like them in Graphite / Grafana so we can see trends, look back in history and be alerted. Points to anyone who can identify what XPP, IFE & CONAN are, this has stumped even our senior infra guy.
I can’t take credit for any of this code, this one is all Steven. He isn’t on GitHub so i’m sharing it on his behalf back to the community in the hope it helps someone else keep a closer eye on their switch, with his blessing.
The code is here, it’s neat and to the point and is designed to run on a RaspberryPi / Raspbian.