Jump to content

Monitoring

From Wikiversity

Monitoring systems are responsible to collect, show and generate alarms in monitored devices such us: servers or network devices.

Tools

[edit | edit source]

Linux includes different tools for collecting, observe and monitoring systems metrics or performance, such as: sysstat, atop, dstat and vmstat. Some of them include recording and alerting capabilities while others just visualization capabilities.

Monitoring solutions

[edit | edit source]

Zabbix and Nagios have been historically tools used to perform monitoring across different servers or devices. Other tools included Icinga (Nagios fork), Prometheus and Netdata.

Alerting capabilities

[edit | edit source]

Netdata support email alerts and is planned to add support to Slack.

prometheus alertmanager support different notifications methods

vmstat

[edit | edit source]
vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
10  2 837532 459608  64052 23976592    1    2   992  1714    1    1 23  5 70  2  0
 1  0 837532 421692  64052 24014288    0    0 33444  2280 20629 33647 38  5 55  2  0
 2  1 837532 496892  64012 23937976    0   32 56464  2224 19104 35788 24  4 70  3  0
 7  0 837532 435928  64020 23999028    0    0 55584  2272 22717 37604 32  5 60  2  0
10  8 837532 411988  64020 24021820    0    0 21532 270348 25256 33189 41  6 38 16  0
 8  3 837532 447948  63984 23986276    0    0 28788 20560 27664 42733 39  7 41 13  0

Activities

[edit | edit source]
  1. Review wikipedia list of software monitors: https://en.wikipedia.org/wiki/System_monitor#List_of_software_monitors
  2. Review wikiversity articles covering system, software or network monitors: monit, Nagios, Prometheus (software), netdata, sar and Zabbix
  3. Identify key differences between network monitoring, system monitoring[1] and application performance monitoring (APM)[2].
  4. Implement a solution to detect disk array failures: System administration/ProLiant/Remove a disk from your redundant storage array and review OS logs
  5. Monitoring disk space:
    1. Configure sysstat for collecting disk space usage
    2. Configure Netdata with diskspace plugin [3]
  6. Monitor your RAID devices:
    1. Software RAIDs: mdadm
    2. Hardware RAIDs: HPE Array Controllers

See also

[edit | edit source]

References

[edit | edit source]