Menu

Poland

GRANDMETRIC Sp. z o.o.
ul. Metalowa 5, 60-118 Poznań, Poland
NIP 7792433527
+48 61 271 04 43
info@grandmetric.com

Sweden

Drottninggatan 86
111 36 Stockholm
+46 762 041 514
info@grandmetric.com

UK

Grandmetric LTD
Office 584b
182-184 High Street North
London
E6 2JA
+44 20 3321 5276
info@grandmetric.com

US Region

Grandmetric LLC
Lewes DE 19958
16192 Coastal Hwy USA
EIN: 98-1615498
+1 302 691 94 10
info@grandmetric.com

  • en
  • pl
  • se
  • Zabbix – Network Monitoring

    Zabbix – Network Monitoring

    Date: 01.10.2024



    Network and server monitoring is the process of overseeing infrastructure and its parameters to gain a complete understanding of how to manage IT infrastructure. Monitoring network devices allows for early identification of issues, prevention of failures, and maintenance of high availability and performance of systems. This gives administrators better insight into the functioning of the entire infrastructure and enables them to respond promptly to potential problems.

    Network monitoring plays a key role in managing IT infrastructure. When properly configured, it not only helps in the rapid detection and resolution of problems but also in their prevention. Preventive measures are made possible through constant access to data on the performance of networks, servers, applications, and network devices.

    Main aspects of network monitoring:

    1. Traffic monitoring: Analysis of data transmitted over the network, including packets, protocols, and IP addresses.
    2. Anomaly detection: Identification of unusual behaviors that may indicate performance issues, hacking attempts, or other irregularities.
    3. Resource management: Monitoring the usage of network resources, such as bandwidth, data flow, CPU usage, memory, and disk space.
    4. Security: Protection against threats, including detecting and responding to intrusion attempts, malware, or DoS attacks.
    5. Diagnostics and troubleshooting: Identifying and fixing network issues, helping to minimize downtime and maintain business continuity.
    6. Reporting and analysis: Creating reports on the status of the network, which can be used to analyze trends, plan future needs, and make decisions.

    Various tools and protocols are used for network monitoring. The most popular of these is SNMP (Simple Network Management Protocol), which allows administrators to collect and analyze data from network devices such as routers, switches, servers, and end devices. Using SNMP, information about the status of devices, their load, and the detection of issues can be gathered.

    Data collected via SNMP can be supplemented with data from other sources, such as system logs (syslog) or NetFlow, which allow for the analysis of network traffic. However, merely collecting data is not sufficient; it is crucial to process, analyze, and visualize it properly.

    To achieve this, administrators utilize a variety of tools that assist in monitoring infrastructure. There are many commercial and open-source solutions available on the market that support IT administrators’ work. Examples of such tools include LibreNMS, Nagios, SolarWinds, and Zabbix.


    Zabbix

    Zabbix is a comprehensive open-source software solution for monitoring IT infrastructure that allows monitoring the status and performance of various components such as networks, servers, virtual machines, and cloud services. Zabbix supports a wide range of devices, from networking hardware to servers and virtual machines, and can also monitor devices like UPSs, IoT sensors (e.g., thermometers, entrance counters, humidity measuring devices), and other devices connected to the network.

    Zabbix enables not only the collection and analysis of data but also its visualization, reporting, and alerting in case of problems. A significant advantage of this tool is its openness—Zabbix is an open-source project, meaning that every user has access to the source code and a large community that actively supports the software’s development. Its large user base and well-prepared documentation make Zabbix user-friendly and flexible in configuration.

    Zabbix Architecture

    Zabbix consists of three main modules:

    1. Zabbix Server – the heart of the system, responsible for analyzing data and sending alerts.
    2. Baza danych – the place where data and user-configured server settings are stored.
    3. Front-end – the user interface that enables data visualization, server configuration, and user interaction through a GUI (graphs, dashboards).
    Rysunek 1. Architektura Zabbix

    Figure 1. Zabbix Architecture

    Zabbix Proxy

    An additional element of the architecture can be the Zabbix Proxy, which collects data on network performance and availability on behalf of the Zabbix server. With this architecture, Zabbix becomes a highly scalable application. In large installations, when the Zabbix server or proxy requires more resources, another Zabbix Proxy can be added to collect data from another part of the network.

    Installation of Zabbix Agent

    The Zabbix Agent can be installed on various operating systems, including Linux, Windows, and macOS.

    Rysunek 2. Dostępne instalatory Zabbix w zależności od platformy

    Figure 2. Available Zabbix Installers by Platform

    The software can operate on both physical machines and in virtual environments, and it can also be deployed in the cloud or in containers(e.g., Docker). Depending on the chosen environment and the number of monitored devices, the resource requirements will vary.

    To install Zabbix, appropriate physical resources will be needed, such as CPU, RAM, and disk space. These resources depend on the number of monitored devices and the amount of data collected. For example, in the case of larger installations, it is recommended to use more powerful servers and ample disk space for monitoring history storage.

    Rysunek 3. Zalecane parametry fizyczne przez producenta

    Figure 3. Recommended Physical Parameters by the Manufacturer

    The disk on which the data is stored must have an appropriate size, depending on how long we want to retain historical data and how large that data will be. The required disk space can be calculated using a formula that takes into account the configuration file, history, trends, and events. The size of each parameter can be determined using:

    Rysunek 4. Wzór na potrzebną przestrzeń dyskową

    Figure 4. Formula for Required Disk Space

    Trends, History, and Events – What Are They?

    Trends: A built-in mechanism in Zabbix that allows for the reduction of historical data. It stores minimum, maximum, average values, and the total count of values for each hour for numerical data. Trends help decrease the amount of stored data without losing information about long-term performance changes.

    History: Stores every collected value, which means it is more resource-intensive than trends. History is useful when detailed information about each event is required.

    Events: Generated by triggers in the Zabbix system. Each event is recorded in the database, allowing for tracking when and why a particular issue occurred. The amount of space allocated for events depends on the number of alarms generated in the system.

    The choice of the database where the data will be stored depends on the preferences and experience of the administrator.

    Now that we have discussed the physical requirements, we should also mention network communication. The default values are as follows:

    Rysunek 5. Komunikacja sieciowa dla aplikacji Zabbix

    Figure 5. Network Communication for the Zabbix Application

    Zabbix Configuration

    1. Host / Host group

    In Zabbix, a “host” refers to any physical or virtual device, application, service, or any logically related set of monitored parameters. To add a new host, navigate to the Configuration tab => Hosts => Create host.

    Rysunek 6. Konfiguracja nowego hosta

    Figure 6. Configuration of a New Host

    The value of “Host name” must be unique for each object created in Zabbix. When creating a host, we have the option to assign it to the appropriate host group, which will facilitate future configuration. Therefore, before proceeding with the configuration, we should analyze our needs and consider what groups will be created. If we are operating in a distributed architecture, we can also choose which proxy server will be responsible for collecting data from the host.

    Depending on whether we will monitor our object using an agent or SNMP, we select the appropriate option and provide the IP address of our host. Once the configuration is complete, we click Add.

    Host groups allow for the grouping of hosts of the same type. In the future, a template can be assigned to a particular group instead of doing so individually for each host. If we choose not to use host groups, we can also assign the appropriate template tag to the host.

    If we have a file with previously collected hosts, there is an option to import them into Zabbix using a file.

    2. Item

    An item is an individual metric used for data collection. After configuring a host, an item must be added to obtain actual data. One way to quickly add multiple items is to assign one of the predefined templates to the host. However, to optimize system performance, it may be necessary to fine-tune the templates to ensure there are only as many items and as frequent monitoring as needed.

    Items can be created from the host configuration level or from the template level. To create a new item, navigate to Configuration => Hosts => host_name => Items => Create Item.

    Rysunek 7. Konfiguracja nowego itemu

    Figure 7. Configuration of a New Item

    Each item must have a unique name. One item can be used for multiple hosts. Depending on the needs, the item type can be customized, such as data collected by an agent, SNMP, or other data sources. We can individually adjust the data retention length or leave the default global settings. To complete the configuration, we click Add.

    Rysunek 8. Przykładowa lista typów itemu

    Figure 8. Example List of Item Types

    3. Trigger

    A trigger is a logical expression that “evaluates” the data collected by items and represents the current state of the system. Triggers allow for the definition of a threshold, determining what state of the data is “acceptable.” If the data exceeds the acceptable threshold, the trigger will be “activated” and change its status to PROBLEM.

    To create a new trigger, follow the same steps as for an item: Configuration => Hosts => host_name => Triggers => Create Trigger.

    Rysunek 9. Konfiguracja triggeru

    Figure 9. Configuration of a Trigger

    We need to create a unique name for the trigger and select the appropriate severity, which defines the importance of the problem in the system. The most challenging part is creating the correct expression that will evaluate the state of the collected data. We can use the expression wizard, which simplifies the task, or create it manually. For example, we can configure a trigger that responds to a lack of response from three consecutive pings to the device. We can also set a recovery expression that defines the conditions for resolving the trigger. To finish, we click Add or Update if we are editing an existing trigger.

    4. Events

    In Zabbix, an event is a record of changes in the state of a monitored item or trigger. Events are a key element of the monitoring system as they log when specific changes occurred, allowing for precise tracking of issues and the system’s responses to these problems.

    An example of an event is a trigger event—every time a trigger changes its status (OK → PROBLEM → OK), an event is generated. All generated events can be viewed in the Monitoring => Problems tab.

    Rysunek 10. Przykład wygenerowanego eventu

    Figure 10. Example of a Generated Event

    5. Graphs

    With a large amount of data flowing into Zabbix, it is significantly easier for users to analyze the data if they can view a visual representation of the situation rather than just numbers. In this case, graphs come into play. Graphs allow for a quick understanding of data flow, correlating problems, discovering when something began, or determining when something might escalate into an issue.

    Rysunek 11. Tworzenie nowego wykresu

    Figure 11. Creating a New Graph

    Creating a new graph is done from the following path: Configuration => Hosts => host_name => Graphs => Create Graph.

    Rysunek 12. Przykład stworzone wykresu

    Figure 12. Example of a Created Graph

    In addition to a unique name for the graph, we can configure its size and select the item based on which the graph will be created. We can also add a legend to the graph, and if there is a configured trigger that responds to the exceeding of certain values, this will be noted on the graph as an event.

    Rysunek 13. Przykład stworzone wykresu dla zużycia RAM maszyny wirtualnej

    Figure 13. Example of a Created Graph for RAM Usage of a Virtual Machine

    6. Maps and Screens

    Maps, screens, and dashboards allow for the visualization of several or multiple graphs and events in one place. A dashboard serves as a central location where we can present the status of the entire network.

    Rysunek 14. Dashboard pozwalający ocenić stan sieci

    Figure 14. Dashboard for Assessing the Status of the Network

    Maps allow for the graphical grouping of hosts. An example of a map could be, for instance, a map of physical connections between network devices, which represents the topology.

    Rysunek 15. Topologia graficzna połączeń urządzeń sieciowych

    Figure 15. Graphical Topology of Network Device Connections

    Depending on the status of the device and the triggered triggers, the color of the devices on the map changes. We can also create nested maps that allow navigation between different maps by clicking on a specific device or group of devices.

    Rysunek 16. Mapa wyższego poziomu

    Figure 16. Higher-Level Map

    In Figure 16, the Higher-Level Map hides subsequent lower-level maps. A problem triggered in one of the locations will be displayed on the global map.

    Screens are nothing more than a slideshow composed of selected maps, allowing us to create a sequence of successive maps or dashboards on the monitoring screen.

    Rysunek 17. Tworzenie screenu

    Figure 17. Creating a Screen

    7. Template / Template Group

    Templates are a useful tool for simplifying the administrator’s work. In templates, you can define the values of items, graphs, and triggers, which will be automatically assigned to devices or virtual machines that are added to them. This way, we don’t have to configure variables separately for each host, but rather for a group of hosts. It’s advisable to consider the division of devices before starting the configuration of Zabbix to effectively utilize templates. Nested templates can also be created within templates.

    Rysunek 18. Przykład szablonów z zdefiniowanymi wartościami

    Figure 18. Example of Templates with Defined Values

    8. Macros

    Zabbix supports the creation of macros. Macros are variables that can be defined in any way. They assign a specific value depending on the context. Using macros saves time and simplifies configuration. Macros can be used, for example, in items, such as “item.key[server_{HOST.HOST}_local]”. Effective use of macros makes configuration clearer.

    9. Users

    All Zabbix users access the application through the web interface. Each user is assigned a unique login name and password. User accounts can be defined locally or, for example, using LDAP. Communication between the user and the Zabbix web server is secured using the SSL protocol.

    Rysunek 19. Lista stworzonych użytkowników

    Figure 19. List of Created Users

    10. Agent

    The Zabbix agent can be deployed on the monitored device to actively monitor local resources and applications, such as hard drives, memory, CPU statistics, etc. The agent collects operational data locally and sends it to the Zabbix server for further processing. In the event of a failure (e.g., disk overflow or malfunctioning processes), the Zabbix server can immediately notify administrators of the problem. From the agent, we can also configure active monitoring tasks, such as executing the fping command to assess whether the machine can communicate with the Internet.

    Rysunek 20. Lista dostępnych agentów w zależności od systemu

    Figure 20. List of available agents depending on the system

    Rysunek 21. Sprawdzanie za pomocą agenta Zabbix czy system maszyny wirtualnej nie znajduje się w stanie ReadOnly

    Figure 21. Checking with the Zabbix agent whether the virtual machine’s system is not in ReadOnly state

    Rysunek 22. Przykład konfiguracji agenta Zabbix

    Figure 22. Example of Zabbix agent configuration

    11. Proxy

    Zabbix Proxy is a module that can collect performance and availability data on behalf of the Zabbix server. The proxy can take on some of the load from the Zabbix server, alleviating it, and is invaluable in the case of distributed installations. With the proxy, we can centralize monitoring from multiple locations, and all data is sent to a single Zabbix server.

    Rysunek 23. Przykład wykorzystania serwera proxy

    Figure 23. Example of using a proxy server

    12. Discovery

    Instead of manually adding hosts or agents, Zabbix also offers a host auto-discovery feature. This can be achieved using SNMP by scanning a specific subnet or through agent auto-registration. Auto-discovery automates the process of adding new hosts to the system.

    Rysunek 24. Ustawienie reguł auto odkrywania

    Figure 24. Setting up auto-discovery rules

    13. API

    Zabbix provides an API that enables automation of system configuration and interaction. The Zabbix API operates based on HTTP requests and data encoded in JSON format. It can be used for automatically creating hosts, items, triggers, and generating reports.

    14. Latest Data

    In the Monitoring => Latest Data tab, we can check the most recent data received from agents or SNMP. This tab is useful for verifying the accuracy of the data received, such as resource usage information for virtual machines. Here, you can also check whether the data is arriving on time and if there are any communication issues.

    Rysunek 25. Dane otrzymane na temat I/O dysków na maszynch wirtualnych

    Figure 25. Data received regarding disk I/O on virtual machines

    15. Reports

    The Reports tab allows users to generate custom-defined reports. With these reports, administrators can assess the state of the network, see how many triggers have been fired, understand which SLAs are being met, and evaluate the downtime of individual resources. An example of a report is the SLA report, which provides information about the availability levels of services.

    Rysunek 26. Przykładowy raport SLA

    Figure 26. Example SLA Report

    16. Integrations

    Zabbix supports a wide range of integrations with third-party systems. If a specific integration is not officially supported, users can benefit from community assistance, which offers numerous plugins and scripts. These integrations allow alerts to be sent to various tools, such as email, SMS, or messaging applications. Zabbix can be integrated with LDAP for centralized user management, as well as with SMTP servers for sending email alerts. It is also possible to send SMS notifications through telecommunications operators.

    Summary

    Zabbix is a powerful and versatile tool for monitoring IT infrastructure, providing full control over various components, from networks to applications and servers. With its modular architecture, flexible configuration options, and the ability to integrate with other systems, Zabbix meets the needs of both small and large organizations. Its open architecture and active community make it one of the best open-source tools in its class. Zabbix enables the automation of many processes, enhances security, and ensures the stability of the entire IT infrastructure, making it an invaluable tool in the daily management of IT environments.

    Expert Confidence

    W oparciu o nasze wieloletnie doświadczenie, wspieramy firmy w efektywnym zarządzaniu infrastrukturą IT. Nasza wiedza, poparta licznymi projektami oraz aktywną współpracą z klientami, pozwala nam dobierać rozwiązania precyzyjnie dopasowane do potrzeb każdej organizacji. Znamy wyzwania związane z monitorowaniem sieci i serwerów, dlatego nasze działania zawsze uwzględniają zarówno optymalizację wydajności, jak i bezpieczeństwo, bez względu na skalę przedsiębiorstwa.

    If you want to learn more or have doubts about which solution would be best for you, talk to our engineers!

    Author

    Olga Żulińska

    A marketing veteran with 16 years of experience in the IT industry. She gained her expertise working in value-added distribution in Poland, where every problem was an opportunity for a creative solution. Well-versed in Photoshop and Illustrator, she organizes events with her eyes closed, and gadgets are her secret weapon – always on point. She approaches marketing with her head, but never forgets her heart. She’s passionate about studying people's reactions to campaigns, and the subtle nuances of marketing psychology are her daily source of inspiration, fueling her creative superpowers as she continuously crafts better marketing campaigns.

    Comments are closed here.
    Grandmetric