netdata: Linux Real Time Performance Monitoring

netdata - SeniorDBA

A netdata should be installed on each of your Linux servers. It is the equivalent of a monitoring agent, as provided by all other monitoring solutions. netdata is a highly optimized Linux daemon providing real-time performance monitoring for Linux systems, Applications, SNMP devices, over the web!  It is useful to visualize the insights of what is happening right now on your systems and applications.

netdata- SeniorDBA

This is what you get:

  • Stunning interactive bootstrap dashboards
    mouse and touch friendly, in 2 themes: dark, light
  • Amazingly fast
    responds to all queries in less than 0.5 ms per metric, even on low-end hardware
  • Highly efficient
    collects thousands of metrics per server per second, with just 1% CPU utilization of a single core, a few MB or RAM and no disk I/O at all
  • Sophisticated alarming
    supports dynamic thresholds, hysteresis, alarm templates, multiple role-based notification methods (such as email,,,,
  • Extensible
    you can monitor anything you can get a metric for, using its Plugin API (anything can be a netdata plugin, BASH, python, perl, node.js, java, Go, ruby, etc)
  • Embeddable
    it can run anywhere a Linux kernel runs (even IoT) and its charts can be embedded on your web pages too
  • Customizable
    custom dashboards can be built using simple HTML (no javascript necessary)
  • Zero configuration
    auto-detects everything, it can collect up to 5000 metrics per server out of the box
  • Zero dependencies
    it is even its own web server, for its static web files and its web API
  • Zero maintenance
    you just run it, it does the rest
  • scales to infinity
    requiring minimal central resources
  • back-ends supported
    can archive its metrics on graphite or opentsdb, in the same or lower detail (lower: to prevent it from congesting these servers due to the amount of data collected)

This is what it currently monitors (most with zero configuration):

  • CPU
    usage, interrupts, softirqs, frequency, total and per core
  • Memory
    RAM, swap and kernel memory usage, including KSM the kernel memory deduper
  • Disks
    per disk: I/O, operations, backlog, utilization, space
  • Network interfaces
    per interface: bandwidth, packets, errors, drops
  • IPv4 networking
    bandwidth, packets, errors, fragments, tcp: connections, packets, errors, handshake, udp: packets, errors, broadcast: bandwidth, packets, multicast: bandwidth, packets
  • IPv6 networking
    bandwidth, packets, errors, fragments, ECT, udp: packets, errors, udplite: packets, errors, broadcast: bandwidth, multicast: bandwidth, packets, icmp: messages, errors, echos, router, neighbor, MLDv2, group membership, break down by type
  • Interprocess Communication – IPC
    such as semaphores and semaphores arrays
  • netfilter / iptables Linux firewall
    connections, connection tracker events, errors
  • Linux DDoS protection
    SYNPROXY metrics
  • fping latencies
    for any number of hosts, showing latency, packets and packet loss
  • Processes
    running, blocked, forks, active
  • Entropy
    random numbers pool, using in cryptography
  • NFS file servers and clients
    NFS v2, v3, v4: I/O, cache, read ahead, RPC calls
  • Network QoS
    the only tool that visualizes network tc classes in realtime
  • Linux Control Groups
    containers: systemd, lxc, docker
  • Applications
    by grouping the process tree and reporting CPU, memory, disk reads, disk writes, swap, threads, pipes, sockets – per group
  • Users and User Groups resource usage
    by summarizing the process tree per user and group, reporting: CPU, memory, disk reads, disk writes, swap, threads, pipes, sockets
  • Apache and lighttpd web servers
    mod-status (v2.2, v2.4) and cache log statistics, for multiple servers
  • Nginx web servers
    stub-status, for multiple servers
  • Tomcat
    accesses, threads, free memory, volume
  • mySQL databases
    multiple servers, each showing: bandwidth, queries/s, handlers, locks, issues, tmp operations, connections, binlog metrics, threads, innodb metrics, and more
  • Postgres databases
    multiple servers, each showing: per database statistics (connections, tuples read – written – returned, transactions, locks), backend processes, indexes, tables, write ahead, background writer and more
  • Redis databases
    multiple servers, each showing: operations, hit rate, memory, keys, clients, slaves
  • memcached databases
    multiple servers, each showing: bandwidth, connections, items
  • ISC Bind name servers
    multiple servers, each showing: clients, requests, queries, updates, failures and several per view metrics
  • Postfix email servers
    message queue (entries, size)
  • exim email servers
    message queue (emails queued)
  • Dovecot POP3/IMAP servers
  • IPFS
    bandwidth, peers
  • Squid proxy servers
    multiple servers, each showing: clients bandwidth and requests, servers bandwidth and requests
  • Hardware sensors
    temperature, voltage, fans, power, humidity
  • NUT and APC UPSes
    load, charge, battery voltage, temperature, utility metrics, output metrics
    multiple instances, each reporting connections, requests, performance
  • hddtemp
    disk temperatures
  • SNMP devices
    can be monitored too (although you will need to configure these)

And you can extend it, by writing plugins that collect data from any source, using any computer language.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.