Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Linux System Monitoring Tools (cyberciti.biz)
118 points by hiteshiitk on Nov 15, 2010 | hide | past | favorite | 15 comments


I suggest dstat over vmstat, it has color coded output and abbreviates units automatically. It's easy to add columns or monitor specific devices or interfaces.

http://dag.wieers.com/home-made/dstat/

I suggest OpenNMS as a cacti and nagios alternative. It eliminates most of the manual configuration. It can automatically detect nodes and services and if you give it SNMP information it can monitor specifics of each machine. I've used it to monitor hundreds of machines but it can be resource intensive.

http://www.opennms.org/

iftop is also a nice lightweight alternative to iptraf and helps track down bandwidth heavy processes and connections.

http://ex-parrot.com/pdw/iftop/


These tools are great for looking at what's happening now if you're logged into the server.

They're complemented by monitoring products like:

Self hosted:

- Nagios (already mentioned in the post)

- Cacti / Munin

Hosted:

- http://www.serverdensity.com (tool my company produces)

- http://www.cloudkick.com (monitoring + cloud infrastructure management)

- http://www.scoutapp.com

these give you similar metrics plus various other things like alerting, graphs, mobile apps, etc.


http://librato.com is another hosted product (disclosure: I work on this) for systematically monitoring/managing applications.


Does anybody actually use top, rather than htop? It's the first thing I install on every system I build.

Something I've become very fond of recently is Monit, which doesn't appear to be on the list. I've found it very reassuring to have Monit set-up and watching the processes on my server.


I came here to say the same thing. htop is streets ahead from top.


atop is fun too


Another one I find quite useful is iotop:

http://guichaz.free.fr/iotop/

Very handy to quickly see what process is causing that disk thrashing, for example.


For people that care about security, I would add those monitoring tools:

-OSSEC - log + file system security monitoring (http://ossec.net)

-Snort - Network-based IDS (http://snort.org)

-Sucuri (not free) - web site monitoring (http://sucuri.net)


I have also seen Munin, which provides robust monitoring.


ps_mem.py - Determine how much RAM is currently being used per program, is useful when top command failed to report actual memory shared due to copy-on-write among multiple processes.

http://www.pixelbeat.org/scripts/ps_mem.py

http://wiki.apache.org/spamassassin/TopSharedMemoryBug


In addition to tcpdump, I'd like to add the command 'tshark'. Tshark usually comes bundled with wireshark and allows you to use the same search capabilities as wireshark from the command line. I find it much easier to use than tcpdump especially if you already have experience with wireshark.


When it comes to wireshark and remote servers i often do this:

    ssh root@someserver "/tmp/tcpdump -i any -p -s0 -w - not port 22" | \
        wireshark -i - -k


I often use basic command line tools (vm/io/snmpstat, fiddle with /proc with cat/cut) and chart the results along the way, in realtime, with this little tool: http://freshmeat.net/projects/trend


Wait, you don't just look at the load average?


After you see that you have a high load average, these are the tools you would use to track down why.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: