Application Monitoring Dashboard Solution

Sonu K. Meena bio photo By Sonu K. Meena Comment

In this post i’d discuss some of the option available for application monitoring. As your application grows, your technology stack also grows and so the surprises that comes along. These often lead to out-of-service statuses which I presume no organization,small or big, can bear.

So, let’s get started.

Application Monitoring : It can include minimum monitoring of application live status and can extend upto application performance monitoring. When you have all these metrics before you, you’ll be able to gauge your infrastructure efficiency and scalability in more predictive manner.

Imaging your application is getting popular and no. of users are increasing day by day. With given metrics you’ll be able to see how many RAM, CPU or instances are being used and how much is available. With current resource utilization and growing user base stats in hand, you can take necessary measures to ensure smooth operation of your service.

What are these metrics that we should be wary about?

Let’s start from server itself.

Few Important Metrics are: RAM , CPU, Disk space, bandwidh, Disk I/O , DB Read/Write etc.

  • Disk I/O

    Processor will wait till process finish reading file. So, having a fast Disk input/output operations on a physical disk will improve your application performance significantly.

    That’s why SSD’s are nowadays popular. 30% faster file opening speed than traditional HDD. File Copy or Write Speed of SSD is typically about 200 MB/s -550 MB/s while of HDD it’s very low around 50-12MB/s.

    You can easily find the difference in turnaround time of your Disk I/O heavy application by just changing a single hardware. But do you really require HDD? See your application metrics and answer yourself ;)

  • RAM

    More the ram, more process can run and stay in memory for long. This mean less Disk I/O and page swapping. With hardware getting cheap, having extra RAM won’t cost much.

    However, knowing when to increase RAM is the purpose of this post. One option is just take note of RAM usage in last couple of days and decide your next step.

  • CPU

    For computation heavy application just RAM is not enough. You need powerful mind to process them all. So, do watch your CPU performances as well. Generally, if metrics shows CPU usage greater than 80% for most of the time, it means you should upgrade to more powerful machines with more cores inside.

  • Bandwidth

    Not only it’ll help track your monthly bandwidth bills but also help you trace anomaly. Many times watching bandwidth usage pattern give you clues on what’s happing wrong if anything is going wrong.

Next Step is, how to get these metrics?

Start with Web server. If you uses nginx then gather its statistics. Afterward, gather App Server statistics. If you have RoR then you might have used or tried unicorn. So, you might need to dig into its log file to gather metrics out of it.

  • PHusion Passenger

    Phusion Passenger make all this easy as it’s a web server and App Server, both. It provide Administration tools that allow you to detect whether an application is stuck and non-responsive.

    It let you Watch and monitor many application-level, process-level and system-level statistics from a central place.

    • CPU, memory and swap usage, both system-wide and per-process.
    • Connections and requests.
    • Hypervisor VM interference rates.
    • Load averages, fork rates and swap rates.
    • Application-level backtraces.

    Best part is Easy integration with external tools. These statistics are queryable over an HTTP JSON API, allowing you to easily integrate these statistics with external tools.

    These stats can be pushed into a centralized storage (Mongodb or mysql).

  • CollectD

    collectd is a daemon which collects system performance statistics periodically and provides mechanisms to store the values in a variety of ways, for example in RRD files. Out of the box it provide multitude of plugins ( cpu, memory, network, disk etc). For bandwidth usage you can look at collectd-network-bandwidth-usage plugin.

    These stats can be collected in Mongodb through Write MongoDB plugin or can be pushed into Graphite db through collectd-carbon plugin.

How to create a monitoring Dashboard ?

Let me start with minimal monitoring to advance monitoring options that can be used to monitor your application and the servers it’s running on.

  • Minimum Application Monitoring: (Monit)

    monit can be used to monitor your endpoints. It provide a nice and simple web interface to see live status.

    Monit is a small Open Source utility for managing and monitoring Unix systems. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations.

    It provide a nice Web GUI where you can see live status of application interfaces and services.

    If any service goes down or resource utilization crosses threshold, it’ll take action specified by you. Action can be restarting the crashed service, shooting mail to system admin or execution of your provided script or command.

    Monit can help you if you have small infrastrcture. But as the infrastructure grows, handling Dashboard for every server would be weary. So, to ease this you can use Monittr.

    Monittr provides a Ruby interface for the Monit systems management system. Its main goal is to aggregate statistics from multiple Monit instances and display them in an attractive web interface.

    For live monitoring and getting on-time system update on service crash/down or excessive resource utilization, Monit can serve your purpose.

    If you want to read resource utilization pattern, to gauge your infrastructure growth and scale infrastructure accordingly you need to store all these metrics somewhere in centralized storage or Database. Next solution let you read past stored data as well.

Recently I come across a application monitoring architecture being used in one of the pioneer bread & breakfast service provider. This solution is taken out of them.

  • Monitoring like a boss ( Design Yourself)
Phusion Passenger        CollectD+  
        +                    +      
        |                    |      
        |                    |      
        v                    v      
     Mongodb        Carbon(Graphite)
           |            |           
           |   JSON API |           
           |            |           
           |            |           
          | Your cool  |            
          | Dashboard  |            
          |            |            

Send Passenger data into MongoDB. Send CollectD metrics into Graphite database(whisper). Graphite provide JSON API interface render url api which can provide data endpoint to real time metrics.

Now, with collected JSON data from MongoDB and Carbon, you can have a common central JSON API endpoint that can power your dashboard to visualize metrics.

Little on choosing technology for building Dashboard

The big problem broken down into small sub problems.

  Get Data -> Store data ->  Provide JSON Interface -> Pull & Visualize Data

For building realtime dashboard you can use any javascript MVC framework. AngualarJS or emberJS .

For creating widgets ( gauge, pie chart, geomap etc) you can use AngularJS template directives or emberjs components.

While these approaches are fine, but i would like to introduce web components here, specially polymer.

Web Components usher in a new era of web development based on encapsulated and interoperable custom elements that extend HTML itself. Built atop these new standards, Polymer makes it easier and faster to create anything from a button to a complete application across desktop, mobile, and beyond.

With polymer you can create custom widgets very cleanly. Best part is you can re-use them in your next cool projects.

So, plugging peices together through standard interface you can build a nice appliclation monitoring dashboard for your infrastructure.