Node.js environment Monitoring and Alerting – Part 3 (New Relic)

Part two of the series looked at what AWS and Azure each offer as basic capabilities for telemetry viewing and alerting. You should know that there are also offerings from independent companies in the area of Application Performance Monitoring (APM). From what I have seen, the cloud providers have yet to create any tools as nice as the ones from third party dedicated APM tools. Microsoft does offer Microsoft Operations Management Suite (OMS), but this is a much larger scope than just APM.

I hesitate to even attempt listing some of the third party APM tools, as there are quite a few and I would certainly leave off the favorite one of someone. You can start by looking at AppDynamics, Ruxit, Stackify and New Relic and then search for others. Loggly is another option, but that is narrowed down to just deal with sending custom logging from your application up to their service where you can set up dashboards and alerts.

Do not take this post as favoring one over the other, but more of something to introduce you to what is possible. Be sure to go out and do your own trials and assessments to choose what works best for you. Do read this post and immediately get one and start using it!

Go back to the first post of the series and see the image showing that there are two components inside of your Node.js application that are sending telemetry. You can now understand how that works, because of what I will explain in this post. I will walk through the usage of New Relic and describe what is going on as we step through it.

Getting set up

The first thing to do is to provision the New Relic SaaS. Go to the New Relic web site and it will walk you through the steps to set up APM as a SaaS offering. You can pick to use a free trial.

One thing you are instructed to do is to get the New Relic Node.js module downloaded and into your project code from NPM. Follow the other steps and you are almost there with basic telemetry reporting. Once the Node require() statement is in place you can run your Node application and within minutes be up and viewing telemetry in the New Relic portal.

newrelic_overview

At this point you are using the built in telemetry collection capability. You can also find a link in the console to download and install a machine metric reporting agent if you like. This will give you an overall look at things like Disk, network, memory and CPU metrics.

To send custom telemetry, you need to instrument your code with custom calls. Here is the explanation for how you go about doing that.

Further transaction granularity

As is, “out of the box”, New Relic is doing its job of reporting all of your Web Service call transactions. You can look and see what the HTTP/Rest call breakdown was by percentage and by performance.

What if you want a further breakdown of work that is going on inside a Post route? This is where you can take advantage of the ability of New Relic to further dive into the details of what makes up a given Node.js Web Service call. New Relic provides the createTracer() function to accomplish this.

Tracers are something that can run inside of an already occurring transaction that New Relic knows about and allows you to report back to New Relic the breakdown of any given sub-operations. The following code shows how you would take a route, such as for user login and create a further detailed report of what is going on. You basically wrap code in one of the blocks and need to make sure to be careful with asynchronous nested callbacks. In the code below you see that inside of the login route there is a connection to some database being made and then some other processing goes on until the response is finally sent back.

var nr = require('newrelic');
app.get('/login', function (req, res) {
   var t1 = nr.createTracer('login:dbconnect', function() {
      dbConnect(connInfo, function() {
         var t2 = nr.createTracer('login:otherStuff', function() {
            ...Do some other stuff
            res.send('Welcome Ted');
            res.end();
         });
         t2();
      });
   });
   t1();
})

Custom telemetry transactions

There are cases, where you have code that is running outside of a Web Service call transaction. This would be code that you might have set up on some backend timer that goes off that acts as some worker processing. New Relic can’t know about that on its own to capture what is going on. This recurring background batch processing is unrelated to incoming Web requests. You might have a request that comes in and then you send back a response, but then queue up something to be processed later.

Here is how you would handle these situations. In either case, it is the same code. You simply write some code to tell New Relic that you have a background transaction. Then when the code completes, you call back to New Relic to tell it that the transaction completed. These show up then in the UI along side the other transactions and are able to report their timing data.

In the following code, you can see that there is an asynchronous function with a callback that needs to have the endTransaction() call in it to notify New relic of the completion. This will now allow any processing to be reported back to New Relic so you can see the telemetry data for it.

var nr = require('newrelic');

setInterval(function () {
   var bg = nr.createBackgroundTransaction('update:data', function(){
      updateData(function () {
         nr.endTransaction();
      });
   });
   bg();
}), 60000)

If your Node.js project uses some type of TCP/IP socket transmission, then the New Relic instrumentation cannot really tell when a transaction begins and ends. For example if you are using Socket.IO and are passing messages back and forth, you will have code that responds to these messages. It is there that you insert New Relic endTransaction() calls to be able to have New Relic know to break those up and report on them.

Custom metrics

There may be metric values that are not really timed transactions. They are just values that you want to be able to report on and see in reports and alert against. For example, things like the number of users currently logged in, the number of entries in some backend processing queue, With one simple line addition, you can send up custom metrics. These are orthogonal to any Web request transactions that are going on. Here is code that could be in a route where a Put operation is happening and you could report a metric value for something connected with that:

app.put('/cart/items', function(req, res) {
   ...code to add the item to the cart and return total number so far
   newrelic.recordMetric('Custom/Cart/ItemCount', itemCount);
});

Custom events

New Relic will report a set of standard events it knows about. You can add additional ones to be reported. Whereas Transactions were measured executions that reported the time they took and the throughput, metrics were single values that could be measured that have meaning to look at for their total, average and change over time. An event on the other hand is a set of information related to something that happened at a point in time that you want to stamp. For example, the starting of the Node process. Other things would be exceptions or errors that happened in the code. Here are some code examples:

// EVENTS like node start or critical error like backend service down
newrelic.recordCustomEvent('NodeServer_START', { level: "Information", message: "Node server was started", port: server.address().port });

// On callback err
if (err) {
   req.newrelic.recordCustomEvent("EVENT_MyBackendServiceError", { level: "Warning", TrackingID: uuid.v4(), message: "Failure processing the file.", err: err.toString() });
}

Custom metrics and events can then be viewed in a custom dashboard in the insights portion of the New Relic UI.

Other capabilities

There are a lot of other capabilities in New Relic and the other competing offerings. For example, you can set up synthetic monitors that hit your API endpoints or run through a script, to control a browser interaction and then report on the SLA from those. These can even be scaled out and run from multiple data centers at once.

New Relic is also adding a new Alerting feature where you can set thresholds to test operational telemetry with and be alerted to any violations.

New Relic also has a plugin model where you can download capabilities, such as being able to combine reporting on a MongoDB database. They have dozens of technologies supported already.

Of course, my purpose in this post was to expose operational insights. If you also want insights into business metrics and customer behaviors, New Relic has you covered there as well.

Conclusion

Remember that you need to instrument your code to return telemetry in the form of events, logs and metrics. You will do this so you are not flying blind! A pilot can fly a plane in the dark because of his instrumentation and telemetry. The last thing you want to have to do is remote into individual servers in a cluster and start poking around to find a “needle in a haystack”. It is the only way you will be able to scale and survive.

You should thus achieve the right amount of logging so as to not affect performance as well as having enough, yet not too much information to search through. With these APM tools though, it makes it easy to filter and watch trends.

I hope you enjoyed this series of posts on this very important topic and will be proactive about your APM implementation. In the end it will save you time and headaches if you get it implemented right. Start with something simple and build from there. You will certainly learn where your strategy is deficient as you go along and iterate to a great solution.

Node.js environment Monitoring and Alerting – Part 1

Node.js environment Monitoring and Alerting – Part 2 (Application Insights and AWS CloudWatch)

About Bushman

Living a purposeful life.
This entry was posted in Uncategorized and tagged , , , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s