Streaming HTML5 video through Node.js

The server side to support streaming HTML5 video needs to be able to handle headers sent in from the browser. So unfortunately you can’t simply just read the video bytes and send everything back in the response.

The main thing to note here is that the browser may send in a Range HTTP header, which will specify what byte range from the video the browser is requesting. If the range is missing, we can send the whole video starting from byte 0. If it’s there, we’ll want to send only the range of bytes requested. The range header will be in this format:

Range: bytes=0-

…meaning it’s requesting the whole video (or at least it’s the initial request so the size of the video can be determined by the browser from the Content-Length header in the response).

Or:

Range: bytes=5000-10000

…meaning the browser is requesting the video starting from 5000 bytes to 10000 bytes (the user may have skipped ahead).

Also important to note the response headers sent back from the server. These should include:

Accept-Ranges: bytes
Content-Type: video/html
Content-Length: (length)
Content-Range: bytes (start)-(end)/(total)

The Accept-Ranges tells the browser that the server side supports HTML5 video streaming and can take byte ranges. Content-Length sends the total length of the file in bytes. And Content-Range sends the range of the content being returned, in bytes.

So in our Node.js API that handles the HTML5 video streaming requests, we need to be able to handle these headers, and ship the video file back accordingly in ranges (if so requested).

Here’s what the code would look like:

exports.stream = function(req, res) {
  var fileName = req.params.fileName ? req.params.fileName : null;
  if(!fileName)
    return res.status(404).send();

  fs.stat(fileName, function(err, stats) {
    if (err) {
      if (err.code === 'ENOENT') {
        return res.status(404).send();
      }
    }

    var start;
    var end;
    var total = 0;
    var contentRange = false;
    var contentLength = 0;

    var range = req.headers.range;
    if (range)
    {
      var positions = range.replace(/bytes=/, "").split("-");
      start = parseInt(positions[0], 10);
      total = stats.size;
      end = positions[1] ? parseInt(positions[1], 10) : total - 1;
      var chunksize = (end - start) + 1;
      contentRange = true;
      contentLength = chunksize;
    }
    else
    {
      start = 0;
      end = stats.size;
      contentLength = stats.size;
    }

    if(start<=end)
    {
      var responseCode = 200;
      var responseHeader =
      {
        "Accept-Ranges": "bytes",
        "Content-Length": contentLength,
        "Content-Type": "video/mp4"
      };
      if(contentRange)
      {
        responseCode = 206;
        responseHeader["Content-Range"] = "bytes " + start + "-" + end + "/" + total;
      }
      res.writeHead(responseCode, responseHeader);

      var stream = fs.createReadStream(file, { start: start, end: end })
        .on("readable", function() {
          var chunk;
          while (null !== (chunk = stream.read(1024))) {
            res.write(chunk);
          }
        }).on("error", function(err) {
          res.end(err);
        }).on("end", function(err) {
          res.end();
        });
    }
    else
    {
      return res.status(403).send();
    }
  });
};

Let’s analyze this code briefly. There is a function called stream() which is passed the request/response through Node.js. The function looks for a request parameter named fileName, though you can pass around the file identifier, or whatever you please, as long as you have a way to map it to a exact file path on disk.

First we look if the HTTP headers include the Range header. If they do, we can assume the browser requested only a certain range of bytes, so we proceed accordingly. Otherwise if the header is not present, we plan on shipping the whole file back. Node.js’s fs.createReadStream() allows you to create a read stream from a file with specifying what bytes to start and and how many to return (as directed by the browser’s request, or all of it). And then we ship back that stream to the browser.

Note: I did this a while back so I’m forgetting the exact reasons now, but the chunked return is important, otherwise the browser acts funny when playing the video. You may want to adjust the chunk size, but I find 1024 works nicely.

And that’s it!

Skip certain file extensions from Morgan HTTP logger for Express.js

Morgan is a useful HTTP request logger middleware for Express.js, which plugs in nicely to Node.js and the MEAN stack. More info on Morgan at https://github.com/expressjs/morgan

One useful feature is to add a filter to skip certain files that you don’t want logged. For example you may not want a log of every single get of an image file.

First you define a filter function that returns a boolean for certain file extensions types:

function skipLog (req, res) {
  var url = req.url;
  if(url.indexOf('?')>0)
    url = url.substr(0,url.indexOf('?'));
  if(url.match(/(js|jpg|png|ico|css|woff|woff2|eot)$/ig)) {
    return true;
  }
  return false;
}

The function above will return true for any files with the extension .js, .jpg, .png, (and so on…). Note: you’ll want to return true for skips because you want to evaluate it to skip=true. Also note that the code extracts out the filename from the URL in case there are request parameters attached to it.

Then to use it, you would initiate Morgan like so when setting it up in express.js:

var morgan = require('morgan');
var express = require('express');
var app = express();
//...
app.use(morgan('combined', {stream: accessLogStream, skip: skipLog}));

And that’s it!

MEAN application, getting all sockets during a request

In my MEAN application I had a need to get the list of all open websockets during a particular request. Turns out with socket.io this is easily doable if can save a reference to the socketio variable during initialization.

For a typical socket.io init in Express, the app.js would contain the following code:

var app = express();
var socketio = require('socket.io')(server, {
    path: '/socket.io-client'
});
require('./config/socketio')(socketio);

So what you need to do is save a reference to the “socketio” variable, which can be retrieved from the request passed down in Express, and used to get all sockets. To do so, you can set it as an application-wide variable in Express like so in app.js:

app.set('socketio', socketio);

Next, a typical Express function handling a request looks like so:

function(req, res) {
}

From your request (the “req” variable) you can get a reference to the Express app, which allows you to get a reference to the socketio variable you set application wide, which you can use to get the list of sockets. Like so:

function(req, res)  {
    var sockets = req.app.get('socketio').sockets.sockets;
    //...
}

And now you can loop through all the sockets like below, and do whatever with them you’d like:

for (var socketId in sockets) {
    var socket = sockets[socketId];
}

As easy as that!

socket.io: Getting all connected sockets in Node/Express

Sometimes you need a list of all connected sockets in your server side API. This could be, for example, if you want to loop through all the sockets and emit something when a certain event happens, such as some a data model updating or something else.

As of socket.io version 1.3, you have access to all sockets using “socketio.sockets.sockets” programatically.

So for example if you needed all sockets in an Express controller running on node.js, you could access it this way.

In your server side app.js, socket.io is usually configured in express this way:

var app = express();
//...
var socketio = require('socket.io')(server, {
  //...
});

And then, you can save the reference for socketio for later using Express’s app.set() and app.get(). In app.js you would do:

app.set('socketio', socketio);

Then in an express controller, you can access socketio like so:

exports.whateverController = function(req, res, next) {
  var socketio = req.app.get('socketio');
  //...
}

And then you can loop through the sockets:

var sockets = socketio.sockets.sockets;
for(var socketId in sockets)
{
  var socket = sockets[socketId]; //loop through and do whatever with each connected socket
  //...
}

Simple, right?!

Paginating documents/items in MEAN

If you’ve ever scrolled through the Facebook newsfeed, you’ve noticed that the topmost stories are the most recent ones, and as you scroll to the bottom, older ones get loaded over and over as you keep scrolling.

This feature is firstly kind of “cool”, and fits in perfectly in a single page application. It’s also pretty useful from a performance standpoint, since not all of the documents (items your page is displaying, such as Facebook news stories, classified ads, search results, etc.) need to be loaded up front all at once when the user first lands on the page.

Paginating your documents in a MEAN application can be accomplished fairly easily, though it isn’t necessarily obvious. So I thought I’d write about the process I took, and the code I wrote, to get it done.

Let’s start with the AngularJS side of things. I used the ngInfiniteScroll module (https://github.com/sroze/ngInfiniteScroll) to accomplish the continuous scrolling effect. It’s pretty simple to configure, so please read up on the documentation. Essentially it can just be wrapped around an Angular ng-repeat directive, and be configured with a function to call to fetch more documents when the bottom of the page is reached (ngInfiniteScroll does all the calculations internally). Here is an example of what it would look like for getting more “classifieds” from the database to add them to the view:

loading-classifieds

So in the example above, the getMorePosted() function in your controller is called whenever ngInfiniteScroll detects that the user is at the bottom of the page. Note here that ngInfiniteScroll will most likely trigger right when the user lands on the page, unless you pre-load some documents in your controller. I elected getMorePosted() to fetch both the initial set of documents, and every successive set of documents as well. Depending on how you set things up, this may or may not make a difference, but it did for me.

My getMorePosted() function in the controller looks like this (note: it uses a factory called Classified to do the actual getting of classifieds from the API (Express/MongoDB on the server side of MEAN) which I’ll define later):

$scope.initialLoadDone = false;
$scope.loadingClassifieds = false;
$scope.getMorePosted = function() {
  if($scope.loadingClassifieds) return;

  $scope.loadingClassifieds = true;

  if(!$scope.initialLoadDone) {
    Classified.getPosted(function (postedClassifieds) {
      $scope.postedClassifieds = postedClassifieds;
      $scope.loadingClassifieds = false;
      $scope.initialLoadDone = true;
    });
  }
  else
  {
    Classified.getMorePosted(function(err,numberOfClassifiedsGotten) {
      $scope.loadingClassifieds = false;
      if(numberOfClassifiedsGotten==0)
        $scope.noMoreClassifieds=true;
    });
  }
}

A couple things to note here. When the classifieds are being loaded, the $scope.loadingClassifieds flag is set to true. This disables ngInfiniteScroll from attempting to keep loading more classifieds when the bottom is reached, and it can also be used to put up a message to the user that loading is underway (in case it doesn’t happen near instantly due to a slow connection). Furthermore, getMorePosted() also tracks through the $scope.noMoreClassifieds flag when the end has reached (if ever, depending on how many thousands or millions of documents are in your database, and how far down the user scrolls). It does this by measuring the number of documents returned, and if the number equals zero, it means the end of pagination has been reached.

This is how getPosted() and getMorePosted() look like in the Classified factory:

app.factory('Classified', function Classified(ClassifiedResource, ...) {
      var postedClassifieds = [];
      var postedClassifiedsLoaded = false;
      //...
      getPosted: function(callback) {
          var cb = callback || angular.noop;
          if (postedClassifiedsLoaded) {
            //console.log(&quot;Sending already-loaded postedClassifieds&quot;);
            return cb(postedClassifieds);
          } else {
            return ClassifiedResource.Posted.query(
              function(_postedClassifieds) {
                //console.log(&quot;Loading postedClassifieds from webservice&quot;);
                postedClassifieds = _postedClassifieds;
                postedClassifiedsLoaded = true;
                return cb(postedClassifieds);
              },
              function(err) {
                return cb(err);
              }).$promise;
          }
        },
        getMorePosted: function(callback) {
          var cb = callback || angular.noop;
          if (!postedClassifiedsLoaded)
            callback();
          else {
            return ClassifiedResource.Posted.query({
                startTime: new Date(postedClassifieds[postedClassifieds.length - 1].posted).getTime()
              },
              function(_postedClassifieds) {
                //console.log(&quot;Loading more postedClassifieds from webservice, from before startTime=&quot;+postedClassifieds[postedClassifieds.length-1].posted);
                for (var i = 0; i &lt; _postedClassifieds.length; i++)
                  replaceOrInsertInArray(postedClassifieds, _postedClassifieds[i], true);
                return cb(null, _postedClassifieds.length);
              },
              function(err) {
                return cb(err);
              }).$promise;
          }
        },
        //...

And this is how ClassifiedResource looks like:

app.factory('ClassifiedResource', function ($resource) {
  return {
    Posted: $resource(
      '/api/classified/getPosted/:startTime',
      {
      },
      {
      }
    ),
}

So note that in my setup, the service loads and maintains the list of documents (postedClassifieds) within memory. And getPosted() returns that list if it is already loaded, and it also gets the first set of documents. getMorePosted() is where the magic happens. It gets the timestamp of the last classified, and transmits that to the API (server side, Express) which then loads the next “page” after for all documents (classifieds in this case) after that timestamp.

Before we continue to examine the server side, it’s important to note that you’ll need a field to sort by in a descending order (or ascending if you want you want the oldest documents up front). A timestamp value will work great. Otherwise a MongoDB ID could work too, since those are incremental. It will depend on your data. In my case, a timestamp called “posted” was available in my data, and very consistent. Documents could only be removed from before a past timestamp, but not added to in a past timestamp (even then, this wouldn’t be a huge problem). So that works just fine with this pagination approach.

Here is what the server side looks like in Express/NodeJS:

var Classified = require('./classified.model');
exports.getPosted = function(req, res) {
  var startTime = req.params.startTime ? req.params.startTime : null;

  var query = Classified.find(
      {posted: { $ne: null }}
  );
  query.sort('-posted -_id');
  query.limit(20);
  if(startTime)
    query.where({posted: {$lt: new Date().setTime(startTime)}});
  query
    .exec(function (err, classifieds) {
      if(err) { ... }
      return res.status(200).json(classifieds);
    });

}

Note that “Classified” defines my model, which is queried from using Mongoose. I limit the number of documents returned to 20, which works well for my application. And the query is sorted in descending order by the “posted” field, which is a timestamp. You’ll notice a where clause added, which gets only the classifieds posted before the time sent in (“startTime”) from the UI, so that works in conjunction with the sort and returns 20 more classifieds before the “startTime”. Also note that I send the timestamp in milliseconds, which gives a nice clean number that can be sent down to the API from the UI.

And, that’s it!

Something I want to add is that on your client side (in AngularJS) if you end up loading too many documents/items in your ng-repeat, the application performance will greatly degrade. With ngInfiniteScroll, all items on the page are always kept once they’re loaded, even if they’re not in the view currently. There’s another module: https://github.com/angular-ui/ui-scroll which will allow you to destroy and re-create items as they go in and out of the view from the user’s browser as the user scrolls through. This will vastly improve performance when a lot of documents are loaded.

MEAN stack foreign language translations

For the MEAN application I’m currently building, there is a requirement to have it served in multiple user-selectable languages. I used the MEAN fullstack generator (https://github.com/DaftMonk/generator-angular-fullstack), which does not provide i18n (internationalization) support.

When setting up my application for i18n, I realized that I needed translations available up and down the stack. So not just in the View, also in the Model and Controller. I ended up using angular-translate (https://github.com/angular-translate/angular-translate) and MomentJS (https://github.com/moment/moment/) in the client side AngularJS. And I created my own custom solution, very simple, in Node for the server side model and controller.

I think angular-translate works great in Angular, and there are plenty of guides around so I won’t go into it. But I want to mention that angular-translate doesn’t have great support (at least that I could find) for translating dates and numbers. This is where MomentJS can fill in the gaps. Again, plenty of guides out and good documentation out there for MomentJS.

For Node, I created a module that simply has a JSON of all the translations, and a function that returns the translation. Example below:

—translations.js—

'use strict';
var en = {
  VERIFICATION_EMAIL_SUBJECT: 'Sign up verification',
  VERIFICATION_EMAIL_TEXT: 'You are receiving this email because you or someone else has signed up with this email address (%s)',
};
var fr = {
  VERIFICATION_EMAIL_SUBJECT: 'S\'inscrire vérification',
  VERIFICATION_EMAIL_TEXT: 'Vous recevez ce courriel parce que vous ou quelqu\'un d\'autre a signé avec cette adresse email (%s)',
};
module.exports.get = function(lang, key)
{
  if(lang == 'en')
    return en[key];
  else if(lang == 'fr')
    return fr[key];
};
module.exports.en = en;
module.exports.fr = fr;

And then then use it like so:

var translations = require('translations');
console.log(translations.get('en','VERIFICATION_EMAIL_SUBJECT'));
console.log(sprintf(translations.get(user.language,'VERIFICATION_EMAIL_TEXT'),'blah@blah.com'));

This way translations can be available anywhere on the server side that uses Node.