Gorgi Kosev

code, music, math

Gorgi Kosev

code, music, math


JavaScript isn't cancer

Thu Oct 06 2016

The last few days, I've been thinking about what leads so many people to hate JavaScript.

JS is so quirky and unclean! Thats supposed to be the primary reason, but after working with a few other dynamic languages, I don't buy it. JS actually has a fairly small amount of quirks compared to other dynamic languages.

Just think about PHP's named functions, which are always in the global scope. Except when they are in namespaces (oh hi another concept), and then its kinda weird because namespaces can be relative. There are no first class named functions, but function expressions can be assigned to variables. Which must be prefixed with $. There are no real modules, or proper nestable scope - at least not for functions, which are always global. But nested functions only exist once the outer function is called!

In Ruby, blocks are like lambdas except when they are not, and you can pass a block explicitly or yield to the first block implicitly. But there are also lambdas, which are different. Modules are uselessly global, cannot be parameterised over other modules (without resorting to meta programming), and there are several ways to nest them: if you don't nest them lexically, the lookup rules become different. And there are classes, with private variables, which are prefixed with @. I really don't get that sigil fetish.

The above examples are only scratching the surface.

And which are the most often cited problems of JavaScript? Implicit conversions (the wat talk), no large ints and hard to understand prototypical inheritance and this keyword. That doesn't look any worse than the above lists! Plus, the language (pre ES6) is very minimalistic. It has freeform records with prototypes, and closures with lexical scope. Thats it!

So this supposed "quirkiness" of JavaScript doesn't seem like a satisfactory explanation. There must be something else going on here, and I think I finally realized what that is.

JavaScript is seen as a "low status" language. A 10 day accident, a silly toy language for the browser that ought to be simple and easy to learn. To an extent this is true, largely thanks to the fact that there are very few distinct concepts to be learned.

However, those few concepts combine together into a package with a really good power-to-weight ratio. Additionally, the simplicity ensures that the language is malleable towards even more power (e.g. you can extend it with a type system and then you can idiomatically approximate some capabilities of algebraic sum types, like making illegal states unrepresentable).

The emphasis above is on idiomatically for a reason. This sort of extension is somehow perfectly normal in JavaScript. If you took Ruby and used its dictionary type to add a comparable feature, it has significantly lower likelyhood of being accepted by developers. Why? Because Ruby has standard ways of doing things. You should be using objects and classes, not hashes, to model most of your data. (*)

That was not the case with the simple pre-ES6 JavaScript. There was no module system to organize code. No classes system to hierarhically organize blueprints of things that hold state. Lack of basic standard library items, such as maps, sets, iterables, streams, promises. Lack of functions to manipulate existing data structures (dictionaries and arrays).

Combine sufficient power, simplicity/malleability, and the lack of the basic facilities. Add to this the fact that its the basic option in the browser, the most popular platform. What do you get? You get a TON of people working in it to extend it in various different ways. And they invent a TON of stuff!

We ended up with several popular module systems (object based namespaces, CommonJS, AMD, ES6, the angular module system, etc) as well as many package managers to manage these modules (npm, bower, jspm, ...). We also got many object/inheritance systems: plain objects, pure prototype extension, simulating classes, "composable object factories", and so on and so forth. Heck, a while ago every other library used to implement its own class system! (That is, until CoffeeScript came and gave the definite answer on how to implement classes on top of prototypes. This is interesting, and I'll come back to it later.)

This creates dissonance with the language's simplicity. JavaScript is this simple browser language that was supposed to be easy, so why is it so hard? Why are there so many things built on top of it and how the heck do I choose which one to use? I hate it. Why do I hate it? Probably its all these silly quirks that it has! Just look at its implicit conversions and lack of number types other than doubles!

It doesn't matter that many languages are much worse. A great example of the reverse phenomenon is C++. Its a complete abomination, far worse than JavaScript - a Frankenstein in the languages domain. But its seen as "high status", so it has many apologists that will come to defend its broken design: "Yeah, C++ is a serious language, you need grown-up pants to use it". Unfortunately JS has no such luck: its status as a hack-together glue for the web pages seems to have been forever cemented in people's heads.

So how do we fix this? You might not realize it, but this is already being fixed as we speak! Remember how CoffeeScript slowed down the prolification of custom object systems? Browsers and environments are quickly implementing ES6, which standardizes a huge percentage of what used to be the JS wild west. We now have the standard way to do modules, the standard way to do classes, the standard way to do basic procedural async (Promises; async/await). The standard way to do bundling will probably be no-bundling: HTTP2 push + ES6 modules will "just work"!

Finally, I believe the people who think that JavaScript will always be transpiled are wrong. As ES6+ features get implemented in major browsers, more and more people will find the overhead of ES.Next to ES transpilers isn't worth it. This process will stop entirely at some point as the basics get fully covered.

At this point, I'm hoping several things will happen. We'll finally get those big integers and number types that Brendan Eich has been promising. We'll have some more stuff on top of SharedArrayBuffer to enable easier shared memory parallelism, perhaps even immutable datastructures that are transferable objects. The wat talk will be obsolete: obviously, you'd be using a static analysis tool such as Flow or TypeScript to deal with that; the fact that the browser ignores those type annotations and does its best to interpret what you meant will be irrelevant. async/await will be implemented in all browsers as the de-facto way to do async control flow; perhaps even async iterators too. We'll also have widly accepted standard libraries for data and event streams.

Will JavaScript finally gain the status it deserves then? Probably. But at what cost? JavaScript is big enough now that there is less space for new inventions. And its fun to invent new things and read about other people's inventions!

On the other hand, maybe then we'll be able to focus on the stuff we're actually building instead.

(*) Or metaprogramming, but then everyone has to agree on the same metaprogramming. In JS, everyone uses records, and they probably use a tag field to discriminate them already: its a small step to add types for that.

ES7 async functions - a step in the wrong direction

Sun Aug 23 2015

Async functions are a new feature scheduled to become a part of ES7. They build on top of previous capabilities made available by ES6 (promises), letting you write async code as though it were synchronous. At the moment, they're a stage 1 proposal for ES7 and supported by babel / regenerator.

When generator functions were first made available in node, I was very excited. Finally, a way to write asynchronous JavaScript that doesn't descend into callback hell! At the time, I was unfamiliar with promises and the language power you get back by simply having async computations be first class values, so it seemed to me that generators are the best solution available.

Turns out, they aren't. And the same limitations apply for async functions.

Predicates in catch statements

With generators, thrown errors bubble up the function chain until a catch statement is encountered, much like in other languages that support exceptions. On one hand, this is convenient, but on the other, you never know what you're catching once you write a catch statement.

JavaScript catch doesn't support any mechanism to filter errors. This limitation isn't too hard to get around: we can write a function guard

function guard(e, predicate) {
  if (!predicate(e)) throw e;

and then use it to e.g. only filter "not found" errors when downloading an image

try {
    await downloadImage(url);
} catch (e) { guard(e, e => e.code == 404);

But that only gets us so far. What if we want to have a second error handler? We must resort to using if-then-else, making sure that we don't forget to rethrow the error at the end

try {
    await downloadImage(url);
} catch (e) {
    if (e.code == 404)  {
    } else if (e.code == 401) {
    } else {
        throw e;

Since promises are a userland library, restrictions like the above do not apply. We can write our own promise implementation that demands the use of a predicate filter:

.catch(e => e.code == 404, e => {
.catch(e => e.code == 401, e => {

Now if we want all errors to be caught, we have to say it explicitly:

.catch(e => true, e => {

Since these constructs are not built-in language features but a DSL built on top of higher order functions, we can impose any restrictions we like instead of waiting on TC39 to fix the language.

Cannot use higher order functions

Because generators and async-await are shallow, you cannot use yield or await within lambdas passed to higher order functions.

This is better explained here - The example given there is

async function renderChapters(urls) {
  urls.map(getJSON).forEach(j => addToPage((await j).html));

and will not work, because you're not allowed to use await from within a nested function. The following will work, but will execute in parallel:

async function renderChapters(urls) {
  urls.map(getJSON).forEach(async j => addToPage((await j).html));

To understand why, you need to read this article. In short: its much harder to implement deep coroutines so browser vendors probably wont do it.

Besides being very unintuitive, this is also limiting. Higher order functions are succint and powerful, yet we cannot really use them inside async functions. To get sequential execution we have to resort to the clumsy built in for loops which often force us into writing ceremonial, stateful code.

Arrow functions give us more power than ever before

Functional DSLs were very powerful even before JS had short lambda syntax. But with arrow functions, things get even cleaner. The amount of code one needs to write can be reduced greatly thanks to short lambda syntax and higher order functions. Lets take the motivating example from the async-await proposal

function chainAnimationsPromise(elem, animations) {
    var ret = null;
    var p = currentPromise;
    for(var anim of animations) {
        p = p.then(function(val) {
            ret = val;
            return anim(elem);
    return p.catch(function(e) {
        /* ignore and keep going */
    }).then(function() {
        return ret;

With bluebird's Promise.reduce, this becomes

function chainAnimationsPromise(elem, animations) {
  return Promise.reduce(animations,
      (lastVal, anim) => anim(elem).catch(_ => Promise.reject(lastVal)),
  .catch(lastVal => lastVal);

In short: functional DSLs are now more powerful than built in constructs, even though (admittedly) they may take some getting used to.

But this is not why async functions are a step in the wrong direction. The problems above are not unique to async functions. The same problems apply to generators: async functions merely inherit them as they're very similar.

Async functions also go another step backwards.

Loss of generality and power

Despite their shortcomings, generator based coroutines have one redeeming quality: they allow you to redefine the coroutine execution engine. This is extremely powerful, and I will demonstrate by giving the following example:

Lets say we were given the task to write the save function for an issue tracker. The issue author can specify the issue's title and text, as well as any other issues that are blocking the solution of the newly entered issue.

Our initial implementation is simple:

async function saveIssue(data, blockers) {
    let issue = await Issues.insert(data);
    for (let blockerId of blockers) {
      await BlockerIssues.insert({blocker: blockerId, blocks: issue.id});

Issues.insert = async function(data) {
    return db.query("INSERT ... VALUES", data).execWithin(db.pool);

BlockerIssue.insert = async function(data) {
    return db.query("INSERT .... VALUES", data).execWithin(db.pool);

Issue and BlockerIssues are references to the corresponding tables in an SQL database. Their insert methods return a promise that indicate whether the query has been completed. The query is executed by a connection pool.

But then, we run into a problem. We don't want to partially save the issue if some of the data was not inserted successfuly. We want the entire save operation to be atomic. Fortunately, SQL databases support this via transactions, and our database library has a transaction abstraction. So we change our code:

async function saveIssue(data, blockers) {
    let tx = db.beginTransaction();
    let issue = await Issue.insert(tx, data);
    for (let blockerId of blockers) {
      await BlockerIssues.insert(tx, {blocker: blockerId, blocks: issue.id});

Issues.insert = async function(tx, data) {
    return db.query("INSERT ... VALUES", data).execWithin(tx);

BlockerIssue.insert = async function(tx, data) {
    return db.query("INSERT .... VALUES", data).execWithin(tx);

Here, we changed the code in two ways. Firstly, we created a transaction within the saveIssue function. Secondly, we changed both insert methods to take this transaction as an argument.

Immediately we can see that this solution doesn't scale very well. What if we need to use saveIssue as a part of a larger transaction? Then it has to take a transaction as an argument. Who will create the transactions? The top level service. What if the top level service becomes a part of a larger service? Then we need to change the code again.

We can reduce the extent of this problem by writing a base class that automatically initializes a transaction if one is not passed via the constructor, and then have Issues, BlockerIssue etc inherit from this class.

class Transactionable {
    constructor(tx) {
        this.transaction = tx || db.beginTransaction();
class IssueService extends Transactionable {
    async saveIssue(data, blockers) {
        issues = new Issues(this.transaction);
        blockerIssues = new BlockerIssues(this.transaction);
class Issues extends Transactionable { ... }
class BlockerIssues extends Transactionable { ... }
// etc

Like many OO solutions, this only spreads the problem across the plate to make it look smaller but doesn't solve it.

Generators are better

Generators let us define the execution engine. The iteration is driven by the function that consumes the generator, which decides what to do with the yielded values. What if instead of only allowing promises, our engine let us also:

  1. Specify additional options which are accessible from within
  2. Yield queries. These will be run in the transaction specified in the options above
  3. Yield other generator iterables: These will be run with the same engine and options
  4. Yield promises: These will be handled normally

Lets take the original code and simplify it:

function* saveIssue(data, blockers) {
    let issue = yield Issues.insert(data);
    for (var blockerId of blockers) {
      yield BlockerIssues.insert({blocker: blockerId, blocks: issue.id});

Issues.insert = function* (data) {
    return db.query("INSERT ... VALUES", data)

BlockerIssue.insert = function* (data) {
    return db.query("INSERT .... VALUES", data)

From our http handler, we can now write

var myengine = require('./my-engine');

app.post('/issues/save', function(req, res) {
  myengine.run(saveIssue(data, blockers), {tx: db.beginTransaction()})

Lets implement this engine:

function run(iterator, options) {
    function id(x) { return x; }
    function iterate(value) {
        var next = iterator.next(value)
        var request = next.value;
        var nextAction = next.done ? id : iterate;

        if (isIterator(request)) {
            return run(request, options).then(nextAction)
        else if (isQuery(request)) {
            return request.execWithin(options.tx).then(nextAction)
        else if (isPromise(request)) {
            return request.then(nextAction);
    return iterate()

The best part of this change is that we did not have to change the original code at all. We didn't have to add the transaction parameter to every function, to take care to properly propagate it everywhere and to properly create the transaction. All we needed to do is just change our execution engine.

And we can add much more! We can yield a request to get the current user if any, so we don't have to thread that through our code. Infact we can implement continuation local storage with only a few lines of code.

Async generators are often given as a reason why we need async functions. If yield is already being used as await, how can we get both working at the same time without adding a new keyword? Is that even possible?

Yes. Here is a simple proof-of-concept. github.com/spion/async-generators. All we needed to do is change the execution engine to support a mechanism to distinguish between awaited and yielded values.

Another example worth exploring is a query optimizer that supports aggregate execution of queries. If we replace Promise.all with our own implementaiton caled parallel, then we can add support for non-promise arguments.

Lets say we have the following code to notify owners of blocked issues in parallel when an issue is resolved:

let blocked = yield BlockerIssues.where({blocker: blockerId})
let owners  = yield engine.parallel(blocked.map(issue => issue.getOwner()))

for (let owner of owners) yield owner.notifyResolved(issue)

Instead of returning an SQL based query, we can have getOwner() return data about the query:

{table: 'users', id: issue.user_id}

and have engine optimize the execution of parallel queries, by sending a single query per table rather then per item.

if (isParallelQuery(query)) {
    var results = _(query.items).groupBy('table')
      .map((items, t) => db.query(`select * from ${t} where id in ?`,
                                  items.map(it => it.id))
        .then(results => results.sort(byOrderOf(query.items)))

And voila, we've just implemented a query optimizer. It will fetch all issue owners with a single query. If we add an SQL parser into the mix, it should be possible to rewrite real SQL queries.

We can do something similar on the client too with GraphQL queries by aggregating multiple individual queries.

And if we add support for iterators, the optimization becomes deep: we would be able to aggregate queries that are several layers within other generator functions, In the above example, getOwner() could be another generatator which produces a query for the user as a first result. Our implementation of parallel will run all those getOwner() iterators and consolidate their first queries into a single query. All this is done without those functions knowing anything about it (thus, without breaking modularity).

Async functions cant let us do any of this. All we get is a single execution engine that only knows how to await promises. To make matters worse, thanks to the unfortunately short-sighted recursive thenable assimilation design decision, we can't simply create our own thenable that will support the above extra features. If we try to do that, we will be unable to safely use it with Promises. We're stuck with what we get by default in async functions, and thats it.

Generators are JavaScript's programmable semicolons. Lets not take away that power by taking away the programmability. Lets drop async/await and write our own interpreters.

Why I am switching to promises

Mon Oct 07 2013

I'm switching my node code from callbacks to promises. The reasons aren't merely aesthetical, they're rather practical:

Throw-catch vs throw-crash

We're all human. We make mistakes, and then JavaScript throws an error. How do callbacks punish that mistake? They crash your process!

But spion, why don't you use domains?

Yes, I could do that. I could crash my process gracefully instead of letting it just crash. But its still a crash no matter what lipstick you put on it. It still results with an inoperative worker. With thousands of requests, 0.5% hitting a throwing path means over 50 process shutdowns and most likely denial of service.

And guess what a user that hits an error does? Starts repeatedly refreshing the page, thats what. The horror!

Promises are throw-safe. If an error is thrown in one of the .then callbacks, only that single promise chain will die. I can also attach error or "finally" handlers to do any clean up if necessary - transparently! The process will happily continue to serve the rest of my users.

For more info see #5114 and #5149. To find out how promises can solve this, see bluebird #51

if (err) return callback(err)

That line is haunting me in my dreams now. What happened to the DRY principle?

I understand that its important to explicitly handle all errors. But I don't believe its important to explicitly bubble them up the callback chain. If I don't deal with the error here, thats because I can't deal with the error there - I simply don't have enough context.

But spion, why don't you wrap your calbacks?

I guess I could do that and lose the callback stack when generating a new Error(). Or since I'm already wrapping things, why not wrap the entire thing with promises, rely on longStackSupport, and handle errors at my discretion?

Also, what happened to the DRY principle?

Promises are now part of ES6

Yes, they will become a part of the language. New DOM APIs will be using them too. jQuery already switched to promise...ish things. Angular utilizes promises everywhere (even in the templates). Ember uses promises. The list goes on.

Browser libraries already switched. I'm switching too.

Containing Zalgo

Your promise library prevents you from releasing Zalgo. You can't release Zalgo with promises. Its impossible for a promise to result with the release of the Zalgo-beast. Promises are Zalgo-safe (see section 3.1).

Callbacks getting called multiple times

Promises solve that too. Once the operation is complete and the promise is resolved (either with a result or with an error), it cannot be resolved again.

Promises can do your laundry

Oops, unfortunately, promises wont do that. You still need to do it manually.

But you said promises are slow!

Yes, I know I wrote that. But I was wrong. A month after I wrote the giant comparison of async patterns, Petka Antonov wrote Bluebird. Its a wicked fast promise library, and here are the charts to prove it:

Time to complete (ms)

Parallel requests

Memory usage (MB)

Parallel requests

And now, a table containing many patterns, 10 000 parallel requests, 1 ms per I/O op. Measure ALL the things!

file time(ms) memory(MB)
callbacks-original.js 316 34.97
callbacks-flattened.js 335 35.10
callbacks-catcher.js 355 30.20
promises-bluebird-generator.js 364 41.89
dst-streamline.js 441 46.91
callbacks-deferred-queue.js 455 38.10
callbacks-generator-suspend.js 466 45.20
promises-bluebird.js 512 57.45
thunks-generator-gens.js 517 40.29
thunks-generator-co.js 707 47.95
promises-compose-bluebird.js 710 73.11
callbacks-generator-genny.js 801 67.67
callbacks-async-waterfall.js 989 89.97
promises-bluebird-spawn.js 1227 66.98
promises-kew.js 1578 105.14
dst-stratifiedjs-compiled.js 2341 148.24
rx.js 2369 266.59
promises-when.js 7950 240.11
promises-q-generator.js 21828 702.93
promises-q.js 28262 712.93
promises-compose-q.js 59413 778.05

Promises are not slow. At least, not anymore. Infact, bluebird generators are almost as fast as regular callback code (they're also the fastest generators as of now). And bluebird promises are definitely at least two times faster than async.waterfall.

Considering that bluebird wraps the underlying callback-based libraries and makes your own callbacks exception-safe, this is really amazing. async.waterfall doesn't do this. exceptions still crash your process.

What about stack traces?

Bluebird has them behind a flag that slows it down about 5 times. They're even longer than Q's longStackSupport: bluebird can give you the entire event chain. Simply enable the flag in development mode, and you're suddenly in debugging nirvana. It may even be viable to turn them on in production!

What about the community?

This is a valid point. Mikeal said it: If you write a library based on promises, nobody is going to use it.

However, both bluebird and Q give you promise.nodeify. With it, you can write a library with a dual API that can both take callbacks and return promises:

module.exports = function fetch(itemId, callback) {
    return locate(itemId).then(function(location) {
        return getFrom(location, itemId);

And now my library is not imposing promises on you. Infact, my library is even friendlier to the community: if I make a dumb mistake that causes an exception to be thrown in the library, the exception will be passed as an error to your callback instead of crashing your process. Now I don't have to fear the wrath of angry library users expecting zero downtime on their production servers. Thats always a plus, right?

What about generators?

To use generators with callbacks you have two options

  1. use a resumer style library like suspend or genny
  2. wrap callback-taking functions to become thunk returning functions.

Since #1 is proving to be unpopular, and #2 already involves wrapping, why not just s/thunk/promise/g in #2 and use generators with promises?

But promises are unnecessarily complicated!

Yes, the terminology used to explain promises can often be confusing. But promises themselves are pretty simple - they're basically like lightweight streams for single values.

Here is a straight-forward guide that uses known principles and analogies from node (remember, the focus is on simplicity, not correctness):

Edit (2014-01-07): I decided to re-do this tutorial into a series of short articles called promise nuggets. The content is CC0 so feel free to fork, modify, improve or send pull requests. The old tutorial will remain available within this article.

Promises are objects that have a then method. Unlike node functions, which take a single callback, the then method of a promise can take two callbacks: a success callback and an error callback. When one of these two callbacks returns a value or throws an exception, then must behave in a way that enables stream-like chaining and simplified error handling. Lets explain that behavior of then through examples:

Imagine that node's fs was wrapped to work in this manner. This is pretty easy to do - bluebird already lets you do something like that with promisify(). Then this code:

fs.readFile(file, function(err, res) {
    if (err) handleError();

will look like this:

fs.readFile(file).then(function(res) {
}, function(err) {

Whats going on here? fs.readFile(file) starts a file reading operation. That operation is not yet complete at the point when readFile returns. This means we can't return the file content. But we can still return something: we can return the reading operation itself. And that operation is represanted with a promise.

This is sort of like a single-value stream:

net.connect(port).on('data', function(res) {
}).on('error', function(err) {

So far, this doesn't look that different from regular node callbacks - except that you use a second callback for the error (which isn't necessarily better). So when does it get better?

Its better because you can attach the callback later if you want. Remember, fs.readFile(file) returns a promise now, so you can put that in a var, or return it from a function:

var filePromise = fs.readFile(file);
// do more stuff... even nest inside another promise, then
filePromise.then(function(res) { ... });

Yup, the second callback is optional. We're going to see why later.

Okay, that's still not much of an improvement. How about this then? You can attach more than one callback to a promise if you like:

filePromise.then(function(res) { uploadData(url, res); });
filePromise.then(function(res) { saveLocal(url, res); });

Hey, this is beginning to look more and more like streams - they too can be piped to multiple destinations. But unlike streams, you can attach more callbacks and get the value even after the file reading operation completes.

Still not good enough?

What if I told you... that if you return something from inside a .then() callback, then you'll get a promise for that thing on the outside?

Say you want to get a line from a file. Well, you can get a promise for that line instead:

var filePromise = fs.readFile(file)

var linePromise = filePromise.then(function(data) {
    return data.toString().split('\n')[line];

var beginsWithHelloPromise = linePromise.then(function(line) {
    return /^hello/.test(line);

Thats pretty cool, although not terribly useful - we could just put both sync operations in the first .then() callback and be done with it.

But guess what happens when you return a promise from within a .then callback. You get a promise for a promise outside of .then()? Nope, you just get the same promise!

function readProcessAndSave(inPath, outPath) {
    // read the file
    var filePromise = fs.readFile(inPath);
    // then send it to the transform service
    var transformedPromise = filePromise.then(function(content) {
        return service.transform(content);
    // then save the transformed content
    var writeFilePromise = transformedPromise.then(function(transformed) {
        return fs.writeFile(otherPath, transformed)
    // return a promise that "succeeds" when the file is saved.
    return writeFilePromise;
readProcessAndSave(file, url, otherPath).then(function() {
}, function(err) {
    // This function will catch *ALL* errors from the above
    // operations including any exceptions thrown inside .then
    console.log("Oops, it failed.", err);

Now its easier to understand chaining: at the end of every function passed to a .then() call, simply return a promise.

Lets make our code even shorter:

function readProcessAndSave(file, url, otherPath) {
    return fs.readFile(file)
        .then(fs.writeFile.bind(fs, otherPath));

Mind = blown! Notice how I don't have to manually propagate errors. They will automatically get passed with the returned promise.

What if we want to read, process, then upload, then also save locally?

function readUploadAndSave(file, url, otherPath) {
    var content;
    // read the file and transform it
    return fs.readFile(file)
        content = vContent;
        // then upload it
        return uploadData(url, content);
    }).then(function() { // after its uploaded
        // save it
        return fs.writeFile(otherPath, content);

Or just nest it if you prefer the closure.

function readUploadAndSave(file, url, otherPath) {
    // read the file and transform it
    return fs.readFile(file)
            return uploadData(url, content).then(function() {
                // after its uploaded, save it
                return fs.writeFile(otherPath, content);

But hey, you can also upload and save in parallel!

function readUploadAndSave(file, url, otherPath) {
    // read the file and transform it
    return fs.readFile(file)
        .then(function(content) {
            // create a promise that is done when both the upload
            // and file write are done:
            return Promise.join(
                uploadData(url, content),
                fs.writeFile(otherPath, content));

No, these are not "conveniently chosen" functions. Promise code really is that short in practice!

Similarly to how in a stream.pipe chain the last stream is returned, in promise pipes the promise returned from the last .then callback is returned.

Thats all you need, really. The rest is just converting callback-taking functions to promise-returning functions and using the stuff above to do your control flow.

You can also return values in case of an error. So for example, to write a readFileOrDefault (which returns a default value if for example the file doesn't exist) you would simply return the default value from the error callback:

function readFileOrDefault(file, defaultContent) {
    return fs.readFile(file).then(function(fileContent) {
        return fileContent;
    }, function(err) {
        return defaultContent;

You can also throw exceptions within both callbacks passed to .then. The user of the returned promise can catch those errors by adding the second .then handler

Now how about configFromFileOrDefault that reads and parses a JSON config file, falls back to a default config if the file doesn't exist, but reports JSON parsing errors? Here it is:

function configFromFileOrDefault(file, defaultConfig) {
    // if fs.readFile fails, a default config is returned.
    // if JSON.parse throws, this promise propagates that.
    return fs.readFile(file).then(JSON.parse,
           function ifReadFails() {
               return defaultConfig;
    // if we want to catch JSON.parse errors, we need to chain another
    // .then here - this one only captures errors from fs.readFile(file)

Finally, you can make sure your resources are released in all cases, even when an error or exception happens:

var result = doSomethingAsync();

return result.then(function(value) {
    // clean up first, then return the value.
    return cleanUp().then(function() { return value; })
}, function(err) {
    // clean up, then re-throw that error
    return cleanUp().then(function() { throw err; });

Or you can do the same using .finally (from both Bluebird and Q):

var result = doSomethingAsync();
return result.finally(cleanUp);

The same promise is still returned, but only after cleanUp completes.

But what about async?

Since promises are actual values, most of the tools in async.js become unnecessary and you can just use whatever you're using for regular values, like your regular array.map / array.reduce functions, or just plain for loops. That, and a couple of promise array tools like .all, .spread and .some

You already have async.waterfall and async.auto with .then and .spread chaining:

    .then(function(items) {
        // fetch versions in parallel
        var v1 = versions.get(items.last),
            v2 = versions.get(items.previous);
        return [v1, v2];
    .spread(function(v1, v2) {
        // both of these are now complete.
        return diffService.compare(v1.blob, v2.blob)
    .then(function(diff) {
        // voila, diff is ready. Do something with it.

async.parallel / async.map are straightforward:

// download all items, then get their names
var pNames = ids.map(function(id) {
    return getItem(id).then(function(result) {
        return result.name;
// wait for things to complete:
Promise.all(pNames).then(function(names) {
    // we now have all the names.

What if you want to wait for the current item to download first (like async.mapSeries and async.series)? Thats also pretty straightforward: just wait for the current download to complete, then start the next download, then extract the item name, and thats exactly what you say in the code:

// start with current being an "empty" already-fulfilled promise
var current = Promise.fulfilled();
var namePromises = ids.map(function(id) {
    // wait for the current download to complete, then get the next
    // item, then extract its name.
    current = current
        .then(function() { return getItem(id); })
        .then(function(item) { return item.name; });
    return current;
Promise.all(namePromises).then(function(names) {
    // use all names here.

The only thing that remains is mapLimit - which is a bit harder to write - but still not that hard:

var queued = [], parallel = 3;
var namePromises = ids.map(function(id) {
    // How many items must download before fetching the next?
    // The queued, minus those running in parallel, plus one of
    // the parallel slots.
    var mustComplete = Math.max(0, queued.length - parallel + 1);
    // when enough items are complete, queue another request for an item
    return Promise.some(queued, mustComplete)
        .then(function() {
            var download = getItem(id);
            return download;
        }).then(function(item) {
            // after that new download completes, get the item's name.
            return item.name;
Promise.all(namePromises).then(function(names) {
    // use all names here.

That covers most of async.

What about early returns?

Early returns are a pattern used throughout both sync and async code. Take this hypothetical sync example:

function getItem(key) {
    var item;
    // early-return if the item is in the cache.
    if (item = cache.get(key)) return item;
    // continue to get the item from the database. cache.put returns the item.
    item = cache.put(database.get(key));

    return item;

If we attempt to write this using promises, at first it looks impossible:

function getItem(key) {
    return cache.get(key).then(function(item) {
        // early-return if the item is in the cache.
        if (item) return item;
        return database.get(item)
    }).then(function(putOrItem) {
        // what do we do here to avoid the unnecessary cache.put ?

How can we solve this?

We solve it by remembering that the callback variant looks like this:

function getItem(key, callback) {
    cache.get(key, function(err, res) {
        // early-return if the item is in the cache.
        if (res) return callback(null, res);
        // continue to get the item from the database
        database.get(key, function(err, res) {
            if (err) return callback(err);
            // cache.put calls back with the item
            cache.put(key, res, callback);

The promise version can do pretty much the same - just nest the rest of the chain inside the first callback.

function getItem(key) {
    return cache.get(key).then(function(res) {
        // early return if the item is in the cache
        if (res) return res;
        // continue the chain within the callback.
        return database.get(key)

Or alternatively, if a cache miss results with an error:

function getItem(key) {
    return cache.get(key).catch(function(err) {
        return database.get(key).then(cache.put);

That means that early returns are just as easy as with callbacks, and sometimes even easier (in case of errors)

What about streams?

Promises can work very well with streams. Imagine a limit stream that allows at most 3 promises resolving in parallel, backpressuring otherwise, processing items from leveldb:

originalSublevel.createReadStream().pipe(limit(3, function(data) {
    return convertor(data.value).then(function(converted) {
        return {key: data.key, value: converted};

Or how about stream pipelines that are safe from errors without attaching error handlers to all of them?

pipeline(original, limiter, converted).then(function(done) {

}, function(streamError) {


Looks awesome. I definitely want to explore that.

The future?

In ES7, promises will become monadic (by getting flatMap and unit). Also, we're going to get generic syntax sugar for monads. Then, it trully wont matter what style you use - stream, promise or thunk - as long as it also implements the monad functions. That is, except for callback-passing style - it wont be able to join the party because it doesn't produce values.

I'm just kidding, of course. I don't know if thats going to happen. Either way, promises are useful and practical and will remain useful and practical in the future.

Closures are unavoidable in node

Fri Aug 23 2013

A couple of weeks ago I wrote a giant comparison of node.js async code patterns that mostly focuses on the new generators feature in EcmaScript 6 (Harmony)

Among other implementations there were two callback versions: original.js, which contains nested callbacks, and flattened.js, which flattens the nesting a little bit. Both make extensive use of JavaScript closures: every time the benchmarked function is invoked, a lot of closures are created.

Then Trevor Norris wrote that we should be avoiding closures when writing performance-sensitive code, hinting that my benchmark may be an example of "doing it wrong"

I decided to try and write two more flattened variants. The idea is to minimize performance loss and memory usage by avoiding the creation of closures.

You can see the code here: flattened-class.js and flattened-noclosure.js

Of course, this made complexity skyrocket. Lets see what it did for performance.

These are the results for 50 000 parallel invocations of the upload function, with simulated I/O operations that always take 1ms. Note: suspend is currently the fastest generator based library

file time(ms) memory(MB)
flattened-class.js 1398 106.58
flattened.js 1453 110.19
flattened-noclosure.js 1574 102.28
original.js 1749 124.96
suspend.js 2701 144.66

No performance gains. Why?

Because this kind of code requires that results from previous callbacks are passed to the next callback. And unfortunately, in node this means creating closures.

There really is no other option. Node core functions only take callback functions. This means we have to create a closure: its the only mechanism in JS that allows you to include context together with a function.

And yeah, bind also creates a closure:

function bind(fn, ctx) {
    return function bound() {
        return fn.apply(ctx, arguments);

Notice how bound is a closure over ctx and fn.

Now, if node core functions were also able to take a context argument, things could have been different. For example, instead of writing:

fs.readFile(f, bind(this.afterFileRead, this));

if we were able to write:

fs.readFile(f, this.afterFileRead, this);

then we would be able to write code that avoids closures and flattened-class.js could have been much faster.

But we can't do that.

What if we could though? Lets fork timers.js from node core and find out:

I added context passing support to the Timeout class. The result was timers-ctx.js which in turn resulted with flattened-class-ctx.js

And here is how it performs:

file time(ms) memory(MB)
flattened-class-ctx.js 929 59.57
flattened-class.js 1403 106.57
flattened.js 1452 110.19
original.js 1743 125.02
suspend.js 2834 145.34

Yeah. That shaved off a couple of 100s of miliseconds more.

Is it worth it?

name tokens complexity
suspend.js 331 1.10
original.js 425 1.41
flattened.js 477 1.58
flattened-class-ctx.js 674 2.23

Maybe, maybe not. You decide.

Analysis of generators and other async patterns in node

Fri Aug 09 2013

Table of contents:

Async coding patterns are the subject of never-ending debates for us node.js developers. Everyone has their own favorite method or pet library as well as strong feelings and opinions on all the other methods and libraries. Debates can be heated: sometimes social pariahs may be declared or grave rolling may be induced.

The reason for this is that JavaScript never had any continuation mechanism to allow code to pause and resume across the event loop boundary.

Until now.

A gentle introduction to generators

If you know how generators work, you can skip this and continue to the analysis

Generators are a new feature of ES6. Normally they would be used for iteration. Here is a generator that generates Fibonacci numbers. The example is taken from the ECMAScript harmony wiki:

function* fibonacci() {
    let [prev, curr] = [0, 1];
    for (;;) {
        [prev, curr] = [curr, prev + curr];
        yield curr;

And here is how we iterate through this generator:

for (n of fibonacci()) {
    // truncate the sequence at 1000
    if (n > 1000) break;

What happens behind the scene?

Generator functions are actually constructors of iterators. The returned iterator object has a next() method. We can invoke that method manually:

var seq = fibonacci();
console.log(seq.next()); // 1
console.log(seq.next()); // 2 etc.

When next is invoked, it starts the execution of the generator. The generator runs until it encounters a yield expression. Then it pauses and the execution goes back to the code that called next

So in a way, yield works similarly to return. But there is a big difference. If we call next on the generator again, the generator will resume from the point where it left off - from the last yield line.

In our example, the generator will resume to the top of the endless for loop and calculate the next Fibonacci pair.

So how would we use this to write async code?

A great thing about the next() method is that it can also send values to the generator. Let's write a simple number generator that also collects the stuff it receives. When it gets two things it prints them using console.log:

function* numbers() {
    var stuffIgot = [];
    for (var k = 0; k < 2; ++k) {
        var itemReceived = yield k;

This generator gives us 3 numbers using yield. Can we give something back?

Let's give two things to this generator:

var iterator = numbers();
// Cant give anything the first time: need to get to a yield first.
console.log(iterator.next()); // logs 0
console.log(iterator.next('present')); // logs 1
fs.readFile('file.txt', function(err, resultFromAnAsyncTask) {
    console.log(iterator.next(resultFromAnAsyncTask)); // logs 2

The generator will log the string 'present' and the contents of file.txt


Seems that we can keep the generator paused across the event loop boundary.

What if instead of numbers, we yielded some files to be read?

function* files() {
    var results = [];
    for (var k = 0; k < files.length; ++k)
        results.push(yield files[k]);
    return results;

We could process those file reading tasks asynchronously.

var iterator = files();
function process(iterator, sendValue) {
    var fileTask = iterator.next(sendValue);
    fs.readFile(fileTask, function(err, res) {
        if (err) iterator.throw(err);
        else process(iterator, res);

But from the generator's point of view, everything seems to be happening synchronously: it gives us the file using yield, then it waits to be resumed, then it receives the contents of the file and makes a push to the results array.

And there is also generator.throw(). It causes an exception to be thrown from inside the generator. How cool is that?

With next and throw combined together, we can easily run async code. Here is an example from one of the earliest ES6 async generators library task.js.

spawn(function* () {
    var data = yield $.ajax(url);
    var status = $('#status').html('Download complete.');
    yield status.fadeIn().promise();
    yield sleep(2000);

This generator yields promises, which causes it to suspend execution. The spawn function that runs the generator takes those promises and waits until they're fulfilled. Then it resumes the generator by sending it the resulting value.

When used in this form, generators look a lot like classical threads. You spawn a thread, it issues blocking I/O calls using yield, then the code resumes execution from the point it left off.

There is one very important difference though. While threads can be suspended involuntarily at any point by the operating systems, generators have to willingly suspend themselves using yield. This means that there is no danger of variables changing under our feet, except after a yield.

Generators go a step further with this: it's impossible to suspend execution without using the yield keyword. In fact, if you want to call another generator you will have to write yield* anotherGenerator(args). This means that suspend points are always visible in the code, just like they are when using callbacks.

Amazing stuff! So what does this mean? What is the reduction of code complexity? What are the performance characteristics of code using generators? Is debugging easy? What about environments that don't have ES6 support?

I decided to do a big comparison of all existing node async code patterns and find the answers to these questions.

The analysis

For the analysis, I took file.upload, a typical CRUD method extracted from DoxBee called when uploading files. It executes multiple queries to the database: a couple of selects, some inserts and one update. Lots of mixed sync / async action.

It looks like this:

function upload(stream, idOrPath, tag, done) {
    var blob = blobManager.create(account);
    var tx = db.begin();
    function backoff(err) {
        return done(new Error(err));
    blob.put(stream, function (err, blobId) {
        if (err) return done(err);
        self.byUuidOrPath(idOrPath).get(function (err, file) {
            if (err) return done(err);
            var previousId = file ? file.version : null;
            var version = {
                userAccountId: userAccount.id,
                date: new Date(),
                blobId: blobId,
                creatorId: userAccount.id,
                previousId: previousId
            version.id = Version.createHash(version);
            Version.insert(version).execWithin(tx, function (err) {
                if (err) return backoff(err);
                if (!file) {
                    var splitPath = idOrPath.split('/');
                    var fileName = splitPath[splitPath.length - 1];
                    var newId = uuid.v1();
                    self.createQuery(idOrPath, {
                        id: newId,
                        userAccountId: userAccount.id,
                        name: fileName,
                        version: version.id
                    }, function (err, q) {
                        if (err) return backoff(err);
                        q.execWithin(tx, function (err) {
                            afterFileExists(err, newId);

                else return afterFileExists(null, file.id);
            function afterFileExists(err, fileId) {
                if (err) return backoff(err);
                FileVersion.insert({fileId: fileId,versionId: version.id})
                    .execWithin(tx, function (err) {
                        if (err) return backoff(err);
                        File.whereUpdate({id: fileId}, {
                            version: version.id
                        }).execWithin(tx, function (err) {
                            if (err) return backoff(err);

Slightly pyramidal code full of callbacks.

This is how it looks like when written with generators:

var genny = require('genny');
module.exports = genny.fn(function* upload(resume, stream, idOrPath, tag) {
    var blob = blobManager.create(account);
    var tx = db.begin();
    try {
        var blobId = yield blob.put(stream, resume());
        var file = yield self.byUuidOrPath(idOrPath).get(resume());
        var previousId = file ? file.version : null;
        var version = {
            userAccountId: userAccount.id,
            blobId: blobId,
            creatorId: userAccount.id,
            previousId: previousId
        version.id = Version.createHash(version);
        yield Version.insert(version).execWithin(tx, resume());
        if (!file) {
            var splitPath = idOrPath.split('/');
            var fileName = splitPath[splitPath.length - 1];
            var newId = uuid.v1();
            var file = {
                id: newId,
                userAccountId: userAccount.id,
                name: fileName,
                version: version.id
            var q = yield self.createQuery(idOrPath, file, resume());
            yield q.execWithin(tx, resume());
        yield FileVersion.insert({fileId: file.id, versionId: version.id})
            .execWithin(tx, resume());
        yield File.whereUpdate({id: file.id}, {version: version.id})
            .execWithin(tx, resume());
        yield tx.commit(resume());
    } catch (e) {
        throw e;

Shorter, very straight-forward code and absolutely no nesting of callback functions. Awesome.

Yet subjective adjectives are not very convincing. I want to have a measure of complexity, a number that tells me what I'm actually saving.

I also want to know what the performance characteristics are - how much time and memory would it take to execute a thousand of parallel invocations of this method? What about 2000 or 3000?

Also, what happens if an exception is thrown? Do I get a complete stack trace like in the original version?

I also wanted to compare the results with other alternatives, such as fibers, streamlinejs and promises (without generators).

So I wrote a lot of different versions of this method, and I will share my personal impressions before giving you the results of the analysis

The examples


The original solution, presented above. Vanilla callbacks. Slightly pyramidal. I consider it acceptable, if a bit mediocre.


Flattened variant of the original via named functions. Taking the advice from callback hell, I flattened the pyramid a little bit. As I did that, I found that while the pyramid shrunk, the code actually grew.


I noticed that the first two vanilla solutions had a lot of common error handling code everywhere. So I wrote a tiny library called catcher.js which works very much like node's domain.intercept. This reduced the complexity and the number of lines further, but the pyramidal looks remained.


Uses the waterfall function from caolan's async. Very similar to flattened.js but without the need to handle errors at every step.

flattened-class.js, flattened-noclosure.js, flattened-class-ctx.js

See this post for details


I'll be honest. I've never written promise code in node before. Driven by Gozalla's excellent post I concluded that everything should be a promise, and things that can't handle promises should also be rewritten.

Take for example this particular line in the original:

var previousId = file ? file.version : null;

If file is a promise, we can't use the ternary operator or the property getter. Instead we need to write two helpers: a ternary operator helper and a property getter helper:

var previousIdP = p.ternary(fileP, p.get(fileP, 'version'), null);

Unfortunately this gets out of hand quickly:

var versionP = p.allObject({
    userAccountId: userAccount.id,
    blobId: blobIdP,
    creatorId: userAccount.id,
    previousId: previousIdP,
versionP = p.set(versionP, p.allObject({
    id: fn.call(Version.createHash, versionP)
// Even if Version.insert has been lifted to take promise arguments, it returns
// a promise and therefore we cannot call execWithinP. We have to wait for the
// promise  to resolve to invoke the function.
var versionInsert = p.eventuallyCall(
    Version.insert(versionP), 'execWithinP', tx);
var versionIdP = p.get(versionP, 'id');

So I decided to write a less aggressive version, promiseish.js

note: I used when because i liked its function lifting API better than Q's

promiseish.js and promiseishQ.js

Nothing fancy here, just some .then() chaining. In fact it feels less complex than the promise.js version, where I felt like I was trying to fight the language all the time.

The second file promiseishQ.js uses Q instead of when. No big difference there.


Fibrous is a fibers library that creates "sync" methods out of your async ones, which you can then run in a fiber.

So if for example you had:

fs.readFile(file, function(err, data){ ... });

Fibrous would generate a version that returns a future, suspends the running fiber and resumes execution when the value becomes available.

var data = fs.sync.readFile(file);

I also needed to wrap the entire upload function:

fibrous(function upload() { ... })

This felt very similar to the generators version above but with sync instead of yield to indicate the methods that will yield. The one benefit I can think of is that it feels more natural for chaining - less parenthesis are needed.

// vs
(yield somefn(arg, resume)).split('/')

Major drawback: this will never be available outside of node.js or without native modules.

Library: fibrous

suspend.js and genny.js

suspend and genny are generator-based solutions that can work directly with node-style functions.

I'm biased here since I wrote genny. I still think that this is objectively the best way to use generators in node. Just replace the callback with a placeholder generator-resuming function, then yield that. Comes back to you with the value.

Kudos to jmar777 for realizing that you don't need to actually yield anything and can resume the generator using the placeholder callback instead.

Both suspend and genny use generators roughly the same way. The resulting code is very clean, very straightforward and completely devoid of callbacks.


Q provides two methods that allow you to use generators: Q.spawn and Q.async. In both cases the generator yields promises and in turn receives resolved values.

The code didn't feel very different from genny and suspend. Its slightly less complicated: you can yield the promise instead of placing the provided resume function at every point where a callback is needed.

Caveat: as always with promises you will need to wrap all callback-based functions.

Library: Q

co.js and gens.js

Gens and co are generator-based libraries. Both can work by yielding thunk-style functions: that is, functions that take a single argument which is a node style callback in the format function (err, result)

The code looks roughly the same as qasync.js

The problem is, thunks still require wrapping. The recommended way to wrap node style functions is to use co.wrap for co and fn.bind for gens - so thats what I did.


Uses streamlinejs CPS transformer and works very much like co and qasync, except without needing to write yield all the time.

Caveat: you will need to compile the file in order to use it. Also, even though it looks like valid JavaScript, it isn't JavaScript. Superficially, it has the same syntax, but it has very different semantics, particularly when it comes to the _ keyword, which acts like yield and resume combined in one.

The code however is really simple and straightforward: infact it has the lowest complexity.


To measure complexity I took the number of tokens in the source code found by Esprima's lexer (comments excluded). The idea is taken from Paul Graham's essay Succinctness is Power

I decided to allow all callback wrapping to happen in a separate file: In a large system, the wrapped layer will probably be a small part of the code.


name tokens complexity
src-streamline._js 302 1.00
co.js 304 1.01
qasync.js 314 1.04
fibrous.js 317 1.05
suspend.js 331 1.10
genny.js 339 1.12
gens.js 341 1.13
catcher.js 392 1.30
promiseishQ.js 396 1.31
promiseish.js 411 1.36
original.js 421 1.39
async.js 442 1.46
promises.js 461 1.53
flattened.js 473 1.57
flattened-noclosure.js 595 1.97
flattened-class-ctx.js 674 2.23
flattened-class.js 718 2.38
rx.js 935 3.10

Streamline and co have the lowest complexity. Fibrous, qasync, suspend, genny and gens are roughly comparable.

Catcher is comparable with the normal promise solutions. Both are roughly comparable to the original version with callbacks, but there is some improvement as the error handling is consolidated to one place.

It seems that flattening the callback pyramid increases the complexity a little bit. However, arguably the readability of the flattened version is improved.

Using caolan's async in this particular case doesn't seem to yield much improvement. Its complexity however is lower than the flattened version because it consolidates error handling.

Going promises-all-the-way as Gozala suggests also increases the complexity because we're fighting the language all the time.

The rx.js sample is still a work in progress - it can be made much better.

Performance (time and memory)

All external methods are mocked using setTimeout to simulate waiting for I/O.

There are two variables that control the test:

  • nn - the number of parallel "upload requests"
  • tt - average wait time per async I/O operation

For the first test, I set the time for every async operation to 1ms then ran every solution for n{100,500,1000,1500,2000}n \in \lbrace 100, 500, 1000, 1500, 2000 \rbrace.

note: hover over the legend to highlight the item on the chart.

Wow. Promises seem really, really slow. Fibers are also slow, with time complexity O(n2)O(n^2). Everything else seems to be much faster.

Update (Dec 20 2013): Promises not slow anymore. PetkaAntonov wrote Bluebird, which is faster than almost everything else and very low on memory usage. For more info read Why I am switching to Promises

Lets try removing all those promises and fibers to see whats down there.

Ah, much better.

The original and flattened solutions are the fastest, as they use vanilla callbacks, with the fastest flattened solution being flattened-class.js.

suspend is the fastest generator based solution. It incurred minimal overhead of about 60% running time. Its also roughly comparable with streamlinejs (when in raw callbacks mode).

caolan's async adds some measurable overhead (its about 2 times slower than the original versions). Its also somewhat slower than the fastest generator based solution.

genny is about 3 times slower. This is because it adds some protection guarantees: it makes sure that callback-calling functions behave and call the callback only once. It also provides a mechanism to enable better stack traces when errors are encountered.

The slowest of the generator bunch is co, but not by much. There is nothing intrinsically slow about it though: the slowness is probably caused by co.wrap which creates a new arguments array on every invocation of the wrapped function.

All generator solutions become about 2 times slower when compiled with Google Traceur, an ES6 to ES5 compiler which we need to run generators code without the --harmony switch or in browsers.

Finally we have rx.js which is about 10 times slower than the original.

However, this test is a bit unrealistic.

Most async operations take much longer than 1 millisecond to complete, especially when the load is as high as thousands of requests per second. As a result, performance is I/O bound - why measure things as if it were CPU-bound?

So lets make the average time needed for an async operation depend on the number of parallel calls to upload().

On my machine redis can be queried about 40 000 times per second; node's "hello world" http server can serve up to 10 000 requests per second; postgresql's pgbench can do 300 mixed or 15 000 select transactions per second.

Given all that, I decided to go with 10 000 requests per second - it looks like a reasonable (rounded) mean.

Each I/O operation will take 10 ms on average when there are 100 running in parallel and 1000 ms when there are 10 000 running in parallel. Makes much more sense.

promises.js and fibrous.js are still significantly slower. However all of the other solutions are quite comparable now. Lets remove the worst two:

Everything is about the same now. Great! So in practice, you won't notice the CPU overhead in I/O bound cases - even if you're using promises. And with some of the generator libraries, the overhead becomes practically invisible.

Excellent. But what about memory usage? Lets chart that too!

Note: the y axis represents peak memory usage (in MB).

Seems like promises also use a lot of memory, especially the extreme implementation promises.js. promiseish.js as well as qasync.js are not too far behind.

fibrous.js, rx.js and stratifiedjs are somewhat better than the above, however their memory usage is still over 5 times bigger than the original.

Lets remove the hogs and see what remains underneath.

Streamline's fibers implementation uses 35MB while the rest use between 10MB and 25MB.

This is amazing. Generators (without promises) also have a low memory overhead, even when compiled with traceur.

Streamline is also quite good in this category. It has very low overhead, both in CPU and memory usage.

Its important to note that the testing method that I use is not statistically sound. Its however good enough to be used to compare orders of magnitude, which is fine considering the narrowly defined benchmark.

With that said, here is a table for 1000 parallel requests with 10 ms response time for I/O operations (i.e. 100K IO / s)

file time(ms) memory(MB)
suspend.js 101 8.62
flattened-class.js 111 4.48
flattened-noclosure.js 124 5.04
flattened.js 125 5.18
original.js 130 5.33
async.js 135 7.36
dst-streamline.js 139 8.34
catcher.js 140 6.45
dst-suspend-traceur.js 142 6.76
gens.js 149 8.40
genny.js 161 11.69
co.js 182 11.14
dst-genny-traceur.js 250 8.84
dst-stratifiedjs-014.js 267 23.55
dst-co-traceur.js 284 13.54
rx.js 295 40.43
dst-streamline-fibers.js 526 17.05
promiseish.js 825 117.88
qasync.js 971 98.39
fibrous.js 1159 57.48
promiseishQ.js 1161 96.47
dst-qasync-traceur.js 1195 112.10
promises.js 2315 240.39


Having good performance is important. However, all the performance is worth nothing if our code doesn't do what its supposed to. Debugging is therefore at least as important as performance.

How can we measure debuggability? We can look at source maps support and the generated stack traces.

Source maps support

I split this category into 5 levels:

  • level 1: no source maps, but needs them (wacky stack trace line numbers)

  • level 2: no source maps and needs them sometimes (to view the original code)

    Streamline used to be in this category but now it does have source maps support.

  • level 3: has source maps and needs them always.

    Nothing is in this category.

  • level 4: has source maps and needs them sometimes

    Generator libraries are in this category. When compiled with traceur (e.g. for the browser) source maps are required and needed. If ES6 is available, source maps are unnecessary.

    Streamline is also in this category for another reason. With streamline, you don't need source maps to get accurate stack traces. However, you will need them if you want to read the original code (e.g. when debugging in the browser).

  • level 5: doesn't need source maps

    Everything else is in this category. That's a bit unfair as fibers will never work in a browser.

Stack trace accuracy

This category also has 5 levels:

  • level 1: stack traces are missing

    suspend, co and gens are in this category. When an error happens in one of the async functions, this is how the result looks like:

    Error: Error happened
      at null._onTimeout (/home/spion/Documents/tests/async-compare/lib/fakes.js:27:27)
      at Timer.listOnTimeout [as ontimeout] (timers.js:105:15)

    No mention of the original file, examples/suspend.js

    Unfortunately, if you throw an error to a generator using iterator.throw(error), the last yield point will not be present in the resulting stack trace. This means you will have no idea which line in your generator is the offending one.

    Regular exceptions that are not thrown using iterator.throw have complete stack traces, so only yield points will suffer.

    Some solutions that aren't generator based are also in this category, namely promiseish.js and async.js. When a library handles errors for you, the callback stack trace will not be preserved unless special care is taken to preserve it. async and when don't do that.

  • level 2: stack traces are correct with native modules

    Bruno Jouhier's generator based solution galaxy is in this category. It has a native companion module called galaxy-stack that implements long stack traces without a performance penalty.

    Note that galaxy-stack doesn't work with node v0.11.5

  • level 3: stack traces are correct with a flag (adding a performance penalty).

    All Q-based solutions are here, even qasync.js, which uses generators. Q's support for stack traces via Q.longStackSupport = true; is good:

    Error: Error happened
        at null._onTimeout (/home/spion/Documents/tests/async-compare/lib/fakes.js:27:27)
        at Timer.listOnTimeout [as ontimeout] (timers.js:105:15)
    From previous event:
        at /home/spion/Documents/tests/async-compare/examples/qasync.js:41:18
        at GeneratorFunctionPrototype.next (native)

    So, does this mean that its possible to add long stack traces support to a callbacks-based generator library the way that Q does it?

    Yes it does! Genny is in this category too:

    Error: Error happened
        at null._onTimeout (/home/spion/Documents/tests/async-compare/lib/fakes.js:27:27)
        at Timer.listOnTimeout [as ontimeout] (timers.js:105:15)
    From generator:
        at upload (/home/spion/Documents/tests/async-compare/examples/genny.js:38:35)

    However it incurs about 50-70% memory overhead and is about 6 times slower.

    Catcher is also in this category, with 100% memory overhead and about 10 times slower.

  • level 4: stack traces are correct but fragile

    All the raw-callback solutions are in this category: original, flattened, flattened-class, etc. At the moment, rx.js is in this category too.

    As long as the callback functions are defined by your code everything will be fine. However, the moment you introduce some wrapper that handles the errors for you, your stack traces will break and will show functions from the wrapper library instead.

  • level 5: stack traces are always correct

    Streamline and fibers are in this category. Streamline compiles the file in a way that preserves line numbers, making stack traces correct in all cases. Fibers also preserve the full call stack.

Ah yes. A table.

name source maps stack traces total
fibrous.js 5 5 10
src-streamline._js 4 5 9
original.js 5 4 9
flattened*.js 5 4 9
rx.js 5 4 9
catcher.js 5 3 8
promiseishQ.js 5 3 8
qasync.js 4 3 7
genny.js 4 3 7
async.js 5 1 6
promiseish.js 5 1 6
promises.js 5 1 6
suspend.js 4 1 5
gens.js 4 1 5
co.js 4 1 5

Generators are not exactly great. They're doing well enough thanks to qasync and genny.


If this analysis left you even more confused than before, you're not alone. It seems hard to make a decision even with all the data available.

My opinion is biased. I love generators, and I've been pushing pretty hard to direct the attention of V8 developers to them (maybe a bit too hard). And its obvious from the analysis above that they have good characteristics: low code complexity, good performance.

More importantly, they will eventually become a part of everyday JavaScript with no compilation (except for older browsers) or native modules required, and the yield keyword is in principle as good indicator of async code as callbacks are.

Unfortunately, the debugging story for generators is somewhat bad, especially because of the missing stack traces for thrown errors. Fortunately, there are solutions and workarounds, like those implemented by genny (obtrusive, reduces performance) and galaxy (unobtrusive, but requires native modules).

But there are things that cannot be measured. How will the community accept generators? Will people find it hard to decide whether to use them or not? Will they be frowned upon when used in code published to npm?

I don't have the answers to these questions. I only have hunches. But they are generally positive. Generators will play an important role in the future of node.

Special thanks to Raynos, maxogden, mikeal and damonoehlman for their input on the draft version of this analysis.

Thanks to jmar777 for making suspend

Older posts

Google Docs on the iPad

Introducing npmsearch

Fixing Hacker News with a reputation system

Why native sucks and HTML5 rocks: porting

Let it snow

Intuitive JavaScript array filtering function pt2

Intuitive JavaScript array filtering function pt1

Amateur - Lasse Gjertsen