Chapter 1. Introduction
A very brief introduction to Node.js
Node.js is many things, but mostly it's a way of running JavaScript outside the web browser. This book will cover why that's important, and what benefits Node.js provides. This introduction attempts to sum up that explanation in a few paragraphs, rather than a few hundred pages.
Many people use the JavaScript programming languages extensively for programming the interfaces of Web sites. Node.js allows this popular programming language to be applied in many more contexts, in particular on Web servers. There are several notable features about Node.js that make it worthy of interest.
Node is a wrapper around the high-performance V8 JavaScript runtime from the Google Chrome browser. Node tunes V8 to work better in contexts other than the browser, mostly by providing additional APIs which are optimized for specific use cases. For example, in a server context manipulation of binary data is often necessary. This is poorly supported by the JavaScript language and as a result, V8. Node's Buffer class provides easy manipulation of binary data. As such, Node doesn't just provide direct access to the V8 JavaScript runtime. It also makes JavaScript more useful for the contexts in which people use Node.
V8 itself uses some of the newest techniques in compliler technology. This often allows code written in a high-level langauge like JavaScript to perform similarly to code written in much lower level language, like C, with a fraction of the development cost. This focus on performance is a key aspect of Node.
JavaScript is an event driven language, and Node uses this to its advantage to produce highly scalable servers. Using an architecture called an event loop, Node makes programming high scalable servers both easy and safe. There are various strategies that are used to make servers performant. Node has chosen an architecture that performs very well but also reduces the complexity for the application developer. This is an extremely important feature. Programming concurrency is hard, and frought with dangers. Node side-steps these while still offering impressive performance. As always any approach still has trade-offs that are discussed in detail later in the book.
To support the event-loop approach, Node supplies a set of "non-blocking" libraries. In essence these are interfaces to things like the filesystem or databases which operate in an event-driven way. When you make a request to the file system, rather than requiring Node to wait for the hard drive to spin up and retrive the file the non-blocking interface simply notifies Node when it has access in the same way that web browsers notifies your code about an onclick event. This model simplifies access to slow resources in a scalable way that is intuitive to JavaScript programmers and easy to learn for everyone else.
While not unique to Node, supporting JavaScript on the server is also a powerful feature. Whether we like it or not the browser environment gives us little choice of programming languages. Certainly if we would like our code to work in any reasonable percentage of browsers JavaScript is the only choice. To acheive any asperations of sharing code between the server and the brower we must use JavaScript. Due to the increasingly complexity of client applications we are building in the browser using JavaScript (such as GMail) the more code we can share between the browser and the server the more we can reduce the cost of creating rich web applications. Since we must rely on JavaScript in the browser having a server-side environment that use JavaScript opens the door to code sharing in a way which is not possible using other server-side languages such as PHP, Java, Ruby or Python. While there are other platforms which support programming web servers with JavaScript, Node is quickly becoming the dominent platform in the space.
Aside from what you can build with Node, one extremely pleasing aspect is how much you can build for Node. Node is extremely extensible with a large volume of community modules having been built in the relatively short time the project has been running. Many of these are drivers to connect with databases or other software, but many are also useful software applications in their own right.
The last but certainly not least reason to celebrate Node is its community. The Node project is still very young, and yet rarely has the author seen such fervor around a project. Both novices and experts have coalesced around the project to use and contribute to Node, making it both a pleasure to explore and a supportive place to share and get advice.
Installing Node.js
Installing Node.js is extremely simple. Node runs on Windows, Linux, Mac and other POSIX OSes (such as Solaris and BSD). Node.js is available from two primary locations: the project's web site or the Github repository. You're probably better off with the Node web site because it contains the stable releases. The latest cutting edge features are hosted on Github for the core development team and anyone else who wants a copy. While these features are new and often intriguing, they are also less stable than those in a stable release.
Let's get started by installing Node.js. The first thing to do is download Node.js from the web site. So let's go there and find the latest release. From the Node homepage find the download link. The current release at the time of print is 0.6.6, which is a stable release. The Node web site provides installers for Windows and Mac as well as the stable source code. If you are on Linux you can either do a source install or you can use your usual package manager (apt-get, yum, etc).
Note
Node.js version numbers follow the C convention of major.minor.patch. Stable versions of Node.js have an even minor version number, development versions have an odd minor version number. It's unclear when Node will start using the version numbers, but it's a fair assumption that it will only be when the Windows and Unix combined release is considered mature.
If you are doing a source install you can follow the steps in this section. If you used an install you can skip to the section called “First Steps in Code”. Otherwise once you have the code, you'll need to unpack it. The tar command does this using the flags xzf. The x stands for extract (rather than compress), z tells tar to also decompress using the GZIP algorithm, and f indicates we are unpacking the filename given as the final argument:
Example 1.1. Unpacking the code
enki:Downloads $ tar xzf node-v0.6.6.tar.gz
enki:Downloads $ cd node-v0.6.6
enki:node-v0.6.6 $ ls
AUTHORS Makefile common.gypi doc test
BSDmakefile Makefile-gyp configure lib tools
ChangeLog README.md configure-gyp node.gyp vcbuild.bat
LICENSE benchmark deps src wscript
enki:node-v0.6.6 $
The next step is to configure the code for your system. Node.js uses the configure/make system for its installation. The configure script looks at your system and finds the paths Node needs to use for the dependancies it needs. Node generally has very few dependancies. The installer requires Python 2.4 or greater, and if you wish to use TLS or cryptology (such as SHA1), Node needs the OpenSSL development libraries. Running configure will let you know whether any of these dependancies are missing.
Example 1.2. Configuring the Node install
enki:node-v0.6.6 $ ./configure
Checking for program g++ or c++ : /usr/bin/g++
Checking for program cpp : /usr/bin/cpp
Checking for program ar : /usr/bin/ar
Checking for program ranlib : /usr/bin/ranlib
Checking for g++ : ok
Checking for program gcc or cc : /usr/bin/gcc
Checking for gcc : ok
Checking for library dl : yes
Checking for openssl : not found
Checking for function SSL_library_init : yes
Checking for header openssl/crypto.h : yes
Checking for library util : yes
Checking for library rt : not found
Checking for fdatasync(2) with c++ : no
'configure' finished successfully (0.991s)
enki:node-v0.6.6 $
The next installation step is to make the project. This compiles Node and builds the binary version of the project that you will use into a build directory of the source directory we've been using. Node numbers each of the build steps it needs to do so you can follow the progress it makes during the compile.
Example 1.3. Compiling Node with the make command
enki:node-v0.6.6 $ make
Waf: Entering directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
DEST_OS: darwin
DEST_CPU: x64
Parallel Jobs: 1
Product type: program
[ 1/35] copy: src/node_config.h.in -> out/Release/src/node_config.h
[ 2/35] cc: deps/http_parser/http_parser.c -> out/Release/deps/http_parser/http_parser_3.o
/usr/bin/gcc -rdynamic -pthread -arch x86_64 -g -O3 -DHAVE_OPENSSL=1 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DHAVE_FDATASYNC=0 -DARCH="x64" -DPLATFORM="darwin" -D__POSIX__=1 -Wno-unused-parameter -D_FORTIFY_SOURCE=2 -IRelease/deps/http_parser -I../deps/http_parser ../deps/http_parser/http_parser.c -c -o Release/deps/http_parser/http_parser_3.o
[ 3/35] src/node_natives.h: src/node.js lib/dgram.js lib/console.js lib/buffer.js lib/querystring.js lib/punycode.js lib/http.js lib/net.js lib/stream.js lib/events.js lib/util.js lib/module.js lib/_debugger.js lib/assert.js lib/fs.js lib/child_process.js lib/os.js lib/readline.js lib/vm.js lib/url.js lib/tls.js lib/crypto.js lib/sys.js lib/https.js lib/freelist.js lib/dns.js lib/_linklist.js lib/buffer_ieee754.js lib/tty.js lib/cluster.js lib/repl.js lib/path.js lib/string_decoder.js lib/timers.js lib/zlib.js lib/constants.js -> out/Release/src/node_natives.h
[ 4/35] uv: deps/uv/include/uv.h -> out/Release/deps/uv/uv.a
...
f: Leaving directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
'build' finished successfully (2m53.573s)
-rwxr-xr-x 1 sh1mmer staff 6.8M Jan 3 21:56 out/Release/node
enki:node-v0.6.6 $
The final step is to use make to install Node. First I'm going to show how to install Node globally for the whole system. This requires you either to have access to the root user or to have sudo privileges that let you act as root.
Example 1.4. Installing Node for the whole system
enki:node-v0.6.6 $ sudo make install
Password:
Waf: Entering directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
DEST_OS: darwin
DEST_CPU: x64
Parallel Jobs: 1
Product type: program
* installing deps/uv/include/ares.h as /usr/local/include/node/ares.h
* installing deps/uv/include/ares_version.h as /usr/local/include/node/ares_version.h
* installing deps/uv/include/uv.h as /usr/local/include/node/uv.h
...
* installing out/Release/src/node_config.h as /usr/local/include/node/node_config.h
Waf: Leaving directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
'install' finished successfully (0.915s)
enki:node-v0.6.6 $
If you want to install only for the local user, and avoid using the sudo command, you need to run the configure script with the --prefix argument in to tell Node to install somewhere other than the default.
Example 1.5. Installing Node for a local user
enki:node-v0.6.6 $ mkdir ~/local
enki:node-v0.6.6 $ ./configure --prefix=~/local
Checking for program g++ or c++ : /usr/bin/g++
Checking for program cpp : /usr/bin/cpp
...
'configure' finished successfully (0.501s)
enki:node-v0.6.6 $ make && make install
Waf: Entering directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
DEST_OS: darwin
DEST_CPU: x64
...
* installing out/Release/node as /Users/sh1mmer/local/bin/node
* installing out/Release/src/node_config.h as /Users/sh1mmer/local/include/node/node_config.h
Waf: Leaving directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
'install' finished successfully (0.747s)
enki:node-v0.6.6 $
First Steps in Code
Node REPL
One of the things that's often hard to unerstand about Node.js is that, in addition to being a server, it's also a runtime environment in the same way that Perl, Python and Ruby are. As such, while we often refer to Node.js as "server-side JavaScript," that doesn't really accurately describe what Node.js does. One of the best ways to get to grips with Node.js is to use Node REPL ("Read-Evaluate-Print-Loop"), an interactive Node.js programming environment. It's great for testing out and learning about Node.js. You can try out any of the snippets in this book using Node REPL. In addition, because Node is a wrapper around V8, Node REPL is an ideal place to easily try out JavaScript. However, when you want to run a Node program you can use your favourite text editor save it in a file and simply run node filename.js. REPL is a great learning, or exploration tool but we don't use it for production code.
Let's launch Node REPL and try out a few bits of JavaScript to warm up. Open up a console on your system. The system I'm using is a Mac with a custom command prompt, so your system might look a little different, but the commands should be the same:
Example 1.6. Starting Node REPL and trying some JavaScript
$Enki:~ $ node
> 3 > 2 > 1
false
> true == 1
true
> true === 1
false
Note
The first line, which evaluates to false, is from http://wtfjs.com, a collection of weird and amusing things about JavaScript.
Having a live programming environment is a really great learning tool, but you should know a few helpful features of Node REPL to make the most of it. It offers meta-commands, which all start with a period (.). Thus, .help shows the help menu, .clear clears the current context, and .exit quits Node REPL. The most useful command is .clear, which wipes out any variables or closures you have in memory without the need to restart the REPL.
Example 1.7. Using the meta-features in Node REPL
> console.log('Hello World');
Hello World
> .help
.clear Break, and also clear the local context.
.exit Exit the prompt
.help Show repl options
> .clear
Clearing context...
> .exit
Enki:~ $
When using REPL, simply typing the name of a variable will enumerate it in the shell. Node tries to do this intelligently so a complex object won't just be represented as simple Object, but actually through a description that reflects what's in the object. The main exception to this involves functions. It's not that REPL doesn't have a way to enumerate functions, it's that functions have the tendency to be very large. If REPL enumerated functions, a lot of output could scroll by.
Example 1.8. Setting and enumerating objects with REPL
Enki:~ $ node
> myObj = {};
{}
> myObj.list = ["a", "b", "c"];
[ 'a', 'b', 'c' ]
> myObj.doThat = function(first, second, third) { console.log(first); };
[Function]
> myObj
{ list: [ 'a', 'b', 'c' ]
, doThat: [Function]
}
>
A First Server
While REPL gives us a great tool for learning and experimentation, the main application of Node.js is as a server. One of the specific design goals of Node.js is to provide a highly scalable server environment. This is an area where Node differs from V8, which I described in Chapter 1, Introduction. Although the V8 runtime is used in Node.js to interpret the JavaScript, Node.js also uses a number of highly optimized libraries to make the server efficient. In particular, the HTTP module was written from scratch in C to provide a very fast non-blocking implementation of HTTP. Let's take a look at the canonical Node "Hello World" example using an HTTP server.
Example 1.9. A Hello World Node.js Web Server
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('Hello World\n');
}).listen(8124, "127.0.0.1");
console.log('Server running at http://127.0.0.1:8124/');
The first thing that this code does is use require to include the HTTP library into the program. This concept is used in many languages, but Node uses the CommonJS module format, which we'll talk about more later in the chapter. The main thing to know at this point is that the functionality in the HTTP library is now assigned to the http object.
Next we need an HTTP server. Unlike some languages such as PHP, which run inside a server such as Apache, Node itself acts as the Web server. However, that also means we have to create it. The next line calls a factory method from the HTTP module that creates new HTTP servers. The new HTTP server isn't assigned to a variable, it's simply going to be an anonymous object in the global scope. Instead we use chaining to initialize the server and tell it to listen on port 8124.
When calling createServer, we passed an anonymous function as an argument. This function is attached to the new server's event listener for the request event. Events are central to both JavaScript and Node. In this case, whenever there is a new request to the Web server, it will call the method we've passed to deal with the request. We call these kinds of methods callbacks. That's because whenever an event happens, we "call back" all the methods listening for that event.
Perhaps a good analogy would be ordering a book from a bookshop. When your book is in stock they call back to let you know you can come and collect it. This specific callback takes the arguments req for the request object and res for the response object.
Inside the function we created for the callback, we call a couple of methods on the res object. These calls modify the response. This example doesn't use the request, but typically you would use both the request and response objects.
The first thing we must do is set the HTTP response header. We can't send any actual response to the client without it. the res.writeHead method does this. We set the values 200 (for the HTTP status code 200 OK) and pass a list of HTTP headers. In this case the only header we specify is Content-type.
After we've written the HTTP header to the client, we can write the HTTP body. In this case we use a single method to both write the body and close the connection. The end method closes the HTTP connection, but since we also passed it a string, it will send that to the client before it closes the connection.
Finally, the last line of our example uses the console.log. This simply prints information to STDOUT, much like the browser counterpart supported by Firebug and Web Inspector.
Let's run this with Node.js on the console and see what we get:
Example 1.10. Running the Hello World example
Enki:~ $ node
node> var http = require('http');
node> http.createServer(function (req, res) {
... res.writeHead(200, {'Content-Type': 'text/plain'});
... res.end('Hello World\n');
... }).listen(8124, "127.0.0.1");
node> console.log('Server running at http://127.0.0.1:8124/');
Server running at http://127.0.0.1:8124/
node>
Here I start a Node REPL and type in the code from the sample (I'll forgive you for pasting it from the web site). Node REPL accepts the code using ... to indicate that you haven't completed the statement and should continue entering it. When we run the console.log line, Node REPL prints out Server running at http://127.0.0.1:8124/. Now we are ready to call our Hello World example in a web browser (Figure 1.1, “Viewing the Hello World Web Server from a browser”).
It works! While this isn't exactly a stunning demo, it is notable that we got "Hello World" working in 6 lines of code. Not that I would recommend that style of coding, but we are starting to get some where. In the next chapter we'll look at a lot more code but first let's think about why Node is how it is.
Why Node?
In writing this book I've been acutely aware of how new Node.js is. Many platforms take years to find adoption, and yet I've found a level of excitement around Node.js that I've never seen before in such a young platform. I hope that by looking at the reasons other people are getting so excited about Node.js I will find reasons that also resonate for you. By looking at Node.js' strengths we can find the the places where it is most applicable. This chapter will look at the factors that have come together to create a space for Node.js and look at the reasons why it's become so popular in such a short time.
High Performance Web Servers
When I first started writing web applications, more than 10 years ago, the web was much smaller. Sure we had the .com bubble but the sheer volume of people on the Internet was considerably less, and the sites we made were much less ambitious. Fast forward and we have the advent of Web 2.0 and widely available Internet connections on cell phones. So much more is expected of us as developers. Not only are the features we need to deliver more complex, more interactive, more real but many more people are using them more often from more devices than ever before. This is a pretty steep challenge. While hardware continues to improve we also need to make improvements to our software development practices to support such demands. If we kept just buying hardware to support ever increasing features or users it wouldn't be very cost effective.
Node is an attempt to solve this problem by introducing the architecture called event-driven computing to the programming space for web server. As it turns out Node isn't the first platform to do this, but it is by far the most successful, and I would argue the easiest to use. We are going to talk about event-driven programming in a lot more detail later in the book, but let's do a quick intro. Imagine you connect to a web server to get a web page. The time to reach that web server is probably 100ms or so over a reasonable DSL connection. When you connect to a typical web server it creates a new instance of a program on the server that represents your request. That program runs from the top to the bottom (follow all of the function calls) to provide your web page. This means that the server has to allocate a fixed amount of memory to that request until it is totally finished including the 100ms+ to send the data back to you. Node doesn't work that way. Instead Node keeps all users in a single program. Whenever Node has to do something slow like wait for a confirmation you got your data (so it can mark your request as finished) it simply moves on to another user. I'm glossing over the details a bit but this means Node is a lot more efficient with memory than traditional servers and can keep providing a very fast response time with lots and lots of concurrent users. This is a huge win. This approach is one of the main reasons people like Node.
Professionalism in JavaScript
Another reason people like Node is JavaScript. JavaScript was created by Brendan Eich in 1995 to be a simple scripting language for use in web pages on the Netscape browser platform. Surprisingly almost since its inception JavaScript has been used in non-browser settings. Some of the early Netscape server products supported JavaScript (known then as LiveScript) as a server-side scripting language. While server-side JavaScript didn't really catch on then, that certainly wasn't true for the exploding browser market. On the Web JavaScript competed with Microsoft's VBScript to provide programming functionality in Web pages. It's hard to say why JavaScript won, perhaps Microsoft allowing JavaScript in Internet Explorer[1] did it. Perhaps it was the JavaScript language itself, but win it did. This meant by the early 2000s JavaScript had emerged as the Web language. Not the first choice, but the only choice for programming with HTML in browsers.
What does this have to do with Node.js? Well the important thing to remember is that when the AJAX revolution happened and the Web became big business (think Yahoo, Amazon, Google, etc) the only choice for the "J" in AJAX was JavaScript there simply wasn't an alternative. As a result a whole industry needed an awful lot of JavaScript programmers, really good ones at that, rather fast. The emergence of the Web as a serious platform and JavaScript as its programming language meant that we, as JavaScript programmers needed to shape up. We can equate the change in JavaScript as the second or third programming language of a programmer to the change in perception of its importance. We started to get emerging experts who lead the charge in making JavaScript respectable.
Arguably at the head of this movement was Douglas Crockford. His popular articles and videos on JavaScript have helped many programmers discover that inside a language much maligned there is a lot of inner beauty. Most programmers working with JavaScript had spent the majority of their time working with the browser implementation of the W3C DOM API for manipulating HTML or XML documents. Unfortunately, the DOM is probably not the prettiest API ever conceived, but worse its various implementations in the browsers are inconsistent and incomplete. No wonder that for a decade after its release JavaScript was not thought of as a "proper" language by so many programmers. More recently Douglas' work on "the good parts" of JavaScript have helped create a movement of advocates of the language which recognize that it has a lot going for it despite the warts.
In 2010 we now have a proliferation of JavaScript experts advocating well written, performant, maintainable JavaScript code. People such as Douglas Crockford, Dion Almaer, Peter Paul Koch (PPK), John Resig, Alex Russell, Thomas Fuchs, and many more have provided research, advice, tools, and primarily libraries that have allowed thousands of professional JavaScript programmers worldwide to practice their trade with a spirit of excellence. Libraries like jQuery, YUI, Dojo, Prototype, Mootools, Sencha and many others are now used daily by thousands of people and deployed on millions of Web sites. It is in this environment where JavaScript is not only accepted, but widely used and celebrated that a platform larger than the web makes sense. When so many programmers know JavaScript its ubiquity has become a distinct advantage.
When I speak at conferences I can ask a room full of Web programmers what languages they use. Java and PHP are very popular, Ruby is probably next most popular these days or at least closely tied with Python and Perl still has a huge following. However, almost without exception anyone who does any programming for the web has programmed in JavaScript. While backend languages are fractured in browser programming is united by the necessities of deployment. Various browsers and browser plugins allow the use of other languages, but they simply aren't universal enough for the web. So here we are with a single universal web language. How can we get it on the server?
Browser Wars 2.0
Fairly early in the days of the Web we had the infamous browser wars. Internet Explorer and Netscape competed viciously on Web features, adding various incompatible programmatic features to their browsers and not supporting the features in the other browser. For those of us who programmed the web this was the cause of much anguish because it made Web programming really tiresome. Internet Explorer more or less emerged the winner of that round and became the dominant browser. Fast forward a few years, Internet Explorer has been languishing at version 6 and a new contender, Firefox emerges from the remnants of Netscape. Firefox kicks off a new resurgence in browsers being followed by Webkit (Safari) and then Chrome. Most interesting about this current trend is the resurgence of competition into the browser market.
Unlike the first iteration of the browser wars today's browser compete on two fronts, adhering to the standards that emerged after the previous browser war and performance. As Web sites have become more complex users want the fastest experience possible. This has meant that browsers not only need to support the Web standards well, allowing developers to optimize, but also to do a little optimization of their own. JavaScript being a core component of Web 2.0, AJAX web sites has become part of the battleground.
Each browser has their own JavaScript runtimes: Spider Monkey for Firefox, Squirrel Fish Extreme for Safari, Karakan for Opera, and finally V8 for Chrome. As these runtimes compete on performance it creates an environment of innovation for JavaScript. In order to differentiate their browsers vendors are going to great lengths to make them as fast as possible.
[1] Internet Explorer doesn't actually support JavaScript or ECMAScript, it supports a language varienty called JScript. In recent years JScript has fully supported the ECMAScript 3standard and has some ECMAScript 5 supoort. However it also implements proprietary extensions in the same way Mozilla JavaScript has features that ECMAScript does not.






Add a comment



Add a comment