Chapter 5. Helper APIs
This chapter covers a number of cre APIs thats you'll almost certainly use regularly but aren't used as much as those in chapter 4.
DNS
Programmers, like end-users, normally want to refer to things by their domain names instead of their IP addresses. The DNS module provides this look up facility to you, but it is also used under the hood whenever you are able to use a domain name, for example in HTTP clients.
The dns module consists of two main methods and a number of convenience methods. The two main method are resolve(), which turns a domain name into a DNS record, and reverse(), which turns an IP address into a domain. All of the other methods in the dns module are more specialised forms of these methods.
dns.resolve() takes three arguments:
- A string containing the domain to be resolved
This can include subdomains, such as
www.yahoo.com. Thewwwis technically a hostname, but the system will resolve it for you.- A string containing the types of records being requested
This requires a little more understanding of DNS. Most people are familiar with the "address" or A record type. This type of record maps an IPv4 domain to a domain name (as defined in the previous item). The "canoninical name" or CNAME records allow you to create an alias of an A record or another CNAME. For example,
www.example.commight be a CNAME of the A record atexample.com. MX records point to the mail server for a domain for the use of SMTP. When you emailperson@domain.com, the MX record fordomain.comtells your email server where to send their mail. Text records, or TXT, are notes attached to domain. They have been used for all kinds of functions. The final type supported by this library is Service or SRV records, which provide information on the services available at a particular domain.- A callback
This returns the response from the DNS server. The prototype wil be shown in an example following.
Calling dns.resolve() is easy, although the callback may be slightly different from other callbacks you've have used so far.
Example 5.1. Calling dns.resolve()
dns.resolve('yahoo.com', 'A', function(e,r) {
if (e) {
console.log(e);
}
console.log(r);
} );
We called dns.resolve() with the domain and a record type of A, along with a trivial callback that prints results. The first arguemnt of the callback is an error object. If an error occurs, the object will be non-null, and we can consult it to see what went wrong. The second argument is a list of the records returned by the query.
There are convenience methods for all the types of records listed earlier. For example, rather than calling resolve('example.com', 'MX', callback) you can call resolveMx('example.com', callback) instead. The API also provides resolve4() and resolve6() methods, which resolve IPv4 and IPv6 address records respectively.
Example 5.2. Using resolve() vs resolveMx()
var dns = require('dns');
dns.resolve('example.com', 'MX', function(e, r) {
if (e) {
console.log(e);
}
console.log(r);
});
dns.resolveMx('example.com', function(e, r) {
if (e) {
console.log(e);
}
console.log(r);
});
Since resolve() usually returns a list containing many IP addresses, there is also a convenience method called dns.lookup() that returns just one IP address from an A record query. The method takes a domain, an IP family (4 or 6), and a callback. However, unlike .dns.resolve(), it always returns a single address. If you don't pass an address, it defaults to the network interface's current setting.
Example 5.3. Looking up a single A record with lookup()
var dns = require('dns');
dns.lookup('google.com', 4, function(e, a) {
console.log(a);
});
Crypto
Cryptography is used in lots of places for a variety of tasks. Node uses the OpenSSL library as the basis of its cryptography. This is because OpenSSL is already a well tested, hardened implementation of cryptographic algorithms. But you have to compile Node with OpenSSL support in order to use the methods in this section.
The cryptograph module enables a number of different tasks. First, it powers the SSL/TLS parts of Node. Second, it contains hashing algorithms like MD5 or SHA-1 that you might want to use in your application. Third, it allows you to use HMAC. [12] There are some encryption methods to cipher the data with to ensure it is encrypted. Finally, HMAC contains other public key cryptographic functions to sign data and verify signitures.
Each of the functions that cryptography does is contained within a Class (or Classes), which we'll look at in the following sections.
Hashing
Hashes are used for a few important functions, such as obfuscating data in a way that allows it to be validated or providing a small checksum for a much larger piece of data. In order to use hashes in Node, you should create a Hash object using the factory method crypto.createHash(). This returns a new Hash instance using a specified hashing algorithm. Most popular algorithms are available. The exact ones depend on your version of OpenSSL, but common ones are:
md5
sha1
sha256
sha512
ripemd160
These algorithms all have different advantages and disadvantages. MD5, for example, is used in many applications but has a number of known flaws including collision issues.[13] Depending on your application, you can pick either a widely deployed algorithm like MD5 or (preferably) the newer SHA1, or a less universal but more hardened algorithm like RIPEMD or SHA256 or SHA512.
Once you have data in the hash, you can use it to create a digest by calling hash.update() with the hash data. You can keep updating a Hash with more data until you want to output it; the data you add to the hash is simply concatinated to the data passed in previous calls. To output the hash, call the hash.digest() method. This will output the digest of the data that was input into the hash with hash.update(). No more data can be added after you call hash.digest().
Example 5.4. Creating a digest using hash
> var crypto = require('crypto');
> var md5 = crypto.createHash('md5');
> md5.update('foo');
{}
> md5.digest();
'¬½\u0018ÛLÂø\\íïeOÌĤØ'
>
Notice that the output of the digest is a bit weird. That's because it's the binary representation. More commonly, a digest is printed in hex. We can do that by adding 'hex' as a parameter to hash.digest:
Example 5.5. The lifespan of hashes and getting hex output
> var md5 = crypto.createHash('md5');
> md5.update('foo');
{}
> md5.digest();
'¬½\u0018ÛLÂø\\íïeOÌĤØ'
> md5.digest('hex');
Error: Not initialized
at [object Context]:1:5
at Interface.<anonymous> (repl.js:147:22)
at Interface.emit (events.js:42:17)
at Interface._onLine (readline.js:132:10)
at Interface._line (readline.js:387:8)
at Interface._ttyWrite (readline.js:564:14)
at ReadStream.<anonymous> (readline.js:52:12)
at ReadStream.emit (events.js:59:20)
at ReadStream._emitKey (tty_posix.js:280:10)
at ReadStream.onData (tty_posix.js:43:12)
> var md5 = crypto.createHash('md5');
> md5.update('foo');
{}
> md5.digest('hex');
'acbd18db4cc2f85cedef654fccc4a4d8'
>
When we call hash.digest() again, we get an error. This is because once hash.digest(), is called the hash object is finalised and cannot be reused. We need to create a new instance of Hash and use that instead. This time we get the hex output that is often more useful. The options for hash.digest() output are binary (default), hex, and base64.
Because data in calls hash.update() are concatinated, the following examples are identical:
Example 5.6. Looking at how hash update concatinates input
> var sha1 = crypto.createHash('sha1');
> sha1.update('foo');
{}
> sha1.update('bar');
{}
> sha1.digest('hex');
'8843d7f92416211de9ebb963ff4ce28125932878'
> var sha1 = crypto.createHash('sha1');
> sha1.update('foobar');
{}
> sha1.digest('hex');
'8843d7f92416211de9ebb963ff4ce28125932878'
>
It is also important to know that while hash.update() looks a lot like a stream, it isn't really. You can easily hook a stream to hash.update(), but you can't use stream.pipe().
HMAC
HMAC combines the hashing algorithms with a cryptographic key in order to stop a number of attacks to the integrity of the signiture. This means that HMAC uses both a hashing algorithm (such as the ones discussed in the previous section) and an encryption key. The HMAC API in Node is virtually identical to the Hash API. The only difference is that the creation of an hmac object requires a key as well as a hash algorithm.
crypto.createHmac() returns an instance of Hmac, which offers update() and digest() methods that work identically to the Hash methods we saw in the previous section.
The key required to create an Hmac object is a PEM-encoded key, passed as a string. It is easy to create a key on the command line using OpenSSL.
Example 5.7. Creating a PEM encoded key
Enki:~ $ openssl genrsa -out key.pem 1024
Generating RSA private key, 1024 bit long modulus
...++++++
............................++++++
e is 65537 (0x10001)
Enki:~ $
This example creates an RSA in PEM format and puts it into a file, in this case called key.pem. We could also have called thesame functionality directly from Node using the process module (discussed later in this chapter) if we omitted the -out key.pem option to get the results on an stdout stream. Instead we are going to import the key from the file and use it to create an Hmac object and create a digest:
Example 5.8. Creating an hmac digest
> var crypto = require('crypto');
> var fs = require('fs');
>
> var pem = fs.readFileSync('key.pem');
> var key = pem.toString('ascii');
>
> var hmac = crypto.createHmac('sha1', key);
>
> hmac.update('foo');
{}
> hmac.digest('hex');
'7b058f2f33ca28da3ff3c6506c978825718c7d42'
>
This example uses fs.readFileSync(), since a lot of the time, loading keys will be a server setup task. As such it's fine to do them synchronously (which might slow down server start up time) because you aren't serving clients yet, so blocking the event loop is OK. In general, other than the use of the encryption key, using an Hmacexample is exactly like using a Hash.
Public Key Cryptography
The public key cryptography functions are split into four Classes: Cipher, Decipher, Sign and Verify. Like all the other Classes in crypto, they have factory methods. Cipher encrypts data, Decipher decrypts data, Sign creates a cryptographic signiture for data, and Verify validates cryptographic signitures.
For the HMAC operations, we used a private key. For these operations we are going to use both the public and the private keys. Public key cryptography has matched sets of keys. One, the private key, is kept by the owner and it is used to decrypt and sign data. The other, the public key, is made available to other parties. The public key can be used to encrypt data that only the private key owner can read, or to verify the signature of data signed with the private key.
Let's extract the public key of the private key we generated to do the HMAC digests. Node expects public keys in certificate format, which requires you to input additional "information." But you can leave all the information blank if you like:
Example 5.9. Extracting a public key certificate from a private key
Enki:~ $ openssl req -key key.pem -new -x509 -out cert.pem
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:
State or Province Name (full name) [Some-State]:
Locality Name (eg, city) []:
Organization Name (eg, company) [Internet Widgits Pty Ltd]:
Organizational Unit Name (eg, section) []:
Common Name (eg, YOUR name) []:
Email Address []:
Enki:~ $ ls cert.pem
cert.pem
Enki:~ $
We simply ask OpenSSL to read in the private key, and then output the public key into a new file called cert.pem in X509 certificate format. All of the operations in crypto expect keys in PEM format.
Encrypting with Cipher
The Cipher Class provides a wrapper for encrypting data using a private key. The factory method to create a cipher takes an algorithm and the private key. The algorithms supported come from those compiled into your OpenSSL implementation.
blowfish
aes192
Many modern cryptographic algorithms use block ciphers. This means that the output is always in standard sized "blocks." The block sizes vary between algorithms: Blowfish, for example, uses 40-byte blocks. This is signifant when using the Cipher API, because the API will always output fixed sized blocks. This helps prevent information from being leaked to an attacker about the data being encrypted or the specific key being used to do the encryption.
Like Hash and Hmac, the Cipher API also uses the update() method to input data. However, update works differently when used in a cipher. First, cipher.update() returns a block of encytped data if it can. This is where block size becomes important. If the amount of data in the cipher plus the amount of data passed to cipher.update() is enough to create one or more blocks, the encrypted data will be returned. If there isn't enough to form a block, the input will be stored in the cipher.
Cipher also has a new method, Cipher.final() which replaces the digest() method. When cipher.final() is called, any remaining data in the cipher will be returned encyrpted, but with enough padding to make sure the block size is reached.
Example 5.10. Ciphers and block size
> var crypto = require('crypto');
> var fs = require('fs');
>
> var pem = fs.readFileSync('key.pem');
> var key = pem.toString('ascii');
>
> var cipher = crypto.createCipher('blowfish', key);
>
> cipher.update(new Buffer(4), 'binary', 'hex');
''
> cipher.update(new Buffer(4), 'binary', 'hex');
'ff57e5f742689c85'
> cipher.update(new Buffer(4), 'binary', 'hex');
''
> cipher.final('hex')
'96576b47fe130547'
>
In order to make the example easier to read, I specified the input and output formats. The input and output format are both optional and will be assumed to be binary unless specified. For this example, I specified a binaryinput format because I'm passing a new Buffer (containing whatever random junk was in memory), along with hex output to produce something easier to read. You can see that the first time I call cipher.update(), with 4 bytes of data, I get back an empty string. The second time, because I have enough data to generate a block, I get the encrypted data back as hex. When I call cipher.final(), there isn't enough data to create a full block, so the output is padded and a full (and final) block is returned. If I sent more data than would in a single block cipher.final() would output as many blocks as it could before padding. Since Cipher.final() is just for outputting existing data, it doesn't accept an input format.
Decrypting with Decipher
The Decipher class is the almost exact inverse of the Cipher class. You can pass encrypted data to a Decipher object using decipher.update(). It will stream it into blocks until it can output the unecrypted data. You might think that since cipher.update() and cipher.final() always give fixed length blocks, you would have to give perfect blocks to decipher, but luckily it will buffer the data, so you can pass it data you got off some other I/O transport such as the disk or network, even if this might give you block sizes different from those used by the encryption algorithm.
Let's take a look an example of encrypting data and then decrypting it:
Example 5.11. Encrypting and decrypting text
> var crypto = require('crypto');
> var fs = require('fs');
>
> var pem = fs.readFileSync('key.pem');
> var key = pem.toString('ascii');
>
> var plaintext = new Buffer('abcdefghijklmnopqrstuv');
> var encrypted = "";
> var cipher = crypto.createCipher('blowfish', key);
>
> encrypted += cipher.update(plaintext, 'binary', 'hex');
> encrypted += cipher.final('hex');
>
> var decrypted = "";
> var decipher = crypto.createDecipher('blowfish', key);
> decrypted += decipher.update(encrypted, 'hex', 'binary');
> decrypted += decipher.final('binary');
>
> var output = new Buffer(decrypted);
>
> output
<Buffer 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76>
> plaintext
<Buffer 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76>
>
It is important to make sure both the input and output formats match up for both the plaintext and the encrypted data. It's also worth noting that in order to get a buffer, you'll have to make one from the strings returned by cipher and decipher.
Creating signitures using Sign
Signitures verify that some data has been authenticated by the signer using the private key. However, unlike with HMAC, the public key can be used to authenticate the signiture. The API for Sign is nearly identical to that for HMAC. crypto.createSign() is used to make a sign object. It takes only the signing algorithm. sign.update() allows you to add data to the sign object. When you want to create the signature, call sign.sign() with a private key to sign the data.
Example 5.12. Signing data with sign
> var sign = crypto.createSign('RSA-SHA256');
> sign.update('abcdef');
{}
> sig = sign.sign(key, 'hex');
'35eb47af5260a00c7bad26edfbe7732a897a3a03290963e3d17f48331a42d48cc0753621e95c374c917424574be237e112c2be2a5eff74c200697b58e275a846f46da5e6f53ba69a0532b1db7e93aba49e3a463620f53461e7215189ae4a1658f4a57720174dddc5a9eaffbd34982ca4a3eefc03886f3caa53e013c3f03aa81b'
>
Verifying signitures with Verify
The Verify API works uses a method like the ones we've seen, verify.update(), to add data. When you have added all the data to be verified against the signiture verify.verify() validates the signiture. It takes the cert (the public key), the signature, and the format of the signature.
Example 5.13. Verifying signitures
> var crypto = require('crypto');
> var fs = require('fs');
>
> var privatePem = fs.readFileSync('key.pem');
> var publicPem = fs.readFileSync('cert.pem');
> var key = privatePem.toString();
> var pubkey = publicPem.toString();
>
> var data = "abcdef"
>
> var sign = crypto.createSign('RSA-SHA256');
> sign.update(data);
{}
> var sig = sign.sign(key, 'hex');
>
> var verify = crypto.createVerify('RSA-SHA256');
> verify.update(data);
{}
> verify.verify(pubkey, sig, 'hex');
1
Processes
Although Node abstracts a lot of things from the operating system, you are still running in an operating system and may want to interact more directly with it. Node allows you to interact with system processes that already exist, as well as creating new child processes to do work of various kinds. While Node itself is generally a 'fat' thread with a single event loop, you are free to start other processes (threads) to do work outside of the event loop.
process module
The process module gives you the ability to get information about and change the settings of the current Node process. Unlike most modules, the process module is global and is always available as the variable process.
process events
process is an instance of EventEmitter, so it provides events based on systems calls to the Node process. The exit event provides a final hook before the Node process exits. Importantly, the event loop will not run after the exit event, so only code without callbacks will be executed:
Example 5.14. Calling code when Node is exiting
process.on('exit', function () {
setTimeout(function () {
console.log('This will not run');
}, 100);
console.log('Bye.');
});
Since the loop isn't going to run again, the setTimeout() code will never be evaluated.
An extremely useful event provided by process is uncaughtException. After you've spent any time with Node, you'll find that exceptions that hit the main event loop will kill your Node process. In many use cases, especially servers that are expected to never be down, this is unacceptable. The uncaughtException event provides an extremely brute force way of catching these exceptions. It's really a last line of defense, but it's extremely useful for that purpose.
Example 5.15. Trapping an exception with the uncaughtException event
process.on('uncaughtException', function (err) {
console.log('Caught exception: ' + err);
});
setTimeout(function () {
console.log('This will still run.');
}, 500);
// Intentionally cause an exception, but don't catch it.
nonexistentFunc();
console.log('This will not run.');
Let's break down what's happening. First we create an event listener for uncaughtException. This is not a smart handler; it simply outputs the exception to stdout. If this Node script were running as a server, stdout could easily be used to log into a file and capture these errors. However, because it captures the event for a non-existent function, Node will not exit. However, the standard flow is disrupted. We know that all the JavaScript runs once, and then any callbacks will be run each time their event listener emits an event. In this scenario, since nonexistentFunc() will throw an exception, no code following it will be called. However, any code that has already been run will continue to run. This means the setTimeout() will still call. This is significant when writing servers. Let's consider some more code in this area:
Example 5.16. The effect on callbacks of catching exceptions
var http = require('http');
var server = http.createServer(function(req,res) {
res.writeHead(200, {});
res.end('response');
badLoggingCall('sent response');
console.log('sent response');
});
process.on('uncaughtException', function(e) {
console.log(e);
});
server.listen(8080);
This code creates a simple HTTP server and then listens for any uncaught exceptions at the process level. In our HTTP server, the callback deliberately calls a bad function after we've sent the HTTP response. Let's look at the console output for this script.
Example 5.17. Output of Example 5.16, “The effect on callbacks of catching exceptions”
Enki:~ $ node ex-test.js
{ stack: [Getter/Setter],
arguments: [ 'badLoggingCall' ],
type: 'not_defined',
message: [Getter/Setter] }
{ stack: [Getter/Setter],
arguments: [ 'badLoggingCall' ],
type: 'not_defined',
message: [Getter/Setter] }
{ stack: [Getter/Setter],
arguments: [ 'badLoggingCall' ],
type: 'not_defined',
message: [Getter/Setter] }
{ stack: [Getter/Setter],
arguments: [ 'badLoggingCall' ],
type: 'not_defined',
message: [Getter/Setter] }
When we start the example script, the server is available and I have made a number of HTTP requests to it. Notice that the server doesn't shut down at any point. Instead the errors are logged using the function attached to the uncaughtException event. However, we are still serving complete HTTP requests. Why? The deliberate prevented the callback in process from proceding and calling console.log(), but it affected only the process we spawned and the server kept running, so any other code was unaffected by the exception encapsulated in one specific code path.
It's important to understand the way that listeners are implmented in Node.
Example 5.18. The abbreviated listener code for EventEmitter
EventEmitter.prototype.emit = function(type) {
...
var handler = this._events[type];
...
} else if (isArray(handler)) {
var args = Array.prototype.slice.call(arguments, 1);
var listeners = handler.slice();
for (var i = 0, l = listeners.length; i < l; i++) {
listeners[i].apply(this, args);
}
return true;
...
};
After an event is emitted, one of the checks in the run-time handler is to see whether there is an array of listeners. If there is more than one listener, the run-time calls the listeners is by looping through the array in order. This means the first attached listener will be called first with apply(), then the second, etc. What's important to note here is that all all listeners on the same event are part of the same code path. So an uncaught exception in one callback will stop execution for all other callbacks on the same event. However, an uncaught exception in one instance of an event won't affect other events.
We also get access to a number of systems events through process. When the process gets a signal, it is exposed to Node via events emitted by process. An operating system can generate a lot of POSIX system events, which can be found in the sigaction(2) man page. Really common ones including SIGINT, the interrupt signal. Typically, a SIGINT is what happens when you press CTRL-C in the terminal on a running process. Unless you handle the signal events via process, Node will just perform the default action, which in the case of a SIGINT would immediately kill the process. You can change default behavior (except for a couple signals that can never get caught) through the process.on() method.
Example 5.19. Catching signals to the Node process
// Start reading from stdin so we don't exit.
process.stdin.resume();
process.on('SIGINT', function () {
console.log('Got SIGINT. Press Control-D to exit.');
});
In order to make sure Node doesn't exit on it's own, we read from stdin (described later in the section called “Operating system input/output”) so the Node process continues to run. If you CTRL-C the program while it's running, the OS will send a SIGINT to Node, which will be caught by the SIGINT event handler. Here, instead of exiting, we log to the console instead.
Interacting with the current Node process
Process contains a lot of meta-information about the Node process. This can be really helpful when you need to manage your Node environment from within the process. There are a number of properties that contain immutable (read-only) information about Node, such as:
process.versionContains the version number of the instance of Node you are running.
process.installPrefixContains the install path (
/usr/local,~/local, etc.) used during installation.process.platformLists the platform on which the Node is currently running. The output will specify the kernel (
linux2,darwin, etc.) rather than "Redhat ES3," "Windows 7," "OSX 10.7," etc.process.uptime()Contains the number of seconds the process has been running.
There are also a number of things that you can get and set about the Node process. When the process runs, it does so with a particular user and group. You can
get these and set them with process.getgid(), process.setgid(), process.getuid(), and process.setuid(). These can be very useful for making sure that Node is running in a secure way. It's worth noting that the set methods take either the numerical ID of group or username, or the group/username itself. However, if you pass the group or username, the methods do a blocking lookup to turn the group/username into an ID, which takes a little time.
The Process ID or PID of the running Node instance is also available as the process.pid property. You can set the title that Node displays to the system using the process.title property. Whatever is set in this property will be displayed in the ps command. This can be extremely useful when using multiple Node process in a production environment. Instead of having a lot of processes called node or possibly node app.js, you can set names intelligently to make them easy to refer to. When one process is hogging CPU or RAM, it's great to have a quick idea of which one is doing so.
Other available information includes process.execPath, which shows the execution path of the current Node binary (e.g., /usr/local/bin/node). The current working directory (to which all files opened will be relative) is accessible with process.cwd(). The working directory is the directory you were in when Node was started. You can change it using process.chdir() (this will throw an exception if the directory is unreadable or doesn't exist). You can also get the current memory usage of the current Node process using process.memoryUsage(). This returns an object specifying the size of the memory usage in a couple of ways: rss show how much RAM is being used and vsize shows the total memory used, including both RAM and swap. You'll also get some V8 stats: heapTotal and heapUsed show how much memory V8 has allocated and how much it is actively using.
Operating system input/output
There are a number of places where you can interact with the OS (besides making changes to Node process the program is running in) from process. One of the main ones is having access to the standard OS IO streams. stdin is the default input stream to the process, stdout is the process's output stream, and stderr is its error stream. These are exposed with process.stdin, process.stdout and process.stderr respectively. process.stdin is a readable stream, whereas process.stdout and process.stderr are writable streams.
process.stdin
stdin is a really useful device for interprocess communication. It's used to facilitate things like piping in the shell. When I type cat file.txt | node program.js it will be the stdin stream that recieves the data from the cat command.
Since process is always available, the process.stdin stream is always initialized in any Node process. But it starts out in a paused state, where Node can write to it but you can't read from it. Before attempting to read from stdin, call its resume() method. Until then, Node will just fill the read buffer for the stream and then stop until you are ready to deal with it. This approach avoids data loss.
Example 5.20. Writing stdin to stdout
process.stdin.resume();
process.stdin.setEncoding('utf8');
process.stdin.on('data', function (chunk) {
process.stdout.write('data: ' + chunk);
});
process.stdin.on('end', function () {
process.stdout.write('end');
});
We ask process.stdin to resume(), set the encoding to UTF-8, and then set a listener to push any data sent to process.stdout. When the process.stdin sends the end event, we pass that on to the process.stdout stream. We could also easily do this with the stream pipe() method, since stdin and stdout are both real streams.
Example 5.21. Writing stdin to stdout using Pipe
process.stdin.resume();
process.stdin.pipe(process.stdout);
This is the most elegant way of connecting two streams.
process.stderr
stderr is used to output exceptions and problems with program execution. In POSIX systems, because it is a separate stream, output logs and error logs can be easily redirected to different destinations. This can be very desirable, but it creates a couple of caviets in Node. When you write to stderr, Node guarantees that the write will happen. However, unlike a regular stream this is done as a blocking call. Typically, calls to Steam.write() return a boolean value indicating whether Node was able to write to the kernel buffer. With process.stderr this will always be true, but it might take a while to return, unlike the regular write(). Typically, it will be very fast, but the kernel buffer may sometimes be full and old up your program. This means that it is generally inadvisable to write a lot to stderr in a production system, because it may block real work.
One final thing to note is that process.stderr is always a UTF-8 stream. Any data you write to process.stderr will be interpretated as UTF-8 without you having to set an encoding. More over you are not able to change the encoding here.
Another place where Node programmers often touch the operating system is to retrieve the arguments passed when their program is started. argv is an array containing the command line arguments, starting with the node command itself.
Example 5.22. A simple script outputting argv
console.log(process.argv);
Example 5.23. Running Example 5.22, “A simple script outputting argv”
Enki:~ $ node argv.js -t 3 -c "abc def" -erf foo.js
[ 'node',
'/Users/croucher/argv.js',
'-t',
'3',
'-c',
'abc def',
'-erf',
'foo.js' ]
Enki:~ $
There are few things to notice here. First, the process.argv array is simply a split of the command line based on whitespace. If there are many characters whitespace between two arguments, they count as only a single split. The check for whitespace is \s+ in regex. This doesn't count for white space in quotes, however. Quotes can be used to keep tokens together. Also, notice how the first file argument is expanded. So you can pass a relative file argument on the command line and it will appear as its absolute pathname in argv. This is also true for special characters such as using ~ to refer to the home directory. Only the first argument is expanded this way..
argv is extremely helpful for writing command-line scripts, but it's pretty raw. There are a number of community projects that extend its support to help you easily write command-line applications, including support for automatically enabling features, writing inline help systems, and other more advanced features.
Event loop and tickers
If you've done work with JavaScript in browsers, you'll be familiar with setTimeout(). In Node we have a much more direct way to access the event loop and defer work that is extremely useful. process.nextTick() creates a callback to be executed on the next "tick" or iteration of the event loop. While it is implemented as a queue, it will supersede other events. Let's explore that a little bit:
Example 5.24. Using process.nextTick() to insert callbacks into the event loop
> var http = require('http');
> var s = http.createServer(function(req, res) {
... res.writeHead(200, {});
... res.end('foo');
... console.log('http response');
... process.nextTick(function(){console.log('tick')});
... });
> s.listen(8000);
>
> http response
tick
http response
tick
This example creates an HTTP server with a callback that creates a callback on process.nextTick(). No matter how many requests I make to the HTTP server, the "tick" will always occur on the next pass of the event loop. Unlike other callbacks, nextTick() callbacks are not a single event and thus are not subject to the usual callback expcetion brittlness:
Example 5.25. nextTick() continues after other code's exceptions
process.on('uncaughtException', function(e) {
console.log(e);
});
process.nextTick(function() {
console.log('tick');
});
process.nextTick(function() {
iAmAMistake();
console.log('tock');
});
process.nextTick(function() {
console.log('tick tock');
});
console.log('End of 1st loop');
Example 5.26. Results of Example 5.25, “nextTick() continues after other code's exceptions”
Enki:~ $ node process-next-tick.js
End of 1st loop
tick
{ stack: [Getter/Setter],
arguments: [ 'iAmAMistake' ],
type: 'not_defined',
message: [Getter/Setter] }
tick tock
Enki:~ $
Despite the deliberate error, unlike other event callbacks on a single event each of the ticks is isolated. Let's walk through the code. First we set an exception handler to catch any exceptions. Next we set a number of callbacks on process.nextTick(). Each of these outputs to the console, however the second has a deliberate error. Finally we log a message to the console. When Node runs the program, it evaluates all the code, which includes outputting 'End of 1st loop'. Then it calls the callbacks on nextTick() in order. First 'tick' is outputted, then we throw an error. This is because we hit our deliberate mistake on the next tick. The error causes process to emit() an uncaughtException event which runs out function to output the error to the console. Since we threw an error 'tock' was not outputted to the console. However, 'tick tock' still is. This is because every time nextTick() is called each callback is created in isolation. You could consider the execution of events to be emit() which is called inline in the current pass of event loop, nextTick() which is called at the beginning of the event loop in preference to other events, and finally other events in order at the beginning of the event loop.
Child Process
The child_process module allows you to create child processes of your main Node process. Since Node has only one event loop in a single process, sometimes it is helpful to create child processes. This can be do make use of more cores of your CPU, because a single Node process can only use one of the cores. child_process can also be used to launch other programs and let Node interact with them. This is extremely helpful when writing command-line scripts.
There are two main methods in child_process. spawn() creates a child process with its own stdin, stdout, and stderr file descriptors. exec() creates a child process and returns the result as a callback when the process is complete. This provides one extremely versitile way to create child processes, a way that is still non-blocking but doesn't require you to write extra code in order to steam forward.
Every child process has some common properties. They all contain properties for stdin, stdout and stderr, which I discusses in the section called “Operating system input/output”. There is also a pid property that contains the OS process ID of the process. Children emit the exit event when they exit. Other data events are available via the stream methods of child_process.stdin, child_process.stdout, and child_process.stderr
child_process.exec()
Let's start with exec() as the most straight-forward use case. Using exec(), you can create a process that will run some program (possibly another Node program) and then return the results for you in a callback:
Example 5.27. Calling ls with exec()
var cp = require('child_process');
cp.exec('ls -l', function(e, stdout, stderr) {
if(!e) {
console.log(stdout);
console.log(stderr);
}
});
When you call exec(), you can pass a shell command for the new process to run. Note that the entire command is a string. If you need to pass arguments to the shell command, they should be constructed into the string. In the example, I passed ls the -l argument to get the long form output. You can also include complicated shellfeatures, such as | to pipe commands. Node will return the results of the final command in the pipeline.
The callback function receives three arguments: an error object, the result of stdout, and the result of stderr. Notice that just calling ls will run it in the current working directory of Node, which you can retrieve by running process.cwd().
It's important to understand the difference between the first and third arguments. The error object returned will be null unless an error status code is returned from the child process or there was another exception. When the child process exits, it passes a status up to the parent process. In Unix, for example, this is 0 for success and an 8-bit number greater than 0 for an error. The error object is also used when the command called doesn't meet the constraints placed on it by Node. When an error code is returned from the child process, the error object will contain the error code and stderr. However, when a process is sucessful, there may still be data on stderr.
exec() takes an optional second argument with an options object. By default this object contains the following:
Example 5.28. Default options object for child_process.exec()
var options = { encoding: 'utf8',
timeout: 0,
maxBuffer: 200 * 1024,
killSignal: 'SIGTERM',
setsid: false,
cwd: null,
env: null };
The properties are:
encodingThe encoding for passing characters on the I/O streams.
timeoutThe number of milliseconds the process can run before Node kills it.
killSignalThe signal to use to terminate the process in case of a time or buffer size overrun.
maxBufferThe maximum number of kilobytes that stdout or stderr may each grow to.
setsidWhether to create a new session inside Node for the process.
cwdThe initial working directory for the process (where null uses Node's current working directory).
envThe process' environment variables. All environment variables are also inherited from the parent.
Let's set some of the options to put constraints on a process. First let's try restricting the buffer size of the response:
Example 5.29. Restricting the Buffer size on child_process.exec() calls
> var child = cp.exec('ls', {maxBuffer:1}, function(e, stdout, stderr) {
... console.log(e);
... }
... );
> { stack: [Getter/Setter],
arguments: undefined,
type: undefined,
message: 'maxBuffer exceeded.' }
In this example, you can see that when we set a tiny maxBuffer (just 1 kilobyte), running ls quickly exhausted the available space and threw an error. It's important to check for errors so that you can deal with them in a sensible way. You don't want to cause an actual exception by trying to access resources that are unavailable because you've restricted the child_process. If the child_process returns with an error, its stdin and stdout properties will be unavailable an attempts to access them will throw an exception.
It's also possible to stop a Child after a set amount of time:
Example 5.30. Setting a timeout on process.exec() calls
> var child = cp.exec('for i in {1..100000};do echo $i;done',
... {timeout:500, killSignal:'SIGKILL'},
... function(e, stdout, stderr) {
... console.log(e);
... });
> { stack: [Getter/Setter], arguments: undefined, type: undefined, message: 'Command failed: ', killed: true, code: null, signal: 'SIGKILL' }
This example define a deliberately long-running process (counting from 1 to 100,000 in a shell script), but I also set a short timeout. Notice that I also specified a killSignal. By default, the kill signal is SIGTERM, but I used SIGKILL[14] to show the feature. When we get the error back, notice there is a killed property that tells us that Node killed the process and that it didn't exit volunarily. This is also true for the previous example. Since it didn't exit on its own, there isn't a code property or some of the other properties of a system error.
child_process.spawn()
spawn() is very similar to exec() however it is a more general purpose method that requires you to deal with streams and their callbacks yourself. This makes it a lot more powerful and flexible but it also means more code is required to do the kind of one shot system calls we did with exec(). This means that spawn() is most commonly used in server contexts to create subcomponents of a server. This is the most common way people make Node work with multiple cores on a single machine.
While it performs the same function as exec() the API for spawn() is slightly different. The first argument is still the command to start the process with however, unlike exec() it is not a command string, but just the executable. The process' arguments are passed in an array as the (optional) second argument to spawn(). It's like an inverse of process.argv instead of the command being split() across spaces you provide an array to be join()ed with spaces.
Finally, spawn() also takes an options array as the final argument. Some of these options are the same as exec() but we'll cover that in more detail shortly.
Example 5.31. Starting child processes using spawn()
var cp = require('child_process');
var cat = cp.spawn('cat');
cat.stdout.on('data', function(d) {
console.log(d.toString());
});
cat.on('exit', function() {
console.log('kthxbai');
});
cat.stdin.write('meow');
cat.stdin.end();
Example 5.32. Results of previous example
Enki:~ $ node cat.js
meow
kthxbai
Enki:~ $
In this example I'm using the UNIX program cat. Cat simply echoes back whatever input it gets. You can see unlike exec() we don't issue a callback to spawn() directly. That's because we are expecting to use the Streams provided by the Child Class to get and send data. I named the variable with the instance of Child "cat". I can access cat.stdout to set events on the STDOUT stream of the child process. I set a listener on cat.stdout to watch for any data events, and I set a listener on the child itself in order to watch for the exit event. I can send my new child data using STDIN by accessing its child.stdin stream. This is just a regular writable stream. However, as a behaviour of the cat program when I close STDIN the process exits. This might not be true for all processes but it is true for cat, which only exits to echo back data.
The options that can be passed to spawn() aren't exactly the same as exec(). This is because you are expected to manage more things by hand with spawn(). The env, setsid and cwd properties are all options for spawn() as are uid and gid to set the user ID and the group ID respectively. Like process setting the uid or the gid to a username or a group name will block briefly while the user or group is looked up. There is one more option for spawn() which doesn't exist for exec(). You can set custom file descriptors that will be given to the new child process. Let's take a little bit of time to cover this topic because it's a little complex.
A file descriptor in UNIX is a way of keeping track of which programs are doing what with which files. Since UNIX lets many programs run at the same time there needs to be a way to make sure that when they interact with the file system they don't accidentally overwrite someone else's changes. The file descriptor table keeps track of all the file which a process wants to access. The kernel might lock a particular file to stop two programs writing to the file at the same time, as well as other management functions. A process will look at its file descriptor table to find the file descriptor representing a particular file and pass that to the kernel to access the file. The file descriptor is simply an integer.
The important thing is that not just pure files are represented by file descriptors. The name is a little deceptive because network and other sockets are also allocated file descriptors. UNIX has inter-process communications (IPC) sockets which lets processes talk to each other. We've been calling them STDIN, STDOUT and STDERR. This is interesting because spawn() let's us specify specific file descriptors when starting a new child process. This means instead of the OS assigning a new file descriptor we can ask a child processes to share an existing file descriptor with the parent process. That file descriptor might be a network socket to the Internet or just the parent's STDIN, the point is that we have a powerful way of delegating work to child processes.
How does this work in practice? When passing the options object to spawn() we can specify customFds to pass our own three file descriptors to the child instead of them creating a STDIN, STDOUT and STDERR file descriptor.
Example 5.33. Passing STDIN, STDOUT and STDERR to a child process
var cp = require('child_process');
var child = cp.spawn('cat', [], {customFds:[0, 1, 2]});
Example 5.34. Running previous example and in data to STDIN
Enki:~ $ echo "foo"
foo
Enki:~ $ echo "foo" | node
readline.js:80
tty.setRawMode(true);
^
Error: ENOTTY, Inappropriate ioctl for device
at new Interface (readline.js:80:9)
at Object.createInterface (readline.js:38:10)
at new REPLServer (repl.js:102:16)
at Object.start (repl.js:218:10)
at Function.runRepl (node.js:365:26)
at startup (node.js:61:13)
at node.js:443:3
Enki:~ $ echo "foo" | cat
foo
Enki:~ $ echo "foo" | node fds.js
foo
Enki:~ $
The file descriptors 0, 1 and 2 represent STDIN, STDOUT and STDERR respectively. In this example we create a child and pass it STDIN, STDOUT and STDERR from the parent Node process. We can test this wiring using the command line. The echo command outputs a string we did it. If we pass that directly to node with a pipe (STDOUT to STDIN), we get an error. We can however pass it to the cat command which echoes it back. If we pipe to the Node process running our script it echoes back. This is because we've hooked up the STDIN, STDOUT and STDERR of the Node process directly to the cat command in our child process. When the main Node process gets data on STDIN then it get passed to the cat child process which echoes it back on the shared STDOUT. One thing to note is that once you wire up Node process this way the child process loses it's child.stdin, child.stdout and child.stderr file descriptor references. This is because once you pass the file descriptors to the process they are duplicated and the kernel handles the data passing. This means that Node isn't in between the process and the file descriptors so you cannot add events to those streams.
Example 5.35. Tying to access file descriptor streams fails when custom FDs are passed
var cp = require('child_process');
var child = cp.spawn('cat', [], {customFds:[0, 1, 2]});
child.stdout.on('data', function(d) {
console.log('data out');
});
Example 5.36. Results of test
Enki:~ $ echo "foo" | node fds.js
node.js:134
throw e; // process.nextTick error, or 'error' event on first tick
foo
^
TypeError: Cannot call method 'on' of null
at Object.<anonymous> (/Users/croucher/fds.js:3:14)
at Module._compile (module.js:404:26)
at Object..js (module.js:410:10)
at Module.load (module.js:336:31)
at Function._load (module.js:297:12)
at Array.<anonymous> (module.js:423:10)
at EventEmitter._tickCallback (node.js:126:26)
Enki:~ $
When custom file descriptors are specified the streams are literally set to null and are completely inaccessible from the parent. It is still preferable in many cases though because the kernel is much faster at routing than if we did something like stream.pipe() with Node to connect the streams together. However STDIN, STDOUT and STDERR aren't the only file descriptors worth connecting to child processes. A very common use case is connecting network sockets to a number children which allows multi-core utilization.
Say we are creating a web site, or a game server or anything that has to deal with a bunch of traffic. We have this great server which has a bunch of processors each of which has 2 or 4 cores. If we just started a Node process running our code we'd have just one core being used. While CPU isn't always the critical factor for Node we want to make sure we get as close to CPU bound as we can. We could start a bunch of Node processes with different ports and load balance them with Nginx or Apache Traffic Server. However that's inelegant and requires use to use more software. We could create a Node process that creates a bunch of child processes and routes all the requests to them. This is a bit closer to our optimal solution, but we just created a single point of failure. There is one Node process which routes all the traffic. This isn't ideal. This is where passing custom FDs comes into it's own. In the same way we can pass the STDIN, STDOUT, STDERR of a master process we can create other sockets and pass those in to child processes. However, because we are passing file descriptors instead of messages the kernel will deal with the routing. This means while the master Node process is still required it isn't bearing the load for all the traffic.
Testing through assert
assert is a core library that provides the basis for testing code. Node's assertions works pretty much like the same feature in other languages and environments: they allow you to make claims about objects and function calls and send out messages when the assertions are violated. These methods are really easy to get started with and provide a great way to unit test your code's features. Node's own tests are written with Assert.
Most assert methods come in pairs: one method providing the positive test and the other providing the negative one. For instance, the following example shows equal() and notEqual(). The methods take two arguments: the first is the expected value, and the third is the actual value.
Example 5.37. Basic assertions
> var assert = require('assert');
> assert.equal(1, true, 'Truthy');
> assert.notEqual(1, true, 'Truthy');
AssertionError: Truthy
at [object Context]:1:8
at Interface.<anonymous> (repl.js:171:22)
at Interface.emit (events.js:64:17)
at Interface._onLine (readline.js:153:10)
at Interface._line (readline.js:408:8)
at Interface._ttyWrite (readline.js:585:14)
at ReadStream.<anonymous> (readline.js:73:12)
at ReadStream.emit (events.js:81:20)
at ReadStream._emitKey (tty_posix.js:307:10)
at ReadStream.onData (tty_posix.js:70:12)
>
The most obvious thing here is that when an assert method doesn't pass, it throws an exception. This is a fundamental principle in the test suites. When a test suite runs, it should just run, without throwing an exception. If that is the case, the test is successful.
There are just a few assertions. equal() and notEqual() check for the == equality and != inequality operators. This means they test weakly for truthy and falsy values, as Crockford termed them. In brief, when tested as a Boolean, falsy values consist of false, 0, empty strings (e.g., ""), null, undefined, and NaN. All other values are truthy. A string such as "false" is truthy. A string containing "0" is also truthy. As such, equal() and notEqual() are fine to compare simple values (strings, numbers, etc) with each other, but you should be careful checking against Booleans to ensure you got the result you wanted.
The stringEqual() and notStrictEqual() methods test equality with === and !==, which will ensure that only actual values of true and false are treated as true and false respectively. The ok() method is a shorthand for testing whether something is truthy, by comparing the value with true using ==.
Example 5.38. Testing something is truthy with assert.ok()
> assert.ok('This is a string', 'Strings that are not empty are truthy');
> assert.ok(0, 'Zero is not truthy');
AssertionError: Zero is not truthy
at [object Context]:1:8
at Interface.<anonymous> (repl.js:171:22)
at Interface.emit (events.js:64:17)
at Interface._onLine (readline.js:153:10)
at Interface._line (readline.js:408:8)
at Interface._ttyWrite (readline.js:585:14)
at ReadStream.<anonymous> (readline.js:73:12)
at ReadStream.emit (events.js:81:20)
at ReadStream._emitKey (tty_posix.js:307:10)
at ReadStream.onData (tty_posix.js:70:12)
>
Often the things you want to compare aren't simple values, but objects. JavaScript doesn't have a way to let objects define equality operators on themselves, and even if it did, people often wouldn't define the operators. So the deepEqual() and notDeepEqual() methods provide a way of deeply comparing object values. Without going into too much of the gory details, these methods perform a few checks. If any check fails, the test throws an exception. The first test checks whether the values simply match with the === operator. Next, the values are checked to see whether they are Buffers and, if so, are checked for their length, and then checked byte by byte. Next, if the object types don't match with the == operator they can't be equal. Finally if the arguments are objects, more extensive tests are done, comparing the prototypes of the two objects and the number of properties and then recursively perform deepEqual() on each property.
The important point is here is that deepEqual() and notDeepEqual() are extremely helpful and thorough but also really potentially expensive. You should really try to use them only when needed. Although they will attempt to do the most efficient tests first, it can still take a longer to find an inequality. If you can provide a more specific reference, such as the property of an object rather than the whole object, you can improve the performance of your tests a lot.
The next assert methods are throws() and doesNotThrow(). These check whether a particular block of code does or doesn't throw an exception. You can check for a specific exception or just whether any exception is thrown. The methods are pretty straightforward, but have a few options that are worth reviewing.
It might be easy to overlook these tests, but handling exceptions is an essential part of writing robust JavaScript code, so you should use the tests to make sure the code you write throws exceptions in all the correct places. Chapter 3, Building Robust Node Applications offers more information on how to deal with exceptions well.
In order to pass blocks of code to throws() and doesNotThrow(), wrap them in functions that take no arguments. The exception being tested for is optional. If one isn't passed, throws() will just check whether any exception happened and doesNotThrow() will ensure one doesn't. If a specific error is passed, throws() will check that the specified exception and only that exception was thrown. If any other exceptions are thrown or the exception isn't thrown the test will not pass. For doesNotThrow(), when an error is specified, it will continue without error if any exception is thrown except for the specific one. If an exception matching the specified error is thrown, it will cause the test to fail.
Example 5.39. Using assert.throws() and assert.doesNotThrow() to check for exception handling
> assert.throws(
... function() {
... throw new Error("Seven Fingers. Ten is too mainstream.");
... });
> assert.doesNotThrow(
... function() {
... throw new Error("I lived in the ocean way before Nemo");
... });
AssertionError: "Got unwanted exception (Error).."
at Object._throws (assert.js:281:5)
at Object.doesNotThrow (assert.js:299:11)
at [object Context]:1:8
at Interface.<anonymous> (repl.js:171:22)
at Interface.emit (events.js:64:17)
at Interface._onLine (readline.js:153:10)
at Interface._line (readline.js:408:8)
at Interface._ttyWrite (readline.js:585:14)
at ReadStream.<anonymous> (readline.js:73:12)
at ReadStream.emit (events.js:81:20)
>
There are four ways to specify the type of error to look for or avoid. Pass one of the following:
- Comparison function
The function should take a the exception error as its single argument. In the function, compare the exception actually thrown to what you expect to find whether there is a match. Return
trueif there is a match,falseotherwise.- Regular expression
The library will compare the regex to the error message to find a match using the
regex.test()method in JavaScript.- String
The library will directly compare the string to the error message.
- Object constructor
The library will perform a
typeoftest on the exception. If this test throws an error with thetypeofthe constructor, then the exception matches. This can be used to makethrows()anddoesNotThrow()very flexible.
VM
The vm or Virtual Machine module allows you to run arbitary chunks of code and get a result back. It has a number of features that allow you to change the context in which the code runs. This can be useful to act as a kind of faux sandbox. However, the code is still running in the same Node process, so you should be cautious. vm is similar to eval(), but offers some more features and a better API for managing code. It doesn't have the ability to interact with the local scope in the way that eval() does, however.
There are two ways to run code with vm. Running the code "inline" is similar to using eval(). The second way is to precompile ithe code into a vm.Script object. Let's have a look at running code inline using vm.
Example 5.40. Using vm to run code
> var vm = require('vm');
> vm.runInThisContext("1+1");
2
So far, vm looks a lot like eval(). We pass some code to it and we get a result back. However, vm doesn't interact with local scope in the same way that eval() does. Code run with eval() will behave as is if it was truly inline and replaces the eval() method call. But calls to vm methods will not interact with the local scope. So eval() can change the surrounding context, whereas vm cannot, as shown in the following example.
Example 5.41. Accessing the local scope the differences between vm and eval()
> var vm = require('vm'),
... e = 0,
... v = 0;
> eval(e=e+1);
1
> e
1
> vm.runInThisContext('v=v+1');
ReferenceError: v is not defined
at evalmachine.<anonymous>:1:1
at [object Context]:1:4
at Interface.<anonymous> (repl.js:171:22)
at Interface.emit (events.js:64:17)
at Interface._onLine (readline.js:153:10)
at Interface._line (readline.js:408:8)
at Interface._ttyWrite (readline.js:585:14)
at ReadStream.<anonymous> (readline.js:73:12)
at ReadStream.emit (events.js:81:20)
at ReadStream._emitKey (tty_posix.js:307:10)
>
> vm.runInThisContext('v=0');
0
> vm.runInThisContext('v=v+1');
1
>
0
I've created two variables e and v. When I use the e variable with eval(), the end result of the statment applies back to the main context. However when I try the same thing with v and vm.runInThisContext(), I get an exception because I refer to v on the right side of the equal sign and that variable is not defined. While eval() runs in the local scope, vm does not.
The vm subsystem actually maintains its own local context that persists from one invocation of vm to another. Thus, if I create v within the scope of the vm, the variable is available subsequently to later vm invocations, maintaining the state in which the first vm left it. However, the variable from the vm has no impact on v in the local scope of the main event loop.
It's also possible to pass a pre-existing context to vm. This context will be used in place of the default context.
Example 5.42. Passing a context in to vm
> var vm = require('vm');
> var context = { alphabet:"" };
> vm.runInNewContext("alphabet+='a'", context);
'a'
> vm.runInNewContext("alphabet+='b'", context);
'ab'
> context
{ alphabet: 'ab' }
>
This example uses vm.runInNewContext(), which takes a context object as a second argument. The scope of that object becomes the context for the code we run with vm. If we continue to pass it from object to object, the context will be modified. However, the context is also available to the global scope.
You can also compile vm.Script objects. These save a piece of code that you can then run repeatedly. At run time, you can choose the context to be applied. This is helpful when you are repeatedly running the same code against multiple contexts.
Example 5.43. Compiling code into a script with vm
> var vm = require('vm');
> var fs = require('fs');
>
> var code = fs.readFileSync('example.js');
> code.toString();
'console.log(output);\n'
>
> var script = vm.createScript(code);
> script.runInNewContext({output:"Kick Ass"});
ReferenceError: console is not defined
at undefined:1:1
at [object Context]:1:8
at Interface.<anonymous> (repl.js:171:22)
at Interface.emit (events.js:64:17)
at Interface._onLine (readline.js:153:10)
at Interface._line (readline.js:408:8)
at Interface._ttyWrite (readline.js:585:14)
at ReadStream.<anonymous> (readline.js:73:12)
at ReadStream.emit (events.js:81:20)
at ReadStream._emitKey (tty_posix.js:307:10)
> script.runInNewContext({"console":console,"output":"Kick Ass"});
Kick Ass
This example reads in a JavaScript file that contains the simple command console.log(output);. I compile this into a script object. I can then run script.runInNewContext() on the script and pass in a context. I deliberately triggered an error to show that, just as when running vm.runInNewContext(), you need to pass in the objects to which you refer, such as the console object, or even basic global functions are not available. It's also worth noting that the exepction is thrown from undefined:1:1.
All the vm run commands take a filename as an optional final argument. It doesn't change the functionality, but allows you to set the name of the file that appears in a message if an error is throwing. This is useful if you load a lot of files from disk and run them, so that you know which piece of code threw an error. The parameter is totally arbitrary, so you could use whatever string is meaningful to help you debug the code.
[12] Hash-based Message Authentication Code is a crytographic way of verifying data. It is often used like hashing algorithms to verify that two pieces of data match, but it also verifies that the data hasn't been tampered with.
[13] It's possible to deliberately make two pieces of data with the same MD5 checksum, which for some purposes can make the algorithm less desirable. More modern algorithms are less prone to this, although people are finding similar problems with SHA1 now.
[14] SIGKILL can be invoked in the shell through kill -9.





Add a comment



Add a comment