Chapter 2. Type Less, Do More
In This Chapter
We ended the last chapter with a few “teaser” examples of Scala code. This chapter discusses uses of Scala that promote succinct, flexible code. We’ll discuss organization of files and packages, importing other types, variable declarations, miscellaneous syntax conventions and a few other concepts. We’ll emphasize how the concise syntax of Scala helps you work better and faster.
Scala’s syntax is especially useful when writing scripts. Separate compile and run steps aren’t required for simple programs that have few dependencies on libraries outside of what Scala provides. You compile and run such programs in one shot with the scala command. If you’ve downloaded the example code for the book, many of the smaller examples can be run using the scala command, e.g., scala filename.scala. See the README.txt files in each chapter’s code examples for more details. See also the section called “Command Line Tools” in Chapter 14, Scala Tools, Libraries and IDE Support for more information about using the scala command.
Semicolons
You may have already noticed that there were very few semicolons in the code examples in the previous chapter. You can use semicolons to separate statements and expressions, as in Java, C, PHP, and similar languages. In most cases, though, Scala behaves like many scripting languages in treating the end of the line as the end of a statement or an expression. When a statement or expression is too long for one line, Scala can usually infer when you are continuing on to the next line, as shown in this example.
// code-examples/TypeLessDoMore/semicolon-example-script.scala
// Trailing equals sign indicates more code on next line
def equalsign = {
val reallySuperLongValueNameThatGoesOnForeverSoYouNeedANewLine =
"wow that was a long value name"
println(reallySuperLongValueNameThatGoesOnForeverSoYouNeedANewLine)
}
// Trailing opening curly brace indicates more code on next line
def equalsign2(s: String) = {
println("equalsign2: " + s)
}
// Trailing comma, operator, etc. indicates more code on next line
def commas(s1: String,
s2: String) = {
println("comma: " + s1 +
", " + s2)
}When you want to put multiple statements or expressions on the same line, you can use semicolons to separate them. We used this technique in the ShapeDrawingActor example in the section called “A Taste of Concurrency” in Chapter 1, Zero to Sixty: Introducing Scala.
case "exit" => println("exiting..."); exitThis code could also be written as follows.
...
case "exit" =>
println("exiting...")
exit
...You might wonder why you don’t need curly braces ({…}) around the two statements after the case … => line. You can put them in if you want, but the compiler knows when you’ve reached the end of the “block” when it finds the next case clause or the curly brace (}) that ends the enclosing block for all the case clauses.
Omitting optional semicolons means fewer characters to type and fewer characters to clutter your code. Breaking separate statements onto their own lines increases your code’s readability.
Variable Declarations
Scala allows you to decide whether a variable is immutable (read-only) or not (read-write) when you declare it. An immutable “variable” is declared with the keyword val (think value object).
val array: Array[String] = new Array(5)
To be more precise, the array reference cannot be changed to point to a different Array, but the array itself can be modified, as shown in the following scala session.
scala> val array: Array[String] = new Array(5)
array: Array[String] = Array(null, null, null, null, null)
scala> array = new Array(2)
<console>:5: error: reassignment to val
array = new Array(2)
^
scala> array(0) = "Hello"
scala> array
res3: Array[String] = Array(Hello, null, null, null, null)
scala>An immutable val must be initialized, that is defined, when it is declared.
A mutable variable is declared with the keyword var.
scala> var stockPrice: Double = 100. stockPrice: Double = 100.0 scala> stockPrice = 10. stockPrice: Double = 10.0 scala>
Scala also requires you to initialize a var when it is declared. You can assign a new value to a var as often as you want. Again, to be precise, the stockPrice reference can be changed to point to a different Double object (e.g., 10.). In this case, the object that stockPrice refers to can’t be changed, because Doubles in Scala are immutable.
There are a few exceptions to the rule that you must initialize val's and var's when they are declared. Both keywords can be used with constructor parameters. When used as constructor parameters, the mutable or immutable variables specified will be initialized when an object is instantiated. Both keywords can be used to declare "abstract" (uninitialized) variables in abstract types. Also, derived types can override vals declared inside parent types. We’ll discuss these exceptions in Chapter 5, Basic Object-Oriented Programming in Scala.
Scala encourages you to use immutable values whenever possible. As we will see, this promotes better object-oriented design and it is consistent with the principles of “pure” functional programming. It may take some getting used to, but you’ll find a newfound confidence in your code when it is written in an immutable style.
Note
The var and val keywords only specify if the reference can be changed to refer to a different object (var) or not (val). They don’t specify whether or not the object they reference is mutable.
Method Declarations
We saw several examples in Chapter 1, Zero to Sixty: Introducing Scala of how to define methods, which are functions that are members of a class. Method definitions start with the def keyword, followed by optional argument lists, a colon character ‘:’ and the return type of the method, an equals sign ‘=’, and finally the method body. Methods are implicitly declared “abstract” if you leave off the equals sign and method body. The enclosing type is then itself abstract. We’ll discuss abstract types in more detail in Chapter 5, Basic Object-Oriented Programming in Scala.
We said “optional argument lists”, meaning more than one. Scala lets you define more than one argument list for a method. This is required for currying methods, which we’ll discuss in the section called “Currying” in Chapter 8, Functional Programming in Scala. It is also very useful for defining your own domain-specific languages (DSLs), as we’ll see in Chapter 11, Domain-Specific Languages in Scala. Note that each argument list is surrounded by parentheses and the arguments are separated by commas.
If a method body has more than one expression, you must surround it with curly braces {…}. You can omit the braces if the method body has just one expression.
Method Default and Named Arguments (Scala Version 2.8)
Many languages let you define default values for some or all of the arguments to a method. Consider the following script with a StringUtil object that lets you join a list of strings with a user-specified separator.
// code-examples/TypeLessDoMore/string-util-v1-script.scala
// Version 1 of "StringUtil".
object StringUtil {
def joiner(strings: List[String], separator: String): String =
strings.mkString(separator)
def joiner(strings: List[String]): String = joiner(strings, " ")
}
import StringUtil._ // Import the joiner methods.
println( joiner(List("Programming", "Scala")) )There are actually two, “overloaded” joiner methods. The second one uses a single space as the “default” separator. Having two methods seems a bit wasteful. It would be nice if we could eliminate the second joiner method and declare that the separator argument in the first joiner has a default value. In fact, in Scala version 2.8, you can now do this.
// code-examples/TypeLessDoMore/string-util-v2-v28-script.scala
// Version 2 of "StringUtil" for Scala v2.8 only.
object StringUtil {
def joiner(strings: List[String], separator: String = " "): String =
strings.mkString(separator)
}
import StringUtil._ // Import the joiner methods.
println(joiner(List("Programming", "Scala")))There is another alternative for earlier versions of Scala. You can use implicit arguments, which we will discuss in the section called “Implicit Function Parameters” in Chapter 8, Functional Programming in Scala.
Scala version 2.8 offers another enhancement for method argument lists, named arguments. We could actually write the last line of the previous example in several ways. All of the following println statements are functionally equivalent.
println(joiner(List("Programming", "Scala")))
println(joiner(strings = List("Programming", "Scala")))
println(joiner(List("Programming", "Scala"), " ")) // #1
println(joiner(List("Programming", "Scala"), separator = " ")) // #2
println(joiner(strings = List("Programming", "Scala"), separator = " "))Why is this useful? First, if you choose good names for the method arguments, then your calls to those methods document each argument with a name. For example, compare the two lines with comments #1 and #2. In the first line, it may not be obvious what the second, " " argument is for. In the second case, we supply the name separator, which suggests the purpose of the argument.
The second benefit is that you can specify the parameters in any order when you specify them by name. Combined with default values, you can write code like the following
// code-examples/TypeLessDoMore/user-profile-v28-script.scala
// Scala v2.8 only.
object OptionalUserProfileInfo {
val UnknownLocation = ""
val UnknownAge = -1
val UnknownWebSite = ""
}
class OptionalUserProfileInfo(
location: String = OptionalUserProfileInfo.UnknownLocation,
age: Int = OptionalUserProfileInfo.UnknownAge,
webSite: String = OptionalUserProfileInfo.UnknownWebSite)
println( new OptionalUserProfileInfo )
println( new OptionalUserProfileInfo(age = 29) )
println( new OptionalUserProfileInfo(age = 29, location="Earth") )OptionalUserProfileInfo represents all the “optional” user profile data in your next Web 2.0, social networking site. It defines default values for all its fields. The script creates instances with zero or more named parameters. The order of those parameters is arbitrary.
The examples we have shown use constant values as the defaults. Most languages with default argument values only allow constants or other values that can be determined at parse-time. However, in Scala, any expression can be used as the default, as long as it can compile where used. For example, an expression could not refer to an instance field that will be computed inside the class or object body, but it could invoke a method on a singleton object.
Finally, another constraint on named parameters is that once you provide a name for a parameter in a method invocation, then the rest of the parameters appearing after it must also be named. For example, new OptionalUserProfileInfo(age = 29, "Earth") would not compile because the second argument is not invoked by name.
We’ll see another useful example of named and default arguments when we discuss case classes in the section called “Case Classes” in Chapter 6, Advanced Object-Oriented Programming In Scala.
Nesting Method Definitions
Method definitions can also be nested. Here is an implementation of a factorial calculator, where we use a conventional technique of calling a second, nested method to do the work.
// code-examples/TypeLessDoMore/factorial-script.scala
def factorial(i: Int): Int = {
def fact(i: Int, accumulator: Int): Int = {
if (i <= 1)
accumulator
else
fact(i - 1, i * accumulator)
}
fact(i, 1)
}
println( factorial(0) )
println( factorial(1) )
println( factorial(2) )
println( factorial(3) )
println( factorial(4) )
println( factorial(5) )The second method calls itself recursively, passing an accumulator parameter, where the result of the calculation is “accumulated”. Note that we return the accumulated value when the counter i reaches 1. (We’re ignoring invalid negative integers. The function actually returns 1 for i < 0.) After the definition of the nested method, factorial calls it with the passed-in value i and the initial accumulator value of 1.
Like a local variable declaration in many languages, a nested method is only visible inside the enclosing method. If you try to call fact outside of factorial, you will get a compiler error.
Did you notice that we use i as a parameter name twice, first in the factorial method and again in the nested fact method? As in many languages, the use of i as a parameter name for fact “shadows” the outer use of i as a parameter name for factorial. This is fine, because we don’t need the outer value of i inside fact. We only use it the first time we call fact, at the end of factorial.
What if we need to use a variable that is defined outside a nested function. Consider this contrived example.
// code-examples/TypeLessDoMore/count-to-script.scala
def countTo(n: Int):Unit = {
def count(i: Int): Unit = {
if (i <= n) {
println(i)
count(i + 1)
}
}
count(1)
}
countTo(5)Note that the nested count method uses the n value that is passed as a parameter to countTo. There is no need to pass n as an argument to count. Because count is nested inside countTo, n is visible to it.
The declaration of a field (member variable) can be prefixed with keywords indicating the visibility, just as in languages like Java and C#. Similarly the declaration of a non-nested method can be prefixed with the same keywords. We will discuss the visibility rules and keywords in the section called “Visibility Rules” in Chapter 5, Basic Object-Oriented Programming in Scala.
Inferring Type Information
Statically-typed languages can be very verbose. Consider this typical declaration in Java.
import java.util.Map; import java.util.HashMap; ... Map<Integer, String> intToStringMap = new HashMap<Integer, String>();
We have to specify the type parameters <Integer, String> twice. (Scala uses the term type annotations for explicit type declarations like HashMap<Integer, String>.)
Scala supports type inference (see, for example, [TypeInference] and [Pierce2002]). The language’s compiler can discern quite a bit of type information from the context, without explicit type annotations. Here’s the same declaration rewritten in Scala, with inferred type information.
import java.util.Map import java.util.HashMap ... val intToStringMap: Map[Integer, String] = new HashMap
Recall from Chapter 1 that Scala uses square brackets ([…]) for generic type parameters. We specify Map[Integer, String] on the left-hand side of the equals sign. (We are sticking with Java types for the example.) On the right-hand side, we instantiate the actual type we want, a HashMap, but we don’t have to repeat the type parameters.
For completeness, suppose we don’t actually care if the instance is of type Map (the Java interface type). It can be of type HashMap for all we care.
import java.util.Map import java.util.HashMap ... val intToStringMap2 = new HashMap[Integer, String]
This declaration requires no type annotations on the left-hand side because all of the type information needed is on the right-hand side. The compiler automatically makes intToStringMap2 a HashMap[Integer,String].
Type inference is used for methods, too. In most cases, the return type of the method can be inferred, so the ‘:’ and return type can be omitted. However, type annotations are required for all method parameters.
Pure functional languages like Haskell (see, e.g., [O'Sullivan2009]) use type inference algorithms like Hindley-Milner (see [Spiewak2008] for an easily digested explanation). Code written in these languages require type annotations less often than in Scala, because Scala’s type inference algorithm has to support object-oriented typing as well as functional typing. So, Scala requires more type annotations than languages like Haskell. Here is a summary of the rules for when explicit type annotations are required in Scala.
Note
The Any type is the root of the Scala type hierarchy (see the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System for more details). If a block of code returns a value of type Any unexpectedly, chances are good that the type inferencer couldn’t figure out what type to return, so it chose the most generic possible type.
Let’s look at examples where explicit declarations of method return types are required. In the following script, the upCase method has a conditional return statement for zero-length strings.
// code-examples/TypeLessDoMore/method-nested-return-script.scala
// ERROR: Won't compile until you put a String return type on upCase.
def upCase(s: String) = {
if (s.length == 0)
return s // ERROR - forces return type of upCase to be declared.
else
s.toUpperCase()
}
println( upCase("") )
println( upCase("Hello") )Running this script gives you the following error.
... 6: error: method upCase has return statement; needs result type
return s
^You can fix this error by changing the first line of the method to the following.
def upCase(s: String): String = {Actually, for this particular script, an alternative fix is to remove the return keyword from the line. It is not needed for the code to work properly, but it illustrates our point.
Recursive methods also require an explicit return type. Recall our factorial method in the section called “Nesting Method Definitions”, previously in this chapter. Let’s remove the : Int return type on the nested fact method.
// code-examples/TypeLessDoMore/method-recursive-return-script.scala
// ERROR: Won't compile until you put an Int return type on "fact".
def factorial(i: Int) = {
def fact(i: Int, accumulator: Int) = {
if (i <= 1)
accumulator
else
fact(i - 1, i * accumulator) // ERROR
}
fact(i, 1)
}Now it fails to compile.
... 9: error: recursive method fact needs result type
fact(i - 1, i * accumulator)
^Overloaded methods can sometimes require an explicit return type. When one such method calls another, we have to add a return type to the one doing the calling, as in this example.
// code-examples/TypeLessDoMore/method-overloaded-return-script.scala
// Version 1 of "StringUtil" (with a compilation error).
// ERROR: Won't compile: needs a String return type on the second "joiner".
object StringUtil {
def joiner(strings: List[String], separator: String): String =
strings.mkString(separator)
def joiner(strings: List[String]) = joiner(strings, " ") // ERROR
}
import StringUtil._ // Import the joiner methods.
println( joiner(List("Programming", "Scala")) )The two joiner methods concatenate a List of strings together. The first method also takes an argument for the separator string. The second method calls the first with a “default” separator of a single space.
If you run this script, you get the following error.
... 9: error: overloaded method joiner needs result type def joiner(strings: List[String]) = joiner(strings, "")
Since the second joiner method calls the first, it requires an explicit String return type. It should look like this.
def joiner(strings: List[String]): String = joiner(strings, " ")
The final scenario can be subtle, when a more general return type is inferred than what you expected. You usually see this error when you assign a value returned from a function to a variable with a more specific type. For example, you were expecting a String, but the function inferred an Any for the returned object. Let’s see a contrived example that reflects a bug where this scenario can occur.
// code-examples/TypeLessDoMore/method-broad-inference-return-script.scala
// ERROR: Won't compile. Method actually returns List[Any], which is too "broad".
def makeList(strings: String*) = {
if (strings.length == 0)
List(0) // #1
else
strings.toList
}
val list: List[String] = makeList() // ERRORRunning this script returns the following error.
...11: error: type mismatch;
found : List[Any]
required: List[String]
val list: List[String] = makeList()
^We intended for makeList to return a List[String], but when strings.length equals zero, we returned List(0), incorrectly “assuming” that this expression is the correct way to create an empty list. In fact, we returned a List[Int] with one element, 0. We should have returned List(). Since the else expression returns a List[String], the result of strings.toList, the inferred return type for the method is the closest common super type of List[Int] and List[String], which is List[Any]. Note that the compilation error doesn’t occur in the function definition. We only see it when we attempt to assign the value returned from makeList to a List[String] variable.
In this case, fixing the bug is the solution. Alternatively, when there isn’t a bug, it may be that the compiler just needs the “help” of an explicit return type declaration. Investigate the method that appears to return the unexpected type. In our experience, you often find that you modified that method (or another one in the call path) in such a way that the compiler now infers a more general return type than necessary. Add the explicit return type in this case.
Another way to prevent these problems is to always declare return types for methods, especially when defining methods for a public API. Let’s revisit our StringUtil example and see why explicit declarations are a good idea (adapted from [Smith2009a]).
Here is our StringUtil “API” again with a new method, toCollection.
// code-examples/TypeLessDoMore/string-util-v3.scala
// Version 3 of "StringUtil" (for all versions of Scala).
object StringUtil {
def joiner(strings: List[String], separator: String): String =
strings.mkString(separator)
def joiner(strings: List[String]): String = strings.mkString(" ")
def toCollection(string: String) = string.split(' ')
}The toCollection method splits a string on spaces and returns an Array containing the substrings. The return type is inferred, which is a potential problem, as we will see. The method is somewhat contrived, but it will illustrate our point. Here is a client of StringUtil that uses this method.
// code-examples/TypeLessDoMore/string-util-client.scala
import StringUtil._
object StringUtilClient {
def main(args: Array[String]) = {
args foreach { s => toCollection(s).foreach { x => println(x) } }
}
}If you compile these files with scala, you can run the client as follows.
$ scala -cp ... StringUtilClient "Programming Scala" Programming Scala
Note
For the -cp … class path argument, use the directory where scalac wrote the class files, which defaults to the current directory (i.e., use -cp .). If you used the build process in the downloaded code examples, the class files are written to the build directory (using scalac -d build ...). In this case, use -cp build.
Everything is fine at this point, but now imagine that the code base has grown. StringUtil and its clients are now built separately and bundled into different jars. Imagine also that the maintainers of StringUtil decide to return a List instead of the default.
object StringUtil {
...
def toCollection(string: String) = string.split(' ').toList // changed!
}The only difference is the final call to toList that converts the computed Array to a List. You recompile StringUtil and redeploy its jar. Then you run the same client, without recompiling it first.
$ scala -cp ... StringUtilClient "Programming Scala" java.lang.NoSuchMethodError: StringUtil$.toCollection(... at StringUtilClient$$anonfun$main$1.apply(string-util-client.scala:6) at StringUtilClient$$anonfun$main$1.apply(string-util-client.scala:6) ...
What happened? When the client was compiled, StringUtil.toCollection returned an Array. Then toCollection was changed to return List. In both versions, the method return value was inferred. Therefore, client should have been recompiled, too.
However, had an explicit return type of Seq been declared, which is a parent for both Array and List, then the implementation change would not have forced a recompilation of the client.
Note
When developing APIs that are built separately from their clients, declare method return types explicitly and use the most general return type you can. This is especially important when APIs declare abstract methods (see, e.g., Chapter 4, Traits).
There is another scenario to watch for when using declarations of collections like val map = Map(), as in this example.
val map = Map()
map.update("book", "Programming Scala")... 3: error: type mismatch;
found : java.lang.String("book")
required: Nothing
map.update("book", "Programming Scala")
^What happened? The type parameters of the generic type Map were inferred as [Nothing,Nothing] when the map was created. (We’ll discuss Nothing in the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System, but its name is suggestive!) We attempted to insert an incompatible key, value pair of types String and String. Call it a Map to nowhere! The solution is to parameterize the initial map declaration, e.g., val map = Map[String, String]() or to specify initial values so the map parameters are inferred, e.g., val map = Map("Programming" → "Scala")
Finally, there is a subtle behavior with inferred return types that can cause unexpected and baffling results [ScalaTips]. Consider the following example scala session.
scala> def double(i: Int) { 2 * i }
double: (Int)Unit
scala> println(double(2))
()Why did the second command print () instead of 4? Look carefully at what the scala interpreter said the first command returned, double (Int)Unit. We defined a method named double that takes an Int argument and returns Unit. The method doesn’t return an Int as we would expect.
The cause of this unexpected behavior is a missing equals sign in the method definition. Here is the definition we actually intended.
scala> def double(i: Int) = { 2 * i }
double: (Int)Int
scala> println(double(2))
4Note the equals sign before the body of double. Now, the output says we have defined double to return an Int and the second command does what we expect it to do.
There is a reason for this behavior. Scala regards a method with the equals sign before the body as a function definition and a function always returns a value in functional programming. On the other hand, when Scala sees a method body without the leading equals sign, it assumes the programmer intended the method to be a “procedure” definition, intended for performing side effects only with the return value Unit. In practice, it is more likely that the programmer simply forget to insert the equals sign!
Warning
When the return type of a method is inferred and you don’t use an equals sign before the opening parenthesis for the method body, Scala infers a Unit return type, even when the last expression in the method is a value of another type.
By the way, where did that () come from that was printed before we fixed the bug? It is actually the real name of the singleton instance of the Unit type! (This name is a functional programming convention.)
Literals
Often, a new object is initialized with a literal value, such as val book = "Programming Scala". Let’s discuss the kinds of literal values supported by Scala. Here, we’ll limit ourselves to lexical syntax literals. We’ll cover literal syntax for functions (used as values, not member methods), tuples, and certain types like Lists and Maps, as we come to them.
Integer Literals
Integer literals can be expressed in decimal, hexadecimal, or octal. The details are summarized in Table 2.1, “Integer literals.”.
Table 2.1. Integer literals.
| Kind | Format | Examples |
|---|---|---|
Decimal | 0 or a nonzero digit followed zero or more digits (0-9) | 0, 1, 321 |
Hexadecimal | 0x followed by one or more hexadecimal digits (0-9, A-F, a-f) | 0xFF, 0x1a3b |
Octal | 0 followed by one or more octal digits (0-7) | 013, 077 |
For Long literals, it is necessary to append the L or l character at the end of the literal. Otherwise, an Int is used.
The valid values for an integer literal are bounded by the type of the variable to which the value will be assigned.
Table 2.2, “Ranges of allowed values for integer literals (boundaries are inclusive).” defines the limits, which are inclusive.
Table 2.2. Ranges of allowed values for integer literals (boundaries are inclusive).
| Target Type | Minimum (inclusive) | Maximum (inclusive) |
|---|---|---|
| −263 | 263 - 1 |
| −231 | 231 - 1 |
| −215 | 215 - 1 |
| 0 | 216 - 1 |
| −27 | 27 - 1 |
A compile-time error occurs if an integer literal number is specified that is outside these ranges, as in the following examples.
scala > val i = 12345678901234567890
<console>:1: error: integer number too large
val i = 12345678901234567890
scala> val b: Byte = 128
<console>:4: error: type mismatch;
found : Int(128)
required: Byte
val b: Byte = 128
^
scala> val b: Byte = 127
b: Byte = 127Floating Point Literals
Floating point literals are expressions with zero or more digits, followed by a period ., followed by zero or more digits. If there are no digits before the period, i.e., the number is less than 1.0, then there must be one or more digits after the period. For Float literals, append the F or f character at the end of the literal. Otherwise, a Double is assumed. You can optionally append a D or d for a Double.
Floating point literals can be expressed with or without exponentials. The format of the exponential part is e or E, followed by an optional + or -, followed by one or more digits.
Here are some example floating point literals.
0. .0 0.0 3. 3.14 .14 0.14 3e5 3E5 3.E5 3.e5 3.e+5 3.e-5 3.14e-5 3.14e-5f 3.14e-5F 3.14e-5d 3.14e-5D
Float consists of all IEEE 754 32-bit, single-precision binary floating point values. Double consists
of all IEEE 754 64-bit, double-precision binary floating point values.
Warning
To avoid parsing ambiguities, you must have at least one space after a floating point literal, if it is followed by a token that starts with a letter. Also, the expression 1.toString returns the integer value 1 as a string, while 1. toString uses the operator notation to invoke toString on the floating point literal 1..
Boolean Literals
The boolean literals are true and false. The type of the variable to which they are assigned will be inferred to be Boolean.
scala> val b1 = true b1: Boolean = true scala> val b2 = false b2: Boolean = false
Character Literals
A character literal is either a printable Unicode character or an escape sequence, written between single quotes. A character with Unicode value between 0 and 255 may also be represented by an octal escape, a backslash \ followed by a sequence of up to three octal characters.
It is a compile time error if a backslash character in a character or string literal does not start a valid escape sequence.
Here are some examples.
’A’ ’\u0041’ // 'A' in Unicode ’\n’ '\012' // '\n' in octal ’\t’
The valid escape sequences are shown in Table 2.3, “Character escape sequences.”.
Table 2.3. Character escape sequences.
| Sequence | Unicode | Meaning |
|---|---|---|
|
| backspace BS |
|
| horizontal tab HT |
|
| linefeed LF |
|
| form feed FF |
|
| carriage return CR |
|
| double quote |
|
| single quote |
|
| backslash \ |
String Literals
A string literal is a sequence of characters enclosed in double quotes or triples of double quotes, i.e., """…""".
For string literals in double quotes, the allowed characters are the same as the character literals. However, if a double quote " character appears in the string, it must be “escaped” with a \ character. Here are some examples.
"Programming\nScala" "He exclaimed, \"Scala is great!\"" "First\tSecond"
The string literals bounded by triples of double quotes are also called multi-line string literals. These strings can cover several lines; the line feeds will be part of the string. They can include any characters, including one or two double quotes together, but not three together. They are useful for strings with \ characters that don’t form valid Unicode or escape sequences, like the valid sequences listed in Table 2.3, “Character escape sequences.”. Regular expressions are a typical example, which we’ll discuss in Chapter 3, Rounding Out the Essentials. However, if escape sequences appear, they aren’t interpreted.
Here are three example strings.
"""Programming\nScala""" """He exclaimed, "Scala is great!" """ """First line\n Second line\t Fourth line"""
Note that we had to add a space before the trailing """ in the second example to prevent a parse error. Trying to escape the second " that ends the "Scala is great!" quote, i.e., "Scala is great!\", doesn’t work.
Copy and paste these strings into the scala interpreter. Do the same for the previous string examples. How are they interpreted differently?
Symbol Literals
Scala supports symbols, which are interned strings, meaning that two symbols with the same “name”, i.e., the same character sequence, will actually refer to the same object in memory. Symbols are used less often in Scala than in some other languages, like Ruby, Smalltalk, and Lisp. They are useful as map keys instead of strings.
A symbol literal is a single quote ', followed by a letter, followed by zero or more digits and letters. Note that an expression like '1 is invalid, because the compiler thinks it is an incomplete character literal.
A symbol literal ’id is a shorthand for the expression scala.Symbol("id").
Note
If you want to create a symbol that contains whitespace, use e.g., scala.Symbol(" Programming Scala "). All the whitespace is preserved.
Tuples
How many times have you wanted to return two or more values from a method? In many languages, like Java, you only have a few options, none of which is very appealing. You could pass in parameters to the method that will be modified for all or some of the “return” values, which is ugly. Or, you could declare some small “structural” class that holds the two or more values, then return an instance of that class.
Scala, supports tuples, a grouping of two or more items, usually created with the literal syntax of a comma-separated list of the items inside parentheses, e.g., (x1, x2, …). The types of the xi elements are unrelated to each other, you can mix and match types. These literal “groupings” are instantiated as scala.TupleN instances, where the N is the number of items in the tuple. The Scala API defines separate TupleN classes for N between 1 and 22, inclusive. Tuple instances are immutable, first-class values, so you can assign them to variables, pass them as values, and return them from methods.
The following example demonstrates the use of tuples.
// code-examples/TypeLessDoMore/tuple-example-script.scala
def tupleator(x1: Any, x2: Any, x3: Any) = (x1, x2, x3)
val t = tupleator("Hello", 1, 2.3)
println( "Print the whole tuple: " + t )
println( "Print the first item: " + t._1 )
println( "Print the second item: " + t._2 )
println( "Print the third item: " + t._3 )
val (t1, t2, t3) = tupleator("World", '!', 0x22)
println( t1 + " " + t2 + " " + t3 )Running this script with scala produces the following output.
Print the whole tuple: (Hello,1,2.3) Print the first item: Hello Print the second item: 1 Print the third item: 2.3 World ! 34
The tupleator method simply returns a “3-tuple” with the input arguments. The first statement that uses this method assigns the returned tuple to a single variable t. The next four statements print t in various ways. The first print statement calls Tuple3.toString, which wraps parentheses around the item list. The following three statements print each item in t separately. The expression t._N retrieves the N item, starting at 1, not 0 (this choice follows functional programming conventions).
The last two lines show that we can use a tuple expression on the left-hand side of the assignment. We declare three vals, t1, t2, and t3, to hold the individual items in the tuple. In essence, the tuple items are extracted automatically.
Notice how we mixed types in the tuples. You can see the types more clearly if you use the interactive mode of the scala command, which we introduced in Chapter 1, Zero to Sixty: Introducing Scala.
Invoke the scala command with no script argument. At the scala> prompt, enter val t = ("Hello",1,2.3) and see that you get the following result, which shows you the types of each element in the tuple.
scala> val t = ("Hello",1,2.3)
t: (java.lang.String, Int, Double) = (Hello,1,2.3)It’s worth noting that there’s more than one way to define a tuple. We’ve been using the more common parenthesized syntax, but you can also use the arrow operator between two values, as well as special factory methods on the tuple-related classes.
scala> 1 -> 2 res0: (Int, Int) = (1,2) scala> Tuple2(1, 2) res1: (Int, Int) = (1,2) scala> Pair(1, 2) res2: (Int, Int) = (1,2)
Option, Some, and None: Avoiding nulls
We’ll discuss the standard type hierarchy for Scala in the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System. However, three useful classes to understand now are the Option class and its two subclasses, Some and None.
Most languages have a special keyword or object that’s assigned to reference variables when there’s nothing else for them to refer to. In Java, this is null; in Ruby, it’s nil. In Java, null is a keyword, not an object, and thus it’s illegal to call any methods on it. But this is a confusing choice on the language designer’s part. Why return a keyword when the programmer expects an object?
To be more consistent with the goal of making everything an object, as well as to conform with functional programming conventions, Scala encourages you to use the Option type for variables and function return values when they may or may not refer to a value. When there is no value, use None, an object that is a subclass of Option. When there is a value, use Some, which wraps the value. Some is also a subclass of Option.
Note
None is declared as an object, not a class, because we really only need one instance of it. In that sense, it’s like the null keyword, but it is a real object with methods.
You can see Option, Some, and None in action in the following example, where we create a map of state capitals in the United States.
// code-examples/TypeLessDoMore/state-capitals-subset-script.scala
val stateCapitals = Map(
"Alabama" -> "Montgomery",
"Alaska" -> "Juneau",
// ...
"Wyoming" -> "Cheyenne")
println( "Get the capitals wrapped in Options:" )
println( "Alabama: " + stateCapitals.get("Alabama") )
println( "Wyoming: " + stateCapitals.get("Wyoming") )
println( "Unknown: " + stateCapitals.get("Unknown") )
println( "Get the capitals themselves out of the Options:" )
println( "Alabama: " + stateCapitals.get("Alabama").get )
println( "Wyoming: " + stateCapitals.get("Wyoming").getOrElse("Oops!") )
println( "Unknown: " + stateCapitals.get("Unknown").getOrElse("Oops2!") )The convenient -> syntax for defining name-value pairs to initialize a Map will be discussed in the section called “The Predef Object” in Chapter 7, The Scala Object System. For now, we want to focus on the two groups of println statements, where we show what happens when you retrieve the values from the map. If you run this script with the scala command, you’ll get the following output.
Get the capitals wrapped in Options: Alabama: Some(Montgomery) Wyoming: Some(Cheyenne) Unknown: None Get the capitals themselves out of the Options: Alabama: Montgomery Wyoming: Cheyenne Unknown: Oops2!
The first group of println statements invoke toString implicitly on the instances returned by get. We are calling toString on Some or None instances, because the values returned by Map.get are automatically wrapped in a Some, when there is a value in the map for the specified key. Note that the Scala library doesn’t store the Some in the map, it wraps the value in a Some upon retrieval. Conversely, when we ask for a map entry that doesn’t exist, the None object is returned, rather than null. This occurred in the last println of the three.
The second group of println statements go a step further. After calling Map.get, they call get or getOrElse on each Option instance to retrieve the value it contains. Option.get requires that the Option is not empty, that is, the Option instance must actually be a Some. In this case, get returns the value wrapped by the Some, as demonstrated in the println where we print the capital of Alabama. However, if the Option is actually None, then None.get throws a NoSuchElementException.
We also show the alternative method, getOrElse, in the last two println statements. This method returns either the value in the Option, if it is a Some instance, or it returns the second argument we passed to getOrElse, if it is a None instance. In other words, the second argument to getOrElse functions as the default return value.
So, getOrElse is the more defensive of the two methods. It avoids a potential thrown exception. We’ll discuss the merits of alternatives like get vs. getOrElse in the section called “Exceptions and the Alternatives” in Chapter 13, Application Design.
Note that because the Map.get method returns an Option, it automatically documents the fact that there may not be an item matching the specified key. The map handles this situation by returning a None. Most languages would return null (or the equivalent) when there is no “real” value to return. You learn from experience to expect a possible null. Using Option makes the behavior more explicit in the method signature, so it’s more self-documenting.
Also, thanks to Scala’s static typing, you can’t make the mistake of attempting to call a method on a value that might actually be null. While this mistake is easy to do in Java, it won’t compile in Scala because you must first extract the value from the Option. So, the use of Option strongly encourages more resilient programming.
Because Scala runs on the JVM and .NET and because it must interoperate with other libraries, Scala has to support null. Still, you should avoid using null in your code. Tony Hoare, who invented the null reference in 1965 while working on an object-oriented language called ALGOL W, called its invention his “billion dollar mistake” [Hoare2009]. Don’t contribute to that figure.
So, how would you write a method that returns an Option? Here is a possible implementation of get that could be used by a concrete subclass of of Map (Map.get itself is abstract). For a more sophisticated version, see the implementation of get in scala.collection.immutable.HashMap in the Scala library source code distribution.
def get(key: A): Option[B] = {
if (contains(key))
new Some(getValue(key))
else
None
}The contains method is also defined for Map. It returns true if the map contains a value for the specified key. The getValue method is intended to be an internal method that retrieves the value from the underlying storage, whatever it is.
Note how the value returned by getValue is wrapped in a Some[B], where the type B is inferred. However, if the call to contains(key) returns false, then the object None is returned.
You can use this same idiom when your methods return an Option. We’ll explore other uses for Option in subsequent sections. Its pervasive use in Scala code makes it an important concept to grasp.
Organizing Code in Files and Namespaces
Scala adopts the package concept that Java uses for namespaces, but Scala offers a more flexible syntax. Just as file names don’t have to match the type names, the package structure does not have to match the directory structure. So, you can define packages in files independent of their “physical” location.
The following example defines a class MyClass in a package com.example.mypkg using the conventional Java syntax.
// code-examples/TypeLessDoMore/package-example1.scala
package com.example.mypkg
class MyClass {
// ...
}The next example shows a contrived example that defines packages using the nested package syntax in Scala, which is similar to the namespace syntax in C# and the use of modules as namespaces in Ruby.
// code-examples/TypeLessDoMore/package-example2.scala
package com {
package example {
package pkg1 {
class Class11 {
def m = "m11"
}
class Class12 {
def m = "m12"
}
}
package pkg2 {
class Class21 {
def m = "m21"
def makeClass11 = {
new pkg1.Class11
}
def makeClass12 = {
new pkg1.Class12
}
}
}
package pkg3.pkg31.pkg311 {
class Class311 {
def m = "m21"
}
}
}
}Two packages pkg1 and pkg2 are defined under the com.example package. A total of three classes are defined between the two packages. The makeClass11 and makeClass12 methods in Class21 illustrate how to reference a type in the “sibling” package, pkg1. You can also reference these classes by their full paths, com.example.pkg1.Class11 and com.example.pkg1.Class12, respectively.
The package pkg3.pkg31.pkg311 shows that you can “chain” several packages together in one clause. It is not necessary to use a separate package clause for each package.
Following the conventions of Java, the root package for Scala’s library classes is named scala.
Warning
Scala does not allow package declarations in scripts that are executed directly with the scala interpreter. The reason has to do with the way the interpreter converts statements in scripts to valid Scala code before compiling to byte code. See the the section called “The scala Command Line Tool” section in Chapter 14, Scala Tools, Libraries and IDE Support for more details.
Importing Types and Their Members
To use declarations in packages, you have to import them, just as you do in Java and similarly for other languages. However, compared to Java, Scala greatly expands your options. The following example illustrates several ways to import Java types.
// code-examples/TypeLessDoMore/import-example1.scala
import java.awt._
import java.io.File
import java.io.File._
import java.util.{Map, HashMap}You can import all types in a package, using the underscore _ as a wild card, as shown on the first line. You can also import individual Scala or Java types, as shown on the second line.
Java uses the “star” character * as the wild card for matching all types in a package or all static members of a type when doing “static imports”. In Scala, this character is allowed in method names, so _ is used as a wild card, as we saw previously.
As shown on the third line, you can import all the static methods and fields in Java types. If java.io.File were actually a Scala object, as discussed previously, then this line would import the fields and methods from the object.
Finally, you can selectively import just the types you care about. On the fourth line, we import just the java.util.Map and java.util.HashMap types from the java.util package. Compare this one-line import statement with the two-line import statements we used in our first example in the section called “Inferring Type Information”. They are functionally equivalent.
The next example shows more advanced options for import statements.
// code-examples/TypeLessDoMore/import-example2-script.scala
def writeAboutBigInteger() = {
import java.math.BigInteger.{
ONE => _,
TEN,
ZERO => JAVAZERO }
// ONE is effectively undefined
// println( "ONE: "+ONE )
println( "TEN: "+TEN )
println( "ZERO: "+JAVAZERO )
}
writeAboutBigInteger()This example demonstrates two features. First, we can put import statements almost anywhere we want, not just at the top of the file, as required by Java. This feature allows us to scope the imports more narrowly. For example, we can’t reference the imported BigInteger definitions outside the scope of the method. Another advantage of this feature is that it puts an import statement closer to where the imported items are actually used.
The second feature shown is the ability to rename imported items. First, the java.math.BigInteger.ONE constant is renamed to the underscore wild card. This effectively makes it invisible and unavailable to the importing scope. This is a useful technique when you want to import everything except a few particular items.
Next, the java.math.BigInteger.TEN constant is imported without renaming, so it can be referenced simply as TEN.
Finally, the java.math.BigInteger.ZERO constant is given the alias JAVAZERO.
Aliasing is useful if you want to give the item a more convenient name or you want to avoid ambiguities with other items in scope that have the same name.
Imports are Relative
There’s one other important thing to know about imports; they are relative. Note the comments for the following imports:
// code-examples/TypeLessDoMore/relative-imports.scala
import scala.collection.mutable._
import collection.immutable._ // Since "scala" is already imported
import _root_.scala.collection.jcl._ // full path from real "root"
package scala.actors {
import remote._ // We're in the scope of "scala.actors"
}Note that the last import statement nested in the scala.actor package scope is relative to that scope.
The [ScalaWiki] has other examples at http://scala.sygneca.com/faqs/language#how-do-i-import.
It’s fairly rare that you’ll have problems with relative imports, but the problem with this convention is that they sometimes cause surprises, especially if you are accustomed to languages like Java, where imports are absolute. If you get a mystifying compiler error that a package wasn’t found, check that the statement is properly relative to the last the import statement or add the _root_. prefix. Also, you might see an IDE or other tool insert an import _root_… statement in your code. Now you know what it means.
Warning
Remember that import statements are relative, not absolute. To create an absolute path, start with _root_.
Abstract Types And Parameterized Types
We mentioned in the section called “A Taste of Scala” in Chapter 1, Zero to Sixty: Introducing Scala that Scala supports parameterized types, which are very similar to generics in Java. (We could use the two terms interchangeably, but it’s more common to use “parameterized types” in the Scala community and “generics” in the Java community.) The most obvious difference is in the syntax, where Scala uses square brackets ([…]), while Java uses angle brackets (<…>).
For example, a list of strings would be declared as follows.
val languages: List[String] = ...
There are other important differences with Java’s generics, which we’ll explore in the section called “Understanding Parameterized Types” in Chapter 12, The Scala Type System.
For now, we’ll mention one other useful detail that you’ll encounter before we can explain it in depth in Chapter 12, The Scala Type System. If you look at the declaration of scala.List in the Scaladocs, you’ll see that the declaration is written as … class List[+A]. The ‘+’ in front of the A means that List[B] is a subtype of List[A] for any B that is a subtype of A. If there is a ‘-’ in front of a type parameter, then the relationship goes the other way, Foo[B] would be a supertype of Foo[A], if the declaration is Foo[-A].
Scala supports another type abstraction mechanism called abstract types, used in many functional programming languages, such as Haskell. Abstract types were also considered for inclusion in Java when generics were adopted. We want to introduce them now, because you’ll see many examples of them before we dive into their details in Chapter 12, The Scala Type System. For a very detailed comparison of these two mechanisms, see [Bruce1998].
Abstract types can be applied to many of the same design problems for which parameterized types are used. However, while the two mechanisms overlap, they are not redundant. Each has strengths and weaknesses for certain design problems.
Here is an example that uses an abstract type.
// code-examples/TypeLessDoMore/abstract-types-script.scala
import java.io._
abstract class BulkReader {
type In
val source: In
def read: String
}
class StringBulkReader(val source: String) extends BulkReader {
type In = String
def read = source
}
class FileBulkReader(val source: File) extends BulkReader {
type In = File
def read = {
val in = new BufferedInputStream(new FileInputStream(source))
val numBytes = in.available()
val bytes = new Array[Byte](numBytes)
in.read(bytes, 0, numBytes)
new String(bytes)
}
}
println( new StringBulkReader("Hello Scala!").read )
println( new FileBulkReader(new File("abstract-types-script.scala")).read )Running this script with scala produces the following output.
Hello Scala!
import java.io._
abstract class BulkReader {
...The BulkReader abstract class declares three abstract members, a type named In, a val field source, and a read method. As in Java, instances in Scala can only be created from concrete classes, which must have definitions for all members.
The derived classes, StringBulkReader and FileBulkReader, provide concrete definitions for these abstract members. We’ll cover the details of class declarations in Chapter 5, Basic Object-Oriented Programming in Scala and the particulars of overriding member declarations in the section called “Overriding Members of Classes and Traits” in Chapter 6, Advanced Object-Oriented Programming In Scala.
For now, note that the type field works very much like a type parameter in a parameterized type. In fact, we could rewrite this example as follows, where we show only what would be different.
abstract class BulkReader[In] {
val source: In
...
}
class StringBulkReader(val source: String) extends BulkReader[String] {...}
class FileBulkReader(val source: File) extends BulkReader[File] {...}Just as for parameterized types, if we define the In type to be String, then the source field must also be defined as a String. Note that the StringBulkReader's read method simply returns the source field, while the FileBulkReader's read method reads the contents of the file.
As demonstrated by [Bruce1998], parameterized types tend to be best for collections, which is how they are most often used in Java code, while abstract types are most useful for type “families” and other type scenarios.
We’ll explore the details of Scala’s abstract types in Chapter 12, The Scala Type System. For example, we’ll see how to constrain the possible concrete types that can be used.
Reserved Words
Table 2.4, “Reserved Words.” lists the reserved words in Scala, which we sometimes call “keywords”, and briefly describes how they are used [ScalaSpec2009].
Table 2.4. Reserved Words.
Notice that break and continue are not listed. These control keywords don’t exist in Scala. Instead, Scala encourages you to use functional programming idioms that are usually more succinct and less error prone. We’ll discuss alternative approaches when we discuss for loops (see the section called “Generator Expressions” in Chapter 3, Rounding Out the Essentials).
Some Java methods use names that are reserved by Scala, e.g., java.util.Scanner.match. To avoid a compilation error, surround the name with single back quotes, e.g., java.util.Scanner.‵match‵.
Recap and What’s Next
We covered several ways that Scala’s syntax is concise, flexible, and productive. We also described many Scala features. In the next chapter, we will round out some Scala essentials before we dive into Scala’s support for object-oriented programming and functional programming.





Add a comment



Add a comment