Java Rules Vol.2 (2003)
Java Rules Vol.2 (2003)
Doug Dunn
Java is a registered trademark of Sun Microsystems, Inc. Windows 95, Windows NT, Win-
dows 2000 and Windows XP are trademarks of Microsoft Corporation. All other product or
company names mentioned herein are the property of their respective owners. If the pub-
lisher was aware of a trademark claim, the product or company name is capitalized.
The publisher has taken care in the preparation of this book, but makes no expressed or
implied warranty of any kind and assumes no responsibility for errors or omissions. No lia-
bility is assumed for incidental or consequential damages in connection with or arising out
of the use of the information or programs contained herein.
The publisher is excepting back orders for printed editions of this book. For more informa-
tion, please write to [email protected].
– Doug
Table of Contents
Be advised that this table of contents (TOC) only includes the first two section
levels (referred to as heads in the publishing industry). For a complete TOC, see
the chapter-level TOC at that beginning of each chapter. They include section
numbers with four components such as 1.5.2.1 The Problem of Changeable
Inlined Constants in the first chapter.
VII
1.6 Methods 131
1.6.1 abstract Methods 134
1.6.2 Result Types and the return Statement 139
1.6.3 Formal Parameter Lists 144
1.6.4 The throws Clause 165
1.7 Local Variables 174
1.8 “Write Once, Compile Anywhere” 179
1.8.1 Definite Assignment 181
1.8.2 Unreachable Statements 184
1.9 Qualifying Type versus Compile-Time Declaration 188
1.10 The Five General Forms 193
1.10.1 The Meaning of a Simple Field or Method Name 198
1.10.2 Method Invocation Chaining 204
1.10.3 Casting a Target Reference 206
1.10.4 Accessing static Members using a Primary
Expression 208
1.11 Method Signatures 209
1.11.1 The Compiler-Enforced Method Contract 215
1.11.2 Overloaded Methods 225
1.11.3 Overriding and Dynamic Method Lookup 233
1.12 Method Forwarding 246
IX
4.5.6 Bitwise Operators &, |, ^, >>, >>>, and << 454
4.5.7 Ternary Conditional Operator ?: 465
4.5.8 The Simple and Compound Assignment Operators 470
4.6 The instanceof Type Comparison Operator 473
4.7 A Bitwise Primer 475
4.7.1 Bits 477
4.7.2 Converting Nybbles to Hexadecimal Digits 492
4.7.3 General-Purpose Integer to String Conversion Methods 498
4.7.4 Unsigned Bytes 501
4.7.5 Some Miscellaneous Uses of the Bitwise Operators 523
4.8 Statements 526
4.8.1 Control-flow Statements 529
4.8.2 Labeled Statements 554
4.8.3 Control-transfer Statements (a.k.a. Abrupt Completion) 560
4.9 Blocks 566
X JAVA RULES
Assertions, Exceptions, and Logging 659
6.1 Introduction 660
6.2 Assertions 661
6.2.1 Preconditions, Postconditions, and Invariants 671
6.2.2 assert false and Logic Traps (or Control-flow
Invariants) 679
6.2.3 Catching CloneNotSupportedException 682
6.3 An Execution Stack Primer 685
6.4 The Exception Mechanism 690
6.5 The Throwable Class Hierarchy 698
6.5.1 General Exception Classes 723
6.5.2 Unspecified Runtime Exceptions and Errors 736
6.5.3 Asynchronous Exceptions 759
6.6 Throwable Objects 762
6.6.1 The Loggable Interface 767
6.7 The throw Statement 768
6.8 The try Statement 785
6.8.1 The catch Clause 791
6.8.2 The finally Clause 817
6.9 Exception Handling 859
6.9.1 Rethrowing the Same Exception 864
6.9.2 Exception Translation and Chaining 867
6.9.3 Squelching an Exception 874
6.9.4 Setting an Error Flag 878
6.9.5 Retry 879
6.9.6 Try an Alternative 881
6.10 Uncaught Exceptions 883
6.10.1 Top-level Exception Handlers 884
6.10.2 A Stack Trace Primer 922
6.11 Logging 936
6.11.1 The Logger Namespace 952
6.11.2 Logging Configuration 957
6.11.3 Logging Methods 975
6.11.4 The Cleaner Thread (A Shutdown Hook in
LogManager) 984
XI
XII JAVA RULES
Preface for Volume 2
This book has been under continuous development for many years now. I began
writing it shortly after the JDK 1.0.2 release. From the beginning of this project,
I felt the need for something new and different to bridge the chasm between the
tutorials (as well written as many of them are) and the “this is all the neat stuff I
know” books written by the best and brightest. Both genres are indispensable. In
between are reference works that claim to be both comprehensive and detailed.
That is the genre to which Mastering the Fundamentals belongs, but it is dif-
ferent from all of the others in that it focuses entirely on language fundamentals.
For example, there are no other books within the reference work genre of which
I am aware that completely ignore (perhaps to my peril) graphics and network
programming. These are specialized subjects I hope to cover in future volumes,
but not until there is a proven market for my style of technical writing. That style
can be summarized as “an uncompromising level of detail in a technical writing
style that is easy to read.”
Why is an “uncompromising level of detail” so important? There are a lot
more pages on language fundamentals in this book than in any other Java book
ever written (or that ever will be written I am sure), including even the specifica-
tions. Mastering the Fundamentals deliberately sacrifices brevity in order to
focus on details that are not found in most books on the subject. This is a very
different approach to technical writing as found in almost every other computer
language book on the market, one that assumes readers are willing to read
more so long as the material is relevant (no “fluff” which Merriam-Webster
defines as “something inconsequential”1) and easy to read. One reader said he
liked my work because it didn’t have any “speed bumps,” an analogy that I
greatly appreciated. A “speed bump” is a passage of technical writing that
requires you to stop reading and ponder its meaning or significance. I have expe-
rienced this in some very popular works. One paragraph in a best seller always
comes to mind in this context because I had to think about what it meant for the
better part of a half hour, and was still not sure I understood what the author was
XIII
saying. I argue that so long as the material is of some practical importance
and easy to read, length must be allowed to suffer. At what price brevity? To
carefully explain something in detail requires more words. It is that sim-
ple. Nevertheless most programmers do not like to read a lot. This is precisely
gramming language. Mastering the Java programming language will change not
only how you approach the work of computer programming, but the industry as
a whole. The obvious place to begin to excel is exactly the material in this book.
Thanks to Dr. Gosling and the other language designers at Sun, Java is a pro-
gramming language that can in fact be mastered in a couple years.
Nothing would bring me greater satisfaction than to learn that this work has
earned a place on the bookshelf of a significant number of teachers and univer-
sity professors. I have diligently sought to find the most natural organization of
the material, one that I could feel justified referring to as a reference work. This
has involved untold thousands of permutations over a period of more than seven
years and has yielded some interesting results (such as the sections on field ini-
tialization anomalies in the first chapter). I have also built a number of concep-
tual frameworks from scratch. More than mere exercises in vocabulary, these
conceptual frameworks are designed to help students understand language fun-
damentals at a professional level. Among them are constructor mathematics,
inlined constants, system-induced argument checks, the five general
forms (of field access and method invocation expressions), the compiler-
enforced method contract, compilation unit scope, type privileges, catch-
XV
all exception handlers, umbrella exceptions, and many others. I also tackle
some really tough subjects such as the practical uses of bitwise operators,
the type of a variable or expression versus the class of an object, and sub-
stitution as a higher concept than polymorphism that other books on the
subject tend to either skirt or completely ignore.
I owe it to myself to also mention that while selling this book online (before
Volume 1 was published by Addison Wesley Professional), I had many mechanical
engineers as well as computer programmers from the Indian subcontinent inter-
ested in my work. In both cases I felt honored, but the Indian readers were of par-
ticular interest to me. Finally, I asked one why he thought so many of his fellow
countrymen would be interested in my work. His reply was that my style of techni-
cal writing was more like that of the English than Americans. If this is so, I have no
idea why.
More generally, this book is for any programmer who wants to truly
master the fundamentals of the Java programming language. It is the travel
log I kept while making the same journey. While formalism and stylistic norms
forbid me from including students in the target audience, I firmly believe that this
book can be profitably used to teach the Java programming language in a class-
room setting (if not as a primary text, then as an auxiliary reference work).
Acknowledgements
I have had to wipe the slate clean in this department, so I will start with the latest
help that I got from Neal Gafter. Neal Gafter, I was surprised to learn, is the only
software engineer working on the javac compiler. That’s his baby. He recently
helped me to understand overloaded method matching and the changes that are
taking place in the 1.4.2 release (the declaring class of applicable methods is no
longer relevant). A warning here, however. Just because someone helped
doesn’t mean they approve of the resulting text. Gafter showed me a lot of
patience. On the other hand, if you learn something while reading 5.8 Over-
loaded Method Matching, please believe me when I say that it is entirely due to
the generosity of Neal Gafter. Thanks Neal. I hope you don’t regret it after read-
ing the final text.
2. See www.javaranch.com/contact.jsp#ThomasPaul.
XVII
XVIII JAVA RULES
About This Book
No Pagination
The term pagination refers to various techniques used to format the pages in a
printed book. In particular, it refers to page breaks and their impact on the for-
mat of the rest of a page. Electronic editions of Mastering The Fundamentals
are not paginated. Pagination is a huge job and it must be redone every time a
book is published. I will be posting new versions far too frequently to paginate
every time. This may result in some rather odd page breaks in printed copies of
the electronic book.
XIX
Married to the Java Specifications
This book is based on The Java Language Specification (JLS 1). In fact, I
started out to write what I regarded as The Java Language Specification for
Mainstream Business Application Programmers. Shortly thereafter, how-
ever, I realized that material from The Java Virtual Machine Specification
(JVMS) would have to be included as well. The work has grown to include every-
thing of interest to mainstream business application programmers in a host of
Java specifications, including notably the Second Editions of both the JLS and
the JVMS.
The main difference between the specifications and this book is the target
audience. The JVMS, of course, is written for someone who wants to implement
a JVM. What is not as generally understood is that the JLS is a grammar
intended for someone who wants to write a Java compiler, such as the jikes
compiler team at IBM. This is a very small group of people, undoubtedly well
known to each other. For example, the javac compiler is developed and main-
tained by one software engineer (Neal Gafter, a widely recognized authority on
the JLS). Because of this very small target audience, the JLS includes a lot of
material that is of interest only to those programmers who make a living in the
arcane world of compilers. That is truly unfortunate because the JLS contains a
wealth of information of interest to application programmers that rarely makes it
into mainstream Java books. The same can be said of the JVMS. This book
extracts all of that information, elaborates upon it where necessary, and pre-
sents it in a technical writing style appropriate for application programmers.
Much more is covered, but the reader is guaranteed that both the JLS and
JVMS have been diligently searched for every scrap of information that is
of interest to application programmers .
1. Technically speaking, the initialisms JLS and JVMS should be italicized because they are book
titles. My decision not to do so is based on font aesthetics and nothing else.
XX JAVA RULES
Section Names are Very Descriptive
I have over the years decomposed unwieldy sections into subsections, each of
which have very descriptive names. Because the section names are so
descriptive, the chapter-level TOC is intended to replace the usual intro-
duction to a chapter (which more often than not is simply an overview of the
chapter contents). Thus, I tend to use chapter introductions for miscellaneous
notes such as pointing out sections that I think are of particular importance,
explaining omissions, and the like. In short, the chapter introductions in this book
are little more than footnotes for the chapter-level TOC.
print is also considered. Occasionally, I introduce a new term “of my own mak-
ing” (as I like to say) or take exception with the usage in official specifications.
However, this is never done without first alerting the reader.
Correct usage is nothing to brag about. More experienced programmers may
actually look unfavorably at technical writers who place an unusually strong empha-
XXI
sis on using terms correctly. There are more important things to think about. Who
really cares if a term is used loosely? We are computer programmers, not doc-
tors or lawyers. I agree. As your knowledge of computer programming increases,
the importance of terminology diminishes. All that really matters is good inter-
face design and a bug-free, efficient implementation.
For less experienced programmers, however, meaning is attached to terms.
If terms are used consistently and have a clear meaning, the learning process
can be reduced to the acquisition of new terms, the building of a vocabulary that
eventually translates into the actual writing of programs. For programmers at
this stage of development (which includes all of my target audience) terms
are the building blocks of knowledge. It is unnatural for a technical writer not
to respect this reality, unless perhaps the target audience is more experienced
programmers. Then correct usage is not as important.
So please, if at times I seem overbearing or a zealot when it comes to mat-
ters of terminology, try to take it in your stride. I am not trying to change the
culture of computer programming; I am just trying to do my job as a tech-
nical writer.
Release Information
I recall Dr. Gosling once saying that he spends an inordinate amount of time behind
closed doors speaking to lawyers. I have never fully understood why “Java 2 Plat-
form” was added to the beginning of official release names starting with the 1.2
release, but suspect that it is somehow related to those closed door sessions. Like
most software engineers and technical writers, I use what Joshua Bloch refers to
as engineer version numbers2 (a.k.a. platform versions) such as “the 1.4
release” instead of official release names such as “Java 2 Platform, Standard
Edition v 1.4” (or J2SE v1.2 for short). There is no significant difference (for pro-
grammers at least) because the engineering version number is part of the official
release name. Figure 1.1 shows how to read an engineering version number.
The java -version DOS command can be used to determine which ver-
sion of Java you are running. Note that the -version option is the only way to
determine what specific build you are using, such as 1.4.1_01-b01.
There never has been a new major release; 2.0 will be the first. Starting
with the 1.2 release, Sun committed to a new functionality release approxi-
mately once every 18 months. (These are sometimes referred to as feature
releases.) As the name implies, functionality releases introduce new packages
as well as changes (new classes, methods, etc.) to existing packages. Also
beginning with the 1.2 release, there are at most one or two maintenance
releases for each functionality release. At least some of them have insect code
names such as (praying) Mantis because they are primarily bug fixes. This also
explains why they are sometimes referred to as bug releases.
Beginning with the 1.1.4 release, Sun livened things up a bit by using code
names such as Merlin for the 1.4 release and Tiger for the 1.5 release. Table
1.1 includes all of the code names of which I am aware. Many programmers who
research the Bug Database on a regular basis find these code names mind
numbing, especially in light of the fact that Sun has never bothered to publish
something akin to Table 1.1, which would make it possible to translate code
names into engineering release numbers.
XXIII
Table 1.1 Release Information
Latest
Engineering JDK/Java 2 Update
Version SDK Code Name Release Dates Releasea
1.0.2 JDK
1.1 JDK
1.1.1 JDK
1.1.2 JDK
1.1.3 JDK
C:\Java\classes>java Test
In order to make the DOS command appear more natural (at least to Window’s
users), I always use C:\Java\classes> as the DOS prompt. Such a default
DOS prompt indicates that C:\Java\classes is the current working direc-
tory. Otherwise, I have tried my best to be a platform-neutral author, but the
effort is imperfect, and I apologize in advance to any hard-core UNIX or Mac pro-
grammers who may take offense.
XXV
Examples of Code
Most of the examples in this book are so short and easy to read that to explain
them might insult some readers. This is very deliberate. I personally do not like
the long examples found in most other Java books. Code is difficult to read. You
have to acquaint yourself with variable names and other intricacies of the code
that have no long-term benefit to the reader. In this book, examples are always
as short as I can make them, preferably no more than 10 to 15 lines of code.
Furthermore, bold is often used to help to draw the eye to the most import lines
of code. This makes it possible to quickly read and comprehend an example
without a detailed, blow-by-blow description of what the code is doing.
Another thing that annoys me about the examples in other Java books is the
lack of output. Not only are you expected to read and understand lengthy exam-
ples, but there is an assumption that you already know what the code outputs. I
always show the output from examples.
There are two kinds of examples: the most common is a short program
named Test; the other is based partly or wholly on snippets of code from the
core API of a J2SE release. These snippets usually consist of no more than a
few lines of code. Nevertheless, one must respect the copyright notices at the
top of each of the compilation units in the core API. Therefore, the examples of
source code from the core API are never the actual source code. They are sim-
plified versions of the same code that are good only for instructional purposes.
That is why I always qualify the introduction of these examples by saying that the
source code “looks like this.” That is, it is not the actual source code, but is
close enough to make the salient point without either breaking copyright laws or
bringing any disrepute to the software engineers at Sun. If I feel that a particular
example is overly simplified, I will usually include a note explicitly reminding the
reader that the actual source code is different.
XXVII
fore (at least to my way of thinking) potentially misleading. Only class methods
such as System.arraycopy are qualified with the name of the class in which
they are declared (as they also appear in code).
Deprecated Members
My general policy towards deprecated classes, fields, methods, and construc-
tors is to pretend as if they do not exist. Deprecated code is something that
should ideally be removed from the language. Why waste time explaining the way
things used to be? The only exception to this rule is when explaining why some-
thing was deprecated helps to better understand the current API.
XXIX
XXX JAVA RULES
A Pet Peeve
In a Bill Venner interview, Dr. Gosling was asked some questions about Design
by Contract. The following is excerpted from that interview.
I believe it is completely beyond the state of the art to try to do that kind of
analysis statically, that particular one, but there are all kind of analyses that
you could actually do. You don't declare failure if you can't do everything stati-
cally, but every thing that you can push into the static analysis phase of
the system to get earlier and earlier is yet another source of reliability
to the system.1 [emphasis added]
This emphasis on “static analysis” is at odds with the following evaluation of Bug
Id 4128179, “Add a lint-like facility to javac.”
Many RFE's have been submitted against javac requesting warnings of the sort
that lint would give for C. Because of the importance of comformance [sic] and
the role that javac plays in establishing it, it is our policy … to not give
warnings by default. However, an optional mode that is not the default would
be fine.2
What we are talking about here is compiler warnings, or rather the lack of them.
A chapter should be added to the JLS so that “Write Once, Compile Anywhere”
(which at present describes only definite assignment and unreachable state-
ments) is extended to include compiler warnings. Special flags (also known as
compiler options) such as -Xswitchcheck should not be used to solve this
problem, —not on the Java platform. Every compiler will have different options,
and Java programmers will not benefit from the standardization to which they
have become accustomed. In short, all of the software engineers responsible for
writing Java compilers should sit down at the earliest possible date and have a
compiler warnings powwow.
1. Dr. James Gosling in an interview with Bill Venner entitled “A Conversation with James Gosling
(May 2001),” on the artima.com Web site (Artima Software, Inc.), www.artima.com/intv/
gosling310.html. Please remember that Dr. Gosling is speaking extemporaneously. There also
appears to be some transcription errors that I have not marked “[sic]” or bothered to correct.
2. Evaluation of Bug Id 4128179.
XXXI
XXXII JAVA RULES
Chapter 1
Chapter Contents
1.1 Introduction 34
1.2 Fields 35
1.3 Field Initialization 38
1.3.1 Automatic Initialization Using Standard Default Values 38
1.3.2 Initialization Blocks 39
1.3.3 Constructors 42
1.3.3.1 Reference Constructors 48
1.3.3.2 Alternative Constructor Designs 50
1.4 Field Initialization Anomalies 55
1.4.1 The Problem of Forward Referencing 56
1.4.2 Invoking Overridden Methods During Object Initialization 61
1.4.3 Inlined Constants Always Appear to Have Been Initialized 65
1.4.4 The Variable Initializer for a Field is Always Executed 68
1.4.5 StackOverflowError During Object Initialization 70
1.4.6 Throwing Checked Exceptions During Initialization 73
1.5 Constants 81
1.5.1 Compile-Time Constant Expressions 82
1.5.2 Inlined Constants 89
1.5.2.1 The Problem of Changeable Inlined Constants 95
1.5.3 Declaring Mutable Objects final 101
1.5.4 Blank Finals 102
1.5.5 Enumerated Types 106
1.5.6 Declaring Local Variables and Parameters final 126
1.5.7 The Constant Interface Antipattern 130
1.6 Methods 131
1.6.1 abstract Methods 134
1.6.2 Result Types and the return Statement 139
1.6.2.1 Using Return Values to Indicate Failure 142
1.6.3 Formal Parameter Lists 144
33
1.6.3.1 Argument Checks 147
1.6.3.2 On const References 155
1.6.3.3 Making Defensive Copies 160
1.6.4 The throws Clause 165
1.7 Local Variables 174
1.8 “Write Once, Compile Anywhere” 179
1.8.1 Definite Assignment 181
1.8.2 Unreachable Statements 184
1.9 Qualifying Type versus Compile-Time Declaration 188
1.10 The Five General Forms 193
1.10.1 The Meaning of a Simple Field or Method Name 198
1.10.2 Method Invocation Chaining 204
1.10.3 Casting a Target Reference 206
1.10.4 Accessing static Members using a Primary Expression 208
1.11 Method Signatures 209
1.11.1 The Compiler-Enforced Method Contract 215
1.11.1.1 throws Clause Conflicts in abstract Methods 221
1.11.1.2 Covariant Result Types 224
1.11.2 Overloaded Methods 225
1.11.3 Overriding and Dynamic Method Lookup 233
1.11.3.1 The invokeinterface machine instruction 241
1.12 Method Forwarding 246
1.1 Introduction
Methods and constructors have a lot in common. According to the JLS, the fol-
lowing are “identical in structure and behavior”1 for both constructors and meth-
ods.
• Parameter lists
• Signatures
• The throws clause
• Overloading2
Because of these similarities between methods and constructors, many of the
sections on methods in this chapter apply equally to constructors.
1. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha, The Java Language Specification,
Second Edition (Boston: Addison-Wesley, 2000), §8.8.1, “Formal Parameters,” §8.8.4, “Construc-
tor Throws,” and James Gosling, Bill Joy, and Guy Steele, The Java Language Specification, First
Edition, (Reading: Addison-Wesley, 1996), §8.6.2, “Constructor Signature.” (Do not update.)
34 JAVA RULES
I consider 1.10 The Five General Forms (of field access and method invoca-
tion expressions) to be one of the most important sections in the whole book.
There are a great many sections throughout this volume and the first that make
reference to at least one of the five general forms. The term five general forms
is mine. However, it is merely a restatement of the JLS rules for determining the
class or interface searched by a compiler to find the declaration of a field or
method —one that is better adapted for classroom settings.
This chapter makes extensive use of the <init> and <clinit> method
names. The more formal names of these methods are instance initialization
method and class initialization method, both of which are simply too long for
repeated use. If you are not familiar with the <init> and <clinit> methods,
they are discussed in 2.3.1 Special Initialization Methods in Volume 1.
The only field and method modifiers discussed in detail in this chapter are
final and abstract. The access modifiers public, protected, and
private are discussed in the next chapter. The static and strictfp
modifiers are discussed in Volume 1. The other field and method modifiers
(transient, volatile, synchronized, and native) are discussed in
an as yet unpublished volume.
1.2 Fields
Fields are either class variables, instance variables, or interface constants. They
are declared in a class or interface body, whereas local variables are declared in
a block. Fields and local variables share a common declaration syntax, which is
discussed in Table 1.1. The constants declared in the body of an interface are
implicitly public static final. These are the only modifiers that can be
used when declaring interface constants. The explicit use of constant modifiers
was actively discouraged “as a matter of style” in the original JLS.
2. Gosling et al., The Java Language Specification, §8.8.6, “Constructor Overloading.” Actually
the JLS says only that constructor overloading “is identical in behavior,” not “identical in structure
and behavior.” Constructors are inherently overloaded because they have the same name as the
class in which they are declared, which (I presume) explains why constructor overloading is not also
“identical in structure.”
b Field or local variable The field modifiers for class and instance variables
modifiers are the access modifiers (public, protected,
and private ), static, final,
transient, and volatile. The only field
modifiers for interface constants are public,
static, and final. These are referred to as
interface constant modifiers. The only local
variable modifier is final.
3. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification, First Edition,
§9.3, “Field (Constant) Declarations.” (Do not update.)
4. Gosling et al., The Java Language Specification, §8.3.1, “Field Modifiers.”
36 JAVA RULES
This order is not arbitrary. It reflects the order in which the modifiers are
most commonly used, starting with the access modifiers and ending with
volatile. The same order of the access modifier followed by static and
then final is also used in method declarations.
There is only one illegal combination of field modifiers. A field cannot be
declared both final and volatile. To do so would be an inherent contra-
diction because final fields are constants and the volatile modifier
implies change.
The default serialization mechanism uses the transient keyword to
determine which instance variables to include in the serialized form. Instance
variables that are not declared transient are said to represent the persis-
tent state of an object.
It is interesting to note that in alpha releases of the Java programming lan-
guage the name of the volatile modifier was threadsafe. The effect of
declaring a field volatile is that it can be safely accessed by unsynchronized
methods.
A variable declarator consists of an identifier and an optional variable ini-
tializer. There can be any number of comma separated variable declarators in
the same declaration statement. The purpose of a variable declarator is to allow
for more than one variable to be declared at a time. The modifier(s) and type
name are the same for all of the fields or local variables declared. For example,
/** Constant for the Unicode character block of the same name. */
public static final UnicodeBlock
BASIC_LATIN = new UnicodeBlock("BASIC_LATIN"),
LATIN_1_SUPPLEMENT = new UnicodeBlock("LATIN_1_SUPPLEMENT"),
LATIN_EXTENDED_A = new UnicodeBlock("LATIN_EXTENDED_A"),
LATIN_EXTENDED_B = new UnicodeBlock("LATIN_EXTENDED_B"),
…
38 JAVA RULES
Table 1.2 Standard Default Values
Data Type Default Initialization Literal Value
ues occurs immediately after memory is allocated for a field and before the
<init> or <clinit> method is invoked.
The question arises: Should fields be explicitly initialized with a stan-
dard default value? Doing so would seem redundant, but there is also a main-
tenance issue here. Looking only at the following declaration, can you tell if the
intent is to initialize count with the standard default value of zero or is field ini-
tialization being deferred?
int count;
int count = 0;
There is a slight performance price to pay, however, because the variable initial-
izer must be executed after default initialization has already occurred, in effect
assigning zero to a field that is already equal to zero. See 1.4.4 The Variable Ini-
tializer for a Field is Always Executed for a discussion.
Array components are fields in the dynamically created array classes. They
are always initialized to standard default values. It does not matter if the
array is created in a method or constructor and assigned to a local variable. In
that case, the array variable is local, not the array components.
Loading this array requires several lines of code, which is simply more than can
fit into a variable initializer. Notice the placement of the curly braces. This style
of coding static initialization blocks is comparable to method declarations. It
minimizes the indentation of code in the block and also helps to standardize the
look and feel of source code.
With the exception of anonymous classes, variable initializers cannot invoke
methods that throw a checked exception. Therefore such methods must be
40 JAVA RULES
invoked in initialization blocks. In the case of static initialization blocks, the
checked exception must be caught and handled. Instance initialization blocks
can throw checked exceptions, but only if the checked exception is included in
the throws clause of every constructor in the class. For example,
class Test {
int instanceVariable = initialize();
Test() throws Exception { }
int initialize() throws Exception {
return 0;
}
}
For a complete discussion of this subject see 1.4.6 Throwing Checked Excep-
tions During Initialization.
Instance initialization blocks are rarely used as a replacement for construc-
tors in anonymous classes, which is ironic because that is precisely why they
were introduced to the language in the 1.1 release as part of the Inner Classes
Specification. (Anonymous classes cannot declare constructors because there
is no class name to use in the constructor declaration.) That part of the specifi-
cation was not implemented until the 1.4 release, however. By then the decision
had been made to allow the instance variable initializers in an anonymous class
to throw checked exceptions also. Invoking instance methods that throw
checked exceptions is one of the main reasons why a replacement for construc-
tors was needed in the first place.
Initialization blocks must be able to complete normally. This was a change to
the language that was not implemented until the 1.3 release. The primary Bug Id
is 4064281. It relates to the specification for unreachable statements, the very
first rule of which is that “the block that is the body of a constructor, method,
instance initializer or static initializer is reachable.”5 Remember that initialization
blocks are executed in the <clint> and <init> special initialization methods
in the same textual order in which they are found in the source code. If one of
them cannot complete normally then others that follow would not be reachable.
For example,
NOTE 1.1
Constructors serve two purposes. One is the chaining together of in-
stance initialization methods, which is discussed in 2.4.1.2 Default
Constructors in the Second Edition of Volume 1. The other is dynamic
initialization of instance variables, which is the subject of the following
section.
1.3.3 Constructors
The term constructor is something of a misnomer. Constructors do not actually
“construct” anything. Leaving aside the issue of implicit or explicit superclass
constructor invocations (the chaining together of instance initialization methods),
constructors more or less just assign values to instance variables after an object
has been created. With or without a constructor, the runtime system creates and
initializes an object every time a class instance creation expression is evaluated.
That the this keyword can be used in constructors shows that the object is
created before the constructor (or rather, the corresponding <init> method)
is executed. The fact is that <init> methods are not any different from other
42 JAVA RULES
instance methods except that they happen to be the first method invoked after
an object is created.
The four ways in which to initialize instance variables are summarized in
Table 1-3. What makes constructors fundamentally different from other initializa-
tion code (automatic initialization using standard default values, instance variable
initializers, and instance initialization blocks) is that the values assigned to
instance variables can be different from one object to the next because they are
passed by client programmers as arguments in class instance creation expres-
sions. In the context of a discussion of constructor design, I refer to automatic
The value of today will be different for every object created. Nevertheless, they
are the same in the sense that they are the current date. Client programmers
could pass any date, including one in the past.
The term dynamically assigned initial values is just a fancy way of refer-
ring to constructor parameters, which are either required or optional. They are
required if every constructor in the class includes a parameter for a particular
dynamically assigned initial value. Otherwise, they are optional. For example,
44 JAVA RULES
date = new Date().getTime();
}
Widget(Date date) {
if (date == null)
throw new NullPointerException();
this.date = date.getTime();
}
6. This term is derived from “constructor madness” as used in “Java Tip 63” in Java World, which
is entitled “Avoid ‘constructor madness.’” See www.javaworld.com/javaworld/javatips/
jw-javatip63.html.
B Either there are no instance variables None (uses the default constructor)
(such as a utility class) or all of the
instances variables have statically
compiled values.
C All instance variables are initialized with One constructor (which is alternatively
required constructor parameters. referred to as the minimal constructor)a
46 JAVA RULES
Table 1.4 Constructor Mathematics
Initial Values Constructor Requirements
D All instance variables default if not For every “mathematically possible” com-
specified. Therefore all of the bination of constructor parameters, there is
constructor parameters are optional. one constructor. Quotation marks are used
around “mathematically possible” because
very often not every combination makes
sense from an interface design perspective.
Assuming that all of the combinations are
used, however, the arithmetic is simple. You
use the number of optional constructor
parameters as a power of two to determine
the number of required constructors. For
example, if there are three optional
constructor parameters, the class would
require 23 or eight constructors. To that
must be added a no-argument constructor.
C Required constructor parameters and Same as row three except that a minimal
D Optional constructor parameters constructor replaces the no argument
constructor
a. I refer to the constructor that includes all of the required constructor parameters (and only those parame-
ters) as the minimal constructor. Other constructors in the same class will have optional parameters in addi-
tion to the required ones. (The term minimal constructor is of my own making. It is only useful is discussing the
concept of constructor mathematics, and is not to be confused with the more widely used term reference con-
structor discussed below.)
48 JAVA RULES
append(str);
}
Can you tell which of these is the reference constructor? Is this case, the refer-
ence constructor merely creates an array and marks it as not shared.
Reference constructors are used as a way of assuring that instance initializa-
tion is exactly the same for all of the objects in a class. For example, factory
methods typically use a private reference constructor. Likewise, if there is
no constructor that includes only the initialization logic common to some
or all of the constructors in a class, a private method can be used
instead. Consider the following example from the HashMap class.
50 JAVA RULES
computer programmer. One of them is using something the way it was
designed to be used. Doing so tends to avoid unforeseen problems. In this
case, that means using constructors instead of factory methods unless there is
a compelling reason to do otherwise. I urge you to consider doing the same.
Alternative constructor designs solve one of the following problems.
• A need for more than one constructor with the same signature, such as
more than one no-arg constructor
• A class that has an unmanageable number of constructors (when construc-
tor mathematics does indeed become constructor madness ). This always
stems from the fact that there are a lot of what would normally be optional
constructor parameters (as defined in 1.3.3 Constructors) that default if not
specified. Here it is important to differentiate between a class that cannot
know for sure which combinations of optional constructor parameters make
the most sense and a class that simply has an extraordinary number of
optional constructor parameters. The former tends to use set methods
while the later will use a separate class with public fields (a struct-like
data structure) to make it easier for client programmers to set the fields.
Both are discussed further below.
• Constructors are like nameless methods. They can only be differentiated by
their signature. Constructors that serve a highly specialized purpose
(beyond what can be explained by the concept of constructor mathematics)
need a name to make that purpose more explicit to client programmers.
This is precisely why the BigInteger.probablePrime(int
bitLength, Random rnd) factory method was added in the 1.4
release.
The first problem is sometimes solved by rearranging constructor parameters
so that the signatures are arbitrarily different. From an interface design perspec-
tive, that is a really bad idea (and does not address the problem of needing more
than one no-arg constructor).7 A much better solution is to use factory methods.
Here is an example of this use of factory methods from the DateFormat
class:
7. Actually I recall an article in a popular Java magazine that suggested using a dummy parameter
in order to obtain a second no argument constructor. The dummy parameter was of type boolean. It
was suggested that clients always pass true; not that it mattered because the value was completely
ignored. This is a bizarre suggestion and evidences a total disregard for interface design.
Note how many of these factory methods would have the same signature if they
were constructors instead. Factory methods are not usually declared final,
which makes it all the more obvious that this is not a traditional use of factory
methods. As discussed in the seminal work Design Patterns,8 the whole point
of using factory methods is to override them in subclasses. Programmers
should be able to readily differentiate the different uses of factory meth-
ods. Here is another example of this particular use (a solution to the problem of
having more than one constructor with the same signature) from the Break-
Iterator class in java.text:
8. Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, Design Patterns, Elements of
Reusable Object-Oriented Software, (Boston, Addison-Wesley Professional, 1995).
52 JAVA RULES
public static Logger getLogger(String name)
public static Logger getLogger(String name,
String resourceBundleName)
This use of factory methods is very common. The factory methods in the last
two examples can be overridden, but they still have no correspondence whatso-
ever to factory methods as envisioned in Design Patterns. Certainly using fac-
tory methods for something as simple as “constructor names” (the third bulleted
item in the above list of constructor design problems) is not something you will
find discussed in Design Patterns. The essential definition of a factory method,
however, is to “manufacture” an object, and using them to solve constructor
design problems such as these is perfectly valid.
Another example of how the use of factory methods has evolved since the
publication of Design Patterns is their use to return instances of cached immu-
table objects. If there are many requests for the same immutable object, creat-
ing it over and over again in a constructor makes no sense. For example, the API
docs for the valueOf(long unscaledVal, int scale) method in the
BigDecimal class states:
This “static factory method” is provided in preference to a (long,
int) constructor because it allows for reuse of frequently used
BigDecimals.9
That is also why the Boolean.valueOf(boolean b) was added in the 1.4
release. There are many such examples in the core API. The reason for making
these comparisons to Design Patterns is not to discredit the Gang-of-Four (the
GoF, or authors), merely to point out that their ideas have evolved over time
along with the Java programming language.
The problem of too many constructors is usually solved by using a no-arg
constructor and set methods. The GregorianCalendar class comes
9. API docs for the valueOf(long unscaledVal, int scale) method in the
java.math.BigDecimal class
GregorianCalendar()
GregorianCalendar(int year, int month, int date)
GregorianCalendar(int year, int month, int date,
int hour, int minute)
GregorianCalendar(int year, int month, int date,
int hour, int minute, int second)
GregorianCalendar(Locale aLocale)
GregorianCalendar(TimeZone zone)
GregorianCalendar(TimeZone zone, Locale aLocale)
If other fields need to be set, the set(int field, int value) method is
used after the GregorianCalendar object is created. The Component
and JComponent classes are other important examples of this solution in the
Core API. Though not as immediately obvious, the applyPattern(String
pattern) methods in classes such as DecimalFormat and Simple-
DateFormat are also examples.
Finally, there is the rare case in which a class has an extraordinary number
of optional constructor parameters that default if not specified. The GridBag-
Layout and GridBagConstraints classes in the java.awt package is
a well-known example of this. GridBagLayout is a LayoutManager used
to layout GUI components. The GridBagConstraints class is described as
follows in the API docs.
The GridBagConstraints class specifies constraints for compo-
nents that are laid out using the GridBagLayout class.10
54 JAVA RULES
These “constraints” very well could have been constructor parameters in a differ-
ent solution to the problem of laying out GUI components. Instead, they are
public fields in the GridBagConstraints class, which is basically a C
struct (or record in other legacy systems) in which the value of what would
be constructor parameters can be directly set. The constructor parameters are
then passed to the GridBagLayout class that actually uses them as an
instance of the GridBagConstraints class. All of the public fields in
such a class represent optional constructor parameters that default if not speci-
fied. In the example of the GridBagLayout and GridBagConstraints
classes, the default values are set in a no-argument GridBagConstraints
constructor. The API docs for that no-arg constructor reads as follows.
Creates a GridBagConstraint object with all of its fields set to
their default value.11
Only those fields that do not use the default values need to be set. This also sim-
plifies the overall interface design in that the class that actually uses the con-
structor parameters has only one constructor which is passed an instance of the
class used to set those parameters. Using a separate class in which constructor
parameters are declared as public fields is an extreme solution that I hesitate
to even characterize as an alternative constructor design. Examples such as
GridBagLayout and GridBagConstraints are rare in the core API.
class Test {
int a = b; //COMPILER ERROR
int b = a;
}
class Test {
int a = b; //NOT A FORWARD REFERENCE
static int b = 10;
}
The difference is that class variables are initialized in the <clinit> method,
which is executed when the class file is loaded. Only class variable initializers
and static initialization blocks can forward reference a class variable.
Method invocation expressions are not checked for forward references. For
example,
56 JAVA RULES
class Test {
static int a = forwardReference();
static int b = 10;
public static void main(String[] args) {
System.out.println(Test.a);
}
static int forwardReference() { return b; }
}
Care must be exercised, therefore, that the methods invoked in variable initializ-
ers and initialization blocks do not reference a field that is declared further down
in the compilation unit.
Though forward references may involve method invocation expres-
sions, they always in either a variable initializer or an initialization block.
In the case of variable initializers, the problem is always the textual order of field
declarations. Nothing can be done to systematically avoid this problem except
to make sure that if an instance variable is used in an instance variable initializer
(or a class variable is used in a class variable initializer) that it is declared before
the field being initialized. In short, reorder the field declarations. In the case of
initialization blocks, however, the problem of forward referencing can be
This specification was not fully implemented until the 1.4 release. I will look at
the last bulleted item first. The necessity for this change can be seen in the fol-
lowing example.
58 JAVA RULES
class Test {
public static void main(String[] args) {
new Test().new InnerClass().print();
}
class InnerClass {
void print() {
System.out.println(s);
}
}
String s = "not a forward reference";
}
class Test {
static {
s = "this is NOT a forward reference";
}
static String s = "";
}
In light of the discussion in 1.4.4 The Variable Initializer for a Field is Always Exe-
cuted, this change only makes sense. The reality is that fields can be assigned
values after automatic initialization using standard default values and before vari-
able initializers are executed. Why not make that reality more explicit?
It is the following example that intrigues me.
class Test {
static {
s = "this is NOT a forward reference";
String copy = s; //Is this really a forward reference?
}
static String s = "";
}
Prior to the 1.4.1 release, both statements in the static initialization block
would generate a forward reference compiler error. Now that only the
second one does, I have to question whether this is really a forward reference.
Another approach would be to say that there are no forward references after a
value has been assigned to the field.
Finally, there is an interesting little whole in the compiler analysis for circular
initialization. Here is an example:
class Test {
static final int INLINED_CONSTANT = Test.INLINED_CONSTANT;
public static void main(String[] args) {
System.out.println(INLINED_CONSTANT);
}
}
This example compiles and when executed prints 0. The bytecodes for this
example are even more interesting:
Method Test()
0 aload_0
1 invokespecial #1 <Method java.lang.Object()>
4 return
60 JAVA RULES
6 invokevirtual #4 <Method void println(int)>
9 return
Method static {}
0 getstatic #3 <Field int INLINED_CONSTANT>
3 putstatic #3 <Field int INLINED_CONSTANT>
6 return
While this is not a bug (it is a correct implementation of the JLS), one
can imagine a useful optional warning message for such cases. I'm
changing this to an RFE.16 [example modified to conform with mine]
Looking at the byte codes, however, I would say that this should be an error.
class Superclass {
Superclass() { overridden(); }
void overridden() { System.out.println("Superclass"); }
}
class Test extends Superclass {
String s = "Test";
public static void main(String[] args) {
new Test().overridden();
}
void overridden() { System.out.println(s); }
}
In both cases the result is a forward reference, but in this example the instance
variable that has only a standard default value (because the variable initializer
has not executed) is declared in a subclass. There is the substantial differ-
ence. Executing this program prints
null
Test
There are two invocations of the overridden() method. The first is from the
Superclass constructor. At that point the <init> method for Test has not
yet been invoked, and all of the instance variables in Test still have standard
default values. Thus null is printed.
In the last example, the overridden() method is invoked in a construc-
tor. The exact same thing happens if you invoke overridden instance methods in
instance variable initializers or instance initialization blocks. For example,
class Superclass {
{ System.out.println(overridden());}
String s = overridden();
String overridden() { return "Superclass"; }
}
class Test extends Superclass {
62 JAVA RULES
String s = "Test";
public static void main(String[] args) {
new Test().print();
}
void print() { System.out.println(super.s); }
String overridden() { return s; }
}
null
null
Apparently the author did not realize this problem would not go away. Further-
more, it is not limited to “some implementations.” It is just the problem of invok-
17. John Rose, Inner Classes Specification (Mountain View: Sun Microsystems, 1997), “How do
inner classes work?” As I am reading him, “up-level reference” in this context means access to the
private members of an enclosing class. It must mean that because this$0 (otherwise known
as a link variable) is only used to defeat the normal access control mechanism in a JVM
class Superclass {
class InnerMemberSuperclass {
InnerMemberSuperclass() { print(); }
void print() {System.out.println("overridden method");}
}
}
class Test extends Superclass {
String s = "private member";
public static void main(String[] args) {
new Test().new InnerMemberSubclass();
}
class InnerMemberSubclass extends InnerMemberSuperclass {
void print() {System.out.println(s);}
}
}
class Test {
private String s = "this is a private instance variable";
public static void main(String[] args) {
new Test().new Test$InnerMemberClass().print();
}
String access$000() { //COMPILER-GENERATED ACCESSOR METHOD
return s; //DEFEATS NORMAL ACCESS CONTROL
}
}
64 JAVA RULES
class Test$InnerMemberSubClass {
private Test this$0; // COMPILER-GENERATED LINK VARIABLE
// JVM QUIETLY PASSES BOTH this AND this$0
Test$InnerMemberClass (Test this$0) {
this.this$0 = this$0;
}
void print() {System.out.println(this$0.access$000());}
}
The effect of this specification is that inlined constants are not subject to the
field initialization anomalies discussed in the previous two sections. To demon-
strate this fact, I will use examples from those sections and modify them to use
inlined constants. Here is the first of the two examples from 1.4.1 The Problem
of Forward Referencing:
class Test {
static int a = forwardReference();
static final int b = 10;
public static void main(String[] args) {
System.out.println(Test.a);
}
static int forwardReference() { return b; }
}
18. Gosling et al., The Java Language Specification, §13.1, “The Form of a Binary.”
class Superclass {
Superclass() { overridden(); }
void overridden() { System.out.println("Superclass"); }
}
Test
Test
This program used to print null followed by Test before adding the final
modifier to the declaration of s (thereby making it an inlined constant).
There is a question as to how the inlined constants “must always appear to
have been initialized” specification is implemented. The JLS very misleadingly
suggests that this specification is implemented by merely initializing inlined con-
stants before the other fields in a class:
One subtlety here is that, at run time, static variables that are
final and that are initialized with compile-time constant values are ini-
tialized first. This also applies to such fields in interfaces. These vari-
ables are “constants” that will never be observed to have their default
initial values, even by devious programs.19
19. Gosling et al., §8.3.2.1, “Initializers for Class Variables.” There is a similar statement in refer-
ence to interface constants in §9.3.1, “Initialization of Fields in Interfaces.”
66 JAVA RULES
variable initialization is not reordered. That does not matter, however, because
the value of the inlined constant is never “observed.” It is inlined. In the case of
class variable, while the declaration of the static field is written to the class
file, but there appears to be no initialization logic at all. The reordering of initial-
ization is therefore a moot point. Yet elsewhere the JLS clearly says that there
should no references to an inlined constant “present in the code in a binary file
(except in the class or interface containing the constant field, which will have
code to initialize it).…20 Here is an example,
class Test {
String one = "1";
String two = "2";
final String THREE = "3"; //inlined instance variable
static final String FOUR = "4"; //inlined class variable
{
String s = THREE;
}
static {
String s = FOUR;
}
}
Method Test()
0 aload_0
1 invokespecial #1 <Method java.lang.Object()>
4 aload_0
20. Gosling et al., The Java Language Specification, §13.1, “The Form of a Binary.” Emphasis
added.
Method static {}
0 ldc #8 <String "4">
2 astore_0
3 return
The lines in bold show that THREE and FOUR are indeed inlined constants (there
are no references to these fields in the decompiled code, only their values). As
you can see, THREE is not initialized before the other instance variables. There
is no reordering of the initialization as suggested in the above quote from the
JLS. The declaration of FOUR is written to the file, but not the variable initializer.
None of this, however, changes the fact that an inlined constant “always appear
to have been initialized.” It is just a question of how this specification is imple-
mented.
int x = 0;
class Test {
public static void main(String[] args) {
System.out.println(i);
68 JAVA RULES
}
static {
i = Integer.MAX_VALUE;
}
static int i = 0;
}
Executing this program prints 0. If the variable initializer were not executed it
would print the value of Integer.MAX_VALUE.
According to 8.3.2.3, “Restrictions on the use of Fields during Initialization”
in the JLS, this is not an example of forward referencing (which is somewhat
obvious because otherwise it would not have compiled). That section was added
to the Second Edition of the JLS and reflects a change in the Java programming
language that was not actually implemented until the 1.4.1 release. The fact that
“the variable initializer for a field is always executed” is an important change in
the Java programming that was implemented in the 1.2 release, however.21 The
rationale for the change at the time was in fact forward references and the
closely related problem of invoking overridable methods during object initializa-
tion. Prior to the 1.2 release, the next two examples would print 100, whereas
now they print 0.
class Test {
int j = forwardReference();
int i = 0;
int forwardReference() {
return i = 100;
}
public static void main(String[] args) {
System.out.println(new Test().i);
}
}
21. I became aware of this issue while reading the “Compatibility with Previous Releases” document
in the 1.2 release. For some reason the primary Bug Id 1227855 is no longer available on the Bug
Database.
The assignment statement i = 0 simply was not included in the <init> meth-
ods for both of these examples because of the assumption that doing so would
overwrite the standard default value. Similar examples can be constructed for
class variables and reference type fields that are initialized with the standard
default value of null. This change is significant in part because it means explic-
itly initializing fields with their standard default value is not without cost.
class Test {
public static void main(String[] args) {
new Superclass();
}
}
class Superclass {
Subclass subclass = new Subclass();
}
class Subclass extends Superclass { }
70 JAVA RULES
at Superclass.<init>(Test.java:7)
at Subclass.<init>(Test.java:9)
at Superclass.<init>(Test.java:7)
…
72 JAVA RULES
1.1
NOTE Every once in a while I come across some of my own technical writing
that makes absolutely no sense (even to me) and is obviously wrong.
While this is never a pleasant experience, there is usually some reason
for the confusion. While working on the index, that is exactly what hap-
pened with the following section on August 14, 2003. It should be com-
pletely ignored in all versions of the book published prior to that date.
The problem in this case was a huge compiler bug that I immediately
submitted. Actually, I am sure that it will result in a change to the spec-
ification rather than the language because otherwise there will be an
equally huge backwards compatibility problem.
Note that this specification never saw the light of day until the 1.4 release.23
Uncharacteristically, the Second Edition of the JLS included the following
very thoughtful explanation for the difference between instance initialization
blocks in named versus unnamed (or anonymous) classes.
An instance initializer of a named class may not throw a checked
exception unless that exception or one of its superclasses is explicitly
declared in the throws clause of each constructor of its class and the
The first paragraph is a restatement of the original in the Inner Classes Speci-
fication. The second paragraph is an elaborate explanation as to why instance
initialization blocks in an anonymous class can throw any checked exception. In
plain English, anonymous classes are only used by the programmers who code
them. Therefore, there is no need for a throws clause.
Although an undocumented feature of the language, instance variable initial-
izers behave exactly the same as instance initialization blocks in this regard;
they can throw checked exceptions so long as the exception is assignable
to an exception class in the throws clause of every constructor in the
class (which assumes that at least one constructor is declared because default
The fact that instance variable initializers in named classes can also
throw checked exceptions is an undocumented feature of the Java
programming language.
constructors do not throw exceptions). On August 14, 2003, I submitted the fol-
lowing bug which will explain this in detail.
74 JAVA RULES
I am sure this is a documentation problem in the JLS, but am filing it
against the compiler because it clearly contradicts the Second Edition
of the JLS.
Amazingly, I have not been able to find any previously submitted bugs
or any discussion of this issue whatsoever on the JDC.
Here I must point out that only the last paragraph of that section is
being corrected. The second paragraph is an addition to the printed
specification.
76 JAVA RULES
class Test {
String s = doSomething();
Test() throws Exception { }
String doSomething() throws Exception {
return "throws a checked exception";
}
}
It is clear that the language changed sometime between the 1.1.8 and
the 1.3 releases. The change is that instance variable initializers are
behaving the same instance initializers in this regard. They can throw a
checked exception as long as the exception is assignable to an excep-
tion class in the throws clause of every constructor in the class (which
again assumes that at least one constructor is declared).
As of this writing, I have not yet received a reply to this bug submission. It could
be that I am wrong, but it looks pretty straightforward to me.
While on the subject of instance variable initializers, there is a minor error in
the Second Edition of the JLS. The last paragraph in 11.2 Compile-Time Check-
ing of Exceptions read as follows.
78 JAVA RULES
Variable initializers for fields and static initializers must not result in a
checked exception; if one does, a compile-time error occurs.25
25. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification, First Edition,
§11.2, “Compile-Time Checking of Exceptions.” (Do not update.)
26. Gosling et al., §11.2, “Compile-Time Checking of Exceptions.”
27. See java.sun.com/docs/books/jls/clarifications-2-2nd-ed.html.
28. This is a somewhat complex term that is defined in an as yet unpublished volume of Java Rules.
The first active use of a class or interface type is defined by §5.5, “Initialization” in the JVMS. Note
that in response to Bug Id 4073950, “invocation of certain reflective methods” (in particular, the
Class.forName(String className) is now included in the definition of first active use.
Note that this says “instance of class Error or one of its subclasses.” As
stated above, even instances of RuntimeException are translated into an
ExceptionInInitializerError. For example,
class Test {
static { throwsRuntimeException(); }
public static void main(String[] args) {
System.out.println(“Hello World!”);
}
static void throwsRuntimeException() {
throw new RuntimeException();
}
}
29. Gosling et al., The Java Language Specification, §14.17, “The throw Statement.”
80 JAVA RULES
Notice that the original exception is included in the stack trace printed. This
was not the case before the Java 1.2 release. To determine the cause of the
problem in earlier releases, the getException() method in Exception-
InInitializerError had to be invoked. This change in the 1.2 release of
the ExceptionInInitializerError class became the model for print-
ing chained exceptions in the 1.4 release.
1.5 Constants
Variables declared final are constants. The fields in an interface are implic-
itly final, which explains why they are usually referred to as interface con-
stants. After initialization, the value of a constant cannot be changed. If a
constant is used in a context that requires a variable, such as the left-hand oper-
and of an assignment expression (except for the initialization of a blank final
as discussed 1.5.4 Blank Finals), a compiler error is generated. For example,
class Test {
static final String s = "cannot be changed";
public static void main(String[] args) {
s = "compiler error";
}
}
class Test {
final int ZERO = 0; //instance varaible
public static void main(String[] ars) {
System.out.println(Subclass.ZERO);
}
}
class Subclass extends Test {
static final String ZERO = "zero"; //class varaible
}
This program compile and when executed prints one. Hiding is discussed at
length in the next chapter.
The subject of constants is much more complex than simply declaring a field
or local variable final, however, and is thoroughly discussed in the following
subsections.
true
(short)(1*2*3*4*5*6)
Integer.MAX_VALUE / 2
2.0 * Math.PI
"The integer " + Long.MAX_VALUE + " is mighty big."30
Some of these examples are highly contrived. More often than not, compile-time
constant expressions are just literals.
The term constant expression has a very precise meaning in the Java pro-
gram language, as explained in this section. It is not to be confused with con-
stants in general (variables declared final). Perhaps because the more
82 JAVA RULES
general term constant is so easily confused with constant expression, the JLS
prefers the more formal term compile-time constant expression.31 Using the
term constant to refer to compile-time constant expressions, as is done by a
number of software engineers and technical writers, is extremely confusing
because not all variables declared final (which is the general meaning of the
term constants) are compile-time constant expressions.
Understanding the definition of compile-time constant expressions in the
Java programming language is important for the following reasons (which I have
tried to list in their order of importance).32 While reading this list, it is important
to know that inlined constants (which are discussed in the next section) are
final variables initialized with the value of a compile-time constant
expression.
• The case labels in a switch statement must be constant expressions. This
is the most “in your face” use of compile-time constant expressions for the
average programmer. See 4.8.1.1 The switch Statement for a discus-
sion.
• Inner classes can declare constants, but only if they are inlined.
• References to inlined constants from other classes and interfaces do not
cause the class or interface in which they are declared to be loaded, linked,
and initialized. In other words, references to inlined constants does not con-
stitute the “first active use” of a class or interface.
• If the operands in a string concatenation operation are not compile-time con-
stant expressions, the string concatenation operation must be executed at
runtime. This is the definition of a computed string as discussed in 5.11
String Concatenation Operations in Volume 1.
• The general expectation in conditional compilation is that debugging code is
not written to class files if the debug flag is set to false. In order for
31. Though not used in 15.28 Constant Expressions, the more formal compile-time constant
expression is used everywhere else in the JLS.
32. I had been working on this list on my own for several years, but some additions were discovered
(and elaborated upon) while reading the unofficial “Java Spec Report” at www.ergnosis.com/
java-spec-report/java-language/jls-15.28-e.html as well as Bug Id 4396260 (includ-
ing some of the insightful comments). All such lists are built by searching for either “constant” or
“constant expression” in the JLS. However, you can take this too far. There seems to be a consen-
sus that the list as shown here is complete.
33. Gosling et al., The Java Language Specification, §15.25, “Conditional Operator ? :”
84 JAVA RULES
As you can see, this is an extensive list. This is why I say it is important for all
Java programmers to understand the definition of compile-time constant expres-
sions.
Compile-time constant expressions are defined in terms of what they can
include, which is any of the following.
• Any literal other than the null literal or class literals.
• The simple name or TypeName.fieldName references to inlined con-
stants.34
• Any unary, binary, and ternary operator other than the increment or decre-
ment operators (prefix or postfix) and assignment operators (simple or com-
plex). These are the only unary, binary, or ternary operators that require a
variable. There are only two operators that are not unary, binary, or ternary.
They are instanceof and cast operators. The instanceof operator
cannot be used. Cast operators can be used, but only if the type name in
the cast operator is a primitive data type or String.
• Parentheses that enclose a compile-time constant expression as defined by
the previous bulleted items.
Constant expressions can also be defined in terms of what they do not include. A
constant expression cannot include any value that could possibly change after
compilation.
Here I am using the strict definition of variables, which does not include vari-
ables declared final . (Is that a confusing sentence or what?)
The remainder of this section discusses a documentation problem that has
plagued compile-time constant expressions. Is null a compile-time constant
expression or not? The short answer is No, but there is some interesting history
class Test {
static final Object NOT_INLINED = null;
void test() {
Object obj = NOT_INLINED;
}
}
As you can see, null is clearly not inlined. That stands is stark contrast to Bug
Id 4083093, which is mark “Closed, fixed” and includes the following evaluation.
The spec needs to be corrected. As xxxxx@xxxxx says, javac has
always treated null as a compile time constant.
The full extent of this documentation problem can be seen in the following code
from the System class.
35. Unascribed, “Clarifications and Amendments to The Java Language Specification” available
online at java.sun.com/docs/books/jls/clarify.html, (Mountain View: Sun Microsys-
tems, 1995-2003), “null is a Compile time Constant.”
86 JAVA RULES
public final static InputStream in = nullInputStream();
public final static PrintStream out = nullPrintStream();
public final static PrintStream err = nullPrintStream();
/**
* The following two methods exist because in, out, and err
* must be initialized to null. The compiler, however, cannot
* be permitted to inline access to them, since they are later
* set to more sensible values by initializeSystemClass().
*/
private static InputStream nullInputStream() throws
NullPointerException {
if (currentTimeMillis() > 0)
return null;
throw new NullPointerException();
}
private static PrintStream nullPrintStream() throws
NullPointerException {
if (currentTimeMillis() > 0)
return null;
throw new NullPointerException();
}
If this strikes you as odd, the reason in part is that this code has been in place
since before blank finals were introduced to the language in the 1.1 release. The
problem I have with this code is that InputStream and PrintStream are
reference types other than String (the values of which are never inlined), but
this code does make me wonder if null was in fact inlined at some point in the
evolution of the Java platform. The explanation you are about to read as to why
null cannot be inlined would suggest that the answer is No, but then again this
is the System class. One assumes the responsible programmer is in the know.
One thing is for sure, however, is that you should not emulate this code. Just use
a blank final if the need arises.
Apparently in dealing with this problem during the writing of the Second Edi-
tion of the JLS, the decision was made that null is not a compile-time constant
expression. The only explanation I can find for this is at the bottom of Bug Id
4475252, which includes the following comment.
null is not a constant expression because its value cannot be
expressed in the VM's constant pool.
Test {
final int i = 1 / 0;
}
Method Test()
0 aload_0
1 invokespecial #1 <Method java.lang.Object()>
4 aload_0
5 iconst_1
6 iconst_0
7 idiv
8 putfield #2 <Field int i>
11 return
88 JAVA RULES
The compiler is writing code that it knows for sure will throw an error, but there
is nothing unusual about that. The other thing they had to do was to change the
specification for inlined constants. The following was added to the “Clarifications
and Amendments to The Java Language Specification, Second Edition” docu-
ment.
15.28
One of the reasons I decided to include the 1/0 discussion at the bottom of this
section is to make a point about the JLS. Sometimes the quest for succinct-
ness comes at a very high cost. This, I believe, is a perfect example. Without
any further explanation in the specification, how many man hours will be lost by
programmers trying to figure the significance of “that does not complete
abruptly” in the above specification? However many that may be, they could be
wiped away with a simple footnote. All that it need do is reference the primary
Bug Id 4178182. Am I the only one that questions the wisdom of these kind of
trade-offs?
37. Unascribed, “Clarifications and Amendments to The Java Language Specification, Second Edi-
tion” available online at java.sun.com/docs/books/jls/clarifications-2-2nd-
ed.html, (Mountain View: Sun Microsystems, 1995-2003), “15.28.”
class Test {
static final String INLINED = "value";
static final String NOT_INLINED = getString();
public static void main(String[] args) {
String s = INLINED;
s = NOT_INLINED;
}
static String getString() {
return "value";
}
}
As you can see, there is no reference to the INLINED constant. Instead, a ldc
(load constant) machine instruction is used to push the value of a string constant
onto the operand stack, which is subsequently stored in the local variable array.
The equivalent operation for the NOT_INLINED field requires a getstatic
machine instruction. I refer to fields such as INLINED as inlined constants.
The specification for inlined constants is somewhat buried in 13.1 The Form of a
Binary in the JLS:
References to fields that are final and initialized with compile-time
constant expressions are resolved at compile time to the constant
value that is denoted. No reference to such a constant field should be
present in the code in a binary file (except in the class or interface con-
taining the constant field, which will have code to initialize it)… 38
90 JAVA RULES
The only concrete explanation for inlined constants in the JLS is switch
statements. It is imperative for the machine instructions used to implement a
switch statement that no two case constants are the same. The compiler
must therefore check for duplicate case constants. Those constant expressions
are then inlined as a means of making sure they cannot possibly change. That is
not the only reason for inlined constants, however. The inlining of String type
constants is an important part of the intern mechanism discussed in 5.11.3 The
Intern Mechanism in Volume 1.
More generally, inlined constants in and of themselves make the Java pro-
gramming language much faster. In fact, it can be rightly said that inlined con-
stants are one of the single-most important performance optimizations in the
whole of Java technology. They eliminate an untold number of getfield and
getstatic machine instructions. An extreme example of this is a group of
related constants (all of which are inlined) declared either as a class or interface.
Such a class or interface is never loaded into a JVM. It is only used by the com-
piler. For example,
class Test {
public static void main(String[] args) {
System.out.println(InlinedConstants.s);
}
}
interface InlinedConstants {
String s = "testing, testing, testing";
}
If this program is compiled and executed using the -verbose option (which
lists the names of all the class and interface types loaded), the name of the
InlinedConstants interface cannot be found anywhere in the output
38. Gosling et al., The Java Language Specification, §13.1, “The Form of a Binary.” Note that
§3.4.8, “final Fields and Constants” is often incorrectly cited as the source of this specification.
That section only discusses the problems of binary compatibility that arise as the result of this spec-
ification for inlining constants (what this chapter refers to as the problem of changeable inlined
constants). In fact, as of the Second Edition, §3.4.8, “final Fields and Constants” still incorrectly
suggests that inlined constants must be declared static. That has never been the case. As far
back as the 1.0 release final instance variables initialized with a compile-time constant have
been inlined.
/*
* Note that this class imports sun.tools.java.RuntimeConstants and
* references many final static primitive fields of that interface.
* By JLS section 13.4.8, the compiler should inline all of these
* references, so their presence should not require the loading of
* RuntimeConstants at runtime when ProxyGenerator is linked. This
* non-requirement is important because ProxyGenerator is intended
* to be bundled with the JRE, but classes in the sun.tools
* hierarchy, such as RuntimeConstants, are not.
*
* The Java compiler does add a CONSTANT_Class entry in the constant
* pool of this class for "sun/tools/java/RuntimeConstants". The
* evaluation of bugid 4162387 seems to imply that this is for the
* compiler's implementation of the "-Xdepend" option. This
* CONSTANT_Class entry may, however, confuse tools which use such
* entries to compute runtime class dependencies or virtual machine
* implementations which use them to effect eager class resolution.
*/39
interface Test {
long TIME_LOADED = System.currentTimeMillis();
}
The TIME_LOADED field is not an inlined constant because of the method invo-
cation expression in the variable initializer. The implication is that a JVM imple-
mentation has per-interface tables in which the value of interface constants are
stored.
92 JAVA RULES
Changing the value of an inlined constant breaks compatibility with pre-exist-
ing binaries.40 This is sometimes referred to as the problem of inconstant
constants, and is discussed in the following subsection.
As with compile-time constant expressions, inlined constants suffer from a
number of documentation problems, the most significant of which is what to call
them. The term inlined constant is mine. In the original JLS, these were called
primitive constants:
We call a field that is static, final, and initialized with a compile-
time constant expression a primitive constant.41
Bracha understandably deleted this sentence from the Second Edition. I say
“understandably” because the term primitive constant suggests only a primi-
tive data type declared final. In other places, where the term primitive con-
stant was used in the First Edition, it was replaced by compile-time constant
or compile-time constant field.42 There is no official explanation for this
change, but the following sentence from the evaluation of Bug Id 4015781
makes it clear that primitive constant fell out of favour.
JLS 13.4.8 has been amended. The distinction between primitive con-
stants and other constants has been abandoned.43
The term should have been replaced (by something more meaningful than com-
pile-time constant or compile-time constant field both of which are practi-
cally indistinquishable from compile-time constant expressions), however,
rather than abandoning a meaningful distinction between constants that are
40. I really do not like the term pre-existing binaries. What is the difference between an existing
binary and a pre-existing binary? You can actually find pre-exist in a couple of dictionaries, but in
reference to a change in a Java program (which we assume to be happening in the moment), all
binaries are pre-existing (except perhaps a helper class). Note also that binary (or binaries) is
just a fancy term for another class file. So to say that a “change is not compatible with pre-existing
binaries” means that the change you are contemplating or have just made is not compatible with
existing class files. Generally speaking this means that a LinkageError will be thrown, but the
definition of incompatible is sometimes extended to include other changes such as changing the
value of an inlined constant.
41. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification, First Edition,
§13.4.8, “final Fields and Constants.” (Do not update.)
42. For example, see §13.4.8, “final Fields and Constants” in the Second Edition.
43. Evaluation of Bug Id 4015781.
This change is encouraging because it acknowledges the need for a term, but
the choice of constant variable is as unfortunate as primitive constant. Why
not call a thing what it is? In this case, that would be an inlined constant.
In the original JLS, the following statements from §15.27, “Constant Expres-
sion” and §13.1, “The Form of a Java Binary” were inconsistent.
A compile-time constant expression is an expression denoting a value
of primitive type or a String that is composed using only the follow-
ing:
The first quote says final and the second one says static final. There is
a discussion of this in Bug Id 4262182. Inlined constants do not have to be
declared static, and the Second Edition fixed 13.1 The Form of a Java Binary
by dropping the static keyword. It did not, however, make the same change
in 13.4.8 final Fields and Constants.
The best way to avoid problems with “inconstant constants” in widely-
distributed code is to declare as compile time constants only values
44. Unascribed, “Clarifications and Amendments to The Java Language Specification, Second Edi-
tion” available online at java.sun.com/docs/books/jls/clarifications-2-2nd-
ed.html, (Mountain View: Sun Microsystems, 1995-2003), “JLS 4.5.4.”
45. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification, First Edition,
§15.28, “Constant Expression.” (Do not update.)
46. Ibid., “The Form of a Java Binary.” (Do not update.)
94 JAVA RULES
which truly are unlikely ever to change. …Other than for true mathemat-
ical constants, we recommend that source code make very sparing
use of class variables that are declared static and final.47
For example,
class InlinedConstant {
static final int x = 2;
}
Existing binaries (read other class files) do not see the change in an
inlined constant until recompiled.
That is the whole problem in a nutshell. What is the solution? The JLS suggests:
The best way to avoid problems with “inconstant constants” in widely-
distributed code is to declare as compile time constants only values
which truly are unlikely ever to change. Many compile time constants in
interfaces are small integer values replacing enumerated types, which
the language does not support; these small values can be chosen arbi-
trarily, and should not need to be changed. Other than for true mathe-
96 JAVA RULES
matical constants, we recommend that source code make very sparing
use of class variables that are declared static and final.49
The first very important point to add to this suggestion is that the problem of
changeable inlined constants does not apply to reference types other than
String. Note also that inlined constants should be class variables because it
makes no sense to have copies of a variable that is truly constant in every
instance of a class. Nevertheless, this advice applies equally to instance vari-
ables (or non-static fields). In other words, the last sentence of this quote
should read:
Other than for true mathematical constants, we recommend that Java
code make very sparing use of primitive data type or String type
fields declared final .
In declaring such a field final you expose yourself to the problem of change-
able inlined constants.
As stated above, the hard part is remembering to ask yourself this question.
When declaring an interface, you should be especially alert to the problem in
changeable inlined constants because the fields in an interface are implicitly
final. That makes interfaces especially susceptible to the problem of change-
able inlined constants.
When are such fields likely to never change? Asking this question is the same
as asking when is it okay to use inlined constants. The following list is doubtless
incomplete, but covers most uses of inlined constants of which I am aware:
• True mathematical constants
• Bit masks
• Mnemonics
These are from the java.util.TimeZone class. There is a fine line between
convenience constants and mnemonics. The easiest way to distinquish between
the two is that mnemonics are not enumerated constants. By that I mean there
is usually only one of them. For example, consider the following int type con-
stants from an old version of a package-private class in the java.net package:
98 JAVA RULES
The use of inlined constants for true mathematical constants, bit masks, and
mnemonics will always be a part of the Java programming language. The same
is true of using strings (as property names) to access a Properties object
because of the following specification.
Because Properties inherits from Hashtable, the put and
putAll methods can be applied to a Properties object. Their use
is strongly discouraged as they allow the caller to insert entries whose
keys or values are not Strings. The setProperty method should
be used instead. If the store or save method is called on a “compro-
mised” Properties object that contains a non-String key or value, the
call will fail.50
The use of inlined constants (both int and String type) as keys for hash
tables other than Properties objects are actually just one of the many uses
of convenience constants. Convenience constants were originally suggested as
a replacement for “enums” (enumerated constants declared using the enum
keyword in the C and C++ programming languages) by no less than Dr. Gosling.
They have fallen out of favour, however, and are gradually being replaced by the
typesafe enum pattern discussed in 1.5.5 Enumerated Types.
What about primitive data types or String type fields that are not true
mathematical constants, bit masks, mnemonics, or property names? Is there a
hard and fast rule that such fields should not be declared final? Generally
speaking the answer is Yes. There are always exceptions, however. Consider
the following system-wide values in the java.io.File class.
Note that these fields fail to heed the advice of the JLS not to use final for
read-only access. They are not inlined constants (which would be disastrous),
however, because all four variable initializers include either a variable or a
None of these values are inlined. Unless you were going to use a variable,
method invocation expression, or class instance creation expression to initialize
the field anyway, a blank final is more efficient than any of these concoctions.
The blank final can be initialized in an initialization block that immediately follows
the declaration. Blank finals are never inlined.
System.getStdIn().println("Hello World!");
This specification was always a problem because the raison d'être for initializa-
tion block (complex initialization logic and invoking methods that throw checked
51. I normally use a fixed font for keywords, but make an exception for the term blank final. The
problem is that I never pluralize words in fixed font (such as type names), and more often than not
the term blank final is used in the plural.
52. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification, First Edition,
§8.3.1.2, “final Fields.” (Do not update.)
Assignments to blank finals are still much more restrictive than they are for non-
final fields. For example, this quote says “calculated by a loop” (i.e., com-
plex initialization logic), not assigned in a loop. Attempting to assign a value to a
blank final anywhere in a loop generates a compiler error. (There is an example
of this below.) Moreover, fields that are blank finals must be definitely
assigned somewhere in the special initialization method that initializes
the corresponding class or object. For static fields, that means the defi-
nite assignment must occur somewhere in a static initialization block. For
non-static fields, the definite assignment must occur either in an instance ini-
tialization block or every constructor. A compiler error is generated if one or
more of the constructors do not assign a value to the blank final. For example,
class Test {
final String s;
Test() { }
Test(String s) { this.s = s; }
}
53. Ken Arnold and James Gosling, “Appendix D, Changes for 1.1” (excerpted from the fourth print-
ing of The Java Programming Language, (Boston, Addison-Wesley Professional, 1996), D.1.2,
“New Uses for final.”
Blank finals cannot be initialized inside of loops for the same reason. For example,
That this is the same tentative language used in definite assignment compiler
errors is no coincidence. The rules of definite assignment were extended in the
1.1 release to include fields that are blanks finals as well as local variables. Thus
Chapter 16, “Definite Assignment” in the JLS begins as follows.
Each local variable and every blank final field must have a defi-
nitely assigned value when any access of its value occurs. A Java
compiler must carry out a specific conservative flow analysis to make
sure that, for every access of a local variable or blank final field f, f
is definitely assigned before the access; otherwise a compile-time error
must occur.55
54. The term double-assignment is from the “Compatibility with Previous Releases” document
cited above. The examples of double-assignment in this section are based on examples in that doc-
ument.
class Test {
void print() {
final char EXCLAMATION_MARK;
class LocalClass {String s = "Hello World" +
EXCLAMATION_MARK;}
EXCLAMATION_MARK = '!';
System.out.println(new LocalClass().s);
}
}
There are other special rules for initializing blank finals. These are just some of
the more obvious ones.
NOTE 1.2
If some programmers are willing to sacrifice everything at the alter of
performance (including readability), the bitwise operators are the gods
to which they pray. This religious order in which every bit is cherished
has existed undisturbed since the dawn of computer time. Enter the
typesafe enum pattern. If taken to the extreme (read “full-featured ab-
stractions,”56 far beyond mere compile-time type checking), the type-
safe enum pattern is a vain attempt to preempt a world in which
the thought of exchanging a bit mask for an object is nothing
less than ludicrous. This is harsh language, I realize, but the point
must be made that this cannot be an either or choice. The bit worship-
ers must realize that the opportunity to eliminate an entire class of runt-
ime exceptions is an efficiency of a different sort; and the typesafe
enum pattern enthusiasts must realize that they are standing on holy
ground. Is a marriage of these two worlds possible? Read on.
56. Joshua Bloch, Effective Java Programming Language Guide, (Boston: Addison-Wesley,
2001), “Item 21: Replace enum constructs with classes.”
57. This is sometimes referred to as a closed set or a bounded domain by the mathematically liter-
ate.
Java has no enum types. You can obtain something similar to enum by
declaring a class whose only raison d'etre is to hold constants. You
could use this feature something like this:
You can now refer to, say, the South constant using the notation
Direction.South.
Using classes to contain constants in this way provides a major advan-
tage over C's enum types. In C (and C++), names defined in enums
must be unique: if you have an enum called HotColors containing
names Red and Yellow, you can't use those names in any other
enum. You couldn't, for instance, define another Enum [sic] called
TrafficLightColors also containing Red and Yellow.
Using the class-to-contain-constants technique in Java, you can use the
same names in different classes, because those names are qualified by
the name of the containing class. From our example just above, you
might wish to create another class called CompassRose:
Thus began one of the longest running design discussions in the whole of the
Java programming language. This focus on the namespace problems in C and
C++ was something of a ruse. In a more casual setting, Dr. Gosling later admit-
ted that he more or less ran out of time in trying to “converge on a design that
made sense” as a replacement for enum in an object-oriented programming lan-
guage:
Enumerations were left out of the Java spec not because I think they're
a bad idea, but because I couldn't converge on a design that made
sense. Enumerations means different things to different people.
There was more than enough grayness in the area that I decided to put
the issue aside for the time being. 59
In fact, enum was apparently a reserved word while the 1.0 release of the JDK
was still in beta testing.
The kind of enumerated constants advocated in this white paper are used
extensively in pre-1.4 releases of the core API, and have come to be known as
convenience constants (or less commonly as the int enum pattern). The fol-
lowing example of date and time fields is from the Calendar class.
58. James Gosling and Henry McGilton, “The Java programming Language Environment, A White
Paper” (Mountain View: Sun Microsystems, 1996), java.sun.com/docs/white/langenv/
index.html.
59. Gosling, Letters to the Editor, Java World, June 1998, www.javaworld.com/javaworld/
jw-06-1998/jw-06-letters.html.
As you can see, convenience constants are usually public static final
int declarations, but String is sometimes used as the data type. The follow-
ing example of the less common String type convenience constants is from
the BorderLayout class in the java.awt package.
Most optimized JVM will inline this method. It is invoked from several different
public methods so that the switch statement doesn’t clutter those meth-
ods. The use of a typesafe enum obviates the need for such argument checks.
The other problems with convenience constants pale in comparison to throw-
ing runtime exceptions versus compile-time type checking:
• Client programmers sometimes ignore convenience constants and use an
int or String literal instead. That is a problem because it introduces the
possibility that an incorrect value is passed. The eternal debate over
whether Calendar.JANUARY should be 0 or 1 is a classic example of
this. This problem is exacerbated when using String type convenience
constants because misspellings are not caught by the compiler. It is even
60. Bloch, Effective Java , “Item 21: Replace enum constructs with classes.”
61. As far as I know, this was the first serious break with convenience constants in the core API. It
was a bold initiative, and the responsible programmer(s) should be commended for setting an exam-
ple.
This is the central complaint in Bug Id 4401321, which is the RFE to sup-
port enum in the Java programming language, and it is as true today as it
was on January 2, 2001 when the RFE was submitted. Bloch refers to this as a
“minor disadvantage”63 but I am not sure everyone would agree. What we are
talking about here is coding styles. The suggestion that nested if-then-else
statements should be used instead of a switch statement is sacrilege for the
bit worshipers. I am only partly kidding because Bloch really did cross the line
when suggesting that bitsets should be exchanged for a collection of objects in
the following quote.
The typesafe enum pattern has few disadvantages when compared to
the int pattern. Perhaps the only serious disadvantage is that it is
more awkward to aggregate typesafe enum constants into sets. With
int-based enums, this is traditionally done by choosing enumeration
constant values, each of which is a distinctive positive power of two,
and representing a set as the bitwise OR of the relevant constants:
62. The typesafe enum pattern is an example of the more general Flyweight design pattern in
Design Patterns .
63. Bloch, Effective Java, “Item 21: Replace enum constructs with classes.”
His real faux pas, however, is obsessing with the idea of evolving convenience
constants into “full-featured abstractions,”65 My question is why? The raison
d'etre for using the typesafe enum pattern is compile-time type checking. The
idea that every enumerated type should evolve into a “full-featured abstraction”
overlooks the common case of a discrete set of integers that have nothing
whatsoever to do with behavior. The Level class is a perfect example. The
enumerated constants in the Level class are primarily used in the following if
statement to determine if a log record should be published.
64. Bloch, Effective Java , “Item 21: Replace enum constructs with classes.” I lavished so much
praise on Bloch in Volume 1 as to embarrass myself, so I trust I can be allowed the occasional con-
structive criticism for the general good. There can be no doubt, however, that he has achieved the
same status in the pantheon of Java gods as Dr. Gosling, Guy Steele, Doug Lea, and a great many
others. Besides that, I made essentially the same mistake in Volume 1 by suggesting that there
should be a HashCode class to facilitate the computation of hash codes. In both cases, there is a
profound disregard for the importance of bitwise programming (which is something I did not fully
develop until writing 4.7 A Bitwise Primer).
65. Ibid.
Level is never going to evolve into a “full fledged abstraction.” Levels are a dis-
crete set of integers and nothing else.
What can be done to make the typesafe enum pattern usable in switch
statements? Interestingly, Bloch hints at that answer in the very next sentence of
the passage quoted above:
…Such a set is best implemented in the same package as the element
type to allow access, via a package-private field or method, to a bit
value internally associated with each typesafe enum constant.66
66. Ibid., Note that this explains why none of the examples in Effective Java include a
getValue() or intValue() method that returns an integer value for the enumerated con-
stant.
The typesafe enum redirect is always available should you want to use it as
a hash table key or for some other reason. The point is that by providing an
int value that can be used in bitwise programming you offer client pro-
grammers the greatest flexibility in deciding how to use the typesafe
enum. Dyed-in-the-wool bitwise programmers will mostly likely choose to com-
pletely ignore the typesafe enum after invoking redirect.intValue(),
and they should have that option.
As further support for bitwise programming, enumerated constants that
are part of a public interface should be initialized with a power of two.
There are 32 powers of two (or bit masks) in an int, which more than exceeds
the number of enumerated constants in most enumerated types. If more enu-
merated constants are needed, a long can be used.
The interface contract for enumerated types should specifically state that
the int value of an enumerated constant will never change. Once the
immutable class modifier is implemented (as I am utterly convinced will hap-
pen for a number of reasons), the language will guarantee this.
The typesafe enum pattern reduces but does not entirely eliminate the possi-
bility of a runtime exception being thrown. Compile-time type checking assures
only that one of the enumerated constants can be passed, but there is now the
possibility that a null reference is passed. The example above uses an explicit
argument check. Other methods may use a system-induced argument check
that throws a NullPointerException. I would argue that passing null is
an IllegalArgumentException, but other programmers may feel just as
strongly that NullPointerException should always be thrown. This must
be regarded as a matter of style because both are runtime exceptions.
I arrived at this implementation of the typesafe enum pattern by reasoning
that typesafe enums should support the use of switch statements (or more
This is precisely the point that Vladimir Roubtsov makes in a recent contribution
to the enumerated type design discussion.68 I could not agree with him more,
67. As of this writing RFE 4403347 has only nine votes (one of which is mine). I think this is mislead-
ing, however. I bet that a significant number of the 466 votes (as of this writing) for RFE 4401321
would be changed to 4403347 were there not 100+ pages of comments (making it easy to forget
which RFE is under discussion) at the bottom of the former. In fact, there are so many proposals for
an Enum or EnumeratedType baseclass in those comments that RFE 4403347 actually refer-
ences it as a source of examples.
68. Vladimir Roubtsov, “Java Tip 122: Beware of Java typesafe enumerations,” (New York, Java-
World, 2002), www.javaworld.com/javaworld/javatips/jw-javatip122.html
Nor is Level ever going to evolve into a “full fledged abstraction.” Levels are a
discrete set of integers and nothing else. I have deliberately repeated this point
because it is central to this entire section.
The real reason for not creating an EnumeratedType baseclass in the
java.util package is that it will be subject to the worst kind of abuse, per-
petuating seriously outdated design ideas that evolved in procedural program-
ming languages, and that have no place in an object-oriented programming
71. The term polymorphism is normally used in this context. However, I define polymorphism much
more narrowly than the average programmer. See 5.4 Substitution is a Higher Concept than Poly-
morphism.
class Test {
private int levelValue =
java.util.logging.Level.SEVERE.intValue();
private static final int offValue =
java.util.logging.Level.OFF.intValue();
private int level = Level.Constants.SEVERE;
This example involves a test to see if trace information should be logged. It sim-
ulates a logger that is set to Level.SEVERE. Thus the test fails every time.
This is very common in production because trace information is typically not pub-
lished unless there is a problem. This “cheap comparison” (as it is described in
the following quote) is coded exactly as it is in all of the logging methods in Table
6.8 Logging Methods on page 976. It is described as follows in the API docs.
The APIs are structured so that calls on the Logger APIs can be cheap
when logging is disabled. If logging is disabled for a given log level,
then the Logger can make a cheap comparison test and return.73
These test results show that a greater appreciation for bitwise program-
ming could have significantly reduced the cost of this comparison. This
test actually favours the existing implementation in that the second Boolean
72. Steve Wilson and Jeff Kesselman, Java Platform Performance: Strategies and Tactics,
(Reading: Addison-Wesley, 2000), Listing 3-2, “Reusable stopwatch class.”
73. Unascribed, “Java Logging Overview” in the API docs for the 1.4.0 release, (Mountain View: Sun
Microsystems, 2002), §1.1, “Overview of Control Flow.”
I quote this specification to show that the raison d'être for declaring local vari-
ables and parameters final is to make them “available to inner classes” Actu-
ally only block classes (local and anonymous classes) can use local variables
and parameters declared final.
Elsewhere the Inner Classes Specification says that the copy of a local vari-
able or parameter that is passed to a block class constructor “never contain
inconsistent values”76 because they are declared final. That is the extent to
which the Inner Classes Specification explains why local variables and param-
eters used in block classes must be declared final, and the JLS is silent on
the subject. The remainder of this section will explain this language design deci-
sion.
There are no “potential synchronization problems” because local variables
and parameters are allocated in the local variable array on the stack, which is
not shared memory. “Shared access” is therefore an impossibility. This lan-
guage design decision is intended entirely to correct the mistaken perception
that a local variable or parameter (that exists only in a local variable array in a
single frame on the stack) could actually be shared between classes. For exam-
ple,
class Test {
public static void main(String[] args) {
int x = 0;
class LocalClass {
void print() {
System.out.println(x);
}
}
LocalClass local = new LocalClass(); //value of x passed here
local.print();
x = 10;
local.print();
}
}
0
0
The value of x is passed to the local class constructor when new Local-
Class() is evaluated. All classes, including local and anonymous classes, are
top-level package members after compilation. The transformed code looks like
this:
class Test {
public static void main(String[] args) {
int x = 0;
Test$1$LocalClass local = new Test$1$LocalClass(x);
local.print();
x = 10;
local.print();
}
}
class Test$1$LocalClass {
private int val$local;
LocalClass(int val$local) {
this.val$local = val$local;
}
void print() {
System.out.println(val$local);
}
}
Now you can clearly see why the value 10 is not printed. The decision to require
that local variables and parameters used in a block class be declared final is
not based on “potential synchronization problems.”77 It is intended entirely to
correct the mistaken perception that changing the value of a local vari-
able or parameter after a block class has been instantiated could some-
how change the corresponding value in the block class. I really do not want
to be seen as questioning this language design decision. They really had no
choice because the mechanism for passing the value of a local variable or
class Test {
public static void main(String[] args) {
final int[] x = {0};
class LocalClass {
void print() {
System.out.println(x[0]);
}
}
LocalClass local = new LocalClass(); //value of x passed here
local.print();
x[0] = 10;
local.print();
}
}
0
10
I would note, however, that there really are “potential synchronization prob-
lems”79 in shared access to a one-element array.
Beyond their use in block classes, declaring local variables and method or
constructor parameters final is useful in documenting that the value never
NOTE 1.1
Patterns are good programming practices. An antipattern is a bad one.
Inasmuch as Joshua Bloch both defined and solved the problem of the
Constant Interface antipattern (his term), it would be reckless not to
attribute the following section to him. It is merely a restatement of his
“Item 17: Use interfaces only to define types” in Effective Java.80
Importing a class does not make it possible to use the simple names
of static members, but implementing an interface does. This is a
little confusing at first.
class. The static members in that class must still be qualified with the class
name. Thus programmers have evidenced a strong preference for declaring
constants in interfaces rather than in classes. (The opponents of the new
import static facility in the 1.5 release should remember this.)
This was the status quo before Bloch pointed out that public interfaces are
part of the API design, and should not be used merely to declare constants. For
example, what does it mean for a class to implement java.swing.Swing-
Constants ? Is a JProgressBar a SwingConstant in the polymorphic
80. Bloch, Effective Java, “Item 17: Use interfaces only to define types.”
The opponents of the import static facility would argue that using
Math.PI is much clearer than using just PI and does not run the risk of name
conflicts similar to those that occur when using type-import-on-demand declara-
tion. I do not necessarily disagree with them. This facility must be used with
care, especially when invoking static methods. See 2.6.5 import static
in the Second Edition of Volume 1 for a discussion.
One thing is evidently clear, however. The days of using interfaces to
declare constants (unrelated to a specific type) are over. Examples such as
SwingConstants in the core API are no less an anachronism than similar
uses outside of Sun. As Bloch says, such interfaces in the core API are “anoma-
lies and should not be emulated.”81 This practice is described as an antipattern
because the fact that a class uses constants such as those declared in
SwingConstants is an implementation detail that should not be exposed. In
other words, using interfaces to declare constants breaks the encapsulation of
any class that implements the interface.
1.6 Methods
A method declaration consists of a method header and method body. A
method body is one of the following:
• A semicolon indicating either (1) that an abstract method is not imple-
mented, or (2) that the body of a native method is omitted
• A pair of empty braces indicating a void method that returns without doing
anything
81. Ibid.
b Method modifiers The method modifiers are the access modifiers (public,
protected, and private), abstract, static,
final, synchronized, native, and, as of the 1.3
release, strictfp.
c Result type The result type is either void (for methods that do not return
a value) or the type of the method invocation expression.
Notice this says result type. The term return type is a
bastardization of return statement and result type. There
is no such thing as a “return type,” because of the method
return conversion context discussed in 5.7.3 Method Return
Conversion Context.
Putting the abstract modifier in front of the access modifier is a very reason-
able exception to the general rule the access modifiers always come first
because there can be at most two modifiers in such a declaration.
Subclasses cannot declare methods with the same signature as final
superclass methods. Consequently, attempting to hide or override a final
method is a compiler error. The practical effect of declaring a method
private is to make it final because private methods are not inherited.
Likewise, all of the methods in a final class are implicitly final because the
class cannot be extended. Declaring such methods final is redundant, but
does not generate a compiler error.
82. Gosling et al., The Java Language Specification, §8.4.3, “Method Modifiers.”
package com.javarules.examples;
public abstract class Superclass {
83. Before the 1.2 release, many of these illegal combinations of abstract and other field mod-
ifiers would compile. See Bug Id 1266571.
import com.javarules.example.*;
public class Test extends Superclass {
void defaultAccess() {}
}
The compiler must allow such declarations because the superclass can be
extended in the same package. Outside of that package, however, default
access (which means the method is not inherited) and the abstract modifier
(which requires that the method be implemented) are as contradictory as any of
the other illegal combinations of method modifiers listed above.
That leaves only public abstract, which is simply a way of declaring a
behavior that subclasses must implement before they can be instantiated, and
protected abstract. Subclasses inherit interface and implementation. If
methods declared abstract are not implemented, then subclasses can only
inherit interface from them. Given the definition of the protected access mod-
ifier in 2.8.1 The protected Access Modifier (as “implementation inherit-
ance”), a protected abstract method may seem to be an inherent
contradiction. Indeed, such methods have a very special purpose in object-ori-
ented programming. A protected instance method that is declared
abstract, has an empty method body, or a default implementation that must
be overridden by at least some subclasses is described as a superclass imple-
mentation hook in this book. These are very special methods discussed in 3.9
Designing Extensible Classes. Thus abstract methods have only two,
Unlike interface constants, the emphasized text was not removed in the Second
Edition. Apparently, no one thinks method modifiers should be used when declar-
ing methods in an interface.
The difference between an abstract method and an empty method imple-
mentation is subtle. Syntactically one is represented by a semicolon and the other
by empty braces. As stated above, an abstract method defines a behavior that
must be implemented somewhere in a class hierarchy before subclasses can be
instantiated. In other words, abstract are not implemented. Empty method
implementations are “do nothing” methods. One of the most common uses of
empty method declarations is in classes such as WindowAdapter that imple-
ment EventListener interfaces in the java.awt.event package. Such
classes are designed to simplify the implementation of anonymous classes used
as event adapters. For example,
addWindowListener(new WindowAdapter() {
public void windowClosing(WindowEvent e) {
System.exit(0);
}
});
85. Gosling et al., The Java Language Specification, §6.4.2, “The Members of a Class Type.”
86. Gosling et al., §8.4.6.1, “Overriding (By Instance Methods).” See also §9.4.1, “Inheritance
and Overriding,” the opening sentence of which explicitly states that subinterface methods “over-
ride” superinterface methods.
This method throws a general IOException. Now consider the following over-
riding method in a subclass.
Implementations of this method further down in the class hierarchy are now
restricted to throwing the more specific FileNotFoundException.
class Widget {
private static final Widget[] OUT = new Widget[0];
private Widget[] stock;
As shown in this example, zero-length arrays are immutable and can be returned
over and over again. This class caches a zero-length array during class loading.
The problem with returning null can be seen in the args parameter of the
main method. One must always test for null before attempting to access the
args array (to search for an element or even just to query the length field).
Client programmers are confronted with the same problem if you return null
instead of a zero-length array. As noted by Joshua Bloch in his best seller Effec-
tive Java:
…there is no reason ever to return null from an array-valued
method instead of returning a zero-length array. This idiom is
likely a holdover from the C programming language, in which array
lengths are returned separately from actual arrays. In C, there is no
advantage to allocating an array if zero is returned as the length.88
87. The term empty array cannot be used in this context. In fact, I do not use that term at all
because it can be interpreted to mean either a zero-length array or an array in which no values have
been assigned to the components.
88. Bloch, “Item 27: Return zero-length arrays, not nulls.”
The problem with the return statement in this accessor method is that it is
conditionally executed. If fd == null is true, the method will fall through.
Here is the actual method declaration from the RandomAccessFile class of
the java.io package:
Placing the empty brackets after the former parameter list in a method declara-
tion is still supported for backward compatibility with those early releases of
Java. However, all editions of the JLS have specified that this syntax “should not
be used in new code.”90 See also 1.11.1.2 Covariant Result Types.
89. Gosling et al., The Java Language Specification, §8.4, “Method Declarations.”
90. Ibid.
This perhaps explains why the JLS uses funny values instead, and Bloch uses
distinquished return value in Effective Java. I have also heard them referred
to as special values.
Java exceptions are intended to replace the use of return values to indicate
failure. As stated in the JLS:
Explicit use of throw statements provides an alternative to the old-
fashioned style of handling error conditions by returning funny values,
such as the integer value -1 where a negative value would not normally
be expected. Experience shows that too often such funny values are
ignored or not checked for by callers, leading to programs that are not
robust, exhibit undesirable behavior, or both.92
A small number of methods in the core API return -1, but not to signal failure.
Methods in the java.io and java.nio package use -1 to signal EOF, and
the String and StringBuffer classes return -1 to indicate that a char
or substring was not found. What is happening on the Java platform is that
null is being returned instead of an object. In 6.9 Exception Handling, I argue
that this is not one whit different than returning -1 to indicate failure. Such meth-
ods should either throw an exception or else use the null object pattern. There
are also a handful of boolean methods in the core API that return false to
indicate failure. This is also a mistake. Return values should not be used to indi-
cate failure in the Java programming language. The problem with doing so is
A parameter declared final is like any other constant. The value cannot be
changed in the method or constructor body. See 1.5.6 Declaring Local Variables
and Parameters final for a discussion.
Parameters are created when a method invocation expression or class
instance creation expression is evaluated and destroyed when control passes
out of the method or constructor either because a return or throw state-
class Test {
public static final void main(String[] args) {
int i = 100;
change(i);
System.out.println(i);
}
static void change(int value) {
value--;”
}
}
ble, however, one of two things can happen at this point. If a defensive copy is
94. There is a mindless debate that rages over whether Java passes objects “by value” or “by refer-
ence.” This terminology is used nowhere in either the JLS or JVMS. It is a throwback to the C
and C++ programming languages. Furthermore, to argue that Java passes objects “by value”
because a copy of the reference is passed is superficial analysis. The real issue is that Java does
not create objects on the stack. Therefore, there is no mechanism for automatically creating a copy
of the object passed. In short, Java cannot pass an object. Java can only pass references to
objects. What possible good can come from arguing that Java passes those references “by value”
because they are “copies” of the original? To do so is merely to explain how parameters are initial-
ized. They are initialized with a copy of the corresponding argument expression. Whether that is
considered the equivalent of “passing by value” or “passing by reference” is irrelevant to Java pro-
grammers who have no background in the C and C++ programming languages (which sooner or
later will be the most of them).
class Test {
public static final void main(String[] args) {
Mutable mutable = new Mutable();
System.out.println(mutable);
change(mutable);
System.out.println(mutable);
}
static void change(Mutable value) {
value.mutatorMethod();
}
}
class Mutable {
int i = 100;
void mutatorMethod() {
i--;
}
public String toString() { return Integer.toString(i); }
}
100
99
NOTE 1.2
Everything said in the following section applies to constructors as well
as to methods. There is a dearth of argument checks in constructors
(including constructors in the core API). The arguments in a class in-
stance creation expression are not substantially different from those in
a method invocation expression. Generally speaking, they should be
95. Prior to the publication of the “Programming With Assertions” in the 1.4 release in which asser-
tions were introduced to the Java programming language, I used the term parameter check. The
change from parameter check to argument check is consistent with my policy of always deferring
to the software engineers and technical writers at Sun in matters of terminology.
There is one important exception to coding argument checks at the very top of a
method or constructor. As discussed in 1.6.3.3 Making Defensive Copies, argu-
ment checks should always be made against the defensive copy. Therefore the
code to make the defensive copy comes first.
Argument checks such as these cannot be coded without documenting them
using the @param and @throws tags because they throw runtime exceptions
that are not included in the throws clause of a method or constructor header.
They are always used in public and protected methods, but in default
access and private methods assertions are preferable to arguments checks
for reasons explained in 6.2 Assertions.
There are two very specific reasons why assertions cannot be used as
replacements for argument checks in public and protected methods: they
can be disabled; and if the Boolean expression in an assertion evaluates to
false an AssertionError is thrown. Either of these would violate the
exception specification in the API docs. Disabling assertions violates the excep-
tion specification because the exception would not be thrown. Throwing
AssertionError would violate the exception specification because the
method is documented as throwing IllegalArgumentException (or
some other runtime exception). Normally neither runtime exceptions nor errors
are explicitly caught, but changing from a runtime exception to an error could
change the behavior of a program that uses a catch(Exception e) catch-
all exception handler.
Here is the actual documentation comments in the Object class that result
in the API docs that you see above:
The @throws tag was introduced in the 1.2 release as a “synonym” for
@exception with the explanation that throws is the keyword and so
@throws is generally preferred over @exception. Both are translated into
“Throws:” by javadoc. The @throws tag is discussed at length in 1.6.4
The throws Clause.
In the examples above, “in nanoseconds range 0-999999”, “if the value of
timeout is negative”, and “if newLocale is null” are known as precondi-
tions, which the Cambridge International Dictionary of English defines as “some-
thing which must happen or be true before it is possible for something else to
happen.”96 Preconditions are mentioned in 6.2.1 Preconditions, Postconditions,
and Invariants for the sake of completeness, but are primarily discussed in this
section. The relationship between argument checks and preconditions is that the
former are used to test that client programmers are in compliance with the lat-
ter. Other tests are required to test preconditions, however. For example, tests
that throw IllegalStateException are not usually argument checks.
Thus the term precondition has a broader meaning than argument checks.
Preconditions can be documented using @param tag, the @throws tag, or
both. Here is an example from the java.nio package that documents the pre-
conditions using the @param tag:
Parameters:
newPosition - The new position value; must be non-nega-
tive and no larger than the current limit
Throws:
IllegalArgumentException - If the preconditions on
newPosition do not hold
This usage is inconsistent with the rest of the core API, which throws an
IllegalArgumentException under the same circumstances. Throwing
ArithmeticException instead anticipates something that has not as yet
happened.
When coding argument checks for strings, including a check for an empty
string (i.e., "" or a string that has no length) is often as important as checking
for null. In fact, you should make it a point to stop and consider the implica-
tions of passing an empty string when coding any method or constructor with
String type parameters. Usually an IllegalArgumentException is
thrown as a result. It may be tempting to code the following, but throwing
NullPointerException for an empty string is clearly misleading.
if (s == null || s.length = 0)
throw new NullPointerException
If you want to code a single argument check such as this, throw Illegal-
ArgumentException instead. Be sure to mention both null and empty
strings in the @throws tag documentation. Forgetting to check for an
empty string is a very common problem. For example, in Bug Id 4481055,
“LOGGING APIs: Undefined behavior for FileHandler constructors with
empty pattern” none of the constructors in the FileHandler class check for
an empty string. There is at least one other example of this in the logging API,
which is documented in Bug Id 4486791, “LOGGING APIs: Undetermined behav-
ior on empty logger names.” My reason for picking on the logging API is that the
/**
* Sets the time field with the given value.
* @param field the given time field.
* @param value the value to be set for the given time field.
*/
public final void set(int field, int value) {
fields[field] = value;
}
Here the exception is thrown in a private method, which can be very confus-
ing to client programmers reading a stack trace. (All of the constructors in the
Locale class are documented to throw NullPointerException, so I
assume the absence of an explicit argument check is an example of a system-
induced argument check performance optimization and not just an oversight.)
NullPointerException is the classic system-induced argument check.
Many other methods and constructors in the core API doubtless make the same
mistake. In fact, explicit argument checks for null references are often omitted
even when it is painfully obvious that passing a null reference is eventually
going to result in NullPointerException or some other exception being
thrown. I would argue that, if you are going to use the system-induced argument
check for a NullPointerException, the first use of the parameter
should be as a target reference in a field access or method invocation
if (name == null)
throw new NullPointerException();
detail. In either case, the runtime exception is thrown as the result of the value
passed and should be so documented. In fact, I would argue that it is actually
more important to document a runtime exception thrown as the result of a sys-
tem-induced argument check than one that is explicitly thrown (which at the very
least can be readily seen looking at the source code).
Immutable types (not objects) are how const references are imple-
mented in the Java programming language.
actually an immutable type that is required. The immutable type may be either a
class or interface type, but in either case does not include any mutator meth-
class Account {
The code in bold is mine. Other than some gratuitous name changes, the rest is
from Concurrent Programming in Java.97 Immutable is an inner class so
97. Doug Lea, Concurrent Programming in Java, Second Edition, (Boston, Addison-Wesley,
2002), §2.4.3, “Read-Only Adapters.” It is difficult for me to image anyone not owning Lea’s book,
so I will let you read his (rather entertaining) explanation as to why recorder should not be
passed an instance of the mutable class.
Note that Account can still be an interface rather than a class type. I just
wanted to simplify the example as much as possible. The required changes to
the other two classes in Doug Lea’s example are very minor, and are also
marked in bold:
public AccountHolder(AccountRecorder r) {
recorder = r;
}
NOTE 1.3
Being a thorough Bloch devotee, it took me a while to realize that he
confuses Would-Be Mutators and making defensive copies in Effective
Java. In “Item 13: Favor immutability” he talks about making defensive
copies (of the values stored in instance variables) while actually copying
the current object. Returning a copy of the current object is the Would-
Be Mutator design pattern. The motivation for the Would-Be Muta-
rather than:
[end of quote] 99
This passage from the JLS contributes to a popular myth among Java program-
mers that read-only access can be implemented by using private fields and
100.String objects are sometimes called read-only, but this is an abuse of terminology. Vari-
ables are read-only. Objects are either mutable or immutable.
Note that the argument check now uses instance variables (the defense copies)
and not the method parameters. This eliminates the possibility that whoever cre-
ates a Period object either maliciously or because the Period class is
documented to make defensive copies immediately changes the Date
objects passed. In a multithreaded application, it is possible for such a change
to occur after the argument check and before the defensive copies are made. In
other words, during the window of vulnerability. The consequence of such a
change is twofold. First the dates passed are not what the client programmer
intended (unless done maliciously) and secondly the start > end class invari-
ant is compromised. You should make it a practice to always code like this, even
if the class is initially only used in single threaded applications.
101.Joshua Bloch, “More Effective Programming With Java™ Technology” delivered at the 2002
JavaOne conference (Session 2502).
Why is using the clone() method okay when making defensive copies
in accessor methods, but not in mutator methods or constructors? Under-
standing the answer to this question is critical to proper encapsulation. The dif-
ference is that in mutator methods and constructors you are being passed an
object the purports to be a Date object but may in fact be a subclass of Date.
If the object is in fact a subclass of Date, the clone() method in the subclass
can be used to store references to the object cloned. For example,
import java.util.*;
class Test {
public static void main(String[] args) {
Period period = new Period(new MaliciousDate(),
new MaliciousDate());
Date start = period.start();
Date end = period.end();
System.out.println("start = " + start);
System.out.println(" end = " + end);
MaliciousDate.attack();
System.out.println("start = " + start);
System.out.println(" end = " + end);
}
}
class Period {
Date start, end;
public Period(Date start, Date end) {
if (start.compareTo(end) > 0)
throw new IllegalArgumentException(start + ">" + end);
// Window of vulnerability
this.start = (Date) start.clone(); //big mistake
this.end = (Date) end.clone(); // ~ditto~
}
It is important to understand why the @throws has the same weight as the
throws clause in establishing an exception specification.
The JLS says that “it is permitted but not required to mention other
(unchecked) exceptions in a throws clause.”103 This specification is seriously
outdated and should have been changed in the Second Edition of the JLS. It is
indicative of a subtle and rarely discussed change in how exceptions are docu-
mented on the Java platform. In the First Edition of the JLS (which unlike the Sec-
ond Edition included the API docs for the java.lang, java.util, and
java.io packages) runtime exceptions were included in the throws clause
of method declarations. This practice in now openly discouraged:
By convention, unchecked exceptions should not be included in a
throws clause. (Including them is considered to be poor programming
practice. The compiler treats them as comments, and does no check-
ing on them.)104
The only exception to this rule that unchecked exceptions should not be included
in throws clauses are runtime exceptions that should have been declared as
checked exceptions. In the core API this includes MissingResource-
102. Unascribed, “How to Write Doc Comments for the Javadoc Tool” at java.sun.com/j2se/
javadoc/writingdoccomments/index.html#throwstag, (Mountain View: Sun Microsys-
tems, Inc., 2000), “Documenting Exceptions with @throws Tag.”
103. Gosling et al., §8.4.4, “Method Throws.”
104. Unascribed, “How to Write Doc Comments for the Javadoc Tool” at java.sun.com/j2se/
javadoc/writingdoccomments/index.html#throwstag, “Documenting Exceptions with
@throws Tag.” Bug Id 4349458, “Runtime exceptions should not be included in throws clause” is
interesting in this regard. It concerns the javax.microedition package (J2ME) and com-
plains that runtime exceptions in throws clauses bloat class files.
from the “How to Write Doc Comments for the Javadoc Tool” document (written
some five years latter) reflect a subtle change that de-emphasized the throws
clause as a means of documenting exceptions. The throws clause is now best
thought of as primarily used by compilers to check for exception handlers. The
@throws tag should be used to “document” exceptions (versus merely “declar-
ing” that they are thrown). There is some redundancy in this design in that
checked exceptions are both documented using the @throws tag and declared
in the throws clause, but the redundancy is not without purpose. Unlike the
throws clause, the reason(s) why an exception is thrown can only be
explained using an @throws tag.
The checked exceptions either explicitly thrown or propagated in a method
or constructor body only need to be assignable to one of the exception types in
the throws clause. As stated in the JLS:
For each checked exception which is a possible result, the throws
clause for the method or constructor must mention the class of that
exception or one of the superclasses of the class of that exception.105
There are only two cases, however, in which an exception superclass should be
named in the throws clause: high-level exceptions as defined in 6.5.1 General
Exception Classes and umbrella exceptions. The documentation requirements
106. The compiler will allow related classes to be named in a throws clause. For example,
throws IOException, FileNotFoundException compiles, but no ones codes like this.
If an exception superclass is used in the throws clause, subclasses are documented using the
@throws tag.
class Test {
public static void main(String[] args) {
throw new Exception();
}
}
This is essentially what “checked” exception means; the compiler “checks” for
an exception handler. If there is none, the method or constructor must declare
that the exception is thrown.
All checked exceptions should be documented using the @throws
tag. There is general agreement on this point. Opinions as to which unchecked
exceptions should be documented vary. Some software engineers and technical
writers think that all of the unchecked exceptions should be documented. For
example, in “Item 44: Document all exceptions thrown by each method” Bloch
This advice is totally bogus. Generally speaking, runtime exceptions are not
caught. They are programmer errors (or bugs) that are fixed during development
and unit testing. Furthermore, even if they are caught, it would most likely be in a
catchall exception handler such as catch(Throwable e), not explicitly as
this specification implies. The reason for documenting runtime exceptions
is that they are preconditions. If the reason why they are thrown is not docu-
mented, programmers will have to revert to reading source code to find out why
their classes are not compiling.
As explained in 6.5 The Throwable Class Hierarchy, runtime exceptions
that should have been declared as checked exceptions (but were not because
there is no general agreement on the meaning of the term runtime exception)
should be documented as if they actually were checked exceptions. There are
three such runtime exceptions in the core API. They are MissingResource-
Exception, SecurityException, and NumberFormatException.
These runtime exceptions and other like them should not only be documented
using the @throws tag, but should also be declared in throws clauses.
Errors are documented at the programmer’s discretion. There are some rea-
sonable guidelines that can be formulated, however. As explained in 6.5 The
Throwable Class Hierarchy, errors are either “every method errors,” “non-
occurring errors,” or “recurring end-user errors.” The only every method error
that would ever be documented is OutOfMemoryError. There are only two
examples of this in the entire J2SE (including support packages) of which I am
aware, so you can be assured that it is rarely done. Recurring end-user errors
should always be documented using the @throws tag (to encourage client pro-
grammer to catch them in top-level exception handlers). That only leaves non-
occurring errors, the overwhelming majority of which are either assertion errors
or system configuration errors. Assertion errors should never be documented.
The “programmer’s discretion” then is limited to the documentation of system
110. Unascribed, “How to Write Doc Comments for the Javadoc Tool,” java.sun.com/j2se/
javadoc/writingdoccomments/index.html#throwstag.
I really do not like this example. It is misleading because most runtime excep-
tions are direct subclasses of RuntimeException and do not have any sub-
classes. In fact, ignoring the org.omg package, the only other runtime
exceptions in the core API that do have subclasses are IllegalArgument-
Exception , IllegalStateException, and UnsupportedOpera-
tionException . When a subclass of one of these runtime exceptions is
thrown it (versus the superclass) is always documented in the @throws tag. In
fact, the only time I have seen this example put into practice is in the String
class, which is funny because if the methods in the String class are not going
to throw StringIndexOutOfBoundsException who is? Nevertheless,
care should be taken not to document unchecked exceptions that may not be
thrown in a different implementation of the same method.
Encapsulation issues aside, documenting system configuration errors may
encourage client programmers to write exception handlers that translate them
into some high-level exception. How useful is that if these are non-occurring
errors? Explicitly catching a non-occurring error is like “Waiting for Gadot.” A
111. Ibid.
112. API docs for the java.util.logging package. See also Bug Id 4478366.
Doing so makes methods much easier to read because code does not have to
be searched to determine the initial value of a local variable or if it is used else-
where in the method, constructor, or initialization block. Bloch argues essentially
the same thing in “Item 29: Minimize the scope of local variables”, so why does
the “Code Conventions for the Java Programming Language” document include
the following?
Put declarations only at the beginning of blocks. (A block is any code
surrounded by curly braces “{” and “}”.) Don't wait to declare variables
until their first use; it can confuse the unwary programmer and hamper
code portability within the scope.
113. The main Bud Id is 4486754, “LOGGING APIs: Expected NullPointerException isn't
thrown” in which more than a few methods and constructors in the logging API are faulted for not
actually throwing the exception. Other such bugs include Bug Id 4625722,
“java.util.logging.Level(null, int[, String]) (sic) doesn't throw NPE” and Bug
Id 4635308, “ java.util.logging null reaction doc should be updated,” and Bug Id
4398380, “Logging APIs: SocketHandler constructors spec need clarification.”
if (condition) {
int int2 = 0; // beginning of "if" block
...
}
}114
The answer is that this specification is just plain wrong. In fact, the statement
that the placement of local variable declarations “may hamper code portability
within the scope” is a little bizarre. If Sun is going to continue to promote this
document it should be revised to reflect current coding practices.
The remainder of this section discusses declaring local variables in a loop.
For example,
Is s allocated over and over again? The short answer is No, but it is important to
understand why because otherwise there is a tendency to think that local vari-
able declaration statements inside a loop are somehow inefficient. That simply is
not the case.
Local variables and parameters are allocated on the stack in the same local
variable array as this. Unlike Java arrays, the components in a local variable
array do not all have the same component type. The following is a complete list
of variables allocated in a local variable array:
• The this reference (in local variable zero)
• Method parameters
• Constructor parameters
• Exception-handler parameters
• Local variables
114. Unascribed, “Code Conventions for the Java Programming Language,” (Mountain View: Sun
Microsystems, 1995-1999), §6.3, “Placement.”
class Test {
void test() {
/*
* The astore_<n> instructions are used to pop a reference
* off of the stack and store it in the local variable array.
* The <n> is the index value. In an instance method such as
* this, "local variable zero" is always a reference to the
* current object (a.k.a. this).
*/
String mathew = "Mathew"; //astore_1
String mark = "Mark"; //astore_2
String luke = "Luke"; //astore_3
String john = "John"; //astore_4
Object Paul = this; //astore_5
}
}
class Test {
void test() {
if (true) {
String mathew = "Mathew";
String mark = "Mark";
String luke = "Luke";
String john = "John";
Object Paul = this;
}
String Judas = "Judas";
}
}
//FIRST EXAMPLE
int defined = 0;
char ch;
for (int i=0; i<=Character.MAX_VALUE; i++) {
ch = (char)(i);
if (Character.isDefined(ch))
defined++;
}
System.out.println(defined + " character codes are defined");
//SECOND EXAMPLE
int defined = 0;
for (int i=0; i<=Character.MAX_VALUE; i++) {
char ch = (char)(i);
if (Character.isDefined(ch))
defined++;
}
System.out.println(defined + " character codes are defined");
The answer is neither. The bytecodes for these two examples are identical except
for the fact that ch and i have swapped places in the local variable array. The
only real difference is the scope of ch. In the first example, by declaring ch out-
side of the loop you make it impossible to reuse ch as a data name in the remain-
der of the method or constructor body.
invocation expressions that directly use the value of an intermediate result. For
example,
int defined = 0;
for (int i=0; i<=Character.MAX_VALUE; i++) {
if (Character.isDefined((char)(i)))
defined++;
}
System.out.println(defined + " character codes are defined");
This is not always possible, however, because the resultant expression may be
too long to fit on a single line. For example,
int hashCollisions = 0;
for (int i=0; i<=Character.MAX_VALUE; i++) {
int hashCode = (new Character((char)i)).hashCode();
if ((hashCode & 0x7FFFFFFF) % size == index)
hashCollisions++;
}
115. Gosling et al., The Java Language Specification, §16.2.14, “try Statements.”
116. Gosling et al., The Java Language Specification, introduction to Chapter 16, “Definite
Assignment” and also 14.20, “Unreachable Statements.”
Here is a very simple example from the core API of what the JLS means by
“every possible execution path:”
Signal signal;
if (name.startsWith("CTRL"))
signal = new Signal(name);
else
signal = new Signal(name.substring(3));
int x;
Only the simple assignment operator can be used to initialize a local variable
because the compound assignment operators += -= *= /= &= |= ^= %=
<<= >>= and >>>= first use the left-hand operand in the compounded opera-
tion, and then assign the result to the variable.
From the point at which a local variable is declared until it is definitely
assigned, any use of the variable is a compiler error. For example,
class Test {
public static void main(String[] args) {
int i;
if (Boolean.valueOf(args[0]).booleanValue())
i = 0;
System.out.println(i);
}
}
Attempting to compile this program generates exactly the same error message.
I like to use the try statement to emphasize that the compiler decides what is
“definite.” Assignments to local variables in a try block are never definite. That
means in order to initialize a local variable in a try block, a standard default
value must be assigned in the variable initializer. For example,
117. The authors of the JLS are frequently flogged for not having written a more detailed specifica-
tions for definite assignment and unreachable statements. I find much of the criticism to be out of
balance. For example, should the JLS include a specification that says: “If the type of the expression
in a switch statement is byte and there is a case label for all 256 integer values, the state-
ments in a default label are unreachable”? No! The specification is clear that the value of the
expressions in a case label is not taken into consideration. Dr. Gosling once said that Guy Steele
and Gilad Bracha were the only ones qualified to write the JLS because they were “well-known totally
anal freaks.” Well I’m sure Dr. Gosling meant that in a nice way, but some of the suggestions for
improving the specifications for definite assignment and unreachable statements really are “totally
anal.”
while (true) {
continue;
System.out.println("unreachable statement");
}
switch (i) {
case 1:
System.out.println("reachable");
break;
System.out.println("unreachable statement");
default:
System.out.println("reachable");
}
void test() {
return;
System.out.println("unreachable statement");
}
Perhaps the second most obvious group of unreachable statements are loops in
which the expression is an inlined constant that evaluates to false. For example,
class Test {
static final boolean DEBUG = false;
public static void main(String[] args) {
for (;DEBUG;) {
System.out.println("unreachable statement");
}
while (DEBUG) {
System.out.println("unreachable statement");
}
do {
System.out.println("reachable"); //always executes once
}
while (DEBUG);
}
}
What this means to application programmers is that debug flags that are
inlined constants cannot be used to control the execution of a loop.
The rules for unreachable statements have been especially written to allow
the if statement to be used for conditional compilation; statements contained
in an if (false) statement are always considered reachable.
Except for loop control variables that are inlined constants, the value of an
expression is never taken into consideration when determining if a statement is
reachable. The JLS has always included a curious little mistake in this regard:
Except for the special treatment of while, do, and for statements
whose condition expression has the constant value true, the values of
expressions are not taken into account in the flow analysis.119
(Recall that “constant value” is the interim term for inlined constant in the Sec-
ond Edition.) The “whose condition expression has a constant value true”
Here are a couple examples of what the specification means by not “taking into
account” the value of an expression:
switch (0) {
case 1:
System.out.println("technically reachable");
default:
System.out.println("reachable");
}
Both of these statements compile even though it is obvious they include state-
ments that are not really reachable. Here is another example made rather
famous by the Jikes compiler team at IBM:
void method(boolean b) {
if (b) return;
else return;
b = !b;
}
The last statement is bold is unreachable, but not because the value of b was
taken into consideration. It is unreachable because all of the branches in the if-
then-else statement resulted in abrupt completion.
A catch clause can also be unreachable. For example,
NOTE 1.4
The following section is necessary because I use the term qualifying
type to describe the class or interface that a compiler searches for the
compile-time declaration of a method (most notably in Table 1.7 The
Five General Forms and Qualifying Types). Anyone already familiar with
this terms knows that the Second Edition of the JLS introduced it pre-
cisely so as to differentiate the name of the class or interface written to
a class file from the class or interface that a compiler searches for the
compile-time declaration of a method. The two are only different, how-
ever, when the method invoked is declared in the Object class. Never-
theless, the reader should be aware that I am using the term qualifying
type with a slightly different meaning than the JLS. My reason for blur-
ally invoked at runtime. These three methods may be all the same, or they could
all be different (but only if the method is declared in the Object class) as in the
following example of invoking an overridden toString() method.
class Test {
public static void main(String[] args) {
Superclass superclass = new Subclass();
superclass.toString();
}
Because this method is declared in the Object class the qualifying type written
to the class file is Object. This may surprise some readers because it is a
quirk in the Java programming language that was not formalized until the Second
Edition of the JLS. The qualifying type of a method declared in the Object
class is always Object, regardless of how the method is invoked. If the
method is not declared in the Object class, the qualifying type is always the
same as the class or interface searched by a compiler for the compile-time dec-
laration.
Why the difference? The Second Edition of the JLS offers no explanation for
the difference. The change, however, can be seen in the addition of the following
specification in 13.1 The Form of a Binary.
• Given a method invocation expression in a class or interface
C referencing a method named m declared in a (possibly dis-
tinct) class or interface D, we define the qualifying type of
the method invocation as follows:
If D is Object then the qualifying type of the expression is
Object.…120
What follows the ellipses is more or less an exact restatement of the rules in
15.12.1 Compile-Time Step 1: Determine Class or Interface to Search, so the
entire difference between the qualifying type and the class or interface searched
by a compiler is the last sentence quoted above.
class Test {
public static void main(String[] args) {
Dummy dummy = new Dummy() {};
System.out.println(dummy.getClass());
}
}
interface Dummy { }
This program compiles and when executed prints class Test$1 (the name of
an anonymous class). The problem is that although the type of the primary
expression in dummy.getClass() is an interface, the invokeinterface
machine instruction cannot be used to invoke the public methods in the
Object class. The precise reason why is not of interest to application program-
mers, but is discussed in Bug Id 4398789. The solution is to invoke the method
using the invokevirtual machine instruction. If you read the JVMS for the
invokevirtual machine instruction you will see that the type name written
to the class file as part of the symbolic reference to the method invoked must be
a class type. To simplify the specification, the JLS states that the qualifying type
for such a method invocation is always Object.
I probably would not even broach the subject were it not for examples such as
the following involving the protected clone() and finalize() methods.
class Test {
public static void main(String[] args) {
Dummy dummy = new Dummy() {};
dummy.clone();
}
As explained in 3.7 Do Interfaces Extend the Object Class?, the clone() and
finalize() methods are “filtered out” when interfaces extend the Object
class. This example should therefore generate a cannot resolve symbol
error instead of an access control error. I believe this is why Bug Id 4644627,
“interfaces extend Object?” is still marked “In progress, bug.” This is very confus-
ing because the following interface compiles in the same 1.4.1_01 release (the
latest release as of this writing).
class Test {
public static void main(String[] args) {
Dummy dummy = null;
dummy.clone();
}
}
interface Dummy {
void clone(); //result type not Object
}
The error message generated by the first example (incorrectly) implies that inter-
faces inherit the protected clone() method from Object. The second
example implies that it is not inherited because the clone() method declared
in the Dummy interface has a void result type that would otherwise generate a
compiler error. (In fact, such interface method declarations do generate a com-
piler error in pre-1.4.1 releases.) This obviously needs to be fixed.
While on the subject of the difference between the qualifying type of a
method invocation and the class or interface that a compiler searches for the
compile-time declaration of a method, I want to address a serious problem with
the term compile-time declaration. This term is actually only used in one sec-
tion of the JLS. It is defined as follows.
class Test {
public static void main(String[] args) {
Subclass sub = new Subclass();
sub.print();
}
}
class Superclass {
void print() {
System.out.println("inherited by subclass");
}
}
class Subclass extends Superclass {
}
122. Gosling et al., §15.12.3, “Compile-Time Step 3: Is the Chosen Method Appropriate?”
When you hear the term compile-time declaration think “the class
or interface the compiler searched” not for a declaration, but for a
member with a given method signature.
This says that the name of the class or interface written to the class file is the
one in which the method is “declared.” This has never been the case. The name
of the class or interface written to the class file has always been the one the
compiler searched. This specification was corrected in the Second Edition by
the introduction of the term qualifying type:
A reference to a method must be resolved at compile time to a sym-
bolic reference to the qualifying type of the invocation.124
123. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification, First Edition,
§13.1, “The Form of a Java Binary.” (Do not update.)
124. Gosling et al., §13.1, “The Form of a Binary.”
import java.util.*;
class Test {
public static void main(String[] args) {
GregorianCalendar today = new GregorianCalendar();
today.getDate();
}
}
symbol : class …
symbol : constructor …
symbol : method …
symbol : variable …
Notice that there is no symbol : field. The compiler has no idea how to
determine that an unqualified variable name is supposed to be a field. It could
just as easily be a local variable or parameter, so variable is always used in
the error message. The third part of the message is what interests us the most
in this context. The third line always begins with the word location followed
by either the class or interface keyword and then the name of the class
or interface searched. The type name in a cannot resolve symbol error
message is always fully qualified. Be on the alert for simple types names. They
125. Gosling et al., The Java Language Specification, Introduction to Chapter 6, “Names.”
REFERENCE TYPE LITERALS comprised of The compiler searches the String class or
String literals (very unusual) and class the class name in the class literal.
literals
VARIABLES comprised of local variables and The compiler searches the declared type of the
parameters, the simple name of a field, field field, local variable, or parameter. In the case of
access expressions, and array access an array access expression, the compiler
expressions.b searches the component type.
ANY USE OF THE new KEYWORD comprised The compiler searches the class or interface
of class instance and array creation named in a class instance creation expression.
expressions (the latter is rare) In the case of an array creation expression, the
compiler searches the component type.
a. Before the 1.2 release, the javac compiler “silently tolerated” use of the TypeName.fieldName gen-
eral form to reference instance variables as if it had been written this.fieldName. (See Bug Id 4087127.)
b. As presented in 4.2.1 Primary Expressions, the this keyword is included in this list. It is excluded here
because this.fieldName and this.methodName are presented as separate general forms.
c. Methods declared void do not have a type, and therefore cannot be used as the primary expression in this
context.
face searched is not what you expected, you should analyze the code using the
five general forms listed in Table 1.7.
In a non- static context, qualifying the name of a static member with this
works, but only because the type of this is used by the compiler, not the
this reference. See 1.10.4 Accessing static Members using a Primary
Expression for a detailed discussion. If you try that in a static context, how-
ever, a compiler error is generated (because this is undefined in a static
context). The statement that the simple name of a field or method is the same as
the this.fieldName or this.methodName general forms is a gross
oversimplification that only applies to instance variables and instance methods
referenced from within the body of the same package member in which they are
declared.
An entirely difference approach to this subject is required, one that equally
emphasizes the following idea.
Which type name depends on whether the simple name is used in a package
member or nested type. The rule for package members is stated as follows in
Table 1.7.
126. Rose, “How do inner classes affect the idea of this in Java code?”
I know of no other Java book that suggests the simple name of a field or method
in a static context is implicitly qualified by the name of the class or interface
in which the simple name appears. It is a novel idea, but one that is extremely
useful in a classroom setting. The closest the specifications come to saying
something like this is the following quote from the JVMS.
A class method may refer to other fields and methods of the class by
simple name only if they are class methods and class (static) vari-
ables.127
The JLS says even less, —that the simple name of an instance variable in a
static context generates a compiler error.128
Determining the meaning of a simple field or method name used in a nested
type requires searching all of the enclosing types as well as any blocks that may
enclose a local or anonymous class. Because of scoping rules, the innermost
enclosing type must be searched first, followed by any blocks that may enclose
a local or anonymous class, and then the remaining enclosing classes. For
example,
class Test {
public static void main(String[] args) {
A.B.C abc = new A().new B().new C(); //don't try this at home
abc.test();
}
}
class A {
String s = "outermost enclosing class";
class B {
String s = "enclosing class";
127. Tim Lindholm and Frank Yellin, The Java Virtual Machine Specification, Second Edition,
(Boston: Addison-Wesley, 1999), §2.10.3, “Method Modifiers.”
128. Gosling et al., §6.5.6, “Meaning of Expression Names.”
129. Rose, “How does the Java Language Specification change for inner classes?”
I think this is an error of omission that should be fixed in the next edition.
In the previous example, the variable s has four different meanings:
The local variable s
this.s
C.this.s
B.this.s
A.this.s
This further complicates the notion that the simple name of a field or method is
implicitly qualified by either this (non-static members) or the name of the
type in which the simple name appears (static members). For example:
class Test {
private int x = 0;
class InnerClass {
void test() {
System.out.println(x);
}
}
}
This compiles because inner classes can access the private members of
enclosing classes. Now qualify the simple name x with the this keyword:
class Test {
private int x = 0;
class InnerClass {
130. Gosling et al., §8.1.2, “Inner Classes and Enclosing Instances.” Interestingly, subsection
6.5.6.1 Simple Expression Names of 6.5.6 Meaning of Expression Names did not have to be
changed to specify that the meaning of a simple field name in a block class may be a local variable
or parameter.
A qualified this keyword should have been used. The following code does com-
pile.
class Test {
private int x = 0;
class InnerClass {
void test() {
System.out.println(Test.this.x);
}
}
}
See 3.6.2 Qualifying the this Keyword in Volume 1 for a detailed discussion of
multiple current instances.
There is a similar problem when saying that the simple name of a field or
method in a static context is implicitly qualified by a type name. In a contain-
ment or inner class hierarchy,131 there is more than one enclosing type. For exam-
ple,
This is comparable to the last example, only the test() method is now a
static context. Now qualify the simple name x with the name of the
NestedTopLevel class:
class Test {
private static int x = 0;
static class NestedTopLevelClass {
static void test() {
System.out.println(NestedTopLevelClass.x);
}
}
}
Attempting to compile the Test class generates the following compiler error:
131. Basically, the term containment hierarchy refers to a package member and all of the
nested top-level classes declared in that package member. An inner class hierarchy refers
to all of the inner classes (inner member, local, and anonymous classes) in a given top-level
class. Contrary to the JLS, I define top-level class as any class declared static (the same as
John Rose did in the Inner Classes Specification). That means the class at the top of an inner
class hierarchy may be either a package member or a nested top-level class. See 2.12 Containment
and Inner Class Hierarchies in Volume 1 for a complete definition of these terms. Be advised that I
repeat this footnote a number of different times in Volume 2 because these terms are so unusual.
method invocation. The efficient thing to do in that case is to chain the method
invocations together. This works because the result type of getClass() is
Class , and getName() is a member of the Class class. Here is an example
involving a class instance creation expression:
There are two intermediate results in this example. One is the args[0] array
access expression, which is used to create a Boolean object. The value of
that class instance creation expression is then immediately used as the target
reference in a method invocation expression. The Boolean object is created,
You should never have to create an object just to get at the code in
an instance method.
Any time you see code like this there is a need for a class method that does the
same thing. Thus Bug Ids 4302078 and 4262398 (both filed in 1999) which
eventually resulted in the addition of the Boolean.valueOf(boolean b)
factory method in the 1.4 release (years later).
I find it very interesting that invocation chaining was emphasized in the
design of the java.nio package. The document entitled NIO APIs: Beta 3
Changes in the 1.4 release included the following last minute change:
Improved invocation chaining: Revised the methods Datagram-
Channel.connect, DatagramChannel.disconnect,
FileChannel.position(long), FileChannel.truncate
(long), SelectableChannel.configureBlocking,
SelectionKey.interestOps(int), Selector.mwakeup,
Matcher.appendReplacement, Matcher.reset(), and
Matcher.reset(java.lang.CharSequence) to return the object
upon which they are invoked. Revised Matcher.appendTail to return
the string buffer object passed to it.132
132. “NIO APIs: Beta 3 Changes” document in the API docs for in the 1.4 release.
class Test {
public static void main(String[] args) {
Subclass subclass = new Subclass();
System.out.println((Superclass) subclass.s);
(Superclass) subclass.print();
}
}
class Superclass {
public String s = "hidden field";
public static void print() {
System.out.println("hidden class method");
}
}
class Subclass extends Superclass {
public String s = "subclass field";
public static void print() {
System.out.println("subclass method");
}
}
class Test {
public static void main(String[] args) {
Subclass subclass = new Subclass();
System.out.println(((Superclass)subclass).s);
((Superclass)subclass).print();
}
}
This is the same program as before, only using parentheses to enclose both the
cast operator and the target reference. This program does compile, and when
executed prints
hidden field
hidden class method
Note that both the cast operator and the target reference must be enclosed in
parentheses. Furthermore, the closing parentheses comes before the period
separator. “Breaking apart” a field access or method invocation expression like
this can take a little getting used to, and explains in part why casting target refer-
ences is discussed in this chapter (apart from the other uses of a cast operator
in 5.7.4 The Cast Operator).
Casting a target reference has the effect of telling the compiler to search in
a different class or interface. It is therefore conceptually different from other
type conversions. For example,
import java.util.Stack;
class Test {
public static void main(String[] args) {
Stack stack = new Stack();
stack.push(new String("abc"));
System.out.println(((String)stack.pop()).toUpperCase());
}
}
class Test {
static String s = "this is a class variable";
public static void main(String[] args) {
Test test = null;
System.out.println(test.s);
test.print();
}
static void print() {
System.out.println("this is a class method");
}
}
java.io.PrintStream.println(Ljava/lang/String;)V
The V at the end of this symbolic reference is the result type, which is void. At
the front of the symbolic reference is the fully qualified class or interface name in
which the method is declared. In the middle is the method signature, which in
Overloaded methods have different Overriding methods have the same method
method signatures. They are not bound signatures. They are therefore bound by the
by the compiler-enforced method compiler-enforced method contract.
contract.
Arguably, overloaded methods are the Client programmers expect overriding methods
same method in the mind of a client to implement a different, more specialized
programmer. behavior.
class Test {
public static void main(String[] args) {
String local = args[0];
}
}
Method Test()
0 aload_0
1 invokespecial #1 <Method java.lang.Object()>
4 return
Look closely and you will see that args and local appear nowhere in this out-
put. There is a detailed discussion of this in 1.7 Local Variables.
There are four components to a method signature.
The example:
There is no mention of order anywhere. Yet another section includes the follow-
ing specification.
The following compile-time information is then associated with the
method invocation for use at run time:
class Test {
public static void main(String[] args) {
String s = test("echo");
}
static String test(Object o) {
System.out.println("test(Object o)");
}
static void test(String s) {
System.out.println("test(String s)");
}
}
Why is the result type not part of the method signature in the Java programming
language? Why exclude the result type, but include the parameter types? These
are reasonable questions because of examples such as the previous one.
doSomething();
class IncompatibleChange {
private int count = 0;
public int getCount() {
return count;
}
}
class Test {
public static void main(String[] args) {
IncompatibleChange incompatibleChange =
new IncompatibleChange();
System.out.println(incompatibleChange.getCount());
}
}
One effect of this implicit check of the result type is that a void method cannot
be substituted for a method that has a different result type. This is important
because the compiler checks to make sure that void methods are only invoked
in top-level expressions. As discussed in one of the following subsections, how-
ever, a future release of the Java programming language may allow for Covari-
ant result types in which subclasses can override superclass methods using a
narrower result type that is assignment compatible with the result type of the
overridden method.
Quite the contrary, this sections explains the extent to which the compiler
enforces this contract. The term method contract is also used to refer to the
API docs for a particular method. When so used, method contract is more or
less synonymous with interface contract. To avoid any ambiguity in meaning, I
consistently use the term compiler-enforced method contract when refer-
ring to the rules discussed in this section. The compiler-enforced method con-
tract is not about enforcing the interface contract (or API docs) for a particular
method (referred to as “behavior” in the above quote) because it only applies to
It is this passage from the JLS in particular that led me to name these rules the
compiler-enforced method contract. Nevertheless, I should point out that this
term is of my own making. It is not used in 8.4.6.3 Requirements in Overloading
and Hiding, which is the corresponding section in the JLS.
The compiler-enforced method contract always involves subclass methods
that have the same signature as a superclass or superinterface method. There
are exactly three cases in which the compiler enforces the method contract.
• Hiding methods: I believe this is an unnecessary language restriction that
could be lifted (though I seriously doubt it ever will be). There is a discussion
of this below.
• Overriding methods: This includes implementing abstract methods
inherited from either a superclass or superinterface. It also includes inherit-
ing a non-abstract method from a superclass that has the same signa-
ture as one or more abstract methods inherited from superinterfaces. In
that case the superclass method overrides and implements the superinter-
face method(s).
• More Than One abstract Method With The Same Signature: This is a
special case that can only happen when at least one of the abstract
methods is inherited from an interface. The abstract methods are bound
by the method contract with the notable exception of the throws clause.
public public
protected public or protected
none (default) public, protected, or none (default)
a. Note that private methods cannot be hidden. Subclasses are therefore not bound by the compiler-en-
forced method contract when declaring a method with the same signature as a private superclass method.
native , and strictfp. Any use of the final modifier precludes this entire
discussion because methods cannot be abstract final, and of course
final methods cannot be hidden or overridden. There are no restrictions on
the use synchronized, native, and strictfp when hiding or overrid-
ing. For example, methods that hide or override synchronized methods do
not have to be declared synchronized.
There are a couple very interesting exceptions to the compiler-enforced
method contract. The first concerns access control. As stated in the JLS:
What we are talking about here are changes to the superclass method that in
effect invalidate the assumption that overriding methods can be safely substi-
tuted at run time. This is a little known quirk in dynamic linking that generally
goes unnoticed. For example,
package com.javarules.examples;
import java.io.*;
public class Superclass {
void print() throws IOException {
System.out.println("superclass");
}
}
package com.javarules.examples;
import java.io.*;
public class Subclass extends Superclass {
void print() throws IOException {
System.out.println("subclass");
}
}
Both classes are members of the unnamed package and have package-private
print() methods that throws IOException. Now suppose the following
changes are made to Superclass without recompiling the Subclass.
import com.javarules.examples.*;
import java.io.*;
class Test {
public static void main(String[] args) {
Superclass sub = new Subclass();
try {
sub.print();
} catch(FileNotFoundException e) { }
}
}
Executing this program prints subclass. This program has just invoked a
package-private method in a different package inside of a try statement that
cannot catch the checked exception thrown by that method.
One need only ponder the design of the compiler-enforced method contract
and dynamic linking for a moment to realize there is really nothing that can be
done about this. The superclass programmer opens up a hole in the access con-
trol mechanism by declaring an overridden method to be more public than
the corresponding subclass method. Likewise, adding a checked exception to a
superclass throws clause that is narrower than any thrown in the correspond-
ing subclass methods invalidates the subclass exception specification. These
problem will be caught when the subclasses are recompiled. For example,
recompiling Subclass as shown above generates the following compiler error.
com/javarules/examples/Subclass.java:4: print() in
com.javarules.examples.Subclass cannot override print() in
com.javarules.examples.Superclass; attempting to assign weaker
com/javarules/examples/Subclass.java:4: print() in
com.javarules.examples.Subclass cannot override print() in
com.javarules.examples.Superclass; overridden method does not
throw java.io.IOException
public void print() throws IOException {
^
1 error
interface Superinterface {
double test();
}
And if the public modifier is removed from the test() method in Super-
class , the same code generates the following compiler error:
The same abstract methods, however, can throw entirely different checked
exceptions. For example,
interface One {
void test() throws A, C;
}
interface Two {
void test() throws B, C;
Attempting to compile the Test class generates the following compiler error:
This shows that the compiler enforces an entirely different rule for methods that
override more than one abstract method; the checked exceptions thrown
must be in the intersection of the throws clauses for the overridden meth-
ods.
Of course, the larger issue here is one of API design. If the abstract
methods are throwing different checked exceptions, they may not have the
same semantics. You want to carefully read the interface contracts of the
abstract methods involved. Providing a single implementation for
abstract methods that represent different behaviors would be a gross mis-
take. See also 3.8.3 Inheriting Methods With the Same Signature.
import java.util.*;
class Superclass {
static long a() { return 0; }
Calendar b() { return new Calendar(); }
}
class Subclass extends Superclass {
static int a() { return 0; }
Such code will likely compile beginning with the same release that introduces
parameterized types. As of this writing, there are a total of 709 votes on the Bug
Parade for this RFE. The primary Bug Id is 4144488.
Because constructors always have the same name as the class in which they are
declared, they are inherently overloaded.
Overloaded methods are not bound by the compiler-enforced method con-
tract. They can have different result types, different levels of access, and differ-
class Test {
public static void overloaded() { }
public String overloaded(String s) { return s; }
}
The fact that overloaded methods can have different result types actually comes
in very handy. There are numerous examples of overloaded methods in which
the result type corresponds to the parameter type(s). The overloaded methods
in the Math utility class are a very important example of this:
There are four overloaded toBinaryString methods, one each for the int,
long, float, and double data types. All four methods do the same thing,
Do client programmers think about the fact that there are different
method implementations when invoking an overloaded method?
144. This term is of my own making. It is precisely defined in that an Object type parameter in
fully overloaded methods such as print and println can accept any reference type. On the
other hand, some fully overloaded methods include one or more array type parameters.
import java.util.*;
class Test {
public static void main(String[] args) {
List list = Arrays.asList(args);
ListIterator iterator = list.listIterator();
Object
Object
Object
There is nothing confusing about this example. If the expectation was that it
would print String three times instead of Object , it’s just plain wrong. The
result type of the next() method in the Iterator interface is Object, so
the overloaded classType method invoked is always going to be the one that
has an Object type parameter. The transition from this rather obvious pro-
gramming mistake to a discussion of “confusing uses of overloading” is subtle
and illogical:
Because overriding is the norm and overloading is the exception, over-
riding sets people’s expectation for the behavior of method invocation.
As demonstrated by the CollectionClassifier example, over-
loading can easily confound these expectations. It is bad practice to
write code whose behavior would not be obvious to the average pro-
grammer upon inspection. This is especially true for APIs.147
147. Ibid.
If you are going to code overloaded methods, the one thing that you must do
above all else is to make sure that no matter what type of arguments are passed
that the correct thing will happen. In other words, client programmers should
never have to think about which overloaded method is invoked. That
Bloch is on a seriously bad tack in Item 26 can be seen in his advice to “never
export two overloadings with the same number of parameters.”149 What about
the common case of a fully overloaded methods such as println in the
PrintStream and PrintWriter classes? These classes export 19 differ-
ent methods with the same number of parameters. Are client programmers con-
fused by this? Oh contraire! Client programmers never have to think about which
fully overloaded method is invoked. And what about all the examples from the
Math utility class above? Are they confusing?
Perhaps the most important thing to remember about overloaded method
matching is that the choice of which method to invoke is made at compile time.
As stated in the JLS:
Java is designed to prevent additions to contracts and accidental name
collisions from breaking binary compatibility; specifically:
148. Ibid.
149. Ibid.
The point is that one overloaded method cannot be used as more or less a
replacement for another. A naive programmer may think so because the newly
added method is more specific. If the older method is public, deleting it is a
mistake. In fact, deleting public methods is always a mistake in terms of
binary compatibility. It simply does not matter if the method is overloaded.
152. Besides having the same method signature, I think it is a good idea to use the same parameter
names as an overridden superclass method. This is purely a matter of style, but it helps to make the
connection for client programmers reading the API docs. Method signatures are not what they see.
They see method headers that include the parameter names. If a client programmer is familiar with
the superclass parameter names, changing them in an overriding subclass method is a potential
source of confusion. For example, the String class declares equals(Object anObject) .
In the Object class the declaration is equals(Object obj). The difference is admittedly
small, but always gives me pause.
153. C programmers must use the virtual keyword for late binding and overriding function dec-
larations. Consequently, they use the term virtual functions when referring to functions that can be
overridden. The term virtual methods is nothing more than a throwback to C. I strongly discourage
this usage in Java.
class Test {
public static void main (String[] args) {
Superclass superclass = new Subclass();
superclass.print();
}
}
class Superclass {
static void print() {
System.out.println("superclass");
}
}
class Subclass extends Superclass {
static void print() {
System.out.println("subclass");
}
}
Executing this program prints superclass because the declared type of the
superclass variable is Superclass. The compiler always uses the
invokestatic machine instruction when invoking class methods, exactly as
if the method invocation expression had been written using the Type-
Name.methodName general form. Therefore the compile-time declaration is
always invoked.
When a non-private instance method is invoked the JVM searches the
method dispatch table that corresponds to the class of the object referenced
at run time. That may be the same class as the compile-time declaration or a
subclass thereof. For example,
class Test {
public static void main (String[] args) {
Superclass superclass = new Subclass();
superclass.print();
154. Gosling et al., §15.11.1, “Field Access Using a Primary.” Gilad Bracha cut a bunch of “in Java”
when working on the Second Edition. Here the “in” didn’t get cut.
This is the same as the previous example, only using instance methods. In both
cases, the compile-time declaration is found in Superclass. Executing this
program, however, prints subclass.
To fully understand overriding and dynamic method lookup you must know
something about invocation modes. Invocation modes determine whether or
not dynamic method lookup is used. There are five invocation modes defined in
the JLS: virtual, nonvirtual, super, static, and interface. It is
important to understand that invocation modes are conceptual only. They are
not written to class files.
The are four bytecodes (or machine instructions) that can be used to invoke
methods.155 The relationship between invocation modes and these four byte-
codes (or machine instructions) is an interesting one. Neither the JLS nor the
JVMS map invocation modes to machine instructions. In fact, there is not a sin-
155.Strictly speaking, the term bytecode refers only to the code arrays in a class file. There is a
code array for each method, including the special initialization methods. They are a series of one
byte opcodes usually followed by one or more operands. The opcodes are unsigned bytes with val-
ues ranging from zero to 255. For every opcode, the JVMS has a mnemonic such as zero and nop,
which is a “do nothing” machine instruction more commonly spelled no-op (for no operation). I tend
to use the term bytecode when referring to the numeric value of an opcode and machine instruction
when referring to the corresponding JVMS mnemonic. Note also that I use a fixed font for invocation
modes, bytecodes, and machine instructions, but that has no particular significance.
156. Gosling et al., §15.12.3, “Compile-Time Step 3: Is the Chosen Method Appropriate?.”
157. Tim Lindholm and Frank Yellin, §3.11.8, “Method Invocation and Return Instructions.”
because a special ACC_SUPER flag must be set in the class file. The explana-
tion for this is of little interest to application programmers.158
Presumably this mapping from invocation modes to machine instructions is
only of interest to the software engineers who write compilers, but without the
JVMS application programmers would be left thinking that the super invocation
mode uses dynamic method lookup:
The strategy for method lookup depends on the invocation mode.
158. Briefly, in the 1.0.2 release methods invoked using super were resolved at compile time,
which was a huge mistake. It is possible for the resolved method to be overridden in another super-
class that is compiled after the subclass in which method invocation appears is compiled. If the
ACC_SUPER flag is set (and it always is) the JVM looks in the method dispatch table of the direct
superclass (as it always should have). See Bug Id 4069324 for more details.
159. Gosling et al., §15.12.4.4, “Locate Method to Invoke.”
This specification in the JLS should be clarified. Even the statement that “overrid-
ing is possible” is confusing unless the reader is familiar with Bug Id 4069324.
Table 1.11 summarizes everything discussed so far. There is a sequence of
things to consider. At the end of that list, all remaining method invocation expres-
sions are virtual. You should take the time to compare this table to Table 1.7
160. Tim Lindholm and Frank Yellin, under invokespecial in Chapter 6, “The Java Virtual
Machine Instruction Set.”
class Test {
public static void main (String[] args) {
Runnable runnable = new Runnable() { public void run(){ } };
runnable.run();
runnable = new Thread();
runnable.run();
}
}
The decompiled code for the main method in the Test class is as follows:
As stated above, the additional operand was the “guess.” The original JVMS
included an entire chapter on the _quick machine instructions. They are no
longer documented in part because of the Java Platform Debugger Architecture
obviates the need for Sun to make such implementation details publicly avail-
able, but also because new Sun implementations now use a more efficient algo-
rithm when searching for interface methods (the details of which have never
been made public, but to which Dr. Gosling alludes in a Bill Venner’s interview
quoted below).
I have always found this chapter in the unfolding Java platform story to be of
particular interest because anyone who has closely studied Dr. Gosling’s style of
coding will notice what I can only describe as an aversion to using interfaces. I
have always suspected this had something to do with the invokeinterface
machine instruction, and I am still pretty well convinced that it does. The prob-
lem with even mentioning this curiosity in a book about the Java programming
language is that the invokeinterface will be used to argue that interface
types are somehow inefficient. Here I am reminded of some quotes from “Item
37: Optimize judiciously” in Effective Java:
161. An implementation-defined “identifier” column was also used to uniquely identify each of the
interface methods in a method dispatch table. This alternative key was faster than a string compari-
son of method descriptors, and was used when guessing at the index value. The so-called “guess”
was correct if the identifier at that index value was the same as the one stored in the
invokeinterface_quick machine instruction.
162. Tim Lindholm and Frank Yellin, under invokeinterface in Chapter 6, “The Java Virtual
Machine Instruction Set.”
163. William A. Wulf, A Case Against the GOTO. Proceedings of the 25th ACM National Confer-
ence 2 (1972): 791-797.
164. Donald E. Knuth, Structured Programming with go to Statements. Computing Surveys 6
(1974): 261-301.
165. M. A. Jackson, Principles of Program Design , (London, Academic Press, 1975).
166. Erich Gamma et al., §1.6, “How Design Patterns Solve Design Problems.”
This was the first confirmation I ever had that this was a real issue, I mean that I
was reading the JVMS correctly. I was grateful for that, but then along came the
HotSpot VM and it was clear they had done something different when implement-
ing invokeinterface, but what and how much of a difference it made once
again had me on the lookout for some clue from a Sun source.
A Bill Veneer interview with Gosling provided enough of an answer for me to
put this issue to rest once and for all:
There is still part of me that says, maybe interfaces should never have
existed. People should just use classes for interfaces. But there turned
out to be some nice things that get done with interfaces that are differ-
ent. There's an interesting performance difference that most people
never think about, which is that interfaces need to do a kind of a
dynamic dispatch, whereas strict classes don't. The class model in
Java is really rather reactionary. It's almost exactly the original class
model from Simula 67. And one of the nice things about that model is
you can make method dispatch just fiendishly fast.
There are all kinds of tricks for doing interface-style dispatches, flexible
multiple-inherited dispatches, pretty fast. But they are always a couple
of instructions longer at least and maybe more, depending on how you
do it. Although there are techniques that trade it off against memory.
You can actually get it to the same performance as single inheritance,
if you are willing to basically spend more RAM building tables. But one
of the unsung nice things about the difference between classes and
interfaces is that it is statically knowable whether you can do dynamic
versus static dispatching.168
167. “JDJ EXCLUSIVE: An Interview with James Gosling,” (SYS-CON Publications, Inc., 1999),
www.sys-con.com/java/javaone/interviews/gosling.html.
168. Dr. James Gosling in an interview with Bill Venner entitled “A Conversation with James Gosling
(May 2001),” on the artima.com Web site (Artima Software, Inc.), www.artima.com/intv/
gosling314.html.
Chapter Contents
2.1 Introduction 248
2.2 Namespaces 249
2.2.1 The Meaning of a Simple or Partially Qualified Name 252
2.2.2 Disambiguating Type Names 254
2.3 The Fundamentals of Lexical Scoping 258
2.3.1 The Mysterious Scope Hole 266
2.3.2 Compilation Unit Scope 268
2.3.3 Members Shadow Declarations in Enclosing Scopes 268
2.3.4 The Scope of Types in the Unnamed Package 271
2.3.5 Circular Dependencies in Type Declarations 276
2.4 Shadowing 278
2.5 Obscuring 281
2.6 Observable Compilation Units and Packages 284
2.7 Qualified Access 286
2.8 Access Control 290
2.8.1 The protected Access Modifier 301
2.8.2 Full Access to the Members of an Enclosing Class 312
2.8.3 Members More Accessible Than Their Class Type 315
2.8.4 Accessing the Implementation of Same Class Objects 323
2.9 Encapsulation 324
How does shadowing fit in? Shadowing is an exception to the rule. A shadowed
declaration is by definition in scope, but cannot be referenced using a simple
name.
More generally, every (Java programming language) entity that exists on the
planet, or that ever will exist for that matter, is one of the following:
• In scope and visible
• In scope, but shadowed or obscured
• Accessible
1. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha, The Java Language Specification, Sec-
ond Edition, (Boston: Addison-Wesley, 2000), §6.6, “Access Control.”
2. Ibid.
3. Gosling et al., introduction to Chapter 6, “Names.”
I have always maintained this very same distinction in Java Rules (even before
the Second Edition of the JLS was released), and was glad to see Bracha intro-
duce this new terminology to support the distinction. It is by no means super-
fluous. Bracha continues to use the term hiding in the context of members that
are not inherited because that is what most programmers understood hiding to
mean. He uses the new terms shadowing and obscuring for the more refined
meanings of what used to be lumped together as hiding. Note also that the term
visible is now defined in terms of shadowing, not hiding. A name that is shad-
owed in not visible, whereas an entity that is hidden is not inherited and therefore
not in scope. Only fields, class methods, and nested types (members of a class
or interface type) are hidden. That about sums up the change. Obscuring is also
discussed in this chapter for the sake of completeness.
2.2 Namespaces
The concept of a namespace is not difficult to master. A family is a namespace
in which each individual must have a unique first name. A person’s full name is
like the fully qualified name of an entity in the Java programming language,
except that the syntax is different. Instead of writing John Doe, in Java you write
Doe.John. In this analogy, the family name corresponds to a package name,
and the first name corresponds to a class or interface type. A package is a group
of logically (versus biologically) related classes and interfaces, each of which
5. Sheng Liang and Gilad Bracha, Dynamic Class Loading in the Java Virtual Machine, Pro-
ceedings of the 1998 ACM SIGPLAN Conference on Object-Oriented Programming Systems, Lan-
guages and Applications (OOPSLA’98), citeseer.nj.nec.com/liang98dynamic.html. This
is a must read for anyone wanting to fully understand dynamic class loading.
6. Tim Lindholm and Frank Yellin, The Java Virtual Machine Specification, Second Edition,
(Boston: Addison-Wesley, 1999), §4.2, “Internal Form of Fully Qualified Class Names.”
import java.awt.*;
import java.util.*;
class Test {
List l;
}
Both of these packages include a type named List. The compiler cannot arbi-
trarily decide which type-import-on-demand declaration you intended to use,
which is exactly what it would be doing if the search stopped as soon as a
matching type name was found.
Assuming that there is no package statement at the top of a compilation
unit, these rules stipulate that the unnamed package is searched before
named packages imported using a type-import-on-demand declaration.
This is a language feature that sooner or later comes to the attention of most
Java programmers when they inadvertently declare a class that has the same
simple name as a class in a named package, including notably the core API. If
that class is stored in the unnamed package, it is only a matter of time before
you try to compile a program that attempts to import a like-named class from a
named package using a type-import-on-demand declaration. In my case I
thoughtlessly created a Properties class that sat in the unnamed package
for months (like a time bomb waiting to explode) until one day when I tried
importing java.util.Properties using import java.util.*. All
sorts of compiler errors were generated because I was trying to use my
Properties class as if it were java.util.Properties. This is a very
hard problem to diagnose the first time you encounter it.
As stated above, the first two steps in the process of disambiguating type
names are normal lexical scoping rules. For example,
class Test {
class Widget { } // inner member class
public static void main(String[] args) {
class Widget { } // local class
System.out.println(Widget.class);
}
}
class Widget { } // helper class
class Test$Widget
This is the binary name of the inner member class. The local class is no longer in
scope because it has not yet been declared. In neither case does the name
mean Widget, which is the helper class.
Notice that declarations, not names, have a scope. In fact, Gilad Bracha
changed the name of the section in which scope is defined from 6.3 Scope of a
Simple Name to 6.3 Scope of a Declaration. Although the JLS says declarations
have a scope, I prefer the term entity rather than declaration. For example,
programmers must know if an entity is in scope or out of scope in order to
know if the simple name can be used. If the entity is out of scope, then some
form of qualified access must be used.
9. John Rose, Inner Classes Specification (Mountain View: Sun Microsystems, 1997), “What are
the new binary compatibility requirements for Java 1.1 classes?”
10. Gosling et al., introduction to Chapter 6, “Names.”
11. Local classes are interesting in this regard. Do they continue to exist when out of scope? The
answer is Yes (as class files that are dynamically loaded). Given the five choices of “in scope and
visible,” “in scope, but shadowed or obscured,” “accessible,” “inaccessible,” or “unobservable” from
the introduction to this chapter, we say that local and anonymous classes are implicitly private
and therefore inaccessible outside of the block in which they are declared (which for anonymous
classes created in the variable initializer of a field is the body of an <init> of <clinit>
method). Thus there exists an entity that cannot be named except when in scope. In that sense, they
are comparable to local variables and parameters.
12. John Rose, “How does the Java Language Specification change for inner classes?”
ical scoping. You run into the classic problem of trying to describe water to a
fish when explaining the term lexical scoping to someone who has never writ-
ten programs in a dynamically scoped language. Nevertheless, all Java program-
mers should at least be aware of the fact that Java is a lexically scoped
language.
Lexical scoping can be compared to throwing a pebble into a pond. The
compiler begins searching for the declaration of an entity at the point at which a
simple or partially qualified name is used and works outward. Hence, smaller
scopes such as block scope are always searched before the larger scopes such
as a compilation unit or package. There are five scopes in the Java program-
ming language. They are listed here in the order of the size:
• Host system scope
• Package scope
• Compilation unit scope
• Type scope 13
• Block scope (which includes methods, constructors,
and initialization blocks)
If the simple name pebble were used, you can look at this list as representing
the concentric circles made by the pebble as the compiler searches for a decla-
ration with the same name.
All entities scope to one of these lexical constructs. The complete list of enti-
ties in the Java programming language are repeated here for your convenience.
13. I prefer the term type scope over class scope. Otherwise, you must say that interface con-
stants are in scope in their own variable initializers and well as in the variable initializers of other
interface constants declared textually after them in a given interface type because they have what?
Class scope? No, the answer is because interface constants have type scope.
Lexical Construct
Block Scope
Entity
Packages ✔
Package members ✔
Imported types ✔
Fields, methods, and member types ✔
Local classes, local variables, and parameters ✔
tions are not shadowed, this table can be stated in plain English (including a few
more details) as follows.
• Observable package names can be used anywhere.
• The simple name of a package member can be used anywhere in any of the
compilation units that belong to that package.
• The simple name of an imported type can be used anywhere in the compila-
tion unit in which the import declaration appears. Although member types
only have type scope, they can be imported. The effect of doing so is to
Though this specification does not explicitly exclude package and import
declarations, 7.5 Import Declarations does:
16. Gosling et al., §6.3, “Scope of a Declaration.” This sections says: “The scoping rules for various
constructs are given in the sections that describe those constructs. For convenience, the rules are
repeated here,” but I cannot find the scope of a class type discussed anywhere else in the JLS.
What is the point in this? The entity named in a package statement is a fully
qualified package name. While single-type-import declaration do name individual
class or interface types, the JLS is clear that a fully qualified name (or, more pre-
cisely, the canonical name) is required:
A single-type-import declaration imports a single named type, by men-
tioning its canonical name.18
I prefer lexical construct.20 This use of the term scope (as a noun) is analogous
to namespace. This gives rise to the term enclosing scope as in the following
quote from the Inner Classes Specification:
The code of an inner class can use simple names from enclosing
scopes, including both class and instance members of enclosing
classes, and local variables of enclosing blocks.21
The term enclosing scope is useful because of inner classes. However, soft-
ware engineers and technical writers freely combine any number of adjectives
with the term scope. Here are some examples I have found in other Java books:
• Current scope
• Outer scope
• Lower scope
• Temporary scope
• Intervening scope
Contrast this with the following specifications from the Second Edition of the JLS
(which superseded the Inner Classes Specification).
The scope of a declaration of a member m declared in or inherited by a
class type C is the entire body of C, including any nested type declara-
tions.23 [from JLS 8.1.5 Class Body and Member Declarations]
22. John Rose, author of the Inner Classes Specification, in a personal email dated 7 February
1998.
23. Gosling et al., §8.1.5, “Class Body and Member Declarations.”
In both cases, the blanket statement “including any nested type declarations” is
incorrect. For example,
class C {
String m = "Where oh where has my little String gone? " +
"Where oh where can it be?";
static class ScopeHole {
void print() {
System.out.println(m);
}
}
}
The member m disappeared down the mysterious scope hole. The JLS does
address the scope hole in the following specification.
It is a compile-time error if a static class contains a usage of a non-
static member of an enclosing class.25
If you do not already understand why, then you need to read Volume 1 of Java
Rules. As Rose said above, the scope hole is “an unavoidable implication of the
meaning of ‘static,’” which is the subject of Chapter 3 in Volume 1. More gener-
ally, however, you need to fully understand the difference between top-level
classes and inner classes, which is discussed in Chapter 2 of the same volume.
There is no scope hole in inner classes, which are by definition non-static.
As stated in the Inner Classes Specification:
See 3.6 Multiple Current Instances (a.k.a. Levels) in Volume 1 for a closely
related discussion.
class Test {
public static void main(String[] args) {
System.out.println(new Test().new NestedType().s);
}
class ScopeTest { } //declaration is an enclosing scope
class NestedType {
class ScopeTest { } //declared member
String s = new ScopeTest().getClass().getName();
}
}
class Test {
public static void main(String[] args) {
new EnclosingClass.NestedType().print();
}
String scopeTest() { return "inheritance"; }
}
class EnclosingClass {
static String scopeTest() { return "enclosing scope"; }
static class NestedType extends Test {
void print() {
System.out.println(scopeTest());
}
}
}
Executing this program prints inheritance. What makes this example espe-
cially interesting is that the instance method inherited by NestedType is shad-
owing a static method in Test. In a containment hierarchy, an instance
This specification was more or less taken directly from the original Inner Class
Specification. It only mentions class types, however. Consequently, it was con-
sequently was only included in Chapter 8, “Classes” of the JLS, but the exact
same specification should be included in comparable sections of Chapter 9,
“Interfaces.” For example,
class Test {
public static void main(String[] args) {
System.out.println(
EnclosingScope.NestedSubInterface.PRECEDENCE_TEST);
}
}
27. Gosling et al., §8.1.5, “Class Body and Member Declarations.” In the Inner Classes Specifica-
tion , the same specification went on to say: “Additionally, unless the [shadowed] definition is a pack-
age member, the simple name m is illegal; the programmer must write C.this.m.” The language
designers were concerned about ambiguity. As stated elsewhere in the Inner Classes Specifica-
tion : “Sometimes the combination of inheritance and lexical scoping can be confusing. For exam-
ple, if the class E inherited a field named array from Enumeration, the field would [shadow] the
parameter of the same name in the enclosing scope. To prevent ambiguity in such cases, Java 1.1
allows inherited names to [shadow] ones defined in enclosing block or class scopes, but prohibits
them from being used without explicit qualification.” This was a contradiction in terms, however. You
cannot say the inherited member shadows the like-named entity in an enclosing scope and in the
same breath require the use of a qualified name. This restriction was lifted shortly after the publica-
tion of the Inner Classes Specification.
package com.javarules.examples;
class Test {
public static void main(String[] args) {
String name = new Directory().getName();
System.out.println("directory = " + name);
}
}
C:\Java\classes\com\javarules\examples>javac Test.java
Test.java:4: cannot access com.javarules.examples.Directory
bad class file:
C:\Java\classes\com\javarules\examples\Directory.class
class file contains wrong class: Directory
Please remove or make sure it appears in the correct subdirectory
of the classpath.
String name = new Directory().getName();
^
1 error
This is how the compiler complains that a class does not have the correct
package statement.
Those Java books that make statements to the effect that “the unnamed
package is automatically imported” or “the unnamed package is always in
scope” are either ignorant of the facts or else they assume that access is from
another member of the unnamed package.
Prior to the 1.4 release, members of the unnamed package could be
imported using a single-type-import declaration. However, the original JLS
included stern warnings against doing so:
Caution must be taken when using unnamed packages. It is possible
for a compilation unit in a named package to import a type from an
These warnings were removed from the Second Edition of the JLS without any
explanation. I can only speculate that this change in the specification was some-
how connected to Bug Id 4361575 in which importing from the unnamed pack-
age was made illegal. The description (copied from the evaluation) of that bug
report begins as follows.
The compiler now correctly scopes import declarations.
Among other effects of this change, the compiler now rejects import
statements that import a type from the unnamed namespace. The com-
piler used to accept such import declarations before, but they were
arguably not allowed by the language (because the type name appear-
ing in the import clause is not in scope). Rather than try to rationalize
the specification with the compiler's behavior, the compiler has been
brought into line with the specification, and the specification is being
clarified to outright say that you can't have a simple name in an import
statement, nor can you import from the unnamed namespace. There
were ample warnings in the language specification warning against
importing names from the unnamed namespace into a named name-
space. Those warnings are no longer necessary, as it is outright illegal.
This is likely to break lots of code, but virtually all of it is example code
rather than production code.
import SimpleName;
29. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification (Reading: Addison-
Wesley, 1996), §7.4.2, “Unnamed Packages.” (Do not update.) There is actually a second para-
graph that was removed (a continuation of the warning) which I am not quoting here.
To fix such problems in your code, move all of the classes from the
unnamed namespace into a named namespace.30
This analysis is all wrong and for a very simple reason. JLS 6.7 Fully Qualified
Names and Canonical Names states very clearly that
The fully qualified name of a top level class or top level interface
that is declared in an unnamed package is the simple name of
the class or interface.
For every package, top level class, top level interface and primi-
tive type, the canonical name is the same as the fully qualified
name.31 [emphasis added]
Now go back and read the description again. The rationale for this change in the
language is that the name of a class or interface type in the unnamed package is
not in scope, but scope is not an issue. The specification is very clear the canon-
ical names must be used in import declarations and that the simple name of a
class or interface type that is a member of the unnamed package is the canoni-
cal name. This also applies to nested types in the unnamed package because of
the following specification from the same section of the JLS:
A member class or member interface M declared in another class C
has a canonical name if and only if C has a canonical name. In that
case, the canonical name of M consists of the canonical name of C,
followed by ".", followed by the simple name of M.32
I am really at a loss as to why Sun would so totally abandoned its strict policy of
backwards compatibility on this issue.
It is clear from the comments at the bottom of this and related bugs that a
lot of production code was in fact broken. Moreover, language designers usually
This is blatantly false. Other than removing the warnings against importing
from the unnamed package, the Second Edition of the JLS is totally silent on this
issue. Furthermore, as of this writing (well after this change was implemented)
there is also nothing on the JLS “Maintenance Page” (where you would expect to
find it because this bug was reported after the publication of the Second Edition)
at java.sun.com/docs/books/jls/jls-maintenance.html.
Nevertheless, as late as the 1.4.1_01 release import declarations that
use either a simple name or the fully qualified name of a nested type in the
unnamed package no longer compile. For example,
package com.javarules.examples;
import Directory;
class Test {
public static void main(String[] args) {
String name = new Directory().getName();
System.out.println("directory = " + name);
In the example:
class Point {
int x, y;
PointList list;
Point next;
}
class PointList {
Point first;
}
class Test {
public static void main(String[] args) {
System.out.println(new A().getClass().getName());
}
}
class A {
B b = new B();
}
class B {
A a = new A();
}
The only substantial difference between this example and the one in the JLS is
the addition of instance variable initializers. This example of circular dependen-
cies in type declarations does indeed compile, but attempting to instantiate
either class results in an instance initialization loop (the “hall of mirrors” effect) at
2.4 Shadowing
All entities are in scope when they are declared. They are not, however, neces-
sarily visible throughout the lexical construct to which they scope (referred to
simply as their scope). Those lexical constructs are shown in Table 2.1 Lexical
Scoping on page 262. Many entities are shadowed in part of their scope by the
declaration by another entity with the same simple name. This complicates the
rule for when simple names can be used.
The simple name of an entity can be used only if that entity is both in
scope and visible.
The term visible means not shadowed. The design of the Java programming
language (or any lexically scoped programming language for that matter) is that
if two or more entities of the same kind and with the same simple name are in
scope, one is visible and the other is shadowed. The simple name will always
refer to the visible entity. In order to reference the shadowed entity, a qualified
name can be used for static members; the this keyword can be used for
non-static members. Some entities simply cannot be referenced while they
are shadowed. Local variables and parameters shadowed by declarations in a
block class are an example of this.
As with hiding, the visible and shadowed entities are always the same kind of
entity because of the syntactic classification of names according context. More
specifically, both are one of the following.
• Type
• Variable (field, local variable, or parameter)
• Method
This rule applies to type, variable, and method declarations alike (as well as to
labels). The remainder of this section discusses the only three acceptable uses
of shadowing:
• Constructor parameters shadow instance variables
• Types imported using single-type-import declarations shadow members
either declared in other compilation units of the same package or imported
using a type-import-on-demand declaration
• Members of a nested type shadow entities in enclosing classes
These accepted uses of shadowing are listed in the order in which they are most
commonly used.
Java programmers use shadowing all the time in constructors. Constructor
parameters shadow the name of the instance variable to which they are
assigned. This is an important programmer convention that results in much
cleaner and easier to read constructors. For example,
This is a simplified constructor from the Locale class. The constructor param-
eters language, country, and variant shadow the like-named fields
throughout the body of the constructor. This particular use of shadowing is
encouraged in the JLS:
…the constructor takes parameters having the same names as the
fields to be initialized. This is simpler than having to invent different
2.5 Obscuring
Declarations can be hidden, shadowed, and obscured. As stated in the JLS
“obscuring is distinct from shadowing and hiding.”38 The difference between
them is subtle. You may recall from the last section that both the visible and
shadowed entity (or entities) are one of the following.
• Type
• Variable (field, local variable, or parameter)
• Method
• Label
At the beginning of the next chapter, 3.2 Hiding makes essentially the same
point. Both entities are always one of the following.
• Fields
• Class methods
• Member types
This is precisely what makes obscuring “distinct from shadowing and hiding.” In
obscuring different kinds of entities are involved. Only package and types
names are obscured. There are three possibilities:
• Variable names that obscure package names
• Type names that obscure package names
• Variable names that obscure type names
The last two possibilities are rare because of Java naming conventions, so this
discussion of obscuring focuses largely on variable names that obscure pack-
age names.
Obscuring is based on 6.5.2 Reclassification of Contextually Ambiguous
Names in the JLS. Basically, the syntactic classification of names according to
context is less than perfect. It is very obvious in some contexts that a name
class Test {
public static void main(String[] args) {
String Widget = "Test is a variable name";
System.out.println(Widget.length());
}
}
class Widget {
static int length() {
return 0;
Executing this program prints 23, which is the length of the Widget string. In
both examples, the compiler somewhat arbitrarily decides the ambigu-
ous name is a variable, not a type or package. As stated in the JLS:
A simple name may occur in contexts where it may potentially be inter-
preted as the name of a variable, a type or a package. In these situa-
tions, the rules [§6.5.2 Reclassification of Contextually Ambiguous
Names] specify that a variable will be chosen in preference to a type,
and that a type will be chosen in preference to a package.39
In effect, this means the simple name of the test class and test package
cannot be used anywhere in the scope of the test variable. In these examples,
the type name is obscured (and obscures) because the responsible programmer
is flaunting Java naming conventions. That should not happen in practice.
Obscured packages names, however, are a different matter. Java naming
conventions do not help because “names of packages intended only for local
use should have a first identifier that begins with a lowercase letter”40 Thus a
variable name can obscure a package name even when Java naming conven-
tions are followed. For example, nothing can be done to name the shadowed
x.length field in the Widget helper class in the first example because the
test package is doubly obscured. The JLS includes the following two sugges-
tions should the obscuring of package names become a problem.
When package names occur in expressions:
39. Gosling et al., §6.3.2, “Obscured Declarations.” The actual specification is in §6.5.2, “Reclassi-
fication of Contextually Ambiguous Names.”
40. Ibid., §6.8.1, “Package Names.”
41. Ibid.
The older top-level domain names are the ones you need to be concerned about.
They are com, net, org, gov, mil, and edu. You should probably never
declare local variables or parameters to have one of these names. With the pos-
sible exception of biz, the seven newer ones are not likely to be sources of
widely distributed Java packages. They are biz, info, name, pro, aero,
coop, and museum. By the way, Verisign has an excellent white paper on the
new top-level domain names entitled “Journey to the Right of the Dot: ICANN’s
New Web Extensions.”43
In one of many wonderful usage distinctions Gilad Bracha added to the Second
Edition, packages (and compilation units) are now either observable or not.46
The above specification was changed to read as follows in the Second Edition of
the JLS.
Which compilation units are observable is determined by the host sys-
tem. However, all the compilation units of the package java and its
subpackages lang and io must always be observable. The observ-
ability of a compilation unit influences the observability of its pack-
age.47
45. Gosling et al., §7.4.4, “Access to Members of a Package.” (Do not update.)
46. I cannot agree with the analysis in Bug Id 4420532, which is also available on the (unofficial)
Java Spec Report Web site at www.ergnosis.com/java-spec-report/java-language/
jls-7.4.3.html. Recursively defining the observability of packages based on either the observ-
ability of subpackages or the compilation units that belong to the package, and then defining the
observability of subpackages on the observability of compilation units that belong to the subpack-
age, is both intuitive and non-circular.
47. Gosling et al., §7.3, “Compilation Units.”
48. Gosling et al., §7.4.3, “Observability of a Package.”
The Second Edition of JLS does not actually define the term observable, but nei-
ther did the original JLS define the “conventions of the host system” except indi-
rectly by the suggestion at the bottom of 7.4.4 Access to Members of a
Package above that file systems and databases allow access to all of the compi-
lation units in a package. It basically means that the development directory, JAR,
or Zip file that contains the package is not on the bootstrap or extension class
path and cannot be loaded using a user-defined class loader (including the appli-
cation or applet class loader). A package on some removal media buried under a
bunch of trash at the bottom of a dumpster in an ally somewhere in downtown
Tokyo, Japan is a good example of an “unobservable” package. A development
directory that cannot be read because of file protections would be an even bet-
ter example.
These three means of access are grouped together not only because they look
the same (or rather are “syntactically similar”), but also because of lexical scop-
ing. An entity that is not in scope must be accessed using a qualified name, field
access expression, or method invocation expression. These are the two halves
of a whole. Where one stops, the other begins.
As shown in Figure 2.2, constructors are a fourth means of access. The
failure to mention constructors in the definition of access in the JLS is an obvi-
ous error of omission because access modifiers are used in their declaration.
The fundamental distinction between constructors (instantiating a class) and
qualified access (accessing the implementation of a class) can be see in the fol-
lowing definition of access control.
Access control applies to qualified access and to the invocation of con-
structors by class instance creation expressions and explicit construc-
tor invocations.52
51. Ibid., §6.6, “Access Control.” As with the definition of access in the JLS, this definition of quali-
fied access should specify that it does not include method invocation expressions “in which the
method is not specified by a simple name.”
52. Gosling et al., The Java Language Specification, Second Edition, §6.6, “Access Control.”
53. For the time being at least, I am ignoring the instantiation of nested types. The rationale for
doing so is the nested top-level classes are not substantially different than package members, and
the inner classes are rarely instantiated outside of the class in which they are declared.
Only fully qualified type names include a package name and thus
“access” the members of a package.
import java.util.*;
class Test {
HashMap map;
}
The JLS definition of qualified access does not specifically cover this
most common means of access. Nevertheless, we know that HashMap is
accessed because it is a member of another package. It is important to dif-
ferentiate between actually “accessing” a package and simply being able
to use the name of a class or interface type. I address this problem by intro-
ducing the term type privileges, which is defined as any use of a type name.
No distinction is made between simple and qualified names.
Type privileges are further divided into class type privileges and interface
type privileges. This is important because interface types cannot be instanti-
ated and all of their members are implicitly public. Therefore interface type
privileges are not relevant to any discussion of access control.
54. The term access specifier used in a number of very popular Java books is in direct conflict with
the consistent use of access modifier in the JLS. Access modifiers are specified in the declaration
of a class, interface, field, constructor, or method, but there is no such thing as an “access speci-
fier.”
55. Gosling et al., The Java Language Specification, Second Edition, Chapter 1, “Introduction.”
56. Ibid., §6.6, “Access Control.”
I just refuse to go along with the status quo that includes a row labeled “Sub-
classes declared in other packages.” Nevertheless the reader should be aware
that protected is more accessible than default access. Some software engi-
neers and technical writers like to say more public rather than more acces-
sible. The meaning is the same. The fact that protected is more accessible
than default access is significant because of the compiler-enforced method con-
tract. Methods declared protected cannot be overridden by default access
methods. The protected access modifier is discussed in detail in the follow-
ing subsection.
57. Ken Arnold, James Gosling, and David Holmes, The Java Programming Language, Third
Edition , (Reading: Addison-Wesley, 2000), 81.
lexical construct (if you are willing to accept my definition of containment and
inner class hierarchies as lexical constructs,59 comparable to packages).
The main thing that is being glossed over in the Second Edition of the JLS is
a very subtle change in the meaning of the private access modifier.
59. Basically, the term containment hierarchy refers to a package member and all of the
nested top-level classes declared in that package member. An inner class hierarchy refers
to all of the inner classes (inner member, local, and anonymous classes) in a given top-level
class. Contrary to the JLS, I define top-level class as any class declared static (the same as
John Rose did in the Inner Classes Specification). That means the class at the top of an inner
class hierarchy may be either a package member or a nested top-level class. See 2.12 Containment
and Inner Class Hierarchies in Volume 1 for a complete definition of these terms. Be advised that I
repeat this footnote a number of different times in Volume 2 because these terms are so unusual.
class Test {
private String s = "not in scope but accessible";
static class NestedType {
void print() {
System.out.println(s); //COMPILER ERROR
System.out.println(new Test().s);
}
}
}
In this example, s is not in scope (because of the mysterious scope hole), but it
is accessible using a field access expression in which the primary expression is
a class instance creation expression. This is significantly different from
being able to use the simple name of a private field or method.
My approach to this subject depends heavily on the definition of containment
and inner class hierarchies in Volume 1. I begin by restating the specification for
the accessibility of members and constructors in nested types:
class Test {
private static String s = "in scope";
class InnerMemberClass {
void print() {
System.out.println(s);
}
}
static class NestedTopLevelClass {
void print() {
System.out.println(s);
}
}
}
class Test {
private String s = "in scope";
The s field is now an instance variable. Therefore the second rule comes into
play. In the InnerMemberClass, the simple name of s can still be used
because it is implicitly qualified by Test.this. See 1.10.1 The Meaning of a
Simple Field or Method Name for a discussion. There is no enclosing instance of
the Test class in NestedTopLevelClass, however. Thus the second rule
applies and an instance of the Test class must be created before the s
instance variable can be accessed. See 2.3.1 The Mysterious Scope Hole for a
discussion.
These rules are most useful when one nested type accesses the members
of another nested type. For example,
class Test {
class InnerMemberClass {
private String s = "accessible";
void print() {
System.out.println(new NestedTopLevelClass().s);
}
}
static class NestedTopLevelClass {
private String s = "accessible";
void print() {
InnerMemberClass imc = new Test().new InnerMemberClass();
System.out.println(imc.s);
}
}
}
61. John Rose, “What are top-level classes and inner classes?”
The Second Edition of the JLS does not use the term inaccessible, saying only
that
It is a compile-time error if a local class declaration contains any one of
the following access modifiers: public, protected, private, or
static.63
It is a mistake to characterize local and anonymous classes as inaccessible. The
fact is that other local and anonymous classes in the same block can access the
members of a local class, including members declared private. For example,
class Test {
public static void main(String[] args) {
class A {
private String s = "accessible";
}
class B {
private void print() {
System.out.println(new A().s);
}
}
new B().print();
}
}
Access control error messages were not always so precise. This same program
used to generate the following compiler error:
The conceptual difficulty arises from the fact that the meaning of the
protected access modifier depends on the entity modified. It is very much
like the static modifier in this regard. The protected access modifier has
a different meaning for each of the following kinds of entities:
• static fields and methods, and all member types (including inner mem-
ber classes)
• non-static fields and methods
• Constructors
The grouping in the first bulleted item in and of itself presents a conceptual diffi-
culty because inner classes are not normally grouped together with static
members.
Most programmers think protected members are accessible from sub-
classes declared in different packages. This is true, but for non-static fields
and methods as well as for constructors how the member is accessed
must be taken into consideration. Here is an example of what most program-
mers think of as protected access:
package com.javarules.examples;
public class Superclass {
protected static void print() {
System.out.println("Hello World!");
}
}
import com.javarules.examples.*;
class Test extends Superclass {
public static void main(String[] args) {
Superclass superclass = new Superclass();
superclass.print();
package com.javarules.examples;
public class Superclass {
protected class InnerMemberClass {
public InnerMemberClass() { }
}
import com.javarules.examples.Superclass;
import com.javarules.examples.Superclass.*;
class Test extends Superclass {
public static void main(String[] args) {
InnerMemberClass imc = new Superclass().
new InnerMemberClass();
NestedTopLevelClass ntlc =
new Superclass.NestedTopLevelClass();
}
}
As this example shows, subclasses declared in other packages have type privi-
leges for all of the protected member types declared in a superclass, even if
the protected member type is an inner member class.
The fact that static members and member types declared protected
can be accessed by subclasses declared in other packages irrespective of the
class of the object accessed was actually a bug in the Java programming lan-
guage. It came to the attention of the language designers as a result of Bug Id
4033907. That in turn led to further relaxation of the rules concerning access to
protected static members from subclasses declared in other packages.
Those changes were introduced in the 1.3 release and were subsequently for-
malized in the Second Edition of the JLS. This Bug Id includes comments from
Guy Steele, Gilad Bracha, and other important software engineers at Sun. It is an
historically important bug, —one that I hope is never removed from the Bug
Database. My reason for mentioning this change to the Java programming lan-
guage is that it shows all the more that the basic rule is inadequate to fully
explain protected access.
The rule for non-static fields and methods is by far the most difficult to
understand. Looking only at the examples involving the protected print()
method in Superclass, you might conclude that subclasses declared in other
import com.javarules.examples.*;
class Test extends Superclass {
public static void main(String[] args) {
Test test = new Test();
test.print();
}
}
package com.javarules.examples;
public class Superclass {
protected void print() {
System.out.println("Hello World!");
}
}
66. Gosling et al., §6.6.2.1, “Access to a protected Member.” At the beginning of Table 2.2.1
the difference between a qualified name and a field access or method invocation expression is dis-
cussed. Based on that discussion, I am at a loss to explain the first bulleted item in this specifica-
tion. Access to an instance variable is never by means of a qualified name. This specification is
just plain wrong to suggest otherwise. I am tempted to address this in the main body of the text, but
to do so would only serve to distract someone trying to understand the protected access modi-
fier. Later on when I say “the JLS defines this in terms of the primary expression general form” I am
deliberately ignoring the first bulleted item.
import com.javarules.examples.*;
class Test extends Superclass {
public static void main(String[] args) {
Superclass superclass = new Superclass();
superclass.print();
}
}
In the example that did compile, test was an instance of the subclass. Why is
the print() method inaccessible using superclass as a reference? This is
the same as asking: What is being protected?
The superclass itself does not need protecting (because protected members
are in fact accessible from subclasses declared in other packages), but
instances of the superclass do. Those subclasses cannot change the value of a
protected instance variable in a superclass object. Nor can they invoke a
protected instance method (as in this example) if the target object is an
instance of the superclass in which the protected instance method is
declared. This should shed some light on what the JLS means by “code that is
responsible for the implementation of an object.”65
An entirely different approach to this subject is possible. It requires first
exploring the idea that protected instance variables and instance meth-
ods (and protected constructors for that matter) are at once both
accessible and inaccessible. Access control is normally discussed strictly in
terms of the member accessed: public members that are not in scope are
always accessible (assuming the package in which they are declared is observ-
able); a default access member is accessible from class or interface types that
For example,
package com.javarules.examples;
public class Superclass {
protected void print() {
System.out.println("Hello World!");
}
}
import com.javarules.examples.Superclass;
public class Test extends Superclass {
public static void main(String[] args) {
new Test().test();
67. Indeed, there is a section in the JLS entitled “Accessing Superclass Members using super”
which includes examples of accessing hidden instance variables. This is not easily understood
because whenever the super.fieldName or super.methodName general forms are used,
this is implicitly used as the target reference. How can this be used to access hidden instance
variables or invoke overridden instance methods that exist only in the superclass method dispatch
table? The answer is very different for fields versus methods. Hidden fields are not inherited by sub-
classes, but they are nevertheless included in the “list of instance variables” that comprise an
instance of the subclass in memory. See 4.11 The Object Class in Volume 1 for a discussion of
how objects are implemented. In the case of the super.methodName general form, the answer
to this question requires an understanding of invocation modes and is discussed in 1.11.3 Overrid-
ing and Dynamic Method Lookup. See also 3.4.1.1 this is Polymorphic, super is Not in Volume 1
for a discussion of how the super keyword is implemented. Based on this explanation of
protected instance variables and instance methods being at once both accessible and inacces-
sible, the analysis in Bug Id 4493343 (and not the JLS) is flawed.
This program does not compile because of the second print() method invo-
cation. The only difference is that this is implicitly used as the target reference
in the first method invocation whereas an instance of the superclass is used in
the second. Thus a different mindset is required. You must think in terms of
the class of the object referenced, not the member accessed.
While this analysis of protected instance variables and instance methods
being at once both accessible and inaccessible is technically correct, it suffers
from the same problem as the “code that is responsible for the implementation
of an object”65 explanation of protected access in the JLS in that it is too
complicated. This leads to an unfortunate disconnect between how this subject
is presented by technical writers and what most programmers actually think.
Truth be told, application programmers more or less think of protected
instance variables and instance methods as having default access. In that sense,
the primary use of the protected access modifier is not access control.
package com.javarules.examples;
public class Superclass {
protected Superclass(String s) {
System.out.println(s);
}
}
package dummy;
import com.javarules.examples.Superclass;
public class Test extends Superclass {
Test(String s) {
super(s);
}
public static void main(String[] args) {
new Test("protected constructor"); //OKAY
new Superclass("protected constructor"); //COMPILER ERROR
}
}
68. Joshua Bloch, Effective Java Programming Language Guide, (Boston: Addison-Wesley,
2001), “Item 15: Design and document for inheritance or else prohibit it.”
69. Ken Arnold and James Gosling, The Java Programming Language, Second Edition (Read-
ing: Addison-Wesley, 1998), §3.11, “Designing a Class to Be Extended.”
package dummy;
import com.javarules.examples.Superclass;
public class Test {
public static void main(String[] args) {
new Superclass("protected constuctor") { };
}
}
Notice that Test no longer extends Superclass, and these classes are now
in different packages. The Test program compiles and when executed prints
protected constructor. How is this possible you ask? Well, actually the
protected subclass constructor is being invoked from within the body of the
anonymous subclass. In that sense, this is not really an exception to the rule at
all. It is just a subclass that has neither a package statement nor a class header
to make it obvious that the access occurs from within the body of a subclass.
This section has defined the meaning of the protected access modifier
in great detail. 3.9 Designing Extensible Classes discusses how protected
fields, methods, constructors, and member types are actually used. If you want
to fully develop your understanding of protected access, you should read
that section now.
NOTE 2.1
The following section was moved from Volume 1 and corrected as per
the following comment at the bottom of Bug Id 4116802, appropriately
named “Can a nested class access protected fields inherited by an en-
closing class?”
class Test {
private String s = "this is a private instance variable";
public static void main(String[] args) {
new Test().new InnerMemberClass().print();
}
class InnerMemberClass {
void print() {System.out.println(s);}
}
}
class Test {
private String s = "this is a private instance variable";
public static void main(String[] args) {
new Test().new Test$InnerMemberClass().print();
}
String access$000() { //COMPILER-GENERATED ACCESSOR METHOD
return s; //DEFEATS NORMAL ACCESS CONTROL
}
The link variable is to an inner class hierarchy what the extends clause is to
an inheritance hierarchy. They literally “link” an inner class to an instance of the
innermost enclosing class. That link is used to invoke compiler-generated
accessor and mutator methods that have access to the private mem-
bers of the enclosing class (thus defeating the normal access control
mechanism in a JVM). Similar transformations are required for nested top-
level classes, nested interfaces, and orphaned local and anonymous classes
that access the private static members of an enclosing type, only there is
no link variable.
70. Rose, Inner Classes Specification, “How do inner classes affect the idea of this in Java code?”
This specification influences Figure 2.4, which is repeated from above. In this
high-level view of access control, class type privileges appear to control both
instantiation and access to the implementation of an object, but is this
really true? The analogy I like to use are safety deposit boxes in a bank. Before
you can open a safety deposit box you have to get through the door of the bank
vault. Is the access modifier of a class type like the door on the bank vault? The
package com.javarules.examples;
public class Superclass {
protected class NestedSuperClass {
//uses default constructor
}
}
import com.javarules.examples.Superclass;
public class Test extends Superclass {
public static void main(String[] args) {
Superclass superclass = new Superclass();
NestedSuperClass nested = superclass.new
NestedSuperClass();
}
}
import com.javarules.examples.Superclass;
public class Test extends Superclass {
class NestedSubClass extends NestedSuperClass { }
public static void main(String[] args) {
NestedSubClass nested = new Test().new NestedSubClass();
}
}
class SortAlgorithm {
protected boolean stopRequested = false;
public void setParent(SortItem p) { … }
protected void pause() throws Exception { … }
protected void pause(int H1) throws Exception { … }
protected void pause(int H1, int H2) throws Exception { … }
public void stop() { … }
public void init() { … }
void sort(int a[]) throws Exception { }
}
The ellipses indicate method implementations that have been deliberately omit-
ted. There are several details to notice about the sort(int a[]) method.
• It is package-private
• There is no default implementation. The sort(int a[]) method is coded
exactly as you see it here
• Dr. Gosling’s unusual parameter specifier style (please do not emulate him
in this case)
All of the SortAlgorithm class methods invoked by the applet are declared
public except for the sort(int a[]) method. Why declare them public?
And why treat the sort(int a[]) method differently? The answers to these
class Test {
public static void main(String[] args) {
Widget widget = new Widget();
new Widget().print(widget);
}
}
class Widget {
private String s = "this is the current instance";
public void print(Widget widget) {
widget.s = "this is another object of the same class";
System.out.println(s);
System.out.println(widget.s);
}
}
2.9 Encapsulation
This chapter has discussed two fundamentally different means of access, instan-
tiation and accessing the implementation of an object. When I think of encapsula-
tion, I think of the latter. The essential meaning of encapsulation is that the
instance variables in which the state of an object is stored are declared
private . That is why the JLS states that “well-designed classes have very few
public or protected fields, except for fields that are constants (final
static fields).”73 The JLS would be more correct to say well-designed
classes have very few non- private instance variables.
If someone has access to a field, it means not only that they can read the
value stored in that field, but they can also change the value. Therefore, any
code that has access to a field is potentially responsible for corrupting the data
stored in that field. If access to instance variables is uncontrolled (read non-
private ), the first step in debugging is to determine all of the classes in a sys-
tem that have access to the corrupted data. In large applications that might be
72. Martin Fowler, UML Distilled: Applying the Standard Object Modeling Language, (Read-
ing: Addison-Wesley, 1997), 101.
73. Gosling et al., §6.8.4, “Field Names.”
Chapter Contents
3.1 Introduction 329
3.2 Hiding 332
3.3 The Definition of Baseclass 340
3.4 The Definition of Related Classes 341
3.5 Generalization in Inheritance Hierarchies 342
3.6 Inheritance 344
3.6.1 Interface Inheritance 345
3.6.2 Implementation Inheritance 347
3.6.3 Inheriting Overloaded Methods 362
3.7 Do Interfaces Extend the Object Class? 362
3.8 Inheriting Members With The Same Name 367
3.8.1 Re-Inheritance 367
3.8.2 Ambiguous Names Related to Inheritance 368
3.8.3 Inheriting Methods With the Same Signature 370
3.9 Designing Extensible Classes 374
3.10 Capping a Class Hierarchy 387
3.1 Introduction
The meaning of hiding as defined in this chapter is consistent with changes
Gilad Bracha made in the Second Edition of the JLS. Specifically, hiding is not
the same as shadowing or obscuring as defined in the last chapter. Only mem-
bers are hidden, and they are always hidden by subclass members. The signifi-
cance of hiding a superclass member is that the hidden member is not inherited.
Hence, hiding is discussed before inheritance.
overriding are implemented at the machine level. If you could watch a method
dispatch table being loaded, you could actually “see” how inheritance, hiding,
and overriding work: inaccessible superclass methods are not loaded into the
method dispatch table of a subclass, and therefore cannot be executed; hiding
and overriding methods overlay the entry in a method dispatch table in which the
hidden or overridden superclass method was previously loaded; etc.
In 3.4 The this and super Keywords in Volume 1, I said that understand-
ing how the this mechanism works “is like pulling the curtain back on the Wiz-
ard of Oz when it comes to understanding how object-oriented programming
languages work.” The same could be said for understanding how method dis-
patch tables are loaded. Once you know that inheritance is a factor of how
method dispatch tables are loaded, there is little more that even Dr. Gosling (the
Wizard at Sun) could say to make inheritance, hiding, and overriding any clearer
or easier to understand.
class Test {
public static void main(String[] args) {
Subclass sub = new Subclass();
sub.print();
System.out.println(sub.s);
System.out.println(((Superclass)sub).s);
}
}
class Superclass {
static String s = "superclass";
int x = 0;
}
This example shows that an instance variable can hide a class variable, and that
a double can hide an int. These are not gratuitous examples. It is difficult to
imagine why someone would hide just the value of a superclass field by declar-
ing a subclass field with the same field modifiers, type, and identifier. In fact, it is
difficult to image any reason for hiding a field. In order for a field to be hidden, it
must first be accessible, so what we are talking about here is non-private
fields. Furthermore, the field would have to be non-final. If you understand
protected access, hiding a protected field makes no sense whatsoever.
A public, non-final field is a rare bird indeed (because of the need for
encapsulation). That leaves only default access. The remainder of this section
will take a close look at the problem with hiding default access fields before
explaining why hiding is allowed at all.
When discussing the hiding of instance variables, it is important to under-
stand that field access expressions are not polymorphic. In the primary-
Expression.fieldName general form, the type of the variable or other pri-
mary expression is always used. As stated in the JLS:
Note, specifically, that only the type of the Primary expression, not
the class of the actual object referred to at run time, is used in deter-
mining which field to use.2
Thus assigning a subclass object to a superclass field does not change the
result of a field access expression. Here is a simplified version of the example in
the JLS:
class Test {
public static void main(String[] args) {
T t = new T();
S s = new S();
System.out.println("t.x = " + t.x);
2. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha, The Java Language Specification,
Second Edition (Boston: Addison-Wesley, 2000), §15.11.1, “Field Access Using a Primary.”
t.x = 1
s.x = 0
s.x = 0
The assignment s = t does not change the fact that the type of s is S. Hence
the value accessed is still zero.
I am using the examples in the JLS for a very deliberate reason. The JLS
goes on to make the point that “the power of late binding and overriding is avail-
able in, [sic] but only when instance methods are used.”3 Here again is a simpli-
fied version of the next example in the JLS:
class Test {
public static void main(String[] args) {
T t = new T();
S s = new S();
System.out.println("t.z() = " + t.z());
System.out.println("s.z() = " + s.z());
s = t;
System.out.println("s.z() = " + s.z());
}
}
class S { int x = 0; int z() { return x; } }
class T extends S { int x = 1; int z() { return x; } }
t.z() = 1
s.z() = 0
s.z() = 1
3. Gosling et al., §15.11.1, “Field Access Using a Primary.” Gilad Bracha cut a bunch of “in Java”
when working on the Second Edition. Here the “in” didn’t get cut.
class Test {
public static void main(String[] args) {
Subclass sub = new Subclass();
System.out.println(((Superclass)sub).getS());
System.out.println(sub.getS());
}
}
class Superclass {
String s = "superclass";
String getS() {
return s;
}
}
class Subclass extends Superclass {
String s = "subclass";
//String getS() {
// return s;
//}
}
superclass
superclass
subclass
subclass
This should give the programmer some idea about what the authors think about
the practice of shadowing fields in a method. Indeed, the JLS goes on to say, “it
is considered poor style to have local variables with the same names as fields.”5
What it does not say is that the designers of the Java programming language
wanted to make hiding of superclass fields a compiler error. You can see that in
the following quote from The Java Programming Language (which Dr. Gos-
ling coauthored).
…where fields are concerned, it is hard to think of cases in which hid-
ing them is a useful feature.
You need to appreciate that if these are not Dr. Gosling’s own words, then they
at least come with his stamp of approval. Therefore we conclude the following.
The binary compatibility problem that would exist in superclasses if hiding were
not allowed can still be seen in accessor methods. Suppose the following Sub-
class were a “pre-existing binary.”
Look what happens if Superclass adds an accessor method with the same
signature:
class Superclass {
private int x = 0;
public int getX() {
return x;
6. Ken Arnold, James Gosling, and David Holmes, The Java Programming Language, Third
Edition (Boston, Addison-Wesley Professional, 2000), §3.3.3, “Accessing Inherited Members.” This
quote has been part of the book since the First Edition, which means it was certainly at least care-
fully read by Dr. Gosling.
The next attempt to compile Subclass generates the following compiler error:
I realize these are instance methods. The point is to show why hiding must be
allowed so as to not break subclasses.
Hidden fields and class methods are members of the direct superclass. Thus
the super.fieldName and super.methodName general forms are said
to access hidden superclass members. It is interesting to note, however, that if
a superclass field or class method is not hidden, then the super.fieldName
and super.methodName general forms are the same as this.field-
Name and this.methodName. For example,
Classes in different class hierarchies are obviously not related. As this example
shows, however, classes in the same class hierarchy can also be unrelated. This
makes the term related classes somewhat counterintuitive because in a compa-
rable family tree all of the family members would be related. The terms related
and unrelated classes are particularly important when discussing type conver-
sion. For example, the only permitted conversions between class types are
those between related classes, which means that the classes are necessarily
part of the same class hierarchy.
superclass supertype
superinterface parent
ancestor
base classa
subclass subtype
subinterface child
descendant
derived class
a. As defined in this book, baseclass is not a synonym for superclass. Baseclasses are always a direct sub-
class of Object. It is also spelled as one word.
• I is a direct superinterface of C.
• C has some direct superinterface J for which I is a super-
interface…
• I is a superinterface of the direct superclass of C.
A class is said to implement all its superinterfaces.7
More general and more specific are the preferred terms with which to discuss
the relationships between supertypes and subtypes in a class or interface hierar-
chy. However, there are alternative terms for generalization:
• abstraction
• derivation
• specialization
• restriction
Any class is an abstraction. Derivation, specialization, and restriction emphasize
subtypes: subclasses are “derived” from their superclass; a subclass is “a spe-
cial kind of” its direct superclass; and subclasses represent a “restriction” of
their superclass behavior.
3.6 Inheritance
A class inherits from the direct superclass either explicitly named in the
extends clause of a class header or implicitly from the Object class, as well
as from any direct superinterfaces named in the implements clause. An inter-
face inherits from any direct superinterfaces named in the extends clause of
an interface header.
class Test {
public static void main(String[] args){
Subclass sub = new Subclass();
if (sub instanceof Object &&
sub instanceof Superclass &&
sub instanceof Subclass &&
sub instanceof Multiple &&
sub instanceof Inheritiance &&
sub instanceof Superinterface &&
sub instanceof Subinterface)
System.out.println("this is an example of " +
"interface inheritiance");
}
}
interface Multiple {}
interface Inheritiance {}
interface Superinterface extends Multiple, Inheritiance {}
interface Subinterface extends Superinterface {}
class Superclass implements Subinterface {}
class Subclass extends Superclass {}
This compiler error is the result of a bug fix. There is a very funny comment in
the evaluation of the primary Bug Id:
There is an unfortunate specification vaccuum [sic] in this area -- inter-
actions of inner classes and inheritance range from purely nonsensical
to merely pathological. We will likely draw the line to favor simplicity
and avoidance of pathologies as opposed to assigning meaning to
every case possible.9
Actually, the range was a little wider than is described here. Jon Steelman
exploited this language feature in his “Java Tip 50: On the Parameter Constants
pattern” in JavaWorld.10 This article was an important contribution to the evolu-
tion of enumerated types in the Java programming language. The interface type
can always be imported. The only significant difference is that subclasses may
have to either import or implement the same interface because the members will
not be inherited.
Now all we have to say is that hidden fields are not inherited, just like any other
kind of entity.
In 3.2 Hiding, I said that hidden instance variables are considered members
of the direct superclass. This point is worth elaborating on in order to clarify the
class Test {
public static void main(String[] args) {
Subclass subclass = new Subclass();
System.out.println(subclass.s);
System.out.println(((Superclass)subclass).s);
}
}
class Superclass {
String s = "superclass";
}
subclass
superclass
The Subclass only has one member, yet the value of two different fields is
printed using the same subclass object. One of those fields is a member of
Superclass. All hidden fields or class methods are members of the
direct superclass. This becomes a very significant point when discussing the
meaning of the protected access modifier.
There is nothing remarkable about the fact that overridden instance methods
are not inherited. For example,
class Test {
public static void main(String[] args) {
Subclass sub = new Subclass();
System.out.println(sub.a());
}
}
class Test {
public static void main(String[] args) {
Subclass sub = new Subclass();
Subclass.Top top = new Subclass.Top();
Subclass.Inner inner = sub.new Inner();
System.out.println(sub.a);
System.out.println(sub.b);
System.out.println(sub.c());
System.out.println(sub.d());
System.out.println(sub.e);
System.out.println(top.getS());
System.out.println(inner.getS());
}
}
class Superclass {
static String a = "Subclasses inherit class variables";
String b = "Subclasses inherit instance variables";
static String c() {
return "Subclasses inherit class methods";
}
String d() {
return "Subclasses inherit instance methods";
Method dispatch tables have always been used in Sun implementations. I think it
is safe to say that all commercial JVM implementations use them. There is only
one such table for each class (which is why they are sometimes referred to as
per-class method dispatch tables). Every method that can be invoked for a
given class of objects can be found in the per-class method dispatch table for
that class, including all of the methods inherited from superclasses. Thus a sin-
gle table lookup is required to resolve symbolic references to methods in that
class. This dramatically improves the overall efficiency of a JVM. Without the use
of per-class method dispatch tables, classes would have to be searched recur-
sively starting at the class of the target object and working up the class hierar-
12. Gosling et al., §15.12.4.4, “Locate Method to Invoke.” As always, runtime data structures are
implementation defined. This discussion assumes the use of per-class method dispatch tables
(which I am reasonably sure is the case in all Sun implementations).
package java.lang;
public class Object {
private static native void registerNatives();
public final native Class getClass();
public native int hashCode();
public boolean equals(Object obj)
protected native Object clone()
public String toString()
public final native void notify();
public final native void notifyAll();
public final native void wait(long timeout)
public final void wait(long timeout, int nanos)
public final void wait()
protected void finalize()
}
package java.io;
public class BufferedReader extends Reader {
private void ensureOpen()
private void fill()
public int read()
private int read1(char[] cbuf, int off, int len)
public int read(char cbuf[], int off, int len)
String readLine(boolean skipLF)
public String readLine()
public long skip(long n)
public boolean ready()
public boolean markSupported()
public void mark(int readAheadLimit)
public void reset()
public void close()
}
package java.io;
public class LineNumberReader extends BufferedReader {
public void setLineNumber(int lineNumber)
public int getLineNumber()
public int read()
public int read(char cbuf[], int off, int len)
public String readLine()
public long skip(long n)
public void mark(int readAheadLimit)
public void reset()
}
13. The dynamically created arrays classes actually use the same method dispatch table as the
Object class (at least in Sun implementations) because they have no methods of their own, only
the length field.
equals(Ljava/lang/Object;)B public
clone()Ljava/lang/Object; protected
toString()Ljava/lang/String; public
notify()V public final
uses the table. In this case that means only the private methods in Line-
NumberReader are loaded.
Table 3.3 shows the method dispatch table after loading methods from the
abstract Reader baseclass. Reader does not override any of the house-
finalize()V protected
read()I public
read([C)I public
read([CII)I public
skip(J)J public
ready()B public
markSupported()B public
mark(I)V public
reset()V public
close()V public
keeping methods in the Object class, so the method dispatch table is merely
extended. Note that abstract methods are loaded. One of the columns in
method dispatch tables that I am not showing is the abstract method modi-
fier. A JVM will throw AbstractMethodError if such a method is invoked in
a fully loaded method dispatch table. That cannot happen in the LineNumber-
Reader method dispatch table because all of the abstract methods loaded
from the abstract Reader baseclass are overlaid (read overridden) when
loading methods from BufferedInputStream.
toString()Ljava/lang/String; public
notify()V public final
notifyAll()V public final
wait(I)V public final
finalize()V protected
read()I public
read([C)I public
read([CII)I public
skip(J)J public
ready()B public
markSupported()B public
mark(I)V public
reset()V public
close()V public
readline(B)Ljava/lang/String;
readline()Ljava/lang/String; public
bold. The overloaded readline methods added to the end of the table are the
only new methods. All of the other rows in bold override instance methods that
were already in the table.
Finally, classes from the LineNumberReader class are loaded, including
all of the private methods. It just so happens, however, that there are no
private methods in the LineNumberReader class, only public ones
(and all but two of those override methods already in the table). After loading the
methods in LineNumberReader, the method dispatch table includes an entry
for every method that can be invoked given a reference to a LineNumber-
Reader object. Table 3.5 shows the fully loaded LineNumberReader
method dispatch table. The final keyword is interesting in terms of how hiding
class Superclass {
private String s = "Hello World!";
void print() {
System.out.println(s);
}
}
class Test extends Superclass {
public static void main(String[] args) {
new Test().print(); //private members are not inherited
}
}
This program compiles and when executed prints "Hello World!". The
Test object created therefore must include a field named s even though that
field is not inherited. Even if the superclass method is hidden or overridden (and
therefore not inherited), it can still be invoked in the subclass by using the
super.methodName general form or by casting to the superclass type. In
either of these scenarios inaccessible and hidden instance variables must
be implemented in the subclass object. Understanding this is central to your
understanding of how object-oriented programming languages work. See 4.11
The Object Class in Volume 1 for a detailed discussion.
This paragraph was changed to read as follows in the Second Edition of the JLS:
The members of an interface are:
This paragraph was not substantially changed in the Second Edition of the JLS:
While every class is an extension of class Object, there is no single
interface of which all interfaces are extensions.18
The following is a similar comment from the closely related Bug Id 4526026.
javac implements the rules using inheritance instead of by explic-
itly inserting the members as per JLS 2 9.2 bullet 3. This subtle and
generally innocuous bug is the only known symptom (well, see also
4479715) of this compiler shortcut. Implementing the JLS2 directly
would require making special cases in many places in the compiler for
very little benefit. We are more likely to fix this by filtering out protected
members that were inherited into an interface.21 [emphasis added]
This is a problem because there are two protected methods in the Object
class: clone() and finalize(). Neither of these should be inherited by
interfaces. Here is an example from Bug Id 4479715:
interface I {
int clone();
}
This should compile (even though the result type is int instead of Object)
because clone() is a protected method. Prior to the 1.4.1 release, how-
ever, it generated the following compiler error:
19. According to a comment made in one of the bugs under discussion this is also true of the
jikes compiler, which would be very surprising because historically the jikes compiler team at
IBM has been a real stickler for implementing the JLS as written.
20. Evaluation of Bug Id 4479715. Bug Id 4526026 covers the same issue for the finalize()
method. Both of these bug reports along with a number of others have been consolidated into Bug
Id 4644627, “interfaces extend Object?” (though this is not immediately obvious in the case of Bug
Id 4479715), which is now the primary Bug Id for this issue.
21. Evaluation of Bug Id 4526026.
As is often the case with examples that break compilers, this declaration is
“completely useless”22 (because any attempt to implement this interface will
generate a similar compiler error). Nevertheless, as Gilad Bracha says in the
same bug report, an interface should have “no knowledge of the protected
members of Object, so there is no constraint on the clone method in an inter-
face.”22 This problem was solved by “filtering out protected members that were
inherited into an interface.”23 As of the 1.4.1 release, this example compiles but
the cat is out of this bag; interfaces do extend the Object class.
If the protected clone() and finalize() are filtered out by the
compiler, why does the following program generate an access control compiler
error (in the 1.4.1 release) instead cannot resolve symbol?
import java.util.*;
class Test {
public static void main(String[] args) {
Set set = new HashSet();
set.clone(); //clone() is not inherited by the Set interface
}
}
22. This is excerpted from an email Gilad Bracha sent to Eric Blake, and that is quoted in the bug
report. I do not know Eric Blake, but he submitted this bug report and many others apparently in
jikes-like effort to implement a Java compiler that is as true as possible to the specification.
23. Evaluation of Bug Id 4526026.
Interfaces that override one of the three non-final methods in bold are subject
to the compiler-enforced method contract. It makes no sense for an interface to
override any of these methods (except perhaps to modify the interface con-
tract), however, because classes inherit their default implementations from the
Object class. The methods inherited from the Object class would override
and implement any method with the same signature declared in an interface. For
example,
class Test {
public static void main(String[] args) {
AnObject object = new AnObject() { };
}
}
interface AnObject {
int hashCode();
boolean equals(Object obj);
String toString();
}
This compiles, showing that all three of the abstract interface methods are
implemented by an anonymous class that declares no members.
3.8.1 Re-Inheritance
Because of what is described as “(multiple) interface inheritance”24 in the JLS, it
is possible for a class or interface to inherit the same interface member from
more than one interface. This is referred to as re-inheritance. References to a
re-inherited field, method, or member type are never ambiguous. The fol-
lowing example illustrates one of a practically unlimited number of scenarios that
result in the “re-inheritance” of interface members.
interface SuperInterface {
String ROME = "re-inherited member"; //all roads lead to Rome
}
interface SubInterface extends SuperInterface { }
class Superclass implements SuperInterface { }
class Subclass extends Superclass implements SubInterface { }
class Test extends Subclass implements SubInterface {
public static void main(String[] args) {
System.out.println(ROME);
}
}
Both DEBUG fields are inherited and neither shadows the other (because they
have the same scope). Qualified names must be used to reference either field in
the subclass. Another option is to use the super.fieldName general form to
access the superclass field.
The names of inherited member types can also be ambiguous. This is com-
pletely analogous to ambiguous field names. The difference is that inheriting two
or more member types with the same name is a lot less likely to occur. Never-
theless, here is an example:
class Superclass {
class Widget { }
}
interface Superinterface {
class Widget { }
}
It does not matter that one of the inherited member types (the one declared in
Superinterface) is a nested top-level class and the other is an inner mem-
ber class. Nested type ambiguity (like that of fields) is based solely on the
type name. None of the modifiers are taken into consideration.
25. Gosling et al., §8.4.6.4, “Inheriting Methods with the Same Signature.”
26. Ibid.
27. Gosling et al., §15.12.2.2, “Choose the Most Specific Method.” (Do not update.)
28. I strongly disagree with the JLS in that the issue of methods with the same signature should not
be addressed in 15.12.2.2 Choose the Most Specific Method which is about overloaded method
matching. Methods with the same signature are no overloaded, they are the same method.
That the JLS is inconsistent as a result can be seen in the above specifications. 8.4.6.4 Inheriting
Methods with the Same Signature in the JLS clearly says that all of the abstract methods are inher-
ited, whereas the addition to 15.12.2.2 Choose the Most Specific Method says that one of them is
“arbitrarily” chosen. Which is it? The answer is clearly that latter, but the JLS could do even better.
What does it mean to say that one of the abstract methods is “arbitrarily chosen”? The choice does
not affect the qualifying type of the method invocation. That is always going to be K in the Perera
example above; so what exactly does it mean to “arbitrarily” choose between two abstract
methods that have the same signature? Whatever the answer, this discussion should be moved to
8.4.6.4 Inheriting Methods with the Same Signature in the JLS. There is yet another problem with
this addition to 15.12.2.2 Choose the Most Specific Method. The discussion of throws clause is a
flawed reiteration of the same discussion in 8.4.4 Method Throws of the JLS. Including it here at
least suggests that it is somehow an exception to the rule. It is not.
interface I {
void f () throws X;
}
interface J throws X, Y{
void f ();
}
interface K extends I, J {
}
class Test {
public static void main (String[] args) {
K k = null;
k.f(); // ambiguous
}
}
However it is clear that in this case the method invocation is not ambig-
uous, since in an important sense I.f() and J.f() represent the
“same” (dynamically selected) method.29
Surprisingly, prior to the 1.2 release examples such as this would not compile.
Only methods with the same name but different signatures (i.e., overloaded
methods) can be ambiguous. However, this subject is not discussed until 5.8
Overloaded Method Matching.
That methods with the same signature are considered to be the same
method is further supported by the fact that they are subject to the compiler-
enforced method contract as discussed in 1.11.1 The Compiler-Enforced
Method Contract: all of the methods must have the same result type; a super-
29. Roly Perera, quoted with permission from a personal email with the subject line of “JLS2
15.12.2.2 addition” and dated June 6, 2003. Note that the throws clauses in f() are not rele-
vant to the immediate discussion.
The reason you make it visible is that it's often necessary in order to
allow subclasses to do their job, or to do it efficiently. But once you've
done it, you're committed to it. It is now something that you are not
allowed to change, even if you later find a more efficient implementa-
tion that no longer involves the use of a particular field or method.
So all other things being equal, you shouldn't have any protected mem-
bers at all. But that said, if you have too few, then your class may not
be usable as a super class, or at least not as an efficient super class.
Often you find out after the fact. My philosophy is to have as few pro-
tected members as possible when you first write the class. Then try to
• C directly depends on T.
• C directly depends on an interface I that depends on T.
• C directly depends on a class D that depends on T (using this
definition recursively).
It is a compile-time error if a class depends on itself.32
Although the UML definition is much broader and the JLS definition is really
directed at what are best described as “compiler dependencies,” both defini-
tions are applicable to this discussion.
The definition of implementation dependencies is actually much more
precise than either of the above definitions. The protected fields and meth-
ods in a superclass are only the most obvious implementation dependencies.
Those fields and methods are by definition part of the superclass implementation
30. Joshua Bloch in an interview with Bill Venner entitled “A Conversation with Josh Bloch” (First Pub-
lished in JavaWorld, January 4, 2002), on the artima.com Web site (Artima Software, Inc.),
www.artima.com/intv/blochP.html.
31. UML Semantics, Glossary.
32. Gosling et al., §8.1.3, “Superclasses and Subclasses.”
33. I must acknowledge Joshua Bloch as the source for this emphasis on documenting overridable
methods when designing a class to be extended. See “Item 15: Design and document for inherit-
ance or else prohibit it” in his book Effective Java (which is having a profound influence on the Java
programming community at large). He did not, however, originate the idea of designing a class to
be extended. That can be traced back to the First Edition of The Java Programming Language in
a section named “Designing a class to be extended.”
Whenever the API docs start talking about a method implementation, take a
moment to confirm for yourself that this is an overridable method in an extensi-
ble class. Then you know what you are reading is actually a note to subclass pro-
grammers and not part of the API specification. Bloch rightly suggests that
API documentation generator tools such as javadoc should be modified
at the earliest possible date to reflect this reality. Classes have three inter-
faces to maintain: the API specification, the so-called protected interface for
subclass programmers, and the serialized form. Why is the protected inter-
face being treated as if it were a second-class citizen? Just as there are separate
pages of automatically generated documentation for the @serial tag, there
should be a separate pages for automatically generated documentation directed
at subclass programmers using the @subclass tag.
Designing an extensible class therefore means both deciding which mem-
bers should be declared protected and documenting the implementation of
overridable methods. Failure to do either will make it difficult or impossible to
extend the class (especially if subclass programmers do not have access to the
source code). The “design” of the protected interface is essentially an effort
to minimize implementation dependencies while at the same time supporting
subclasses. Superclass programmers commit to supporting both the
protected members and the documented implementation details of overrid-
able methods. That part of the implementation cannot be changed without
“breaking” subclasses declared in other packages. Doing so is comparable to
changing the interface contract on which client programmers depend. This is
precisely why extensible classes are much more difficult to code than final
import java.util.Date;
class Test {
private long date;
public void setDate(Date date) {
if (date == null)
this.date = new Date().getTime();
else
this.date = date.getTime(); //effectively a defensive copy
}
}
class Subclass extends Test {
This example does not compile because date is not inherited by the subclass,
effectively making it impossible to override the superclass method. The date
field should be declared protected.
The other use of protected instance variables is the field equivalent of
what this section describes as superclass implementation hooks (a very spe-
cial use of protected methods discussed below). The protected instance
variable is declared and accessed in a superclass, but subclasses are responsi-
ble for assigning a value. For example, AbstractList in the java.util
package includes the following declaration.
The primary responsibility for setting this field rests with subclasses, but it is
also set by the setTimeInMillis(long millis) method in the
Calendar superclass in which it is declared.
A private instance variable and a protected set method are some-
times used instead of a protected instance variable. The subclass program-
mer can either invoke the superclass set method or override the same. Here is
an actual example from the PrintStream and PrintWriter classes:
class Test {
public static void main(String[] args) {
new Subclass().doSomething();
Note that computeTime() is one of the most important methods in the entire
Calendar class hierarchy. It is responsible for taking fields such as MONTH,
DATE, and YEAR and converting them into what is essentially a timestamp.
34. Joshua Bloch, Effective Java Programming Language Guide, (Boston: Addison-Wesley,
2001), “Item 15: Design and document for inheritance or else prohibit it.”
35. Ibid.
36. Ibid.
37. Gosling et al., The Java Language Specification, §15.9.5, “Anonymous Class Declarations.”
Any class that has a name can be declared final, but you should not “cap”
local classes because it is an obvious source code optimization that compilers
will do for you.
Capping class hierarchies costs nothing in terms of binary compati-
bility. This is a very important point to grasp. As stated in the JLS:
Changing a class that was declared final to no longer be declared
final does not break compatibility with pre-existing binaries.39
Bloch addressed this in a Bill Venner interview when he said:
My view is you can always add something, but you can't take it away.
Make it final. If somebody really needs to subclass it, they will call you.
Listen to their argument. Find out if it's legitimate. If it is, in the next
release you can take out the final. In terms of binary compatibility, it is
not something that you have to live with forever. You can take some-
thing that was final and make it non-final. If you had no public construc-
tors, you can add a public constructor. Or if you actually labeled the
class final, you can remove the access modifier. If you look at the
binary compatibility chapter of the JLS (Chapter 13), you will see that
you have not hurt binary compatibility by adding extendibility [sic] in a
later release.40
I was happy to have read this statement from a Java luminary after having writ-
ten this section on capping class hierarchies five years earlier.
The benefits of capping a class hierarchy are:
• Method Inlining: Methods in a final class are said to be implicitly
final . They cannot be overridden because the class cannot be extended.
38. John Rose, Inner Classes Specification (Mountain View: Sun Microsystems, 1997), “Can a
nested class be declared final, private, protected, or static?”
39. Gosling et al., §13.4.2, “final Classes.”
40. Joshua Bloch in an interview with Bill Venner entitled “A Conversation with Josh Bloch” on the
artima.com Web site (Artima Software, Inc.), www.artima.com/intv/blochP.html.
41. In a Classic JVM, method inlining is a source code optimization that requires using the –O or -
O:interclass compiler options. Source code optimizations have proven to be problematic,
however. In a newer HotSpot JVM, method inlining is a bytecode optimization performed at run time.
Note that static and private methods can also be inlined for the same reason as final
methods (dynamic method lookup is not required).
42. Human language is in a constant state of flux. Some purists prefer cracker to hacker. Popular
usage, however, dictates that a hacker is a cracker. Furthermore, in some circles cracker has an
entirely different, pejorative meaning (as in “you honky cracker”).
Expressions, Statements,
and Blocks
Chapter Contents
4.1 Introduction 392
4.2 Expressions 393
4.2.1 Primary Expressions 397
4.2.2 Expression Statements and Other Top-Level Expressions 402
4.3 Operator Expressions 405
4.3.1 Numeric Promotion 407
4.3.2 Operator Order of Precedence and Parenthesized Expressions 415
4.3.3 The Associative Property of Operators 428
4.3.4 Nondestructive Operators 430
4.4 Exceptions are Precise 431
4.5 The 38 Unary, Binary, and Ternary Operators 435
4.5.1 Increment and Decrement Operators -- and ++ 436
4.5.2 Negation Operators -, ~, and ! 438
4.5.3 The Elementary School Operators 444
4.5.4 Remainder Operator % 446
4.5.5 Boolean Logical Operators &&, ||, &, |, and ^ 450
4.5.6 Bitwise Operators &, |, ^, >>, >>>, and << 454
4.5.7 Ternary Conditional Operator ?: 465
4.5.8 The Simple and Compound Assignment Operators 470
4.6 The instanceof Type Comparison Operator 473
4.7 A Bitwise Primer 475
4.7.1 Bits 477
4.7.2 Converting Nybbles to Hexadecimal Digits 492
4.7.3 General-Purpose Integer to String Conversion Methods 498
4.7.4 Unsigned Bytes 501
4.7.5 Some Miscellaneous Uses of the Bitwise Operators 523
4.1 Introduction
The bulk of the work done in a Java program is accomplished by evaluating
expressions. They are the workhorses of a programming language. All expres-
sions are either primary expressions or operator expressions. The section
on primary expressions in the JLS begins with the following paragraph:
Primary expressions include most of the simplest kinds of expressions,
from which all others are constructed: literals, class literals, field
accesses, method invocations, and array accesses. A parenthesized
expression is also treated syntactically as a primary expression. 1
“From which all others are constructed” simply means that operator expressions
are composed of a sequence of one or more operators and primary expres-
sions. This is true no matter how complex the operator expression.
As first discussed in 1.8 Operators in Volume 1, there are a total of 40 oper-
ators in the Java programming language:
1. cast operator
1. instanceof operator
38. unary, binary, and ternary operators
40. total
1. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha, The Java Language Specification, Sec-
ond Edition (Boston: Addison-Wesley, 2000), §15.7, “Primary Expressions.”
4.2 Expressions
The term expression is poorly defined, or left undefined, in most books about
computer programming languages. Those that do attempt a definition usually err
by defining only operator expressions involving unary, binary, and ternary opera-
tors. Primary expressions are excluded by such a definition. Primary expres-
sions include literals, local variables and parameters, the simple name of a field,
the this keyword, as well as the more complex primary expressions such as
field access expressions, method invocation expressions, and class instance
creation expressions.
What then is the essential meaning of the term expression? Is there one def-
inition that can describe everything from a local variable or parameter to a class
instance creation expression or a complex operator expression? How about “any
token or combination of tokens that represents a single value.” Is this an accept-
able definition of the term expression as used in computer languages? Well, No.
This definition does not work because invoking a method that has a result type
of void is also an expression—a “void expression” that denotes nothing.
This is the reason why so few books attempt to define the term expression.
No one definition can suffice because there are actually three very distinct kinds
of expressions. When an expression is evaluated, the result denotes one of the
following:
class Test {
public static void main(String[] args) {
final int x = 0;
x = 10;
x++;
}
}
2. There are lvalues and rvalues. A variable is said to have both an lvalue and an rvalue, whereas
a constant only has an rvalue. The lvalue is the location in which the value is stored. The rvalue (or
read value) is the value stored in that variable. This is a somewhat antiquated terminology that is not
used in either the JLS or the JVMS. Because it is primarily used in reference to assignment state-
ments, lvalue and rvalue are sometimes interpreted to mean the left and right-hand operands.
It is interesting to note that because these are the only operator expressions
that include a primary expression that denotes a variable, they are also the only
way to change the value of a variable.
The third kind of expression is the most narrowly defined of all. Invoking a
method that has a result type of void is the only expression in the Java pro-
gramming language that denotes nothing. Such method invocation expressions
can only be invoked as a top-level expression. They are executed for their side
effects. The term top-level expression is defined in 4.2.2 Expression State-
ments and Other Top-Level Expressions.
Note that none of these definitions for the term expression include type
names or labels, neither of which are considered expressions. For example, the
type name in a cast operator or in the right-hand operand of an instanceof
operator denotes a type, not a value. Nor do the labels used in break and
continue statements denote a value. Having said this, it could be argued the
type names and labels are primary expressions that denote a type or label,
respectively. The defining characteristic of a primary expression is that
there are no operators. If some primary expressions denote a variable, why
cannot others denote a type or label? Such a redefinition of the term expres-
sion, however, would be inconsistent with 15.8 Primary Expressions in the JLS
As an lvalue must denote a non-final variable, an analysis of the other
uses of expressions uncover similar rules of which you should be aware. For
example, expressions used in control-flow statements must evaluate to the
boolean data type. The six general uses of expressions are summarized in
Table 4.1. This analysis does not include subexpressions or operands. There are
arguably other highly specialized uses of expressions:
b Variable initializers These are the initial values for variables. They differ only
c Argument expressions in that arguments provide the initial value for method or
constructors parameters.
d Control flow expressions All six of the control-flow statements use Boolean expres-
sions enclosed in parentheses. This group also includes
the Boolean expression in a conditional operator (?:).
Finally, if you understand how the compiler implements
assert statements, then the Boolean expressions
used in assertions must also be included in this group.
e Control transfer There are four control-transfer statements, but only two of
expressions them are included in this group. Those are the return
and throw control-transfer statements. As discussed
above, the labels in a break or continue
statements are not expressions. The commonality of this
group of expressions is that they are usually very simple.
The type of the expression in a return statement must
be assignment compatible with the result type. The type
of an expression in a throw statement must be
Throwable or a subclass thereof.
f Dimension and index Both dimension and index expressions are enclosed in
expressions brackets, and must evaluate to integral types other than
long. Dimension expressions are used in array creation
expressions; index expressions are used in array access
expressions. The type of these expressions is promoted
to int using unary numeric promotion.
Note that this includes all of the expressions that have long names (consisting of
three or more words):
a = b + c;
class Test {
static int value = 5;
public static void main(String[] args) {
getValue();
new Test();
}
static int getValue() {
return value;
}
}
The value returned by the getValue() method or the reference to the newly
created Test object is said to be “quietly discarded.” Most of the time, how-
ever, the value of a primary expression is used as an argument or an oper-
and in some larger context.
Much like operator expressions, the complex primary expressions (the ones
with long names) have an order of evaluation. The term order of evaluation
refers to the order in which subexpressions are evaluated. Understanding the
order of evaluation is important for two reasons. If an expression has a side
effect, then it is particularly important for a programmer to know exactly when
the side effect occurs. The other reason is debugging. A programmer who does
not understand the order of evaluation will find it difficult to debug complex
class Test {
static char a = 'a', b = 'b';
public static void main(String[] args) {
print(a, b, b = 'c');
}
static void print(char a, char b, char c) {
System.out.println("" + a + b + c);
}
}
Executing this program prints abc because the assignment expression b = 'c'
is not evaluated until after the second argument expression is evaluated.
The fact that dimension expressions in an array creation expression are eval-
uated before an array is actually created is an exception to the general rule that
expressions are evaluated from left-to-right. The dimension expression(s) in an
array creation expression must be evaluated before the array is created because
the value of a dimension expression determines the length of the array to be cre-
ated. The order of evaluation for an array creation expression is as follows.
class Test {
public static void main(String[] args) {
int[] array = new int[getArrayDimensions()];
}
static int getArrayDimensions() {
System.out.println("array dimension expressions are " +
"evaluated first");
return Integer.MAX_VALUE;
}
}
class Test {
public static void main(String[] args) {
int x = 1;
System.out.println((x = 2) + x );
}
}
a = b = c;
Method invocation expressions The value of a method invocation expression is the value
returned in the body of the method invoked. The side
effect of a method invocation expression is everything
else that the method does.
Class instance and array The value of a class instance creation expression or an
creation expressions array creation expression is a reference to an object. The
side effect is the creation of an object on the heap.
Additional side effects occur if the class has not already
been loaded, linked, and initialized (including the
dynamically created array classes).
is an important table because any expression that produces a side effect can be
followed by a semicolon and executed as a statement. When executed as state-
ments, these expressions are referred to as expression statements. Here are
some examples of expression statements:
a = b + c;
System.out.println("Goodbye Cruel C++ World");
i++;
new Main();
––array[i];
x = Math.pow(2, a = b + c);
The JLS is wrong to say that “such an expression can be used only as an expres-
sion statement.” It is possible, though somewhat unusual, to invoke such a
method in the ForInit or ForUpdate part of a for statement header. For example,
import java.io.*;
class Test {
public static void main(String[] args) throws IOException {
FileReader fr = new FileReader("Test.java");
for (int c;(c = fr.read()) != -1;System.out.print((char)c))
continue;
}
}
It should be clear from this quote that the conditional operator is right-asso-
ciative.
6. Perform Assignment Operations: These are all of the right-associative
binary operators. Scan the expression from right-to-left performing any
assignment operations. Compound assignment operators are executed as if
they were two separate operations with the result type of the first operation
cast to the type of the left-hand variable before the assignment operation is
performed. See 4.5.8 The Simple and Compound Assignment Operators
below for a discussion. Do not forget to increment or decrement the vari-
able in any postfix increment or decrement operations when you are
done manually evaluating the operator expression (presumably on a
blackboard).
There should be a single value left in the end. That is the value of the operator
expression. In the language of the JLS, every operator expression has a type
and a value. Often the value is referred to as the result. Thus it is natural to use
Executing this program prints the following hex dump of the test.ser file:
AC ED 00 05 73 72 00 04 54 65 73 74 52 F8 93 41 29 2E 78 A9 02 00
02 5A 00 02 62 30 5A 00 02 62 31 78 70 00 01
The last two bytes are the boolean values. As you can see, 00 is false and
01 is true. The hex dump also shows that boolean values are serialized as
bytes. Indeed, the interface contract for the readBoolean() method in the
DataInput interface begins with the following sentence.
Reads one input byte and returns true if that byte is nonzero, false
if that byte is zero.6
VM heap that the boolean data type is also a byte.7 Likewise, implementa-
tions are free to use either arrays of bytes or bits (referred to a packed
boolean arrays in the JVMS8) to implement a boolean[]. The bastore
machine instruction (used to store boolean and byte values on the operand
in arrays) includes the following documentation.
If the components of the array are of type boolean, the int value is
truncated to its low order bit then zero-extended to the storage size for
components of boolean arrays used by the implementation.
Thus for at least one machine instruction a boolean value is always a bit.
The boolean data type is the extreme case is this regard, but is useful in
introducing the concept of computational types. There are only four primitive
data types in the constant pool of a class file: int, long, float, and
double. These same data types are known as computational types in a
7. See Bug Id 4392283 which states that “Booleans are 1 byte fields instead of 4 bytes since the
object packing change was implemented.” Object packing is briefly discussed in a number of
HotSpot VM documents (for example, java.sun.com/products/hotspot/docs/
whitepaper/Java_HSpot_WP_v1.4_802_2.html).
8. Tim Lindholm and Frank Yellin, The Java Virtual Machine Specification, Second Edition,
(Boston: Addison-Wesley, 1999), documentation for the newarray machine instruction.
9. There are actually two other computational types in a JVM: reference and return-
Address.
10. Gosling et al., §5.6, “Numeric Promotions.”
11. James Gosling and Henry McGilton, The Java Language Environment: A White Paper (Mountain
View: Sun Microsystems, 1996), §1.2.3, “Architecture Neutral and Portable.”
Most programmers know about the computational types, but you may not know
how int bytecodes are made to work for byte, short, char, and
boolean values. Here is a very simple example:
class Test {
public static void main(String[] args) {
byte b = 99;
b++;
}
}
Unlike boolean values discussed above, the value of a byte can be stored
directly in the bytecodes that implement a method. That is precisely the case
with the bipush machine instruction. The 99 you see in this decompiled code
is actually an operand that immediately follows the machine instruction. The bi
in bipush stands for byte-to-integer push. The value 99 is sign-extended and
pushed onto the operand stack as an integer. Thus bipush is how the byte
12. Gosling et al., §3.11.1, “Types and the Java Virtual Machine.”
This clearly shows that a byte value was promoted to an int. Unary numeric
promotion is not performed on the operand of an increment or decrement oper-
ator, as is suggested by the following quote.
For unary operators (such as ++ or --), the situation is very simple:
operands of the type byte or short are converted to int, and all
other types are left as is.13
This simply is not true. For increment or decrement operations, binary numeric
promotion is performed on the int value of +1 and the declared type of the
variable incremented or decremented. Furthermore, the type of an increment
or decrement operation is the declared type of the variable incremented
or decremented, not the promoted type as is suggested in the above quote.
If that type is a byte or short, then a narrowing primitive conversion is
required to store the result of the increment or decrement operation in the vari-
able denoted. Interestingly, we just decompiled such an example. The i2b
machine instruction discussed above is the narrowing primitive conversion.
If the operands of a binary operator are primitive numeric types, binary
numeric promotion is performed with one notable exception. In shift expres-
13. I lost this citation. Under the circumstances, I’m sure the author will not mind.
H exclusive OR ^ Left
I inclusive OR | Left
1) Conditional-Or || Left
There appears to be some confusion about the cast operator level of prece-
dence in relation to the unary operators. The JLS does not have an operator
order of precedence table and is silent on this matter. The cast operator is
clearly right-to-left associative as are the unary operators. For example,
class Test {
public static void main(String[] args) {
int x = 0;
System.out.println(++(int)x); //COMPILER ERROR
System.out.println(--(int)x); //COMPILER ERROR
}
}
If the cast operator has the same level of precedence as the unary operators
and is also right-associative, then you might expect this program to compile. It
does not, but the reason is precisely because the cast operator does have the
same level of precedence as the other unary operators and is right-asso-
ciative. The order of evaluation is therefore ++((int)x) and --((int)x).
The problem is that ((int)x) evaluates to zero, which is not a variable. Here
are the compiler error messages:
As discussed in the next section, nothing can come between a prefix increment
or decrement operator and the name of the variable being incremented or decre-
mented. With the exception of this prefix increment and decrement operator
anomaly, however, the cast operator behaves as would be expected when used
with other right associative operators that have the same level of precedence
(i.e., with other unary operators). For example,
class Test {
public static void main(String[] args) {
-1
-1
0
0
0
0
1
1
1
1
true
true
The larger lesson here is that the cast operator is just another unary opera-
tor. This means parentheses are not required when the expression following a
unary operator includes a cast operator. For example,
This line of code is from a class in the sun package. It also means that the
cast operator never has more than one operand. In the above line of code,
count is cast to an int, not count & 3. The “Code Conventions for the Java
Programming Language” document says:
Casts should be followed by a blank space. Examples:
[end of quote]15
If the cast operator is a unary operator, why treat it differently from the other
cast operators? I would say just the opposite; never use a space after a cast
operator. Doing so is a constant reminder that the cast operator is a unary opera-
tor.
I have elected not to include the instanceof operator in this table,
though it certainly does have an order of precedence, as can be seen in the fol-
lowing program.
class Test {
public static void main(String[] args) {
boolean TRUE = true;
/*
* Lower than ! (and therefore all unary operators)
* and the cast operator because removing the
* parentheses generates compiler error.
*/
if (!("Hello World!" instanceof String))
System.out.println("passed test #1");
if ((boolean) ("Hello World!" instanceof String))
System.out.println("passed test #2");
15. Unascribed, “Code Conventions for the Java Programming Language” available online at
java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html, (Mountain View: Sun Micro-
systems, 1995-1999), §10.5.1, “Parentheses.”
passed test #2
passed test #3a
passed test #3b
passed test #4a
passed test #4b
The lesson here is first that parentheses are always required when negating the
value of an instanceof operator expression (or any operator expression for
that matter). It is doubtful that you would ever cast the value of such an expres-
sion (in what would necessary be a boolean identity conversion) and none of
the other operators above == and != in Table 4.3 Operator Order of Precedence
have boolean operands, so the other lesson is that parentheses are never
required around an instanceof operator expression except to negate it.
The remainder of this section discusses when parentheses are required. The
advice on this subject that can be found in Java books is wildly contradictory.
The reason for this is that some technical writers and software engineers regard
the use of parentheses as explicit precedence. As stated in the JLS:
Java programming language implementations must respect the order
of evaluation as indicated explicitly by parentheses and implicitly by
operator precedence.16
This is precisely why the definition of primary expression in the JLS includes
parenthesized expressions. Primary expressions (and therefore parenthesized
expressions) are always the first thing evaluated in any operator expression. The
more confident a programmer is in his or her knowledge of the normal order of
evaluation (as dictated by the operator order of precedence), however, the less
parentheses are required.
Most programmers know that the multiplicative operators have a higher
order of precedence then the additive operators. For example,
In this case, parentheses are required in order to override the normal order of
evaluation. The addition operator in the subexpression a + b has a lower order
of precedence than the multiplication operator. Unless parentheses are used,
the operator expression will be evaluated as follows.
a + (b * c)
if (a == b && c == d) // AVOID!
if ((a == b) && (c == d)) // RIGHT
[end of quote]17
Bonkers! This advice must be balanced against the following quote from The
Java Programming Language, which is coauthored by no less than Dr. Gos-
ling.
Our use of parentheses is sparse—we use them only when code
seems otherwise unclear. Operator precedence is part of the language
and should be generally understood. Others inject parentheses liber-
ally. Try not to use parentheses everywhere—code becomes com-
pletely illegible…18
17. Unascribed, “Code Conventions for the Java Programming Language” available online at
java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html, (Mountain View: Sun Micro-
systems, 1995-1999), §10.5.1, “Parentheses.”
I know how intimidating this must sound to some readers, but you do not have to
know what a polynomial is in order to understand from what Steele is saying that
the operator order of precedence evolved in mathematics precisely so that
“parentheses or other grouping marks” would not have to be used.
Although not directly related to the operator order of precedence, primary
expressions are always evaluated before the operator expressions in which they
18. Ken Arnold and James Gosling, The Java Programming Language, Third Edition (Reading:
Addison-Wesley, 1998), 178. Although I continually update references for new editions, this quote
has always been part of the book. For example, in the Second Edition it is on page 120. I suspect
very strongly that if it was not actually written by Dr. Gosling, that it was directly influenced by him.
19. I have had the good fortune of having had email exchanges with several of the authors of the
JLS and JVMS as well as other specifications (notably, John Rose, author of the Inner Classes
Specification ). I can assure you these are not only amazingly smart people, but I have found them
to be some of the nicest computer programmers I have ever met.
dstMask[z] = ~(dstMax[z]);
return !(iterator().hasNext());
if (!(my.equalsIgnoreCase(his)))
if (!(mycomps.nextElement().equals(comps.nextElement())))
if (!(visBounds.equals(previousBounds)))
I understand this is a matter of style, but I also think if more programmers under-
stood that parentheses are almost never required around the operands of the
&& and || operators, there would be less of them. Here is another example
from no less than the Boolean class:
I would not have used parentheses around the operands of the && operator in
either of these examples. Nor do I use them around the operands of the ||
operators.
One of the reasons why I am focusing on these two operators is to empha-
size that there is a considerable difference between using them separately and
together. When used separately, the meaning of && is always “if all of these
operands evaluates to true,” and the meaning of || is “if any one of these
operands evaluates to true. It does not matter how many times the operator is
used, the meaning is always the same. That simplicity is lost, however, if the &&
and || operators are used together in the same expression. This is one case in
which I think the most ardent advocate of memorizing the operator order of pre-
cedence would not only tolerate unnecessary parentheses, but might even use
them as a matter of style. For example,
class Test {
public static void main(String[] args) {
boolean a = true, b = false, c = true;
if (a && b || c)
System.out.print("Can you tell if this will print?");
}
}
Only one of the operands of an || operator has to be true in order for the
expression to evaluate to true, so the answer is somewhat obvious. The ques-
tion is: What is the left-hand operand of the || operator? Is it b or a && b. The
(a && b) || c
In this case, the use of parentheses makes it much easier to read the expres-
sion.
You should always remember that the unary operators (which include the
cast operator), the ?: conditional operator, and the assignment operators are
at the extremes of the operator order of precedence. There are a number of
simple rules that grow out of this.
Here is an example from a private method in the core API (method and vari-
able names have been changed):
The parentheses in the first (1 << mask) are definitely not required because
assignment operators have the lowest level of precedence. They have a certain
aesthetic value, however, because the same parentheses are required in the
second assignment operation. Here is yet another example from the core API in
which a condition expression is negated:
This is a very common programming error. It does not compile because the
entire binary operator expression must be cast. The correct way to write such
an expression is
The other rule for operators at the extreme ends of the operator order of prece-
dence is stated as follows.
This is a common example that writes the contents of a file one character at a
time to standard output.
The relationship between the conditional operator and the assignment opera-
tors is very special. The conditional operator has a higher level of precedence.
Normally that would mean that any operand that uses an assignment operator
would require parentheses, but only the third operand (to the right of the colon)
actually requires parentheses if an assignment operator is used. For example,
class Test {
static int x;
public static void main(String[] args) {
System.out.println(test(true));
System.out.println(test(false));
}
This is one of those compiler messages that does more harm than good. The
compiler is reading the return statement as follows.
If parentheses are placed around x=1 this example compiles. What makes the
relationship between these two levels of precedence special is that the compiler
uses ? and : to delimit the first and second operands. Parentheses are therefore
not required even if they are assignment expressions (such as x=0 in this exam-
ple). The JLS really should document this behavior.
The only other use of parentheses that I feel deserves special mention is
bits. Querying a bit requires the use of a bitwise AND or inclusive OR operators
as well as one of the equality operators. Here is a typical example from the core
API:
The bitwise operators have a lower order of precedence than the equality opera-
tors and therefore always require parentheses in this context. There are numer-
ous other examples of this in 4.7.1 Bits.
Parentheses are never required around the expression in a return
statement. They are sometimes used as a matter of style, but this should be
limited to conditional expressions using the ?: operator. Using parentheses
The plus and minus operators have the same level of precedence and are left
associative (as are all of the binary operators). Therefore, the computer evalu-
ates this expression as if it were written (a + b) - c.
Any combination of unary operators is valid except the following.
• The operand of a increment or decrement operator must denote a variable.
Hence, none of the other unary operators can come between a prefix incre-
ment or decrement operator and the variable name. The problem is that
these operators are right-associative, and the result of the first unary opera-
tor expression is a value, not a variable. A special case of this rule is the
combination +++ and ---. This is not allowed because the parser inter-
prets them as a prefix increment operator followed by unary plus operator
and as a prefix decrement operator followed by a unary minus operator,
respectively.
• For any given unary expression, only one prefix increment operator, prefix
decrement operator, postfix increment operator, or postfix decrement oper-
ator can be used. In other words, these unary operators are mutually exclu-
x = ~a--;
x = -++a;
x = -+~a++;
All of these examples compile. Note that such compounded unary expressions
cannot be used in expression statements:
Because unary operators are right-associative, these examples are a bitwise com-
plement expression, unary minus expression, and unary plus expression, respec-
tively. As such, they cannot be used in expression statements. Attempting to
compile them generates not a statement error messages. This is significant
because increment and decrement operators can be used in expression state-
ments (because they have side effects). It is interesting to note that prior to the
1.3 release, these same lines of code would generate invalid expression
statement compiler error messages.
The numerical comparison and equality operators are sometimes described
as non-associative. Technically speaking, they are left-associative, but some
explanation is required. As stated in the JLS:
The relational operators are syntactically left-associative (they group
left-to-right), but this fact is not useful.21
if (a == b == c)
then statement;
int a = 1;
int b = a << 1;
int c = b << 1;
System.out.println(a);
System.out.println(b);
System.out.println(c);
1
2
4
int a = 25;
int b = 25;
int c = 25;
boolean d = true;
int x;
boolean y;
x = +a;
x = -b;
x = ~c;
y = !d;
The term nondestructive could be used to describe these unary operators also.
This point can be taken even further by adding the final modifier to the decla-
ration of a, b, c, and d.
class Test {
static {System.out.println("loading program");} // <clinit>
public static void main(String[] args) {
System.out.println("main");
System.out.println("invoking a()");
a();
System.out.println( "end of program");
}
static void a() {
System.out.println("invoking b() in a try-finally");
try {
b();
}
finally {
System.out.println("finally blocks are always executed");
System.out.println("STACK TRACE TO FOLLOW…");
}
}
static void b() {
System.out.println("invoking c()");
c();
}
static void c() {
int i = 0;
System.out.println("ABOUT TO DIVIDE AN INTEGER BY ZERO");
i /= 0;
System.out.println("exceptions are precise");
}
}
Stack traces are discussed in 6.10.2 A Stack Trace Primer. The point to under-
stand now that “abrupt” completion is just that. Control is immediately trans-
ferred to the first statement in the catch block, which explains why neither
“exceptions are precise” nor “end of program” print.
There are two exceptions to the rule that exceptions are precise. Expres-
sions that can throw a NullPointerException do so only after all of the
subexpressions are fully evaluated. This includes method invocation expressions
and array access expressions. The following is an example of throwing a
NullPointerException late:
class Test {
static String s;
public static void main(String[] args) {
try {
s.equals(s = "abc");
}
catch(Exception e) {
System.out.println(e);
System.out.print("s = " + s);
}
}
}
java.lang.NullPointerException
s = abc
class Test {
public static void main(String[] args) {
String[] strings = new String[10];
String s = null;
strings[99] = printSomething();
}
static String printSomething() {
System.out.println("ArrayIndexOutOfBounds is thrown late");
return "";
}
}
Although the left-hand operand is evaluated first, the rather obvious Array-
IndexOutOfBoundsExpression is not thrown until after the right-hand
operand has been evaluated.
There is an exception to both of these exceptions to the rule that exceptions
are thrown late. If that sentence is not enough to confuse you, the exception to
the exceptions involve the left-hand operand of a compound assignment opera-
tor, in which case both NullPointerException and ArrayIndexOut-
OfBoundsException are precise. If the left-hand operand of a compound
assignment expression throws either a NullPointerException or an
ArrayIndexOutOfBoundsException, the exception is in fact thrown
before any part of the right-hand operand is evaluated. The necessity for this
exception to the exception is the “fetch and save” behavior of compound assign-
int i = 12;
i = i + (i = 3);
int i = 12;
i += (i = 3);
++ Increment Evaluates to the value of the variable before or after (prefix and postfix,
operator respectively) the result of the increment operation is stored in the
variable. The increment operation adds 1 to the value of the variable
(x++ is the same as x = x + 1).
-- Decrement Evaluates to the value of the variable before or after (prefix and postfix,
operator respectively) the result of the decrement operation is stored in the
variable. The decrement operation subtracts 1 to the value of the
variable (x-- is the same as x = x - 1 ).
class Test {
public static void main(String[] args) {
int x = 1;
System.out.println(x++);
System.out.println(x);
}
}
1
2
class Test {
public static void main(String[] args) {
int a = 2;
int b = 1;
System.out.println(a * ++b);
}
}
NOTE 4.1
In choosing to organize the unary operators as follows, I deliberately ig-
nore the unary plus operator (which does nothing). An unsigned nu-
meric literal is always positive, and never requires the + operator to
make it so. The explanation for including it in the language is typically
one of “symmetry” with the unary minus operator.
System.out.println(– (+1));
System.out.println(– (–1));
System.out.println(– (+0));
System.out.println(– (–0));
-1
1
0
0
Note that zero is not signed in the primitive integral types. They are, however, in
the floating-point types. For example,
System.out.println(-(-0.0));
System.out.println(-(+0.0));
System.out.println(– (Double.POSITIVE_INFINITY));
System.out.println(– (Double.NEGATIVE_INFINITY));
0.0
-0.0
–Infinity
Infinity
This is why ~ is best grouped with the other “bitwise” operators &, ^, and |
rather than thinking of it as a unary operator. All four of these bitwise opera-
tors are discussed in 4.7 A Bitwise Primer along with the shift operators.
The logical complement operator is almost always used in Boolean control
flow expressions to negate either a boolean type variable or a method that
returns a boolean value. Interestingly enough, this leads to a very subtle nam-
if (!serialized) {
if (!hashCodeCached) {
hashCode = hashCode();
hashCodeCached = true;
}
Do you see how naturally these read because the variable names suggest the
value is true? Here is another example involving a method that returns a
boolean value:
Here a second set of parentheses is required in order to negate the entire condi-
tional expression. If both of the operands are boolean, however, a != b is
the same as !(a == b) and is an easier read.
After boolean type variables and methods, one of the most common uses of
the ! operator in Boolean control flow expressions is in front of a parenthesized
expression using the instanceof type comparison operator. The following
if (!(this.type.equals(that.type) &&
this.name.equals(that.name) &&
this.actions.equals(that.actions)))
return false;
This if statement returns false if “any” of these instance variables are not
equal.
There is just one other use of the logical complement operator that deserves
special mention. The type of a numerical comparison operation is boolean,
which means they can be negated using the logical complement operator. Here
are a couple examples from the core API:
if (!('A' <= c && c <= 'Z' || 'a' <= c && c <= 'z'))
break;
I find such expressions hard to read and avoid them as a matter of style. Here
are the equivalent numerical comparison operations stated positively:
if (size <= 0) {
throw new IllegalArgumentException("negative send size");
}
The first change helps to point out a problem with the detail message. I know it
is controversial to say so, but there is always a positive alternative to negating
the result of a numerical comparison operation.
* multiplication Evaluates to the product of the two operands. The left-hand operand
operator is the multiplicand. The right-hand operand is the multiplier.
+ addition Evaluates to the sum of the two operands. Both operands are
operator addends.
< numerical Evaluates to true if the value of the left-hand operand is less than
comparison the value of the right-hand operand. Otherwise, the expression
operator evaluates to false.
> numerical Evaluates to true if the value of the left-hand operand is greater
comparison than the value of the right-hand operand. Otherwise, the expression
operator evaluates to false.
<= numerical Evaluates to true if the value of the left-hand operand is less than
comparison or equal to the value of the right-hand operand. Otherwise, the
operator expression evaluates to false.
>= numerical Evaluates to true if the value of the left-hand operand is greater
comparison than or equal to the value of the right-hand operand. Otherwise, the
operator expression evaluates to false.
!= equality Evaluates to true if the value of the left-hand operand is not equal
operator to the value of the right-hand operand. Otherwise, the expression
evaluates to false.
This group of operators include the only two operators in the Java program-
ming language that throw an exception. Those are the / and % operators, which
throw ArithmeticException if the divisor (the right-hand operand) is
zero.24
The operands of the arithmetic and numerical comparison operators must
be primitive numeric types. The type of the operator expression is the promoted
type of the operands using binary numeric promotion.
The equality operators work with all data types, but the operands must have
matching types. That is to say, both operands must be primitive numeric
types, the boolean type, or reference types. The concept of matching types
largely corresponds to the type conversion boundaries in Figure 5.4, “Type Con-
version Boundaries” on page 595. For example, if the type of the left-hand oper-
and is a reference type, the type of the right-hand operand must be either
another reference type or the special null type, else the equality expression will
not compile. If the operand types are primitive numeric types, they are promoted
using binary numeric promotion. Depending on the operand types, equality oper-
ators are referred to as numerical equality operators, Boolean equality
operators, or reference equality operators. Equality expressions always
evaluate to the boolean type.
24. ArithmeticException is discussed in a section that was moved to Volume 1 (along with
the sections on floating-point arithmetic and rounding modes) after the first edition of that book was
published. A cross-reference to the correct section will be added to the Second Edition of this vol-
ume if and when the Second Edition of Volume 1 is published. I eagerly await that day.
There is no way around this in the Java programming language. See also Table
4.6 The Complete Set of Logical Expressions on page 442.
Finally, there are a few miscellaneous notes. The expression a != b is the
same as !(a == b), regardless of the operand type. In Boolean equality, the
expression false == false is true. The expression a != b is also the
same as a ^ b, if the operands are boolean.
class Test {
public static void main(String[] args) {
double dividend = 5.5, divisor = 2, quotient, remainder;
quotient = dividend/divisor;
remainder = dividend%divisor;
System.out.println("quotient = " + quotient);
System.out.println("remainder = " + remainder);
}
}
quotient = 2.75
remainder = 1.5
The fact is that the remainder operator is seldom used to obtain the remainder
from the implied division operation. How then is it used?
A comparison can be made to the shift operators, which always multiply or
divide the left hand operand by a power of two, but are seldom used with this is
mind. Likewise, the value of a remainder operation is always the remainder of an
implied division of the left-hand operand by the right-hand operand. Neverthe-
less, that is not how the operator is used. Most of the time the % operator is
used as a modulo operator to reduce a number (the left-hand operand or
dividend) modulo the divisor, the result of which is a number in a known
range of values. Sound funny? Welcome to modular arithmetic, which is
defined as follows in Webster’s online dictionary.
arithmetic that deals with whole numbers where the numbers are
replaced by their remainders after division by a fixed number <in a
modular arithmetic with modulus 5, 3 multiplied by 4 is 2>25
In other words 12%5 equals 2. In modular arithmetic, 12 modulo 5 is read
“repeatedly subtract 5 from 12 until the difference is less than 5.” It is a mathe-
matical certainty that the value of a remainder operation will be less than the divi-
sor (the right-hand operand) and have the same sign as the dividend (the left-
hand operand). If the remainder operation uses integer math, for example, the
hashCode = key.hashCode();
index = (hashCode & 0x7FFFFFFF) % hashTable.length;
This example of computing an index for a hash table is from 4.11.2 Understand-
ing Hash Tables in Volume 1. The only reason HashTable and other hash-
based collections do not inexplicably throw an ArrayIndexOutOfBounds-
Exception when accessed is because of % hashTable.length (reducing
the index modulo the hash table size) in this algorithm.
The other common use of the remainder operator is in an equality expres-
sion (n % x == 0 or n % x != 0) which is read “n is (or is not) a multiple of the
x.” For example,
I bet you can you tell what this code is doing. It is the return statement (which
in this case is the entire method implementation) from the isLeapYear(int
year) method.
One very common trick of the trade is to use the remainder operator to exe-
cute some code “every nth iteration” of a loop, where n is the divisor in a remain-
der operation. For example,
The bold line of code appends a space to the bitPattern string buffer every
8th iteration of the loop.
The remainder operator should not be used to determine if a number is even
of odd. Nevertheless, % 2 == 0 or % 2 != 0 (or % 2 == 1) are sometimes used
to test for even and odd numbers, respectively. As discussed in the last section,
& 1 == 0 or & 1 != 0 are also used for this purpose. The only difference is that
& 1 is significantly faster than % 2.
There are other remainder methods in the core API, including IEEE-
remainder(double f1, double f2) in the java.lang.Math class
and remainder(BigInteger val) in BigInteger. The former rounds
NOTE 4.2
The AND, inclusive OR, and exclusive OR operators (&, |, and ^), are re-
ferred to as either Boolean logical operators or integer bitwise op-
erators in the JLS depending on the type of the operands. The Boolean
logical operators are rarely, if ever, used. They are primarily discussed
in the next section. The integer bitwise operators are introduced in
4.5.6 Bitwise Operators &, |, ^, >>, >>>, and <<, but their practical
significance may not be appreciated until reading 4.7 A Bitwise Primer.
&& conditional-and & AND Evaluates to true if both the left- and
right-hand operands are true .
Otherwise, evaluates to false.
AND
false false false
false true false
true false false
true true true
Inclusive OR
false false false
false true true
true false true
Exclusive OR
false false false
false true true
true false true
true true false
As stated above, the AND, inclusive OR, and exclusive OR operators seldom
if ever have boolean type operands. There is only one programmer convention
of which I am aware that uses any of these Boolean logical operators. It is an
alternative to using the ?: operator to invoke equals(Object obj) without
the possibility of throwing a NullPointerException. For example,
Presumably this is being done because the ^ operator is faster than using the
?: operator. The idea has gained a significant foothold in the core API. Here is
yet another example:
Most examples follow this general coding style, but at least one core API pro-
grammer uses the same coding technique as follows.
Not only is this programmer convention clearly more readable, I have run a num-
ber of microbenchmark tests and using the ^ operator as shown in the core
API examples above is actually slower than using the ?: operator. Really!
& Bitwise Each bit evaluates to 1 if the corresponding bit in the left-hand and
AND right-hand operand is 1. Otherwise, the bit evaluates to 0. For
example,
00110101
& 00101100
00100100
00110101
| 00101100
00111101
00110101
^ 00101100
00011001
>> Right shift Evaluates to the value of the left-hand operand right-shifted the shift
distance using sign extension. The shift distance is usually the value
of the right-hand operator, but not always. Shift distances are
discussed in detail below.
>>> Unsigned Evaluates to the value of the left-hand operand right-shifted the shift
Right shift distance using zero extension.
<< Left shift Evaluates to the value of the left-hand operand left-shifted the shift
distance. Low-order bits are zero filled.
Inclusive Exclusive
AND OR OR
0 0 0 0 0 0 0 0 0
0 1 0 0 1 1 0 1 1
1 0 0 1 0 1 1 0 1
1 1 1 1 1 1 1 1 0
Note that the descriptions of the shift operators in Table 4.9 do not mention
multiplying and dividing by powers of two, as explained in the following quote
from the JLS.
The value of n<<s is n left-shifted s bit positions; this is equivalent
(even if overflow occurs) to multiplication by two to the power s.
While this is certainly mathematically true, most Java books make way too much
of this. It is actually done a lot less frequently than you may imagine. Here is a
rare example from a simplified version of the write(int b) method in
java.io.ByteArrayOutputStream , which writes a single byte to the
output stream:
The count is the number of characters in the buffer, which must not exceed the
buffer size. If the buffer is already full, this method increases the buffer size by
two times the current size or the current buffer size plus one, whichever is
greater. Note that newcount will never be greater than buf.length << 1
unless buf.length is zero (which is possible given the constructor design) or
buf.length << 1 results in overflow. This is a questionable implementation.
Most of the time when the shift operators are used to multiply or divide the
right-hand operand is 1 (as in this example), which means that the left-hand oper-
and is being multiplied or divided by two. While this is marginally more efficient
than using the * and / operators, it is so seldom done that I think mainstream
business application programmers should avoid it altogether. It is definitely not
an easy read because of the shift distance. For example, i << 3 means i * 4,
not i * 3. Note also that using the left-shift operator to divide by powers
of two only works for positive numbers. Of course, left and right shifting is
always multiplying and dividing by powers of two; this is a question of intent. For
example, I regard the initialization of bit masks such as MAXIMUM_CAPACITY
= 1 << 30 as purely a shift operation. Others may well regard it as multiplica-
tion. If you want to know how the shift operators are used “in the real world” read
4.7 A Bitwise Primer.
In shift operations, the left-hand operand is the value to be shifted, and the
right-hand operand is the shift distance. The operands of a shift expression
are promoted separately using unary numeric promotion. The type of a
shift expression is the promoted type of the left-hand operand, which is always
either int or long. Shift distances are usually expressed as numeric literals
(4, 8, 16, 24, 32, etc.), but a variable is occasionally used. It should be under-
stood that shifting more than 31 bits in an int is meaningless. Likewise, shift-
ing a long more than 63 bits is meaningless. Thus the ishl, ishr, lshl,
If the promoted type of the left-hand operand is long, then only the six
lowest-order bits of the right-hand operand are used as the shift dis-
tance. It is as if the right-hand operand were subjected to a bitwise log-
ical AND operator & with the mask value 0x3f. The shift distance
actually used is therefore always in the range 0 to 63, inclusive.27
The following program demonstrates this behavior using a 0x1f mask (as if left
or right shifting an int).
class Test {
public static void main(String[] args) {
//just enough times to show integer overflow
for (int distance=0; distance < 40; distance++) {
System.out.println(BitPattern.toBinaryString
(distance & 0x1f)); //mask for ints
}
}
}
27. Ibid.
There are two details to notice; the first is how 11111 (the binary value of
0x1f) is the largest number that can be expressed by the parenthesized
expression (distance & 0x1f) . The other is that this is a special case of
integer overflow.
The unsigned right-shift operator is a new operator that does not exist in the
C and C++ programming languages. The rationale for a new shift operator is
import java.util.Random;
class Test {
public static void main(String[] args) {
Random random = new Random();
for (int i=0; i < 10; i++) {
System.out.print(random.nextInt() & 3);
System.out.print(" ");
System.out.println(random.nextInt() & 4);
}
}
}
2 0
1 4
2 4
2 4
2 4
3 4
1 4
3 4
1 0
0 0
As you can see, if the bits are not consecutive, this programming technic does
not work. That is why hexadecimal literals should always be used.
You have already seen this use of the AND operator in the shift distance
example. Here is a more elaborate example from the java.io package:
long bitset = 0;
final int LONG_BITS = 64;
int[] counter = new int[LONG_BITS];
…
for (int i=0; i<LONG_BITS; i++) {
// If no (more) bits are set, break out of the loop
if (bitset == 0) {
break;
}
if ((bitset & 1) != 0) { //uses an integer literal as a mask
counter[i]++;
}
bitset >>>= 1;
}
These are sometimes referred to as sets, but not in Java Rules. For rea-
sons explained in the note at the start of 4.7.1 Bits on page 477, I typically
name the int or long in which bits are stored bitset (instead of
flags). The same term cannot be used to describe both of the operands
in a bitwise operation. Thus I use the term combined bit mask instead set to
describe bit masks such as ALL.
• Nybble mask: This is either 0xf or 0xF. Used when converting nybbles to
hexadecimal digits.
Assuming that the type of ch is char, can you tell what this code is asking?
This if statement tests to see if ch is in the range of an ASCII or Latin-1 charac-
ter. If any of the bits in the high-order byte were set, the expression ch & 0xff
would evaluate to false.
Assuming that the same bit is not set in both operands, the | operator is to
bits what the + operator is to numbers. For example,
255
255
2
4
The values added in the first two lines of code are all unique non-zero powers of
two; hence, the results are equal. When adding two plus two, however, the
results are different because the | operator does not “carry” as does the +
operator. Thus one thinks of the inclusive OR operator as “combining” (unique)
bits rather than “adding” them.
The ^ operator (exclusive OR) is the least intuitive of the bitwise operators
because if both bits are 1, the result is 0. Note that ^ is frequently used in the
API docs for exponentiation. For example, 2^32 is read as “two raised to the
power of 32.” This use of ^ has nothing to do with the exclusive-OR opera-
hashcode = language.hashCode() ^
country.hashCode() ^
variant.hashCode();
This is the code used to compute a hash code for the Locale class. Even as
an operator for individual bits (or bit flags), the exclusive OR operator is used to
toggle a bit from one to zero or vice versa depending of the setting of the bit
prior to the operation. In practice, however, bits are seldom toggled.
29. Ibid.
This example from the core API returns a key, value, or a Map.Entry.
• As a guard against the possibility of a NullPointerException being
thrown when invoking an instance method. For example,
if (!(o1==null ? o2==null : o1.equals(o2)))
return false;
These three well established uses easily account for the overwhelming majority
of all conditional operators.
As suggested in the second bulleted item, the conditional operator is often
thought of as an alternative to if-then-else statements for implementing
binary branches (witness “if-else operator” and “ternary if-else operator” from the
list of alternative names above). This is not really true. Conditional expressions
are just that, — expressions. They are not statements; nor can they be used as
expression statements. For example,
class Test {
public static void main(String[] args) {
boolean b = true;
b ? System.out.println("true") :
System.out.println("false");
if (n == 0)
value = false;
else if (n == 1)
value = true;
else
assert false;
This line of code generates all sorts of compiler errors because assert
false is a statement.
When there is actually a choice between using the conditional operator and
an if-then-else statement, preference should be given to the conditional oper-
ator because it is much more compact and does not suffer from the mainte-
nance problems that can occur when braces are not used in if-then-else
statements.
The “Code Conventions for the Java Programming Language” document
says that if a binary operator is used in the first expression (to the left of the ?)
in a ?: operator, parentheses should be used.30 For example,
d = (a = b + c) + r; // AVOID!
should be written as
a = b + c;
d = a + r;31
class Test {
public static void main(String[] args)
throws java.io.IOException {
int i;
System.out.println("PRESS CONTROL Z (^Z) TO END PROGRAM");
while ((i = System.in.read()) != -1) {
char c = (char) i;
if (c=='\n' || c=='\r')
continue;
System.out.println(c + " = ASCII " + i);
}
}
}
This simple assignment expression does not compile. The type of the right-hand
operand expression Math.PI is double. A conversion from double to int
obviously can result in a loss of data. Now consider the same expression written
using a compound assignment operator:
int i = 0;
i += Math.PI;
System.out.println(i);
NOTE 4.3
However unconventional it may sound, the instanceof and cast op-
erators should always be discussed together because at the machine
instruction level they are practically the same. The only substantial
difference is that the instanceof operator returns false whereas
a cast operator throws a ClassCastException. Therefore 5.7.4
The Cast Operator should be read at the same time as this section.
class Test {
public static void main(String[] args) {
String s = null;
if (s instanceof StringBuffer) //compiler error
System.out.println("does not print");
}
}
This simply is not true. Dr. Mark Davis solved this problem years ago. His solu-
tion is to not use the instanceof operator where the meaning is “the class of
this object must be this class.” Here is Bloch’s example with the solution sug-
gested by Dr. Davis.
import java.awt.Color;
class Point {
private final int x;
private final int y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
public boolean equals(Object obj) {
if (o == null) return false;
if (obj.getClass() != this.getClass())
return false;
Point p = (Point)obj;
return p.x == x && p.y == y;
}
}
class ColorPoint extends Point{
private Color color;
32. Joshua Bloch, Effective Java Programming Language Guide, (Boston: Addison-Wesley,
2001), “Item 7: Obey the general contract when overriding equals.”
NOTE 4.4
There is a common thread that runs through all of the bitwise primer
subsections. They all discuss what I refer to as conceptual data
types, which are bits, nybbles, and unsigned bytes. There are no type
names that can be used to allocate a bit, nybble, or unsigned byte.
They are allocated as one of the integral data types byte, short,
char, int, or long (usually int because that is the smaller of the
two computational types). Conceptual data types are made possible by
the use of the bitwise operators.
NOTE 4.5
The next section uses some nonstandard terminology to refer to the
the int or long in which bits are allocated. That field is often named
flags, and individual bits are more generally referred to as bit flags.
Instead of referring to such fields as flags, I have expropriated the term
bitset from the BitSet class in java.util (which is too slow to
be used by serious bitwise programmers). Likewise, instead of refer-
ring to individual bits as flags, I simply refer to them as bits. Besides
being easier to read, this terminological shift was more or less forced
on all of us by the name of the BitSet class in the core API.
4.7.1 Bits
The term bit is short for binary digit. The bits in a integral data type are num-
bered from right-to-left beginning at zero. Thus bit 31 is the sign bit in an int or
float. In a long or double, bit 63 is the sign bit. A bit has one of two val-
ues, which is either 0 or 1. A bit is said to be set if its value is 1, and reset or
cleared if its value is 0.
Zero and one are binary values which gives rise to the question of boolean
versus bit. Bits are always an alternative to using boolean type variables, espe-
short changed = 0;
if( (flags & ImageObserver.HEIGHT) != 0 )
if( ! getElement().getAttributes().isDefined
(HTML.Attribute.HEIGHT) ) {
changed |= 1;
}
if( (flags & ImageObserver.WIDTH) != 0 )
if( ! getElement().getAttributes().isDefined
(HTML.Attribute.WIDTH) ) {
changed |= 2;
}
synchronized(this) {
if ((changed & 1) == 1) {
fWidth = width;
}
if ((changed & 2) == 2) {
fHeight = height;
}
if (loading) {
return true;
}
}
if( changed != 0 ) {
…
In my opinion, this code would read a lot easier had it used two boolean type
variables (perhaps widthChanged and heightChanged ) instead of a bit
set (the changed variable). If there are many binary values, however, using bits
class Test {
public static void main(String[] args) {
long bitset = 0;
final int WHATEVER = 1 << 31;
33. See Bug Id 4392283 which states that “Booleans are 1 byte fields instead of 4 bytes since the
object packing change was implemented.” Object packing is briefly discussed in a number of differ-
ent very similar HotSpot VM documents (for example, java.sun.com/products/hotspot/
docs/whitepaper/Java_HSpot_WP_v1.4_802_2.html).
class Test {
public static void main(String[] args) {
int bitset = (1 << 16) - 1; //set 16 low-order bits
System.out.println(BitPattern.toBinaryString(bitset));
bitset = ~((1 << 16) - 1); //set 16 high-order bits
System.out.println(BitPattern.toBinaryString(bitset));
}
}
This is one programming technique that you will definitely want to add to you bag
of tricks. It is used a surprising number of times in the core API in a variety of dif-
I would not use the leading zeros as a matter of style unless there are a lot of bit
masks, as in this example. (When it comes to readability, “less is always better”
is my motto.) This repetition of 1, 2, 4, 8 is how to express powers of two in
hexadecimal literals. For example,
class Test {
public static void main(String[] args) {
System.out.println(0x1);
System.out.println(0x2);
System.out.println(0x4);
System.out.println(0x8);
System.out.println(0x10);
System.out.println(0x20);
System.out.println(0x40);
System.out.println(0x80);
System.out.println(0x100);
System.out.println(0x200);
System.out.println(0x400);
System.out.println(0x800);
}
}
Note that such bit masks are inlined constants, which means their value should
never be changed. The fact that bit masks are inlined constants significantly
adds to the overall efficiency of bit processing. Alternatively, the left-shift opera-
tor can be used to initialize bit masks. For example,
Of course, the first bit mask could just as easily been initialized with one (the
expression 1 << 0 equals one). The << 0 is added as a matter of style. The pro-
grammer convention is to use hexadecimal literals because left-shifting becomes
awkward if there are a lot of bit masks.
Sometimes masks do not need a name and can be stored in an array in
which the index value corresponds to the bit mask you want to use. For example,
the BitPattern program uses a generic BIT_MASK array that is initialized
as follows:
These are unnamed bit masks, which are somewhat unusual. You can even set
the unnamed bit mask in a for statement header. For example,
class Test {
public static void main(String[] args) {
int bitset = 0;
final int ONE = 0x1;
final int TWO = 0x2;
final int THREE = ONE | TWO;
bitset |= ONE;
if ((bitset & (ONE | TWO)) != 0)
System.out.println("at least one of the bits are set");
if (!((bitset & THREE) == THREE))
System.out.println("both bits are not set");
bitset |= TWO;
if ((bitset & THREE) == THREE)
System.out.println("both bits are now set");
}
}
Setting Inclusive Set this bit or all of the bits in a combined bit mask.
OR
bitset |= MASK;
Set all of the bits except this one. This query works equally
well for a combined bit mask.
bitset |= ~MASK;
Clearing AND Clear this bit or all of the bits in a combined bit mask. Be
careful, though, because the bitwise complement operator (or
bit flipper) has exactly the opposite meaning it has when
querying, setting, or toggling.
Clear all of the bits except this one. This query works equally
well for a combined bit mask.
Toggling Exclusive Toggle this bit or all of the bits in a combined bit mask.
OR
bitset ^= MASK;
THE BIT
FLIPPER Toggle all of the bits except this one. This query works equally
at work well for a combined bit mask.
bitset ^= ~MASK;
a. Such a query would be very awkward for a dynamically combined bit mask because the bits would have to
be dynamically combined on both sides of the equality operator.
There is a similar difference when checking if the bits in a combined bit mask
have been cleared. When setting, clearing, or toggling bits, however, there
The only time a bitwise programmer has to think about the difference
between an individual bit mask and a combined bit mask is when
querying. There is no difference when setting, clearing, or toggling.
is no such difference. This is a critical point to grasp when working with bits. It
helps to explain why dynamically combined bit masks are very common when
setting or clearing bits. Here is an example from the core API:
All of these bits are cleared, no different than if a statically combined bit mask
were used. The same could be said of toggling bits, but bits are seldom toggled.
Be careful when clearing one or more bits because, unlike the other opera-
tions, the bitwise negation operator is used to clear the bit (or bits) in the mask.
All other uses of the bitwise negation operator have the meaning of “bit(s) not in
the mask.” For example,
class Test {
public static void main(String[] args) {
int bitset = ~0;
System.out.println(BitPattern.toBinaryString(bitset));
final int MASK = 0x1;
bitset &= ~MASK; //clear this bit
System.out.println(BitPattern.toBinaryString(bitset));
bitset = ~0;
bitset &= MASK; //clear all but this bit
System.out.println(BitPattern.toBinaryString(bitset));
}
}
class Test {
public static final int TRUE = 1;
public static final int FALSE = 2;
public static void main(String[] args) {
test(0); //illegal argument
test(TRUE);
test(FALSE);
test(TRUE|FALSE); //illegal argument
}
public static void test(int mask) {
if ((mask & ~(TRUE|FALSE)) != 0)
System.out.println(
"throwing IllegalArgumentException " + mask);
switch(mask) {
case(TRUE): System.out.println("true");
break;
case(FALSE): System.out.println("false");
break;
default: System.out.println("assertion failure");
break;
}
}
}
assertion failure
true
false
assertion failure
The argument check has two bugs. The first thing that needs to be done is to
add a check for zero:
This gets to be very awkward however when there are more bit masks with
longer names than TRUE and FALSE. Many programmers use a switch state-
ment under these circumstances. Such argument checks are still valid as com-
plex assertions in non- public methods. For example,
class Test {
private static final int TRUE = 1;
private static final int FALSE = 2;
public static void main(String[] args) {
test(0); //illegal argument
test(TRUE);
test(FALSE);
test(TRUE|FALSE); //illegal argument
}
private static void test(int mask) {
assert checkMask(mask);
switch(mask) {
case(TRUE): System.out.println("true");
break;
case(FALSE): System.out.println("false");
break;
default: System.out.println("assertion failure");
break;
}
}
private static boolean checkMask(int mask) {
switch(mask) {
case(TRUE):
case(FALSE):
return true;
default:
return false;
}
}
}
Such boolean methods are important because they make it possible to encap-
sulate the bit set variable as well as the bit masks (assuming that a typesafe
enum is used to pass the values).
Everything you have seen up until now is what I would describe as standard
bit processing. It is extremely efficient, both in terms of CPU cycles and memory
usage. The remainder of this section discusses alternatives to standard bit pro-
cessing. None of these alternatives are quite as efficient as traditional bit pro-
// Using BitSet would make this easier, but it's significantly slower.
There really is no comparison. I searched the entire core API as well as all of the
support classes for the most recent J2SE implementation and could find only
eleven classes that use the BitSet class, whereas there are hundreds of
classes the use traditional bit processing. This is true even though the BitSet
class has been part of the core API since the 1.0 release.
The BigInteger class also has bit operations, but is an immutable class
that creates a new object every time a bit is set, cleared, or toggled. There is a
method for each of the basic bit operations:
Using the testBit() method as a test case, I can find only two uses of these
bit operations in either the core API or any of the support classes for the most
recent Windows implementation of the J2SE. One is in BigDecimal (which
34. For the sake of consistent usage in this section, I had to rename the methods as well as their
parameters.
Basically instead of being actual bit masks, these constants are the shift dis-
tances required to compute the value of a bit mask. Thus the computed value of
the REQUEST_FOCUS_DISABLED bit mask is one (a power of two), not zero. As
stated above, zero is never used as a bit mask.
Code that uses this approach to bit processing is only slightly less efficient
than traditional bit processing because of additional method invocations and the
continual recomputation of mask values. The main advantage is readability. This
style of bit processing is much more readable than traditional bit processing, but
the performance degradation (however negligible) and the limitation of setting or
clearing one bit at a time will prevent it from ever becoming a programmer con-
vention. It simply does not pass the “best practice” muster.
A do loop is used so that zero returns "0". Because the nybbles are processed
from right to left, the char[] buffer is loaded using a prefix decrement opera-
tor. Note that the hexDigits array can be easily changed to use upper case
A-F.
There are number of optimizations common to most hexadecimal conver-
sion methods:
• The do loop does not invoke Character.forDigit((i & 0xF),
16) for reasons explained below
• A char[] buffer is used in conjunction with the String(char[]
value, int offset, int count) constructor. The alternative of
using a StringBuffer is too slow
• The hexDigits array is static. Using a local array would be disastrous
because every conversion operation incurs the cost of allocating and load-
ing the array.
The Character.forDigit(int digit, int radix) method is a clas-
sic example of a method that cost more than it is worth. Here is one possible
implementation of that method:
Besides the overhead of method invocation, there are two argument checks. All
of this to evaluate a single array access expression.
The toHexString methods are very easy to understand which makes
them a good introduction to how the right-shift operators are used as conveyor
belts. 4.5.6 Bitwise Operators &, |, ^, >>, >>>, and << included the following.
In practice, the signed and unsigned right shift operators are most
often used interchangeably as conveyor belts at the end of which bits
are processed in groups of four or eight. Sign extension versus zero
filling is a moot point because the “conveyor belt” effectively stops
after the last bit is processed (bit 31 or 63). Thus the use of the signed
or unsigned right shift operator becomes a matter of style. There are
times when the use of the unsigned right-shift operator is absolutely
necessary, but that is not usually the case.
The are eight nybbles in an int. The following program illustrates the concept
of a conveyor belt in action.
class Test {
static char[] hexDigits = {'0','1','2','3','4',
'5','6','7','8','9',
'a','b','c','d','e','f'};
The bold shows what is left of the original int. Each iteration masks the four
low-order bits and converts them into a hexadecimal digit. Can you see how this
“conveyor belt” works? It is the same whether you are processing nybbles or
unsigned bytes. The loop continues until i is equal to zero. This is therefore an
example of when the unsigned right-shift operator is required.
charts such as this 0123456789, ABCDEF, and abcdef are like sub-arrays
of character codes that can be indexed by the value of the nybble. In the case of
ABCDEF or abcdef, ten must be subtracted from the value of the nybble in
order to use it as an index. (Remember, the value of a nybble is 0-15.) To con-
vert this index value to an actual character it must be added to the first charac-
ter code (the value of '0', 'a', or 'A') in the respective sub-array. For
example,
do {
int offset = i & 0xF; //value of the nybble
char digit = (char)((offset < 10) ?
('0' + offset) : ('a' + offset - 10));
} while ((i >>>= 4) > 0);
NOTE 4.1
The following section discusses a programming technique that does
not actually use the bitwise operators. Nevertheless, it cannot be un-
derstood unless thinking at the bit level, and is a natural to follow the
previous section on nybble to hexadecimal digit conversions.
class Test {
public static void main(String[] args) {
int i = 123456789;
System.out.println(i);
while (i != 0)
System.out.println(i /= 10);
}
}
123456789
12345678
1234567
123456
12345
1234
123
12
1
0
class Test {
public static void main(String[] args) {
int radix = 10;
int integer = 123456789;
System.out.println(integer % radix);
}
}
Here then is a complete example of the coding technique for converting inte-
gers to decimal digits:
Except for the differences discussed above this looks very much like the last
example in the previous section.
As a final note on integer to string conversions, the programming technique
discussed in this section is indeed used in the core API in general-purpose inte-
ger to string conversions such as the Integer.toString(int i, int
radix) and Long.toString(long i, int radix) methods. As of the
1.4 release, however, it is no longer used in the Integer.toString(int
i) method, which apparently is a “hot spot” because it is getting a lot of atten-
tion. Prior to the 1.4 release that method essentially used this programming
technique only with some very clever loop unrolling. The 1.4 implementation is
considerably different. It “avoids division by 10”35 by using what the responsible
programmer refers to as invariant division by multiplication.36 The resulting
implementation is impressively complex. Interestingly enough, “invariant division
by multiplication” (based on what I have seen in Integer.toString(int
i) because I have not yet read the paper) is essentially an optimization that has
figured out how to use the shift operators and bit masks for radices that are not
a power of two. If you are a teacher, I would suggest comparing the source
code for this method in the 1.3 release to the source code in the latest release.
Doing so would be an excellent lesson in source code optimization.
35. End-of-line comments in the source code for the Integer.toString(int i) method.
36. T. Gralund and P. Montgomery, “Division by Invariant Integers using Multiplication,” ACM PLDI
1994
class Test {
public static void main(String[] args) {
byte b = (byte) 255;
System.out.println("signed byte = " + b);
System.out.println("unsigned byte = " + (b & 0xff));
}
}
The (byte) cast operator truncates the high-order bits in the int type
numeric literal so that b is initialized with the bit pattern 11111111. Executing
this program prints
37. Historically a byte has not always been eight bits; the term octet is used as an unambiguous ref-
erence to an eight-bit byte. It is particularly favoured by network programmers.
It is just a matter of how the bit pattern is interpreted. By using the AND operator
with a 0xff mask, the signed byte data type can be used as if it were an
unsigned byte.
There are two programming techniques discussed in this section. One uses
the right-shift operators along with an 0xff byte mask to process the bytes in
primitive data types as if they were coming down a conveyor belt at the end of
which they are written to an output stream one (unsigned) byte at a time. For
example,
38. All of the (simplified) source code examples are from the older java.io package because
java.nio uses a preprocessor, making the source code much more difficult to work with. In fact,
it took me about two hours just to locate the comparable methods in the java.nio package.
They are buried in the package-private java.nio.Bits utility class. As it turn out, the best place
to see both programming techniques in a single class is the package-private java.io.Bits util-
ity class. (Note the this says io not nio .)
The byte[] data type is the only data type in the Java programming
language that can address an arbitrary number of unsigned bytes.
This explains why the central abstraction in the older java.io package is
referred to as a byte stream. In the newer java.nio package, the term byte
channel is used instead. In either case, it is a byte something. Personally, I pre-
fer byte train. Then you can think of unsigned bytes as open boxcars on a rail-
road track standing ready to transport eight bits of binary data either to a file or
across a network to another computer. Stops along the track are referred to as
byte (or file) buffers. (One of the problems with the old java.io package is
that all of the trains are local.) If you can grasp the special significance of the
byte[] data type as used in the java.io and java.nio packages, you will
have a much better understanding of why byte buffers, byte streams, byte chan-
nels, etc. are so called.
Now we are ready to discuss how the bitwise operators are used to convert
primitive data types to a sequence of unsigned bytes, and vice versa. First a few
Parentheses are not required around the shift operations because shift opera-
tors have a higher level of precedence than bitwise operators. They are often
used, however, as a matter of style. Also, right shifting with a shift distance of
zero accomplishes nothing. It is done merely to make the code more readable.
The writeInt(int v) method could have been written as follows.
When converting primitive data types to a sequence of unsigned bytes using this
style of coding, the use of the right-shift operator versus the unsigned right-shift
operator is also a matter of style. This discussion is momentarily differed.
Now for the readInt() method. The source code is repeated here for
your convenience:
If you are confused as to why this example uses & 0xff and the readInt()
methods does not, the difference is data types. This example is accessing a
byte[] . Reading from an input stream in the java.io package returns an
int . I will explain the significance of this difference momentarily.
There is only one matter of style to discuss when converting from sequence
of unsigned bytes to a primitive data type (or reading binary data). There are
four different operators that can be used to “add” or “reassemble” the unsigned
bytes after they have been shifted back to their original positions: +, |, +=, and
|= . The only appreciable difference from a performance perspective is
that the compound operators are slower. For example,
import java.util.*;
class Test {
static int retval=0;
static int[] unsignedBytes = {'t','e','s','t'};
public static void main(String[] args) {
test1(10000000, true);
test2(10000000, true);
test3(10000000, true);
test4(10000000, true);
System.out.println();
These tests accurately reflect coding styles found in the core API. I would cau-
tion against making too much out of the difference between the + and | opera-
tors (or between the += and |= operators for that matter). With so many
iterations (two billion in the last test), those differences are best described as
miniscule. I say this because I strongly prefer the | (inclusive OR) operator
as a matter of style because it does not require parentheses around the
operands and simply looks better.
In fact, I think the bitwise operators should be used exclusively or not at
all. That excludes any use of compound assignment operators in bitwise pro-
gramming. The majority of the core API programmers apparently agree because
the | (inclusive OR) operator is definitely the most commonly used operator in
methods such as readInt(). Use of the compound assignment operators
when converting a sequence of unsigned bytes to a primitive data types is far
less common. Here is a very clean implementation the readInt() method
using only bitwise operators and eliminating the unnecessary parentheses:
if(byteOrder == 0xFE) {
sectDirStart = is.read();
sectDirStart += is.read()<<8;
sectDirStart += is.read()<<16;
sectDirStart += is.read()<<24;
}
else {
sectDirStart = is.read()<<24;
sectDirStart += is.read()<<16;
sectDirStart += is.read()<<8;
sectDirStart += is.read();
}
Here is what this example would look like using the | (inclusive OR) operator:
if(byteOrder == 0xFE) {
sectDirStart = is.read() |
is.read() << 8 |
is.read() << 16 |
is.read() << 24;
} else {
sectDirStart = is.read() << 24 |
is.read() << 16 |
is.read() << 8 |
is.read();
}
If reading from an input stream the order of shift distances must reflect the big
or little-endian byte order of the input data.
The most important matter of style is one that I would argue should not be a
matter of style. Most of the examples in this section conform to the following
programmer conventions:
• When converting primitive data types to a sequence of unsigned bytes (or
writing binary data), the programmer convention is to (right) shift and then
mask
• When converting a sequence of unsigned bytes to a primitive data type (or
reading binary data), the programmer convention is to mask and then (left)
shift. However, masking is only necessary when the unsigned bytes are
stored in a byte or byte[]
The main reason why these are (or rather should be) programmer conventions
and not matters of style is that the alternatives require more complicated bit
masks. For example,
Note also that the unsigned right-shift operator is required when shifting the
most significant byte. It’s use when masking and then shifting is definitely not a
matter of style. I would argue, too, that right shifting and then masking (the con-
veyor belt), is more intuitive than this alternative.
Here is a comparable example of converting a sequence of unsigned bytes
to a primitive data type from a class in the com.sun package:
Here again the more complicated bit masks are the primary problem. However,
this too suffers from the problem of being less intuitive.
Now I can discuss the use of the right-shift operator versus the unsigned right-
shift operator when converting primitive data types to sequences of unsigned
bytes. The following is from 4.5.6 Bitwise Operators &, |, ^, >>, >>>, and <<.
In practice, the signed and unsigned right shift operators are most
often used interchangeably as conveyor belts at the end of which bits
are processed in groups of four or eight. Sign extension versus zero
filling is a moot point because the “conveyor belt” effectively stops
after the last bit is processed (bit 31 or 63). Thus the use of the signed
or unsigned right shift operator becomes a matter of style.
This is only true when using the programmer convention of (right) shift-
ing and then masking. To show how this works, let us look at the difference
between the right-shift and unsigned right-shift operators:
class Test {
public static void main(String[] args) {
int i = 0x80000000; //sets the sign bit to 1
System.out.println(BitPattern.toBinaryString(i));
System.out.println(BitPattern.toBinaryString(i >>> 24));
System.out.println(BitPattern.toBinaryString(i >> 24));
}
}
I refer to shifting in multiples of eight as byte shifting. (In the previous section
there are examples of right shifting with a shift value of four and using a 0xF
nybble mask. For consistency’s sake, this could be referred to a nybble shift-
class Test {
public static void main(String[] args) {
int i = 0x80000000; //sets the sign bit to 1
System.out.println(BitPattern.toBinaryString(i));
System.out.println(BitPattern.toBinaryString
(i >>> 24 & 0xff));
System.out.println(BitPattern.toBinaryString
(i >> 24 & 0xff));
}
}
This is not the same output as before. The sign extended 1 bits in the second bit
pattern are now 0 because of the bitwise AND operator.
If you would like to see an example of when the unsigned right-shift operator
is absolutely necessary, take a look at this slightly modified version of a method
in com.sun.media.sound.SunFileReader:
int toLittleEndian(int i) {
int b1, b2, b3, b4 ;
b1 = (i & 0xFF) << 24 ;
b2 = (i & 0xFF00) << 8;
b3 = (i & 0xFF0000) >> 8;
b4 = (i & 0xFF000000) >>> 24;
I have tried several alternatives for converting from big- to little-endian and think
this programmer has it right. The unsigned right-shift operator is absolutely nec-
essary because if i were negative, the signed right-shift operator would result in
a very large negative number instead of an unsigned byte in the range of 0 to
255. The do statement in the toHexString(int i) method in 4.7.2 Con-
verting Nybbles to Hexadecimal Digits is another example when using the
unsigned right-shift operator is required. Such examples, however, are rare. This
finishes the discussion on matters of style. I will now address more significant
issues such as inadvertent sign extension.
Inadvertent sign extension resulting from numeric promotion is a menacing
bug that constantly threatens bitwise programmers. Therefore it is important to
understand the following.
If an unsigned byte is incorrectly promoted and the most significant bit happens
to be 1 the value of the unsigned byte changes as a result. For example,
class Test {
public static void main(String[] args) {
/*
* These are the values of the unsigned bytes
* in a int that is equal to 255
*/
byte b1 = 0;
byte b2 = 0;
byte b3 = 0;
byte b4 = -1; //all bits are set to 1
-1
11111111 11111111 11111111 11111111
255
00000000 00000000 00000000 11111111
It is imperative that whenever you are reading binary data and converting
sequences of unsigned bytes to primitive data types that you stop and consider
the data type of the unsigned bytes. If it is either byte or byte[], masking is
required to prevent inadvertent sign extension resulting from numeric promotion.
I should point out, however, that the most significant byte never really needs to
be masked. In the example above, the alternative of b1 << 24 would indeed
result in sign extension of the left-hand operand, but all the sign extended bits do
an exit stage left. Nevertheless, I have never seen a programmer actually omit
the & 0xff for the most significant byte when left shifting an unsigned byte
stored in either a byte or byte[].
There is an unwritten rule when converting a sequence of unsigned bytes not
stored in either a byte or byte[] (which typically means stored in an int) to
a primitive data type. A 0xff mask is not used on the value of the unsigned
byte prior to left shifting. This explains why readInt() and other conversion
methods that read from byte streams do not use & 0xff on the value of the
unsigned byte. (The result type of the read() method in the java.io pack-
age is int.) There are two reasons for this:
• Assuming that the value of the int or other non-byte integral data type is
actually an unsigned byte, masking is completely unnecessary because
return Double.longBitsToDouble(
((buf[pos + 0] & 0xffL) << 56) +
((buf[pos + 1] & 0xffL) << 48) +
((buf[pos + 2] & 0xffL) << 40) +
((buf[pos + 3] & 0xffL) << 32) +
((buf[pos + 4] & 0xffL) << 24) +
((buf[pos + 5] & 0xffL) << 16) +
These are the return statements from the readLong() and read-
Double() methods in ObjectInputStream. Note also that shift distances
greater than 31 can only be used in expressions that evaluate to long. Other-
wise, some or all of the shifted bits will do an exit stage left.
A (long) cast operator is sometimes used to promote the value being
shifted. Here is an example from the java.io package:
Note that the parentheses around readInt() are unnecessary. This example
falls under the rule of using the bitwise operators exclusively or not at all. As a
matter of style, it should have been written as follows.
The cyclic redundancy code (or CRC) is stored in an int. What you have to keep
in mind when looking at this code is that the cast operator is a unary operator.
Only the value of crc is cast. At that point the crc is incorrectly sign extended,
the effect of which is negated by the 0xffffffffL mask. You should always
be on the lookout for this mistake of using both a cast operator and bit masking
on the same value. Although I have seen several examples of this being done in
the core API, it is always a mistake. If it is a narrowing primitive conversion that
requires a cast operator (typically storing the value of an int in a byte or
byte[]), then & 0xff masking is redundant.
This for loop is from a method that is passed variable length data. Otherwise a
loop would not have been used. The reason why is a very common optimization
known as loop unrolling. Loops have significant overhead; every iteration has
to evaluate a boolean type expression to determine if the loop should continue
to execute. For example,
import java.util.*;
class Test {
static int retval=0;
static int[] unsignedBytes = {'t','e','s','t'};
public static void main(String[] args) {
test1(100000, true);
test2(100000, true);
test1(1000000, false);
test2(1000000, false);
test1(10000000, false);
test2(10000000, false);
test1(100000000, false);
test2(100000000, false);
test1(1000000000, false);
test2(1000000000, false);
}
static void test1(int count, boolean JVMwarmup) {
Microbenchmark timer = new Microbenchmark().start();
for (int i=0; i<=count; i++) {
int ch1 = unsignedBytes[0];
int ch2 = unsignedBytes[1];
int ch3 = unsignedBytes[2];
int ch4 = unsignedBytes[3];
retval = ch1 << 24 | ch2 << 16 | ch3 << 8 | ch4 << 0;
}
This code is modeled after the core API example, and so uses retval as a
variable name. I have gone out of my way so as not to give loop unrolling an
unfair advantage:
• The numeric literal 4 is used as the loop control variable instead of
unsignedBytes.length
• The & 0xff mask has been removed because it is normal to assume that
an int used as an unsigned byte actually does have a value <= 255
• The | operator is used instead of +
Here is some typical output on my computer:
int address;
You will recall from the discussion above that the parentheses around the shift
operations are unnecessary, and using the unsigned right-shift operator is purely
a matter of style. The maximum string length is 15 so the default initial capacity
of a StringBuffer is adequate.
My hope is that because of the more complicated bit masks that you immedi-
ately recognized that the programmer convention for converting a sequence of
unsigned bytes stored in a byte or byte[] to a primitive data type is not
being used. As a matter as style, the bitwise operators should be used exclu-
sively, or not at all. I would have written this as follows.
address = (addr[0] & 0xff) << 24) | (addr[1] & 0xff) << 16) |
(addr[2] & 0xff) << 8) | addr[3] & 0xff;
Do you see how these conversion methods are all the same? Now consider RGB
colors.
Digital images are stored in an int[] in which the elements are referred to
as pixels (or picture elements). The size of a digital image is measured in pixels.
The X and Y coordinates for a particular pixel are not actually stored in the int.
It is a factor of how the image is displayed and the position of the pixel in the
array. The int is used to store only the color information. An int is sufficient
to display images in True Color, which is usually describe as “more than 16 mil-
lion colors.” True Color is simply 2553 or 16,581,375 colors using RGB colors
(Red, Green, Blue), which are a sequence of four unsigned bytes. The Java pro-
gramming language uses a standard RGB color referred to as the sRGB, which
is the default color space for the Java 2D package. For more information on
sRGB, see www.w3.org/pub/WWW/Graphics/Color/sRGB.html.
Why four unsigned bytes if there are only three colors? The first byte is
referred to as the Alpha value, and is explained as follows in the API docs for
the java.awt.Color class:
Look familiar? This is exactly the same code used by the readInt() method
with one important exception. The & 0xff bit masking is not required if the int
type parameters are indeed passing unsigned bytes, so we should expect the
API docs to include some notice that the arguments passed are “reduced” mod-
ulo 256. Surprisingly, the API docs for this constructor says only the following.
Creates an sRGB color with the specified red, green, blue, and alpha
values in the range (0 - 255). 40
I would have made the warning more explicit, even though you can expect
graphic programmers to be fully aware of this issue. Here is some code from
the getDataElements(int rgb, Object pixel) method that converts
a pixel into individual RGB colors:
One of those irreverent programmers doing things out of order! Otherwise, per-
fectly written in terms of following the programmer convention and not doing
anything unnecessary. (The parentheses are unnecessary.)
class Test {
public static void main(String[] args) {
for (int i=0; i<= 5; i++) {
if ((i & 1) == 0)
System.out.println(i + " is even");
if ((i & 1) != 0)
System.out.println(i + " is odd");
}
}
}
0 is even
1 is odd
2 is even
3 is odd
4 is even
5 is odd
The bitwise AND operator should always be used to make this determination.
The following if statement can be used to test for powers of two.
For example,
class Test {
public static void main(String[] args) {
0
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
This use of bitwise operators is very common in methods that rehash hash
tables. For example,
If you need to test for a power of two there is nothing more efficient.
The core API uses at least three different programming techniques for align-
ing text to a four byte boundary. The first uses the remainder operator:
class Test {
public static void main(String[] args) {
String string = "123456";
int align = 4 - (string.length() % 4);
if (align == 4)
align = 0;
length = length + align;
System.out.println(length);
}
}
class Test {
public static void main(String[] args) {
String string = "123456";
int length = string.length();
int incr = length & 3;
if (incr != 0) {
incr = (4 - incr);
length += incr;
}
System.out.println(length);
}
}
This is an argument check from the core API. It could just as well have been writ-
ten as follows.
if (temp[0] < 0)
throw new IllegalArgumentException("negative BigInteger");
This line of code is from the java.math package, however, and the responsi-
ble programmer is a master of bitwise programming.
All of the other bits except the sign bit can be masked using a hexadecimal
literal that begins 0x7f. For example, the sign bit for an int is masked using &
0x80000000 and all of the other bits are masked using & 0x7fffffff.
You should be familiar with this and similar lines of code because many classes
declare their own hash tables and include code such as this. See also 4.11.2
Understanding Hash Tables in Volume 1.
Now we begin to look at some of the more questionable uses of the bitwise
operators in the core API. Any variable that is incremented using the ++ opera-
tor will always have the low order bit set if the variable is not equal to zero. Thus
the low order bit can be masked as a means of testing if the variable has been
incriminated at all. Here is an example from the core API:
This method returns true if hits has been incriminated during execution of
the method. Here is a more elaborate example also from the core API:
int crossings = 0;
…
while (--roots >= 0) {
if (x < res[roots]) {
crossings++;
}
}
return ((crossings & 1) == 1);
In both cases the expression in the return statement could have been coded
hits > 0 or crossings > 0, which would have been more readable.
4.8 Statements
A Java program is executed one statement at a time. Every statement in Java is
one of the following:
class Test {
public static void main(String[] args) {
LocalClass localClass = new LocalClass();
class LocalClass { }
}
}
If the two lines of code in the main method are reversed, this program compiles
without generating any compiler errors. The same is true of local variable decla-
ration statements.
The empty statement is nothing more than a semicolon, and is similar to
the syntax used to indicate a missing method implementation in either
abstract or native method declarations. The empty statement can be
used anywhere statements are used, including as the contained statement in a
control-flow statement. In that case, I can only make the analogy that this is like
a programming language shooting itself in the foot. For example,
class Test {
public static void main(String[] args) {
if (2+2!=4);
System.out.println("akin to the dangling else problem");
for (int i=0; i>Integer.MAX_VALUE; i++);
System.out.println("better hope this doen't work");
}
}
The problem is very simply that programmers habitually type the semicolon at
the end of statements, but in the case of control-flow statements such a semico-
lon inadvertently becomes an empty statement. If the continue statement
without label could be used in if-then-else statements, there would be
no need for “empty” statements. I suspect the next programming language
This is the equals(Object obj) method in the String class (the notori-
ously slow string comparison). The second if-then statement in bold immedi-
ately contains a local variable declaration statement and yet another if-then
statement. It does not immediately contain any of the other local variable decla-
ration statements, the for loop, or either of the first two return statements.
The difference between the terms contains and immediately contains is criti-
cally important in explaining the syntax of the Java programming language.
class Test {
public static void main(String[] args) {
int x = 2 + 2;
if (x = 4)
System.out.println("what's wrong with this picture?");
}
}
Note that the for loop has two versions of an infinite loop. The for (;;) and
while (true) infinite loops are by far the most common. The choice is
purely a matter of style. There is no difference whatsoever in the code that is
generated. For example,
class Test {
void forLoop() {
for (;;)
System.out.println("Hello World");
}
void whileLoop() {
while (true)
System.out.println("Hello World");
}
}
The decompile code for both the forLoop() and whileLoop() methods is
identical:
0 goto 3
3 getstatic #2 <Field java.io.PrintStream out>
6 ldc #3 <String "Hello World">
8 invokevirtual #4 <Method void println(java.lang.String)>
11 goto 3
The compiler recognizes all four of these as infinite loops. Consequently, any
statements after the unconditional execution of one of these statements is
unreachable.
There are only two exceptions to the rule that control flow expressions eval-
uate parenthesized Boolean type expressions:
if (n == 0)
zero++;
else if (n == 1)
one++;
else if (n == 2)
two++;
else if (n == 3)
three++;
else
four++;
switch(n) {
case 0: zero++;
break;
case 1: one++;
break;
case 2: two++;
break;
case 3: three++;
break;
default: four++;
break;
}
import java.util.Random;
class Test {
public static void main(String[] args) {
test1(100000, true);
test2(100000, true);
test1(1000000, false);
test2(1000000, false);
test1(10000000, false);
test2(10000000, false);
test1(100000000, false);
test2(100000000, false);
test1(1000000000, false);
test2(1000000000, false);
}
Even at one billion iterations the difference between a switch and nested if-
then-else statements is only 14.374 seconds in an average runtime of approx-
imately two minutes and fifteen second (a difference of about seven percent).
The relationship between the switch statement and a “multi-way if-else” is
comparable to the relationship between the conditional operator and an if-then-
else statement. If you can use a switch statement or the conditional
operator, they are generally preferable to the alternative. In the case of a
switch statement, this is largely a matter of the type of the expression being
evaluated. These are more a matter of style, however, than performance.
This last point begs the question of should switch statements be used if
there is only one or two cases. There are numerous examples of switch state-
ments that have only one or two cases in the core API. Here are some examples
from the Component class:
switch(id) {
case FocusEvent.FOCUS_GAINED:
listener.focusGained(e);
break;
case FocusEvent.FOCUS_LOST:
listener.focusLost(e);
switch(id) {
case MouseEvent.MOUSE_WHEEL:
listener.mouseWheelMoved(e);
break;
}
These are examples of using switch statements for method dispatch, which is
now generally recognized as a “code smell” (see page 119). When I see exam-
ples such as these I am forced to conclude one of three things:
• The programmer is habituated to the use of switch statements
• Additional case constants are anticipated
• The programmer thinks switch statements are much faster than a compa-
rable if-then or if-then-else statement
In the case of these examples from the Component class, I am inclined to think
it is a combination of being habituated and anticipating additional cases. Both of
these are understandable. It is also understandably a matter of style when the
case constants are stacked, as with the following argument check from the core
API.
switch (orientation) {
case VERTICAL:
case HORIZONTAL:
break;
default:
throw new IllegalArgumentException(
"orientation must be one of: VERTICAL, HORIZONTAL");
}
In this case a comparable if-then statement would not only have to repeat the
name of the orientation argument but would have to be stated negatively:
if (condition) {
statements;
}
if (condition) {
statements;
} else {
statements;
}
if (condition) {
statements;
} else if (condition) {
statements;
} else {
statements;
}
Note: if statements always use braces {}. Avoid the following error-
prone form:
Note that none of the recommended styles uses indentation. I cannot agree with
with the last point. Sun is not practicing what it preaches. If the contained state-
ment fits on a single line, if-then statements often do not use braces. That
This code is legal because the contained if-then is a single statement (though it
spans more than one line. Examples such as these are extremely error-
prone, and the use of braces in the for statement should not be regarded as a
matter of style.
switch (expression) {
case CASE_CONSTANT: statementGroup;
…
default: statementGroup;
}
The type of the expression must be char, byte, short, or int. The body of
a switch statement that follows is called a switch block (and is required
syntax). The difference between a switch block and other blocks is that all of
the statements in a switch block are labeled with one or more switch
labels. The term switch label refers to both case labels and the default
label. The statement or statements that follow a switch label are referred to
as a switch block statement group (or statement group for short). State-
ment “groups” are something of a quirk in the syntax of the Java programming
language. Everywhere else “blocks” are used. Actually, a statement group can
be enclosed in a block, but doing so is never required.
switch(id) {
case KeyEvent.KEY_PRESSED:
case KeyEvent.KEY_RELEASED:
case MouseEvent.MOUSE_PRESSED:
case MouseEvent.MOUSE_RELEASED:
case MouseEvent.MOUSE_MOVED:
case MouseEvent.MOUSE_DRAGGED:
case MouseEvent.MOUSE_ENTERED:
case MouseEvent.MOUSE_EXITED:
case MouseEvent.MOUSE_WHEEL:
case InputMethodEvent.INPUT_METHOD_TEXT_CHANGED:
case InputMethodEvent.CARET_POSITION_CHANGED:
consumed = true;
break;
default:
// event type cannot be consumed
}
This example is formatted exactly as it is in the original source code. Note that
the case labels are indented. The “Code Conventions for the Java Programming
Language” document prefers a different style of coding switch statements:
switch (condition) {
case ABC:
statements;
/* falls through */
case DEF:
statements;
break;
case XYZ:
statements;
break;
default:
statements;
The switch labels are not indented and there is a blank line after each
switch block statement group. Many API programmers respect the former
convention, but few the latter. In fact, examples of switch statements in the
core API that use a blank line after each switch block statement group are
hard to find. Personally, I do not like either convention.
The type of a case constant must be assignment compatible with the type of
the parenthesized expression in a switch statement. For example,
switch (0) {
case 'a':
System.out.println("letter");
break;
}
class Test {
public static void main(String[] args) {
final int i = Byte.MAX_VALUE + 1;
byte b = 0;
switch (b) {
case i:
System.out.println("number");
break;
}
}
}
Considering that the JLS clearly states that this example should not compile, this
error message is not a very intuitive. It really should be fixed.
Case constants must be either compile-time constant expressions (which is
typically either a character or integer literal) or an inlined constant. The term
inlined constant is defined in 1.5.2 Inlined Constants. Variables (or normal
constants for that matter) specifically cannot be used. That is why they are
called case constants. Furthermore, the value of each case constant must be
unique. For example,
class Test {
public static void main(String[] args) {
switch (0) {
case 'a': //the integral value of which is 97
System.out.println("letter");
break;
case 97:
System.out.println("number");
break;
}
}
}
switch (status) {
default:
case IMAGEERROR:
flags |= ImageObserver.ERROR | ImageObserver.ABORT;
break;
case IMAGEABORTED:
flags |= ImageObserver.ABORT;
break;
case STATICIMAGEDONE:
flags |= ImageObserver.ALLBITS;
break;
case SINGLEFRAMEDONE:
flags |= ImageObserver.FRAMEBITS;
break;
}
This would not read as well if the default label were moved.
A switch statement is executed by first evaluating the expression in paren-
theses. The statements in a switch block are executed beginning at the first
statement following a case constant equal to the value of the expression (some-
times called a matching case label). If there is no matching case label, the
default statement group is executed. If there is neither a matching case label nor
class Test {
public static void main(String[] args) {
test(0);
test(1);
switch (1) {
case 0: System.out.println("nothing prints");
break;
}
}
static void test(int i) {
switch (i) {
case 0: System.out.println("matching case label");
break;
default: System.out.println("default case label");
break;
}
}
}
Note that the last statement in each of these statement groups is break. This
is required (except for the last statement group in a switch block) because
once a matching case label is found execution of the statements contained in a
switch block continues until either a control-transfer statement is executed or
the switch block completes normally (by falling through to the closing brace of
the switch block). This somewhat counterintuitive behavior of switch state-
ments is referred to as falling though. For example,
class Test {
public static void main(String[] args) {
int n = 0;
switch (n) {
case 0: System.out.print('H');
case 1: System.out.print('e');
case 2: System.out.print('l');
class Test {
public static void main(String[] args) {
int x = 0;
switch (x) {
case 1:
System.out.println("one");
break;
case 2:
System.out.println("two");
case 3:
System.out.println("three");
}
}
}
switch (type) {
case '1':
default:
formattedNum = String.valueOf(itemNum);
break;
case 'A':
uppercase = true;
// fall through
case 'a':
formattedNum = formatAlphaNumerals(itemNum);
break;
case 'I':
uppercase = true;
The switch statement is often criticized for falling though. Exiting the switch
block after executing the statement group of the matching case label (or the
default label, as the case may be) would be more intuitive, but considerably
less flexible from a language design perspective.
The ForInit part of the for statement header must be either a comma-separated
list of top-level expressions or a local variable declaration statement. These
are mutually exclusive options (which I will explain momentarily). The ForUp-
class Test {
public static void main(String[] args) {
int j;
for (int i=0, j=0; ; ) //COMPILER ERROR
continue;
}
}
This example shows why a local variable declaration statement cannot be mixed
with top-level expressions. If that were the case, j=0 could be either a declara-
tor or assignment expression. Here are some for statement headers that do
compile:
//top-level expressions
int i, j=0;
for (i=0, j++, System.out.println("starting loop");;)
The top-level expression can be any expression in Table 4.2 Expressions that
Produce Side Effects on page 402.
The local variable declaration statement may have multiple declarators, but
the fact that there can only be one local variable declaration statement means
that the type of all the variables declared will necessarily be the same. In the
example above, three int type variables are declared.
class Test {
public static void main(String[] args) {
boolean
abcdefghijklmnopqrstuvwxy
String
ThreadGroup tg = Thread.currentThread().getThreadGroup();
for (ThreadGroup tgn = tg;
tgn != null;
tg = tgn, tgn = tg.getParent());
Can you tell what this code is doing? Both of these examples execute an empty
statement. This style of coding is also recommended in the “Code Conventions
for the Java Programming Language” document:
An empty for statement (one in which all the work is done in the initial-
ization, condition, and update clauses) should have the following form:
[end of quote]45
I take exception with this, however. In all likelihood, such constructs will generate
a compiler warning in some future release of the Java programming language
45. Unascribed, “Code Conventions for the Java Programming Language” available online at
java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html, (Mountain View: Sun Micro-
systems, 1995-1999), §10.5.1, “Parentheses.”
For example,
class Test {
public static void main(String[] args) {
int[] placeValues = new int[8];
placeValues[0] = 1;
for (int i = 1; i < placeValues.length; i++)
placeValues[i] = placeValues[i-1] << 1;
int total = 0;
for (int i = 0; i < placeValues.length; i++) {
total += placeValues[i];
System.out.println(placeValues[i]);
}
System.out.println();
System.out.println(total == Byte.MAX_VALUE +
Math.abs(Byte.MIN_VALUE));
}
}
true
Again, this idiom uses two loop variables, and the second variable, n, is
used to avoid the cost of performing redundant computation on every
iteration. As a rule, you should use this idiom if the loop test involves a
method invocation and the method invocation is guaranteed to return
the same result on each iteration.46
In this example, n could be considered a stack variable, but Bloch does not actu-
ally suggest the use a different idiom for iterating over the elements of an array.
In fact, in an unrelated item he specifically says that “the standard idiom for loop-
ing through an array does not necessarily result in redundant checks; some
modern JVM implementations optimize them away.”47 Microbenchmark tests can
always put such issues to rest. For example,
class Test {
static int nLoops;
byte[] array0=new byte[nLoops],array1=new byte[nLoops],
array2=new byte[nLoops],array3=new byte[nLoops],
array4=new byte[nLoops],array5=new byte[nLoops],
array6=new byte[nLoops],array7=new byte[nLoops],
array8=new byte[nLoops],array9=new byte[nLoops];
47. Bloch, “Item 39: Use exceptions only for exception conditions.”
void test2() {
long count = 0;
Microbenchmark timer = new Microbenchmark().start();
for (int i0=0, n0=array0.length; i0<n0; i0++)
for (int i1=0, n1=array0.length; i1<n1; i1++)
for (int i2=0, n2=array0.length; i2<n2; i2++)
for (int i3=0, n3=array0.length; i3<n3; i3++)
for (int i4=0, n4=array0.length; i4<n4; i4++)
for (int i5=0, n5=array0.length; i5<n5; i5++)
for (int i6=0, n6=array0.length; i6<n6; i6++)
for (int i7=0, n7=array0.length; i7<n7; i7++)
for (int i8=0, n8=array0.length; i8<n8; i8++)
for (int i9=0, n9=array0.length; i9<n9; i9++)
count++;
timer.stop();
System.out.println(count + " using stack variables: " +
timer.getElapsedTime());
}
}
As you can see, there is no difference. Unless you are working on the kind of
intensive graphics application that gave birth to Duff’s Device, I would not even
contemplate these kinds of optimizations.
Identifier: StatementOrBlock
It’s always interesting to study the changes Gilad Bracha made in the Second
Edition of the JLS. He changed this paragraph to read as follows.
Let l be a label, and let m be the immediately enclosing method, con-
structor, instance initializer or static initializer. It is a compile-time error
if l shadows the declaration of another label immediately enclosed in
m.49
For example, the following loops do not compile because the continue target
is ambiguous.50
48. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification (Reading: Addison-
Wesley, 1996), §14.6, “Labeled Statements.” (Do not update.)
49. Gosling et al., §14.7, “Labeled Statements.”
50. Because of a bug in the javac compile, this example compiled in Java 1.1 and earlier
releases. See Bug Id 1241001.
class Test {
public static void main(String[] args) {
int[] array = {1,2,3};
for (int i=0; i < array.length; i++) {
if (2+2==4) {
synchronized (Test.class) {
try {
continue;
}
catch(Throwable e) { }
}
}
}
}
}
51. Gosling et al., The Java Language Specification, §14.15, “The continue Statement.”
52. Gosling et al., The Java Language Specification, §14.14, “The break Statement.”
53. Philip Elmer-Dewitt, “Ghost in the Machine,” Time Magazine, January 29, 1990.
class Test {
public static void main(String[] args) {
block: {
while (true)
continue block;
}
}
}
class Test {
public static void main(String[] args) {
label: return;
}
}
56. Tim Lindholm and Frank Yellin, The Java Virtual Machine Specification, §2.16, “Exceptions.”
57. Gosling et al., §14.15, “The continue Statement.”
class Test {
public static void main(String[] args) {
block:
{
System.out.println("in block");
if (true)
break block;
System.out.println("does not print");
in block
out of block
class Test {
public static void main(String[] args) {
int i = 0, j = 0;
boolean b = true;
label:
if (i == 0)
if (j == 0) {
System.out.println("nested if statements");
if (b)
break label;
System.out.println("does not print");
}
label:
{
System.out.println("arbitrary blocks");
if (b)
break label;
System.out.println("does not print");
label:
synchronized(Test.class) {
System.out.println("switch statements");
if (b)
break label;
System.out.println("does not print");
}
label:
try {
System.out.println("try statements");
if (b)
break label;
System.out.println("does not print");
}
catch (Exception e) {}
label:
switch (i) {
case 0:
switch (j) {
case 0: System.out.println("switch statements");
if (b)
break label;
}
System.out.println("does not print");
}
}
}
loops
nested if statements
arbitrary blocks
switch statements
try statements
switch statements
The break statement has uses other than breaking out of nested loops, but it is
helpful to understand the history behind this statement. It is really highly con-
strained goto statement, as is throwing an exception.
4.9 Blocks
A block is one or more statements within braces. Table 4.13 lists all of the
places where blocks are required syntax. In addition to these blocks, most con-
59. James Gosling and Henry McGilton, The Java Language Environment: A White Paper,
§2.2.5, “No More Goto Statements.”
class Test {
public static void main(String args[]) {
{
String s = "This is a completely arbitrary block";
System.out.println(s);
}
}
}
The term arbitrary block is of my own making, but one that I feel is important
and very useful is discussing the language. One of the primary uses of arbitrary
blocks is as break or continue targets. The following example from 4.8.3.2
The break Statement is repeated here for your convenience:
class Test {
public static void main(String[] args) {
block:
{
60. Surprisingly, the JLS does not explicitly say so. I know because I have diligently searched for
such a statement in both editions of the JLS. That a block can always be substituted for a statement
is implicit in the “grammatical productions” of 14.5 Statements and again in 18.1 The Grammar of
the Java Programming Language.
in block
out of block
static { };
{ };
int i = 0;;
void method() { };
};
This program compiles even though only the first semicolon after the field decla-
ration is legal according to the original JLS. The main Bug Id is 4057172 (dated
June 6, 1997), the evaluation for which says:
This is a bug. The relevant productions are in JLS 19.8.3 et al. Proba-
bly there should be a release with warning messages prior to making
these hard errors, since these typos are likely to be somewhat wide-
spread.
xxxxx@xxxxx 1997-06-06
The TRC has determined that due to the widespread use of extra semi-
colons in practice, it is more desirable to modify the spec to allow their
use than to break that much existing code. I'm changing this bug over
to the specification subcategory so that the grammar will be updated in
the next version of the JLS.
xxxxx@xxxxx 1997-07-1561
This really is a problem for “write once, compile anywhere,” albeit a minor one.
For example, the jikes compiler will issue a warning if semicolons are used
incorrectly. The Second Edition of the JLS does not address this issue so there
exists the possibility that Sun plans to eventually fix this bug in the javac com-
piler.
Chapter Contents
5.1 Introduction 572
5.2 The Type of a Variable or Expression versus the Class of an Object 573
5.2.1 The Phrase “type of an object” is in Prevalent Use 576
5.2.2 The Term class type is Where Everything Goes Afoul 577
5.3 Java is a Strongly Typed Language 583
5.4 Substitution is a Higher Concept than Polymorphism 587
5.5 Forbidden Conversions 595
5.6 Permitted Conversions 597
5.6.1 Identity Conversions 600
5.6.2 Primitive Type Conversions 601
5.6.2.1 Widening Primitive Conversions 604
5.6.2.2 Narrowing Primitive Conversions 605
5.6.3 Reference Type Conversions 612
5.6.3.1 Widening Reference Conversions 614
5.6.3.2 Narrowing Reference Conversions 619
5.7 Conversion Contexts 623
5.7.1 Simple Assignment Conversion Context 627
5.7.1.1 The ArrayStoreException 631
5.7.2 Method Invocation Conversion Context 632
5.7.3 Method Return Conversion Context 636
5.7.4 The Cast Operator 638
5.7.5 The Implicit Cast in a Compound Assignment Operation 643
5.8 Overloaded Method Matching 644
5.8.1 Choosing The Most Specific Applicable Method 645
5.8.2 The Declaring Class of Applicable Methods 650
5.9 Value Set Conversions 656
1. Ken Arnold and James Gosling, The Java Programming Language , (Boston, Addison-Wesley,
1996). If anything is merciless, it is quoting from the first edition of a book that is now in the third
edition. Still I think it is important to show that even the “Father of Java” can make this mistake (or at
least one of his coauthors).
The term run-time type and the initialism RTTI are holdovers from the C pro-
gramming language which is not object-oriented and does not have classes.
The First Edition of the JLS was unremitting in its effort to differentiate the use of
the terms type and class, as can be seen in the following quote from a section
named “Variables Have Types, Objects Have Classes.”
Every object belongs to some particular class: the class that was men-
tioned in the creation expression that produced the object, the class
whose class object was used to invoke the newInstance method to
produce the object, or the String class for objects implicitly created
by the string concatenation operator +. This class is called the class of
the object … An object is said to be an instance of its class and of all
superclasses of its class.
I implore the authors to restore the original language. At most this concession to
incorrect usage should have been added as a footnote. I don’t mind saying so
because there are numerous other places in Java Rules in which I praise the
added clarity of the Second Edition.
Type is a compile-time concept that relates to variables and other expres-
sions. Once a program is running, the objects created have classes. According
to Design Patterns:
It’s important to understand the difference between an object’s class
and its type.4
2. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification, (Reading: Addi-
son-Wesley, 1996), §4.5.5, “Variables Have Types, Objects Have Classes.” (Do not update.) By the
way, I believe this is the longest quote in all of Java Rules. I am very grateful to Sun Microsystems
and in particular to Lisa Friendly and Doug Krammer for liberal permission to quote from Sun
sources (which includes the specifications as well as the Java Series).
3. Gosling et al., §4.5.6, “Types, Classes, and Interfaces.”
4. Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, Design Patterns, Design Pat-
terns: Elements of Reusable Object-Oriented Software, (Reading: Addison-Wesley, 1995), 16.
It would be better to say “when you see the word ‘type’ think ‘compiler’ and when
you see the word ‘class’ think ‘Java Virtual Machine.’” This is not intended as a
criticism of Thinking in Java. A footnote on the very same page states that
“some people make a distinction, stating that type determines the interface
while class is a particular implementation of that interface,”6 which is exactly the
difference between type and class. That distinction can be seen in the following
quotes from the UML Notation Guide:
Classes implement types. A type provides a specification of external
behavior. A class provides an implementation data structure and a pro-
cedural implementation of methods that together implement the speci-
fied behavior.7
The Thinking in Java statement that “when you see the word ‘type,’ think
‘class’ and vice versa” is to be commended as a very apt characterization of
what most programmers actually do think, either because they have concluded
5. Bruce Eckel, Thinking in Java, 2nd Edition, Release 11 (Upper Saddle River: Prentice Hall,
2000), 33.
6. Ibid.
7. UML Notation Guide, Version 1.0, (Santa Clara: Rational Software, 1997), §4.6, “Type.”
8. Ibid. §4.6.1, “Semantics.”
Our intent here is not to single out The Java Programming Language for crit-
icism; the idea is to show that even the best software engineers and technical
writers can and do make this mistake. The awkwardness of the incorrect usage
in the last quote is evident in the change from “type” to “superclass” in “some
behavior makes sense only for particular types of objects, and not a general
9. Ken Arnold and James Gosling, The Java Programming Language, (Reading: Addison-Wes-
ley, 1996), §1.6, “Classes and Objects.” (Do not update.)
10. Ibid. §1.10.2, “Invoking Methods from the Superclass.”
11. Ibid. §3.7, “Abstract Classes and Methods.”
How is the average programmer supposed to learn the difference between type
and class when phrases such as “type of an object” are in prevalent use?
This quote is from UML Distilled. Design Patterns addresses the class type
problem as follows:
Of course, there’s a close relationship between class and type.
Because a class defines the operations an object can perform, it also
defines the object’s type. When we say that an object is an instance of
a class, we imply the object supports the interface defined by the
class.13
It would have been much better to have said, “it also defines one of the object’s
types” (the class type). To fully understand the answer to this question you must
understand the newer definition of a type in object-oriented programming. This
12. Martin Fowler, UML Distilled: Applying the Standard Object Modeling Language, (Read-
ing: Addison-Wesley, 1997), 55.
13. Erich Gamma et al, 17.
A type consists of an identifier (the name of the type), API docs (the
specification) and an interface. Thus an interface type is the purist
form of a type.
The Unified Modeling Language (UML) makes a science out of terminology. The
following definitions from the UML glossary fully support this definition of type.
type
class
interface
implementation
14. UML Semantics, Appendix M1-UML Glossary, version 1.0 dated 13 January 1997 (Santa
Clara: Rational Software, 1997), Glossary.
API Docs
Terms Identifier (the spec) Interface Implementation
Class ✔
Interface ✔
Interface type ✔ ✔ ✔
Class type ✔ ✔ ✔ ✔
15. Actually, the best way to design a system is to write the API docs first. What I am talking
about here is actually writing the documentation comments in a source code (or .java ) file and
then automatically generating the API docs using javadoc. This iterative process should continue
until there is general agreement on all of the types involved (whether they be class or interfaces
types) and their behavior. Using this approach to system design means that all of the methods in a
class type will have either an empty method body or a return statement that does nothing but
quiet the compiler (such as return 0 in a method that has a result type of int). I am utterly
convinced that this is “rapid development” at its best. It yields excellent API docs that have
been thoroughly reviewed, makes the job of the application programmer much simpler, and also
makes it possible to control the kind of design changes that creep into a system after development
and unit testing.
There is nothing tricky about this. Separating out the interface implicitly defined
by a class type follows a very specific formula. Typically only the method sig-
natures and throws clauses of the public instance methods are
included. If there are any public instance variables or inner member classes,
however, they too would be included. Here is another example,
16. Patrick Naughton, The Java Handbook, (Berkeley: McGraw-Hill, 1996), Chapter 4,
“Types.”
17. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha, The Java Language Specification, Sec-
ond Edition, (Reading: Addison-Wesley, 1996-2000), §4.2, “Primitive Types and Values.”
class Test {
public static void main(String[] args) {
int i = 0;
if (i = 0)
System.out.println("doubtless a typo");
}
}
Attempting to compile this program generates the following compiler error:
Any compiler error message that include the phrase incompatible types
or inconvertible types is the result of compile-time type checking. Using
the simple assignment operator = instead of the equality operator == in Boolean
expressions is a common source of program bugs in the C and C++ program-
ming languages. In the Java programming language, however, statements such
as if (x = 0) will not compile because the type of the expression x = 0 is
numeric, and control flow expressions must evaluate to the boolean data
type. This is just one of many type checks performed at compile time. I know
class Test {
public static void main(String[] args) {
boolean a = false, b = true;
if (a = b)
System.out.println("probably a typo");
}
}
NOTE 5.1
The following section is the culmination of a relentless effort spanning
several years to clarify the definition of polymorphism in Java Rules.
Doing so has required differentiating overriding from polymorphism.
Overriding is not the same as polymorphism, but there is a very
close relationship between the two. While a given class of objects
may appear to have “many forms,” all of the objects must behave the
same. This is precisely why dynamic method lookup (or what C pro-
grammers call virtual functions) is considered “the normal method
dispatch in the Java programming language”19 (or any other object-ori-
ented programming language for that matter).
19. Tim Lindholm and Frank Yellin, The Java Virtual Machine Specification, Second Edition,
(Boston: Addison-Wesley, 1999), §3.11.8, “Method Invocation and Return Instructions.” The industry
standard terminology for dynamic method lookup is late binding and overriding.
20. Gary Cornell and Cay S. Horstmann, Core Java, (Upper Saddle River: Prentice Hall, 1997),
148. As always, please do not interpret this as a personal criticism of the authors. I make it a rule
to only (constructively) criticize software engineers and technical writers for whom I have
a deep and abiding respect. In this case, I assure you that certainly does apply to the authors of
Core Java and Thinking in Java. These are two of the finest Java books on the market. In fact, I
was thrilled to learn that Bruce Eckel even mentioned me in his Preface. I will be always grateful to
him for helping to get me started in Java.
When you send a message to an object even though you don’t know
what specific type it is, and the right thing happens, that’s called poly-
morphism. The process used by object-oriented programming lan-
guages to implement polymorphism is called dynamic binding.21
21. Bruce Eckel, Thinking in Java, (Upper Saddle River: Prentice Hall, 1998), 30. This quote
involves an incorrect use of type. Objects do not have a type. They have a class. It should say,
“even though you don’t know what specific class of object it is.”
class Test {
public static void main(String[] args) {
Baseclass base = new Baseclass();
DerivedClass derived = new DerivedClass();
type variables are also widening reference conversions. This is no less polymor-
phism than assignments involving class types. A subclass object can always be
substituted for a superclass object because of interface inheritance. Likewise,
any object that implements a particular interface type can be substituted for any
other object that implements the same interface type.
The following summarizes everything said thus far in this section.
Polymorphism is Greek for “many forms.” The “many forms” of an
object are the “many types” that a class of objects implements. The
qualifying type of a method invocation expression determines the
“form” of the object referenced in the polymorphic (many forms)
sense. Any object that is an instanceof that type can be substi-
tuted at run time as the target object in the same method invocation
expression. The code executed will be appropriate for the given
class of objects because of method overriding (or dynamic
method lookup). This explains why virtual is the normal invocation
mode in an object-oriented programming language such as Java. The
class of the object substituted is often related through a class hierar-
chy (i.e., a subclass), but if interfaces are involved the substituted class
is just as likely to be unrelated.
There is one sentence in this summary that I believe to be at the very heart of
understanding not only polymorphism, but the whole of the Java programming
language:
The qualifying type of a method invocation expression determines the
“form” of the object referenced in the polymorphic (many forms) sense.
catenation when using the + operator. This is what is meant by ad-hoc polymor-
phism, not method overloading. No less than Patrick Naughton got this wrong
in The Java Handbook:
It is possible and often desirable to create more than one method with
the same name, but different parameter lists. This is called method
overloading. Method overloading is used to provide Java’s polymorphic
behavior… 22 [emphasis added]
I would characterize the last statement in this quote as nothing less than bizarre.
No doubt this led to some other Java books making the same mistake. The rea-
son I say Java does not really support ad-hoc polymorphism is that there is no
stated briefly in small type. The double lines mean that there are no permitted
conversions to or from the types in that box (with the notable exception of
Object and null for reference types). The most important type conversion
The reference type conversion newton = (PDA) cray does not compile
because a Supercomputer is not a PDA. More precisely, when a Super-
computer object is created on the heap, none of the instance variables
declared in the PDA class are allocated. Therefore, a Supercomputer object
cannot possibly support the PDA interface. The two classes are unrelated
though part of the same class hierarchy. This explains why the only permitted
conversions between class types are those between related classes. Any of the
reference types can be converted to the Object type. Likewise, the special
null type can be assigned to any reference type. Nothing can be converted to
the special null type, because null is not really a type.
There is only one forbidden conversion between interface types. A method in
one of the interfaces cannot have the same method signature but a different
result type as a method in the other interface because no one class could imple-
ment both interfaces. For example,
class Test {
public static void main(String[] args) {
A a = null;
B b = (B) a; //COMPILER ERROR
}
}
interface B {
int doSomething();
}
This example compiled prior to the 1.2 release because of Bug Id 4028359.
There is a significant difference between forbidden conversions and per-
mitted conversions that are unsafe. A cast operator is required in a permitted
conversion that can result either in a loss of information or throw a
ClassCast-Exception at run time (which is the very definition of an
unsafe type conversion). However, a cast operator such as (PDA) in the
newton = (PDA) cray example above is a forbidden conversion and does
not compile. Nothing a programmer can do will ever make forbidden conver-
sions compile.
This may seem trivial, but it has two practical consequences. First,
it is always permitted for an expression to have the desired type to
begin with, thus allowing the simply stated rule that every expression is
subject to conversion, if only a trivial identity conversion. Second, it
implies that it is permitted for a program to include redundant cast
operators for the sake of clarity. 25
No conversion is required because the type of the cast expression is that same
as the type in the cast operator. In the following example, the declared type of
Math.PI is double. The (double) cast operator is little more than docu-
mentation to a compiler.
25. Gosling et al., The Java Language Specification, §5.1.1, “Identity Conversions.”
Such a conversion is said to “identify” the type of the cast expression. The fol-
lowing example is not an identity conversion:
int i = 0;
double d = 0;
d = (double) i
long to byte ( uses l2i and then i2b ) byte to long ( uses i2l )
long to short ( uses l2i and then i2s ) short to long ( uses i2l )
long to char ( uses l2i and then i2c ) char to long ( uses i2l )
float to short( uses f2i and then i2b ) short to float ( uses i2f )
float to char ( uses f2i and then i2c ) char to float ( uses i2f )
instructions in parentheses pop a primitive numeric type value off the operand
stack, convert it to a different data type, and then push the converted value back
on the operand stack. The three instructions marked “no conversion” do not
require a type conversion because the computational type of a byte, short,
or char is int. The nine narrowing type conversions that use two machine
instructions are referred to as two step conversions. They are discussed in the
bottom half of 5.6.2.2 Narrowing Primitive Conversions below.
class Test {
public static void main(String[] args) {
float f = Byte.MAX_VALUE;
test(Short.MAX_VALUE);
f = 0.0f + Character.MAX_VALUE;
f = (float) Integer.MAX_VALUE;
}
static float test(float f) {
return 0L;
}
}
The decompiled code for the main and test(float f) methods follows.
Here is the same example using conversions from the smaller integral types to
long.
class Test {
public static void main(String[] args) {
long l = Byte.MAX_VALUE;
test(Short.MAX_VALUE);
l = 0L + Character.MAX_VALUE;
l = (long) Integer.MAX_VALUE;
The decompiled code for the main and test(long l) methods follows.
As you can see, the compiler effects the conversions in all cases. This is impor-
tant because programmers sometimes use numeric suffixes in an effort to avoid
the cost of a type conversion. For example, long l = 0L. Although the default
type of numeric literals is int, the numeric suffix in this assignment expression
is completely unnecessary.
Widening primitive conversions from one integral type to another are unremark-
able. This is not the case, however, with widening primitive conversions from
integral to floating-point types. The potential for a loss of precision mentioned in
the last section is something that Java programmers really need to understand.
That is why I moved the discussion out of this chapter to Chapter 4, “Primitive
Data Types and Object” in Volume 1. Note also that conversions between inte-
gers and floating-point types are characterized as “nontrivial”26 in the JLS
because the two’s complement and floating-point formats are significantly differ-
ent. Such conversions should be avoided whenever possible.
The diagnostic message prints 1. To round the fraction when converting from
float to int or from double to long, use one of the following utility meth-
ods in the Math class.
27. IEEE Standard for Binary Floating-Point Arithmetic, (New York: The Institute of Electrical
and Electronics Engineers, Inc., 1985),
class Test {
public static void main(String[] args) {
28. API docs for the rint(double a) method in the java.lang.Math class.
true
1
2
float f = +128;
byte b = (byte) f;
System.out.println("b = " + b);
From this example of a narrowing primitive conversion, it can be seen that the
sign bit is always truncated. The question arises: Does the truncation of the
sign bit mean that the sign is lost? Not necessarily. If the value converted is
within the range of the smaller integral type, the sign bit is not lost. The reason
why the sign bit is not lost is the very definition of serendipity in the two’s com-
plement format. The value of a positive sign bit is zero, which is the same as the
rest of the high order bits if the number is within the range of the smaller integral
type. The same is true for negative numbers in the two’s complement format
because all of the bits of a negative number are flipped. To illustrate how this
works, Figure 5.8 compares the value of –50 in the int, short, and byte
formats:
No matter which of the integral types is used to represent the value –50 , the
eight low-order bits are the same.
A char to short conversion in which the char value is +32768 or
greater results in a loss of information. That is the value of the high-order bit in
char c = +32768;
short s = (short) c;
System.out.println(s);
short s = -1;
char c = (char) s;
System.out.println((int)c);
The diagnostic message prints 65535. This explains the oddity of why an 8-bit
byte to 16-bit char (superficially a widening primitive conversion) is catego-
rized as a narrowing primitive conversions. In such a conversion sign extension
would be inappropriate (because there is no sign in a char). Thus the high order
bits are always zero filled as shown in Figure 5.9. This works fine when convert-
29. In Java 1.1 and earlier releases, a character literal could be assigned to a byte or short type
variable without the use of a cast operator, but only for ASCII or Latin 1 characters (i.e., 8-bit val-
ues). For example, byte b = 'a' would compile. That extension of the definition of implicit narrow-
ing conversions (discussed below) was not sanctioned in the JLS and no longer compiles. See Bug
Id 4030496.
Computer comp;
PC pc = new PC();
comp = pc;
System.out.println(comp == pc);
The diagnostic message prints true proving that the value stored in the refer-
ence type variable is the same after as it was before the reference type conver-
ence type conversion is indeed a change in the “form” of an object, but that
change is entirely from the perspective of the programmer. The object is viewed
as having a different “form” without actually changing. In this case, a PC is
viewed as a Computer. That is to say, the object is seen through the interface
implicitly defined by public methods in the Computer class rather than
those of the PC class. The type changes, not the object. The JLS deals with
this conceptual difficulty as follows.
A cast expression converts, at run time, a value of one numeric type to
a similar value of another numeric type; or confirms , at compile time,
that the type of an expression is boolean; or checks, at run time, that a
Here the verb to convert is used for primitive numeric type conversions, to con-
firm is used for identity conversions, and to check is used for reference type
conversions. This terminological problem is closely related to the different mean-
ings of data type as discussed in 4.2 The Definition of Data Types in Volume 1.
Primitive type conversions result in a change of the data format or length, which
is the older definition of data type. Reference type conversions result in a
change of the class or interface type in which a reference is stored, which is the
newer definition of type in object-oriented programming languages.
Note that some Java books refer to widening and narrowing reference type
conversions as “upcasting” and “downcasting,” respectively. The following quote
from Thinking in Java is representative.
Taking an object reference and treating it as a reference to its base
type is called upcasting, because of the way inheritance trees are
drawn with the base class at the top… to move back down the inherit-
ance hierarchy, you use a downcast.31
This terminology is not sanctioned in the JLS, however, nor is it widely used in
other Java books.
Computer comp;
PC pc = new PC();
comp = pc;
Object type Any reference type Any class, interface, or array type can be
converted to an Object type variable.
Any reference type null reference Any reference type can be assigned a null
reference, resulting in a null reference of that
type. If the null reference is the null literal,
no cast operator is required. If the null
reference is not the null literal and the
types involved are not assignment
compatible, a cast operator is required but
the cast operation will never throw a
ClassCastException. For example,
Object obj = null;
String s = (String)obj;
pc = (PC) node[i];
Why is that? The answer is simply that interface types can reference any class of
objects that implements the interface. In this example, the array access expres-
sion node[i] can reference any class that implements the Networkable
interface. There is no guarantee whatsoever that such a class even will be
related to the PC class. For example, Laptop implements the Networkable
interface. That means node[i] could reference a Laptop computer. On the
other hand, node[i] could in fact reference a PC. The cast operator in effect
tells the compiler that you are sure the node[i] references a PC. If node[i]
references any other class of object at runtime, a ClassCastException is
thrown. For example,
class Test {
public static void main(String[] args) {
The bold line of code is a narrowing reference conversion because p could refer-
ence any class that implements the Printable interface. In this case, p refer-
ences an anonymous class. The compiler generated name of the anonymous
class is Test$1 , which like all anonymous classes is a direct subclass of
Object. This program compiles successfully. Attempting to execute the pro-
gram, however, throws a ClassCastException.
The remainder of this section discusses the three rules for array type
assignment compatibility. Before discussing these rules, however, you must
be able to distinguish between an array type and the component type of that
array. An array type is the component type followed by one or more pairs of
empty brackets indicating the number of dimensions. For example,
float[] gpa = new float[class.size()]
This is a one dimensional array of float. The array type is float[]. The
component type is float. The length of the array is not part of the array type.
That much is obvious in this example because we have no idea what the value of
class.size() is going to be at runtime. Array and component types are dis-
cussed further in 6.2 Array, Component, and Element Types in Volume 1.
The first rule applies to all array type conversions. To be assignment com-
patible, array types must have the same number of dimensions. There is no
This is deceptive, however, because there is no way of telling the type of the
unnamed variable in the first element of a[] by just looking at the array access
expression a[0]. If it is not int[][], this line of code will not compile.
The other two rules in Table 5.6 are for the component types. The last rule
Primitive type Primitive type Both component types must be the same primitive
type.
Reference typea Reference type The rvalue component type must be assignable to
the lvalue component type (according to the rules
in Table 5.5)
a. This is the component type used in the declaration of the array variable, not the component type of the array
referenced. They could be different. The component type of the array referenced is pertinent to the
ArrayStoreException discussion in 5.7.1.1 The ArrayStoreException.
runtime will be assignable to the type of the variable or other expression in a nar-
rowing reference conversion. If that is not the case, a ClassCastException
is thrown.
Narrowing reference conversions involving class types are always from a
superclass to a subclass. For example,
PC pc = (PC) comp;
The declared type of the comp variable is Computer. The conversion is catego-
rized as a narrowing reference conversion because Computer is a superclass
of PC. If comp actually references a Computer class object at run time, a
ClassCastException would be thrown because references to Computer
objects are not assignable to PC type variables.
So why does a reference type conversion such as pc = (PC) comp com-
pile at all? The answer is widening reference conversions. Because of widening
reference conversions such as comp = pc, the compiler has no way of deter-
mining the class of object referenced at run time. It could be any class in the
Computer class hierarchy or an interface implemented by any of those
class Test {
static java.rmi.Remote remote;
static java.util.RandomAccess random;
static Cloneable cloneable;
public static void main(String[] args) {
remote = (java.rmi.Remote)random;
cloneable = (Cloneable)remote;
random = (java.util.RandomAccess)cloneable;
}
}
class Test {
static Superclass s = new Superclass();
static Printable p;
public static void main(String[] args) {
p = (Printable) Test.s;
}
}
class Superclass { } //there will be a subclass added momentarily
interface Printable {
void print();
}
class Test {
static Superclass s = new Superclass();
static Printable p;
public static void main(String[] args) {
p = (Printable) Test.s;
}
}
final class Superclass { }
interface Printable {
void print();
}
This is exactly the same as the original program except for the final keyword
in the Superclass declaration. Attempting to compile the program now gen-
erates the following compiler errors:
The compiler knows a lot more about the class hierarchy, and can do more pre-
cise compile-time type checking. This is one reason for routinely capping class
hierarchies as discussed in 3.10 Capping a Class Hierarchy.
Simple Assignment No
Method Return No
String No
Compound Assignment No
Casting No
stand that any use of a cast operator is a casting conversion. It does not matter
if the cast operator appears in an assignment or return statement, before an
argument in a method invocation or class instance creation expression, in a
string concatenation operation, or in an operator expression, any use of a cast
operator is the casting conversion context. Three of the conversion contexts
in Table 5.7 are highly specialized and not discussed in this chapter:
The design of the Java programming language is that safe type con-
versions are automatic in all conversion contexts.
are allowed in the assignment conversion context but not in method invocation or
return. Table 5.8 summarizes which type conversions are allowed in a given con-
text. There are two important points about type conversions that this table makes
Method Invocation ✔
Method Return ✔
Exception Handling ✔
Numeric Promotion ✔
String ✔
Simple Assignment ✔ ✔
Compound
Assignment
✔ ✔ ✔
Casting n/a (always
✔ explicit)
✔ ✔
very clear:
class Test {
public static void main(String[] args) {
byte b = 0;
b += 0;
}
}
Here is the decompiled code for the main method showing that a cast operator
must be used to implement the compound assignment operator even though the
equivalent operation using a simple assignment operator merely stores the value
of the compile-time constant:
Method invocation is only the most obvious of these three; the compiler must
also choose between one of the four computational types when promoting the
operands in an operator expression; and a JVM must either choose an exception
handler parameter type from one of the dynamically enclosing catch clause or
else invoke the default exception handler.
class Test {
public static void main(String[] args) {
int i = 0;
long l = i;
}
}
The i2l machine instruction in this decompiled code implements the implicit
int to long conversion. If the operands are reference types, they must be
assignment compatible as defined in 5.6.3.1 Widening Reference Conversions
above.
The simple assignment conversion context is unlike any other (including the
compound assignment conversion context) because of what are referred to as
implicit narrowing conversions. The term implicit narrowing conversions
should have the same puzzling effect as does any oxymoron. Implicit conver-
sions by definition are safe, and narrowing conversions by definition are unsafe.
Therefore, without any further explanation, the term “implicit narrowing conver-
sions” should sound like “safe unsafe conversions.”
Safe conversions include only widening primitive conversions, widening ref-
erence conversions, identity conversions, and string conversions with one impor-
tant exception. The designers of the Java programming language have defined a
subset of narrowing primitive conversions that are always safe. Therefore, they
are implicit, but only in the simple assignment conversion context.
Why define a subset of “implicit” narrowing conversions? Quite simply to make
it possible to initialize a byte, short, or (less frequently) char variables with
an integer literal without having to use an explicit cast. Remember that the
default type of an integer literal is int. Without implicit narrowing conversions,
variable declarations as simple as byte b = 0 would require the use of a cast
operator.
Such so-called “implicit narrowing conversions” are not really conversions at
all. They are merely compiler checks to make sure the value assigned is within
the range of the byte, short, or char type variable. For example,
The default type of the numeric literals in these assignment expressions is int.
Normally they would require conversion to the byte, short, and char data
types, but as you can see in the following decompiled code the i2b, i2s, and
i2c machine instructions are not used.
0 bipush -128
2 istore_1
3 sipush 32767
6 istore_2
7 ldc #2 <Integer 65535>
9 istore_3
10 return
34. Gosling et al., The Java Language Specification, §5.2, “Assignment Conversion.”
class Test {
public static void main(String[] args) {
doSomething(0);
}
static void doSomething(byte b) { }
}
It simply does not matter if the argument is a constant expression that “is repre-
sentable in the type of the” method or constructor parameter.
There is a lengthy explanation for this difference in the JLS. It is quoted here
in its entirety:
Method invocation conversions specifically do not include the implicit
narrowing of integer constants which is part of assignment conversion.
The Java designers felt that including these implicit narrowing conver-
sions would add additional complexity to the overloaded method
matching resolution process. Thus, the example:
class Test {
static int m(byte a, int b) { return a+b; }
static int m(short a, short b) { return a-b; }
public static void main(String[] args) {
System.out.println(m(12, 2)); //compile-time error
}
}
class Test {
public static void main(String[] args) {
Object[] array = new String[10];
array[0] = new Object(); //throws ArrayStoreException
}
}
This program compiles because the component type of the array variable is
Object. The component type changes to String at run time, however, when
a String[] type array is assigned to the same array variable. It is precisely
such narrowing reference type conversions between array types that lead to the
ArrayStoreException being thrown at run time. Assignments to primitive
type arrays never throw an ArrayStoreException because the element
type and the component type are necessarily always the same.
The relevant section of the JLS includes the following statement.
class Test {
public static void main(String[] args) {
float f = 0;
test(f);
}
static void test(double d) { }
}
The f2d machine instruction in this decompiled code implements the implicit
float-to-double conversion. If the operands are reference types, they must be
assignment compatible as defined in 5.6.3.1 Widening Reference Conversions.
The only difference between the simple assignment and method invocation
conversion contexts is that the “implicit narrowing conversions” defined in 5.7.1
Simple Assignment Conversion Context are not allowed. For example,
class Test {
public static void main(String[] args) {
test(0);
}
static void test(byte b) { }
}
The javac compiler is doing you a favor by pointing this out. Other, less
sophisticated Java compilers might act as if the test(byte b) method does
not even exist.
The “cannot be applied” terminology in this error message seems a little odd
at first. Elsewhere in the JLS this process is described as “choosing” a type con-
version (which in this case means choosing a method). The following quote from
5.7 Conversion Contexts is repeated here for your convenience.
The term “conversion” is also used to describe the process of choos-
ing a specific conversion for such a context. For example, we say
that an expression that is an actual argument in a method invocation is
subject to “method invocation conversion,” meaning that a specific con-
version will be implicitly chosen for that expression according to the
rules for the method invocation argument context.37 [emphasis added]
37. Gosling et al., The Java Language Specification, introduction to Chapter 5, “Conversions and
Promotions.”
38. Gosling et al., §15.12.2, “Compile-Time Step 2: Determine Method Signature.”
39. Neal Gafter confirmed this when he said the following in a personal email: “...the word ‘match’ in
... compiler diagnostic[s]... means ‘are accessible and applicable’. We use ‘match’ because it's
shorter.”
class Test {
public static void main(String[] args) {
int big = 1234567890;
float approx = big;
System.out.println(big - (int)approx);
}
}
Executing this program prints -46. The loss of precision occurred because the
value 1234567890 was implicitly converted to the float data type in the
assignment statement float approx = big. Now consider the following
modification to that program.
class Test {
static int big = 1234567890;
public static void main(String[] args) {
System.out.println(big - (int)approx());
}
public static float approx() {
return big;
This shows conclusively that return statements are a conversion context. The
name for such a conversion context is rather naturally the method return con-
version context. Even the language used to describe the return statement is
similar to that used to describe the assignment conversion context. Compare
the following two quotes:
The type [of the return expression] must be assignable to the
declared result type of the method, or a compile-time error occurs.41
The absence of a method return conversion context in the JLS is clearly an error
of omission. As with other conversion contexts, if the result type in a method
declaration is a reference type, the expression in a return statement must
evaluate to an assignment compatible type.
The exact same run-time “type checks” are used to implement both
the checkcast and instanceof machine instructions.
under exactly the same conditions that a cast operator will throw a Class-
CastException. Likewise, the instanceof operator will not compile
under exactly the same conditions as a cast operator. Therefore, you should
have no more qualms about using a cast operator than you do using the
instanceof operator.
The main difference between checkcast and instanceof is null ref-
erences. As stated in Table 5.5 The Four General Rules for Assignment Compat-
ibility on page 615, null can be assigned to any reference type. If the left-hand
operand of the instanceof type comparison operator is a null reference,
however, false is always returned. For example,
class Test {
public static void main(String[] args) {
Object objectref = null;
if (objectref instanceof String) //instanceof
System.out.println("objectref instanceof String");
String s = (String)objectref; //checkcast
}
}
This program compiles, but prints nothing. The only other difference is that a
cast operator throws a ClassCastException instead of returning false.
For example,
class Test {
public static void main(String[] args) {
43. Tim Lindholm and Frank Yellin, The Java Virtual Machine Specification, Notes for the
instanceof machine instruction in Chapter 6, “The Java Virtual Machine Instruction Set.”
This program still does not print anything, but it now throws a ClassCast-
Exception (because Object is not assignable to String).
The run-time “type checks” for both the instanceof and checkcast
machine instructions are stated as follows.
If S is the class of the object referred to by the objectref and T is the
resolved class, array, or interface type, checkcast determines
whether objectref can be cast to type T [or instanceof deter-
mines whether objectref is an instance of T] as follows:
44. Tim Lindholm and Frank Yellin, The Java Virtual Machine Specification, Second Edition,
§6.4, “Format of Instruction Descriptions” on pages 193-194 (checkcast ) and 278-279
(instanceof). Reading these specifications from the JVMS, you can see why run-time class (or
interface) check is the more appropriate term
a = b Compile-time type checking checks that the type name in a cast operator is
assignment compatible with the variable or expression that references an
object. For example,
pc = (PC) node[i];
Compile-time type checking checks that the PC type name in the cast
operator is assignment compatible with the declared type of the pc variable.
(In this case, they are the same.)
b = c Opcode 192, checkcast checks to make sure that the class of the
object referenced at runtime is assignment compatible with the type named in
a cast operator. Hence the name checkcast . In the assignment
statement pc = (PC) node[i] , the checkcast machine instruction
would check that the class of the object referenced by node[i] is
assignment compatible with the PC type. Note that a ClassCast-
Exception always means that the class of the object referenced at
runtime is not assignment compatible with the type name in the cast operator
(as opposed to the type of the variable denoted by the left-hand operand of an
assignment expression). There is an example below in which that fact becomes
relevant.
This simple assignment expression does not compile. The type of the right-hand
operand expression Math.PI is double. A conversion from double to int
obviously can result in a loss of information. Now consider the same expression
written using a compound assignment operator:
int i = 0;
i += Math.PI;
System.out.println(i);
The value of i is the same as after the compound assignment operation. In both
cases, there is a loss of information. The implied cast means that considerably
more care must be taken when using compound assignment operators.
The second reason for describing the use of a compound assignment opera-
tor as an explicit conversion is that, unlike simple assignment conversion, there
is an alternative means for doing the same thing. The operations do not have to
be compounded. They can be written separately. The choice is the program-
mer’s. All I am saying is that the effect of using a compound assignment opera-
tor is the same as using a cast operator if the implicit cast is unsafe. That much
is obvious. If a programmer knows this, then the use of a compound assignment
operator is in fact an explicit conversion.
loaded method matching only uses the types of their formal parameters to either
choose the most specific one or else generate a compiler error saying that the
method invocation or class instance creation expression is ambiguous.
The rule for choosing the most specific applicable method is very simple.
Each parameter in the applicable method signature must be narrower
than the same parameter in all of the other applicable methods. That is
precisely what makes it the most specific applicable method. For example, con-
sider the overloaded println methods in the PrintStream and Print-
Writer classes. Given a byte argument, all four of the overloaded println
methods in Figure 5.11 are applicable. This is just another way of saying that a
byte type argument can be converted to an int , long, float, or double
(all of which are widening conversions) in the method invocation conversion con-
import java.io.Serializable;
class Test {
public static void main(String[] args) {
new Test().test(new Integer(0));
}
void test(Number n) {
System.out.println("test(Number n)");
}
void test(Serializable s) {
System.out.println("test(Serializable s)");
}
}
Which of these applicable methods is the most specific? If you are thinking in
terms of the type of the argument expression new Interger(0) then it
would be quite natural to expect this program to print test(Number n)
because Integer extends Number. However, Integer also implements
Serializable, which means that both methods are applicable. The program
actually prints test(Serializable s) because Serializable is nar-
rower than Number.
class Test {
static void test(ColoredPoint p, Point q) {
System.out.println("(ColoredPoint, Point)");
}
static void test(Point p, ColoredPoint q) {
System.out.println("(Point, ColoredPoint)");
}
public static void main(String[] args) {
ColoredPoint cp = new ColoredPoint();
test(cp, cp); // compile-time error
}
}48
If there is no one applicable method in which all of the parameters are the nar-
rowest, then the method invocation or class instance creation expression is
ambiguous. In test(ColoredPoint p, Point q) only the first parameter
is narrower. Likewise, in test(Point p, ColoredPoint q) only the sec-
ond parameter is narrower. The method invocation test(ColoredPoint,
ColoredPoint) in main is therefore ambiguous.
Using null as an argument expression is a special case that 15.12.2.2
Choose the Most Specific Method of the JLS does not directly address. I believe
I admit that this behavior may be confusing due to the fact that
the literal null belongs to all reference types (i.e., the null type is
convertible to any reference type). In the more common case,
where the argument expression belongs to a named class or interface
type, choosing the most specific method generally results in the intu-
itively expected behavior.49 [emphasis added]
class Test {
public Test(Object o) {
System.err.println("null is an Object");
}
public Test(String s) {
System.err.println("null is a String");
}
public static void main(String[] args) {
new Test(null);
}
}
class Test {
public Test(Object o) {
System.err.println("null is an Object");
}
public Test(String s) {
System.err.println("null is a String");
}
public Test(StringBuffer sb) {
System.err.println("null is a StringBuffer");
}
public static void main(String[] args) {
new Test(null);
}
}
class Test {
public static void main(String[] args) {
C c = new C() {
Attempting to compile this program generates essentially the same error mes-
sage because interfaces A and B are unrelated; neither extends that other.
Note that autoboxing, unboxing, and varargs in the 1.5 “Tiger” release will
further complicate overloaded method matching.50 I will not address this issue
until both the beta version of the compiler and a draft copy of the Third Edition of
the JLS is released.
NOTE 5.2
The following section introduces a new term; the class in which a
method is declared is referred to as the declaring class by at least
one software engineer at Sun. (See the Evaluation of Bug Ids 4814557
and 4761586.) This is significantly different than the qualifying type.
The reason for pointing this out is that the two are sometimes confused
when discussing overloaded method matching.
50. This sentence is a paraphrase of one in an email that Neal Gafter sent me. The skeleton of the
previous example was also provided by Gafter in the same email.
Attempting to compile this program in a pre 1.4.2 release generates the follow-
ing compiler error:
As of the 1.4.2 FCS release this code should compile. The question that every-
one would like answered is why did it not compile in previous releases? Here is
the same example with the method signatures in the Baseclass and Super-
class reversed:
This example compiles in all releases. Why the difference? This is very confusing
for application programmers because in both cases print(int i) and
print(long l) are members of the Baseclass; so why are they treated
differently based on their declaring class? The remainder of this section
attempts to answer that question and then explains why the language designers
have reversed this decision.
There has never been an official explanation why the class in which an appli-
cable method is declared was ever part of the overloaded method matching
algorithm. In fact, the Evaluation of Bug Id 4814557 includes the following sen-
tence.
There isn't a very good reason why the specification is the way it is with
respect to this issue. 52
Not even Neal Gafter knows the precise reason why, and he is the software engi-
neer responsible for the development of the javac compiler.53
The apparently unforeseen problem with the overloaded method matching
algorithm used in pre 1.4.2 releases is that it requires client programmers know
the declaring class of overloaded methods. This “exposes the structure of the
type hierarchy to clients”54 In other words, it breaks encapsulation (in the worst
kind of way). For example,
52. Ibid.
53. Gafter has said only that “I suspect the authors of the first edition JLS were defining the VM and
language at the same time and simply got their concepts confused.” This was in a personal email to
the author. Believe me when I say that few if any other technical writers or software engineers have
tried as diligently as I to get to the bottom of this. THERE IS NO BOTTOM. Apparently no one knows
for sure except perhaps the original authors of the JLS and they are not saying.
54. See the Evaluation of Bug Ids 4814557 and 4761586.
class Test {
public static void main(String[] args) {
Subclass sub = new Subclass();
((Superclass)sub).print('A');
Superclass temp = sub;
temp.print('A');
}
}
Subclass and then to use message forwarding to invoke the same method in
Superclass . For example,
class Superclass {
void print(char c) {
System.out.println(c);
}
}
class Subclass extends Superclass {
void print(int i) {
System.out.println(i);
}
void print(char c) {
super.print(c);
}
}
class Test {
public static void main(String[] args) {
Subclass sub = new Subclass();
sub.print('A');
}
}
This amounts to an ugly workaround for a problem that should not exist. From a
API design standpoint, it is highly undesirable because there is no apparent rea-
son for redeclaring the print(char c) method in the subclass. Even more
troubling, this is the only exception to the rule that inherited members are
exactly the same as members declared in a class or interface. There should be
no exceptions to that rule.
That this not a theoretical problem can be seen in Bug Ids 4109924 and
4114065. Here is the description from the latter:
((Graphics)g2).drawString((String) vector.get(i), 1,
(yy += strheight));
This is a modified example from Bug Id 4109924. The problem was solved by
redeclaring drawString(String, int, int) in the Graphics2D sub-
class in the 1.3 release.
As stated above, the use of the declaring class in overloaded method match-
ing will be dropped as of the 1.4.2 FCS release. This is not a cosmetic change to
the specification; programs that did not compile prior to the introduction of the
change will now compile. In other words, the language itself has changed, not
just the specification. It is a backwards compatible change because it only
affects code that previously would not compile. A complex mathematical proof
attributed to Gilad Bracha that this change is backwards compatible can be found
at groups.yahoo.com/group/java-spec-report/message/849.
Finally, note that the declaring class of applicable methods was never used
when choosing the most specific class method. For example,
This is the same example that generated an ambiguous method error at the
beginning of this section, only the static modifier has been added to the
method declarations. This program, however, compiles. That is a problem only
in that the relevant specification makes no distinction whatsoever between
instance and class methods. The 1.4.2 FCS release will fix this problem because
the declaring class of applicable methods will no longer be part of the over-
loaded method matching algorithm. In other words, this “bug” will become nor-
mal behavior (as it should be for class methods).
56. Tim Lindholm and Frank Yellin, §5.1.8, “Value Set Conversion.”
Chapter Contents
6.1 Introduction 660
6.2 Assertions 661
6.2.1 Preconditions, Postconditions, and Invariants 671
6.2.2 assert false and Logic Traps (or Control-flow Invariants) 679
6.2.3 Catching CloneNotSupportedException 682
6.3 An Execution Stack Primer 685
6.4 The Exception Mechanism 690
6.5 The Throwable Class Hierarchy 698
6.5.1 General Exception Classes 723
6.5.1.1 Umbrella Exceptions 729
6.5.2 Unspecified Runtime Exceptions and Errors 736
6.5.2.1 The Chicken Little “The Sky is Falling” Problem 754
6.5.3 Asynchronous Exceptions 759
6.6 Throwable Objects 762
6.6.1 The Loggable Interface 767
6.7 The throw Statement 768
6.8 The try Statement 785
6.8.1 The catch Clause 791
6.8.1.1 Catchall Exception Handlers 799
6.8.2 The finally Clause 817
6.8.2.1 try-finally as a Control-flow Statement 818
6.8.2.2 Releasing System Resources 823
6.9 Exception Handling 859
6.9.1 Rethrowing the Same Exception 864
6.9.2 Exception Translation and Chaining 867
6.9.3 Squelching an Exception 874
6.9.4 Setting an Error Flag 878
6.9.5 Retry 879
6.9.6 Try an Alternative 881
6.1 Introduction
If the Boolean expression in an assertion evaluates to false, an Assertion-
Error is thrown, which explains why assertions and exceptions are discussed
in the same chapter. Likewise, there is a very close relationship between excep-
tion handling and logging.
There are three sections in this chapter that are inherently controversial
because they fly into the face of conventional wisdom:
• 6.5 The Throwable Class Hierarchy in which I argue that there has been a
paradigm shift in the fundamental definition of errors. This section is the
linchpin of the entire chapter
• 6.5.2.1 The Chicken Little “The Sky is Falling” Problem in which I argue that
errors should not be used to shut down an application program (except dur-
ing system initialization)
• 6.8.1.1 Catchall Exception Handlers in which I argue that catch(Throw-
able e) should be the catchall exception handler of choice
I have no desire to become the pied piper of exception handling, so I encourage
you to read these sections with a very critical eye. If I am wrong, then I am
wrong on a massive scale. This is particularly true of my definition of errors
which infuses the entire chapter.
This chapter uses SystemConfigurationError so much that an
unsuspecting reader might think it is a standard error class declared in the
java.lang package. It is not. SystemConfigurationError is an
Error subclass that I propose should be added to the core API. It should be
6.2 Assertions
If the boolean type expression that immediately follows the assert keyword
evaluates to false, an AssertionError is thrown. In other words, by using
the assert keyword, you are asserting that something is true. As stated in
*
* @param newCapacity the new capacity, MUST be a power of two.
*/
void resize(int newCapacity) {
assert (newCapacity & -newCapacity) == newCapacity; //power of 2
…
The capacity is initially set to the size of the old hash table. The local variable
n has been set to the size of the new table (not shown here). Because it is
already known that n has exceeded the threshold of the old hash table,
capacity is sure to be less than n the first time the while loop is executed.
2. Unascribed, “Programming With Assertions” in the API docs for the 1.4 release, (Mountain View:
Sun Microsystems, 2002), Introduction.
This level of trust (or distrust) is entirely different than including code from
another package in an assertion. You should never assume that someone
else’s code works in an assertion. Doing so implies that you are not “in com-
plete control.” There is a very closely related discussion of asserting that an
exception is not thrown near the top of 6.9.2 Exception Translation and Chaining
on page 867. In that discussion, I make a few minor exceptions to the rule that
you should not trust code from another package (at least as far as assertions
are concerned).
The general format of an assertion is as follows.
3. Joshua Bloch in an interview with Bill Venner entitled “A Conversation with Josh Bloch” (First Pub-
lished in JavaWorld, January 4, 2002), on the artima.com Web site (Artima Software, Inc.),
www.artima.com/intv/blochP.html.
4. The API docs for the Throwable class refer to detailed error messages as just detail mes-
sages . I suspect their reason for doing so is that calling this an “error” message might confuse
some programmers because of the Error class. The API docs for the Throwable class even
describe the “detail message” as providing “more information about the error.” I do not like the term
detail message but have grow accustomed to it. It is therefore consistently used throughout this
chapter.
assert Thread.holdsLock(obj);6
A simple assertion is any boolean type expression other than a method invo-
cation expression.
There is a brief period during class loading in which assertions are always
enabled. This brief period is while superclasses are being initialized and before
the <clinit> method is executed. For example,
5. Unascribed, “Java Logging Overview” in the API docs for the 1.4 release, (Mountain View: Sun
Microsystems, 2002), §1.13, “Unique Message IDs.”
6. API docs for the holdsLock(Object obj) method in the java.lang.Thread class.
This complex assertion is being used extensively in the 1.4 release in non-public methods that
should only be called after obtaining some lock.
If an ellipsis is used after the package name, assertions are also enabled in all of
the subpackages. For example,
To enable assertions in one or more classes, use the fully qualified class name
after the -ea option. For example,
Command-line options such as these are processed in the order in which they
are written, so this combination of options enables assertions for all of the
classes in the com.javarules.examples package except for the
Widget class.
Assertions are primarily used during development to debug programs. They
are usually disabled in production. The design of all diagnostic tools is to mini-
mize their cost in a fully developed application that has been deployed:
• With simple diagnostic messages printed to the console using standard I/O,
this usually means removing them as soon as the problem is found. If left in
a program, they are usually removed by the compiler by means of condi-
tional compilation
• With logging, this means not logging trace information (Level.FINE,
Level.FINER, and Level.FINEST)
• With assertions, this means disabling them
This blank final is initialized by the <clinit> method when the class is loaded
(based either on the -ea, -esa, and -da command-line options or the default).
The first thing that is done when executing an assert statement is to check
the value of $assertionsDisabled. If it is false, then the Boolean
expression in the assertion is not evaluated. For example, the compiler trans-
forms assert a > b into the following.
if ($assertionsDisabled)
if (!(a > b))
throw new AssertionError();
After all, what we are talking about here are Boolean expressions that even the
responsible programmer assumes to be true. This is precisely why Sun wants
“to encourage users to leave asserts in object files so they can be enabled in the
field.”8 For field service engineers, assertions are like a highly specialized tool.
They may not use them very often, but when needed there is no substitute for
enabling assertions.
It is precisely because assertions are typically disabled in production code
that you must be extremely careful not to execute any code that has a side
effect. As stated in the “Assertion Facilities” document:
Because assertions may be disabled, programs must not assume that
the boolean expressions contained in assertions will be evaluated. Thus
these expressions should be free of side effects. That is, evaluating
such an expression should not affect any state that is visible after the
evaluation is complete. Although it is not illegal for a boolean expres-
sion contained in an assertion to have a side effect, it is generally inap-
propriate, as it could cause program behavior to vary depending on
whether assertions are enabled or disabled.9
8. Unascribed, “Programming with Assertions” in the API docs for the 1.4 release, Design FAQ -
Enabling and Disabling Assertions.
9. Ibid., Design FAQ - General Questions.
assert list.remove(null);
Normally side effects are thought of as what a method does besides returning a
value, but in this case merely assigning a value to a field is considered a side
effect.
If assertions must be enabled, the API docs include the following suggestion:
Requiring that Assertions are Enabled
static {
boolean assertsEnabled = false;
assert assertsEnabled = true; // Intentional side effect!!!
if (!assertsEnabled)
throw new RuntimeException("Asserts must be enabled!!!");
}
I would only add that the fact that assertions are disabled when they are sup-
posed to be enabled is neither a bug nor a recoverable exception. Such a class
should throw an AssertionsDisabledError and document the fact using
an @throws tag. This is a classic example of a recurring end-user error as
discussed in 6.5 The Throwable Class Hierarchy below.
Sun implementations such as the J2SE are in a “transitional period” during
which the -source 1.4 compiler option must be used in order to compile
using assertions. The reason for this is simply to ease the transition from
assert being a legal identifier to it now being a keyword. All sorts of compiler
errors are generated if you attempt to use assertions without this option. This
raises an interesting question of why there is both a -source and -target
option. They are documented as follows.
-target version
Generate class files that will work on VMs with the specified
version. The default is to generate class files to be compatible
with the 1.2 VM in the Java 2 SDK.…10
My first inclination was that it would make it possible to generate a class file that
used assertions but could be run on a 1.2 VM. That is not the case, however. For
example,
Likewise, compiling a class that uses assertions using only the -target 1.4
does not work because assert is not treated as a keyword. The only answer
then is that there is both a -source and -target option is so that older
class files (in which assert is not a keyword) can be targeted to a 1.4 VM.
10. Unascribed, “Java 2 SDK Tools and Utilities” in the API docs for the 1.4.1 release, (Mountain
View: Sun Microsystems, 2002), “javac - Java programming language compiler.”
The javadoc tool and document comments in general have grown in sophisti-
cation along with the rest of the Java platform since Naughton wrote this back in
1996. Whoever follows in the footsteps of Dennis Ritchie and Dr. Gosling will
benefit from a great many predecessors, including the javadoc team at
Sun.15
Before discussing invariants and postconditions, there is one thing I would
like to point out about the example of testing a precondition in the “Programming
With Assertions” document. There is a misleading statement that suggests that
a client programmer could be responsible for an assertion failure. Here is the
example and misleading statement:
/**
* Sets the refresh rate.
*
* @param rate refresh rate, in frames per second.
* @throws IllegalArgumentException if rate <= 0 or
13. As of this writing, none of the tags in RFE 4137085 are even included in the list of proposed
tags at java.sun.com/j2se/javadoc/proposed-tags.html. Note, however, that Doug Lea
has expressed some support for the @invariant tag in a Java Forum discussion.
14. Patrick Naughton, The Java Handbook, (Berkeley: McGraw-Hill, 1996), 41.
15. No disrespect for Bjarne Stroustrup intended. I just do not think C++ is enough of a departure
from the C programming language to consider it to be an entirely new programming language. C# is
a copy of Java and no one dares claim to have “invented” it. Instead it has only a “chief architect.”
/**
* Sets the refresh interval (which must correspond to a
* legal frame rate).
*
* @param interval refresh interval in milliseconds.
*/
private void setRefreshInterval(int interval) {
// Confirm adherence to precondition in nonpublic method
assert interval > 0 &&
interval <= 1000/MAX_REFRESH_RATE : interval;
Any suggestion that a client programmer could cause an assertion failure is dis-
turbing. In this case, the presumption is that MAX_REFRESH_RATE is in fact
set to 1000. Thus if the client “selects a refresh rate greater than 1000” the
public setRefreshRate(int rate) method will throw an Illegal-
ArgumentException . This is as it should be.
In the previous section on logic traps, there is an if-then-else statement
that embodies the idea that “everything is animal, vegetable, or mineral.” In
mathematical terms, this is referred to as an invariant (something that does not
16. Unascribed, “Programming With Assertions” document in the API docs for the 1.4 release, Pre-
conditions, Postconditions, and Class Invariants.
char c = s.charAt(0);
for (int i = 0; i < n;) {
assert c == s.charAt(i); // Loop invariant
…
The term loop invariant is not widely used, but this is a good example of a
short-lived invariant. It is obviously true the first time the loop is executed. The
responsible programmer is merely asserting that it is true for every iteration.
Here is another example of an internal (or short-lived) invariant:
In this case, the responsible programmer is asserting that there was in fact a
null element in the target list. There are a number of boolean methods in
the Collections Framework for which this same idiom would be very useful.
A class invariant is something that is always true. More precisely, class
invariants should not change as the result of invoking a method or constructor.
They should always be documented. For example,
A buffer is a linear, finite sequence of elements of a specific primitive
type. Aside from its content, the essential properties of a buffer are its
capacity, limit, and position:
A buffer's limit is the index of the first element that should not
be read or written. A buffer's limit is never negative and is
never greater than the its capacity.
The following invariant holds for the mark, position, limit, and capacity
values:
*
*
* <h4> Invariants </h4>
*
Others may wish to use this same documentation technique until such time as an
@invariant tag is introduced. (I don’t want to sound like a fickle lover, but the
java.nio package is from design to documentation is unquestionably the
single best core API package ever developed for the Java platform.)
The following assertions are used to test the class invariant that the index of
the next element to be read or written (the position) should always be less
than the both the limit and capacity of a buffer.
The assertion mechanism does not enforce any particular style for
checking invariants. It is sometimes convenient, though, to combine
the expressions that check required constraints into a single internal
method that can be called by assertions. Continuing the balanced tree
example, it might be appropriate to implement a private method that
checked that the tree was indeed balanced as per the dictates of the
data structure:
assert balanced();
assert t.equals(this);
return t;
} catch (CloneNotSupportedException e) {
assert false;
}
}
This is the clone() method for a hash table, which should equal the current
object after cloning (though the API docs do not require this). Documenting
“postconditions” is therefore synonymous with writing documentation comments
for a method or constructor (at least to the extent that the Java programming
language supports Design by Contract). Coding a postcondition assertion is
more or less like testing a method every time it is executed. There are few if any
postcondition assertions in the core API because Sun maintains a comprehen-
sive test suite.
Postcondition assertions may require saving data at the beginning of a
method. Suppose that the API docs for a method or constructor guarantee that
passing a mutable object is safe because the reference is only stored in a local
variable and is not modified as a result of invoking the method. The reason for
doing so would be so that client programmers do not have to make defensive
copies. This can be accomplished inexpensively using assertions. For example,
Computer programmers really should not make such assumptions. They are
potentially disastrous. Instead you should always use a logic trap to catch
the unknown (read test your assumptions). The expectation is that code gen-
erally should never fall into a logic trap, but logic traps have a way of teach-
ing you things about a system that you did not know before.
The other form of a logic trap in the Java programming language is the
default label in a switch statement. Here is an actual example from the
java.nio.charset.CoderResult class:
switch (type) {
case CR_UNDERFLOW: throw new BufferUnderflowException();
case CR_OVERFLOW: throw new BufferOverflowException();
case CR_MALFORMED: throw new MalformedInputException(length);
case CR_UNMAPPABLE: throw new
UnmappableCharacterException(length);
default:
throw new Error(); // ## assert false;
}
NOTE 6.1
As the name applies, the following discussion is very general. More ex-
perienced programmers already familiar with the terminology of execu-
tion stacks and activation frames may want to skip over this section.
I use the more formal term (and more generic when compared to “Java stack” or
“Java Virtual Machine stack”) to differentiate execution stacks from other com-
mon uses of the stack data structure, which in the case of the operand stack
are sometimes discussed in the same context. Another commonly used term is
call stack.20
The reference to the target object and argument values are stored in the local
variable array. The local variable array and operand stack are the two most
important components of an activation frame. Figure 6.1 illustrates an execution
21. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha, The Java Language Specification, Sec-
ond Edition (Boston: Addison-Wesley, 2000), §15.12.4.5, “Create Frame, Synchronize, Transfer
Control.”
I would have used down and bottom in the same context. To some extent, this
discussion is irrelevant because the memory allocated for frames may not even
be contiguous. In that case, stacks are largely conceptual.
However well established, the notion that new frames are added to the bot-
tom of a stack is absurdly counterintuitive, and I refuse to go along with it. The
22. Tim Lindholm and Frank Yellin, The Java Virtual Machine Specification, Second Edition,
(Boston: Addison-Wesley, 1999), §3.10, “Exceptions.”
class Test {
public static void main(String[] args) {
a();
}
static void a() {
b();
}
static void b() {
c();
}
static void c() {
int frameCount = Thread.currentThread().countStackFrames();
System.out.println("there are " + frameCount +
" frames on the stack");
throw new RuntimeException();
}
}
Thus it is the propagation of an uncaught exception that “kills” the thread by pop-
ping all of the frames off the stack. If the bottom of the stack is reached and no
matching catch clause has been found in any of the methods, the JVM uses
the thread one last time to invoke the default exception handler (which is dis-
cussed in 6.10 Uncaught Exceptions).
25. Mary Campione and Kathy Walrath, The Java Tutorial: Object-Oriented Programming for
the Internet, Second Edition (Reading: Addison-Wesley, 1998), “The catch Block(s).”
26. Gosling et al., Introduction to Chapter 11, “Exceptions.”
27. Gosling et al., §11.3, “Handling an Exception.”
import java.io.*;
import java.util.logging.Logger;
class Test {
static Logger logger = Logger.global;
public static void main(String[] args) throws IOException {
System.out.println(isJavaFile(args[0]));
}
static boolean isJavaFile(String name) throws IOException {
DataInputStream in = null;
try {
0xCAFEBABE
true
The decompiled code (obtained by typing javap -c Test at the DOS prompt)
for the isJavaFile(String name) method includes the following excep-
tion table.
Exception table:
from to target type
2 65 73 <Class java.io.FileNotFoundException>
2 65 81 <Class java.io.IOException>
2 65 95 any
67 70 95 any
73 79 95 any
81 100 95 any
109 113 116 <Class java.io.IOException>
After each opcode is executed, the interpreter increments the value of the pc
register. The number of bytes incremented depends on the opcode. Thus the pc
The handler_pc entries in an exception table are in the same textual order as
they appear in a method or constructor. Note that if try statements are nested
(defined as a try statement in a try block), the catch and finally
clauses of the innermost try statement is entered first. The try block in this
example that begins at code[109] is not nested because it is coded in the
finally block of the enclosing try statement (not in the try block), and so
is listed in textual order. The fact that catch clauses are entered in textual
order is important because exception tables are searched from top to bottom.
If a matching catch clause is found, the value of handler_pc is used to
reset the pc register, and execution continues at that point.
NOTE 6.2
Unlike checked and unchecked exceptions, the differences between er-
rors, runtime exceptions, and all other (checked) exceptions is nothing
more than a programmer convention. More thought must be given to
the definition of these exception superclasses. Otherwise we are look-
ing at the very real possibility that they will become completely mean-
ingless. This should not be a controversial subject. The differences
between errors, runtime exceptions, and checked exceptions should
be precisely defined, unassailable, and easily understood. Like cracks
in a huge concrete damn, I already see evidence that this weakness in
the exception mechanism is threatening to inundate us, —not with wa-
ter—but with mass confusion (such as core API programmers translat-
ing runtime exceptions into errors).
30. Tim Lindholm and Frank Yellin, §3.10, “Exceptions.” I realize a lot of the material in this quote is
redundant. This is sometimes done deliberately to reassure the reader that my presentation of a
subject is in line with the specifications.
class Test {
public static void main(String[] args) throws Exception {
try {
throw new Exception();
} catch(Exception e) {
…other code…
throw e.fillInStackTrace();
}
}
}
This sentence forms the philosophical basis upon which I build my definition of
runtime exceptions and errors (the two baseclasses for unchecked exceptions).
Ideally all exceptions would be checked. As obvious as this may seem, it is
profoundly important in understanding the Throwable class hierarchy. There
are only three reasons for exempting an exception from these compiler checks:
• Exceptions that are potentially thrown in every method: Either catching
or declaring that these exceptions are thrown would overburden the excep-
tion mechanism, making it a hindrance rather than a help to application pro-
grammers
• Exceptions that are not thrown at all: If never thrown, then never caught.
Such exceptions would make for meaningless entries in the throws clause
of method and constructor headers, in effect overburdening the exception
mechanism for less obvious reasons
• Catastrophic (or fatal) exceptions thrown during system initialization as
the result of something an end user does: The only point in catching these
exceptions is to explain to the end user what they are doing wrong. This can
What does this mean? To get inside of the head of the person who wrote this,
one must take a look at the API docs for the Error class:
An Error is a subclass of Throwable that indicates serious prob-
lems that a reasonable application should not try to catch. Most such
errors are abnormal conditions. The ThreadDeath error, though a
“normal” condition, is also a subclass of Error because most applica-
tions should not try to catch it.35
Juxtaposing these API docs, I conclude that the writer was only thinking of
unchecked exceptions when writing the API docs for the RuntimeException
Notice the change from runtime to run-time in this JLS quote. As used in this
book, runtime usually means the runtime system (the only exception being the
term runtime exception); run-time is the compound adjective form of run
time; and run time simply means that a Java VM is executing (as opposed to
compile time). With this in mind, I will now explain my use of the term runtime
exception.
An important distinction must be made between extending the Runtime-
Exception class and throwing one of the “pre--existing” runtime exceptions
declared in the core API, particularly those in Table 1.6 Runtime Exceptions
Commonly Thrown by Argument Checks on page 150. No one argues that appli-
cation programmers should not throw runtime exceptions, only that they should
not declare them. In fact, the JLS has always said “Java programs can use the
pre-existing exception classes in throw statements…as appropriate.”37 Simi-
36. Gosling et al., The Java Language Specification, §11.2.2, “Why Runtime Exceptions are Not
Checked.”
37. Gosling et al., §11.5, “The Exception Hierarchy.”
38. API docs for the java.lang.NullPointerException as far back as the 1.0.2 release.
This section in the JLS does not answer the question it poses. It is largely of his-
torical interest. This entire discussion focusses on runtime exceptions thrown by
a Java VM (which in the case of the HotSpot VM is C++ code) versus exceptions
thrown in the core API and support classes in the sun and com.sun packages.
This much is painfully obvious because of references to “the level of analysis the
compiler performs” and “theorem-proving technology.” The core API and support
classes use the same throw statement as application programmers and there-
fore do not require any “level of analysis” or “theorem-proving technology.” Pre-
sumably, runtime exceptions thrown in the core API and support classes (such
as the NoSuchElementException thrown in the nextElement()
method of an Enumeration) are covered by “in the judgment of the designers
of the Java programming language, having to declare such exceptions would not
aid significantly in establishing the correctness of programs.” This statement,
however, does nothing to enhance the understanding of why runtime exceptions
are unchecked.
The essential specification in this section of the JLS is “not sufficient to
establish that such run-time exceptions cannot occur.” In other words, pro-
grammers should neither catch nor declare that a method or construc-
tor throws exceptions that are not actually thrown. This may sound trivial,
but it is actually the raison d'être for all runtime exceptions as well as for most
errors. In the JLS, such exceptions are said to “violate the semantic constraints
of the Java programming language.”40 Programmers commonly refer to these
as bugs or programmer errors. Rather than catching such an exception, the bug
should be fixed. The classic example of this is ArithmeticException.
Rather than catching ArithmeticException, an if (divisor == 0)
39. Gosling et al., §11.2.2, “Why Runtime Exceptions are Not Checked.”
40. Gosling et al., Introduction to Chapter 11, “Exceptions.”
try {
initialize(mimeType, null, this.getClass().getClassLoader());
It is interesting to contrast the “can occur at many points in the program” in this
specification to the “never occurs” in the following API docs for the Error
class.
An Error is a subclass of Throwable that indicates serious prob-
lems that a reasonable application should not try to catch. Most such
errors are abnormal conditions. The ThreadDeath error, though a
“normal” condition, is also a subclass of Error because most applica-
tions should not try to catch it.
This is the first indication that there is more than one kind of error. In fact, the
JLS is describing one kind of error and the API docs are describing another kind.
Errors that “can occur at many points in the program” are the LinkageError
and VirtualMachineError classes (excluding InternalError). These
45. Unascribed, “Programming with Assertions” in the API docs for the 1.4 release, “Design FAQ -
The AssertionError Class.”
class Foo {
static Choice c = new Choice(); // could throw
HeadlessException
}
I don’t think so. Any time you see “gracefully” think top-level exception han-
dler. HeadlessException is likely going to be caught only by top-level
exception handlers which can then explain the problem to end users in the sim-
plest possible terms. This is what Bug Id 4510992 (an RFE) was aiming for, but
it missed the mark by focusing only on getting rid of the stack trace that was
being printed. One thoughtful JDC member included the following comment at
the bottom of that bug report.
This is a problem for customers. We would like to give them a friendly
error message when this occurs since the stacktrace is only useful to
developers (and only to Sun developers at that since it is in your code).
Because the stacktrace happens in your code and the exception is not
And by extension it also means that the only errors that will ever be added to the
Java programming language are non-occurring errors such as UnknownByte-
OrderError, which is discussed in the following subsection. Understand this
and the definition of runtime exceptions above, and you will never again have to
doubt your understanding of the Throwable class hierarchy.
All of the errors in the core API are either every method errors or non-occur-
ring errors except two. Those are AWTError and the much more recent addi-
tion of CoderMalfunctionError in the java.nio.charset package.
If my analysis of errors is comprehensive and sound, there should be some ratio-
nal explanation as to why these are not every method, non-occurring, or recur-
ring end-user errors. I believe there is.
AWTError is not a general exception class such as LinkageError or
VirtualMachineError. AWTError is the only error class in the
java.awt package. It is used as follows.
• As a SystemConfigurationError if core API or support classes
(such as Toolkit implementations) cannot be “found or instantiated.” Sys-
tem configuration errors are discussed in the following subsection
• As an IllegalArgumentException
• As an IllegatStateException
• As an UnsupportedOperationException
This list looks a lot like the lists in the following subsection on unspecified runt-
ime exceptions and errors. That is in fact how AWTError is being used. Each
use must be considered a different subsystem specific unspecified runtime
if (toolkit == null) {
String nm = System.getProperty("awt.toolkit",
"sun.awt.motif.MToolkit");
try {
toolkit = (Toolkit)Class.forName(nm).newInstance();
} catch (ClassNotFoundException e) {
throw new AWTError("Toolkit not found: " + nm);
} catch (InstantiationException e) {
throw new AWTError("Could not instantiate Toolkit: " + nm);
} catch (IllegalAccessException e) {
throw new AWTError("Could not access Toolkit: " + nm);
}
…
This try block from the java.awt.Toolkit class loads the default toolkit.
The problem is that all three of these checked exceptions represent the
same system configuration error. Now that exception chaining is available,
such try statements should be coded as follows.
if (toolkit == null) {
String name = System.getProperty("awt.toolkit",
"sun.awt.motif.MToolkit");
try {
toolkit = (Toolkit)Class.forName(name).newInstance();
try {
for (int i = 0; i < len; i++) {
int c = str.charAt(i) & 0xFFFF;
if (c >= 0x0001 && c <= 0x007F) {
res[utf8Idx++] = (byte) c;
} else if (c == 0x0000 ||
Examples such as these are not very common, but that does not make them any
less disturbing. Here is another one from sun.misc.Cache:
public Cache () {
try {
init(101, 0.75f);
} catch (IllegalArgumentException ex) {
// This should never happen
throw new Error("panic");
}
}
This is really messed up. Allowing for the fact that this code predates assertions,
it should have been coded as follows.
public Cache () {
init(101, 0.75f);
}
CoderResult cr;
try
cr = $code$Loop(in, out);
} catch (BufferUnderflowException x) {
throw new CoderMalfunctionError(x);
} catch (BufferOverflowException x) {
throw new CoderMalfunctionError(x);
}
} catch (Exception e) {
However, this specification and the API docs for the Error class are seriously
outdated. Since the demise of ThreadDeath, the catchall exception handler
of choice is catch(Throwable e). This is as it should be. Catching a bug or
exceptions and errors, as if to say “if anything goes wrong.” This is the subject
of 6.8.1.1 Catchall Exception Handlers. Nowadays expressing “if anything goes
wrong” as catch(Throwable e) has become a necessity precisely because
of the deterioration of the programmer conventions that differentiate errors from
runtime exceptions. Consequently, an error is no less likely to be caught
than a runtime exception. This is why converting runtime exceptions into
errors makes no sense.
NOTE 6.3
The checked exceptions in the throws clause of a method or con-
structor header are either explicitly thrown or propagated. The
former means that the throw statement can be found in the method
or constructor body. The latter implies that the evaluation of a method
invocation or class instance creation expression resulted in an excep-
tion being thrown. This is sometimes described as “implicitly” throwing
an exception. I do not use that terminology, however, because to me it
implies that the exception is always thrown which, of course, is not
the case. The distinction between explicitly throwing versus propagat-
ing an exception is important in the definition of complex exceptions in
the following section.
Compare this to the following diametrically opposed paragraph from The Java
Programming Language.
A method can throw several different classes of checked exceptions—
all of them extensions of a particular exception class—and declare only
the superclass in the throws clause. By doing so, however, you hide
59. Brian Goetz, “Exceptional Practices, Part 1: Use Exceptions Effectively in Your Programs
By” available at developer.java.sun.com/developer/technicalArticles/
Programming/exceptions/ (Reprinted from JavaWorld August 2001). Note that this article
incorrectly states that the throws clause is part of the method signature.
Who is right? Before answering that question, consider the following paragraph
from the very next page in The Java Programming Language.
You should be explicit in your throws clause, listing all of the exceptions
you know that you throw, even when you could encompass several
exceptions under some superclass they all share. This is good self-doc-
umentation.60
So far so good. This is entirely consistent with what was said on the previous
page. However, the same paragraph continues as follows.
Deciding how explicit you should be requires some thought.60
Huh? Are there different levels of explicitness? Now watch as The Java Pro-
gramming Language uses the same example of the java.io package as
was used in the above quote from the JavaWorld article.
If you are designing a general interface or superclass you have to think
about how restrictive you want to be to the implementing classes. It
may be quite reasonable to define a general exception you use in the
interfaces’s throws clause, and expect that the implementing classes
will be more specific where possible. This tactic is used by the
java.io package, which defines a general IOException type for
its methods to throw. This lets implementing classes throw exceptions
specific to whatever kind of I/O is being done. For example the classes
that do I/O across network channels can throw various network-related
subclasses of IOException, while those dealing with files throw file-
related subclasses.60
60. Ken Arnold, James Gosling, and David Holmes, The Java Programming Language, Third Edi-
tion (Boston, Addison-Wesley Professional, 2000), §8.3, “The throws Clause.”
As indicated in these make believe comments from the API docs the java.io
package, FileException is the logical high-level exception for that package,
not IOException. Most of the methods that explicitly throw IOException
should have thrown FileException instead. Note that in a package such as
java.io that uses an umbrella exception, the only documentation of the fact
that a high-level exception such as our imaginary FileException is thrown is
in the @throws comments.
The problems in java.io are even worse because the read and write prim-
itives (native methods that actually do the work of reading from or writing to
files) even throw IOException. The read and write primitives should throw
The argument against the use of unspecified checked exceptions is not substan-
tially different from the argument against the use of unspecified runtime excep-
tions and errors. See 6.5.2 Unspecified Runtime Exceptions and Errors for a
discussion.
Finally, there is the question of documenting the exception specification in a
package that uses an umbrella exception. The important point to realize here is
that this is nothing like selectively documenting the low-level exceptions thrown
in a method or constructor that is declared to throw a high-level exception. All of
the checked exceptions thrown in the body of a method or constructor
that is declared to throw an umbrella exception must be documented
using the @throws tag, without exception (pun unavoidable). This is a critical
point. The umbrella exception has effectively preempted the normal use of the
throws clause, which is to document all of the checked exceptions thrown.
Using the @throws tag to document those checked exceptions is therefore not
an option.
NOTE 6.4
The subject of unspecified checked exceptions, which is defined as
the throwing of Exception, is not discussed in the following section.
These are the entire API docs for the RuntimeException class. As you can
see, no particular runtime exception is described. Thus throwing Runtime-
Exception is defined as using an unspecified runtime exception. The API
62. API docs for the java.lang.RuntimeException class. I am fully aware of the fact that I
repeat these API docs a number of times in this chapter. This is done for the reader’s convenience.
Do you see the “ebb and flow of the tide” as the implementation of Java classes
(either core API classes such as SoftReference or support classes in the
sun and com.sun packages) evolve and Java code becomes part of the VM
and vice versa?
It is also important to appreciate the difference between native methods
and the Java VM. A native method that is effectively called from Java code is
not part the Java VM. Some of the worst InternalError abuses are occur-
ring in native methods. This is inexcusable. Just because a native method
is written in C or C++ does not mean that it is part of the JVM. I have looked at
all of the JNU_ThrowInternalError(JNIEnv *env, const char
/*
* Class: sun_awt_shell_Win32ShellFolder
* Method: initIDs
* Signature: ()V
*/
JNIEXPORT void JNICALL
Java_sun_awt_shell_Win32ShellFolder_initIDs
(JNIEnv* env, jclass cls)
{
if (!initShellProcs()) {
JNU_ThrowInternalError(env,
"Could not initialize shell library");
return;
}
…
69. The initialism JNU stands for JNI Utilities, so JNU_ThrowInternalError is a utility func-
tion for throwing an InternalError in native methods.
switch (type) {
case java_awt_geom_PathIterator_SEG_MOVETO:
…
break;
case java_awt_geom_PathIterator_SEG_LINETO:
…
break;
case java_awt_geom_PathIterator_SEG_QUADTO:
…
break;
case java_awt_geom_PathIterator_SEG_CUBICTO:
…
break;
case java_awt_geom_PathIterator_SEG_CLOSE:
…
break;
default:
JNU_ThrowInternalError(env, "bad path segment type");
return;
}
JLS and JVMS, it is clear that only a Java VM (as distinct from Java code and
native methods) should throw InternalError. What makes this discus-
sion really interesting is that InternalError and UnknownError were
designed for use in the Classic VM. UnknownError, for example, is a thing
of the past. The design of the HotSpot VM is such that it never throws an
UnknownError. In 6.8.1.1 Catchall Exception Handlers, I return to this sub-
ject in order to differentiate an InternalError from a VM crash.
(InternalError never signifies a VM crash. If it did, the catchall exception
handler catch(Throwable e) would be precarious at best.)
This use (or rather misuse) of InternalError throughout the core API is
very frustrating to application programmers. One JDC member referred to an
InternalError as “this abhorrent message.”70 The frustration can also be
seen in the following description of Bug Id 4171464, “JComboBox should not
This is almost comical given the number of classes in the core API that throw
InternalError (in excess of 150 including the sun and com.sun support
packages, which is roughly 2.5% of the total number of class files in the current
J2SE). This surge in the use of unspecified runtime exceptions and errors in the
core API since the 1.0 release is shockingly unprofessional.
InternalError suffers from what can only be described as a Chicken
Little “the sky is falling” problem. As stated above, it is presumably more serious
than an Error, but that is not actually true. I conducted and extensive search
for "new RuntimeException", "new Error", and "new Internal-
Errror" in the J2SE. The very first thing I noticed was that there is no differ-
ence whatsoever in how these unspecified runtime exceptions and errors are
being used. Although I have separated the compiled list of uses below into what
are properly considered to be errors versus runtime exceptions, for any given
use in either of these lists the exception class actually thrown is just as likely to
This a classic single use mutator, which is the method equivalent of a blank
final. Such methods should throw a SingleUseMutatorException
76. Unascribed, “How to Write Doc Comments for the Javadoc Tool” at java.sun.com/j2se/
javadoc/writingdoccomments/index.html, (Palo Alto, Sun Microsystems, 2000), “Docu-
menting Exceptions with @throws Tag.”
The fallacy of this line of thinking is obvious. An ever increasing number of runt-
ime exceptions and errors would be coded as throw new Runtime-
Exception("detail message") and throw new Error("detail
message") . There must be a “line in the sand” drawn when it comes to docu-
menting that a method or constructor throws an unspecified runtime exception
or error.
It is easy to explain why checked exceptions must have names because cli-
ent programmers must be capable of explicitly catching them. Some runtime
exceptions and errors such as an OutOfMemoryError must also be explicitly
caught. They must be declared as RuntimeException or Error sub-
classes for the same reason as checked exceptions. Unspecified runtime excep-
tions and errors by definition cannot be explicitly caught. Furthermore, they are
rarely documented. Therefore the only rationale for declaring them as Run-
timeException or Error subclasses is that doing so is a programmer con-
vention. In the Java programming language exceptions have names. It is
that simple. This is an important programmer convention for the following rea-
sons.
• Names make it possible for programmers to communicate the details of a
runtime exception or error using only the exception name. This is particu-
larly important for runtime exceptions because they are routinely thrown
during development and unit testing. Take for example Arithmetic-
Exception . Except for BigDecimal and BigInteger in which
ArithmeticException is arguably misused as an Illegal-
ArgumentException, no class in the core API documents that it throws
long l = unsafe.allocateMemory(8);
try {
unsafe.putLong(l, 0x0102030405060708L);
byte b = unsafe.getByte(l);
switch (b) {
case 0x01: byteOrder = ByteOrder.BIG_ENDIAN;
break;
case 0x08: byteOrder = ByteOrder.LITTLE_ENDIAN;
break;
default: throw new Error("Unknown byte order");
}
} finally {
unsafe.freeMemory(l);
}
Error class hierarchy, but how I think it should look. The LinkageError
and VirtualMachineError subclasses have been omitted in order to sim-
plify the figure. The deprecated ThreadDeath has also been omitted. Both
the FactoryConfigurationError and TransformerFactoryCon-
figurationError classes have been removed and replaced by the more
general SystemConfigurationError class. That class has in turn been
extended by the only known recurring end-user errors. Finally both AWTError
and CoderMalfunctionError have been removed for reasons discussed
at the bottom of 6.5 The Throwable Class Hierarchy.
This may not be the actual Error class hierarchy, but we can nonetheless
think of it as being such. Additions of new non-occurring errors such as
UnknownByteOrderError are so rare that they should not be allowed to
stand in the way of a “no unspecified runtime exceptions or errors” programmer
convention. The good to be gained by such a programmer convention signifi-
cantly outweighs the inconvenience of having to name new non-occurring errors.
The Chicken Little “the sky is falling” problem is pervasive in the core
API. The only difference is that instead of saying “the sky is falling,”
the responsible programmer throws InternalError.
} catch (Exception e) {
The word “recovery” is not mentioned once. There are three kinds of errors:
every method errors, non-occurring errors, and recurring end-user errors. Pro-
grammers cannot throw every method errors. Furthermore, new non-occurring
errors such as UnknownByteOrderError are about as rare as recurring
end-user errors. This line of reasoning is very significant for Java programmers
because the only errors that are left are AssertionError and System-
ConfigurationError (or the equivalent). Thus we are faced with two
radically different realities; one in which errors are vaguely defined as
“serious problems” that are “unrecoverable” versus (a much more mun-
79. Actually, I am contradicting myself. If for some reason the javax.print package used a
system property in which to store the name of a class that is a designated part of the J2SE or other
Java product, then it would be perfectly natural to throw a SystemConfigurationError if
for some reason that system property were null or the named class could not be found or instan-
tiated. The difference is that system initialization has not yet completed. Core API programmers rou-
tinely throw errors if there is a problem during initialization. See also the discussion of the “error
chute” in 6.8.1.1 Catchall Exception Handlers.
It helps to understand the “jz” notation. I believe this refers to “JAR or Zip file.”
The getNextEntry(long jzfile, int i) method is a native class
method that is passed an address for the cached JAR or Zip file information as
well as the entry number. It returns zero to indicate a failure. The total num-
ber of entries was returned by another native method that cached the JAR or
Zip file when it was initially read. The error is thrown because the getNext-
Entry method returns zero, but that does not explain what causes the error.
This example cannot be appreciated unless you accept the fact that the respon-
sible programmer had no idea why getNextEntry would return zero, but was
merely aware of the fact that it returned zero if something went wrong. I base
java.lang.InternalError: jzentry == 0
at java.util.zip.ZipFile$2.nextElement(Unknown Source)
Beginning with the 1.4 release some additional diagnostic information has been
added to the detail message, which now looks like this:
java.lang.InternalError: jzentry == 0,
jzfile = 1190293256,
total = 38,
name = F:\Temp\db\inserted_2bytes.zip,
i = 1,
message = invalid LOC header (bad signature)
at java.util.zip.ZipFile$2.nextElement(ZipFile.java:303)
In either case, the detail message begins with the rather obtuse jzentry ==
0, which is the value of a local variable. This is the kind of diagnostic message
one would expect to see when using System.out.println to debug a pro-
gram. Because of this, I am utterly convinced that the responsible programmer
had no idea what might cause this exception to be thrown. Thus the only expla-
nation for throwing an error is that this was obviously a very “serious” exception.
Throwing InternalError because of the seriousness of an exception is very
different from any of the other uses of unspecified runtime exceptions and
errors discussed in the previous section.
As it turns out this exception is thrown for two reasons. The more important
of the two was a caching problem (there was no check to see if the JAR or Zip
file was modified after being cached) that has since been fixed (see Bug Id
4353705). The other reason is that the Zip or JAR file is corrupted. This example
is particularly interesting because ZipException has been part of the
java.util.zip package since the 1.0 release; yet if a Zip file is somehow
corrupted (an exceptional condition, but hardly something that “never occurs”),
an InternalError is thrown. Bug Id 4615343 is a request to throw
ZipException (a checked exception) instead. A better solution would be to
throw a CorruptedZipFileException subclass of ZipException
now that the cause of the exception is understood. The problem is that adding a
In the C++ code that implements the HotSpot VM, these are referred to as safe-
points. This discussion of polling for asynchronous exceptions at safepoints is
very far from the practical reality of application programmers. Even 11.8.3 Asyn-
chronous Exceptions in the The Java Native Interface Programmer’s Guide
and Specification down plays their significance:
80. Gosling et al., §11.3.2, “Handling Asynchronous Exceptions” and Tim Lindholm and Frank Yellin,
§2.16.2, “Handling an Exception.”
For JNI programmers to “carefully following the rules of checking for asynchro-
nous exceptions” roughly equates to the following statement in the API docs for
the ThreadDeath class:
An application should catch instances of this class only if it must clean
up after being terminated asynchronously. If ThreadDeath is caught
by a method, it is important that it be rethrown so that the thread actu-
ally dies.82
81. Sheng Liang, The Java Native Interface Programmer’s Guide and Specification, (Boston,
Addison Wesley Professional, 1999), 11.8.3, “Asynchronous Exceptions.”
82. API docs for the java.lang.ThreadDeath class.
83. Gosling et al., §11.3.2, “Handling Asynchronous Exceptions” and Tim Lindholm and Frank Yellin,
§2.16.2, “Handling an Exception.” The bracketed “implementation” is only found in the JVMS.
With all due respect, this makes no sense. It implies that all of the exceptions
thrown by a JVM implementation are asynchronous. That clearly contradicts the
following statement about synchronous exceptions in the JLS.
These exceptions are not thrown at an arbitrary point in the program,
but rather at a point where they are specified as a possible result of an
expression evaluation or statement execution.87
84. Actually, “a small but bounded amount of execution [is permitted] to occur before an asynchro-
nous exception is thrown” (from both the JLS and JVMS). In the terminology of the HotSpot VM, this
is referred to as reaching a “safepoint.” I believe that it is because of this specification and the fact
that invoking a stop method does not immediately stop a thread that one HotSpot programmer
refers to invoking stop as a “quasi-asynchronous exception.” I have seen other comments made
by very knowledgeable programmers questioning if ThreadDeath is truly asynchronous.
85. Gosling et al., §11.3.2, “Handling Asynchronous Exceptions.”
86. Ken Arnold, James Gosling, and David Holmes, §8.2.2, “Asynchronous Exceptions.”
87. Gosling et al., §11.1, “The Causes of Exceptions.”
In light of the fact that both the JLS and JVMS describe Internal-
Exception as asynchronous and that this may in fact be the first truly asyn-
chronous InternalError ever coded (in a Sun implementation of the JVM),
these comments are almost comical.
88. This applies to both the HotSpot VM and native methods using JNI (which are not part of the
JVM but are mentioned here for the sake of completeness). I do not have the source code for the
Classic VM but have reason to believe that it also threw InternalError using the same mecha-
nism as synchronous exceptions.
89. Evaluation of Bug Id 4454115.
90. Gosling et al., introduction to Chapter 11, “Exceptions.”
This is clearly misleading, but a much more serious problem with the API docs
for the Throwable class is the matter of fact assumption that subclasses
should have a no argument constructor and one that takes a String argument
(for the detail message):
By convention, class Throwable and all its subclasses have two con-
structors, one that takes no arguments and one that takes a String
argument that can be used to produce an error message.91
ever, because the interface contract for exception classes do not specify the
format of either of these strings. For example, the API docs for the
92. API docs for the getStackTrace() method in the java.lang.Throwable class.
Indeed it would, and that is why you should always consider storing a reference
to the current object when declaring a checked exception. Doing so allows client
programmers access to the public interface of the object that caused the
exception. In fact I would go as far as to say that an Object current-
Object parameter should be as common as the String detailMessage
parameter in checked exceptions, if not more so.
What about runtime exceptions and errors? They are generally not caught
except in catchall exception handlers, so why bother storing failure information in
instance variables? This is a difficult question to answer because programmers will
always find a reason to catch an exception. Fortunately, the decision has already
been made because most runtime exceptions and errors are part of the core API.
Consistent with the API docs for the Throwable class, most of them (including
93. Unascribed, “Programming with Assertions” in the API docs for the 1.4 release, “Design FAQ -
The AssertionError Class.”
public UndeclaredThrowableException(Throwable
undeclaredThrowable)
public UndeclaredThrowableException(Throwable
undeclaredThrowable, String s)
In the 1.4 release there are the following examples in the java.nio package:
In all three cases there are accessor methods that correspond to the required
constructor parameters. This shows clear evidence of a shift towards runtime
exceptions (and possibly errors) that store failure information. Missing-
ResourceException also follows this pattern, but it should have been
declared as a checked exception any way. The ArrayIndexOutOfBounds-
Exception and StringIndexOutOfBoundsException both have a
constructor that can be passed an index value (along with a no-arg constructor
and one that takes a String argument), but it is only used in formatting the
detail message.
You should not make too much out of this analysis of runtime exceptions and
errors in the core API. For many of them there simply is no failure information to
store. That is certainly the case with a NullPointerException, for exam-
94. I very often do ignore the org.omg package. The wholesale neglect of Java naming conven-
tions makes even reading their API docs a maddening experience.
class Test {
public static void main(String[] args) {
try {
anotherMethod();
}
catch (goto e) { }
}
static void anotherMethod() {
throw new goto(); //non-local jump
}
}
class goto extends RuntimeException { }
Bloch has an entire section on this subject entitled “Item 39: Use exceptions only
for exceptional conditions” in which he makes the following comments.
Exceptions are, as their name implies, to be used only for exceptional
conditions; they should never be used for ordinary control flow.96
This emphasis on “control flow” can also be found in the FOLDOC definition of
the term exception, which begins as follows.
An error condition that changes the normal flow of control in a pro-
gram…97
96. Joshua Bloch, Effective Java Programming Language Guide, “Item 39: Use exceptions only
for exceptional conditions.”
97. Free Online Dictionary of Computing (FOLDOC), foldoc.doc.ic.ac.uk/foldoc/
foldoc.cgi?exception.
attempt to make a call on a pay phone fails because the caller has not deposited
a sufficient quantity of money.”98 His point is that “the exception should provide
an accessor method to query the amount of the shortfall so the amount can be
relayed to the user of the phone,”98 but there is nothing out of the “ordinary” in
inserting too few coins in a pay phone or vending machine. It happens all the
time. Thus the exception mechanism should not be used.
There is another example I am always reminded of in this context. It has
been in The Java Tutorial for many years now:
try {
while (true) {
price = in.readDouble();
in.readChar(); //throws out the tab
unit = in.readInt();
in.readChar(); //throws out the tab
char chr;
desc = new StringBuffer(20);
char lineSep =
System.getProperty("line.separator").charAt(0);
while ((chr = in.readChar() != lineSep) {
desc.append(chr);
}
98. Bloch, Effective Java, “Item 40: Use checked exceptions for recoverable conditions and run-
time exceptions for programming errors.”
This example reads a binary file until EOFException is thrown. The problem
is that reaching EOF while reading an input file is something that is expected to
happen. The exception mechanism therefore should not be used. This is a com-
mon problem that stems in part from the fact that EOFException was mis-
named. The exception is intended to signify that there are not enough unsigned
bytes left in the input stream to covert to a particular primitive data type. This
would be indicative of a corrupted file, most likely one that has been inadvert-
ently truncated. The exception should have been named UnexpectedEOF-
Exception. As stated in the API docs:
Signals that an end of file or end of stream has been reached unexpect-
edly during input.100
The second paragraph of the API docs for this exception class originally read as
follows for many years starting with the 1.0 release all the way through the last
1.3 release:
This exception is mainly used by data input streams, which generally
expect a binary file in a specific format, and for which an end-of-stream
is an unusual condition. Most other input streams return a special value
on end of stream.101
To this was added the following innocuous paragraph in the 1.2 release:
Note that some input operations react to end-of-file by returning a dis-
tinguished value (such as -1) rather than by throwing an exception.102
99. Mary Campione and Kathy Walrath, “How to Use DataInputStream and DataOutputStream.”
100. API docs for the java.io.EOFException class.
101. API docs for the java.io.EOFException class prior to the 1.4 release.
102. API docs for the java.io.EOFException class in the 1.2 and 1.3 releases only.
try {
for (;;) {
int i = in.readInt();
// …
}
103. API docs for the java.io.EOFException class starting in the 1.4 release.
try {
while ((ch = in.read()) != -1)
// …
in.close();
} catch (IOException e) {
// … handle an error
}
multi-byte data types such as int, double, and char. In that case, the
DataInputStream class does not in fact include a mechanism for signaling
that EOF was reached. Perhaps it should. When reading the first unsigned byte of
105. API docs for the readInt() method in the java.io.DataInputStream class.
106. API docs for the readBoolean() method in the java.io.DataInputStream class.
import java.io.*;
class Test {
public static void main(String[] args) throws IOException {
File file = new File("invoice1.txt");
DataOutputStream out = new DataOutputStream(
new FileOutputStream(file));
double[] prices = { 19.99, 9.99, 15.99, 3.99, 4.99 };
int[] units = { 12, 8, 13, 29, 50 };
String[] descs = { "Java T-shirt",
"Java Mug",
"Duke Juggling Dolls",
"Java Pin",
"Java Key Chain" };
for (int i = 0; i < prices.length; i ++) {
out.writeDouble(prices[i]);
out.writeInt(units[i]);
out.writeInt(descs[i].length());
out.writeChars(descs[i]);
}
The first thing that must be done when reading binary data is to invoke the
length() method in the File class. One way or another, the file length is
used to determine when the end of a binary file is reached, not EOF-
Exception.
Note that it is even possible to subclass DataInputStream and add an
iterator() method that returns an Iterator for which hasNext() has
the meaning of “has another record.” The core API could have generalized this
behavior by creating a RecordInputStream class. The constructors for
such a class would throw a InvalidFileLengthException (which is an
IOException subclass) if the file length was not evenly divisible by the record
length passed. The next() method would throw EOFException if for some
reason the total number of records computed in the constructor could not be
read. RecordInputStream subclasses could further simplify record pro-
cessing by overriding the next() method and reading each of the fields in a
given binary data file format into appropriately named instance variables.
The more important point in terms of this section is that reaching the end of
a binary file (or any other file for that matter) is something that is expected to
happen. Therefore the exception mechanism should not be used. Besides the
fact that the examples in The Java Tutorial and Bug Id 4269112 include latent
bugs, the reason for not using the exception mechanism for “ordinary” or “nor-
mal” control flow is that doing so is very inefficient. I was first made aware of this
Exception table:
from to target type
41 44 44 <Class Zero>
41 44 57 <Class One>
41 44 70 <Class Two>
41 44 83 <Class Three>
41 44 96 <Class Four>
41 44 109 <Class java.lang.Throwable>
The assembler code that implements the table lookup is part of the athrow
instruction. It cannot be seen in decompiled code.
Microbenchmark tests show that these table lookups are much slower than
comparable control-flow statements. Bloch claims that the comparable control-
flow statements are seventy times faster. I was a little sceptical of that number
and ran the test myself. Here is the source inspired by his example in Effective
Java:
import java.io.*;
class Test {
public static void main(String[] args) throws IOException {
test1(1000, true);
test2(1000, true);
System.out.println();
107. Bloch, Effective Java, “Item 39: Use exceptions only for exceptional conditions.”
Ignoring the first test, these results suggest that the exception mechanism is
almost 30 times slower than comparable control-flow statements. As stated
above, Bloch reported 70 times slower. In either case, the difference is signifi-
cant. What about control-flow statements other than loops? The following
microbenchmark test suggests that the exception mechanism is about 18 times
slower than a comparable if-then-else or switch statement.
import java.util.Random;
class Test {
static Random random = new Random();
static int zero, one, two, three, four;
static Number[] exceptions = {new Zero(), new One(), new Two(),
new Three(), new Four()};
static Throwable t;
public static void main(String[] args) {
test1(100000, true);
test2(100000, true);
test3(100000, true);
System.out.println();
test1(1000000, false);
test2(1000000, false);
test3(1000000, false);
System.out.println();
test1(10000000, false);
test2(10000000, false);
test3(10000000, false);
System.out.println();
test1(100000000, false);
test2(100000000, false);
test3(100000000, false);
}
try {
return elementData[index];
} catch (ArrayIndexOutOfBoundsException e) {
If putting code inside of a try block precludes certain JVM optimizations, then it
is not free. It bothers me that Bloch offers no supporting documentation for this
claim. On the other hand, I have no reason to doubt what he is saying, especially
when you consider that Effective Java was technically edited by “the best team
of reviewers imaginable”108 including no less than Dr. Gosling, Gilad Bracha,
Peter Haggar, Doug Lea, Tim Lindholm, and others. The only supporting docu-
mentation that I can find, however, is related to asynchronous exceptions (which
even if they no longer exist continue to influence the design of modern JVM such
as HotSpot). The throw statement is a “control transfer” as referred to in the
following specification that can be found in both the JLS and JVMS.
A simple implementation might poll for asynchronous exceptions at the
point of each control transfer instruction. Since a program has a finite
size, this provides a bound on the total delay in detecting an asynchro-
nous exception. Since no asynchronous exception will occur between
control transfers, the code generator has some flexibility to reorder
computation between control transfers for greater performance.109
class Test {
public static void main(String[] args) throws Exception {
anotherMethod(new Exception());
}
static void anotherMethod(Exception e) throws Exception {
throw e;
}
}
try { }
catch (Throwable e) { }
…
finally { }
import java.io.*;
public class Test {
String s = "field";
public static void main(String args[]) throws IOException {
System.out.println(new Test().testLocalVariable());
System.out.println(new Test().testField());
}
String testLocalVariable() {
String s = "local variable";
try {
return s;
} finally {
s = "finally";
}
}
String testField() {
try {
return s;
} finally {
s = "finally";
}
}
}
local variable
field
111. Gosling et al., The Java Language Specification, §14.19.2, “Execution of try-catch-finally.”
112. Evaluation of Bug Id 4088715.
an error that is never propagated. Bruce Eckel refers to this as the lost excep-
tion in his book Thinking in Java.113 The JLS addresses this issue as follows.
It can be avoided only to the extent that exceptions are not explicitly thrown in
a finally block. Executing this program prints
In all four cases, the try statement forgets about the Error thrown in the try
block (hence the method name amnesia), instead executing the control-trans-
fer statement in the finally block. Note that exception chaining is not an
option, so the checked exception thrown in the try block is truly lost. Thus the
general rule is not to code control-transfer statements in a finally
block.
Application programmers have control over the decision to code a break,
continue, or return statement in a finally block. What they cannot nec-
essarily control are methods and constructors invoked in a finally block that
propagate exceptions. Thus the single most important point to understand about
the order of execution in a try statement is that if an exception is thrown in the
finally block, the execution of any control-transfer statement either in
the try block or in a catch block completes abruptly. The following
inverse example should make this abundantly clear.
class Test {
public static void main(String[] args) {
for (int i=0; i<4; i++) {
try {
test(i);
System.out.println("never returns");
} catch(Throwable e) {
System.err.print(e.getMessage());
113. Bruce Eckel, Thinking in Java, 3rd Edition, Beta (Upper Saddle River: Prentice Hall, 2002),
455.
114. Gosling et al., §11.3, “Handling of an Exception.”
Executing this program prints 0123. All four control-transfer statements in the
try block completed abruptly because of the error propagated in the finally
block.
catch (Throwable e) {
…
}
import java.io.*;
class Test {
public static void main(String[] args) {
try { throw new EOFException(); }
catch (IOException ioe) {
System.out.println("this message prints");
}
}
}
115. Doug Lea, Concurrent Programming in Java, Second Edition, (Boston, Addison-Wesley,
2002), §3.1.2.2, “IO and resource revocation.” (See footnote at the bottom of page 173.)
import java.io.IOException;
class Test {
public static void main(String[] args) {
try { }
catch (IOException ioe) { }
}
}
The problem with the specification is that it cannot be applied to runtime excep-
tions and errors because “it is beyond the scope of the Java programming lan-
guage, and perhaps beyond the state of the art, to include sufficient information
in the program to reduce to a manageable number the places where these can
be proven not to occur.”117 Although the specification as written fails to exempt
116. Gosling et al., The Java Language Specification, §14.20, “Unreachable Statements.”
} catch (Exception e) {
117. Gosling et al., The Java Language Specification, §8.4.4, “Method Throws.”
118. Evaluation of Bug Id 4046575.
119. Gosling et al., §11.5, “The Exception Hierarchy.”
try {
propagate(); //okay
} catch(SubException e) { }
}
This really should not compile. The explicitly thrown Exception is clearly
not assignable to SubException, and both are checked exceptions for
which the strictest compiler type checking should be implemented.
When the checked exception in the second bulleted item is propagated (versus
being explicitly thrown), the catch clause is reachable because any subclass
of an exception type in the throws clause of a method or constructor can
be thrown in the method or constructor body, which is just another way of say-
ing that superclass types can be named in a throws clause. This is a widely
recognized problem with the following specification for unreachable catch
clauses.
A catch block C is reachable [if and only if] both of the following are
true:
class Test {
public static void main(String[] args) {
try { }
catch (Exception e) { }
char e = 'e';
}
}
121. Gosling et al., §14.19, “The try Statement.” This is not the only problem with the wording in
this section of the JLS. The next paragraph in the same section is very difficult to read. The “of the
directly enclosing method or initializer block” should be dropped. If such a local variable were erro-
neously declared, it would be local to the catch block anyway. On the other hand, I noticed long
ago that programmers just love to find even the most trivial of mistakes in the Java specifications.
NOTE 6.1
Unless I add this note I fear the following section would give rise to the
widespread use of catchall exception handlers. That is not my intent. I ar-
gue strongly in favour of their use (when appropriate) because they have
gotten a lot of bad press. For example, The Java Tutorial has said for
many years now that catch(Exception e) is “useless”124 and The
Java Programming Language, Third Edition says this catchall ex-
ception handler is “usually a poor implementation choice.”125 Yet there
are literally hundreds of such exception handlers in the core API as well
as a significant number of catch(Throwable e) exception han-
dlers. For that reason alone they should be discussed. Perhaps the most
important thing that could be said about catchall exception handlers,
however, is that they should be used with the greatest of care.
122. James Gosling, Bill Joy, and Guy Steele, The Java Language Specification, First Edition,
(Reading: Addison-Wesley, 1996), §14.18, “The try statement.” (Do not update.)
123. Gosling et al., §14.19, “The try statement.”
124. Mary Campione and Kathy Walrath, “The Catch Block(s)”, java.sun.com/docs/books/
tutorial/essential/exceptions/catch.html.
125. Ken Arnold, James Gosling, and David Holmes, §8.4, “try, catch, and finally.”
126. Unascribed, “How to Write Doc Comments for the Javadoc Tool” at java.sun.com/j2se/
javadoc/writingdoccomments/index.html, “Documenting Exceptions with @throws Tag.”
} catch (Exception e) {
That the language designers thought errors should not be caught is unambigu-
ously expressed in the API docs for the Error and Exception superclasses:
An Error is a subclass of Throwable that indicates serious prob-
lems that a reasonable application should not try to catch.129 [from the
Error class]
The class Exception and its subclasses are a form of Throwable
that indicates conditions that a reasonable application might want to
catch.130 [from the Exception class]
127. This undocumented behavior was first reported on April 26, 1999 in Bug Id 4233093, the
evaluation of which states: “The documentation of Class.newInstance needs to be modified to docu-
ment the current behavior. The method will propagate any exceptions which may be thrown when
invoking the default constructor.” As of this writing, the API docs for the newInstance() method
in Class still do not document this behavior. On the other hand, Bug Id 4233093 is still marked “In
progress, bug.” Almost three years after it was submitted, there appears to be some indecision as
to the wisdom of fixing this bug.
128. Gosling et al., §11.5, “The Exception Hierarchy.”
129. API docs for the java.lang.Error class.
130. API docs for the java.lang.Exception class.
of the overloaded stop methods in the Thread class in the 1.2 release. These
are the methods that throw ThreadDeath. That opened the way for the
import java.lang.reflect.*;
public class Test {
public static void main(String args[]) {
try {
Class c = Class.forName("HelperClass");
Method main = c.getMethod("main",
new Class[] {args.getClass()});
main.invoke(null, new Object[] { args });
} catch (Throwable e) {
System.out.println("Hello World!"); //default message
}
}
}
class HelperClass {
public static void main(String[] args) {
System.out.println("Goodbye Cruel C++ World!");
}
}
Executing this program prints Goodbye Cruel C++ World!. Now look what
happens if I introduce a programming error by replacing the line in bold with the
following.
The main method is now being invoked with an incorrect number of arguments.
Executing the program prints "Hello World!" when it should have thrown a
try {
Class c = Class.forName("HelperClass");
Method main = c.getMethod("main",
new Class[] {args.getClass()});
main.invoke(null, new Object[] { });
} catch (RuntimeException e) {
throw e;
} catch (Throwable e) {
System.out.println("Hello World!"); //default message
}
This version of the code would have thrown the following runtime exception dur-
ing development and unit testing (the stack trace has been omitted).
The problem with this solution is that it overlooks the reality of runtime excep-
tions such as MissingResourceException, SecurityException,
and NumberFormatException that should have been declared as checked
exceptions. As a result, the catch(Throwable e) exception handler no
longer expresses “catch everything” or “if anything goes wrong.” A sepa-
rate catch(RuntimeException e) exception handler to the left of
the catchall exception handler is therefore not an option.
There must be some way of making sure that the programmer is made
aware of runtime exceptions and assertion errors during development and unit
testing so that they can be corrected. One way is simply to display the exception
to standard error. For example,
} catch (Throwable e) {
//this display is for runtime exceptions and assertion errors
//thrown during development and unit testing
e.printStackTrace();
The only difference between this an what the default exception handler does is
that the program continues to execute. Alternatively the stack trace can be writ-
ten to a log file. Under no circumstances whatsoever should a catchall
exception handler be allowed to lose failure information. Doing so
increases the possibility that a runtime exception or assertion error could escape
detection during development and unit testing. This means that catchall exception
There are only one two rules for using catchall exception handlers:
display the exception or error caught to standard error (at
least during development and unit testing) and do not lose failure
information.
handlers should always use exception chaining when translating the exception or
error caught into a high-level (checked) exception. So long as this is done, how
and when a runtime exception or assertion error caught in a catchall exception
handler comes to the attention of the responsible programmers does not really
matter.
What about errors other than assertions. Isn’t catching a “serious” error
such as an OutOfMemoryError dangerous? There are four categories of
errors that need to be considered:
• Linkage errors
• Virtual machine errors
• System configuration errors (as defined in the introduction to this chapter)
• Assertions
Linkage errors always indicate a problem with a single class, so there is no rea-
son not to catch them. System configuration errors generally occur during sys-
tem initialization and therefore cannot caught by application programmers. Even
if they are thrown after system initialization (because of lazy initialization), a sys-
tem configuration error is like a linkage error in that it indicates a problem with
133. API docs for the java.lang.UnknownError class in the 1.0.2 release.
134. API docs for the java.lang.UnknownError class in the 1.1 release.
135. Description of Bug Id 4065025.
This would be followed by a full thread dump. In the HotSpot VM a crash looks
like this:
#
# HotSpot Virtual Machine Error, Internal Error
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Error ID: 53414645504F494E540E43505002EA
#
I (and several others) have written many replies to such posts because
I have dealt with my share of errors that were manifested as “HotSpot
Virtual Machine Error” written to stderr, and the source of such errors
was actually in the programmer's own native code. That doesn't mean
that actual JVM bugs don't exist, it just suggests that the probability of
a given problem manifesting itself in this way actually turning out to be
a JVM bug is low compared to the other sources mentioned.136
Thanks to “Chris” for these words of wisdom. The 1.3.1 release introduced a
solution to the problem of differentiating native code failures from a JVM
crash. Here is an example of a native code failure from the “Error Han-
dling”137 document:
An unexpected exception has been detected in native code outside the VM.
Unexpected Signal : EXCEPTION_ACCESS_VIOLATION occurred at PC=0xe6a1067
Dynamic libraries:
0x00400000 - 0x00405000 d:\jdk1.3.1\bin\java.exe
0x77F60000 - 0x77FBE000 C:\WINNT\System32\ntdll.dll
0x77DC0000 - 0x77DFF000 C:\WINNT\system32\ADVAPI32.dll
0x77F00000 - 0x77F5E000 C:\WINNT\system32\KERNEL32.dll
0x77E70000 - 0x77EC4000 C:\WINNT\system32\USER32.dll
0x77ED0000 - 0x77EFC000 C:\WINNT\system32\GDI32.dll
0x77E10000 - 0x77E67000 C:\WINNT\system32\RPCRT4.dll
0x78000000 - 0x78040000 C:\WINNT\system32\MSVCRT.dll
0x10000000 - 0x10018000 C:\WINNT\System32\NVDESK32.DLL
0x08000000 - 0x08273000 d:\jdk1.3.1\jre\bin\server\jvm.dll
0x77FD0000 - 0x77FFA000 C:\WINNT\System32\WINMM.dll
0x4FA00000 - 0x4FA77000 C:\WINNT\System32\cwcmmsys.dll
0x50220000 - 0x50227000 d:\jdk1.3.1\jre\bin\hpi.dll
0x503B0000 - 0x503BD000 d:\jdk1.3.1\jre\bin\verify.dll
0x50250000 - 0x50266000 d:\jdk1.3.1\jre\bin\java.dll
0x503C0000 - 0x503CD000 d:\jdk1.3.1\jre\bin\zip.dll
0x0E6A0000 - 0x0E6C2000 D:\testcases\NativeSEGV\NativeSEGV.dll
0x77920000 - 0x77942000 C:\WINNT\System32\imagehlp.dll
0x72A00000 - 0x72A2D000 C:\WINNT\System32\DBGHELP.dll
0x731B0000 - 0x731BA000 C:\WINNT\System32\PSAPI.DLL
…in systems that have to be long lived and reliable, they [the systems]
have to have a comprehensive strategy for dealing with failure,
because failure always happens. There will always be bugs, there will
always be pieces of equipment that get smacked. There will always be
alpha particles that hit busses. Things go weird and the average
answer, which is to just roll over and die, is not a useful one. And partic-
ularly as you get into the systems like flight avionics, you just don't get
to crash…you have to keep on going, you have to do something sensi-
ble. You can't just say, “Oops, no, I'm not going to work anymore.” 138
138. Dr. James Gosling in a Bill Venner’s interview, the full text of which is available on the
www.artima.com/intv/gosling310.html. One must assume that Dr. Gosling was being sar-
castic when referring to a “crash” in a flight avionics system.
139.
String s = System.getProperty("java.security.manager");
SecurityManager sm = null;
if (name != null)
if (s.equals("") || s.equals("default") {
sm = new java.lang.SecurityManager();
} else {
try {
sm = (SecurityManager)loader.loadClass(s).newInstance();
} catch (IllegalAccessException e) {
} catch (InstantiationException e) {
} catch (ClassNotFoundException e) {
} catch (ClassCastException e) {
}
}
if (sm != null) {
System.setSecurityManager(sm);
} else {
throw new InternalError(
"Could not create SecurityManager: " + s);
}
}
A try block such as this is used by the java launcher to load the
SecurityManager. This is an interesting example not only because the
responsible programmer has taken the time to code an exception handler for all
but one of the known checked and runtime exceptions, but also because he or
she has found of way of coding the exception handling strategy (which is to
throw an InternalError) only once. That becomes an issue if you are going
to code multiple catch clauses for what is essentially the same exception.
For example, here is essentially the same code from the PrinterJob class in
the java.awt.print package.
As is often the case when this problem is solved in the core API, the possibility of
a ClassCastException is either deliberately or inadvertently overlooked.
There is also usually no catch clause for NullPointerException, which
is notable in this example because the responsible programmer has specified
null as the default value to be returned by the System.getProperty
method and still does not check the return value. Handling these runtime excep-
tions would have required two more exception handlers. The immediate point,
however, is that the exception handling strategy (which in this case is to throw
an AWTError) has to be coded over and over again.
Most programmers simply will not code multiple catch clauses for what is
essentially the same exception, nor should they. It would be incorrect to charac-
terize this as lazy programming. Five exception handlers for what is essentially
the same exception is as messy as checking the return value of every method
invoked (as was done in the C programming language). It is a step backwards in
terms of the design goals of the exception mechanism in the Java programming
language. The only solution to this problem is to use a catchall exception handler.
The Preferences class in java.util.prefs does just that, using
the catch(Exception e) exception handler:
String factoryName =
System.getProperty("java.util.prefs.PreferencesFactory");
if (factoryName == null)
throw new InternalError(
"System property java.util.prefs.PreferencesFactory not set");
try {
factory = (PreferencesFactory)
Class.forName(factoryName, false,
ClassLoader.getSystemClassLoader()).newInstance();
} catch(Exception e) {
throw new InternalError(
Not that the intent here must be characterized not as “catch everything” or “if
anything goes wrong,” but as “catch all known exceptions, but not errors.” This
is an entirely different motivation than catch(Throwable e).
Primarily because the newInstance() method propagates checked
exceptions that are not named in the throws clause, I would take the
Preferences exception handling strategy a step further and make the code
as compact as possible. For example,
class Test {
public static void main(String[] args) {
for (int i=0; i < 4; i++)
test(i);
}
static void test(int i) {
140. For someone contemplating the use of catch(Throwable e) as the catchall exception
handler of choice, it is notable that this is precisely what the newInstance(Object[]
initargs) method in Constructor and the invoke(Object obj, Object[] args)
method in Method does. Any exception—including runtime exceptions and errors—thrown in a
method or constructor (other than the newInstance() method in Class) invoked using the
reflection API is caught and translated into an InvocationTargetException. That this was
not done for the newInstance() method in Class was a major API design oversight. Alas, you
can’t please everyone. Bug Id 4386935 complains that InvocationTargetException
should not catch runtime exceptions and errors (using the threadbare argument of “What happens if
an OutOfMemoryError is thrown?”). This one, however, is marked “Closed, will not be fixed” (as
rightly it should be).
141. Gosling et al., §11.3, “Handling of an Exception.”
finally block
finally block
finally block
java.lang.Exception
finally block
There is actually one way out of a try statement without executing the
finally block (short of unplugging your computer). If System.exit(1) in
the catch block is uncommented, the last finally block would not print.
readLock();
try {
// do something
} finally {
readUnlock();
}142
The end(boolean completed) checks to make sure the channel was not
closed during the operation, throwing an AsynchronousCloseException
if it was. (This is roughly equivalent to a fail-fast iterator in the Collections Frame-
work.) There is a similar specification for the end() and begin() methods in
the AbstractSelector class of the same package.
It is easy to see how such specifications could proliferate, which is why it is
important to fully understand the raison d’être for try-finally statements in
which no exceptions are thrown. Such try-finally statements are best
thought of as control-flow statements. I did not fully appreciate the fact that
try-finally is actually a control-flow statement until I stumbled across the
following example from the RawCommandLineLauncher class in the
com.sun.tools.jdi package.
try {
return launch(tokenizeCommand(command, quote.charAt(0)),
address);
} finally {
transportService().stopListening(address);
}
This try-finally statement should get your attention. The first time I saw
this code I thought it must be a mistake. Then it occurred to me that this is a
clever try-finally trick that allows invocation of the stopListen-
ing(String address) method to be postponed to the last possible milli-
second. The “trick” is that the return statement is fully evaluated before
the finally block is executed. For example,
class Test {
public static void main(String[] args) throws Exception {
test();
try {
FileDescriptor fd = acquireFD();
try {
} finally {
releaseFD();
}
} catch (IOException e) {
close();
throw e;
}
if (toolkit == null) {
try {
// We disable the JIT during toolkit initialization.
// This tends to touch lots of classes that aren't
// needed again later and therefore JITing is counter-
// productiive.
java.lang.Compiler.disable();
} finally {
// Make sure to always re-enable the JIT.
java.lang.Compiler.enable();
}
}
return toolkit;
I like this example because it is very straightforward. The JIT compiler is dis-
abled while a bunch of classes for which optimization is a waste of time are
loaded. The finally clause is used to make sure the JIT is re-enabled. The
javax.swing.UIManager class does the same thing during UI initializa-
tion. Here is another very straightforward example of a finally clause (this
time in a try-catch-finally statement) that is also completely unrelated
to exception handling.
try {
stream.writeObject(obj);
} catch (IOException e) {
return false;
} finally {
// Fix for 4503661.
// Reset the stream so that it doesn't keep a reference
// to the written object.
try {
stream.reset();
This code is from a utility method used by the serialization mechanism to make
sure that a Serializable object can be serialized without throwing an
exception. The finally clause was added to reset the ObjectOutput-
Stream so that it does not hold a reference to the object. Otherwise, the object
cannot be garbage collected and becomes what is sometimes referred to as a
zombie. What all of these examples have in common is that the rely on the fact
that finally block is “guaranteed” to execute.
Why is that guarantee so important? It is easy to assume that unexpected
runtime exceptions and errors are not caught and always result in abnormal pro-
gram termination. If that were the case (“daFei” whoever you are, I hope
you are listening144), the finally clause would not be needed because
resource leaks or other problems such as no JIT compiler in a program
that is about to terminate are hardly a problem, but assumptions are never
good in computer programming. What if the unexpected runtime exception or
error is caught and the program does not terminate? This is why you can expect
to see more specifications such as these in which an API designer requires the
use of a try-finally statement to “guarantee” execution of some method.
They allow for the possibility that an unexpected runtime exception or error
thrown during execution of the try block will be caught elsewhere in the pro-
gram.
144. I only read about half of the replies (575 as of this writing) to the now infamous daFei “Finally”
thread at forum.java.sun.com/thread.jsp?forum=31&thread=252665, but I wonder if
anyone else in the hundreds of replies I did not read began to suspect as I did that his real motive
was to achieve fame through having the most replies ever to a Java Forum posting. I got to laughing
so hard thinking that was the case that I just couldn’t read any more. He has been called many
names, but I like to think of daFei as “the idiot who everyone loves to hate.”
It is just not good programming practice to wait until finalizers run to release sys-
tem resources.
Why then are there finalizers in the Graphics, FileInputStream, and
FileOutputStream classes?146 There are two possible answers to this
question. The first is that had they the opportunity to do it all over again, these
classes would not have finalizers. The attitude towards finalizers has changed
considerably over the years. For example, here is the evaluation of Bug Id
1244595, “BufferedOutputStream does not flush even when the pro-
gram exits” (dated April 4, 1996 and reported against the 1.0 FCS release):
If for some reason a finally clause cannot be used to close a buffered out-
put stream that must be flushed, a shutdown hook should be used instead.
The other answer to this question is that finalizers are, as one software engi-
neer at Sun has so eloquently said, “the last bastion of reclaiming system
resources.”149 This is a “better late than never” explanation for the use of finaliz-
ers in the Graphics, FileInputStream, and FileOutputStream
classes.
In the case of FileInputStream and FileOutputStream, however,
the finalizers were a blatant mistake. If two streams are using the same file
descriptor closing one of them in a finalizer will cause the other to throw an
IOException with a “Bad file descriptor” detail message.150 For example,
import java.io.*;
class Test {
public static void main(String[] args) throws IOException {
RandomAccessFile raf = new RandomAccessFile("test.txt",
"rw");
BufferedReader in = new BufferedReader(
new InputStreamReader(
new FileInputStream(raf.getFD())));
System.out.println(in.readLine());
System.out.println(raf.read());
testing,
testing,
testing
testing,
-1
testing,
Exception in thread "main" java.io.IOException: Bad file
descriptor
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:161)
at java.io.InputStreamReader.fill(InputStreamReader.java:156)
at java.io.InputStreamReader.read(InputStreamReader.java:232)
at java.io.BufferedReader.fill(BufferedReader.java:130)
at java.io.BufferedReader.readLine(BufferedReader.java:267)
at java.io.BufferedReader.readLine(BufferedReader.java:322)
at Test.main(Test.java:12)
This class does not define methods for opening existing files or for cre-
ating new ones; such methods may be added in a future release. In this
release a file channel can be obtained from an existing FileInput-
Stream, FileOutputStream, or RandomAccessFile object
by invoking that object's getChannel method, which returns a file
channel that is connected to the same underlying file.152
151. See also Bug Id 4081750, “java.io.RandomAccessFile: Add a finalize method” which is (rightly)
marked “Closed, will not be fixed” (because of Bug Id 4099999).
152. API docs for the java.nio.channels.FileChannel class.
// Open the file and then get a channel from the stream
FileInputStream fis = new FileInputStream(f);
FileChannel fc = fis.getChannel();
…
// Close the channel and the stream
fc.close();
This code should actually invoke fis.close(), which would in fact close both
the channel and the stream. This code example is all the more curious because
an end-of-line comment in the source code for the FileChannelImpl class
clearly states that FileInputStream, FileOutputStream, and
RandomAccessFile are “responsible for closing file descriptor.”153 For
example,
import java.io.*;
import java.nio.channels.FileChannel;
class Test {
public static void main(String[] args) throws IOException {
FileOutputStream file = new FileOutputStream("output.txt");
FileChannel channel = file.getChannel();
channel.close();
PrintWriter writer = new PrintWriter(file);
writer.println("testing, testing, testing");
writer.close();
}
}
154. The term internal buffer is generally not used to describe CharBuffer, ByteBuffer,
and other buffer classes in the java.nio package (at least not in the API docs), only the internal
buffer arrays used in the java.io package. Note also that there are no flush() methods in
the java.nio package. Those buffers are passed as arguments to encoders and channels
rather than being flushed.
On the Java platform, it is not possible to determine the numeric value of a file
descriptor (other than those for standard I/O), which I believe is why the API docs
refer to file descriptors “opaque handles.” I do not agree with this characteriza-
tion of “the main practical use of a file descriptor.” Invoking the sync() method
for output streams is just as common if not more so.
File names are only used when opening files. Thereafter the host operating
system returns a file descriptor (which can be thought of as a pointer to an entry
in the open files table) that is used in all subsequent I/O operations, including all
of the read and write primitives and the close() methods in java.io and
java.net packages (assuming the destination or “sink” is a file or socket).
155. The term handle was briefly used to describe references to objects on the Java platform, but
that terminology is no longer appropriate because object references are no longer implemented
using an indirection similar to that found in file descriptor and open files tables. They are now imple-
mented as direct references (or pointers).
156. It is sometimes suggested that file descriptor limits are necessary because the file descriptor
table uses kernel memory. While this is certainly true, it is also grossly misleading. UNIX-like operat-
ing systems more or less assume a 1-to-1 correspondence between file descriptors and open files
when limiting the former. The real issue here is the total amount of memory needed for an open file,
which includes entries in the open files and v-node tables. By limiting file descriptors, the OS effec-
tively limits the total amount of memory that can be used to access files.
157. API docs for the java.io.FileDescritpor class.
import java.io.*;
class Test {
public static void main(String[] args) throws IOException {
FileOutputStream fos = new FileOutputStream("output.txt");
FileDescriptor fd = fos.getFD();
fos.write("testing, testing, testing".getBytes());
fd.sync();
FileInputStream fis = new FileInputStream(fd);
fis.read();
}
}
The same IOException would be thrown if the file descriptor from a File-
InputStream where used to open a FileOutputStream. Given the fact
that RandomAccessFile does not have a constructor that accepts a File-
Decriptor, this means that the file descriptor from a FileInputStream
can only be passed to another FileInputStream and the file descriptor
from a FileOutputStream can only be passed to another FileOutput-
Stream . A much more common scenario than either of these is that the file
descriptor from a RandomAccessFile is passed to either a FileInput-
Stream or a FileOutputStream. The latter requires that the Random-
AccessFile was opened in the "rw", "rws", or "rwd" modes.
An entry in the open files table also includes a pointer to the third table men-
tioned above (which is sometimes referred to as the v-node table). Conceptu-
ally, the v-node table includes information about the file that is the same for all
users (unlike information in the open files table such as the access mode and
current offset that is different for each user). For example, the most important
information in a v-node table is the address(es) of the block(s) on disc. (There
may be more than one block of data if the file is fragmented.) Other information
in this table includes the values returned by the isDirectory(),
isFile(), lastModified(), and length() methods in the File
class. Having now discussed all three of the tables in this conceptual model, I
will now discuss how file sharing works.
File sharing can be defined as opening the same file more than once, which
in the Java programming language means instantiating the FileInput-
Stream , FileOutputStream, or RandomAccessFile classes. Con-
structors in those classes can be passed a string (the file name) or a File
object (a pathname). In addition, constructors in the FileInputStream and
FileOutputStream classes can be passed a FileDescriptor. If a
string or File object are passed, a separate entry is created in the open files
table. Thus, if the same file is opened in two different streams, they both have
their own access mode and file pointer (as shown in Figure 6.5). This is so com-
mon that programmers may not even think of it as “file sharing.” Something very
different happens if a FileDescriptor is passed, however. The same open
files table entry is used for both streams (as shown in Figure 6.6). In C and C++
this is referred to as duplicating a file descriptor. In Java programs, file
descriptors are duplicated by invoking the getFD() method in
FileInputStream, FileOutputStream, or RandomAccessFile and
passing the File-Descriptor returned to a constructor in the
FileInputStream or FileOutputStream class (subject to the limita-
tions outlined above). Note that the FileDescriptor class is not serializ-
able. Therefore a native method is required to duplicate file descriptors
across different processes on the Java platform.
The following program shows that there is a significant difference in how
files are shared.
import java.io.*;
class Test {
static File shared = new File("output.txt");
/*
* Different file descriptors hence separate open files
* table entries and current offsets. The file is not
* actually closed until pw2.close() is invloked.
*/
PrintWriter pw1 = new PrintWriter(
new FileOutputStream(shared));
PrintWriter pw2 = new PrintWriter(
new FileOutputStream(shared));
pw1.print("Hello");
pw2.println(" World!");
pw1.close(); //pw1 data is overwritten
pw2.close();
printFile();
shared.delete();
/*
* Same file descriptor hence same open files table
* entry and current offset. Cannot close the file
1039033138000
1039033142000
World!
Hello World!
The most significant difference is that a file for which there is more than one file
descriptor can only be closed once (because closing a file releases the file
descriptor). For example, if the pw1.flush() method invocation in bold is
changed to pw1.close(), the entry in the open files tables is deleted and the
data written to pw2 (the " World!" string) is left behind in a character encod-
ing buffer.
The lesson here is that the meaning of an “open file” is really an entry in the
open files table. If there are multiple entries for the same file, the file essentially
remains open even after a close() method invocation. However, closing a file
always writes meta data such as the last modified date to the file (as shown in
Exception with a detail message of “Too many open files” as a result of Bug
Id 4027749. IOException and SocketException are also thrown with
the same detail message.
The hard limit is the maximum number of file descriptors that can be allo-
cated by a single process. If a hard limit can be changed doing so requires
superuser (or root) privileges (known as the system administrator on Windows
operating systems). Unlike soft limits, hard limits are primarily designed to
detect file descriptor leaks (the failure to close files or sockets in a long-running
application such as server).
158. This is a UNIX function. The Windows 3.x operating system had a similar setHandleCount
function. It is considered obsolete on Win32 platforms, however, and simply returns the number of
handles requested.
159. This was to be expected. For example, see the “Failure to Load Resources When All File Han-
dles Are Used” document at support.microsoft.com/default.aspx?scid=kb;en-
us;50741 in which the same problem is encountered on the Windows 3.x operating system.
file descriptors <= soft limit <= hard limit <= kernel limit
As noted above, some operating systems only have two file descriptor limits. In
that case, one is always the kernel limit. The other may be either a soft or hard
limit (implying that either can or cannot be changed).
There is no central place where file descriptor limits (or how to change them)
are documented across all platforms and for different versions of the same
operating system. As far as I can tell not even IBM has produced such a docu-
ment. It would behoove Sun as the industry leader in cross-platform pro-
gramming to invest the time and effort into compiling such a list.
Meanwhile, from what I can see after searching the Internet for days, this is an
every man (or programmer) for himself research problem. Historically, the
soft limit has been as low as 15 or 20 handles (depending on whether or not
stdin, stdout, stdprn, stderr, and stdaux are counted). This explains
why to this very day there is still a general perception among programmers that
closing files is critically important. The soft limit is still a relatively small power of
two such as 64, 128, 256, or 1024 on many UNIX-like operating systems. On
newer operating systems, however, the movement is towards either very
high limits or no limits at all. For example, the only limit to the number of file
handles on Win32 platforms is available memory. What about runaway alloca-
tions and detecting file descriptor (or handle) leaks? Microsoft may have gone
from one extreme to another by completely eliminating file handle limits. Not that
import java.io.*;
class Test {
public static void main(String[] args) throws IOException {
File parent = new File("", "fdleak");
parent.mkdir();
String path = parent.getAbsolutePath();
/*
* Save references so that file descriptors are
* not garbage collected
*/
FileOutputStream[] nogc = new FileOutputStream[5000];
for (int i =0; i < nogc.length; i++) {
File fname = new File(path, i + ".txt");
try {
nogc[i] = new FileOutputStream(fname);
} catch (IOException e){
System.out.println(i + " files allocated");
throw e;
}
}
}
}
Executing this program on a Win32 platform using the 1.4.1 release prints
The primary Bug Id 4189011 (which as of this writing is still marked “In
progress, bug”) was submitted on November 11, 1998. It includes the following
insightful comments.
To clarify the issue, on win32 system there are three ways to open a
file:
Other than the third option, i.e. option 1 and 2 have practically no limita-
tion in opening number of files. The third method is restricted (for the
reason not known to me) to open only approx. 2035 files. That is why
MS JVM is able to open unlimited (practically) files, but SUN JVM fails
after 2035 files (my guess is it is using 3rd method to open file).
The case is true only only [sic] on Win32 OS. The reason this is serious
problem ans [sic] we have to revert to MS JVM is because we werr [sic]
writing high scalable server in Java...160
Evaluation Yes, this could be a real problem for big server apps. We've
already migrated some of the win32 I/O code to use the raw win32 API
rather than the MS C runtime library; we should finish the task.
-- xxxxx@xxxxx 1998/11/11
Unfortunately even though windows can deal with more win32 handles
like you get using CreateFile, we need the C library handles for FileDe-
scriptor and those are limited to 2048.
xxxxx@xxxxx 2002-10-25
Bug Id 4779905 is a closely related bug but not so marked on the Bug Parade.
It explains the FileNotFoundException and “The system cannot find the
file specified” detail message which really should be an IOException with a
“Too many open files” detail message. I am flabbergasted that these bugs do
not get more attention. (As of this writing they collectively have only 12 votes.)
This speaks volumes as to the relative insignificance of the Windows operating
system in the unfolding story of the Java platform.
It is doubtful that UNIX-like operating systems will ever completely eliminate
file descriptor limits. They appear to be moving in the direction of very high per-
process limits that are still useful in terms of runaway allocations and detecting
file descriptor leaks. These limits are sufficiently high that only a handful of
This stream uses a 16K character buffer and an 8K character conversion (byte)
buffer, for a total of 24K in buffering. Failure to close such a stream wastes a lot
of memory. The point again is that open files are becoming more of a memory
leak than a file descriptor leak.
Now I am ready to discuss the other reason for explicitly closing files.
Doing so flushes internal buffers. This applies only to instances of the
FileOutput-Stream class. RandomAccessFile is unbuffered.162 Like-
wise, a File-InputStream has no buffers to flush. In these cases, either a
161. Even with these very high per-process limits the three limit design still makes sense. Imagine a
server running on the same machine as general business applications. Without the three limit
design, there is no way to detect file descriptor leaks in the server application and still have a rela-
tively low soft limit for the other applications. A runaway allocation in one of those applications that
is allowed to run up to the server hard limit could do a lot of damage. The opposing argument is that
servers that require that many file descriptors or handles are most likely going to be dedicated.
Assuming that a server application that needs 11,000 file descriptors will not be running on the
same machine as general business applications, separate soft and hard limits become a question-
able design. UNIX-like operating systems currently support both views.
162. There is an RFE for a buffered random access file. The primary Bug Id is 4056207 which at
this point (after the introduction of the java.nio package) I think it is safe to say will never be
implemented. Note that the API docs for RandomAccessFile say “A random access file
behaves like a large array of bytes stored in the file system.” They do not actually say that all writes
are immediate, however. I used to assume that until the "rws" and "rwd" access mode were
added in the 1.4 release.
PrintStream There is a very good reason why the API docs for the
PrintStream class say: “The PrintWriter
class should be used in situations that require writing
characters rather than bytes.”b Whenever characters (a
char, char[], or String) are written to a
PrintStream which includes ALL of the print and
println methods, they are written to a Buffered-
Writer (with a default buffer size of 16K) and then
immediately flushed to an OutputStreamWriter
for character-to-byte conversion. The fact that they are
immediately flushed has nothing to do with automatic
flushing. They must be flushed every time because the
BufferedWriter and OutputStream-
Writer are added upstream. Thus PrintStream
has not one, but two upstream buffers (the second being the
character conversion byte buffer) that are flushed every time
a character or string is written to the stream or one of the
print or println methods is invoked. This makes
PrintStream (and by extension standard output and
error) very inefficient. This class should therefore be used
as little as possible.
Closes this input stream and releases any system resources associ-
ated with the stream.163 [from InputStream]
Closes this output stream and releases any system resources associ-
ated with this stream. The general contract of close is that it closes
the output stream.164 [from OutputStream]]
The implication of closing a “stream,” however, is that all of the stream decora-
tors (or filters) are closed as well as the source or destination. This is accom-
plished by chaining the close() methods together.
In output streams, flush() methods are also chained together. In this
case, one of the API docs does actually use “chain” to describe the implementa-
tion:
Flush the stream. If the stream has saved any characters from the vari-
ous write() methods in a buffer, write them immediately to their
163. API docs for the close() method in the java.io.InputStream class.
164. API docs for the close() method in the java.io.OutputStream class.
165. API docs for the close() method in the java.io.Reader class.
166. API docs for the close() method in the java.io.Writer class.
167. API docs for the flush() method in the java.io.Writer class.
This example may surprise some programmers who expect output files to be
flushed and closed upon program exit. For example, here is part of the descrip-
tion of Bug Id 4034972, “PrintWriter does not flush buffer upon normal
program termination.”
General comments: Since no mention was made of this in the API doc's
I simply assumed that a PrintWriter would flush its buffer when the vir-
tual machine terminated (like output buffers in C do when you call
exit(int)). I would like to see the [sic] any PrintWriters I have created be
flushed when my program terminates normally.168
The problem with such behavior is that a programmer may not want the internal
buffer to be flushed. Instead, in the Java programming language internal
buffers are flushed when the close() method is invoked. It is therefore
incumbent upon application programmers to invoke the close() method after
they are done writing to a buffered output stream. (Note that the finalizer for
FileOutputStream is too far downstream to flush internal buffers.)
In the case of PrintWriter (as mentioned in the bug report), automatic
flushing is always an option, but a very expensive one. It more or less defeats
the purpose of using a buffer in the first place. Thus the default behavior for both
PrintStream and PrintWriter is that automatic flushing is turned off.
168. Description of Bug Id 4034972. There is at least one other such bug. See Bug Id 1244595,
“BufferedOutputStream does not flush even when the program exits.”
This code for standard output and standard error is from the System class.
The reason why the system class buffers standard output and standard error is
in case they are written to one character at a time. In that case, buffering an out-
put stream that is automatically flushed makes sense.
It has become common practice to ignore the IOException thrown by a
close() method. In fact, this is so common that there are numerous exam-
ples in the core API. Ignoring the IOException thrown when closing an input
file is understandable because the file is no longer in use. Ignoring the IO-
Exception thrown by the close() method in an output stream, however, is
clearly not as safe because close() methods are responsible for flushing
internal buffers. In an extreme case in which a buffer never reaches capacity, the
flush() method invocation from within the body of the close() method
invokes the write(byte[] b, int off, int len) method in
FileOutputStream and is therefore responsible for actually writing the buff-
ered data to the file. Under those circumstances, putting all the write method
A single explicit flush of an output stream after the last write is advis-
able if you are going to ignore the IOException thrown by the
close() method.
Beside the fact that the in != null test in the finally block is completely
unnecessary, the BufferedInputStream (or bin variable) should be
closed (if for no other reason than as a matter of style), not the FileInput-
Stream (or in variable). Finding this code in the core API was something of a
surprise. It simply is not possible for in to be equal to null once the try
block is entered (unless you want to argue that a maintenance programmer may
change the code before the try block, which sounds like a stretch to me).
The following (slightly modified) example from the core API uses the second
approach of closing files in the same try block in which they are opened and
processed.
try {
File file = new File(fileName) ;
if (!file.exists())
return ;
Here again it is safe to assume that the file was opened making it much easier to
invoke the close() method. The load(InputStream inStream)
method loads a properties file. If the file does not exist or otherwise throws an
IOException, the properties file is simply not loaded. The debug message is
a mystery, though. It incorrectly assumes that the properties files was not found.
This code is not very likely to throw a FileNotFoundException because
of the exists() method invocation.
Here are a couple examples of the third approach in which the stream is
closed in the finally block of the same try statement in which it is opened.
First an input file example:
This slightly modified example is from the core API. The IOException is
ignored because this is from a boolean method that returns false if inFile
is null. Next an output file example:
If you stop and think about this for a moment, setting instance variables to null
(particularly those that reference large objects) is comparable to closing a
FileInputStream in a finally clause. In both cases the alternative is to
wait for the garbage collector to run in order to release the resource.
Note the following lines of code from the close() method implementa-
tions above.
if (in == null)
return;
if (out == null)
return;
Closes this output stream and releases any system resources associ-
ated with this stream. The general contract of close is that it closes
the output stream. A closed stream cannot perform output operations
and cannot be reopened.170
These API docs are from the close() method in InputStream and
OutputStream, respectively. They say very little. The full specification for
close() methods in the java.io package can only be found in Reader
and Writer classes. For example,
Close the stream. Once a stream has been closed, further read(),
ready(), mark(), or reset() invocations will throw an
IOException. Closing a previously-closed stream, however,
has no effect. 171 [emphasis added]
Close the stream, flushing it first. Once a stream has been closed, fur-
ther write() or flush() invocations will cause an
IOException to be thrown. Closing a previously-closed stream,
however, has no effect.172 [emphasis added]
Note the generic references to “streams.” These API docs are indeed supposed
to describe the behavior of all of the close() methods in java.io. Although
all streams behave as described here, the only documented behavior for binary
data streams are comments in some (but not all) read methods that a closed
binary data input stream cannot be read. For example,
The other problem with the specification that streams can be closed more than
once is that writes to a closed BufferOutputStream do not throw an
IOException. For example,
import java.io.*;
class Test {
public static void main(String[] args) throws IOException {
testText();
testBinaryData();
}
static void testText() throws IOException {
PrintWriter out = new PrintWriter(
new BufferedWriter(
new FileWriter("output.txt")));
out.println("testing, testing, testing");
out.close();
out.close(); //does second close() throw IOException?
out.println("testing, testing, testing");
if (out.checkError())
System.out.println("Writing to a closed Writer " +
"throws an IOException");
Binary data can be inadvertently lost without throwing an exception. If the follow-
ing change is made writing to the same closed DataOutputStream throws
an IOException immediately.
The reason for this behavior is obvious looking at the write methods in
BufferedWriter. They include a private ensureOpen() method that
throws an IOException with a detail message of "Stream closed". This
method is missing in BufferedOutputStream. Adding such a method is
not a backwards compatibility issue, and so I have reported this as a bug.
The next system resource to be discussed is the graphics context used by a
Graphics object in the java.awt package. These objects are frequently
created, used, and then disposed of. The API docs for the dispose() method
reads in part:
Graphics g = getGraphics();
try {
…Graphics object used here…
} finally {
if (g != null) {
g.dispose();
}
}
In this case, the g != null test may be necessary because some get-
Graphics() methods return null.
Conventional wisdom says that finalizers should be used to free memory allo-
cated in native methods, but why should releasing this particular system
resource be any different from the others? There is an example of freeing mem-
ory allocated by a native method near the bottom of 6.5.2 Unspecified Runt-
ime Exceptions and Errors on page 751.
NOTE 6.2
The following section is heavily indebted to Doug Lea for a number of
different reasons, the most important of which is his “six general re-
sponses to…failed actions.”174 In fact this section is modelled after his
3.1.1 Exceptions in Concurrent Programming in Java. It is also
heavily influenced by an in-depth study of exception handling in the core
API. This section must be considered the proverbial “tip of the iceberg”
for two reasons. The first is that a thorough study of error recovery re-
quires a considerable knowledge of concurrent programming, which is
a subject I am not willing to broach in this volume. Thus you will not
174. Doug Lea, Concurrent Programming in Java, Second Edition, 3.1.1 Exceptions.
The system must also be in a consistent state, which can be seen in his discus-
sion of “antimessages”175 and “methods with externally visible effects that irre-
vocably change the real world by performing IO or actuating physical
175. Doug Lea, Concurrent Programming in Java, §3.1.1.3, “Rollback.” This is just some really
great stuff. Lea is a totally original thinker.
normally associated with databases in which log files are used to restore the
database. In object-oriented programming, backwards recovery (or rollback)
also means restoring the current object to the exact same state that it was in
when an instance method began to execute. Restoring the current object to any
other consistent state in response to failure is referred to as forwards recov-
ery (or rollforward). For a detailed discussion of consistent state restoration
or recovery (as narrowly defined in this section) see “Item 46: Strive for failure
atomicity” in Effective Java as well as 3.1.1.3 Rollback in Concurrent Pro-
gramming in Java.
This technical definition of recovery is consistent with the more common
meaning of “doing something that allows a program to continue executing”
(which is actually a combination of recovery and exception handling). That defini-
tion boils down to anything other than a catastrophic (or fatal) error. The reason
why the two uses are consistent is that a program cannot (or rather should not)
continue to execute if either the current object or system is in an inconsistent
state.
There are usually two methods involved in the throwing and catching of an
exception. Exception handing refers to the method that catches the exception.
When Lea refers to rollback and rollforward as one of the “six general
responses…failed actions”174 he is speaking from the perspective of the
method that fails. This is very confusing, however. For example, in 3.1.1.5 Retry
Lea says “you can contain local failure to the current method, rather than throw-
ing exceptions back to clients”176 and then uses the following example.
The “pre-defined and heavily degraded response to that operation” might be the
use of default values or returning null (an exception handling strategy used
extensively in the core API during system initialization).
Characterizing what an exception handler does is not an exact science. The
following list of exception handling strategies was originally inspired by Doug
Lea’s “general responses to…failed actions”174 but has grown significantly as
the result of an in-depth study of exception handling in the core API.
• Throwing an Exception: (This must not be more generally characterized
as “abrupt completion” because executing a return statement is also
considered abrupt completion)
° Do nothing and allow the exception to propagate (which in
the case of a checked exception requires declaring that the
method or constructor throws the exception). PROPERLY
179. Brian Randell and Jie Xu, 40 Years of Computing at Newcastle, Chapter 6, “The Evolution of
the Recovery Block Concept” (Newcastle, University of Newcastle upon Tyne, 1997),
www.cs.ncl.ac.uk/old/events/anniversaries/40th/webbook/dependability/
recblocks/rec_blocks.html.
180. Joshua Bloch in an interview with Bill Venner entitled “A Conversation with Josh Bloch” (First
Published in JavaWorld, January 4, 2002), on the artima.com Web site (Artima Software, Inc.),
www.artima.com/intv/blochP.html.
try {
…code that may throw an OutOfMemoryError…
}
catch (OutofMemoryError e) {throw e;}
catch (Throwable e) {
If the catch block includes other code, however, simply rethrowing the same
exception may result in a stack trace that is misleading because the point-of-ori-
gin remains the same. For example,
class Test {
public static void main(String[] args) {
new Dummy().a();
}
}
class Dummy {
void a() {
b();
}
void b() {
c();
}
void c() {
try {
d();
}
catch (ArithmeticException e) {
System.out.println("peek-a-boo");
//e.fillInStackTrace()
throw e;
}
}
void d() {
e();
}
void e() {
f();
}
void f() {
int i = 0;
i /= 0;
}
}
peek-a-boo
Exception in thread "main" java.lang.ArithmeticException: / by
zero
at Dummy.c(Test.java:19)
at Dummy.b(Test.java:11)
at Dummy.a(Test.java:8)
at Test.main(Test.java:3)
There is nothing to indicate that f() was ever invoked. The best solution to this
problem as of the 1.4 release is to use exception chaining. For example,
void c() {
try {
d();
}
catch (ArithmeticException e) {
System.out.println("peek-a-boo");
throw (ArithmeticException) new ArithmeticException
(e.getMessage()).initCause(e);
}
}
This takes a little extra work but no failure information is lost. I would go as far as
to say that the introduction of exception chaining in the 1.4 release more or less
made the fillInStackTrace() method obsolete.
I did not address this question in that section, but using assert false in such
a catch clause is dangerous. If the exception is actually thrown and assertions
are disabled, the effect is that the exception is ignored. This is not the same as
other uses of assertions because another method or constructor is involved.
Consequently, it is not as easy to establish the “in complete control” criterion for
using assertions. Only two examples of safely asserting that an exception is not
thrown come readily to mind. The first is CloneNotSupportedException
which is discussed in 6.2.3 Catching CloneNotSupportedException. The
other is the discussion of NumberFormatException that begins on page
709. For example,
try {
System.out.println(Integer.parseInt("FFFF", 16));
} catch(NumberFormatException e) {
assert false;
}
This is safe because a string literal (versus a variable) is passed. Such catch
clauses are comparable to logic traps as defined in 6.2.2 assert false and
Logic Traps (or Control-flow Invariants), only instead of a switch or nested if-
then-else statement, a catch clause is used when the checked exception in
question is provably not thrown. Beyond something as simple and straightfor-
ward as CloneNotSupportedException or passing a string literal to a
method that throws a parsing error, however, great care must be taken in
asserting that an exception is not thrown.
Now I am ready to address the rather difficult subject of throwing an unspec-
ified runtime exception or error under these circumstances. How is converting a
checked exception into an unspecified runtime exception or error any different
try {
…
} catch(MalformedURLException e) {
throw e;
} catch(Exception e) {
throw new MalformedURLException(e.getMessage());
}
This does not explicitly state that runtime exceptions and errors should also be
caught, but the “throwable thrown by the lower layer” language certainly leaves
the door open.
Exception translation based on interface design considerations is very
closely related to the discussion in 1.11.1 The Compiler-Enforced Method Con-
tract. When overriding a superclass instance method, hiding a class method, or
implementing an interface method, the choice as to which checked exception
can be throw is limited by the superclass or superinterface method declaration.
Thus the subclass programmer has no choice but to translate exceptions. This
also explains why umbrella exceptions are unique.
Exceptions may also be translated for security reasons. The following exam-
ple is from the File class.
try {
…
} catch (AccessControlException x) {
/* Throwing the original AccessControlException could disclose
the location of the default temporary directory, so we
re-throw a more innocuous SecurityException */
throw new SecurityException(
"Unable to create temporary file");
}
182. API docs for the java.lang.Throwable class. Although one speaks of the different “lay-
ers” of abstraction, the exceptions thrown are usually referred to as being low- and high- “level”
exceptions. I believe the author of this API doc is striving for a consistency in usage that is not gen-
erally found elsewhere.
The only way to implement this specification in a <clinit> (or class initializa-
tion) method is to put all of the code from static initialization blocks (and
class variable initializers) in a try block, catch Exception, and then throw
try {
StringWriter sw = new StringWriter();
PrintWriter pw = new PrintWriter(sw);
record.getThrown().printStackTrace(pw);
pw.close();
sb.append(sw.toString());
} catch (Exception ex) { }
Examples such as this is when it is most appropriate to say that the exception is
being deliberately “squelched.” The primary purpose of the catch clause is to
catch the NullPointerException thrown if record.getThrown()
returns null. However, by including all of the code related to appending the
stack trace in the same try block, the responsible programmer substantially
The intent of the programmer is now much clearer. I chose this example for a
number of reasons, one of which is that it demonstrates how discerning the
intent of a programmer who codes an empty catch block sometimes requires
a lot of thought and is therefore time consuming. That is why a comment is
always appropriate. It is not enough to say “no-op.” You really should explain
why it is safe to ignore the exception. Fearing that I may have aroused the giant,
I am going to move on now.
Many examples that appear to ignore an exception actually use default val-
ues or return null, which is an entirely different exception handling strategy.
Here is one from the 1.4 release of the System class:
try {
java.util.logging.LogManager.getLogManager().
readConfiguration();
} catch (Exception ex) {
// System.err.println("Can't read logging configuration:");
// ex.printStackTrace();
}
This example of ignoring an exception cannot be faulted. The point is that you
cannot make a blanket statement such as “Don’t ignore exceptions.”185 More-
over, this exception handling strategy is very common in the core API. Doug Lea
refers to ignoring an exception as “continuation” and says:
If a failed invocation has no bearing on either the state of the caller
object or the overall functionality requirements of the current activity,
then it may be acceptable just to ignore the exception and continue for-
ward.186
185. Joshua Bloch, Effective Java Programming Language Guide, “Item 47: Don’t ignore excep-
tions.”
186. Doug Lea, Concurrent Programming in Java, 3.1.1.2, “Continuation.”
The only substantial difference between this implementation and the trouble
flag in the PrintStream and PrintWriter classes is that a message is
printed to standard error the first time an error is reported. This writing to stan-
dard error is intended to inform client programmers that an error has occurred
and is therefore no substantially different than setting an error flag (except that
an error flag can be programmatically queried). In fact, I would go as far as to
say that setting a flag (and comparable implementations) is no different than
squelching and exception except that it either provides programmatic access to
187. API docs for the error(String msg, Exception ex, int code) method in
java.util.logging.ErrorManager.
6.9.5 Retry
The only examples of the exception handling strategy known as retry that I have
ever seen are try statements coded in a loop. This exception handling strategy
is sometimes referred to as resumption (versus termination, which is what the
exception mechanism normally does). Retry comes in two flavors: try forever
and try a limited number of times. The retryUntilConnected()
method in 6.9 Exception Handling (quoted from Concurrent Programming in
Java188) is an example of “try forever.” Here is an example of “try a limited num-
ber of times:”
class Test {
public static void main(String[] args) {
Runtime rt = Runtime.getRuntime();
long[] memoryReserve = new long[
(int)(rt.maxMemory() / 4) / 8];
System.out.println(rt.maxMemory() +
" maximum memory in bytes");
System.out.println((memoryReserve.length * 8) +
//forces an OutOfMemoryError
rt.gc();
int dim = (int)(rt.freeMemory()/8) * 2;
for (;;) {
try {
if (memoryLow)
dim /= 2; //simulates low-memory mode
System.out.println("attempting to allocate " +
(dim*8) + " bytes");
double[] array = new double[dim];
System.out.println("allocation successful");
break;
} catch (OutOfMemoryError e) {
if (memoryLow) {
throw e;
} else {
memoryLow = true;
System.out.println("memory is low");
memoryReserve = null; //release the reserve
}
}
}
}
}
Executing this program on my machine using the default heap size and the -
verbose:gc option prints:
Note that the HotSpot VM will do a gc() as well as enlarge the heap before
throwing an OutOfMemoryError(). This explains why a gc() is not neces-
sary in the catch block.
Code very similar to this is used to instantiate the LogManager class. If for
some reason the class named in the "java.util.logging.manager"
property cannot be instantiated, a vanilla log manager is used instead. Here is
another example from the Boolean class:
One of the most important details about using default values in a public
API is to document the fact. For example, the getBoolean(String
name) method includes the following documentation.
Sometimes methods are passed the default value that should be return. For
example,
When running under javaw (as is usually the case with a GUI
application) the default exception handler is useless.
is “called by the Java Virtual Machine when a thread in this thread group stops
because of an uncaught exception.”191 This crude top-level exception handler
189. API docs for the getBoolean(String name) method in the java.lang.Boolean
class.
190. API docs for the getFont(String nm, Font font) method in the java.awt.Font
class.
NOTE 6.3
In the interest of saving space I have consistently shortened “the
uncaughtException(Thread t, Throwable e) method in
the ThreadGroup class” to just “the uncaughtException
method” throughout the following section.
import java.util.logging.*;
import java.io.IOException;
class Test {
private static Logger logger = configureLogger();
public static void main(String[] args) {
try {
//EXECUTE PROGRAM CODE HERE
}
catch (Throwable e) {
System.err.println();
System.err.println("A unknown error has occured and " +
"the application must be closed.");
System.err.println("Diagnostic information has been " +
"written to the log file.");
System.err.println();
//print stack trace to log file
new UncaughtException(logger, e);
}
}
private static Logger configureLogger() {
Logger logger = Logger.getLogger("com.javarules");
logger.setLevel(Level.ALL);
logger.setUseParentHandlers(false);
try {
Handler handler = new FileHandler("%h/javarules.log");
handler.setFormatter(new SimpleFormatter());
logger.addHandler(handler);
} catch(IOException e) {
e.printStackTrace();
System.exit(1);
import java.util.*;
import java.util.logging.*;
public class Test {
private static Logger logger = configureLogger();
public static void main(String[] args) {
logger.entering("Test", "main(String[] args)", args);
try {
new DummyThread().start();
} catch(Throwable e) {
System.out.println(
"Only the start() method is dynamically enclosed. " +
"The run() method executes on a different stack.");
}
logger.exiting("Test", "main(String[] args)");
System.out.println("The main thread is dead");
}
Between printing the main thread is dead and the stack trace there is a
five second delay. The main method in a GUI application (and most other multi-
threaded applications) is only used to start other threads and completes almost
as soon as the application is launched. (Notice the times in the ENTRY and
RETURN log records are the same.) The point is that the catch clause in
this example does not dynamically enclose the code in the run() method.
The fact that it appears to is what makes this example so counterintuitive.
Every method has to be invoked on some stack. The Thread class con-
structor (or rather the corresponding <init> method) and start() method
are executed on the same stack as main (or more generally on the same stack
uncaughtException(Thread t,Throwable e)
RECURSION CHECK
thread is alive
java.lang.Throwable
at AWTExceptionHandler.uncaughtException(Test.java:38)
As you can see from this program the second time the run() method throws
an exception the uncaughtException method is not invoked. The original
JLS (which included the API docs for the java.lang, java.util, and
java.io packages) included the following very significant documentation for
the uncaughtException method.
The call to uncaughtException is performed by the thread that
failed to catch the exception, so t is the current thread. The call to
This paragraph was lost when the decision was made not to include API docu-
mentation in the Second Edition of the JLS. It was scant documentation in the
first place because “if the call to uncaughtException itself results in an
(uncaught) exception” is subject to interpretation. The third edition of the JLS
should clearly state that the uncaughtException method is never (implic-
itly) invoked by the JVM more than once for any given thread.
The fact that the uncaughtException method is never (implicitly)
invoked by the JVM more than once for any given thread is profoundly significant.
It means that the uncaughtException method simply cannot be used
for recovery. The reason why is simple. While it is possible to continue using the
same thread with the uncaughtException method at the bottom of the
stack, a second uncaught exception is going to kill the thread that is executing no
matter what. Thus the application programmer has lost considerable control over
the behavior of an application. This is why it is so imperative to attempt recovery in
the run() method of a thread. To some extent the declaration of the run()
method in the Runnable interface corroborates this design choice because the
run() method has no throws clause. Thus it is more or less not possible to
attempt recovery from a checked exception in the uncaughtException
method. (I say more or less because although the run() method has no
throws clause it is still possible (though rare) for checked exceptions to reach
the uncaughtException method.
192. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha, §20.21.31, “public void
uncaughtException(Thread t, Throwable e).” (Do not update.)
193. Ken Arnold, James Gosling, and David Holmes, §18.3, “Shutdown.”
194. Unascribed, “Programming with Assertions” in the API docs for the 1.4 release, (Mountain
View: Sun Microsystems, 2002), “Design FAQ - The AssertionError Class.”
195. Comment from Bug Id 4063022.
Top-level exception handlers should not delete temporary files, close sockets
and database connections, and the like. As of the 1.3 release, such code should
be placed in separate threads known as shutdown hooks that run after top-level
exception handlers. See the addShutdownHook(Thread hook) method in
Runtime as well as java.sun.com/j2se/1.3/docs/guide/lang/hook-
design.html for a discussion (the hyphen in hook-design is part of the URL).
All exceptions unlock monitors and thus may result in objects that are in incon-
sistent states if allowed to propagate. Except for the fact that ThreadDeath
if (e instanceof RuntimeException) {
throw (RuntimeException)e;
} else if (e instanceof Error) {
throw (Error)e;
} else {
throw new Error(e, "converting to an unchecked exception");
}
import javax.swing.JOptionPane;
import java.util.*;
public class Test {
public static void main(final String args[]) {
new DummyThread("main").start();
}
}
class DummyThread extends Thread {
public DummyThread(String name) {
super(new AWTExceptionHandler(), name);
}
public void run() {
//execute application here (as if this were the main method)
}
}
class AWTExceptionHandler extends ThreadGroup {
public AWTExceptionHandler() {
super("dummy");
}
public void uncaughtException(Thread t, Throwable e) {
//log uncaught exception here (omitted to shorten example)
if (e instanceof ThreadDeath) {
return;
}
String newLine = System.getProperty("line.separator");
String message =
"An unknown error has occured and" + newLine +
"the application must be closed." + newLine +
"Diagnostic information has been" + newLine +
"written to a log file. Please inform" + newLine +
"your System Administrator.";
JOptionPane.showMessageDialog(null, message,
"Fatal Error",
JOptionPane.ERROR_MESSAGE);
System.exit(0);
}
}
This does not say, however, that ThreadDeath cannot be logged. Knowing
the point of origin for a ThreadDeath object may prove to be very useful
information in a log file. The user is not informed because ThreadDeath is not
really an error. It is a poorly designed (in terms of concurrent programming)
mechanism for stopping threads. The presumption when ThreadDeath is
caught is always that the application continues to execute.
There is a great deal of misunderstanding about extending ThreadGroup
in order to override the uncaughtException method. Most of it is due to
the fact that the EventDispatchThread rather ignorantly used to intercept
all of the exceptions thrown in a GUI application and print them to standard error,
as if the application programmer had nothing to say about uncaught exceptions.
This led to the widely held misconception that classes in the core API were delib-
erately specifying what the API docs for the ThreadGroup class refer to as
the “initial thread group”198 (the system thread group, which is actually named
"system" in Sun implementations and is the thread group used by the applica-
tion launcher when creating the main thread) as the parent thread group when
creating new threads, effectively bypassing thread groups created by applica-
tion programmers. This was never actually the case. As noted in the evalua-
tion of Bug Id 4491897, “Cannot install top level Exception handler:”
There are a few instances within the JDK where we go to great lengths
to launch threads outside of the current thread group. These rare
instances typically involve system-level operations; so it is doubtful that
the user would care or want to handle uncaught exceptions from them,
but it is a hole.199
197. James Gosling, Bill Joy, and Guy Steele, §20.20.15, “public final void stop()
throws SecurityException.”
198. API docs for the java.lang.ThreadGroup class.
This (somewhat desperate) solution is a clear indication that this interface design
nightmare really does need to be fixed.
Given Bloch’s searing criticism of the ThreadGroup class in “Item 53: Avoid
thread groups” and what appears to be a general consensus among software
engineers at Sun that the uncaughtException method is the only practical
use of the ThreadGroup class, I should think that they would look at this as an
opportunity to deprecate the ThreadGroup class. Surely this has to be done
to prepare the way for a new major release (2.0) of the Java programming lan-
guage in which a lot of this conceptual deadweight can be tossed overboard.
This concludes the discussion of subclassing ThreadGroup for non-GUI
apps. The remainder of this section discusses the two system properties that
are used to create top-level exception handlers in GUI applications. Before doing
so, however, it is important to understand that throwing an exception or error in
a multi-threaded application does not shut down the application. If propagated to
the default exception handler, the thread will die, but the JVM is not exited. Multi-
threaded applications continue to execute until one of the following happens.
• The last non-daemon thread is exited. In a single threaded application, this is
comparable to normal completion of the main method and is described in
the API docs as exiting the JVM “normally”
• The runtime is explicitly exited, usually by invoking the System.exit
(int status) method
• The JVM is terminated externally (a.k.a. aborting) by the user pressing C
while holding down the control key (Ctrl), sometimes written as Ctrl-C
or ^C, the SIGKILL signal on UNIX or the TerminateProcess call on
Win32 (perhaps in response to a user logging off), or a system crash.
import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
public class ExceptionThrower {
if (!handleException(e)) {
// See Bug Id 4499199.
// If we are in a modal dialog, we cannot throw
// an exception for the ThreadGroup to handle (as added
// in RFE 4063022). If we did, the message pump of
// the modal dialog would be interrupted.
// We instead choose to handle the exception ourselves.
// It may be useful to add either a runtime flag or API
// later if someone would like to instead dispose the
// dialog and allow the thread group to handle it.
if (isModal) {
System.err.println(
"Exception occurred during event dispatching:");
e.printStackTrace();
} else if (e instanceof RuntimeException) {
throw (RuntimeException)e;
} else if (e instanceof Error) {
throw (Error)e;
}
}201
200. API docs for the 1.4 release of the handleException(Throwable thrown) method
in the java.awt.EventDispatchThread class.
201. Code from the processException(Throwable e, boolean isModal) in the
java.awt.EventDispatchThread class.
if (tgName.length() != 0) {
try {
Constructor ctor = Class.forName(tgName).getConstructor
(new Class[] {String.class});
threadGroup = (ThreadGroup)ctor.newInstance
(new Object[] {"AWT-ThreadGroup"});
} catch (Exception e) {
System.err.println("Failed loading " + tgName +
": " + e);
}
}
203. If for some reason you are inclined to use the “temporary hack” solution to this problem, you
should be aware of the fact that the code that is responsible for loading the class designated by the
sun.awt.exception.handler system property displays no diagnostic messages if for
some reason the class cannot be loaded. This is confusing because it leads you to believe that the
class was in fact loaded.
Otherwise it always fails to load the class and prints a message such as the fol-
lowing to standard error.
I submitted a documentation bug about this and got no response. That tells me
they are aware of the problem. Furthermore, as explained in 6.10.1.1 The
AWTExceptionHandler Class the awt.threadgroup system property
must be set before any GUI components are created. This is not true for the
sun.awt.exception.handler system property, which can be safely set
anywhere in the main method.
These are curve balls that would get by a great many programmers. Thus I
suspect the awt.threadgroup system property has gotten off to a slow
start and is not being used very much because as of this writing I can search the
entire Java Web site (including the prolific Java Discussion Forums) using the
advanced search at search.java.sun.com/search/java/advanced.jsp
and there are only two hits (the two Bug Ids mentioned at the start of this para-
graph) for “awt.threadgroup” instead of the usual hundreds or thousands.
The following modification of the ExceptionThrower class is an exam-
ple of a GUI application that uses the awt.threadgroup system property.
import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
import java.util.logging.*;
import java.lang.reflect.Constructor;
public class ExceptionThrower {
Other than the lines of code in bold, this is the same ExceptionThrower
program from above, only now it has a much more sophisticated top-level excep-
tion handler. Note that all of the examples in this chapter set the system proper-
ties programmatically. An alternative is to use the -Dawt.threadgroup= or
-Dsun.awt.exception.handler = launcher options. Executing this pro-
gram generates a log file such as the following.
AWTExceptionHandler[name=AWT-ThreadGroup,maxpri=10]
Thread[AWT-EventQueue-0,6,AWT-ThreadGroup]
Thread[AWT-EventQueue-0,6,AWT-ThreadGroup]
awt.threadgroup=AWTExceptionHandler
awt.toolkit=sun.awt.windows.WToolkit
file.encoding=Cp1252
file.encoding.pkg=sun.io
file.separator=\
/**
* Requires the use of the -Xbootclasspath/a:path launcher
* option. The next few lines of code are in a very specific
* order. The system property must be set before executing any
* GUI code. Then the frame is created and the AWTExceptionHandler
* initialized. In that order! The JFrame constructor effectively
* creates the AWTExceptionHandler, which must then be
* subsequently initialized.
*/
System.setProperty("awt.threadgroup",
"AWTExceptionHandler");
frame = new JFrame("Exception Thrower");
AWTExceptionHandler.getInstance().init(frame,
logger,
Redirect.BOTH);
This may strike you as an odd design at first. I had to get used to it myself, but
given that fact that the core API is responsible for instantiating the top-level
exception handler there is not a lot of choice in the matter.
Here then is the code for the AWTExceptionHandler class:
import javax.swing.JOptionPane;
import javax.swing.JFrame;
import java.io.PrintStream;
import java.io.ByteArrayOutputStream;
/**
* This constructor is used when sun.awt.exception.handler
* system property is set.
*/
public AWTExceptionHandler() {
super("dummy");
if (instance != null)
throw new IllegalStateException(
"AWTExceptionHandler already exists");
instance = this;
}
/**
* This constructor is used when awt.threadgroup system
* property is set.
*/
public AWTExceptionHandler(String name) {
super(name);
/**
* It is recommended that both standard output and standard
* error be redirected because (1) they are useless in a GUI
* application that does not have a console window (i.e. that
* are executed using the javaw launcher, which includes
* Java Web Start) and (2) many of the core API classes
* routinely catch exceptions and write critically important
* dianostic messages to standard error.
*
* It is doubtful that synchronization is required, but is
* used as a precaution. Note also that this method can be
* invoked more than once to reinitialize the handler.
*
*/
public synchronized void init(JFrame frame,
Logger logger,
Redirect redirect) {
//a null frame is okay
if (logger == null)
throw new IllegalArgumentException("logger is null");
if (redirect == null)
throw new IllegalArgumentException("redirect is null");
instance.frame = frame;
instance.logger = logger;
instance.redirect = redirect;
initialized = true; //must be set before redirecting
/*
* The logger's encoding (preferably from a file handler)
* is used because these streams write to the log file.
*/
if (encoding == null) {
sysOut = new PrintStream(new StandardOutputStream(),
true);
sysErr = new PrintStream(new StandardErrorStream(),
true);
} else try {
sysOut = new PrintStream(new StandardOutputStream(),
true,
encoding);
sysErr = new PrintStream(new StandardErrorStream(),
true,
encoding);
} catch(UnsupportedEncodingException e) {assert false;}
switch(redirect.intValue()) {
case NEITHER:
break;
case STANDARD_OUTPUT:
System.setOut(sysOut);
break;
case STANDARD_ERROR:
System.setErr(sysErr);
break;
case BOTH:
System.setOut(sysOut);
System.setErr(sysErr);
default:
assert false;
}
}
if (level.equals(Level.INFO))
message = "[FROM STANDARD OUTPUT] " + message;
else if (level.equals(Level.WARNING))
message = "[FROM STANDARD ERROR] " + message;
else
assert false;
logger.logp(level,
frame.getClassName(),
frame.getMethodName(),
message);
stream.reset();
}
class StandardOutputStream extends ByteArrayOutputStream {
public void flush() {
redirect(this, Level.INFO);
}
}
class StandardErrorStream extends ByteArrayOutputStream {
public void flush() {
redirect(this, Level.WARNING);
}
}
}
As stated above, it is a good idea to redirect both standard error and standard
output in a GUI application.
import java.io.*;
import java.util.*;
import java.util.logging.*;
public class UncaughtException {
private Logger logger;
private Throwable uncaughtException;
private static CharArrayWriter message =
new CharArrayWriter();
private static PrintWriter log = new PrintWriter(
new BufferedWriter(message));
/*
* The constructor assures that at a bare minimum the stack
* trace is always saved to the log file.
*/
public UncaughtException(Logger logger,
Throwable uncaughtException) {
if (logger == null)
throw new IllegalArgumentException("logger is null");
if (uncaughtException == null)
throw new IllegalArgumentException(
"uncaught exception is null");
this.logger = logger;
this.uncaughtException = uncaughtException;
Thread thread = Thread.currentThread();
log.println("UNCAUGHT EXCEPTION in thread \"" +
thread.getName() + "\"");
log.println();
uncaughtException.printStackTrace(log);
checkError();
StackTraceElement[] stack = uncaughtException.
getStackTrace();
StackTraceElement frame = stack[0];
logger.logp(Level.SEVERE,
frame.getClassName(),
frame.getMethodName(),
message.toString());
message.reset();
}
public void logVerbose() {
logThreadGroup();
logLoggableException();
logPackageInformation(false);
logSystemProperties();
}
public void logThreadGroup() {
log.println("THREADS");
log.println();
/*
* This is basically a workaround for the fact that the
* list() method in ThreadGroup always prints to standard
* output. There is no need to somehow reset standard
Note that the stack trace part of this log report includes exactly the same infor-
mation as would be printed to standard error were the default exception handler
allowed to execute. The most useful information is not shown here. It would
come from exception classes that implement the Loggable interface.
NOTE 6.4
If you have not already done so, I strongly recommend reading 6.3 An
Execution Stack Primer at the beginning of this chapter before pro-
ceeding. It is a very basic introduction to stacks.
class Test {
public static void main(String[] args) {
a();
}
static void a() {
b();
}
static void b() {
c();
}
static void c() {
throw new RuntimeException("detail message");
}
}
This is a very simple example, but nevertheless includes all of the elements in a
typical stack trace. Because the stack trace was printed by the default exception
handler (which is passed a reference to the current thread), the first line begins
with Exception in thread followed by the thread name in quotation marks.
Stack traces can also be printed by invoking one of the overloaded print-
StackTrace methods are declared in the Throwable class:
This is the same stack trace as before only printed by invoking one of the over-
loaded printStackTrace methods. If there is no detail message the colon
is omitted.
The remaining lines in the stack trace are indented. There is one line for
each frame on the stack. Each line begins with at followed by the fully qualified
method name of the method invoked and some additional information in paren-
theses. It is the information in parentheses that causes the most confusion. Ide-
ally it is the source code file name and line number, but for a number of different
reasons that information may not be available. In the previous example, all of the
code executed came from the same Test.java file. The runtime exception
was thrown on line twelve. There are a number of different reasons why a stack
trace may not include line number:
• Unknown Source: The debugging information was not included when the
class was compiled, in which case (Unknown Source) is printed instead
• Compiled Code: The code has been somehow optimized, in which case
(Compiled Code) is printed instead
• native Method: The method is native, in which case (Native
Method) is printed instead
-g:none
Do not generate any debugging information.
-g:{keyword list}
Generate only some kinds of debugging information, specified by a
comma separated list of keywords. Valid keywords are:
source
Source file debugging information
lines
Line number debugging information
vars
Local variable debugging information204
Unless the default is taken or both the source the lines keywords are
used, instead of getting line numbers what you will see is (Unknown
Source). Amusingly, novice programmers unfamiliar with stack traces some-
times mistake this to be an UnknownSource exception or error, which of
course in not really the name of an exception class. Note that if only lines is
specified (instead of both source and lines), the stack trace will still show
(Unknown Source). Note also that the compiler ignores the source,
204. Unascribed, “javac - Java programming language compiler” under “Tools and Utilities” in the
1.4.1 (and earlier) releases.
205. Unascribed, “javac - The Java compiler” under “Tools Reference Pages - Windows” in the
1.1.6 (and later) releases.
Unless the exception is thrown in a helper class, the source code file
name is more or less part of the fully qualified method name.
Even with inner classes it is pretty easy to determine the source code file name.
That only leaves the line number. Here it is important to understand that with the
exception of the method in which the exception or error was thrown, all of the
other methods on the stack complete abruptly at the precise point at which eval-
uating a method invocation expression or a class instance creation expression
resulted in a new activation frame being added to the stack. The name of the
206. Unascribed, “java - the Java application launcher” under “Tools and Utilities” in the 1.3 (and
later) releases.
Error codes are also discussed in 6.11.3 Logging Methods as a replacement for
the source class and method names. In either case, they solve the same problem.
In addition to being code tags, error codes are also useful when cataloguing error
messages in the documentation for very large systems.
There was a change in the 1.4 release that for the first time makes stack
traces a little difficult to read. The stack traces that print as the result of excep-
tion chaining are counterintuitive. If you think of the stack as a deck of cards, the
effect of exception chaining is to shuffle the deck. The following example is
adapted from the printStackTrace() method in the Throwable, but
uses runtime exceptions so that there is less clutter:
class Test {
public static void main(String[] args) {
try {
a();
} catch(RuntimeException e) {
e.printStackTrace();
}
}
static void a() {
try {
b();
} catch(Exception e) {
throw e;
207.Unascribed, “Java Logging Overview” in the API docs for the 1.4 release, §1.13, “Unique Mes-
sage IDs.”
LowLevelException
at Test.e(Test.java:30)
at Test.d(Test.java:27)
at Test.c(Test.java:21)
at Test.b(Test.java:17)
at Test.a(Test.java:11)
at Test.main(Test.java:4)
class Test {
public static void main(String[] args) {
try {
a();
} catch(RuntimeException e) {
e.printStackTrace();
}
}
static void a() {
try {
b();
} catch(RuntimeException e) {
throw new HighLevelException(e);
}
}
static void b() {
c();
}
static void c() {
try {
d();
} catch(RuntimeException e) {
throw new MidLevelException(e);
}
}
static void d() {
e();
}
static void e() {
throw new LowLevelException();
}
}
I believe this simple little trick makes it a lot easier to see that “… 3 more” is a
reference to the methods b(), a(), and main, and that “… 1 more” is a ref-
erence to main. The “deck gets shuffled” because code is being executed as
the thread is unwound. The stack trace has to reflect the methods in which code
was most recently executed. Now look back at the actual stack trace and it
should be a lot easier to read.
The overloaded printStackTrace methods in the Throwable class
(or alternatively by the dumpStack() method in the Thread class) are only
the proverbial tip of the iceberg when it comes to the diagnostic information
available without the use of -Xprof, -Xrunhprof, -Xrunxdprof,208 or
209. Calvin Austin, “An Introduction to Java Stack Traces,” (Palo Alto, Sun Microsystems, 1998),
developer.java.sun.com/developer/technicalArticles/Programming/
Stacktrace.
javax.swing.JFrame[frame0,312,284,400x200,invalid,layout=java.awt
.BorderLayout,title=ExceptionThrower,resizable,normal,defaultClos
eOperation=HIDE_ON_CLOSE,rootPane=javax.swing.JRootPane[,4,41,392
x155,invalid,layout=javax.swing.JRootPane$RootLayout,alignmentX=n
ull,alignmentY=null,border=,flags=1409,maximumSize=,minimumSize=,
preferredSize=],rootPaneCheckingEnabled=true]
210.Chris White, “Revelations on Java signal handling and termination,” (www.ibm.com, IBM,
2002), www-106.ibm.com/developerworks/ibm/library/i-signalhandling. z/OS is
an IBM mainframe operating system.
Table 6.3 Operating System Signals that Originate from the Keyboard
Keyboard Operating System Signal Meaning
211. Unascribed, “Enhancements and Changes in J2SE 1.4.1 Platform” document in the 1.4.1 Beta
release.
NOTE 6.5
The following section discusses the logging API. I have carefully divided
this material into the following introductory section in which two exam-
ples are discussed in detail, defining basic terminology along the way,
and three advanced subsections that focus on the logger namespace,
logging configuration, logging methods, and a short section on the
Cleaner thread, which is a shutdown hook used in LogManager.
The first section reads like a tutorial and is intended for programmers
who have not yet used the logging API. Experienced programmers
should consider skipping this section and just reading the subsections
of interest.
6.11 Logging
The logging API is very easy to learn. Here is a very simple example that you can
copy and begin at once to log messages to the console:
212. Calvin Austin and Monica Pawlan, Advanced Programming for the Java 2 Platform, (Bos-
ton, Addison Wesley Professional, 2000).
The example uses a console handler because they are by far the easiest han-
dlers to configure. Basically all you have to do is set the level. I always use a
static method named configureLogger() to both create and configure
the logger (rather than a static initialization block). This is purely a matter of
style, however.
The output from a logger is referred to as a log report. A log report is noth-
ing more than a series of log records, each of which prints on two lines (assum-
ing the use of a SimpleFormatter). If executed as is, this example prints
the following log report.
The first line begins with the date and time the log record was created followed
by the name of the class and method in which the message was logging. The
second line begins with the log level (FINER in this example) followed by a
colon and then the log message, which in this example is simply ENTRY and
213. API docs for the LogRecord(Level level, String msg) constructor in the
java.util.logging.LogRecord class.
You may wonder why Level.OFF is included at the top of this table and
Level.ALL at the bottom (which superficially seems reversed). These are
not log levels. They are special values used only by loggers and handlers.
Using this table, if the log level is below the level of the logger, the log record is
discarded. This implies that all log records are discarded if the logger is set to
Level.OFF. Again using this table, if the log level is below the level of a han-
dler the log record is not published by that particular handler. Hence
Level.OFF is at the top and Level.ALL is at the bottom.
Table 6.4 The Seven Standard Log Levels Plus OFF and ALL
Level Description
214. Unascribed, “Java Logging Overview” in the API docs for the 1.4 release, §1.2, “Log Levels.”
Note that although args was logged, nothing printed. That is because I did not
use any command-line arguments when launching the Test program. Had I
used “testing testing testing” after the program name on the DOS command-line,
the second line would have printed as follows.
A logger without handlers is useless. It creates log records that are discarded
no sooner than they are created. For example, try commenting out the following
line of code in the configureLogger() method.
logger.addHandler(handler);
Nothing is printed as a result. There is actually an exception to the rule that there
is no method that when invoked prints a log report. The push() method in
MemoryHandler sometimes publishes an entire log report by exporting all of
the log records in memory to some other kind of handler. There is an example of
using a memory handler below.
The logger.setUseParentHandlers(false) method invocation
in this example is particularly important. Try commenting out that line of code
and then adding the following one to the main method.
The output now includes four additional lines similar to the following.
import java.util.logging.*;
public class Test {
public static void main(String[] args)
throws java.io.IOException {
Logger logger = Logger.getLogger("unnamed");
logger.addHandler(new FileHandler("%h/java%g.log"));
logger.severe("testing, testing, testing");
System.out.println("User’s home directory = " +
System.getProperty("user.home"));
}
}
This shows that even a Sun engineer had difficulty figuring out what was going
on. I did to, and so will almost every other programmer learning how to
use the logging API for the first time. That is really a shame because it seri-
ously detracts from an otherwise very easy to learn API. Note that the default
pattern string for the global file handler in the logging configuration file is "%h/
java%u.log" (not %g as used in the bug reports), but the same problem
occurs anyway. In the example above, an extra file is created by the no-argu-
ment FileHandler() constructor because it uses the exact same pattern
string as the global file handler (the one in the logging configuration file).
There are three general purpose handlers in the java.util.logging
package: ConsoleHandler, FileHandler, and SocketHandler. In
circular buffer (sometimes referred to as a cyclic or ring buffer) that begins dis-
carding the oldest log records once it is full in order to make room for new ones.
The benefit of doing so is twofold:
• First and foremost the log records in a memory buffer are not formatted
(including localization), which means that logging them is a lot less expen-
sive than would be if the log records were published directly to a file.
• The other major benefit of using a memory handler is that only the most
recent log records are written to the file. (Those are also the most relevant
log records in the event of a failure.) This saves wasting a lot of disc space
for log records that nobody will ever read.
It is important to understand the design of memory handlers. The log records
in a memory handler may never be published. The whole idea is that the log
import java.io.*;
import java.util.logging.*;
class Test {
private static final String CLASS = "Test";
private static Logger logger = configureLogger();
private static MemoryHandler memoryHandler;
public static void main(String[] args) throws IOException {
new Test().copy(new File("Test.java"),
new File("Copy of Test.java"));
}
public static void copy(File source, File destination)
throws IOException {
final String METHOD =
"copy(File source, File destination)";
logger.entering(CLASS, METHOD,
new Object[]{ source, destination });
byte[] buffer = new byte[512];
int numBytes = 0;
FileInputStream in = new FileInputStream(source);
FileOutputStream out = new FileOutputStream(destination);
try {
while ((numBytes = in.read(buffer)) != -1) {
out.write(buffer, 0, numBytes);
}
} catch(IOException e) {
logger.throwing(CLASS, METHOD, e);
throw e;
}
finally {
in.close();
In the nomenclature of the logging API, the file that a memory handler writes to is
referred to as the target (or target handler). Log files typically use the .log
file extension. This log file is being written to the directory that the host operat-
ing system uses to write temporary files. File handlers default to Level.ALL,
so there’s no reason to set their level (at least not in this example). File-
Handler constructors throw IOException which requires the use of a try
block because class variable initializers cannot throw checked exceptions. Inas-
much as the application has just started, exiting is a reasonable response.
Remember, too, that try blocks are never definite assignment, so target
must be initialized to null when declared.
The relationship between the memory handler size (the number of log
records written before the memory handler starts discarding old ones) and the
file handler limit (the maximum number of bytes that can be written to any
one file) is significant. The most recent log records are written first, so if the file
is too small to hold all of the records in a memory handler the older log records
"%u" A unique number used “to resolve conflicts.” This escape sequence is very
easy to misunderstand. It is almost always replaced with a zero. The only time
%u is anything other than a zero is if the FileHandler class cannot
lock the requested file name. In other words, the file is already in use. This
rarely happens. If the file name already exists (using 0 as a replacement for
%u), that is an entirely different matter. The problem of pre-existing file
names is always resolved with the %g escape sequence.
"%%" As with all escape characters, two of them in a row is an escape sequence
that translates into the escape character (without any further translations),
so "%%" translates to a single percent sign "%" that is used in the log file
name.
a. API docs for the java.util.logging.FileHandler class
The system temporary directory is not always easy to find, which can be a
vexing problem because the file search utility on some operating systems
does not include the system temporary directory when searching for a file
or folder. If you run into the problem of not being able to find log files created
using %t execute the following two lines of code.
String tmpdir = System.getProperty("java.io.tmpdir");
System.out.println(new File(tmpdir).getCanonicalPath());
The output will tell you precisely where the %t directory is located. Note that the
API docs suggest that the system temporary directory on Microsoft Windows is
C:\TEMP, which can be grossly misleading. On my computer, for example, the
system temporary directory is C:\Documents and Settings\Douglas
K. Dunn\Local Settings\Temp.
The remainder of this section discusses the global logger, a reference to
which is stored in the public Logger.global field. The idea behind this
logger is to make using loggers as easy as the unnamed package or type-
import-on-demand import declarations. These are all devices used while doing
“casual” development work. As stated in the API docs:
218. API docs for the global field in the java.util.logging.Logger class.
Except for the line of code in bold, the first configureLogger() method is
the same as the first example (that used an arbitrarily named logger) at the start
of this section. The second one is a nightmare. The other alternative is to edit
the default logger configuration file (changing only the .level= INFO entry).
This may sound easy to some readers, but programmers eager to learn the log-
ging API may not have any experience editing configuration files. Indeed, they
may be very reluctant to do so.
I therefore recommend that someone new to the logging API stay away from
the global logger. As stated in the API docs, “The global logger is initialized
by calling Logger.getLogger("global").”218 In other words, it is has an
arbitrary name. As with all loggers that have arbitrary names, the root logger is
the parent. Invoking Logger.getLogger("unnamed") during casual
development accomplishes the same thing, is more in line with how loggers are
actually used, and avoids the assumption that the global logger can be used to
log trace and configuration information without being reconfigured. As stated
above, the meaning of “unnamed” is “a logger for the unnamed package.” Do
not pass "" (an empty string) because that is the name of the root logger.
The significance of logger names is discussed further in the following subsec-
tion.
If the new level is null, it means that this node should inherit its level
from its nearest ancestor with a specific (non-null) level value.220
Thus ancestor is a synonym for parent. The root node at the top of a tree is
special because it has no parent, only is this case it is referred to as the root
logger.
If using either of the first two methods you really have no choice but to name the
logger because passing a null reference throws a NullPointer-
Exception and passing "" (an empty string) returns a reference to the root
logger. Anonymous loggers are “primarily intended for use from applets.”222
Besides not having a name, anonymous loggers are special for three reasons:
• Unlike the getLogger methods, the getAnonymousLogger meth-
ods return a new logger every time they are invoked. Furthermore, the
Logger class does not store a reference to the returned logger in the log-
ger namespace. Thus, only the client programmer who invokes one of the
getAnonymousLogger factory methods has access to the logger.
• Because access to the logger is strictly controlled by the client who invokes
the getAnonymousLogger method, no permission is required to config-
ure an anonymous logger. This makes it possible for untrusted applet code
to invoke the following methods in the Logger class.
setFilter(Filter newFilter)
setLevel(Level newLevel)
addHandler(Handler handler)
removeHandler(Handler handler)
setUseParentHandlers(boolean useParentHandlers)
These are all of the methods in the Logger class used to configure loggers.
• “Anonymous” loggers are not “autonomous.” They have a parent logger,
which is always the root logger. This means their log records may be pub-
lished by the so-called “global” handlers unless setUseParentHand-
Loggers can also have arbitrary names such as the global logger which is
named "global" or the "unnamed" logger used in the examples in this
chapter. Using an arbitrarily named logger simply means that the logger is
sure not to have any children and that the root logger will be the parent. They
are not much different than anonymous loggers is this regard.
223. Unascribed, “Java Logging Overview” document in the 1.4.1 release, §1.3, “Loggers.”
if (targets != null) {
for (int i=0; i<targets.length; i++) {
targets[i].publish(record);
}
}
if (!logger.getUseParentHandlers()) {
break;
}
logger = logger.getParent();
}
As you can clearly see there is nothing in this code that would justify
saying that a logger “inherits” handlers from its parent loggers. The sim-
ple fact of the matter is that log records are logged by the target logger and
then by the parent loggers, unless setUseParentHandlers(false) is
invoked. This while loop is interesting in that you can see how getUse-
ParentHandlers() is used to break this chain of logging to parent loggers.
You can also see this in the default logging configuration file. Reduced to it’s
essence, the default logger configuration file looks like this:
All nine of these are properties used to configure the root logger. It is important
to understand that initial logging configuration is limited to the above list of
three very specific items. Otherwise you fail to grasp that the design of the
LogManager class is to load logging control properties, not to configure
loggers (other than the root logger).
The API docs for the LogManager class list three, mutually exclusive
options for loading logging control properties. To that list I have added a forth
for loading logging control properties from a JAR file:
• Loading From a Database or Across a Network: The name of any class
with a no-argument constructor can be passed to the LogManager class
using the java.util.logging.config.class system property.
That class then becomes the logging configuration class. The logging
configuration class uses the same
readConfiguration(InputStream ins) method as is used to
load logging control properties from a JAR file. The idea behind this class is
simply that the logging control properties may not be stored in a directory in
the local filesystem. For example, the logging configuration class can use
JDBC to load logging control properties from a database, or JNDI to load
them from a LDAP directory service. This must be a public class on the
bootstrap classpath (which is accomplished using the non-standard -
Xbootclasspath/a:path launcher option). The no-argument con-
structor must also be public and cannot have a throws clause. If for
any reason whatsoever this class fails to load or if the constructor
If you forget to change it back the default global logging level (used when
the logging configuration file has a bogus value) is INFO, so no harm is
done.
C:\Java\classes>java -Djava.util.logging.config.class=LoggingConf
iguration -Xbootclasspath/a:C\Java\classes Test
228. Ibid.
229. Ibid.
I find this quote from the API docs for the LogManager class to be very mis-
leading. The so-called “static configuration control” should not be compared with
the constructors and set methods in the logging API. It is nothing more than a
reference to the logging control properties loaded during initial logger configura-
tion. Actually using those properties to configure a logger takes a lot of code,
code that programmers rather naturally assume is implemented for them. Not
so. It took me the better part of two days just to develop and test the following
utility class that configures handlers based on logging control properties.
/**
* This is an experimental class in that I am not using any in-
* dentation for entities declared in the class body (e.g. fields
* and methods). It is always a healthy thing for programmers to
* question the status quo. Why lose all that space when there is
* usually only one class per compilation unit anyway? It doesn’t
* make any sense.
*/
import java.util.logging.*;
public class Loggers {
230. Ibid.
/**
* It is not at all clear to me if an internationalized trim()
* method is really necessary when editing .properties files.
*/
public static String trim(String s) {
int length = s.length();
int index = 0;
while (index < length &&
Character.isWhitespace(s.charAt(index))) {
index++;
}
while (index < length &&
Character.isWhitespace(s.charAt(length - 1))) {
length--;
}
return (index > 0 || length < s.length()) ?
s.substring(index, length) : s;
}
} //END OF PUBLIC CLASS
The design of this utility class is very straightforward. Handler properties are
divided into two very distinct groups. One group must be passed to constructors
because they have no corresponding set methods (the pushLevel in a
MemoryHandler being the only exception). The utility class requires that
these properties be defined in the logging configuration file else a runtime
exception is thrown. The other group of handler properties (of which there are
only four) are optional because they have set methods. Adding handlers to a log-
ger using this utility class is as simple as invoking
logger = Logger.getLogger("com.javarules");
Loggers.configureConsoleHandler(logger);
Loggers.configureFileHandler(logger);
The property keys are a combination of the logger name followed by the simple
name of the handler class. For example,
com.javarules.FileHandler.pattern = logs/java%g.log
com.javarules.FileHandler.limit = 0
com.javarules.FileHandler.count = 30
com.javarules.FileHandler.append = false
com.javarules.FileHandler.level = ALL
com.javarules.FileHandler.formatter =
java.util.logging.SimpleFormatter
Note that this property naming convention is significantly different from what is
described in the API docs for the LogManager class:
The properties for loggers and Handlers will have names starting with
the dot-separated name for the handler or logger.231
java.util.logging.FileHandler.pattern = %h/java%g.log
java.util.logging.FileHandler.count = 5
java.util.logging.FileHandler.formatter = \
java.util.logging.SimpleFormatter
com.javarules.MemoryHandler.target = \
java.util.logging.FileHandler
com.javarules.MemoryHandler.size = 2500
com.javarules.MemoryHandler.push = SEVERE
(Note that \ is the line continuation character in the properties file format. I have
to use it in the book because of the page widths.) This logging configuration file
expropriates the “global” file handler ( java.util.logging.File-
Handler ) that would normally be used by the root logger and uses it as a tar-
get handler instead. Coding such as this offers the advantage of field service
engineers being able to change the logger configuration, but at a very significant
cost.
The big question is why this functionality was not built-in to the logging API.
The best (and only) RFE that I am aware of is 4635817, but I suspect there will
be others to follow. Bug Id 4533204 (marked “Closed, not a bug”) perhaps
failed as an RFE because it was too complicated, but is interesting because it
includes the following evaluation.
The configuration file is intended for simple configuration.
231. Ibid.
There is no support for this view in the API docs. As stated in the API docs for
the logging package:
The Logging APIs offer both static and dynamic configuration control.
Static control enables field service staff to set up a particular configura-
tion and then re-launch the application with the new logging settings.
Dynamic control allows for updates to the logging configuration within a
currently running program.233
Something like the Loggers utility class or RFE 4635817 must, and I am sure,
eventually will be added to the java.util.logging package. Meanwhile, I
suspect most loggers and handlers will be configured dynamically because
doing so requires much less code.
Configuring a logger is actually quite simple. Most of the time this is accom-
plished in four steps (the order is not really significant):
• In the case of an internationalized application or applet, the name of the
resource bundle is passed to the factory method in the Logger class.
• The logger level is set either in the logging configuration file or by invoking
the setLevel(Level newLevel) method in the Logger class. If
additional filtering is required, the setFilter(Filter newFilter)
method is invoked. Filters will be discussed momentarily.
• Handlers are then created and added to the logger by invoking the
addHandler(Handler handler) method.
• The setUseParentHandlers(false) method is invoked so that log
records are not published by the handlers of parent loggers (including the
“global” handlers of the root logger).
Note the handlers are “added” while filters are “set” because there may be more
than one handler whereas there is at most one filter. Here is an example of con-
figuring a logger that uses a file handler.
package java.util.logging;
public interface Filter {
boolean isLoggable(LogRecord record);
}
logger.setFilter(new Filter() {
public boolean isLoggable(LogRecord record) {
if (record.getSourceMethodName().startsWith("get") &&
record.getLevel().intValue() <=
Level.FINE.intValue()) {
return false;
}
return true;
}
});
This filter in effect says, “Do not log trace information from accessor methods.”
Implementing a filter is that easy.
It is important to understand why levels and filters are used in both loggers
and handlers. This may seem redundant at first, but it gives the application pro-
grammer a much higher degree of control over the configuration. For example,
the default level for a ConsoleHandler is Level.INFO , whereas the
level The level is used to make the initial determination if a log record
should be published. The default is usually Level.ALL. The
StreamHandler and ConsoleHandler classes are an
exception. They default to Level.INFO.
filter A handler has at most one filter. I doubt very seriously if this
property would ever be included in a logging configuration file
because it would take more work to instantiate the class than it
would to create the filter (unless the filter were unusually complex).
The default is always no filter because filters must be
programmatically created.
formatter A handler has at most one formatter. Most of the time this is one of
the standard formatters SimpleFormatter (a plain text
formatter) or XMLformatter. The FileHandler and
SocketHandler classes default to an XML formatter. The
StreamHandler and ConsoleHandler classes default
to a plain text formatter. The MemoryHandler class does no
formatting and so ignores the formatter property.
encoding This is the character encoding scheme used when formatting the log
record. Always defaults to the default platform encoding. The
MemoryHandler class does no formatting and so ignores
encoding property.
MemoryHandler()
MemoryHandler(Handler target, int size, Level pushLevel)
FileHandler()
FileHandler(String pattern)
FileHandler(String pattern, boolean append)
FileHandler(String pattern, int limit, int count)
FileHandler(String pattern, int limit, int count, boolean append)
SocketHandler()
SocketHandler(String host, int port)
The explanation for this design is that at least some (all in the case of a socket
handler) of these properties must be known before the handler object can be
created. Those same properties cannot be changed after the handler is created.
Hence no get and set methods. Overall, it is a very clean design.
When I first became aware of this design, I assumed that the three File-
Handler constructors between the no-argument constructor and the one that
Object param1
Object[] params
Upon exiting void methods or any constructor,
the basic exiting method shown to the left is
used. There is another exiting method that
uses the following as the last parameter:
Object result
Logging the fact that an exception is being thrown
is especially important because exceptions are
sometimes ignored or translated without properly
chaining the old exception to the new one.
Object param1
Object[] params
Throwable thrown
import java.util.logging.*;
import java.io.*;
class Test {
private static final String CLASS = "Test";
private static Logger logger = Logger.global;
static {
logger.setLevel(Level.ALL);
logger.setUseParentHandlers(false);
Handler consoleHandler = new ConsoleHandler();
consoleHandler.setLevel(Level.ALL);
logger.addHandler(consoleHandler);
}
public static void main(String[] args) {
new Test().copy(new File("java.log"), "Copy of java.log");
}
public File copy(File file, String newName) {
final String METHOD = "copy(File file, String newName)";
logger.entering(CLASS, METHOD,
new Object[]{file, newName});
File copy = new File(newName);
/* code to copy file */
Exception e = new IOException("something went wrong");
logger.throwing(CLASS, METHOD, e);
//just kidding
logger.exiting(CLASS, METHOD, file);
return copy;
}
}
In mature systems, error codes such as these can be used in system docu-
mentation, making it possible for end-users to lookup more detailed informa-
tion about the cause of a particular problem.
If not passed as arguments, the LogRecord class makes a “best effort” to
determine the name of the class and method in which a log method was invoked,
but this effort is likely to fail in commercially available Java VM such as HotSpot.
As stated in the API docs:
This is, of course, a reference to method inlining, which is the single most impor-
tant optimization performed by state-of-the-art Java VM. Because all commer-
cially available Java VM optimize code by inlining methods, the frame that
invoked the logging method may have a different class and method name. (Pre-
sumably a different method name is what is meant by “approximate.” By exten-
sion, “quite wrong” would mean that both the class and method names are
wrong.) This is the same reason why stack traces sometimes say (Compiled
Code) instead of the source code file name and line number.
If a memory handler is used or if the log record has been serialized, the
source class and method names are sure to be wrong. See Bug Id 4515935,
“LogRecord lazy class/method name inference can be incorrect/unreliable”
for a discussion. It will be interesting to see how this is fixed. Moving the stack
trace search to the log(LogRecord record) method in Logger would be
contrary to existing documentation claims as to the cost of logging. This
dilemma makes a strong case for reconsidering Bug Id 4465521 (mentioned
above).
“Automatically inferring” the name of the class and method in which a log
method was invoked is a much simpler process than the API docs make it out to
be. I do the same thing in AWTExceptionHandler in order to determine the
name of the class and method in which a message was printed to standard out-
put or standard error. Here is the source code:
at AWTExceptionHandler$StandardOutputStream.flush(AWTExceptionHandler.java:230)
at java.io.PrintStream.write(PrintStream.java:260)
at sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(StreamEncoder.java:334)
at sun.nio.cs.StreamEncoder$CharsetSE.implFlushBuffer(StreamEncoder.java:402)
at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:113)
at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:169)
at java.io.PrintStream.write(PrintStream.java:305)
at java.io.PrintStream.print(PrintStream.java:448)
at java.io.PrintStream.println(PrintStream.java:585)
As you can see besides the print method there are seven other methods on
top of println by the time flush() is invoked. About the only trick to doing
this is to remember that the class name in a StackTraceElement is always
fully qualified.
A LogRecord looks like this in memory:
There are thirteen fields, three of which are transient (and therefore not seri-
alized). There are a total of nine reference types plus 24 bytes of primitive data.
That is 96 bytes of data on a 32-bit architecture.
There are get and set methods for all of these fields (except needTo-
InferCaller) as well as the public LogRecord(Level level,
String msg) constructor, making it possible for application programmers to
Basically what this means is that you should not keep a reference to the
LogRecord after invoking log(LogRecord record).
For all named loggers, the reset operation removes and closes all Han-
dlers and (except for the root logger) sets the level to null. The root log-
ger's level is set to Level.INFO.240
The effect of invoking the reset() method is that the logging API can-
not be used during shutdown. Bug Id 4682478 addresses this question, but
misses the mark because it suggests an ordering (or reordering) of shutdown
hooks. The evaluation correctly notes that it “is dangerous to rely on ordering of
shutdown operations.”241 Shutdown hooks are a bunch of unstarted threads that a
JVM starts and runs concurrently. They are not and never will be ordered. See
java.sun.com/j2se/1.3/docs/guide/lang/hook-design.html for a dis-
cussion. The question, however, is not the fact that shutdown hooks are unordered.
The question is: Should the log manager be invoking the reset() method?
In other words, should LogManager be registering a shutdown hook at all?
If the answer is yes, the only plausible explanation is the push() method in
a memory handler, which could potentially publish thousands of log records dur-
a. All of these are private instance variables. Class variables are not included in this table. Note that
parameters[] and resourceBundle are transient, but for reasons explained in the table are written to
the serialized form as strings.
I think this warning should suffice, however, and that the logging API should be
available to application programmers during shutdown. At a bare minimum the
LogManager class should document the rationale for making the logging API
unavailable during shutdown.
There are numerous examples of why the logging API should be available
during shutdown. Suppose, for example, that you have a performance critical
application for which the log file must be saved after each run. The push level is
set to Level.OFF so that no records are published while the application is run-
ning, but there is a requirement to explicitly invoke the push() method during
shutdown. It would be natural to do this in a shutdown hook inside of the
configureLogger() method. For example,
As noted above, however, shutdown hooks are a bunch of unstarted threads that
the JVM runs concurrently. There is no guarantee this shutdown hook will run
before the log manager closes the handler. (In my experience, the log manager
shutdown hook always runs first.) Closing a memory handler does two things.
The target handler is closed and the memory handler level is set to OFF, which
means that when the shutdown hook code above runs (assuming it runs after the
shutdown hook registered by the log manager), nothing is published and the
push() method completes normally. Thus shutdown hooks cannot be used
to invoke the push() method. The workaround is to invoke the push()
method before shutdown. In single-threaded applications, the push() method
can be invoked at the bottom of the main method. GUI applications should
invoke it in AWTExceptionHandler (or whatever you call your top-level
exception handler) as well as in the windowClosing(WindowEvent e)
method. See also Bug Id 4744270, “The Cleaner shutdown hook is application
code” (which is a bug that I submitted).
Numerics
0xF ........................................... 463
0xf ........................................... 463 B
0xFF ......................................... 464 backtrace ...................................922
0xff ......................................... 464 backwards recovery ....................860
baseclass ...................................340
basic rule of protected access .303
binary branch ..............................532
binary branching order.................467
A binary numeric promotion ............413
abnormal termination .................. 561 binary operators..........................435
abrupt completion....................... 560 binary values...............................477
access ....................................... 286 bit ..............................................477
access mode.............................. 831 bit flags ......................................477
accessible.................................. 291 bit flipper (operator).....................485
activation frame.......................... 686 bit masks....................................463
actual argument.......................... 144 bit set.........................................479
addends..................................... 444 bitset .....................................477
ad-hoc polymorphism .................. 587 bitwise operators ........................454
alpha value ................................. 521 bitwise programmer ....................475
ancestor (tree data structures)..... 953 blank final ...................................102
AND ........................................... 454 block ..........................................566
anonymous loggers..................... 954 block-private ...............................300
applicable method....................... 634 boolean equality operators ...........445
application namespace................ 251
P R
package-private .......................... 290 read-only access ........................ 160
parameter specifier ..................... 144 records...................................... 774
parametric polymorphism ............ 587 recovery .................................... 860
parents ...................................... 953 recurring end-user error .............. 715
partially qualified ......................... 253 reference constructor ................... 48
pattern ....................................... 948 reference equality operators........ 445
pc register ................................. 695 re-inheritance ............................. 367
U
umbrella exception ...................... 731
unary numeric promotion ............. 412
unary operators .......................... 435
W
uncaught exception ..................... 883 while (true) ........................ 531
unchecked exceptions ................. 699 widening conversions.................. 597
unique number ............................ 950 window of vulnerability ................ 162
universal data type ...................... 501
unnamed bit masks ..................... 482
unnamed namespace .................. 251
unrecoverable checked X
exceptions.................................. 725
XOR........................................... 455
unsafe type conversion................ 599
unsigned byte ............................. 501
D
Symbols declarators ........................... 36–37
<clinit> .................................. 35
See also class initialization
methods
<init> ...................................... 35
See also instance initialization E
methods ExceptionInInitializerErr
or .........................................80
A
anonymous classes F
using instance initialization blocks factory methods ................... 50–53
as constructors for ............. 41 field access expressions
arrays the five general forms of..........35
array component initialization... 39 fields
declaration syntax ............. 35–37
field modifiers .........................36
initialization.............................38
initialization anomalies .............55
variable initializers for........68, 70
INDEX 997
five general forms (of field access methods
and method invocation expres- See also factory methods
sions) ..................................... 35 modifiers.................................... 35
forward references .............. 56–61 illegal combinations ................ 37
interface constant modifiers .... 36
order of ................................. 36
See also access modifiers,
static, final,
I transient, volatile,
synchronized, native,
initialization blocks............... 39–42
strictfp
must complete normally .......... 41
used as constructors in anony-
mous classes ..................... 41
inlined constants
always appear to have been initial-
ized ............................. 65–68 N
instance initialization blocks native ...................................... 35
See also initialization blocks
instance initialization methods .. 35
instance initializers
See instance initialization blocks
instance intialization O
See object initialization
object initialization
instance variables
invoking overridden methods dur-
memory allocation .................. 38
ing object initialization ....61–65
interfaces
StackOverflowError 70–71
interface constant modifiers .... 36
throwing checked exceptions dur-
ing................................. ??–81
L
link variables .............................. 63 P
local variables
parameters
declaration syntax............. 35–37
See also constructor parameters
M R
method invocation expressions
reference constructors
the five general forms of ......... 35
See constructors
INDEX 999
1000 JAVA RULES