Java Generics and Collections – 2e!

Yesterday I heard that O’Reilly Media, publishers of Java Generics and Collections, have approved a proposal for a second edition. This news came from Zan McQuade, Programming Languages acquisition editor at O’Reilly, whose strategies for acquiring content include indefatigable patience—she calmly waited out two years of silence from me after I first pitched the idea to her. In fact, I’ve been thinking about it for much longer than that, starting from when I was writing Mastering Lambdas in 2014. Despite the title, that book is mostly about Java streams, which since Java 8 have been complementary to collections for processing bulk data. A half-book on collections seemed to need revision from that point on.

But that’s only half the book, and—it’s always seemed to me—the less important half. After all, the original USP of JG&C was its co-authorship by Phil Wadler, who was one of the originators of the “Generic Java” prototype that eventually led to the introduction of generics in Java 5 in 2004. Nowadays few Java programmers will remember how controversial and sometimes difficult generics seemed at their introduction, and how important it was to have an authoritative explanation of their peculiarities. But those have changed little in nearly two decades, even if Project Valhalla seems likely to alter that considerably at some point in the relatively near future, for some value of “relatively near”. Perhaps in another year we’ll know enough about Valhalla to be able to change the generics half of the book in line with it—or perhaps Zan will have to go back in two years’ time to argue for a third edition!

But meanwhile the Java Collections Framework has continued to evolve—perhaps without huge changes, but with enough to justify a revision. Much of this evolution has been adaptation to Java’s journey in the direction of a more functional style. A prime example is unmodifiable collections; although it’s more than four years since they were introduced in Java 9, many people are only now migrating to Java 11—or, equally likely, to Java 17. If the latter, they will encounter records too, so this seems like a good time for an explanation of how these different functionally-oriented features, as well as streams, can work together.

Another reason for a new edition is the ageing of the Java Collections Framework. (Actually, in personal sympathy with this elderly API, I should probably say “maturing”.) It’s worn very well for an API designed in the last century. (On another personal note, I have to tip my hat here to Joshua Bloch, the designer of the JCF. At a time when I’d given up hope of ever getting the collections material into shape, he very generously provided an extraordinarily detailed, precise—and painful!—technical review, highlighting virtually every one of my many errors, and saving the collections material from disaster.) But JG&C was written at a time when the JCF was still only five years old. Nearly two decades on, we have the opportunity for a much more considered design retrospective and for a comparison with other collections frameworks, like Guava and Eclipse collections, that have appeared since then.

I’m also looking forward to supplying two other elements absent from the first edition. One is a cause of increasing dissatisfaction for me with the collections half of JG&C: its discussion of the relative performance of the different collection implementations. I compared them there solely on the basis of their asymptotic (Big-O) performance, without providing any experimental results. That’s quite embarrassing now, after I’ve given so many conference talks on the difficulty and importance of accurate measurement when discussing performance. And since a (half-)book on collections is one place where such discussions are inescapable, I’m looking forward to providing data to back up the theory—or, more likely, to require its modification to fit with modern machine architectures.

I’m feeling quite—perhaps foolishly—confident about this revision, very much in contrast to my feelings approaching the first edition. Much of that is due to already having a technical editor on board, the tireless Doctor Deprecator, Stuart Marks. Stuart is ideally placed for this, being the Oracle lead on the Java collections library. He was TE on Mastering Lambdas, so I’ve had the pleasure of working with him before, and he’s already provided a lot of the ideas in support of the book proposal, including some in this blog piece. If you’ve read this far (congratulations!) you’ll see there’s quite a lot of work to do, but with Stuart on the team I’m confident that we’re really going to produce something valuable for the working Java programmer.

Dijkstra and DevOps: Is Programming Doomed?

Credit: The University of Texas at Austin

I don’t suppose that outside academic circles the name of Edsger Dijkstra is nowadays familiar to many people, but it should be: he was a giant of computing science who could claim, over the four decades of his research and teaching, extraordinary achievements that included the first Algol-60 compiler; the popularisation of structured programming (he coined the term); many important algorithms; and the concepts of mutual exclusion, semaphores, and deadlock, in work which practically invented the field of concurrent programming. In the 1970s and 80s, it seemed he had been everywhere, laying the foundations of almost every area of computing.

Dijkstra was always guided by a fierce determination that computing problems should be both formulated and solved in simple and elegant ways. His famous paper “On the Cruelty of Really Teaching Computing Science” argued that computer programming should be understood as a branch of mathematics. On software engineering, he famously wrote that “it has accepted as its charter ‘How to program if you cannot.’” People generally take this to be a condemnation of software engineering as a discipline, and he may have meant it that way when he wrote it in 1988. But at an earlier time he had used that term for himself, so it seems more likely that he was disparaging the people calling themselves software engineers. (For Dijkstra, disparaging people was actually his normal way of relating to them.)

What does this have to do with anything now? I’ve been thinking about Dijkstra recently in the context of my late adoption (late adoption, my life’s story) of DevOps and cloud technologies. Cloud technology has survived the hype cycle’s Trough of Disillusionment to become fully mainstream, and AWS as currently the leading provider can be taken as representative. Does Dijkstra’s view of the principles that should underlie software engineering have any relevance to people working with it?

Two aspects of the AWS offering stand out immediately: the sheer number of its services, and the diversity of the abstraction level that they provide to its user. For 2019, Wikipedia lists 165 AWS services, covering areas including computing, storage, networking, database, analytics, application services, deployment, management, mobile, and IoT tools. Even developer tools are provided – AWS provides Cloud9, a web-based IDE for the most popular languages. The intention is clearly to provide every single tool or environment needed to develop, deploy, and maintain any application, from toy to enterprise scale. Amazon may have the first-mover advantage in this, but Azure and Google Cloud are in pursuit, with other heavyweights like IBM Watson and Oracle Cloud not far behind.

How do these services relate to software engineering as Dijkstra would have liked to think of it? We can’t know what he would have made of the problem of mastering complexity of modern enterprise systems, which are enormously larger and more complex than anything in his day, but we can guess that he would certainly have scornfully excluded from his definition of software engineering the many services devoted to cost management, governance, and compliance that bedevil an old-fashioned software engineer aiming for AWS certification. But most of the services that constitute a cloud offering are actually aimed at the classical problem that software engineering addresses: managing the complexity of large systems. They do this by providing a giddying variety of ways to compose different operational and computational libraries.

And, in fact, these libraries only continue the tradition of ever-increasing abstraction and power provided to the client programmer — or DevOps engineer, as they will now usually be. I started my working life close to the bare metal, at the end of the assembler epoch, then moved on to coding well-known algorithms and data structures in a high-level language, then saw best-of-breed implementations of those algorithms and data structures incorporated into libraries that I could use, then saw those libraries combined with compilers and runtimes to form a platform on which a client program sits; now many, many layers up from the hardware. So I can’t say that increasing abstraction is anything new.

What is new is the way that cloud services integrate operational concerns like provisioning, scaling, and failover into the same framework as traditional data manipulation. DevOps engineers can script the horizontal scaling policy for their application using the same development tools and in the same language that they use to implement their sorting algorithms. You could imagine that as the developer’s job has ascended in abstraction, so it has also broadened to devour the jobs of the application architect, the data centre designer, the DBA, and in fact almost everyone who ever had any role in the management or use of computers. (Another way to look at this would be that Amazon and its rivals are using automation to deskill these jobs to the point that even developers can do them, or simply consume them as services. Amazon calls this “democratizing advanced technologies”.) However we feel about the casualties of this kind of progress, I think you can reasonably argue that it makes sense to integrate all these functions into a single discipline of, say, systems engineering.

Where next, then? Is the role of programmer doomed? Are Dijkstra’s innovations destined to become essential but forgotten, embedded deep within black-box systems that developers use without understanding – just like modern hardware, in fact? I’m forced towards that conclusion, though I continue to argue that we should value an understanding of the inner workings of our black-box systems: even when we can just about get by without it, having that understanding will improve our daily practice. And, of course, not everyone works on applications headed for the cloud; smaller-scale systems, including front ends of various kinds, will continue to be important. But more than anything, Dijkstra’s real principles, his relentless focus on abstraction and simplicity, remain as relevant as ever, to enterprise DevOps engineers as much as to embedded system programmers.