People of Language Design and Implementation

An interview project in conjunction with PLDI 2019  

Steve Blackburn

Interview with Steve Blackburn

Steve Blackburn is a professor of computer science at the Australian National University. His primary research focus is on programming language implementation. He has published in venues that specialize in programming languages, memory management, operating systems, virtual machines, computer architecture, compilers, performance analysis and software engineering. He is a Fellow of the ACM.

 

BL: Why don't we start with you telling me about your life before academia.

SB: I went to school here and I didn't do very well at school in high school, I didn't do very well at all. I struggled. I eventually went to university and got myself into mechanical engineering, and then electrical engineering, and then electrical and computer science. That was a progression of difficulty to get into those degrees. That was at the University of New South Wales here. I did OK, but I really enjoyed my undergrad life. I did lots of things. I played lots of football, I was producing theatrical performances. I did lots of different things but not much of it was academic. Then there were two things that happened. I met this girl, who is now my wife, and she was very academic. She put me on the track a bit. The other thing that happened was that there was a professor who was running a great computer architecture course that really set me on fire. I ended up doing an honors project with him. In Australia, that's a year long research project that you do as an undergrad. I just deeply enjoyed that. We ended up writing two papers as part of that project. It really opened my eyes to research. I spent the whole year in a little lab with a group of postdocs and PhD students. I had never seen that kind of work going on before. It was a real eye-opener. It showed me as I was doing it that this was what I really loved.

BL: What about that was so revealing for you at this early point in your academic career?

SB: Well, I realized that in research there was a creative dimension. I didn't do so well at school, in the early part of being at university. There weren't a lot of opportunities for me to be creative. But in this position, I could really just go wild and I thoroughly enjoyed it. That's what got me into research, on the track to where I ended up.

BL: It's interesting that after not doing great academically to begin with, you've been able to narrow down the reason to this issue of creativity in the work that you were doing. It can take some intellectual maturity to see research as an outlet for creativity early on. How did you begin to see research as an opportunity to be creative?

SB: Well, I had this really hands-off advisor. I think I met with him two or three times during the year. I was working mostly on my own with a group of enthusiastic grad students and postdocs. It was a very open-ended problem. Somehow or another, I was able to see an answer, early on. It was a hell of a lot of fun to go and solve this problem and implement a solution. It was an image processing problem that required developing an algorithm to do realtime image processing. I had to develop the algorithm, and then I had to develop some hardware for it. The whole thing was just thoroughly fun.

BL: That involved hardware, too?

SB: Yes, I was in a hardware lab, so we had to do the whole thing.

BL: And your work going forward has continued to be a bridge from PL to hardware. You've published not just in PLDI, but also in other conferences, ISCA for example. Has this been an explicitly guiding principle in your work?

SB: Yes, totally. When I went to do my PhD I was just following my wife. She got a PhD scholarship and I didn't. The university where I went, which is actually where I'ved ended up back at again now, ANU [Australian National University] there was no one doing computer architecture there. I have a feeling that had there been anyone doing architecture there back then, I probably would have stayed with that, but the opportunity didn't exist. My interest in architecture back then may help to explain the fact that my work bridges that gap. In fact, I had gotten several offers at the time to go and do a PhD that would have been more on the hardware side. I had an offer from Michigan with someone named Anil K. Jain who was quite famous in image processing. That could have worked out, but it didn't because there wasn't anything there for [my wife] Anneke. So we ended up going to Australia. I very nearly did end up going down the hardware route, but it didn't work out.

BL: So when you eventually did get to grad school, did you dive right in and start working on managed languages research?

SB: No not at all. I couldn't do architecture and I hadn't even thought about memory management. I went into a pretty niche area that just happeend to align with someone who was willing to advise me. There was actually a pivotal moment when I just couldn't get a scholarship to do my PhD because my undergrad grades were pretty terrible. Then, one day, a guy called Dave Thomas [of Object Technology International] showed up and he was a buddy of the guy who became advisor. I ended up having a chat with him for about an hour and he provided the scholarship. It was really pretty serendipitous. Without Dave showing up, it didn't look like I was going to get myself a PhD scholarship.

BL: So there was a moment where you weren't sure you'd be able to go through with it because of funding?

SB: Yes, I was hungry for it, but I couldn't get funded. The stars weren't aligning, but then that happened and that was a real key moment.

BL: And the project grabbed your imagination, too?

SB: Well, it was really fitting in with what was available. I worked on orthogonal persistence, which is an interesting little corner of the programming languages world. It has that same nice twist of being fairly systemsy, but also having some nice abstractions. You have a nice abstraction of persistence, and you have a back-end system that actually makes things persistent. Back in those days the question was how to have a language that had an abstraction of persistence, but then also a system that could efficiently persist the stuff. It was to disks back in those days. Orthogonal persistence was my main thing during my PhD. To get synced up with my wife, after my PhD I did a brief postdoc at ANU. While I was there, I met Eliot Moss and Tony Hosking. They were also working on the whole orthogonal persistence thing. And after getting to know them a bit, Eliot offered me a postdoc position at UMass [University of Massachusetts]. My wife then also secured a postdoc there, too. Once we got to UMass, that was when I started really working on GC.

BL: That was around 2000?

SB: Yes, it was 1999. There was this student in the lab named Darko Stefanovic and he did this very interesting GC work. I guess I arrived in the lab that October and Kathryn McKinley and Darko presented the work in Denver at OOPSLA. I remember that because someone in the audience asked a question after the talk: "Why are you doing all of this GC work in simulation?" Back in those days it was very hard to get any meaningful language environment wher eyou could actually modify the GC and get any real results. So the answer was that we don't have access to a real JVM where we can do the experiments. That person that asked the question I think was Vivek Sarkar or another one of the key people inside of IBM. He went on to say that they were developing this new JVM inside of IBM. Kathryn phoned me right away to talk about that and said that there was an opportunity to work on this JVM, which at the time was called Jalapeno and eventually became JikesRVM. I briefly became a staff member at IBM because at that point it was closed source. I did the port of that JVM to Linux. We open-sourced it and that became the center of everything that I worked on at UMass. The big thing behind JikesRVM, the big motivator was all about how to deliver Java at scale. Everything was about scaling and it was in a large part motivated by the architecture.

BL: That seems like a classic formula for research. Changes in hardware motivating changes in software and advances in software motivating changes in hardware.

SB: Well it depends on how speculative you want me to go. I am really excited about the idea of doing processing in memory. I think there's a lot of opportunity there. Looking more to the mainstream, I don't see one single thing that has driven the GC community. One thing that does stand out is the need of the GC community to deal with pauses. It doesn't have so much to do with computer architecture, but actually to deal with the domain. It's all about managing tail latency. One of the most important domains where we use GC is in large-scale servers. Companies make a lot of money off of the services that run on these servers. There's been a lot of effort in the research community and in industry to limit the impact of GC pauses on their tail latency. This problem of eliminating milliseconds of GC pauses from the tail latency of these large-scale services remains one of the biggest problems that we have in the GC community.

BL: Your work has been at the intersection of PL and systems for a long time, with a clear focus on managed languages and memory management for a long time now. That world has changed significantly since you started working in that area. What is the biggest change that you've seen during your career?

SB: In terms of GC, I think the biggest changes is the adoption of managed languages like Java, Python, and all the other scripting languages out there. Back when I started, the only managed language that anyone used was Java, but now that has completely changed. The widespread adoption of managed languages has, for GC research, led to complete change. It's moved the work from being fairly niche to being something that is fairly mainstream.

BL: It has also created a lot of new programmers.

SB: Yes, raising the level of abstract has really made coding more accessible. It was Ken Kennedy, I think that said "Abstraction without guilt", and I kind of latched onto that as an anchor. How do you deliver some nice, high-level abstraction without all of the overheads. I assume that came from his work on high performance FORTRAN. It has always been an aim of my work to deliver that kind of useable, high-level abstraction without any overheads. Can we get the overheads down to zero?

BL: Zero-cost abstraction is still an important principle. I saw a talk by some Rust developers recently and they were describing their aspiration to zero-cost abstractions, too.

SB: Yes, it's the same idea, but of course they're doing it very differently in Rust.

BL: What do you think the cross-stack nature of research has done to work in this area?

SB: Frankly, I think that the cross-stack nature of these problems has actually been stifling because it is confounding. There are only a relatively small number of groups that have been working on this kind of cross-stack work because it is really hard. You have to have such a breadth of understanding and evaluating these kinds of systems is really hard. Recreating an environment that does all of the stuff at all of the layers of the stack for experiments is really hard, too. Another thing about this area is that the people who really care about these results are the big companies. A lot of researchers decide that they just want to steer clear of that and not play in that sandpit.

BL: I find it interesting that you see the need for the broad expertise of a cross-stack research domains as stifling to PL research. Do you think there is a way to address this issue in the PL community?

SB: There are a lot of dimensions to the problem. One dimension is the education of future students. One of the things that I loved when I was at UMass was what they call a "synthesis project", and I don't know how common that is. The idea is that you do a project across multiple sub-disciplines. I I think that is really healthy. I remember there was a student doing register allocation and machine learning. There was another student doing theory and memory allocation. I think encouraging students to have a really broad yet deep understanding is really important. Nother thing for me is the research community embracing the idea. I think my favorite conference is ASPLOS because it brings together this great intersection of all of these different research areas. I guess I shouldn't say that in the context of this interview. [Laughter] PLDI is of course great, too. I think it is important that we make sure that PLDI opens itself up to things that fall a lot more broadly to systems, architecture, and related areas. I think that if you look at the recent history of PLDI, you'll see that the conference has actually opened up to more of these topics and that's really pleasing, and also necessary. I don't think you're going to get to the bottom of these really hard problems in our community unless our community does open itself up like that.

BL: Are there other areas that have the same problem?

SB: My experience with the computer architecture community initially was that they had the same problem in some ways, but in the reverse. I know that at least back in the day, a lot of effort was spent trying to optimize for [the] SPEC-CPU [benchmark suite] which is a very narrow view --- I'm caracituring here --- of what the software world was. I think that may have really stifled computer architecture research. I don't think that is as much of a problem now, but I think it was for some time a problem in computer architecture research for a long time. Again, that's what makes something like APSLOS so fun, where you see the intersection between these areas. I think that to some degree there is a natural intellectual laziness, regardless of the field. It's easier to do something in a single discipline. Like, if you're an architect, working with managed languages is much harder. Using something like SPEC CPU is great because you're simplifying. I think that complexity may have held people back. So instead of getting all of the infrastructure set up to do work on managed languages, some researchers might be saying, "well if I can get a paper published without doing all that extra work, why would I bother?"

BL: It makes me think of the artifact evaluation process. It's difficult enough to make it all work once, and it's even more difficult to make everything reusable.

SB: Matthias Hauswirth and I ran one of the first Artifact Evaluations. There have been quite a few people, Jan Vitek, Shriram [Krishnamurthi] and others that helped to get artifact evaluation rolling in the PL community in the first place. I've always been keen on the idea or releasing software, reproducibility, and openness. I think that is partly for this reason that we're talking about. If you make an artifact open to the whole community then reproducibility and the credibility of the research goes up. In fact, the DaCapo benchmark suite is a good example of this. DaCapo is a bit like the anti-SPEC CPU. We had this major NSF ITR grant and we complained at the midterm evaluation that one of the big things that was holding us back was the lack of good benchmarks. They said "Great! Go build one!". That was what started that whole project. The name of the research project was DaCapo and so the name of the benchmark suite became DaCapo. The idea was that we need realistic workloads and you need stuff to be open so we can all move forward as a field. If you hve simplistic workloads and people aren't making their work available, it can really hold the field back. I've been really passionate about this idea for a long time. To build and maintain infrastructure like this takes an incredible amount of work. It always has been hard and we were very happy to see the community embrace DaCapo. It was very encouraging to see the community use what we built to that extent.

BL: It sounds like your time at UMass had a pretty significant impact on your career.

SB: I think the trip to UMass was really influential. I was locked in a pretty narrow world here [in Australia] and I think that trip really opened my eyes. I began working with Kathryn McKinley, which really shaped a lot of the work that I did for the rest of my career. We're still working together, in fact, we're doing a book right now. That trip had an enormous impact. The postdoc at UMass was transformative and it was a fun department to be in at that time. The opportunity to travel the world and go to a lot of conference was really great.

BL: Did you feel somewhat isolated, working from ANU before you went to UMass?

SB: Yes, it did feel somewhat isolating. Actually right before I finished up my postdoc at UMass the then department chair happened to be an Australian. He called me into his office and gave me this generous, but somewhat paternalistic talk about how I shouldn't go back to Australia because it wouldn't be good for my career. That talk actually stoked the fire in me. My family was there, my wife and I both wanted to go back. That talk did make me think about how to make my career work in the relative isolation that is Australia. Australia has some strength in computer science, but it was not in the areas that I was working in. Until Tony Hosking came over, I felt like I was working in my area somewhat on my own for about a decade or so, I suppose. That isolation was a little tough.

BL: Are things changing now?

SB: Yes, things are changing, the environment is changing and I'm actually quite optimistic. Back then, however, the isolation made things a bit difficult. I made up for it by doing a lot of video conferencing. For a while I was video conferencing every morning to stay in touch.

BL: Did you find that the isolation had an impact on your personal or family life? I imagine that you'd also have to travel a lot more. How has it been to manage that throughout your career?

SB: That has always been tough, in particular when you have two academics in the family. Part of it has been having a very supportive wife, and also trying my hardest to support her as well. For travel, I have always had a rule, which was that I didn't go away for more than five days. That's kind of insane because most of my trips are to the US and two of those days have to be travel, so I'm only on the ground for about three days. I would just have to pack a hell of a lot in, go to a PhD defense, visit industry, and go to a PC meeting. That was one thing I did to try to stay connected while having a family. If I hadn't have had a family, I could just run off for a week or two at a time and it wouldn't make any difference, but obviously I can't do that, having a family. I did some interesting things with scheduling to make it work, like with exercise. For many years, I would get in my exercise by jogging to work with a stroller and later with a bike trailer that I would use to take the kids in. I would find ways to get exercise that fit in with the daily rhythm of having children that at the same time gave my wife space. The running was especially good because if I took the kid in the stroller running for an hour, the kid would get a snooze and my wife would get a break. All of my other forms of exercise disappeared and I just switched to that, putting two things together.

BL: OK, one last question: thinking back through all of your career and maybe PL history, what is your favorite result?

SB: There was a line of work and I don't remember exactly the paper. There was a moment when I was visiting St. Andrews as a grad student and I was poking around through some old OOPSLA proceedings. I found one of the original papers on the language Self. I remember digging through a lot of papers on Self and it had a signficant impact on me. I think it also had an impact on the field, but in particular, it really did have an impact on me. It really got me interested in managed languages and the whole collection of stuff I spent a lot of time on. I remember reading and thinking they're really doing some thing audacious, taking something that would be ridiculously difficult to make work efficiently and they made it work. At the time it was not clear: the simplicity of it was great, but how can you possibly make it perform? It was really the progenitor of a lot of work that we now see and consider quite normal, things like the insides of languages like Python and Java and so forth.

BL: Did Self have a lot of influence on your work?

SB: It did, and some of these ideas come through in my work on MMTk. Building this software environment is really important because it provided people with a toolkit where they could really rapidly experiment with GC ideas. That's the thing I really feel most proud of in my career. I think it helped me personally, but I think it helped the broader community as well. We talked earlier about PL and architecture as being two distinct, but connected fields that I'm interested in. I think that PL and software engineering are two others, and that was really what MMTk is all about, is software engineering. We got an ICSE paper on that, about how you build this really modular thing that's really flexible that performs well. I find that really fun to think about. We're rebuilding MMTk right now and at a personal level I find it very exciting to build something like that which is really general, really flexible, and performs well. I think that's what threads back to thinking about Self, which shared those goals and inspired me to do that kind of work.