Shoaib Akram is a Ph.D. student in the Department of Electronics and Information Systems at Ghent University in Belgium. He is broadly interested in programming language implementation, runtime systems, and system software for emerging hardware. He is particulary interested in breaking the boundaries between different layers of the computing stack - the user application, the runtime environment including the OS, and the core microarchitecture - to cooperatively improve the performance and efficiency of computer systems. He will soon join the College of Engineering and Computer Science at the Australian National University as an assistant professor.
MZ: Tell us about yourself? How do you get to where you are?
Shoaib: I am originally from Pakistan. My parents live in Pakistan, and my two sisters practice medicine in the US. I had my first experience with any computer in 1996 when I was in 10th grade. A family friend brought us a computer as a gift on his return from the USA. Soon, I started fiddling with markup languages. I was curious how web pages get rendered on the World Wide Web. I found an HTML tutorial to build cool web pages. I was impressed by how using text as input; I can change how images appear on the screen. I started exploring other programming languages. I knew I wanted to play with computers at that stage in my life.
MZ: Did you end up studying computer science to pursue your interests in college?
Shoaib: Computer science was a lesser-established discipline in Pakistan around 1998 when I finished high school. Few schools offered world-class education. The University of Engineering and Technology (UET), the most prestigious engineering university in Lahore, had a popular electrical engineering program. To be enrolled in that program, you need to be at the top of your provincial batch. Infrastructure-wise, the university was well equipped. My dad advised me to go to that university and study electrical engineering.
MZ: That seems like it was a good decision. How was your college experience?
Shoaib: My undergraduate degree focused on mathematics, electronics, and control theory. Just like HTML, FPGAs intrigued me after a course in digital logic design. Once again, it enthralled me that a Verilog program can customize the FPGA fabric. I taught myself about FPGAs and EDA tools. Computer engineering jobs were rare in Lahore in those days, and I wanted to advance my skills. I knew I had to move elsewhere to achieve these goals.
MZ: And right after that, you went to graduate school?
Shoaib: I received a Fulbright Scholarship to attend graduate school at the University of Illinois at Urbana-Champaign. I took several graduate coursework in computer architecture in Illinois. I worked on interconnection networks for multicore architectures with Deming Chen and Rakesh Kumar. In a similar spirit to FPGA-based fabrics, the network my thesis proposed uses reconfigurable logic to build scalable networks for multicores.
MZ: Sounds like a pretty fruitful experience! After that, you got involved in doing more research?
Shoaib: After finishing my MS, I went to the Foundation of Research & Technology (FORTH) in Greece. FORTH was an excellent place to do systems research with many good researchers. I joined a group that focused on storage technology. We fiddled with the Linux kernel, buffer cache, hypervisors, VFS, and specialized file systems. My specific emphasis was on performance characterization of I/O intensive workloads. How I/O overheads scale with thread counts? How performance bottlenecks shift due to changing the storage technology? Other similar characterization puzzles. Overall, I learned about systems software a lot during that time. During a HiPEAC event in Paris, I met Lieven Eeckhout, who at that time was doing similar things, but in the CPU space. We talked about mutual interests, and there was an instant connection.
Lieven asked me to join his group at Ghent University for a Ph.D. At that point, I was building characterization and visualization tools to help people understand scalability bottlenecks. In my mind, I would explore more the scalability of parallel applications; an area Lieven was actively working on. On the other hand, Lieven had students working on scheduling heterogeneous multicores. I met his post-doc researcher, Jennifer Sartor, who did her Ph.D. with Kathryn McKinley, and had a background in garbage collection. The three of us got together. My first Ph.D. project was how best to schedule concurrent garbage collection on heterogeneous multicores.
MZ: That's quite a long and interesting journey! I now see that you have an interesting background to tackle problems with this holistic view of the entire system stack.
Shoaib: Yes, this is how it all happened. In subsequent years, we looked into DVFS for Java applications. Also, then my most important work, garbage collection for heterogeneous memories. This journey brought me at the intersection of computer architecture and programming languages, and more recently, language runtimes and garbage collection for emerging memory systems.
MZ: What type of emerging memory systems you are looking into?
Shoaib: I am looking into heterogeneous or hybrid memories. With the DRAM technology facing scaling limitations, hybrid memory is the way to advance the capabilities of memory systems. Production systems already use a combination of DRAM and another persistent memory technology.
MZ: What specific problems are you targeting?
Shoaib: We need an emerging memory technology, such as phase change memory, for capacity expansion. The problem with phase change memory is its limited endurance. After a finite number of writes, PCM cells wear out, meaning they don't work correctly. We need DRAM for performance and endurance, and PCM for capacity. Once you have a memory system like that, you have to manage it. You have to place highly written stuff in DRAM. We looked at OS approaches to manage hybrid memory at the page level. OS approaches have inefficiencies, such as page migrations, that cause TLB shootdowns. We have this idea to use the garbage collector in the Java Virtual Machine to put highly written objects, instead of entire pages, in DRAM. Everything else that is not highly written ends up in PCM. You see, programmers prefer using Java or any other managed languages because these languages are object-oriented and improve productivity. We found that most writes happen to a few objects. Our PLDI 2018 paper proposed the first write-rationing garbage collectors that use DRAM for highly written objects. One thing that garbage collectors do is they move things around in the virtual heap memory. They move things around to reduce fragmentation and to manage the heap. A write-rationing garbage collector piggybacks on these existing mechanisms to move objects to DRAM or PCM.
MZ: This is one of those ideas when you think in hindsight; it is the way it should be working. Garbage collection is already tracing objects and moving objects. Why not take advantage of it and expand it also to solve the endurance problem.
MZ: What's your approach as a system and PL researcher?
Shoaib: I like building systems, but I don't just blindly build a new system because it's cool! I want to start with uncovering new insights, for example, sources of inefficiency in existing systems; in fact, really pinpointing them using a combination of tools, such as instrumentation, simulation, or analytical models. Once I discover there is inefficiency, I like to mitigate the problem in the simplest possible way without changing the programming model, without burdening the programmer, and without hardware changes. That is why I meddle a lot with runtime systems that sit between the application and the operating system. The runtime system can act as a bridge between the different abstraction layers.
MZ: So you really get your hands dirty. From your perspective, what are good programming language problems to tackle?
Shoaib: Good question! I think that the evolution of programming languages is very closely knit to the advancements in hardware technology. As we got more transistors in the 1950s and 60s, we started to move from FORTRAN and COBOL towards C and C++. As Moore's law continued delivering more transistors, our languages evolved towards Java and C#. At this point, our hardware is changing fast. Hardware is evolving towards heterogeneity, new types of accelerators, FPGAs, hybrid memory systems, and all that. In the long run, the tools we give programmers to exploit emerging hardware should advance. Merely changing the programming models overnight is not realistic. Programmers cannot be expected to learn to use a different language or a different program model overnight. We have to produce automatic tools and systems. What I do in my current research is, I look at where the hardware is evolving, and this is where I pick new and challenging problems. What are the implications of new memory and storage technologies for our software systems? As I said, I do some initial benchmarking using a combination of tools to get a sense of the potential for improving a metric or productivity. Sometimes you go with intuition, the encouragement from experienced researchers that the problem is interesting. Sometimes things are fuzzy at the start. And so, there will be failures, but that's fine (laugh). No matter what you do, what tools you use to pick the right problem, there will be failures, both in industry and academia. You have to believe that in the process, you will learn new things, and that may open other doors.
MZ: Completely agreed. I wonder in this process if you have found someone or their work that are particularly inspiring?
Shoaib: I have two things that come to mind. I think the transactional memory work from Eliot Moss was a seminal work that got people thinking about how to do synchronization efficiently. Also, thinking broadly, Leslie Lamport's seminal work on event ordering in distributed systems. Whether we realize it or not, we are using these ideas, and they are coming back again and again - some of the original work that is right now very applicable.
MZ: Do you have any interesting stories to share from your PLDI experience?
Shoaib: Anyone reading this interview should know the importance of giving great PLDI talks. After my first PLDI talk in 2018, which I practiced an infinite number of times with my three mentors, I remember a senior researcher in our community came to me and shook hands, and told me he enjoyed the talk. I am in the faculty job market this year, and I have an offer to join the faculty of his department as an assistant professor. Sometimes, many young researchers don't pay as much attention to polish talks as they do to polish their camera-ready drafts. Do not underestimate the power of an excellent PLDI presentation. It can bring in life-changing opportunities.
MZ: You mentioned having three mentors. It seems you are very good at establishing mentorship. What is your method?
Shoaib: Junior Ph.D. student should seek mentors outside of their schools. Your advisor and your department is only one data point in a vast space. Look around, go to conferences, and talk to people. You can sometimes get help in advancing your research from unexpected sources. Please don't feel shy talking to new people on how to conduct research, and how they go about their business. Some people say, "refrain from talking about your research before it is mature because others might do it before you." To the contrary, communicating my half-mature research ideas with others helped me a lot. I was lucky to always receive critical and useful feedback on my work. We should remember that, first of all, people have their ideas to work on. Secondly, it's not easy to build a unique system from scratch. It's good to get feedback on what you are doing, and that may trigger an interest in the other person to collaborate with you, which will help you if he or she is an expert in your area. When you attend a conference, you should prepare a story about your ongoing research. Sometimes called the elevator pitch. Deliver the elevator pitch to get attention. Tell them what you're doing. This approach worked for me.