Mike He's Homepage

Summary of 2021

It has been a hard time through 2021, not only because of the pandemic but also due to the amount of stuff to work on. For research, I mainly worked on 3LA with Zach, Steven, Gus and Vishal this year (continuing the work from last year), and fortunately, I got some achivement from it. For my coursework, it's still as usual, but there are several interesting courses that I would like to talk more about, specifically a philosophy course about logic and a computer science course about datacenter systems. For academic in general, there are a lot more things happened this year because I am applying to graduate schools. Among these, I want to note something about teaching, conference service and my take aways from the Ph.D. applications.

Since research seems one of the most important things, so I will talk about it first. 3LA is a collaborative project where the team member spans across three institutions, UW, Harvard and Princeton. The goal of 3LA is to replace traditional HW/SW APIs with a formal interface, which models the semantics of the hardware and bridges with the compiler mapping programs written in high-level DSLs to target hardware. We built the abstraction of the formal interface using Instruction-level Abstraction (ILA) from Princeton. My work on this project involved writing a JIT compilers from Relay to ILA and the Flexible Matching algorithm for workload discovery. Our paper is currently under review at PLDI'22. Additionally, I got an honorable mention from the CRA Outstanding Researcher Award for working on this project (also DTR from last year). The collaboration was a fun part of 3LA because the pandemic blocked us from working in person. In fact, team members from different universities had not been face to face until the SRC ADA Fall Symposium. Speaking of this, I participated in this symposium and made my first domestic trip in the U.S., flying from Seattle to Detroit and taking a car to Ann Arbor. The event was pretty dense and interesting, though we mainly worked on the paper push on the first day. ADA center people are very hardware focused, which made me a bit hard to follow the pace and the topics at first, but thankfully they seem also interested in the sw/hw interfacing part, which I could talk something about. Another thing that surprised me was I met two Ph.D. candidates from UMich who also attended Beijing National Day School (BNDS, my high school). I hope that I would be able to participate in the next one, too, since it will be the last symposium of ADA center.

Because I spared more time for doing research, I didn't take many courses this year. However, there are two of them I'd like to talk and share about it. The first one is PHIL 471 Advanced Logic offered by Professor Michael Townsend from UW Law school. I've heard that the course was approved to be offered because Bruce (one of my roomate major in Math) petitioned it with a philosophy professor. The prof was not convinced there would be enough students enrolling this class, but Bruce said there would be at least 4 (all from our apartment lol), so the professor agreed to proceed the petition. Luckily it was offered and I took it in the spring quarter. The course was an extension of PHIL 470 (intermediate logic, which talked about completeness and consistency of FOL), and the content mostly covered the pessimism of mathematical logic in last century: undecidability, Godel's incompleteness theorem, Hilbert's 10th Problem, etc. It was a hard course but fun to take and I would recommend all computer science undergraduates taking PHIL 470 and 471 if it would ever be offered again, because it could provide many insights of the fundamental theory of computer science (e.g. the effect of self-referencing). The second course I want to share about is CSE 453 Datacenter Systems. It is a new course offered by Prof. Tom Anderson. I liked it a lot even though I didn't go in person very often. It probably is the first undergraduate course at UW taught using the Rust language, and I would like to see more system related course using this language because its intrinsics such as borrow checker and lifetime are extremely helpful when writing programs, especially those building up complex systems. I used it intensively this year for my research, which leverages the egg library for equality saturation, and I think it would be more popular in the next few years because it provides more compile-time security support for system programs.

I worked as a TA for a graudate PL class this year in spring, which was also the first time I taught students. The teaching experience was more interesting than I thought. I held office hours on Sunday, and a lot of PMP students came to my session (because mine was the last one before homework deadlines :P), so chatting with industry people about PL was very fun. I also made my first conference service at MICRO'21, where I was in the AEC. The artifact I evaulated involved training a BERT model, which couldn't be fit in the GPU in pipsqueak (PLSE lab machine), so I brought a bitcoin miner from my friend (Pengfei He) and used his RTX 3090 to do the evaluation (though I ran the training script simultaneously on CPU with pipsqueak). It was a very fun process when synchronizing with the PCs. The other reviewer gave them a top score for reproducibility but I tried several times and failed to reproduce their results, so I gave them the lowest score. A week before the eval deadline, a PC member came to me on Slack and asked me about this issue, and then I contacted with the authors. They managed to identify several bugs in the training script (including misconfigurations of epochs, data directories, etc.). It was a pitty that I tried several fixes from the authors but still failed to get the results :/ but generally, their work was very cool, which managed to do sentence-level energy consumption optimization for BERT inference on edge devices.

I put the Ph.D. application the last also because it is the most important one to talk about. I started the preparation around August by drafting my statement of purpose and sending out requests for recommendations. For SoP, I would recommend people to ask for some references from others because those would be good samples for format, structure, and potential contents, etc. Generally, the SoP is for showing research interests and related background (i.e. research experience), so the main content of my SoP is DTR and 3LA. I got the CRA award nomination in late September, and I built up my SoP and PS through the material preperation for the award selection, and later, thanks to the PLSE reading group, I was able to ask for some help on editing the draft from my colleagues. I applied to 10 institutions, and according to my colleagues, this is twice as many as most of them did, and it was a bit costy. My dream school is still CMU (I applied but was rejected when pursuing my B.S. degree) because the research involved in the Catalyst group matches exactly with my current research interests. At the same time, all the others I applied to are also great matches with my background and future research. The only concern I have is having no offers because apparently schools I applied to are highly selective, but we'll see...

I would like to thank all my friends, teachers and colleagues helped me along the way of graduate school applications. Especially, I want to thank Prof. Zachary Tatlock for advising me on various research projects and providing me enourmous supports on doing research. I want to thank Prof. Sharad Malik and Prof. Aarti Gupta from Princeton University for their recommendation letter and collaboration on the 3LA project. I want to Thank Dr. James Wilcox for his advise on learning PL and research and his recommendation letter. I also want to thank Steven Lyubomirsky on advising me and helping me out the paper works for applications and the recommendations to Intel. I couldn't have those nice experience and applications without their support. Lastly and the most importantly, I would like to thank my parents for their understanding and support for my studying in a foreign country, especially during the COVID-19 hard time. I miss you so much! Thank you all!

Goals for 2022

2022 would be a very dense and busy year for me, no matter whether I would be going to a graduate school. For simplicity, I will bullet point all of them here:

Application decision comes in late Jan and early Feb: make a decision, and then I will plan the visit days

PLDI'22 rebuttal comes in Mid-Feb: if the paper is likely to be rejected, we will need to prepare for a resubmission (potentially ASPLOS)

We have an ongoing research about formal verification for DSL to DSA compilation, collaborating with Princeton folks. We are aiming at OOPSLA

I will probably help on another conference paper of Glenside, or on Relax (Relay-next)

I will probably be joining Intel as a research intern in spring 2022 doing formal verification

I will be joining Taichi Graphics as a compiler dev intern in summer 2022 (I have accepted the offer)

Annual Writeup

Summary of 2021

Goals for 2022