This thesis explores the current state of graph compilation techniques for machine learning applications. I present a subset of such optimizations that theoretically and empirically have the most striking effects on runtime performance under various conditions. I also propose a novel approach to evaluating optimization effectiveness, and apply it to the selected object detection models. I then discuss how the selection of these techniques changes under different compute paradigms.