Object detection and classification is a hard problem that has been a focus of both visual
neuroscience and computer vision research for decades. Amazingly, humans are able to complete these tasks to rapidly and accurately locate object instances of a superordinate category. In this paper, I lay the theoretical framework for a task-contingent visual attention system in the context of the strict time and energy constraints of an autonomous robot. I introduce a two-pass model in which a modification of the neuromorphic HMAX (Riesenhuber & Poggio, 1999) conducts a computationally feasible coarse-grained preliminary search of a large scene before localizing task-relevant images to be robustly processed by standard HMAX. I also implement a system called SIEVE to pair the elements of this model with large natural images from the SUN2012 Database (Xiao et al., 2010) to create a testbed superior to canonical computer vision datasets that also allows for easy model-testing and visualization functionality.