All 50 states use observations to evaluate practicing teachers, but we know little about how administrators actually reason when they use those observation protocols. Drawing on think-aloud and stimulated recall data, this study describes the types of strategies and warrants practicing administrators used when rating with their district’s observation protocol. Administrators in a large urban district used an observation protocol aligned to Danielson’s Framework for Teaching to rate a brief lesson clip. Administrators’ thinking was recorded, clarified, and inductively coded. Findings suggest administrator thinking and justification is complex even for short lengths of instruction. Administrators used a range of reasoning strategies, many of which were not sanctioned by their training. Exploratory analyses suggest strategy use was not related to the accuracy of ratings. Implications for the validity of teacher observation scores in high-stakes settings are considered.