Abstract - Through our work we validated the symmetry and leniency heuristics used to evaluate content which has been procedurally generated in a mixed-initiative tool called the Evolutionary Dungeon Designer (EDD). This was done by analyzing and comparing how said heuristics differs from what a human player perceives the metrics to be. A user study was performed for which we developed a game for human testers to try and experience different dungeons, in order to collect the data needed for making the necessary comparisons. We propose potential improvements to the metrics that are intended to represent difficulty and aesthetics used in EDD so that they could better match their intended goals. In general, the testers found the maps to be close to the expected difficulty but there was a big discrepancy in the symmetry metric and how aesthetic they found the maps. We further discuss how the research performed by us could be expanded upon for the purpose of making further improvements to automatic evaluation heuristics, by either making similar research on games of different genres or on games with different game mechanics.