Metric Match: A Subset Selection Approach to Evaluating LLM Judge Reliability | AIChainDay