When Models Disagree: Rethinking LLM Evaluation for Public Comment Analysis | AIChainDay