CaVe-VLM-CoT: An Interpretable Vision-Language Model Framework | AIChainDay