Tandem Reinforcement Learning with Verifiable Rewards | AIChainDay