OPTIMIZING LEARNING WITH TLMS: A DEEP DIVE INTO TRANSFORMER-BASED MODELS