Working on fine-tuning the Granite 3.2 model for a domain-specific application. While the base model performs well, the fine-tuned version seems to overfit on our dataset. Has anyone experienced similar challenges? What strategies did you employ to mitigate overfitting during fine-tuning?