Exploring Hybrid Pre-Training and Fine-Tuning Strategies for Multimodal Transfer Learning in Cross-Modal Retrieval

Agarwala, Jayesh

Start Over Back to Search

Exploring Hybrid Pre-Training and Fine-Tuning Strategies for Multimodal Transfer Learning in Cross-Modal Retrieval

Open Access

Author:: Agarwala, Jayesh
Area of Honors:: Computer Science
Degree:: Bachelor of Science
Document Type:: Thesis
Thesis Supervisors:: Wang-Chien Lee, Thesis Supervisor
Ting He, Thesis Honors Advisor
Keywords:: Cross-Modal Retrieval
Multimodal Transfer Learning
Domain Adaptation
Abstract:: Multimodal transfer learning offers a powerful solution for cross-modal retrieval tasks by leveraging knowledge across modalities. In this thesis, we explore a two-stage pre-training and fine-tuning approach within an existing multimodal transfer learning framework to improve model efficiency and adaptability. While we don't claim superiority in retrieval accuracy and robustness compared to traditional methods, our research provides valuable insights into optimizing performance for cross-modal retrieval tasks. This exploration involves dividing the model into pre-training and fine-tuning stages. By investigating various configurations within this framework, we aim to identify strategies that can reduce training time and epochs, while also enhancing the model's ability to adapt to new data categories. Our experiments analyze the factors that influence performance in this two-stage approach, providing valuable guidance for future research in multimodal transfer learning. This work contributes to advancing the design and optimization of cross-modal retrieval systems. By exploring segmentation strategies within existing models, our findings can inform the development of more efficient and adaptable retrieval systems for real-world applications.

Tools