#python #cross_modal_retrieval #image_captioning #pretraining #tden #video_captioning #vision_and_language #visual_question_answering
https://github.com/YehLi/xmodaler
https://github.com/YehLi/xmodaler
GitHub
GitHub - YehLi/xmodaler: X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning…
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsens...