e-space
Manchester Metropolitan University's Research Repository

    Robust Multimodal Representation Learning with Evolutionary Adversarial Attention Networks

    Huang, F, Jolfaei, A and Bashir, AK (2021) Robust Multimodal Representation Learning with Evolutionary Adversarial Attention Networks. IEEE Transactions on Evolutionary Computation, 25 (5). pp. 856-868. ISSN 1089-778X

    [img]
    Preview
    Accepted Version
    Download (1MB) | Preview

    Abstract

    Multimodal representation learning is beneficial for many multimedia-oriented applications such as social image recognition and visual question answering. The different modalities of the same instance (e.g., a social image and its corresponding description) are usually correlational and complementary. Most existing approaches for multimodal representation learning are not effective to model the deep correlation between different modalities. Moreover, it is difficult for these approaches to deal with the noise within social images. In this paper, we propose a deep learning-based approach named Evolutionary Adversarial Attention Networks (EAAN), which combines the attention mechanism with adversarial networks through evolutionary training, for robust multimodal representation learning. Specifically, a two-branch visual-textual attention model is proposed to correlate visual and textual content for joint representation. Then adversarial networks are employed to impose regularization upon the representation by matching its posterior distribution to the given priors. Finally, the attention model and adversarial networks are integrated into an evolutionary training framework for robust multimodal representation learning. Extensive experiments have been conducted on four real-world datasets, including PASCAL, MIR, CLEF, and NUS-WIDE. Substantial performance improvements on the tasks of image classification and tag recommendation demonstrate the superiority of the proposed approach.

    Impact and Reach

    Statistics

    Activity Overview
    6 month trend
    364Downloads
    6 month trend
    85Hits

    Additional statistics for this dataset are available via IRStats2.

    Altmetric

    Actions (login required)

    View Item View Item