Neural Module Network: Compositional ViQA Network Built on the Fly

  • J. Andreas, M. Rohrbach, T. Darrell, and D. Klein, “Neural Module Networks,” Nov. 2015, Accessed: Jan. 16, 2022. [Online]. Available: https://arxiv.org/abs/1511.02799v4

Slides: link

Abstract

Visual question answering is compositional in nature. This paper seeks to exploit the representational capacity of deep networks and the compositional linguistic structure of questions.

This approach decomposes questions into their linguistic substructures, and uses these structures to dynamically instantiate modular networks.

Approach

Figure 1: A Schematic Representatyion of the Neural Module Network Architecture

Figure 1: A Schematic Representatyion of the Neural Module Network Architecture

Modules

Different composable modules for different sub-tasks.

Notation: TYPE[INSTANCE](ARG1,...)

  • Type is high-level module type like attention, classification, etc
  • INSTANCE is the particular instance that this module considers

Examples

This paper introduces several types of modules for ViQA, adding newer ones should also be easy.

  • Attention

  • Re-attention

  • Combination

  • Classification

  • Measurement

From Questions to Assembed Networks

Figure 2: Examples of NMN built on the fly

Figure 2: Examples of NMN built on the fly

Question String => via Stanfoard Parser => Symbolic Dependency Representation of Question => via assembling => Neural Module Network built on the fly