Neural Module Network: Compositional ViQA Network Built on the Fly
- J. Andreas, M. Rohrbach, T. Darrell, and D. Klein, “Neural Module Networks,” Nov. 2015, Accessed: Jan. 16, 2022. [Online]. Available: https://arxiv.org/abs/1511.02799v4
Slides: link
Abstract
Visual question answering is compositional in nature. This paper seeks to exploit the representational capacity of deep networks and the compositional linguistic structure of questions.
This approach decomposes questions into their linguistic substructures, and uses these structures to dynamically instantiate modular networks.
Approach

Figure 1: A Schematic Representatyion of the Neural Module Network Architecture
Modules
Different composable modules for different sub-tasks.
Notation: TYPE[INSTANCE](ARG1,...)
- Type is high-level module type like attention, classification, etc
- INSTANCE is the particular instance that this module considers
Examples
This paper introduces several types of modules for ViQA, adding newer ones should also be easy.
-
Attention
-
Re-attention
-
Combination
-
Classification
-
Measurement
From Questions to Assembed Networks

Figure 2: Examples of NMN built on the fly
Question String => via Stanfoard Parser => Symbolic Dependency Representation of Question => via assembling => Neural Module Network built on the fly