Introduction

Automatic synthesis of realistic images from text has become popular with deep convolutional and recurrent neural network architectures to aid in learning discriminative text feature representations.

Discriminative power and strong generalization properties of attribute representations even though attractive, its a complex process and requires domain-specific knowledge. Over the years the techniques have evolved as auto-adversarial networks in space of machine learning algorithms continue to evolve.

In comparison, natural language offers an easy, general, and flexible plugin that can be used to identify and describing objects across multiple domains by means of visual categories. The best thing is to combine the generality of text descriptions with the discriminative power of attributes.

This blog addresses different text to image synthesis algorithms using **GAN (Generative Adversarial Network) **that aims to directly map words and characters to image pixels with natural language representation and image synthesis techniques.

The featured algorithms learn a text feature representation that captures the important visual details and then use these features to synthesize a compelling image that a human might mistake for real.

Generative Adversarial Text to Image Synthesis

This image synthesis mechanism uses deep convolutional and recurrent text encoders to learn a correspondence function with images by conditioning the model conditions on_ text descriptions_ instead of class labels.

An effective approach that enables text-based image synthesis using a character-level text encoder and class-conditional GAN. The purpose of the GAN is to** view (text, image) pairs as joint observations and train the discriminator to judge pairs as real or fake.**

#deep learning

Summarizing Most Popular Text-to-Image Synthesis Methods
1.20 GEEK