In this post, I’ll demonstrate the behavior of Generative Adversarial Networks (GANs) on 3D images and how it can help to generate novel 3D images.

To start with, I’ll divide this post into the following sections:

  1. Problem Statement
  2. Introduction
  3. Proposed Solution
  4. Code
  5. Results
  6. Observations
  7. Limitations
  8. Future Scope
  9. Acknowledgement

Problem Statement

Three-dimensional (3D) models have become popular because of their variety of applications in the domains of industrial product design, cultural relics restoration, medical diagnosis, 3D games, and so on. The traditional way of designing and constructing 3D models is very complicated, which hampers ordinary users’ enthusiasm for creative design and the satisfaction of 3D models that meet their requirements. The modern way of designing and constructing 3D models involves usage of some of the popular 3D modeling software like NX, CATIA, SolidWorks or 3D scanners to obtain digital 3D models. However, it is generally a very exhaustive task to quickly develop some creative or innovative design for an existing 3D model. Therefore, exploring effective 3D image generation methods is an essential aspect in the domain of computer graphics and computer vision.

To address the above requirement of generating novel 3D images, I’ve applied traditional generative adversarial network (GAN) with the introduction of three different class of networks, i.e., convolutional neural networks (CNN), capsule networks (CapsNet) and auto-encoding ability on the NORB dataset (NYU Object Recognition Benchmark).

Introduction

NORB dataset

The smallNORB dataset is considered as a staple dataset for testing the efficacy of generative models in the 3D domain. This dataset has gray-level stereo images of 5 classes of toys: airplanes, cars, trucks, humans, and animals. There are 10 physical instances of each class. 5 physical instances of a class are selected for the training data and the other 5 for the test data. Every individual toy is pictured at 18 different azimuths (0–340), 9 elevations, and 6 lighting conditions, so the training and test set each contain 24,300 stereo pairs of 96x96 images.

#generative-adversarial #machine-learning #deep-learning #3d-images #experimental #deep learning

Applying Generative Adversarial Network to generate novel 3D images
1.25 GEEK