Oodles AI

Oodles AI


Implementation of Neural Style Transfer using Deep Learning

Neural-Style-Transfer is the way toward making another picture by combining two pictures. Since we like the craftsmanship on the base picture, we might want to move that style into our own memory photographs. Obviously, we would want to spare the photograph’s substance however much as could reasonably be expected and in the meantime change it as indicated by the workmanship picture style. As an experiential AI Development Company, Oodles AI elaborates on the process of neural style transfer using deep learning algorithms.

Step 1: Capture

We have to figure out how to catch substance and style picture includes so we can combine them such that the yield will look tasteful to the eye. Convolution neural systems like VGG-16 are as of now, as it were, catching these highlights, because of the way that they can order/perceive an extensive assortment of pictures (millions) with very high precision. We simply need to look further at neural layers and comprehend what they are doing.

Step2: Layer

While preparing with pictures, we should assume we pick the primary layer and begin checking a portion of its units/neurons. Since we are simply in the primary layer, the units catch just a little piece of the pictures and rather low-level highlights, as demonstrated as follows:

It would seem that the principal neuron is keen on askew lines, with the third and fourth in vertical and corner to corner lines, and the eighth for beyond any doubt enjoys the shading green. Is detectable that all these are tiny parts of pictures and the layer is somewhat catching low-level highlights.

We should move somewhat more profound and pick the two layers:

In this layer, neurons begin to recognize more highlights; the second distinguishes thin vertical lines, the 6th and seventh begin catching round shapes, and the fourteenth is fixated on shading yellow.

This layer begins to recognize all the more intriguing stuff, the 6th is more initiated for round shapes that resemble tires, the tenth isn’t anything but difficult to clarify yet prefers orange and round shapes, while the eleventh begins identifying a few people.

So the more profound we go, the more picture neurons are distinguishing, hence catching abnormal state includes (the second neuron on the fifth layer is truly into mutts) of the picture contrasted with low-level layers catching rather little parts of the picture.

This gives incredible knowledge into what profound convolutional layers are learning and furthermore, returning to our style exchange, we have an understanding into how to produce craftsmanship and keep the substance from two pictures.

Step 3: Implement

We simply need to create another picture that, when encouraged to neural systems as information, produces pretty much a similar enactment esteems as the substance (photograph) and style (workmanship painting) picture.


from future import print_function
import functools
import vgg, pdb, time
import tensorflow as tf, numpy as np, os
import transform
from utils import get_img

STYLE_LAYERS =(‘relu1_1’,‘relu2_1’,‘relu3_1’,‘relu4_1’,‘relu5_1’)
CONTENT_LAYER =‘relu4_2’
DEVICES =‘CUDA_VISIBLE_DEVICES’# np arr, np arrdef optimize(content_targets, style_target, content_weight, style_weight,
tv_weight, vgg_path, epochs=2, print_iterations=1000,
batch_size=4, save_path=‘saver/fns.ckpt’, slow=False,
learning_rate=1e-3, debug=False):if slow:
batch_size =1s
mod = len(content_targets)% batch_size
if mod >0:print(“Train set has been trimmed slightly…”)
content_targets = content_targets[:-mod]

style_features ={}

batch_shape =(batch_size,256,256,3)
style_shape =(1,)+ style_target.shape
print(style_shape)# precompute style featureswith tf.Graph().as_default(), tf.device('/cpu:0'), tf.Session()as sess:
    style_image = tf.placeholder(tf.float32, shape=style_shape, name='style_image')
    style_image_pre = vgg.preprocess(style_image)
    net = vgg.net(vgg_path, style_image_pre)
    style_pre = np.array([style_target])for layer in STYLE_LAYERS:
        features = net[layer].eval(feed_dict={style_image:style_pre})
        features = np.reshape(features,(-1, features.shape[3]))
        gram = np.matmul(features.T, features)/ features.size
        style_features[layer]= gram

with tf.Graph().as_default(), tf.Session()as sess:
    X_content = tf.placeholder(tf.float32, shape=batch_shape, name="X_content")
    X_pre = vgg.preprocess(X_content)# precompute content features
    content_features ={}
    content_net = vgg.net(vgg_path, X_pre)
    content_features[CONTENT_LAYER]= content_net[CONTENT_LAYER]if slow:
        preds = tf.Variable(
        preds_pre = preds
        preds = transform.net(X_content/255.0)
        preds_pre = vgg.preprocess(preds)

    net = vgg.net(vgg_path, preds_pre)

    content_size = _tensor_size(content_features[CONTENT_LAYER])*batch_size
    assert _tensor_size(content_features[CONTENT_LAYER])== _tensor_size(net[CONTENT_LAYER])
    content_loss = content_weight *(2* tf.nn.l2_loss(
        net[CONTENT_LAYER]- content_features[CONTENT_LAYER])/ content_size

    style_losses =[]for style_layer in STYLE_LAYERS:
        layer = net[style_layer]
        bsstyle, height, width, filters = map(lambda i:i.value,layer.get_shape())
        size = height * width * filters
        feats = tf.reshape(layer,(bsstyle, height * width, filters))
        feats_T = tf.transpose(feats, perm=[0,2,1])
        grams = tf.matmul(feats_T, feats)/ size
        style_gram = style_features[style_layer]
        style_losses.append(2* tf.nn.l2_loss(grams - style_gram)/style_gram.size)

    style_loss = style_weight * functools.reduce(tf.add, style_losses)/ batch_size

    # total variation denoising
    tv_y_size = _tensor_size(preds[:,1:,:,:])
    tv_x_size = _tensor_size(preds[:,:,1:,:])
    y_tv = tf.nn.l2_loss(preds[:,1:,:,:]- preds[:,:batch_shape[1]-1,:,:])
    x_tv = tf.nn.l2_loss(preds[:,:,1:,:]- preds[:,:,:batch_shape[2]-1,:])
    tv_loss = tv_weight*2*(x_tv/tv_x_size + y_tv/tv_y_size)/batch_size

    loss = content_loss + style_loss + tv_loss

    # overall loss
    train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss)
    sess.run(tf.global_variables_initializer())import random
    uid = random.randint(1,100)print("UID: %s"% uid)for epoch in range(epochs):
        num_examples = len(content_targets)
        iterations =0while iterations * batch_size < num_examples:
            start_time = time.time()
            curr = iterations * batch_size
            step = curr + batch_size
            X_batch = np.zeros(batch_shape, dtype=np.float32)for j, img_p in enumerate(content_targets[curr:step]):
               X_batch[j]= get_img(img_p,(256,256,3)).astype(np.float32)

            iterations +=1assert X_batch.shape[0]== batch_size

            feed_dict ={

            end_time = time.time()
            delta_time = end_time - start_time
            if debug:print("UID: %s, batch time: %s"%(uid, delta_time))
            is_print_iter =int(iterations)% print_iterations ==0if slow:
                is_print_iter = epoch % print_iterations ==0
            is_last = epoch == epochs -1and iterations * batch_size >= num_examples
            should_print = is_print_iter or is_last
            if should_print:
                to_get =[style_loss, content_loss, tv_loss, loss, preds]
                test_feed_dict ={

                tup = sess.run(to_get, feed_dict = test_feed_dict)
                _style_loss,_content_loss,_tv_loss,_loss,_preds = tup
                losses =(_style_loss, _content_loss, _tv_loss, _loss)if slow:
                   _preds = vgg.unprocess(_preds)else:
                   saver = tf.train.Saver()
                   res = saver.save(sess, save_path)yield(_preds, losses, iterations, epoch)def _tensor_size(tensor):fromoperatorimport mul
return functools.reduce(mul,(d.value for d in tensor.get_shape()[1:]),1)

Learn more: Neural Style Transfer using Deep Learning

#neural style transfer using deep learning

What is GEEK

Buddha Community

Implementation of Neural Style Transfer using Deep Learning
Chloe  Butler

Chloe Butler


Pdf2gerb: Perl Script Converts PDF Files to Gerber format


Perl script converts PDF files to Gerber format

Pdf2Gerb generates Gerber 274X photoplotting and Excellon drill files from PDFs of a PCB. Up to three PDFs are used: the top copper layer, the bottom copper layer (for 2-sided PCBs), and an optional silk screen layer. The PDFs can be created directly from any PDF drawing software, or a PDF print driver can be used to capture the Print output if the drawing software does not directly support output to PDF.

The general workflow is as follows:

  1. Design the PCB using your favorite CAD or drawing software.
  2. Print the top and bottom copper and top silk screen layers to a PDF file.
  3. Run Pdf2Gerb on the PDFs to create Gerber and Excellon files.
  4. Use a Gerber viewer to double-check the output against the original PCB design.
  5. Make adjustments as needed.
  6. Submit the files to a PCB manufacturer.

Please note that Pdf2Gerb does NOT perform DRC (Design Rule Checks), as these will vary according to individual PCB manufacturer conventions and capabilities. Also note that Pdf2Gerb is not perfect, so the output files must always be checked before submitting them. As of version 1.6, Pdf2Gerb supports most PCB elements, such as round and square pads, round holes, traces, SMD pads, ground planes, no-fill areas, and panelization. However, because it interprets the graphical output of a Print function, there are limitations in what it can recognize (or there may be bugs).

See docs/Pdf2Gerb.pdf for install/setup, config, usage, and other info.


#Pdf2Gerb config settings:
#Put this file in same folder/directory as pdf2gerb.pl itself (global settings),
#or copy to another folder/directory with PDFs if you want PCB-specific settings.
#There is only one user of this file, so we don't need a custom package or namespace.
#NOTE: all constants defined in here will be added to main namespace.
#package pdf2gerb_cfg;

use strict; #trap undef vars (easier debug)
use warnings; #other useful info (easier debug)

#configurable settings:
#change values here instead of in main pfg2gerb.pl file

use constant WANT_COLORS => ($^O !~ m/Win/); #ANSI colors no worky on Windows? this must be set < first DebugPrint() call

#just a little warning; set realistic expectations:
#DebugPrint("${\(CYAN)}Pdf2Gerb.pl ${\(VERSION)}, $^O O/S\n${\(YELLOW)}${\(BOLD)}${\(ITALIC)}This is EXPERIMENTAL software.  \nGerber files MAY CONTAIN ERRORS.  Please CHECK them before fabrication!${\(RESET)}", 0); #if WANT_DEBUG

use constant METRIC => FALSE; #set to TRUE for metric units (only affect final numbers in output files, not internal arithmetic)
use constant APERTURE_LIMIT => 0; #34; #max #apertures to use; generate warnings if too many apertures are used (0 to not check)
use constant DRILL_FMT => '2.4'; #'2.3'; #'2.4' is the default for PCB fab; change to '2.3' for CNC

use constant WANT_DEBUG => 0; #10; #level of debug wanted; higher == more, lower == less, 0 == none
use constant GERBER_DEBUG => 0; #level of debug to include in Gerber file; DON'T USE FOR FABRICATION
use constant WANT_STREAMS => FALSE; #TRUE; #save decompressed streams to files (for debug)
use constant WANT_ALLINPUT => FALSE; #TRUE; #save entire input stream (for debug ONLY)

#DebugPrint(sprintf("${\(CYAN)}DEBUG: stdout %d, gerber %d, want streams? %d, all input? %d, O/S: $^O, Perl: $]${\(RESET)}\n", WANT_DEBUG, GERBER_DEBUG, WANT_STREAMS, WANT_ALLINPUT), 1);
#DebugPrint(sprintf("max int = %d, min int = %d\n", MAXINT, MININT), 1); 

#define standard trace and pad sizes to reduce scaling or PDF rendering errors:
#This avoids weird aperture settings and replaces them with more standardized values.
#(I'm not sure how photoplotters handle strange sizes).
#Fewer choices here gives more accurate mapping in the final Gerber files.
#units are in inches
use constant TOOL_SIZES => #add more as desired
#round or square pads (> 0) and drills (< 0):
    .010, -.001,  #tiny pads for SMD; dummy drill size (too small for practical use, but needed so StandardTool will use this entry)
    .031, -.014,  #used for vias
    .041, -.020,  #smallest non-filled plated hole
    .051, -.025,
    .056, -.029,  #useful for IC pins
    .070, -.033,
    .075, -.040,  #heavier leads
#    .090, -.043,  #NOTE: 600 dpi is not high enough resolution to reliably distinguish between .043" and .046", so choose 1 of the 2 here
    .100, -.046,
    .115, -.052,
    .130, -.061,
    .140, -.067,
    .150, -.079,
    .175, -.088,
    .190, -.093,
    .200, -.100,
    .220, -.110,
    .160, -.125,  #useful for mounting holes
#some additional pad sizes without holes (repeat a previous hole size if you just want the pad size):
    .090, -.040,  #want a .090 pad option, but use dummy hole size
    .065, -.040, #.065 x .065 rect pad
    .035, -.040, #.035 x .065 rect pad
    .001,  #too thin for real traces; use only for board outlines
    .006,  #minimum real trace width; mainly used for text
    .008,  #mainly used for mid-sized text, not traces
    .010,  #minimum recommended trace width for low-current signals
    .015,  #moderate low-voltage current
    .020,  #heavier trace for power, ground (even if a lighter one is adequate)
    .030,  #heavy-current traces; be careful with these ones!
#Areas larger than the values below will be filled with parallel lines:
#This cuts down on the number of aperture sizes used.
#Set to 0 to always use an aperture or drill, regardless of size.
use constant { MAX_APERTURE => max((TOOL_SIZES)) + .004, MAX_DRILL => -min((TOOL_SIZES)) + .004 }; #max aperture and drill sizes (plus a little tolerance)
#DebugPrint(sprintf("using %d standard tool sizes: %s, max aper %.3f, max drill %.3f\n", scalar((TOOL_SIZES)), join(", ", (TOOL_SIZES)), MAX_APERTURE, MAX_DRILL), 1);

#NOTE: Compare the PDF to the original CAD file to check the accuracy of the PDF rendering and parsing!
#for example, the CAD software I used generated the following circles for holes:
#CAD hole size:   parsed PDF diameter:      error:
#  .014                .016                +.002
#  .020                .02267              +.00267
#  .025                .026                +.001
#  .029                .03167              +.00267
#  .033                .036                +.003
#  .040                .04267              +.00267
#This was usually ~ .002" - .003" too big compared to the hole as displayed in the CAD software.
#To compensate for PDF rendering errors (either during CAD Print function or PDF parsing logic), adjust the values below as needed.
#units are pixels; for example, a value of 2.4 at 600 dpi = .0004 inch, 2 at 600 dpi = .0033"
use constant
    HOLE_ADJUST => -0.004 * 600, #-2.6, #holes seemed to be slightly oversized (by .002" - .004"), so shrink them a little
    RNDPAD_ADJUST => -0.003 * 600, #-2, #-2.4, #round pads seemed to be slightly oversized, so shrink them a little
    SQRPAD_ADJUST => +0.001 * 600, #+.5, #square pads are sometimes too small by .00067, so bump them up a little
    RECTPAD_ADJUST => 0, #(pixels) rectangular pads seem to be okay? (not tested much)
    TRACE_ADJUST => 0, #(pixels) traces seemed to be okay?
    REDUCE_TOLERANCE => .001, #(inches) allow this much variation when reducing circles and rects

#Also, my CAD's Print function or the PDF print driver I used was a little off for circles, so define some additional adjustment values here:
#Values are added to X/Y coordinates; units are pixels; for example, a value of 1 at 600 dpi would be ~= .002 inch
use constant
    CIRCLE_ADJUST_MINY => -0.001 * 600, #-1, #circles were a little too high, so nudge them a little lower
    CIRCLE_ADJUST_MAXX => +0.001 * 600, #+1, #circles were a little too far to the left, so nudge them a little to the right
    SUBST_CIRCLE_CLIPRECT => FALSE, #generate circle and substitute for clip rects (to compensate for the way some CAD software draws circles)
    WANT_CLIPRECT => TRUE, #FALSE, #AI doesn't need clip rect at all? should be on normally?
    RECT_COMPLETION => FALSE, #TRUE, #fill in 4th side of rect when 3 sides found

#allow .012 clearance around pads for solder mask:
#This value effectively adjusts pad sizes in the TOOL_SIZES list above (only for solder mask layers).
use constant SOLDER_MARGIN => +.012; #units are inches

#line join/cap styles:
use constant
    CAP_NONE => 0, #butt (none); line is exact length
    CAP_ROUND => 1, #round cap/join; line overhangs by a semi-circle at either end
    CAP_SQUARE => 2, #square cap/join; line overhangs by a half square on either end
    CAP_OVERRIDE => FALSE, #cap style overrides drawing logic
#number of elements in each shape type:
use constant
    RECT_SHAPELEN => 6, #x0, y0, x1, y1, count, "rect" (start, end corners)
    LINE_SHAPELEN => 6, #x0, y0, x1, y1, count, "line" (line seg)
    CURVE_SHAPELEN => 10, #xstart, ystart, x0, y0, x1, y1, xend, yend, count, "curve" (bezier 2 points)
    CIRCLE_SHAPELEN => 5, #x, y, 5, count, "circle" (center + radius)
#const my %SHAPELEN =
#Readonly my %SHAPELEN =>
    rect => RECT_SHAPELEN,
    line => LINE_SHAPELEN,
    curve => CURVE_SHAPELEN,
    circle => CIRCLE_SHAPELEN,

#This will repeat the entire body the number of times indicated along the X or Y axes (files grow accordingly).
#Display elements that overhang PCB boundary can be squashed or left as-is (typically text or other silk screen markings).
#Set "overhangs" TRUE to allow overhangs, FALSE to truncate them.
#xpad and ypad allow margins to be added around outer edge of panelized PCB.
use constant PANELIZE => {'x' => 1, 'y' => 1, 'xpad' => 0, 'ypad' => 0, 'overhangs' => TRUE}; #number of times to repeat in X and Y directions

# Set this to 1 if you need TurboCAD support.
#$turboCAD = FALSE; #is this still needed as an option?

#CIRCAD pad generation uses an appropriate aperture, then moves it (stroke) "a little" - we use this to find pads and distinguish them from PCB holes. 
use constant PAD_STROKE => 0.3; #0.0005 * 600; #units are pixels
#convert very short traces to pads or holes:
use constant TRACE_MINLEN => .001; #units are inches
#use constant ALWAYS_XY => TRUE; #FALSE; #force XY even if X or Y doesn't change; NOTE: needs to be TRUE for all pads to show in FlatCAM and ViewPlot
use constant REMOVE_POLARITY => FALSE; #TRUE; #set to remove subtractive (negative) polarity; NOTE: must be FALSE for ground planes

#PDF uses "points", each point = 1/72 inch
#combined with a PDF scale factor of .12, this gives 600 dpi resolution (1/72 * .12 = 600 dpi)
use constant INCHES_PER_POINT => 1/72; #0.0138888889; #multiply point-size by this to get inches

# The precision used when computing a bezier curve. Higher numbers are more precise but slower (and generate larger files).
#$bezierPrecision = 100;
use constant BEZIER_PRECISION => 36; #100; #use const; reduced for faster rendering (mainly used for silk screen and thermal pads)

# Ground planes and silk screen or larger copper rectangles or circles are filled line-by-line using this resolution.
use constant FILL_WIDTH => .01; #fill at most 0.01 inch at a time

# The max number of characters to read into memory
use constant MAX_BYTES => 10 * M; #bumped up to 10 MB, use const

use constant DUP_DRILL1 => TRUE; #FALSE; #kludge: ViewPlot doesn't load drill files that are too small so duplicate first tool

my $runtime = time(); #Time::HiRes::gettimeofday(); #measure my execution time

print STDERR "Loaded config settings from '${\(__FILE__)}'.\n";
1; #last value must be truthful to indicate successful load


#use Package::Constants;
#use Exporter qw(import); #https://perldoc.perl.org/Exporter.html

#my $caller = "pdf2gerb::";

#sub cfg
#    my $proto = shift;
#    my $class = ref($proto) || $proto;
#    my $settings =
#    {
#        $WANT_DEBUG => 990, #10; #level of debug wanted; higher == more, lower == less, 0 == none
#    };
#    bless($settings, $class);
#    return $settings;

#use constant HELLO => "hi there2"; #"main::HELLO" => "hi there";
#use constant GOODBYE => 14; #"main::GOODBYE" => 12;

#print STDERR "read cfg file\n";

#our @EXPORT_OK = Package::Constants->list(__PACKAGE__); #https://www.perlmonks.org/?node_id=1072691; NOTE: "_OK" skips short/common names

#print STDERR scalar(@EXPORT_OK) . " consts exported:\n";
#foreach(@EXPORT_OK) { print STDERR "$_\n"; }
#my $val = main::thing("xyz");
#print STDERR "caller gave me $val\n";
#foreach my $arg (@ARGV) { print STDERR "arg $arg\n"; }

Download Details:

Author: swannman
Source Code: https://github.com/swannman/pdf2gerb

License: GPL-3.0 license


Marget D

Marget D


Top Deep Learning Development Services | Hire Deep Learning Developer

View more: https://www.inexture.com/services/deep-learning-development/

We at Inexture, strategically work on every project we are associated with. We propose a robust set of AI, ML, and DL consulting services. Our virtuoso team of data scientists and developers meticulously work on every project and add a personalized touch to it. Because we keep our clientele aware of everything being done associated with their project so there’s a sense of transparency being maintained. Leverage our services for your next AI project for end-to-end optimum services.

#deep learning development #deep learning framework #deep learning expert #deep learning ai #deep learning services

Jerad  Bailey

Jerad Bailey


Google Reveals "What is being Transferred” in Transfer Learning

Recently, researchers from Google proposed the solution of a very fundamental question in the machine learning community — What is being transferred in Transfer Learning? They explained various tools and analyses to address the fundamental question.

The ability to transfer the domain knowledge of one machine in which it is trained on to another where the data is usually scarce is one of the desired capabilities for machines. Researchers around the globe have been using transfer learning in various deep learning applications, including object detection, image classification, medical imaging tasks, among others.

#developers corner #learn transfer learning #machine learning #transfer learning #transfer learning methods #transfer learning resources

Learn Transfer Learning for Deep Learning by implementing the project.

Project walkthrough on Convolution neural networks using transfer learning

From 2 years of my master’s degree, I found that the best way to learn concepts is by doing the projects. Let’s start implementing or in other words learning.

Problem Statement

Take an image as input and return a corresponding dog breed from 133 dog breed categories. If a dog is detected in the image, it will provide an estimate of the dog’s breed. If a human is detected, it will give an estimate of the dog breed that is most resembling the human face. If there’s no human or dog present in the image, we simply print an error.

Let’s break this problem into steps

  1. Detect Humans
  2. Detect Dogs
  3. Classify Dog breeds

For all these steps, we use pre-trained models.

Pre-trained models are saved models that were trained on a huge image-classification task such as Imagenet. If these datasets are huge and generalized enough, the saved weights can be used for multiple image detection task to get a high accuracy quickly.

Detect Humans

For detecting humans, OpenCV provides many pre-trained face detectors. We use OpenCV’s implementation of Haar feature-based cascade classifiers to detect human faces in images.

### returns "True" if face is detected in image stored at img_path
def face_detector(img_path):
    img = cv2.imread(img_path)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray)
    return len(faces) > 0

Image for post

Detect Dogs

For detecting dogs, we use a pre-trained ResNet-50 model to detect dogs in images, along with weights that have been trained on ImageNet, a very large, very popular dataset used for image classification and other vision tasks.

from keras.applications.resnet50 import ResNet50

### define ResNet50 model
ResNet50_model_detector = ResNet50(weights='imagenet')
### returns "True" if a dog is detected
def dog_detector(img_path):
    prediction = ResNet50_predict_labels(img_path)
    return ((prediction <= 268) & (prediction >= 151))

Classify Dog Breeds

For classifying Dog breeds, we use transfer learning

Transfer learning involves taking a pre-trained neural network and adapting the neural network to a new, different data set.

To illustrate the power of transfer learning. Initially, we will train a simple CNN with the following architecture:

Image for post

Train it for 20 epochs, and it gives a test accuracy of just 3% which is better than a random guess from 133 categories. But with more epochs, we can increase accuracy, but it takes up a lot of training time.

To reduce training time without sacrificing accuracy, we will train the CNN model using transfer learning.

#data-science #transfer-learning #project-based-learning #cnn #deep-learning #deep learning

Art Style Transfer using Neural Networks


Art Style Transfer consists in the transformation of an image into a similar one that seems to have been painted by an artist.

If we are Vincent van Gogh fans, and we love German Shepherds, we may like to get a picture of our favorite dog painted in van Gogh’s Starry Night fashion.

german shepherd

Image by author

van gogh starry night

Starry Night by Vincent van Gogh, Public Domain

The resulting picture can be something like this:

german shepherd with a starry night style

Image by author

Instead, if we like Katsushika Hokusai’s Great Wave off Kanagawa, we may obtain a picture like this one:

the great wave

The Great wave of Kanagawa by Katsushika Hokusai, Public Domain

german shepherd with the great wave style

Image by author

And something like the following picture, if we prefer Wassily Kandinsky’s Composition 7:

wassily kandinsky composition 7

Compositions 7 by Wassily Kandinsky, Public Domain

german shepherd with composition 7 style

Image by author

These image transformations are possible thanks to advances in computing processing power that allowed the usage of more complex neural networks.

The Convolutional Neural Networks (CNN), composed of a series of layers of convolutional matrix operations, are ideal for image analysis and object identification. They employ a similar concept to graphic filters and detectors used in applications like Gimp or Photoshop, but in a much powerful and complex way.

A basic example of a matrix operation is performed by an edge detector. It takes a small picture sample of NxN pixels (5x5 in the following example), multiplies it’s values by a predefined NxN convolution matrix and obtains a value that indicates if an edge is present in that portion of the image. Repeating this procedure for all the NxN portions of the image, we can generate a new image where we have detected the borders of the objects present in there.

condor photo plus edge detector equals condor borders

Image by author

The two main features of CNNs are:

  • The numeric values of the convolutional matrices are not predefined to find specific image features like edges. Those values are automatically generated during the optimization processes, so they will be able to detect more complex features than borders.
  • They have a layered structure, so the first layers will detect simple image features (edges, color blocks, etc.) and the latest layers will use the information from the previous ones to detect complex objects like people, animals, cars, etc.

This is the typical structure of a Convolutional Neural Network:

Image for post

Image by Aphex34 / CC BY-SA 4.0

Thanks to papers like “Visualizing and Understanding Convolutional Networks”[1] by Matthew D. Zeiler, Rob Fergus and “Feature Visualization”[12] by Chris Olah, Alexander Mordvintsev, Ludwig Schubert, we can visually understand what features are detected by the different CNN layers:

Image for post

Image by Matthew D. Zeiler et al. “Visualizing and Understanding Convolutional Networks”[1], usage authorized

The first layers detect the most basic features of the image like edges.

Image for post
Image by Matthew D. Zeiler et al. “Visualizing and Understanding Convolutional Networks”[1], usage authorized

The next layers combine the information of the previous layer to detect more complex features like textures.

Image for post

Image by Matthew D. Zeiler et al. “Visualizing and Understanding Convolutional Networks”[1], usage authorized

Following layers, continue to use the previous information to detect features like repetitive patterns.

Image for post

Image by Matthew D. Zeiler et al. “Visualizing and Understanding Convolutional Networks”[1], usage authorized

The latest network layers are able to detect complex features like object parts.

Image for post

Image by Matthew D. Zeiler et al. “Visualizing and Understanding Convolutional Networks”[1], usage authorized

The final layers are capable of classifying complete objects present in the image.

The possibility of detecting complex image features is the key enabler to perform complex transformations to those features, but still perceiving the same content in the image.

#style-transfer-online #artificial-intelligence #neural-style-transfer #art-style-transfer #neural networks