Neural style transfer with TensorFlow

Computer vision is a specialized field within the realm of artificial intelligence that enables machines to process and extract information from visual data depicted in images and videos. Computer vision allows us to do highly interesting processing tasks, and neural style transfer is one of them.

Neural style transfer

Neural style transfer is a technique that combines the content of one image with the artistic style of another image. This results in a new image that depicts a painting-like phenomenon as if it was painted in the second image's style while preserving the original content of the first image.

In easier words, we blend the style of an image into the content of another. Neural style transfer works using:

  1. Content of the first image

  2. The artistic style of the second image

Content and style images being transformed
Content and style images being transformed
Astronaut content image styled with northern lights style image
Astronaut content image styled with northern lights style image

Mechanism of neural style transfer

We can understand how neural style transfers work by understanding the concepts below.

Input

We give two images to our model. The first one is the content of which we want to preserve, and the second one is the style we want to apply.

Note: By style, we refer to the visual patterns or texture present in an image.

Content loss

The first thing we need to ensure is that our final image retains the crucial content of our content image. The content loss is calculated as the difference between the feature mapsfilters extracted from an image of the original image and the stylized one.

Style loss

The style loss is generally calculated by comparing the Gram matricesA gram matrix contains the correlations between feature maps of the feature maps between the original style image and the stylized image.

Optimization

Our goal here is to minimize the total loss i.e. the content and style loss. This can be done by adjusting pixels of the stylized image using iterative optimization techniques until our loss is minimized.

Therefore, the algorithm gradually results in the final stylized image, which is a high-quality blend of the two images.

TensorFlow

TensorFlow is one of the most efficient deep learning frameworks and is adept at building and training various machine learning models.

TensorFlow neural style transfer model

We will be using TensorFlow Hub's pre-trained model "Arbitrary Image Stylization" for our code in this Answer.

The model variant used here is "arbitrary-image-stylization-v1-256/2." This model is designed for stylizing images based on artistic styles.

TensorFlow logo
TensorFlow logo

Code walkthrough

In this Answer, we will be demonstrating a Python implementation of a neural style transfer using TensorFlow and then Plotly to visualize the results.

Imports

import tensorflow as tf
import tensorflow_hub as hub
import numpy as np
import PIL.Image
import plotly.graph_objects as go
import plotly.io as pio

First and foremost, we import the required modules for our code snippet. We use tensorflow for its trained model, numpy for numerical calculations, PIL for image processing, and plotly for visualizing results.

The loadImage method

def loadImage(imagePath):
maxDimension = 650
img = tf.io.read_file(imagePath)
img = tf.image.decode_image(img, channels=3)
img = tf.image.convert_image_dtype(img, tf.float32)
shape = tf.cast(tf.shape(img)[:-1], tf.float32)
longDimension = max(shape)
scale = maxDimension / longDimension
newShape = tf.cast(shape * scale, tf.int32)
img = tf.image.resize(img, newShape)
img = img[tf.newaxis, :]
return img

Our first method reads the image and, after processing, returns the tensor image. Let's dive deeper into how it does this.

  1. maxDimension = 650: Sets the maximum dimension of the output to 650 pixels.

  2. img = tf.io.read_file(imagePath): Reads the image's data from the path imagePath.

  3. img = tf.image.decode_image(img, channels = 3): Decodes the binary image data into a tensor with RGB colors i.e. 3 channels.

  4. img = tf.image.convert_image_dtype(img, tf.float32): Converts pixel values to tf.float32 data type since it's most commonly used in ML.

  5. shape = tf.cast(tf.shape(img)[:-1], tf.float32): Calculates the image shape and converts it to a floating-point tensor.

  6. longDimension = max(shape): Calculates the length of the image's longest side.

  7. scale = maxDimension / longDimension: Computes the scaling factor for resizing.

  8. newShape = tf.cast(shape * scale, tf.int32): Computes new image dimensions after resizing.

  9. img = tf.image.resize(img, newShape): Resizes the image tensor using newShape.

  10. img = img[tf.newaxis, :]: Adds a batch dimension to the image tensor.

  11. Returns our processed image tensor.

The stylizeImages method

def stylizeImages(contentImagePath, styleImagePath):
contentImage = loadImage(contentImagePath)
styleImage = loadImage(styleImagePath)
hubModel = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2')
stylizedImage = hubModel(tf.constant(contentImage), tf.constant(styleImage))[0]
return stylizedImage

Moving on to our next method, we define stylizeImages for producing our final stylized image. Let's see how we accomplish this.

  1. contentImage = loadImage(contentImagePath): Loads and processes our content image using the loadImage method we just defined.

  2. styleImage = loadImage(styleImagePath): Does the same for the style image.

  3. hubModel = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2'): Loads a neural style transfer model from TensorFlow Hub.

  4. stylizedImage = hubModel(tf.constant(contentImage), tf.constant(styleImage))[0]: Applies the style transfer model to both our processed content and style images. hubModel takes two tensors representing these images, and it returns a list of stylized images. We only consider the first stylized image in this code using the [0] index.

  5. Returns the first stylized image tensor.

The tfToPILImage method

def tfToPILImage(tensor):
tensor = tensor * 255
tensor = np.array(tensor, dtype=np.uint8)
if np.ndim(tensor) > 3:
assert tensor.shape[0] == 1
tensor = tensor[0]
return PIL.Image.fromarray(tensor)

We then define a method to convert our tf tensor to a PIL compatible image.

  1. tensor = tensor * 255: Scales the pixel values of the input tensor by 255 so that we can convert normalized pixel values back to the 0-255 range.

  2. tensor = np.array(tensor, dtype=np.uint8): Converts the input tensor to a NumPy array since its a PIL requirement.

  3. if np.ndim(tensor) > 3:: Checks if the tensor has more than three dimensions to handle cases where the tensor has an extra batch dimension.

  4. assert tensor.shape[0] == 1: Raises an error if the first dimension isn't equal to 1, to ensure that the tensor represents a single image.

  5. tensor = tensor[0]: Keeps only the first image in the batch.

  6. return PIL.Image.fromarray(tensor): Converts the NumPy array tensor to an image in PIL.

The plotStylizedImages method

def plotStylizedImages(contentImage, styleImage, finalImage):
fig = go.Figure()
fig.add_trace(go.Scatter(x=[0], y=[0], mode='markers', marker_opacity=0)) # Dummy trace for layout
fig.add_layout_image(
source=contentImage,
xref="x",
yref="y",
x=-0.1,
y=0,
sizex=0.35,
sizey=0.35,
xanchor="left",
yanchor="top"
)
fig.add_layout_image(
source=styleImage,
xref="x",
yref="y",
x=0.25,
y=0,
sizex=0.4,
sizey=0.4,
xanchor="left",
yanchor="top"
)
fig.add_layout_image(
source=finalImage,
xref="x",
yref="y",
x=0.7,
y=0,
sizex=0.4,
sizey=0.4,
xanchor="left",
yanchor="top"
)
fig.update_layout(
xaxis=dict(showgrid=False, zeroline=False, range=[-0.1, 1]),
yaxis=dict(showgrid=False, zeroline=False, range=[-0.5, 0.1]),
width=1720,
height=1000
)
return fig

We define a utility method called plotStylizedImages to aid the data visualization aspect of our code. This function simply depicts the three images on a Plotly graph by customizing the display properties.

The main method

def main():
contentPath = 'izza.jpg'
stylePath = 'starry_night.png'
contentImage = PIL.Image.open(contentPath)
styleImage = PIL.Image.open(stylePath)
stylizedImage = stylizeImages(contentPath, stylePath)
finalImage = tfToPILImage(stylizedImage)
pio.write_html(plotStylizedImages(contentImage, styleImage, finalImage), 'output.html')
finalImage.save("final-output.jpg")
if __name__ == "__main__":
main()

The main method is where the neural style transfer is actually triggered. It loads the content and style images using PIL, applies neural style transfer to get the final stylized image, and saves both the stylized image and the Plotly HTML output.

Complete code

Voila! We've reached the end of the code explanation!

Here's the code we put forth, feel free to experiment with it and click on the "Run" button to see Plotly display the content, style, and final stylized image.

import tensorflow as tf
import tensorflow_hub as hub
import numpy as np
import PIL.Image
import plotly.graph_objects as go
import plotly.io as pio

def loadImage(imagePath):
    
    maxDimension = 650
    img = tf.io.read_file(imagePath)
    img = tf.image.decode_image(img, channels=3)
    img = tf.image.convert_image_dtype(img, tf.float32)

    shape = tf.cast(tf.shape(img)[:-1], tf.float32)
    longDimension = max(shape)
    scale = maxDimension / longDimension

    newShape = tf.cast(shape * scale, tf.int32)

    img = tf.image.resize(img, newShape)
    img = img[tf.newaxis, :]
    return img

def tfToPILImage(tensor):
    tensor = tensor * 255

    tensor = np.array(tensor, dtype=np.uint8)
    if np.ndim(tensor) > 3:
        assert tensor.shape[0] == 1
        tensor = tensor[0]

    return PIL.Image.fromarray(tensor)
    
def stylizeImages(contentImagePath, styleImagePath):
    contentImage = loadImage(contentImagePath)
    styleImage = loadImage(styleImagePath)

    hubModel = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2')

    stylizedImage = hubModel(tf.constant(contentImage), tf.constant(styleImage))[0]

    return stylizedImage


def plotStylizedImages(contentImage, styleImage, finalImage):
    fig = go.Figure()

    fig.add_trace(go.Scatter(x=[0], y=[0], mode='markers', marker_opacity=0)) 

    fig.add_layout_image(
        source=contentImage,
        xref="x",
        yref="y",
        x=-0.1,
        y=0,
        sizex=0.35,
        sizey=0.35,
        xanchor="left",
        yanchor="top"
    )
    fig.add_layout_image(
        source=styleImage,
        xref="x",
        yref="y",
        x=0.25,
        y=0,
        sizex=0.4,
        sizey=0.4,
        xanchor="left",
        yanchor="top"
    )
    fig.add_layout_image(
        source=finalImage,
        xref="x",
        yref="y",
        x=0.7,
        y=0,
        sizex=0.4,
        sizey=0.4,
        xanchor="left",
        yanchor="top"
    )

    fig.update_layout(
        xaxis=dict(showgrid=False, zeroline=False, range=[-0.1, 1]),  
        yaxis=dict(showgrid=False, zeroline=False, range=[-0.5, 0.1]),
        width=1720,  
        height=1000
    )
    return fig


def main():
    contentPath = 'izza.jpg'
    stylePath = 'starry_night.png'

    contentImage = PIL.Image.open(contentPath)
    styleImage = PIL.Image.open(stylePath)

    stylizedImage = stylizeImages(contentPath, stylePath)
    finalImage = tfToPILImage(stylizedImage)

    pio.write_html(plotStylizedImages(contentImage, styleImage, finalImage), 'output.html')
    finalImage.save("final-output.jpg")

if __name__ == "__main__":
    main()

Demonstration of neural style transfers

Let's analyze the output of our code.

  1. We first choose a content image. Since this is an image of a person, we would naturally want to preserve the person object in our image.

Content image
Content image
  1. Next, we choose a style image. The artistic styles of this image will be applied to our content image.

Style image
Style image
  1. The model minimizes the total loss and results in the following image.

Final stylized image
Final stylized image

That's it! We've learned how neural style transfer works using TensorFlow's model.

Conclusively, it's a fascinating technique that combines the content of one image with the artistic style of another image, resulting in a brand-new artistic composition. This field is still limited, but new advancements will allow more refined results and allow developers and artists to explore more creative aspects in the future.

Let's test your style transfer knowledge!

Match The Answer
Select an option from the left-hand side

The content image follows the concepts where

the object is to be focused on

The total loss should be

maximized

minimized

artistic styles are to be added


Free Resources

Copyright ©2024 Educative, Inc. All rights reserved