博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
DeepLearning.ai作业:(4-4)-- 特殊应用:人脸识别和神经风格转换
阅读量:4099 次
发布时间:2019-05-25

本文共 9722 字,大约阅读时间需要 32 分钟。


title: ‘DeepLearning.ai作业:(4-4)-- 特殊应用:人脸识别和神经风格转换’

id: dl-ai-4-4h
tags:

  • homework
    categories:
  • AI
  • Deep Learning
    date: 2018-10-12 18:55:20

首发于个人博客:,欢迎来访

本周作业分为了两个部分:

  • 人脸识别
  • 风格迁移

Part1:人脸识别

训练FaceNet很不现实,所以模型已经都训练好了,我们只是学习一下loss函数,然后调用模型来进行简单的识别而已。

先计算triplet_loss函数,分为4步:

# GRADED FUNCTION: triplet_lossdef triplet_loss(y_true, y_pred, alpha = 0.2):    """    Implementation of the triplet loss as defined by formula (3)        Arguments:    y_true -- true labels, required when you define a loss in Keras, you don't need it in this function.    y_pred -- python list containing three objects:            anchor -- the encodings for the anchor images, of shape (None, 128)            positive -- the encodings for the positive images, of shape (None, 128)            negative -- the encodings for the negative images, of shape (None, 128)        Returns:    loss -- real number, value of the loss    """        anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]        ### START CODE HERE ### (≈ 4 lines)    # Step 1: Compute the (encoding) distance between the anchor and the positive, you will need to sum over axis=-1    pos_dist = tf.reduce_sum(tf.square(anchor - positive),axis=-1)    # Step 2: Compute the (encoding) distance between the anchor and the negative, you will need to sum over axis=-1    neg_dist = tf.reduce_sum(tf.square(anchor - negative),axis=-1)    # Step 3: subtract the two previous distances and add alpha.    basic_loss = tf.add(tf.subtract(pos_dist,neg_dist), alpha)    # Step 4: Take the maximum of basic_loss and 0.0. Sum over the training examples.    loss = tf.reduce_sum(tf.maximum(basic_loss, 0.))    ### END CODE HERE ###        return loss

进行单个人脸验证:

# GRADED FUNCTION: verifydef verify(image_path, identity, database, model):    """    Function that verifies if the person on the "image_path" image is "identity".        Arguments:    image_path -- path to an image    identity -- string, name of the person you'd like to verify the identity. Has to be a resident of the Happy house.    database -- python dictionary mapping names of allowed people's names (strings) to their encodings (vectors).    model -- your Inception model instance in Keras        Returns:    dist -- distance between the image_path and the image of "identity" in the database.    door_open -- True, if the door should open. False otherwise.    """        ### START CODE HERE ###        # Step 1: Compute the encoding for the image. Use img_to_encoding() see example above. (≈ 1 line)    encoding = img_to_encoding(image_path,model)        # Step 2: Compute distance with identity's image (≈ 1 line)    dist = np.linalg.norm(encoding-database[identity])        # Step 3: Open the door if dist < 0.7, else don't open (≈ 3 lines)    if dist < 0.7:        print("It's " + str(identity) + ", welcome home!")        door_open = True    else:        print("It's not " + str(identity) + ", please go away")        door_open = False            ### END CODE HERE ###            return dist, door_open

进行人脸识别:

# GRADED FUNCTION: who_is_itdef who_is_it(image_path, database, model):    """    Implements face recognition for the happy house by finding who is the person on the image_path image.        Arguments:    image_path -- path to an image    database -- database containing image encodings along with the name of the person on the image    model -- your Inception model instance in Keras        Returns:    min_dist -- the minimum distance between image_path encoding and the encodings from the database    identity -- string, the name prediction for the person on image_path    """        ### START CODE HERE ###         ## Step 1: Compute the target "encoding" for the image. Use img_to_encoding() see example above. ## (≈ 1 line)    encoding = img_to_encoding(image_path,model)        ## Step 2: Find the closest encoding ##        # Initialize "min_dist" to a large value, say 100 (≈1 line)    min_dist = 100        # Loop over the database dictionary's names and encodings.    for (name, db_enc) in database.items():                # Compute L2 distance between the target "encoding" and the current "emb" from the database. (≈ 1 line)        dist = np.linalg.norm(encoding-database[name])        # If this distance is less than the min_dist, then set min_dist to dist, and identity to name. (≈ 3 lines)        if dist < min_dist:            min_dist = dist            identity = name    ### END CODE HERE ###        if min_dist > 0.7:        print("Not in the database.")    else:        print ("it's " + str(identity) + ", the distance is " + str(min_dist))            return min_dist, identity

Part2:风格迁移

模型也都是训练好的了,用的是VGG-19的网络。这里只是体验一下cost function的实现罢了。

计算J_content(C,G)

J c o n t e n t ( C , G ) = 1 4 × n H × n W × n C ∑ all entries ( a ( C ) − a ( G ) ) 2 J_{content}(C,G) = \frac{1}{4 \times n_H \times n_W \times n_C}\sum _{ \text{all entries}} (a^{(C)} - a^{(G)})^2 Jcontent(C,G)=4×nH×nW×nC1all entries(a(C)a(G))2

在这过程中需要把三维的矩阵先展开成2维的矩阵进行计算(虽然不展开也是可以计算的,但是风格损失函数需要计算)

def compute_content_cost(a_C, a_G):    """    Computes the content cost        Arguments:    a_C -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image C     a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image G        Returns:     J_content -- scalar that you compute using equation 1 above.    """        ### START CODE HERE ###    # Retrieve dimensions from a_G (≈1 line)    m, n_H, n_W, n_C = a_G.get_shape().as_list()        # Reshape a_C and a_G (≈2 lines)    a_C_unrolled = tf.reshape(a_C,[n_H * n_W, n_C])    a_G_unrolled = tf.reshape(a_G,[n_H * n_W, n_C])        # compute the cost with tensorflow (≈1 line)    J_content = tf.reduce_sum(tf.square(a_C_unrolled - a_G_unrolled)) / (n_H * n_W * n_C * 4)    ### END CODE HERE ###        return J_content

计算J_style(S,G)

需要把三维矩阵展开,然后转置,做矩阵乘法,才能得到相关系数矩阵

# GRADED FUNCTION: gram_matrixdef gram_matrix(A):    """    Argument:    A -- matrix of shape (n_C, n_H*n_W)        Returns:    GA -- Gram matrix of A, of shape (n_C, n_C)    """        ### START CODE HERE ### (≈1 line)    GA = tf.matmul(A,tf.transpose(A))    ### END CODE HERE ###        return GA

J s t y l e [ l ] ( S , G ) = 1 4 × n C 2 × ( n H × n W ) 2 ∑ i = 1 n C ∑ j = 1 n C ( G i j ( S ) − G i j ( G ) ) 2 J_{style}^{[l]}(S,G) = \frac{1}{4 \times n_{C}^{2} \times (n_H \times n_W)^2} \sum_{i=1}^{n_C} \sum_{j=1}^{n_C} (G^{(S)}_{ij} - G^{(G)} _ {ij})^{2} Jstyle[l](S,G)=4×nC2×(nH×nW)21i=1nCj=1nC(Gij(S)Gij(G))2

# GRADED FUNCTION: compute_layer_style_costdef compute_layer_style_cost(a_S, a_G):    """    Arguments:    a_S -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image S     a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image G        Returns:     J_style_layer -- tensor representing a scalar value, style cost defined above by equation (2)    """        ### START CODE HERE ###    # Retrieve dimensions from a_G (≈1 line)    m, n_H, n_W, n_C = a_G.get_shape().as_list()        # Reshape the images to have them of shape (n_C, n_H*n_W) (≈2 lines)    a_S = tf.transpose(tf.reshape(a_S,[n_H*n_W, n_C]))    a_G = tf.transpose(tf.reshape(a_G,[n_H*n_W, n_C]))    # Computing gram_matrices for both images S and G (≈2 lines)    GS = gram_matrix(a_S)    GG = gram_matrix(a_G)    # Computing the loss (≈1 line)    J_style_layer = 1 / (4 * (n_C*n_W*n_H)**2) * tf.reduce_sum(tf.square(tf.subtract(GS,GG)))        ### END CODE HERE ###        return J_style_layer
# GRADED FUNCTION: total_costdef total_cost(J_content, J_style, alpha = 10, beta = 40):    """    Computes the total cost function        Arguments:    J_content -- content cost coded above    J_style -- style cost coded above    alpha -- hyperparameter weighting the importance of the content cost    beta -- hyperparameter weighting the importance of the style cost        Returns:    J -- total cost as defined by the formula above.    """        ### START CODE HERE ### (≈1 line)    J = alpha * J_content + beta * J_style    ### END CODE HERE ###        return J
### START CODE HERE ### (1 line)J = total_cost(J_content, J_style, alpha = 10, beta = 40)### END CODE HERE ###
def model_nn(sess, input_image, num_iterations = 200):        # Initialize global variables (you need to run the session on the initializer)    ### START CODE HERE ### (1 line)    sess.run(tf.global_variables_initializer())    ### END CODE HERE ###        # Run the noisy input image (initial generated image) through the model. Use assign().    ### START CODE HERE ### (1 line)    generated_image = sess.run(model['input'].assign(input_image))    ### END CODE HERE ###        for i in range(num_iterations):            # Run the session on the train_step to minimize the total cost        ### START CODE HERE ### (1 line)        sess.run(train_step)        ### END CODE HERE ###                # Compute the generated image by running the session on the current model['input']        ### START CODE HERE ### (1 line)        generated_image = sess.run(model['input'])        ### END CODE HERE ###        # Print every 20 iteration.        if i%20 == 0:            Jt, Jc, Js = sess.run([J, J_content, J_style])            print("Iteration " + str(i) + " :")            print("total cost = " + str(Jt))            print("content cost = " + str(Jc))            print("style cost = " + str(Js))                        # save current generated image in the "/output" directory            save_image("output/" + str(i) + ".png", generated_image)        # save last generated image    save_image('output/generated_image.jpg', generated_image)        return generated_image

转载地址:http://drrii.baihongyu.com/

你可能感兴趣的文章
wget http://downloads.sourceforge.net/tcl/tcl8.6.1-src.tar.gz 下载失败【已解决】
查看>>
Kettle开源项目一款ETL工具
查看>>
kettle学习免费视频教程【正在学习-亲测!】
查看>>
部署Kettle7.1到linux后执行./kitchen.sh报错No libwebkitgtk1.0 detected
查看>>
CentOS7.2的yum、python卸载和重新安装【亲测有效!20190626】
查看>>
给定一个矩阵m*n,从左上角开始每次只能向右或者向下走,最后到右下角的位置共有多少种路径
查看>>
Mysql基本指令
查看>>
Java设计模式 单例模式
查看>>
HTTP状态码
查看>>
两种 HTTP 请求方法:GET 和 POST
查看>>
微服务概述
查看>>
SpringBoot(二)Springboot整合Mybatis:两种方式注解和xml
查看>>
vnc报错 PID file /home/root/.vnc/localhost.pid not readable after start.
查看>>
centos7 xfce+vnc黑屏解决
查看>>
Kettle7.1部署到服务器上xfce+vnc远程图形界面——手动安装
查看>>
VNC报错Can't find file /root/.vnc/host-x.pid You'll have to kill the Xvnc process manually
查看>>
centos7 命令行和图形界面切换——linux运行级别
查看>>
输出连续最长的数字串 Java实现
查看>>
Java后端开发知识点总结 2019(涉及前沿:微服务)
查看>>
Mysql基本查询语句和多表联合查询
查看>>