Giter Site home page Giter Site logo

nlp_paper's People

Contributors

abooundev avatar

Stargazers

 avatar

Watchers

 avatar  avatar

nlp_paper's Issues

self-supervised, semi-supervised

Semi-supervised Learning

image

Source: http://jalammar.github.io/illustrated-bert/

Self-supervised Learning

image

Source: https://www.kakaobrain.com/blog/118

LeCun updated his cake recipe last week at the 2019 International Solid-State Circuits Conference (ISSCC) in San Francisco, replacing “unsupervised learning” with “self-supervised learning,” a variant of unsupervised learning where the data provides the supervision.

image
image

Source: https://syncedreview.com/2019/02/22/yann-lecun-cake-analogy-2-0/
Source: https://www.slideshare.net/rouyunpan/deep-learning-hardware-past-present-future

image
Source: https://www.slideshare.net/xavigiro/selfsupervised-learning-from-video-sequences-xavier-giro-upc-barcelona-2019

Labeled data

image
Source: https://medium.com/@behnamsabeti/various-types-of-supervision-in-machine-learning-c7f32c190fbe

In the semi-supervised learning setting, the goal is to use both a small labeled training set and a much larger unlabeled data set.
image
Source: http://ai.stanford.edu/blog/weak-supervision/

  • self-supervised learning in unsupervised learning
  • transfer learning: pre-trained model -> fine-tuned model by freezing or fine-tuning
  • pre-trained model is not always unsupervised learning (Image transfer learning)

gpt 구현

  • generative/discriminative
    discriminative 전략? -> 코드에서 확인해보기

  • context window? -> 코드에서 확인해보기

  • multi-layer? -> 코드에서 확인해보기

  • softmax(h_m^l)
    recursive? -> mask
    m만 사용하냐?

tf 1.x 에서 RNN & LSTM

tf 1.x 에서 RNN & LSTM

Using static_rnn()

n_inputs = 3
n_neurons = 5

#############################################################
reset_graph()

X0 = tf.placeholder(tf.float32, [None, n_inputs])
X1 = tf.placeholder(tf.float32, [None, n_inputs])

basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
output_seqs, states = tf.nn.static_rnn(basic_cell, [X0, X1],
                                       dtype=tf.float32)
Y0, Y1 = output_seqs

#############################################################
init = tf.global_variables_initializer()

X0_batch = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 0, 1]])
X1_batch = np.array([[9, 8, 7], [0, 0, 0], [6, 5, 4], [3, 2, 1]])

#############################################################
with tf.Session() as sess:
    init.run()
    Y0_val, Y1_val = sess.run([Y0, Y1], feed_dict={X0: X0_batch, X1: X1_batch})

Packing sequences

n_steps = 2
n_inputs = 3
n_neurons = 5

#############################################################
reset_graph()

X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
X_seqs = tf.unstack(tf.transpose(X, perm=[1, 0, 2]))

basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
output_seqs, states = tf.nn.static_rnn(basic_cell, X_seqs,
                                       dtype=tf.float32)
outputs = tf.transpose(tf.stack(output_seqs), perm=[1, 0, 2])

#############################################################
init = tf.global_variables_initializer()
X_batch = np.array([
        # t = 0      t = 1 
        [[0, 1, 2], [9, 8, 7]], # instance 1
        [[3, 4, 5], [0, 0, 0]], # instance 2
        [[6, 7, 8], [6, 5, 4]], # instance 3
        [[9, 0, 1], [3, 2, 1]], # instance 4
    ])

#############################################################
with tf.Session() as sess:
    init.run()
    outputs_val = outputs.eval(feed_dict={X: X_batch})

GPT

  • generative/discriminative?
  • effective transfer?
  • unsupervised pre-training & semi-supervised?
  • decoder process
  • unsupervised pre-training objective : $p(u)$
  • context window k?
  • QA input transformation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.