nlp_paper's People
nlp_paper's Issues
self-supervised, semi-supervised
Semi-supervised Learning
Source: http://jalammar.github.io/illustrated-bert/
Self-supervised Learning
Source: https://www.kakaobrain.com/blog/118
LeCun updated his cake recipe last week at the 2019 International Solid-State Circuits Conference (ISSCC) in San Francisco, replacing “unsupervised learning” with “self-supervised learning,” a variant of unsupervised learning where the data provides the supervision.
Source: https://syncedreview.com/2019/02/22/yann-lecun-cake-analogy-2-0/
Source: https://www.slideshare.net/rouyunpan/deep-learning-hardware-past-present-future
Source: https://www.slideshare.net/xavigiro/selfsupervised-learning-from-video-sequences-xavier-giro-upc-barcelona-2019
Labeled data
Source: https://medium.com/@behnamsabeti/various-types-of-supervision-in-machine-learning-c7f32c190fbe
In the semi-supervised learning setting, the goal is to use both a small labeled training set and a much larger unlabeled data set.
Source: http://ai.stanford.edu/blog/weak-supervision/
- self-supervised learning in unsupervised learning
- transfer learning: pre-trained model -> fine-tuned model by freezing or fine-tuning
- pre-trained model is not always unsupervised learning (Image transfer learning)
gpt 구현
-
generative/discriminative
discriminative 전략? -> 코드에서 확인해보기 -
context window? -> 코드에서 확인해보기
-
multi-layer? -> 코드에서 확인해보기
-
softmax(h_m^l)
recursive? -> mask
m만 사용하냐?
tf 1.x 에서 RNN & LSTM
tf 1.x 에서 RNN & LSTM
Using static_rnn()
n_inputs = 3
n_neurons = 5
#############################################################
reset_graph()
X0 = tf.placeholder(tf.float32, [None, n_inputs])
X1 = tf.placeholder(tf.float32, [None, n_inputs])
basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
output_seqs, states = tf.nn.static_rnn(basic_cell, [X0, X1],
dtype=tf.float32)
Y0, Y1 = output_seqs
#############################################################
init = tf.global_variables_initializer()
X0_batch = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 0, 1]])
X1_batch = np.array([[9, 8, 7], [0, 0, 0], [6, 5, 4], [3, 2, 1]])
#############################################################
with tf.Session() as sess:
init.run()
Y0_val, Y1_val = sess.run([Y0, Y1], feed_dict={X0: X0_batch, X1: X1_batch})
Packing sequences
n_steps = 2
n_inputs = 3
n_neurons = 5
#############################################################
reset_graph()
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
X_seqs = tf.unstack(tf.transpose(X, perm=[1, 0, 2]))
basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
output_seqs, states = tf.nn.static_rnn(basic_cell, X_seqs,
dtype=tf.float32)
outputs = tf.transpose(tf.stack(output_seqs), perm=[1, 0, 2])
#############################################################
init = tf.global_variables_initializer()
X_batch = np.array([
# t = 0 t = 1
[[0, 1, 2], [9, 8, 7]], # instance 1
[[3, 4, 5], [0, 0, 0]], # instance 2
[[6, 7, 8], [6, 5, 4]], # instance 3
[[9, 0, 1], [3, 2, 1]], # instance 4
])
#############################################################
with tf.Session() as sess:
init.run()
outputs_val = outputs.eval(feed_dict={X: X_batch})
GPT
- generative/discriminative?
- effective transfer?
- unsupervised pre-training & semi-supervised?
- decoder process
- unsupervised pre-training objective :
$p(u)$ - context window k?
- QA input transformation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.