Dear Jack,
Thank you so much for this wonderful knowledge share. My encounters with you have made me better in this field!
There are very minor notation issues for lecture 1 note. Others are just perfect!
formula (1), should be given the centor word $w_t$, and the likelihood should be $$Likelihood = L(\theta) = \prod_{t}^{T} \prod_{\substack{-m \leq j \leq m \\ j \neq 0}} P(w_{t+j}|{w_{t}; \theta})$$
The reason to change to $$j \neq0 $$, rather not $$j \neq m$$ is when $j$ is 0, it means the center word $w_t$ itself. Similar issue applies to other formula.
For formula(3), the wording for prediction function should be : function of predicting context word $v_{o}$ , given the center word $v_{c}$ and the vocabulary $V$. Similarly, formula (5), it should be $$J(\mathbf{u_{o} \;|\; v_{c}}) $$
For Partial derivative with regard to $v_c$ section, step 3 $$\frac{\partial}{\partial \mathbf{v_c}}exp(\mathbf{u_{x}^{T} v_{c}}) = exp(\mathbf{u_{x}^{T} v_{c}}) \cdot \frac{\partial}{\partial \mathbf{v_c}}\mathbf{u_{x}^{T} v_{c}} = exp(\mathbf{u_{x}^{T} v_{c}}) \cdot \mathbf{u_{x}}\tag{Step 3}$$. I think $u_x$ here should indicate a vector $$\mathbf{u_{x}}$$.
Let me know if my understanding is correct or not. Again, thank you for generously sharing your time with community!