cgpotts / cs224u Goto Github PK

Code for Stanford CS224u

License: Apache License 2.0

Python 4.02% Jupyter Notebook 95.98%

cs224u's Issues

hw_wordentail -define_graph() error

While working on the hw_wordentail notebook, I have found an error in test_TorchDeepNueralClassifier function.

The test aims to check whether the nn.module return value from class TorchDeepNeuralClassifier has been successfully implemented. However, the test function extracts the graph by using define_graph() function which is not implemented.

This should be changed to build_graph(). I can make the change and create a pull request if needed.

Location of Yelp and Gigaword data files

Where are the Gigaword and Yelp files located for vsm_01_distributional.ipynb notebook?

hw_colors - Decoder foward method requiring lengths

Hi,

I was working on the hw_colors notebook to create the updated EncoderDecoder model following the instructions. I thought I got all my updates correct until I received an error in the last test.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-217-f204130de1b0> in <module>
----> 1 test_full_system(ColorizedInputDescriber)

<ipython-input-216-df38441535ee> in test_full_system(describer_class)
      8     toy_mod = describer_class(toy_vocab)
      9 
---> 10     _ = toy_mod.fit(toy_color_seqs_train, toy_word_seqs_train)
     11 
     12     acc = toy_mod.listener_accuracy(toy_color_seqs_test, toy_word_seqs_test)

~/Downloads/Stanford-CS224U/codebase/torch_model_base.py in fit(self, *args)
    359                 y_batch = batch[-1]
    360 
--> 361                 batch_preds = self.model(*X_batch)
    362 
    363                 err = self.loss(batch_preds, y_batch)

/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

<ipython-input-214-6cb1cf97c969> in forward(self, color_seqs, word_seqs, seq_lengths, hidden, targets)
     18         output, hidden = self.decoder(
     19             word_seqs,
---> 20             target_colors=color_seqs[:,-1,:])
     21 
     22         # Your decoder will return `output, hidden` pairs; the

/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/Downloads/Stanford-CS224U/codebase/torch_color_describer.py in forward(self, word_seqs, seq_lengths, hidden, target_colors)
    232                 batch_first=True,
    233                 lengths=seq_lengths,
--> 234                 enforce_sorted=False)
    235             # RNN forward:
    236             output, hidden = self.rnn(embs, hidden)

/usr/local/lib/python3.7/site-packages/torch/nn/utils/rnn.py in pack_padded_sequence(input, lengths, batch_first, enforce_sorted)
    232                       'the trace incorrect for any other combination of lengths.',
    233                       stacklevel=2)
--> 234     lengths = torch.as_tensor(lengths, dtype=torch.int64)
    235     if enforce_sorted:
    236         sorted_indices = None

TypeError: an integer is required (got type NoneType)

I was having a hard time understanding this error because I thought we didn't need to input hidden and seq_lengths in the forward step.

I am defining my decoder call as:

output, hidden = self.decoder(
            word_seqs,
            target_colors=color_seqs[:,-1,:])

Is it a mistake in my code or do I have to include an additional length variable?
I know this is part of the homework and not necessarily related to the code base, but any help would be appreciated.

Thank you

Typo in hw_wordentail.ipynb

I wanted to create PR but I do not have create branch permission.

Minor thing, but could improve understanding. In hw_wordentail.ipynb:

"That is, if a word w appears in a training pair, it does not occur in any text pair. "

should read as:

"That is, if a word w appears in a training pair, it does not occur in any test pair. "

Should set return_dict=True for bert_model in finetuning.ipynb

Hi there,

I noticed that this line should be changed to

reps = bert_model(X_example, attention_mask=X_example_mask, return_dict=True)

for the following cells to work.

Thank you,
Wen

Average of the context vector in lecture "Contextual Word Representation"

Thank you for the great course! The course lectures and the other materials are really valuable to learn more about NLU.

I am not an enrolled student, but I've decided ask here a minor question related to the first lecture about "Contextual Word Representation".

In slide 5 (https://web.stanford.edu/class/cs224u/slides/cs224u-contextualreps-part1-handout.pdf), the "context vector" is evaluated as $κ = mean([α_1.h_1, α_2.h_2, α_3.h_3])$.

My question: Is it really necessary to do the "mean" operation instead of a "sum" ?

The attention weights $a_n$ are already from a softmax. The term $sum([α_1.h_1, α_2.h_2, α_3.h_3])$ would be a "weighted average" of the hidden states.

What I see often is to scale the dot products (before the softmax) $h^T_C.h_n$ by $1/\sqrt{d_k}$, where $d_k$ is the vector dimension, to normalize the variance (and get better results) as presented in the paper "Attention Is All You Need".

Thanks again!

tf_model_base.py: hidden_activation hardcoded

In line 35 of tf_model_base.py, self.hidden_activation is initialized hardcoded to tf.nn.tanh instead of taking the parameter value.

Trouble with hw_wordsim: Combining PPMI and LSA

In the second homework question for hw_wordsim, we are provided with the following instructions:

Gigaword with LSA at different dimensions [0.5 points]
We might expect PPMI and LSA to form a solid pipeline that combines the strengths of PPMI with those of dimensionality reduction. However, LSA has a hyper-parameter  𝑘  – the dimensionality of the final representations – that will impact performance. For this problem, write a wrapper function run_ppmi_lsa_pipeline that does the following:

1. Takes as input a count pd.DataFrame and an LSA parameter k.
2. Reweights the count matrix with PPMI.
3. Applies LSA with dimensionality k.
4. Evaluates this reweighted matrix using full_word_similarity_evaluation. The return value of run_ppmi_lsa_pipeline should be the return value of this call to full_word_similarity_evaluation.
The goal of this question is to help you get a feel for how much LSA alone can contribute to this problem.

The function test_run_ppmi_lsa_pipeline will test your function on the count matrix in data/vsmdata/giga_window20-flat.csv.gz.

When I construct run_ppmi_lsa_pipeline with the following steps, I get the wrong output:

Reweight input count_df using: ppmi_df = vsm.pmi(count_df, positive=True)
Perform LSA using: vsm.lsa(ppmi_df, k)
Calculate similarities

This leads to a similarity evaluation for men of 0.57. Not the expected 0.16

However, when I construct run_ppmi_lsa_pipeline using the following steps (without applying PPMI), I get the correct output:

Perform LSA using vsm.lsa(count_df, k)
Calculate similarities

This leads to the expected similarity for men of 0.16.

Is there a mistake in the instructions/expected results? Perhaps I did something wrong? Any and all help would be appreciated.

First cell of hw_sentiment.ipynb fails

Executing the first cell of the hw_sentiment.ipynb cell returns the error partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' on a newly install nlu environment using miniconda.

This was resolved by running this command in the conda environment:
pip install -U --force-reinstall charset-normalizer

(based on the post on pip install -U --force-reinstall charset-normalizer .

Course setup, Pytorch CPU

Hi - I've followed the instructions to setup an environment for the course on my machine. The only difference I am aware of is that miniconda was pre-installed.
When following the instructions, the version of pytorch installed was cpu only. To fix that, I ran the following command:

conda install pytorch=1.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
This seems to have resolved it for me.

add expected score for `run_knn_score_model`?

Hi @cgpotts,

I've just done all of the assignments (except the original system) in hw_wordrelatedness.ipynb and related notebooks and have a couple of thoughts that may be useful to share.

First, it is great that you have open-sourced the class material. Thank you!
I noticed that in the Learned distance functions section, the function run_knn_score_model that students are asked to write is not tested at all. It could be helpful to check the output score so that students know they have written it correctly. You could make train_test_split deterministic by setting shuffle=False. Another option would be to add a note about what approximate score one should expect.
In vsm_01_distributional.ipynb your proper_cosine function looks to be returning the angular distance. Calling it proper_cosine may be confusing to some unless that's a standard name for it.

Looking forward to going through the next notebooks!

data.zip, the data used in this course cannot be unpacked

I downloaded data.zip from "the course data" link on this notebook https://github.com/cgpotts/cs224u/blob/spring-2019/vsm_01_distributional.ipynb

System:
macOS High Sierra 10.13.6
java 12.0.2 2019-07-16
Java(TM) SE Runtime Environment (build 12.0.2+10)
Java HotSpot(TM) 64-Bit Server VM (build 12.0.2+10, mixed mode, sharing)

I run jar xvf data.zip and get:

created: data/
inflated: data/.DS_Store
created: __MACOSX/
created: __MACOSX/data/
inflated: __MACOSX/data/._.DS_Store
created: data/glove.6B/
inflated: data/glove.6B/glove.6B.100d.txt
created: __MACOSX/data/glove.6B/
inflated: __MACOSX/data/glove.6B/._glove.6B.100d.txt
inflated: data/glove.6B/glove.6B.200d.txt
inflated: __MACOSX/data/glove.6B/._glove.6B.200d.txt
inflated: data/glove.6B/glove.6B.300d.txt
inflated: __MACOSX/data/glove.6B/._glove.6B.300d.txt
inflated: data/glove.6B/glove.6B.50d.txt
inflated: __MACOSX/data/glove.6B/._glove.6B.50d.txt
created: data/negotiate/
inflated: data/negotiate/data.txt
created: __MACOSX/data/negotiate/
inflated: __MACOSX/data/negotiate/._data.txt
inflated: data/negotiate/selfplay.txt
inflated: __MACOSX/data/negotiate/._selfplay.txt
inflated: data/negotiate/test.txt
inflated: __MACOSX/data/negotiate/._test.txt
inflated: data/negotiate/train.txt
inflated: __MACOSX/data/negotiate/._train.txt
inflated: data/negotiate/val.txt
inflated: __MACOSX/data/negotiate/._val.txt
inflated: __MACOSX/data/._negotiate
created: data/nlidata/
created: data/nlidata/.ipynb_checkpoints/
inflated: data/nlidata/.ipynb_checkpoints/prep_wordentail_data-checkpoint.ipynb
inflated: data/nlidata/.ipynb_checkpoints/prep_wordentail_data-Copy1-checkpoint.ipynb
created: data/nlidata/multinli_1.0/
extracted: data/nlidata/multinli_1.0/Icon
created: __MACOSX/data/nlidata/
created: __MACOSX/data/nlidata/multinli_1.0/
inflated: __MACOSX/data/nlidata/multinli_1.0/._Icon
inflated: data/nlidata/multinli_1.0/manuscript.pdf
inflated: __MACOSX/data/nlidata/multinli_1.0/._manuscript.pdf
inflated: data/nlidata/multinli_1.0/multinli_1.0_dev_matched.jsonl
inflated: __MACOSX/data/nlidata/multinli_1.0/._multinli_1.0_dev_matched.jsonl
inflated: data/nlidata/multinli_1.0/multinli_1.0_dev_matched.txt
inflated: __MACOSX/data/nlidata/multinli_1.0/._multinli_1.0_dev_matched.txt
inflated: data/nlidata/multinli_1.0/multinli_1.0_dev_mismatched.jsonl
inflated: __MACOSX/data/nlidata/multinli_1.0/._multinli_1.0_dev_mismatched.jsonl
inflated: data/nlidata/multinli_1.0/multinli_1.0_dev_mismatched.txt
inflated: __MACOSX/data/nlidata/multinli_1.0/._multinli_1.0_dev_mismatched.txt
inflated: data/nlidata/multinli_1.0/multinli_1.0_train.jsonl
inflated: __MACOSX/data/nlidata/multinli_1.0/._multinli_1.0_train.jsonl
java.io.EOFException: Unexpected end of ZLIB input stream
at java.base/java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:245)
at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:159)
at java.base/java.util.zip.ZipInputStream.read(ZipInputStream.java:195)
at java.base/java.util.zip.ZipInputStream.closeEntry(ZipInputStream.java:141)
at jdk.jartool/sun.tools.jar.Main.extractFile(Main.java:1457)
at jdk.jartool/sun.tools.jar.Main.extract(Main.java:1364)
at jdk.jartool/sun.tools.jar.Main.run(Main.java:409)
at jdk.jartool/sun.tools.jar.Main.main(Main.java:1681)

I run unzip data.zip and get:

Archive: data.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of data.zip or
data.zip.zip, and cannot find data.zip.ZIP, period.

I think the class data source has broken.

the data.zip might be broken since when I try to open it, my machine told me this and I could not find the csv.gz file as a result..

Torch_model_base.py error in collate_fn

Hi,

I have been trying to run the codes in the notebook. However, as I try to run the code that utilizes torch_model, I keep getting an error when I fit the model. How can I resolve this issue?

Thank you,

Joey

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-60-77ee602003f0> in <module>
      1 giga_ae = TorchAutoencoder(max_iter=1000,
      2                           hidden_dim=100,
----> 3                           eta=0.03).fit(giga5_svd500)

~/Downloads/Stanford-CS224U/codebase/torch_autoencoder.py in fit(self, X)
    124 
    125         """
--> 126         super().fit(X, X)
    127         # Hidden representations:
    128         with torch.no_grad():

~/Downloads/Stanford-CS224U/codebase/torch_model_base.py in fit(self, *args)
    351             epoch_error = 0.0
    352 
--> 353             for batch_num, batch in enumerate(dataloader, start=1):
    354 
    355                 batch = [x.to(self.device, non_blocking=True) for x in batch]

/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
    613         if self.num_workers == 0:  # same-process loading
    614             indices = next(self.sample_iter)  # may raise StopIteration
--> 615             batch = self.collate_fn([self.dataset[i] for i in indices])
    616             if self.pin_memory:
    617                 batch = pin_memory_batch(batch)

TypeError: 'NoneType' object is not callable

super minor : link correction

For the main interface, we can just subclass TorchRNNClassifier and change the build_graph method to use TorchVecAvgModel. (For more details on the code and logic here, see the notebook: tutorial_torch_models.ipynb

This should be:

tutorial_pytorch_models.ipynb

Consider using os.environ.get() instead of test ENV presence

The code base thoroughly uses in idiom to test environment variable presence as a flag, like

if 'IS_GRADESCOPE_ENV' not in os.environ:
    pass

To enable above condition, one has to unset IS_GRADESCOPE_ENV. Flipping the value has not effect, such as IS_GRADESCOPE_ENV=0 or IS_GRADESCOPE_ENV=1.

However, it is usually to use os.environ.get() as a better alternative in production systems. For example,

if not os.environ.get('IS_GRADESCOPE_ENV', False):
    pass

# or
if not os.environ.get('IS_GRADESCOPE_ENV', None):
    pass

# or strictly checking against pre-defined value
if not os.environ.get('IS_GRADESCOPE_ENV', '0') == '1':
    pass

Because in shell script, we usually test if an environment variable flag is on by checking if it is present or if it is non-empty (i.e., [[ ! -z "$VAR" ]]). In this way, either of following will be evaluated as false as expected and follows the convention, just in case users forgot to unset the value from the environment.

IS_GRADESCOPE_ENV=
unset IS_GRADESCOPE_ENV
export IS_GRADESCOPE_ENV=''

Just a minor issue, feel free to ignore. 😄

YouTube videos for later lectures?

Thank you for the wonderful course resources you've made available! Will the videos for the later lectures such as grounded language understanding, semantic parsing, evaluation metrics, and contextual word embeddings ever make their way to YouTube?

Add environment variable check

The autograder being used in XCS224U when converting the notebook hw_colors.ipynb to a python script adds get_ipython() command to the script. This won't work unless you import it - from IPython import get_ipython. something like this:

So it would be good to change these lines of code from:

to this:

I already have the permission to make changes to the repo. Just wanted to run by you once before making changes and committing it. Thanks @cgpotts

def test_op_unigrams_phi(func):
    tree = Tree.fromstring("""(4 (2 NLU) (4 (2 is) (4 amazing)))""")
    expected = {"enlightening": 1}
    result = func(tree)
    assert result == expected, \
        ("Error for `op_unigrams_phi`: "
         "Got `{}` which differs from `expected` "
         "in `test_op_unigrams_phi`".format(result))

DictVectorizer.get_feature_names( ... ) should be DictVectorizer.get_feature_names_out( ...)

I believe get_feature_names() is deprecated (and now removed). get_feature_names_out() is the replacement.

cgpotts / cs224u Goto Github PK

cs224u's Issues

Recommend Projects

Recommend Topics

Recommend Org