Giter Site home page Giter Site logo

cs230-code-examples's People

Contributors

dependabot[bot] avatar guillaumegenthial avatar josephch405 avatar kiank avatar omoindrot avatar suragnair avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cs230-code-examples's Issues

The NLP NER example only predicts correct Os

Hi, the NER example has an accuracy of 0.82 or 0.78 after running for 10 epochs on the toy 10 values dataset. I was using this code to startup a NER task I have to do. It is just predicting 'O's. I checked on set of predictions at the end of the epochs and the I's and B's predicted were all wrong.

Use a score like F1 on the B/I to get a better idea.

Note -
I am not a student in the course. and I know it is supposed to be starter code, but the way it is presented, one expects that it'll work for the toy example.

Using saved models for prediction of a single image

Hi,

I am looking for an example how to use a saved checkpoint of your example for simple inference of a given image, i.e. perform classification on a single image. Unfortunately, searching the internet has not been successful so far.

What I haven been doing so far is the following:

with tf.Session() as sess:
    saver = tf.train.import_meta_graph(metafile)
    saver.restore(sess, path_to_ckpt)
    graph = tf.get_default_graph()
    
    output = graph.get_tensor_by_name('model/pred:0')
    pred = sess.run([output], feed_dict={x: image})

Unfortunately, I am not sure what x is supposed to be. Could you please provide an example for a simple prediction of a single image? Especially, I would need to know which layers to give what name in the model_fn so that I can reference them in the code above.

Best regards.

Problem in Build the dataset of size 64x64

Hello,

After I do the first step (Build the dataset of size 64x64), there is file 64x64_SIGNS file created, but there is no image in the train_signs. How to solve it?

Tutorials

Content ideas for the tutorials explaining the posts

Content ideas for tutorials

Code in https://github.com/cs230-stanford/cs230-stanford.github.io

  • structure of the project (files' roles, experiment pipeline)

  • how to run the toy examples

  • explain how to use logger

  • explain where to define the model or change it

  • explain how to change hyperparameters

  • how to feed data...

  • use github release to have multiple version of the code?

  • Explain the general idea of training multiple models, trying different structures...

    • make sure that experiments are reproducible
      • for instance, if model.py has incompatible changes (ex: adds batch norm), previous params.json cannot be run again
      • have to update old params.json to match the new change (ex: put params.use_bn argument, and add it to all old params.json)
    • give good names to the dirs in experiments
    • visualize on tensorboard
    • don't spend too much time watching training progress: launch hyperparam search, let it run and get back later (make sure there is no bug first)
  • explain how to properly split train / dev / test

    • hardcode the split in three folders

Organization ideas

  • add a number to each post? ex: "3. Creating input pipelines..."
    • would be easier to understand the structure
    • in each post, at the beginning put the full list

Performance issue in tensorflow/vision/model/input_fn.py (by P3)

Hello! I've found a performance issue in input_fn.py: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

  • tensorflow/vision/model/input_fn.py: .batch(params.batch_size)(here) should be called before .map(parse_fn, num_parallel_calls=params.num_parallel_calls)(here).
  • tensorflow/vision/model/input_fn.py: .batch(params.batch_size)(here) should be called before .map(parse_fn)(here).

Besides, you need to check the function called in map()(e.g., parse_fn called in .map(parse_fn)) whether to be affected or not to make the changed code work properly. For example, if parse_fn needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

TODOs

TODO

  • change name eval_metrics to metrics

    • introduces confusion: only have one set of metrics that are said to be an average on the dataset
    • what we log with tqdm is not metrics but "training monitoring"
  • enable hyperparameter search

    • hyperparam_search.py calling train.py over multiple params.json in experiments
    • run("train.py --model_dir experiments/exp024")
  • check that there is no problem with the virtual env and that we don't need to add a preamble

  • saving

    • if we train again the model, training should start where it stopped
    • add an optional argument for this?
  • split the graph into train and eval

    • clean-up
    • reuse the weights with 2 graphs and 2 inputs
    • (or use a placeholder??)
  • split hyperparams_search.py into two files?

    • one for hyperparam search
    • one for "syntethizing results"
  • for synthesize results, get all possible subdirectories in it

  • explore file explore.py ?

    • would run the model on some examples (dataset)
    • give access to some errors
    • interaction? (ipython like test?)
  • explicitely have a train / dev / test split

    • for the SIGNS dataset, there is only train / test --> do the split in train.py
    • some images are duplicated??? Ignore or clean up the dataset?
  • add tf.summary.image for training images?

  • make sure there are only images in SIGNS (no .DS_Store)

  • add script download_data.sh

  • add descriptions to parser.add_argument(help="...")

  • perform synthesize metrics after hyperparam?

  • Use mode (mode={'train', 'eval', 'predict'}) instead of is_training

    • allow use of predict mode for functions
    • pros: allows prediction (without labels)
    • cons: adds more code
  • use restore_from (file or dir)

    • when restoring weights, keep track of the number of epochs already run?
  • when restoring, the numbering is becoming incorrect as it starts from 0 again

  • make first argument from input_fn be train or eval instead of True/False

  • check the date for each blog post (make everything dated as 02/01?)

Error when run build_dataset.py on windows

In Windows OS, folder names in a path join together with back slash [ \ ] instead of slash [ / ] like this:

C:\Program Files\NVIDIA GPU Computing Toolkit

so build_dataset.py throw an error. because it can't split filename from directory.

I solve it by replace the slash with double back slash '\'

image.save(os.path.join(output_dir, **filename.split('\\')[-1])**)

Thanks.

Performance issues in the program

Hello,I found a performance issue in the definition of input_fn ,
cs230-stanford/cs230-code-examples/blob/master/tensorflow/vision/model/input_fn.py,
dataset = (tf.data.Dataset.map was called without num_parallel_calls.
I think it will increase the efficiency of your program if you add this.

Here is the documemtation of tensorflow to support this thing.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

Error running data_loader.py

Since utils.py is not present in the same directory as data_loader.py, this will throw an error.

I did a workaround like this:
import sys sys.path.append("."); import utils # if running from root folder, else append '..'

Feedback from TAs

  • It's not good to call it "starter code", since it makes it like we are hand holding the students too much

  • we should rename the whole project to something like "cs230-code-examples"

    • they can use it as examples, and copy some of the code
  • put a license on the code

  • when we refer to certain files in the code, should we put a link to them?

    • ex: train.py
    • choose it from tensorflow / pytorch / vision / nlp?

hyperparameter search issue

When I run this script using the provided sample dataset:
python search_hyperparams.py --data_dir data/small --parent_dir experiments/learning_rate

It throwed me an error:
image

The Variable API has been deprecated in Pytorch 1.10

The Variable API has been deprecated in Pytorch 1.10.

From Pytorch 1.10 documentation:
"Autograd automatically supports Tensors with requires_grad set to True
Variables are no longer necessary to use autograd with tensors."

The code can be further simplified if lines with Variable calls are removed. For example lines 56-57 of train.pyfile:

# convert to torch Variables
train_batch, labels_batch = Variable(train_batch), Variable(labels_batch)

The calculated metrics are not precise.

The code in pytorch/vision/train.py and pytorch/vision/evaluate.py describe how to calculate metrics with batches of data.

In train.py, since it calculates the metrics once in a while, it doesn't represent the metrics of whole dataset.
In evaluate.py, since the size of dataset may be not divisible by the batch size, the calculated metrics are not precise, either. The better way is to calculate a weighted sum of the mean values of batches which weighted by the number of examples and divide it by the size of whole dataset.

Organization of the blog posts

General (common between TensorFlow and PyTorch)

  1. Introduction to project starter code
  2. Logging + hyperparams
  3. AWS setup
  4. Train/Dev/Test set

TensorFlow

  1. Getting started
  2. Dataset pipeline: tf.data
  3. Creating the model (tf.layers) + training + evaluation
  • model
  • training ops
  • input_fn and model_fn
  • evaluation and tf.metrics
  • initialization
  • saving
  • tensorboard
  • global_step

OutOfRangeError if test set larger than dev set

There is a little issue in cs230-code-examples/tensorflow/nlp/evaluate.py: You use the dev set for evaluation:

path_eval_sentences = os.path.join(args.data_dir, 'dev/sentences.txt')
path_eval_labels = os.path.join(args.data_dir, 'dev/labels.txt')

But later you iterate over the size of the test set:

params.eval_size = params.test_size

If the test set is larger than the dev set, this leads to an OutOfRangeError. If the test set is smaller than the dev set, the iteration stops too early.

Thanks for sharing the code!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.