cs230-stanford / cs230-code-examples Goto Github PK
View Code? Open in Web Editor NEWCode examples in pyTorch and Tensorflow for CS230
License: Other
Code examples in pyTorch and Tensorflow for CS230
License: Other
Hi, the NER example has an accuracy of 0.82 or 0.78 after running for 10 epochs on the toy 10 values dataset. I was using this code to startup a NER task I have to do. It is just predicting 'O's. I checked on set of predictions at the end of the epochs and the I's and B's predicted were all wrong.
Use a score like F1 on the B/I to get a better idea.
Note -
I am not a student in the course. and I know it is supposed to be starter code, but the way it is presented, one expects that it'll work for the toy example.
When running the code, I found that this particular layer (and only this one) fails when the batch size is set to 1.
I think we can maybe use os.path.basename(filename)
instead of filename.split('/')[-1]
as in windows os the paths to images get a \\
instead of /
. Just with this change we can keep a consistent backslash system. Just a thought.
Hi,
I am looking for an example how to use a saved checkpoint of your example for simple inference of a given image, i.e. perform classification on a single image. Unfortunately, searching the internet has not been successful so far.
What I haven been doing so far is the following:
with tf.Session() as sess:
saver = tf.train.import_meta_graph(metafile)
saver.restore(sess, path_to_ckpt)
graph = tf.get_default_graph()
output = graph.get_tensor_by_name('model/pred:0')
pred = sess.run([output], feed_dict={x: image})
Unfortunately, I am not sure what x
is supposed to be. Could you please provide an example for a simple prediction of a single image? Especially, I would need to know which layers to give what name in the model_fn so that I can reference them in the code above.
Best regards.
Hello,
After I do the first step (Build the dataset of size 64x64), there is file 64x64_SIGNS file created, but there is no image in the train_signs. How to solve it?
Content ideas for the tutorials explaining the posts
Code in https://github.com/cs230-stanford/cs230-stanford.github.io
structure of the project (files' roles, experiment pipeline)
how to run the toy examples
explain how to use logger
explain where to define the model or change it
explain how to change hyperparameters
how to feed data...
use github release to have multiple version of the code?
Explain the general idea of training multiple models, trying different structures...
params.use_bn
argument, and add it to all old params.json
)experiments
explain how to properly split train / dev / test
Hello! I've found a performance issue in input_fn.py: batch()
should be called before map()
, which could make your program more efficient. Here is the tensorflow document to support it.
Detailed description is listed below:
.batch(params.batch_size)
(here) should be called before .map(parse_fn, num_parallel_calls=params.num_parallel_calls)
(here)..batch(params.batch_size)
(here) should be called before .map(parse_fn)
(here).Besides, you need to check the function called in map()
(e.g., parse_fn
called in .map(parse_fn)
) whether to be affected or not to make the changed code work properly. For example, if parse_fn
needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
change name eval_metrics
to metrics
enable hyperparameter search
hyperparam_search.py
calling train.py
over multiple params.json
in experiments
run("train.py --model_dir experiments/exp024")
check that there is no problem with the virtual env and that we don't need to add a preamble
saving
split the graph into train and eval
split hyperparams_search.py
into two files?
for synthesize results, get all possible subdirectories in it
explore file explore.py
?
explicitely have a train / dev / test split
add tf.summary.image for training images?
make sure there are only images in SIGNS (no .DS_Store)
add script download_data.sh
add descriptions to parser.add_argument(help="...")
perform synthesize metrics after hyperparam?
Use mode (mode={'train', 'eval', 'predict'}
) instead of is_training
use restore_from (file or dir)
when restoring, the numbering is becoming incorrect as it starts from 0 again
make first argument from input_fn
be train
or eval
instead of True
/False
check the date for each blog post (make everything dated as 02/01?)
In Windows OS, folder names in a path join together with back slash [ \ ] instead of slash [ / ] like this:
C:\Program Files\NVIDIA GPU Computing Toolkit
so build_dataset.py throw an error. because it can't split filename from directory.
I solve it by replace the slash with double back slash '\'
image.save(os.path.join(output_dir, **filename.split('\\')[-1])**)
Thanks.
Hello,I found a performance issue in the definition of input_fn
,
cs230-stanford/cs230-code-examples/blob/master/tensorflow/vision/model/input_fn.py,
dataset = (tf.data.Dataset.map was called without num_parallel_calls.
I think it will increase the efficiency of your program if you add this.
Here is the documemtation of tensorflow to support this thing.
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
Since utils.py
is not present in the same directory as data_loader.py, this will throw an error.
I did a workaround like this:
import sys sys.path.append("."); import utils # if running from root folder, else append '..'
It's not good to call it "starter code", since it makes it like we are hand holding the students too much
we should rename the whole project to something like "cs230-code-examples"
put a license on the code
when we refer to certain files in the code, should we put a link to them?
train.py
The Variable API has been deprecated in Pytorch 1.10.
From Pytorch 1.10 documentation:
"Autograd automatically supports Tensors with requires_grad set to True
Variables are no longer necessary to use autograd with tensors."
The code can be further simplified if lines with Variable calls are removed. For example lines 56-57 of train.py
file:
# convert to torch Variables
train_batch, labels_batch = Variable(train_batch), Variable(labels_batch)
The code in pytorch/vision/train.py and pytorch/vision/evaluate.py describe how to calculate metrics with batches of data.
In train.py, since it calculates the metrics once in a while, it doesn't represent the metrics of whole dataset.
In evaluate.py, since the size of dataset may be not divisible by the batch size, the calculated metrics are not precise, either. The better way is to calculate a weighted sum of the mean values of batches which weighted by the number of examples and divide it by the size of whole dataset.
tf.data
tf.layers
) + training + evaluationtf.metrics
The computation of accuracy and the metrics of accuracy below are lack of masking. In this way the accuracy could be wrong because of wrong predictions of padded tokens.
There is a little issue in cs230-code-examples/tensorflow/nlp/evaluate.py
: You use the dev set for evaluation:
path_eval_sentences = os.path.join(args.data_dir, 'dev/sentences.txt')
path_eval_labels = os.path.join(args.data_dir, 'dev/labels.txt')
But later you iterate over the size of the test set:
params.eval_size = params.test_size
If the test set is larger than the dev set, this leads to an OutOfRangeError. If the test set is smaller than the dev set, the iteration stops too early.
Thanks for sharing the code!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.