josephrocca / openai-clip-js Goto Github PK
View Code? Open in Web Editor NEWOpenAI's CLIP model ported to JavaScript using the ONNX web runtime
License: MIT License
OpenAI's CLIP model ported to JavaScript using the ONNX web runtime
License: MIT License
first of all thank you for sharing such a cool project! and other projects of yours are also amazing!
i have test the model it works great but how do i convert inference results into English text?
any examples or explaining would be appreciated!
error when loading torch.onnx.export(model, text_tokens, "clip-text-vit-32.onnx", export_params=True, opset_version=12, do_constant_folding=True, input_names = ['input'], output_names = ['output'], dynamic_axes={'input' : {0 : 'batch_size'}, 'output' : {0 : 'batch_size'}})
outputs
============= Diagnostic Run torch.onnx.export version 2.0.1+cu118 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 1 ERROR ========================
ERROR: missing-standard-symbolic-function
=========================================
Exporting the operator 'aten::unflatten' to ONNX opset version 12 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.
None
<Set verbose=True to see more details>
---------------------------------------------------------------------------
UnsupportedOperatorError Traceback (most recent call last)
[<ipython-input-12-465ac7da98a7>](https://localhost:8080/#) in <cell line: 1>()
----> 1 torch.onnx.export(model, text_tokens, "clip-text-vit-32.onnx", export_params=True, opset_version=12, do_constant_folding=True, input_names = ['input'], output_names = ['output'], dynamic_axes={'input' : {0 : 'batch_size'}, 'output' : {0 : 'batch_size'}})
4 frames
[/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in _run_symbolic_function(graph, block, node, inputs, env, operator_export_type)
1899 return graph_context.op(op_name, *inputs, **attrs, outputs=node.outputsSize()) # type: ignore[attr-defined]
1900
-> 1901 raise errors.UnsupportedOperatorError(
1902 symbolic_function_name,
1903 opset_version,
UnsupportedOperatorError: Exporting the operator 'aten::unflatten' to ONNX opset version 12 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.
Hi, I am currently trying to convert the VIT L/14 model, but running into some memory issue when I try to load the model in the ONNX web runtime. Do you have any ideas?
I might have to just wait for it to be quantized to INT8.
Thanks,
Awesome project!
I'm trying to use the tflite model that comes out of the conversion, but it's output doesn't look the same as the original model.
After converting to tflite, I use the data provided in the openai pytorch example:
text_model_path = 'clip-text-vit-32.tflite'
# Load TFLite model and allocate tensors.
text_interpreter = tf.lite.Interpreter(model_path=text_model_path)
text_interpreter.allocate_tensors()
# Get input and output tensors.
text_input_details = text_interpreter.get_input_details()
text_output_details = text_interpreter.get_output_details()
text_token = clip.tokenize(['This is a page of text about segmentation'])
text_input = np.array(text_token)
text_interpreter.set_tensor(text_input_details[0]['index'], text_input)
text_interpreter.invoke()
print(text_output[0, :10])
which gives output of:
[-0.1661 0.0545 -0.1515 0.4507 0.207 -0.2947 0.0406 -0.4087 -0.151
0.3198]
For comparison the tutorial notebook https://colab.research.google.com/github/openai/clip/blob/master/notebooks/Interacting_with_CLIP.ipynb
Runs the same calculation:
text_token = clip.tokenize(['This is a page of text about segmentation']).cuda()
text_feature = model.encode_text(text_token).float()
print(np.array(text_feature.tolist())[0, :10])
which gives output of:
array([-8.46557617e-02, 3.23486328e-01, 9.23461914e-02, -2.18261719e-01,
9.08203125e-02, 1.81152344e-01, -7.84397125e-04, -8.35449219e-01,
6.68945312e-01, -4.18945312e-01])
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.