josephrocca / openai-clip-js Goto Github PK

View Code? Open in Web Editor NEW

88.0 88.0 6.0 160 KB

OpenAI's CLIP model ported to JavaScript using the ONNX web runtime

License: MIT License

Jupyter Notebook 40.76% HTML 30.26% JavaScript 28.97%

openai-clip-js's People

Contributors

Stargazers

Watchers

Forkers

zetimente jlarmstrongiv templeblock dingchaoz jonathhhan

openai-clip-js's Issues

[question] how to convert embeddings into text?

first of all thank you for sharing such a cool project! and other projects of yours are also amazing!
i have test the model it works great but how do i convert inference results into English text?

any examples or explaining would be appreciated!

error when gen onnx model

error when loading torch.onnx.export(model, text_tokens, "clip-text-vit-32.onnx", export_params=True, opset_version=12, do_constant_folding=True, input_names = ['input'], output_names = ['output'], dynamic_axes={'input' : {0 : 'batch_size'}, 'output' : {0 : 'batch_size'}})

outputs

============= Diagnostic Run torch.onnx.export version 2.0.1+cu118 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 1 ERROR ========================
ERROR: missing-standard-symbolic-function
=========================================
Exporting the operator 'aten::unflatten' to ONNX opset version 12 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.
None
<Set verbose=True to see more details>


---------------------------------------------------------------------------
UnsupportedOperatorError                  Traceback (most recent call last)
[<ipython-input-12-465ac7da98a7>](https://localhost:8080/#) in <cell line: 1>()
----> 1 torch.onnx.export(model, text_tokens, "clip-text-vit-32.onnx", export_params=True, opset_version=12, do_constant_folding=True, input_names = ['input'], output_names = ['output'], dynamic_axes={'input' : {0 : 'batch_size'}, 'output' : {0 : 'batch_size'}})

4 frames
[/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in _run_symbolic_function(graph, block, node, inputs, env, operator_export_type)
   1899             return graph_context.op(op_name, *inputs, **attrs, outputs=node.outputsSize())  # type: ignore[attr-defined]
   1900 
-> 1901         raise errors.UnsupportedOperatorError(
   1902             symbolic_function_name,
   1903             opset_version,

UnsupportedOperatorError: Exporting the operator 'aten::unflatten' to ONNX opset version 12 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.

Currently converting VIT-L/14 model

Hi, I am currently trying to convert the VIT L/14 model, but running into some memory issue when I try to load the model in the ONNX web runtime. Do you have any ideas?
I might have to just wait for it to be quantized to INT8.

Thanks,

Errors when running demo

I'm getting an error when running the text model in chrome.

The image model runs successfully on wasm, but fails on webgl

tflite output wrong

Awesome project!

I'm trying to use the tflite model that comes out of the conversion, but it's output doesn't look the same as the original model.

After converting to tflite, I use the data provided in the openai pytorch example:

text_model_path = 'clip-text-vit-32.tflite'
# Load TFLite model and allocate tensors.
text_interpreter = tf.lite.Interpreter(model_path=text_model_path)
text_interpreter.allocate_tensors()

# Get input and output tensors.
text_input_details = text_interpreter.get_input_details()
text_output_details = text_interpreter.get_output_details()

text_token = clip.tokenize(['This is a page of text about segmentation'])
text_input = np.array(text_token)
text_interpreter.set_tensor(text_input_details[0]['index'], text_input)
text_interpreter.invoke()
print(text_output[0, :10])

which gives output of:

[-0.1661  0.0545 -0.1515  0.4507  0.207  -0.2947  0.0406 -0.4087 -0.151
  0.3198]

For comparison the tutorial notebook https://colab.research.google.com/github/openai/clip/blob/master/notebooks/Interacting_with_CLIP.ipynb

Runs the same calculation:

text_token = clip.tokenize(['This is a page of text about segmentation']).cuda()
text_feature = model.encode_text(text_token).float()
print(np.array(text_feature.tolist())[0, :10])

which gives output of:

array([-8.46557617e-02,  3.23486328e-01,  9.23461914e-02, -2.18261719e-01,
        9.08203125e-02,  1.81152344e-01, -7.84397125e-04, -8.35449219e-01,
        6.68945312e-01, -4.18945312e-01])

josephrocca / openai-clip-js Goto Github PK

openai-clip-js's People

Contributors

Stargazers

Watchers

Forkers

openai-clip-js's Issues

[question] how to convert embeddings into text?

error when gen onnx model

Currently converting VIT-L/14 model

Errors when running demo

tflite output wrong

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent