Wenzheng Kelly Kang's Projects
Data sampling library
Multilingual Sentence & Image Embeddings with BERT
This project uses SLICE algorithm to extract information from a text-based PDF page containing financial statements (tabular data). It can also be used to extract regular tables but will contain all text on a page.
Unsupervised Semantic Segmentation by Distilling Feature Correspondences
Notebook for my Youtube video on summarizing and querying multiple pdfs
AI alternative to SurveyMonkey
Talkify is an open source framework with an aim to standardize and model conversational AI enabling development of personal assistants and chat bots. The mission of this framework is to make developing chat bots and personal assistants as easy as spinning up a simple website in html.
Temporal service
Tesseract Open Source OCR Engine (main repository)
Pure Javascript OCR for more than 100 Languages 📖🎉🖥
A command-line application to convert images, PDFs, and audio files to text using Apple's APIs
Twitter NLP Tools extract knowledge from informal text like twitter
TypeChat is a library that makes it easy to build natural language interfaces using types.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
maximal update parametrization (µP)
LLM-powered Conversational AI experience using Vectara
📦🔐 A lightweight Node.js private proxy registry
AI search & chat for all Wait But Why posts.
Webassembly compilation of https://github.com/ImageMagick/ImageMagick & samples
Generate PDF files with JavaScript and WASM (WebAssembly)
Add watermark to react components in a more elegant way
:rice_scene: Watermarking for the browser
Working demo of CSS Modules, using Webpack's css-loader in module mode
提取微信聊天记录,将其导出成HTML、Word、CSV文档永久保存,对聊天记录进行分析生成年度聊天报告
Xournal++ is a handwriting notetaking software with PDF annotation support. Written in C++ with GTK3, supporting Linux (e.g. Ubuntu, Debian, Arch, SUSE), macOS and Windows 10. Supports pen input from devices such as Wacom Tablets.