The pymoji from lusbenjamin

View Code? Open in Web Editor NEW

✨📸🕶✨

License: MIT License

Python 84.77% Shell 2.81% HTML 12.42%

pymoji's People

Contributors

make 🤠 bigger
MIDPOINT_BETWEEN_EYES would be a better centroid for the pasteing of the emoji
(hardcoded: emoji faces are a little too low and eating into necks, hat emoji is too low, could pan/tilt to help)
Basic auth of some kind:
- non-anonymous user (HTTP?) OR
- user OAuth logins from Google, Facebook, et al +fullstory
  - and SSL
  - make sure Stackdriver can get through (uptime checks are keeping it awake)
  - flask plugin would allow this based on regex flask-sslify????
automatically throttle app to turn off after a certain number of calls
- set up basic postgres / datastore on Google Cloud compute
- establish a basic counter, and gate calls hourly and total

diagnose PNG masking issue
solve session management (race conditions), use anonymous cookies or Oauth (always on?), +fullstory
handle timeouts (asynchronous, polling, etc)

hella easter egg: super crop and produce extra image if we find any emoji faces hidden in the background inanimate objects in the image
get 1000 training images to start training our own
Google video client

Continue to use Google Vision API to handle face detection
Build a new model to handle classifying heads as emoji
- use Vision API bounding boxes to crop and scale heads
- input is 128x128x3 array of pixel values (unstructured data)
- output is Softmax MECE classification

Data augmentation likely to be a highly promising area for this project
consider adding image noise, rotations, distortions (saturation, color-balance, etc)

Going to keep adding notes here
Intuition suggests that emojivision is a problem with extremely good human-level performance.
Manual error analysis will be relative quick and highly valuable.
Getting to "super-human" performance may be really hard (humans ~ Bayes Error)

Unsupervised learning: probably beyond the scope of the project. Requires more project time, compute resources, and lots of data to start tackling reinforcement learning
Transfer learning: highly promising technique that might let us extend our initial model to handle new classes of emoji without "starting over from scratch"
Multi-task Learning: probably beyond the scope of the project. Might be appropriate if we were simultaneously finding food emoji in the background. (This is not the same as a using a Softmax classifier to generalize beyond logistic!)
CNN: usually a good idea to isolate the non-spatial information in an image. Might be a good idea if our first models are really inaccurate, but might not be necessary after our input processing step (cropping and scaling to night square image of head).
RNN: stupidly cool sounding, but probably a terrible choice for this problem. Best at handling sequences where "short-term memory" is important. It might be useful if we ever got to optimizing the "end-to-end joke" across all faces in the image.
End-to-end DL: probably beyond the scope of this project. Usually requires lots of data, and in our case, we'd need to invent a humor score to quantify "good" end-to-end emoji jokes.