Giter Site home page Giter Site logo

image-virality's Introduction

Modelling the Image Virality

To record the details of my summer project, to explore the relationship between the content of images and their virality.

Files:

The jupyter notebooks are basically used for testing demos before building scripts.

  • Load_OriginalDataSet.ipynb: Load the original data set from the previous paper
  • EDA & Build_dataset.ipynb: do some EDA and build the data sets: image pair for the siamese net, and the image data sets for the classification
  • Net_Construction.ipynb: build the Siamese network
  • Preliminary Training.ipynb: doing some training
  • Classification_task.ipynb: predicting the subreddit of the images

Scripts:

  • classifier_net.py: the network used for the classification, including the Alexnet, VGG, resnet and densenet.
  • data_set.py: define the data set of images. There are two datasets defined here:
    • Reddit_Img_Pair: data used for the Siamese Net, return the image pairs
    • Reddit_Images_for_Classification: data for the classification task, use the torchvision.datasets.MNIST as the reference
  • download_images.py: scripts used for downloading the images from server
  • feature_extractor.py: features layers from the CNN models, used for the feature extraction.
  • losses.py: define the loss for training the siamese network
  • siamese_net.py: define the siamese net
  • train_classifier.py
  • train_siamese.py
  • transforms.py: define some transforms used on the images
  • utils.py: some other functions
  • visualisation.py: some functions related to image visualisation.
  • script.sh: some command for training and testing the models

Pipeline

Classification:

  1. download all the images to form the data set.
  2. Get the images from 5 most popular subreddit and make the data set
  3. Training the classifier to predict the image subreddit

According to the performance, choose the suitable model for the siamese network. Alexnet is chosen due to its small size and low computation resource demand. Its performance is also not bad.

Siamese Network for predicting the virality:

  1. build image pairs (500 image pairs for the beginning)
  2. build siamese network, combining the Alexnet and Spatial Transformer Network
  3. train and test the performance

Reference

  1. H. Lakkaraju, J. McAuley, and J. Leskovec, “What’s in a name? Understanding the Interplay between Titles, Content, and Communities in Social Media,” p. 10.

  2. A. Deza and D. Parikh, “Understanding Image Virality,” presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1818–1826.

  3. K. K. Singh and Y. J. Lee, “End-to-End Localization and Ranking for Relative Attributes,” arXiv:1608.02676 [cs], Aug. 2016.

image-virality's People

Contributors

shuogh avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.