Giter Site home page Giter Site logo

sryaco / aitools Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 259.41 MB

AI tools used to test the WIlliam Elliot Griffis manuscript collection at Rutgers University Libraries

License: Creative Commons Zero v1.0 Universal

Jupyter Notebook 100.00%
rutgers-university

aitools's Introduction

Readme.md

Aitools repository AI tools tested on the William Elliot Griffis manuscript collection at Rutgers University Libraries

Sonia Yaco Rutgers University 2024

Locations: Notebooks are in \notebooks Photographs A small number of photos that can be used for clustering and mapping are in \data and \data\photos

The full corpus of digitized Griffis Japan images used in testing, 427 tiff files, 10 Gig is available for download from: Google Drive -https://drive.google.com/drive/folders/1U-NIDpXC5cUOzNW0fZ0H8mk9Q5PH3xgG?usp=drive_link

Program names with descriptions Cosine_similarity.ipynb Compares two utf-8 formatted texts and calculates the cosine similarity.

Image_cluster.ipynb Creates 4 groups of photos, groups by content similarly. Prints 5 of each group.

Image_clustermatch.ipynb • Image clustering Creates 4 groups of photos, groups by content similarly. Prints 7 of each group on screen and to png file • Matches one images to all, selecting 5 closest. No reprocessing of corpus is needed so it can be re-run quickly, changing file names each time of a photo to match. Original picture is displayed, then 5 matches.

Image_match.ipynb Provides the top 5 most similar images to one selected image, based on VVG16 pattern similarity.

NER.ipynb Three routines: • NER alpha order by word Spacy and NLTK create ner lists in output file, in word order

• NER in category order Spacy and NLTK create ner lists in output file, sorted by NER category

• NER color coded word visualizations – two versions o All NERs are shown color coded in context o Just three filtered labels, ('PERSON', 'ORG', 'GPE') are shown in original word order but no context.

ngrams.ipynb Builds three lists of n-grams from first input file (diary), second input file (biography), and common to both. Defaults to n-grams length of 1.

Sentiment_analysis.ipynb Produces numeric scores and visual graph of sentiment by paragraph. Output to screen and text and png file.

aitools's People

Contributors

sryaco avatar

Watchers

Bala Desinghu avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.