Giter Site home page Giter Site logo

rndmized / jee-application-for-document-similarity Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 297 KB

Project for Advanced Object Oriented Software Development Module (4th Year, Bsc (Hons) in Software Development)

License: MIT License

Java 31.32% CSS 68.68%
db4o java tomcat jee-application

jee-application-for-document-similarity's Introduction

JEE-Application-for-Document-Similarity

Project for Advanced Obejct Oriented Software Development Module (4th Year, Bsc (Hons) in Software Development)

Project Overview

A Java web application that enables two or more text documents to be compared for similarity.

Project Minimum Requirements

The implementation includes the following features:

  1. A document or URL can be specified or selected from a web browser and then dispatched to a servlet instance running under Apache Tomcat.

  2. Each submitted document it is parsed into its set of constituent shingles and then compared against the existing document(s) in an object-oriented database (db4O) and then stored in the database.

  3. The similarity of the submitted document to the set of documents stored in the database is returned and presented to the session user.

Simple UML Design

Features and Design Decisions

  • The access to the database is controller by a set of classes to prevent a concurrency issue in the database running on a separate Thread that takes in requests. (package: ie.gmit.db)
  • Comparisons between documents run on a separate thread.

Known Bugs

  • Empty documents get a 99% similarity result with any other document.
  • Due to the randomness of the minHash similarity values fluctuate from one request to another for the same files.

Tecnologies

  • Java: Java is a set of computer software and specifications developed by Sun Microsystems, which was later acquired by the Oracle Corporation, that provides a system for developing application software and deploying it in a cross-platform computing environment.

  • db4o: An embeddable open source object database for Java and.NET developers. Developed, commercially licensed and supported by Actian. In October 2014, Actian declined to continue to actively pursue and promote the commercial db4o product offering for new customers.

-db4o XTEA encryption library XTEA: A support for db4o open source object database. XTEA is a 64-bit block Feistel cipher with a 128-bit key and a suggested 64 rounds.

Tools and IDEs

  • Eclipse Eclipse is an integrated development environment (IDE) used in computer programming, and is the most widely used Java IDE.

  • Tomcat: The Apache Tomcat® software is an open source implementation of the Java Servlet, JavaServer Pages, Java Expression Language and Java WebSocket technologies.

Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

jee-application-for-document-similarity's People

Contributors

rndmized avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.