Giter Site home page Giter Site logo

ftserver's Introduction

Lightweight Full Text Search Server for Java

Setup

Use NetBeans 8.2 to build or download fts.zip(WAR) from WAR Folder
Deploy to tomcat/jetty

Dependencies

iBoxDB

FTS Engine

Semantic-UI

Jodd

The Results Order

the results order based on the ID number in IndexTextNoTran(.. long id, ...), descending order.

every page has two index-IDs, normal-id and rankup-id, the rankup-id is a big number and used to keep the important text on the top. (the front results from SearchDistinct(IBox, String) )

Engine.IndexTextNoTran(..., p.Id, p.Content, ...);
Engine.IndexTextNoTran(..., p.RankUpId(), p.RankUpDescription(), ...);

the RankUpId()

public long RankUpId()
{
    return id | (1L << 60);
}

if you have more more important text , you can add one more index-id

public long AdvertisingId()
{
    return id | (1L << 61);
}
public static long RankDownId(long id)
{
    return id & (~(1L << 60 | 1L << 61)) ;
}
public static boolean IsAdvertisingId(long id)
{
    return id > (1L << 61) ;
}

the Page.GetRandomContent() method is used to keep the Search-Page-Content always changing, doesn't affect the real page order.

if you have many pages(>100,000), use the ID number to control the order instead of loading all pages to memory.

Search Format

[Word1 Word2 Word3] => text has Word1 and Word2 and Word3

["Word1 Word2 Word3"] => text has "Word1 Word2 Word3" as a whole

Search Method

searchDistinct (... String keywords, long startId, long count)

startId => which ID(the id when you called IndexText(,id,text)) to start, use (startId=Long.MaxValue) to read from the top, descending order

count => records to read, important parameter, the search speed depends on this parameter, not how big the data is.

Next Page

set the startId as the last id from the results of searchDistinct() minus one

keywords = function(searchDistinct(box, "keywords", startId, count));
nextpage_startId = keywords[last].ID - 1 
...
//read next page
searchDistinct(box, "keywords", nextpage_startId, count)

mostly, the nextpage_startId is posted from client browser when user reached the end of webpage, and set the default nextpage_startId=Long.MaxValue, in javascript the big number have to write as String ("'" + nextpage_startId + "'")

The Page-Text and the Text-Index -Process flow

When Insert

1.insert page --> 2.insert index

DB.Insert ("Page", page);
Engine.IndexTextNoTran( IsRemove = false );
...IndexTextNoTran...

When Delete

1.delete index --> 2.delete page

Engine.IndexTextNoTran( IsRemove = true );
...IndexTextNoTran...
DB.Delete("Page", page.Id);

Memory

indexTextWithTran(IBox, id, String, boolean) // faster, more memories

indexTextNoTran(AutoBox, commitCount, id, String, boolean) // less memory

Private Server

Open

public Page Html.get(String url);

Set your private WebSite text

Page page = new Page();
page.url = url;
page.title = "..."
page.description = "..."
page.content = "..."
return page;

More

C# ASP.NET Core Version

ftserver's People

Contributors

iboxdb avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.