Giter Site home page Giter Site logo

Comments (11)

tomatolog avatar tomatolog commented on July 23, 2024 1

fixed at 8d7ead6 now local_df enabled by default just skip wildcard terms and do not calculate global statistics for these terms.

To get issue fixed you need to install any package with daemon after the 8d7ead6 or you could also add option local_df=0 to your query to fix perormance drop

from manticoresearch.

AlexeyRemenyak avatar AlexeyRemenyak commented on July 23, 2024

Index files uploaded to storage

from manticoresearch.

tomatolog avatar tomatolog commented on July 23, 2024

could you provide time for the same query to the same amount of disk chunks but different daemons (6.2.12 and 6.3) to make sure we are checking the same case?

from manticoresearch.

AlexeyRemenyak avatar AlexeyRemenyak commented on July 23, 2024

If needed, I can perform an additional comparison with version 6.2.12 under identical conditions. Yesterday, I compared the two and shared the results in the Telegram chat, but the datasets were different.

With version 6.2.12, using 26 disk_chunks and ~13 million records, the select count(*) from ul where match('77*') query executed in 0.274 sec.
With version 6.3.0, using 26 disk_chunks and ~1.3 million records, the same query took 37.48 sec. After optimization the index with 1 disk chunk, the query execution time was reduced to ~0.2 sec

All settings were identical, the query was the same, and the only difference was the amount of data, which still didn't help 6.3.

from manticoresearch.

tomatolog avatar tomatolog commented on July 23, 2024

unable to reproduce the case as why I try to load index your uploaded into 6.2.12-release I got error

WARNING: table 'test': prealloc: disk chunk data/ul_full.0: prealloc failed: data/ul_full.0.sph is v.65, binary is v.64 - NOT SERVING

seems you uploaded index from the 630-release and old daemon can not read new index format.

You need upload index from the 6.2.12-release and provide the query result on the same index data - time from the 6.2.12-release and time from the 630-release to make sure you check the same case

from manticoresearch.

tomatolog avatar tomatolog commented on July 23, 2024

seems I reproduced the case after change index version by hand
6.2.12-release gives 0.095 sec

/* Wed Jun 05 12:14:16.119 2024 conn 8 real 0.095 wall 0.095 found 1 */ SELECT count(*) FROM test WHERE MATCH('77*') ORDER BY id desc;

6.3-release gives 4.2 sec

/* Wed Jun 05 12:14:39.755 2024 conn 5 (127.0.0.1:56282) real 4.182 wall 4.182 found 1 */ SELECT count(*) FROM test WHERE MATCH('77*') ORDER BY id desc; /*terms expansion=(merged 137507, not merged 7) */

from manticoresearch.

AlexeyRemenyak avatar AlexeyRemenyak commented on July 23, 2024

added index files created in 6.2.12
6.2.12 - 0.14sec
6.3.0 - 3.27 sec

from manticoresearch.

tomatolog avatar tomatolog commented on July 23, 2024

seems related to local_df enabled by default at #1436

Now during the search it expand terms first to get the correct df of the term from all disk chunks then use this value for every disk chunk. I am going to disable local_df calculation for wildcard terms in the query. As not sure that local_df worth for multiple expanded terms that represent the single term with wildcard

from manticoresearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.