Comments (4)
Hello Jerry,
Speed
The speed performance you are reporting is indeed very low. With a small index like that, it should be a lot quicker. I guess from what you are saying that you are using a kind of VM / cloud infrastructure.
I strongly encourage you to run Pastec on a dedicated physical server.
Pastec requires a good physical CPU to run fast (for example a Core 2 duo 3 GHz). It accesses also a lot the RAM (4 GB of RAM should be sufficient for a 40000 image index). I am not sure you can get the same CPU and RAM speed performance with a virtual infrastructure.
Personally, I only host pastec instances on physical servers.
Sending Datapoints vs Image Data
Computing the image features on the mobile looks like a good idea at the first sight. There are however several problems to solve.
First, it would be slow to extract the features on the mobile if it is not optimized with some NEON code (several seconds).
Besides, depending how it is done, it may also require to load the 30 MB visualWordsORB.dat file on the mobile device.
Finally, if not compressed, the image feature data may also be bigger than a small 30 KB image that is sufficient to be recognized (see the documentation of the "Search request" API call).
Saving/Importing Data set/index
I think you are mixing up two different types of .dat files. This is misleading because they have the same extension.
- The path to visualWordsORB.dat must always be given in the command line when starting Pastec. Without this file, Pastec does not work.
- Your index saved data is also a .dat file. Its default name is backwardIndex.dat. When starting, Pastec tries to load the index in backwardIndex.dat if it is present. Else, it loads an empty index. You cannot currently specify your own index file to load in the command line.
I agree all this needs to be improved.
from pastec.
Please try with Ubuntu 14.10. It should be straightforward on this version.
from pastec.
Perfect. That worked. I've been playing around with it last night and this morning and I do have some thoughts/concerns.
Speed
Is there a preferred setup for the server in terms of CPU and RAM? I've noticed that submitting an image, analyzing an image and returning the result takes around 7 seconds on average. Which is too slow for a production app.
I'm currently only testing a small subset of 40,000 images, currently only 241 images stored in the index. The images used in the search are relatively small (equal size to the ones that generated the index). Adding the images to the index is blazing fast and doesn't take up too much time at all.
Sending Datapoints vs Image Data
Also do you have plans/thoughts about just sending over a hash or the data points to the server instead of the image, I'm thinking that the payload of the processing image data points would be smaller than sending the entire image. OpenCV could process the image "locally" then hit the remote Pastec Server.
Saving/Importing Data set/index
I've tried numerous times to save the current index and reload it when restarting Pastec but it never works. It only works if I build start Pastec, manually rebuild the index, save, clear, then reload it.
Processes I've tried:
- Start Pastec with savedindex.dat -> ./pastec savedindex.dat
or
- Start Pastec with visualWordsORB.dat -> ./pastec visualWordsORB.dat
- After visualWordsORB.dat is loaded send CURL call to load savedindex.dat -> curl -X POST -d '{"type":"LOAD", "index_path":"savedindex.dat"}' http://localhost:4212/index/io
So far I think Pastec is pretty darn good, but I'd to hear your thoughts on the above notes.
from pastec.
So I've bumped up the CPU on the server to 2CPUs and it's sped up the processing from 7s to 4s, but still takes a while.
from pastec.
Related Issues (20)
- Unix socket support HOT 1
- Pastec Server Not retruns More than 100 records HOT 1
- Installation Error - CMake on mac HOT 2
- Run pastec in https HOT 1
- IMAGE_NOT_ENCODED HOT 4
- Image not encoded HOT 2
- Querying from HTTP HOT 1
- How does image resolution impact results? HOT 4
- Loading the same index on different platforms HOT 5
- Problem when installing pastec on ubuntu 18.04 HOT 5
- Questions regarding pastec HOT 1
- Python wrapper HOT 5
- Feature: VisualWordsORB, but for Danbooru HOT 4
- all the -d commands error, why ? HOT 1
- Segmentation fault (core dumped) HOT 3
- pastec "Could not open the backward index file" HOT 11
- Attach 2 models to the index ?
- Continued developement and problem in weight ranking
- Any publication or text about Pastec? HOT 1
- The URL for setup is not available anymore. Is this project still being maintained?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pastec.