Giter Site home page Giter Site logo

mts's People

Contributors

fredblain avatar kpu avatar ugermann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

mts's Issues

Service should fully own workers and queue.

The workers and the queue are properly owned by Service.

Currently they are shared pointers.

std::vector<Ptr<BatchTranslator>> workers_;

Ptr<PCQueue<PCItem>> pcqueue_;

They should be member variables of Service. Remove the shared pointer. If Service launches threads it's also responsible for cleaning them up in its destruction. And order the member variables such that the threads are stopped before the queue is deallocated.

Prefer to reuse complex objects in loops

If you've got an object that entails a memory allocation or other expensive construction/destruction, prefer to reuse it by pulling the object outside a loop. I see Timer, Histories and PCItem

while (true) {
Timer timer;
PCItem pcitem;
pcqueue_->ConsumeSwap(pcitem);
if (pcitem.isPoison()) {
PLOG(_identifier(), info, "Recieved poison, setting running_ to false");
return;
} else {
PLOG(_identifier(), info, "consumed item in {}; ", timer.elapsed());
timer.reset();
Histories histories;
translate(pcitem.sentences, histories);
PLOG(_identifier(), info, "translated batch {} in {}; ",
pcitem.batchNumber, timer.elapsed());
timer.reset();
for (int i = 0; i < pcitem.sentences.size(); i++) {
pcitem.sentences[i].completeSentence(histories[i]);
}
}
}

Clean up command line options for intgemm.

If I understand correctly, there are currently there are currently 4 interconnected boolean command line options relating to intgemm

--optimize
--optmize8
--intgemm-shift
--intgemm-shift-all

This does not make sense to me as options are either mutually exclusive (things can't be 8-bit and 16-bit at the same time), or imply one another, e.g. intgemm-shift implies intgemm-shift and intgemm-shift implies optimize8.

These four options should be combined into a single string-valued option --optimize with the values

  • none or no (default; regular float computation)
  • intgemm16 (implicit, so that --optimize works as before)
  • intgemm8
  • intgemm8-shift
  • intgemm8-shift-all

which are then interpreted in cpu::Backed::configureDevice() in tensor/cpu/backend.h

This should happen sooner rather than later so that we don't have command line interface legacy issues. Easy to fix now, a nightmare to change later once people have started using these options.

cc: @kpu

Redesign workflow

@jerinphilip Based on our discussion on slack and upon some further thought, I'd propose this.

  1. replace members std::vector<std::string> input and std::string translation with std::vector<SentencePieceText> and SentencePieceText, respectively.
  2. Move tokenization from QueuedInput::next() to the PlainTextTranslation constructor, where sentence splitting takes place.
  3. Replace indeed Queued Input and BatchGenerator by a different mechanism.

(Background comment: I'm hesitant to touch the Batch class, as it's fairly baked into Marian, so we need some mechanism of mapping sentence-level translations to the respective requests. Currently the TranslationService has a map that maps from unique sentence ids to the respective Jobs and promises.)

Wrt 3., here's what I'd propose for starters:

  1. for the time being, we keep the mechanism where the TranslationService stores a shared ptr to each translation job in a map that maps from its id (identical to the sentenceId in the respective batch) to the shared ptr and a promise.
  2. for further processing, we pass around weak_ptr. This lets us determine at any point whether a job is still valid.
  3. we add a member function cancelJob(uint64_t job_id) to TranslationService that removes the entry in the map mentioned above.
  4. PlainTextTranslation keeps track of the ids of the jobs it owns. Cancellation of a PlainTextTranslation the cancels all the jobs it owns.
  5. The TranslatoinService (via custom batcher class (to make code maintenance easier)) keeps an array of deques corresponding to the respective sentence lengths, and a heap where the heads of each deques are prioritized so that oldest comes first. When a translation worker requests a batch, we take the oldest sentence and fill up the batch with sentences of the same or a similar length. Each time we pop a sentence/Job from the respective deque, we update the heap. Jobs that aren't valid any more (weak_ptr::lock() returns a shared_ptr to nullptr), can be skipped at batch creation time.

Make macOS Compatible

In the process of getting mts to run on macOS I had to make modifications to

  1. mts
  2. marian-dev
  3. ssplit-cpp

in particular I had to fork and then create a branches of these repos and then modify the branches such that all the repos work together. My forks + branches are

  1. mts
  2. marian-dev
  3. ssplit-cpp

Should I just submit PR's to the masters of all the above repos to make mts macOS compatible?

Compilation fails with gcc-9/gcc-10 due to lambda captures

Hey,

Trying to compile this with gcc-9/10 results in the following error:

[ 59%] Building CXX object 3rd_party/marian/src/CMakeFiles/marian.dir/common/cli_wrapper.cpp.o
In file included from /mnt/Storage/mts/src/./service/api/rapidjson_utils.h:10,
                 from /mnt/Storage/mts/src/service/api/bergamot/node_translation.cpp:2:
/mnt/Storage/mts/src/./service/common/translation_service.h:92:52: error: non-local lambda expression cannot have a capture-default
   92 |        std::function<void (Ptr<Job> j)> callback =[=](Ptr<Job> j){return;});
      |                                                    ^
In file included from /mnt/Storage/mts/src/service/common/plaintext_translation.cpp:2:
/mnt/Storage/mts/src/service/common/translation_service.h:92:52: error: non-local lambda expression cannot have a capture-default
   92 |        std::function<void (Ptr<Job> j)> callback =[=](Ptr<Job> j){return;});
      |                                                    ^
In file included from /mnt/Storage/mts/src/service/api/rapidjson_utils.h:10,
                 from /mnt/Storage/mts/src/service/api/rapidjson_utils.cpp:5:
/mnt/Storage/mts/src/./service/common/translation_service.h:92:52: error: non-local lambda expression cannot have a capture-default
   92 |        std::function<void (Ptr<Job> j)> callback =[=](Ptr<Job> j){return;});
      |                                                    ^
In file included from /mnt/Storage/mts/src/service/api/rapidjson_utils.h:10,
                 from /mnt/Storage/mts/src/service/api/output_options.cpp:3:
/mnt/Storage/mts/src/./service/common/translation_service.h:92:52: error: non-local lambda expression cannot have a capture-default
   92 |        std::function<void (Ptr<Job> j)> callback =[=](Ptr<Job> j){return;});

This change seems to have happened in gcc around 2018: https://patchwork.ozlabs.org/project/gcc/patch/[email protected]/

Removing the = fixes the issue, but I'm not sure if this is the right thing to do.

Compilation broken with gcc7 due to -Werror

Hey,
Compiling with gcc-7 results in the following:

[ 97%] Building CXX object src/service/CMakeFiles/rest-server.dir/rest/rest_server.cpp.o
In file included from /mts/src/service/rest/rest_server.cpp:7:
In file included from /mts/src/./service/api/json_request_handler.h:34:
In file included from /mts/src/./service/api/bergamot/json_request_handler.h:3:
/mts/src/./service/api/bergamot/node_translation.h:13:1: error: class 'OutputOptions' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
class OutputOptions;
^
/mts/src/./service/api/output_options.h:5:8: note: previous use is here
struct OutputOptions {
       ^
/mts/src/./service/api/bergamot/node_translation.h:13:1: note: did you mean struct here?
class OutputOptions;
^~~~~
struct
In file included from /mts/src/service/rest/rest_server.cpp:6:
In file included from /mts/3rd_party/marian/src/marian.h:5:
In file included from /mts/3rd_party/marian/src/common/config.h:3:
In file included from /mts/3rd_party/marian/src/3rd_party/yaml-cpp/yaml.h:10:
In file included from /mts/3rd_party/marian/src/3rd_party/yaml-cpp/parser.h:11:
In file included from /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/memory:81:
In file included from /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr.h:52:
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:588:8: error: delete called on non-final 'marian::data::QueuedInput' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
              delete __p;
              ^
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:595:4: note: in instantiation of function template specialization 'std::__shared_count<__gnu_cxx::_S_atomic>::__shared_count<marian::data::QueuedInput *>' requested here
        : __shared_count(__p)
          ^
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:1079:17: note: in instantiation of function template specialization 'std::__shared_count<__gnu_cxx::_S_atomic>::__shared_count<marian::data::QueuedInput *>' requested here
        : _M_ptr(__p), _M_refcount(__p, typename is_array<_Tp>::type())
                       ^
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:1243:4: note: in instantiation of function template specialization 'std::__shared_ptr<marian::data::QueuedInput, __gnu_cxx::_S_atomic>::__shared_ptr<marian::data::QueuedInput, void>' requested here
          __shared_ptr(__p).swap(*this);
          ^
/mts/src/./service/common/translation_service.h:76:9: note: in instantiation of function template specialization 'std::__shared_ptr<marian::data::QueuedInput, __gnu_cxx::_S_atomic>::reset<marian::data::QueuedInput>' requested here
    jq_.reset(new data::QueuedInput(vocabs_, options_));
        ^
/mts/src/service/rest/rest_server.cpp:180:21: note: in instantiation of function template specialization 'marian::server::TranslationService::start<marian::BeamSearch>' requested here
  service->template start<BeamSearch>();
                    ^
In file included from /mts/src/service/rest/rest_server.cpp:6:
In file included from /mts/3rd_party/marian/src/marian.h:5:
In file included from /mts/3rd_party/marian/src/common/config.h:3:
In file included from /mts/3rd_party/marian/src/3rd_party/yaml-cpp/yaml.h:10:
In file included from /mts/3rd_party/marian/src/3rd_party/yaml-cpp/parser.h:11:
In file included from /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/memory:81:
In file included from /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr.h:52:
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:376:9: error: delete called on non-final 'marian::data::QueuedInput' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
      { delete _M_ptr; }
        ^
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:371:7: note: in instantiation of member function 'std::_Sp_counted_ptr<marian::data::QueuedInput *, __gnu_cxx::_S_atomic>::_M_dispose' requested here
      _Sp_counted_ptr(_Ptr __p) noexcept
      ^
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:584:20: note: in instantiation of member function 'std::_Sp_counted_ptr<marian::data::QueuedInput *, __gnu_cxx::_S_atomic>::_Sp_counted_ptr' requested here
              _M_pi = new _Sp_counted_ptr<_Ptr, _Lp>(__p);
                          ^
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:595:4: note: in instantiation of function template specialization 'std::__shared_count<__gnu_cxx::_S_atomic>::__shared_count<marian::data::QueuedInput *>' requested here
        : __shared_count(__p)
          ^
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:1079:17: note: in instantiation of function template specialization 'std::__shared_count<__gnu_cxx::_S_atomic>::__shared_count<marian::data::QueuedInput *>' requested here
        : _M_ptr(__p), _M_refcount(__p, typename is_array<_Tp>::type())
                       ^
/usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:1243:4: note: in instantiation of function template specialization 'std::__shared_ptr<marian::data::QueuedInput, __gnu_cxx::_S_atomic>::__shared_ptr<marian::data::QueuedInput, void>' requested here
          __shared_ptr(__p).swap(*this);
          ^
/mts/src/./service/common/translation_service.h:76:9: note: in instantiation of function template specialization 'std::__shared_ptr<marian::data::QueuedInput, __gnu_cxx::_S_atomic>::reset<marian::data::QueuedInput>' requested here
    jq_.reset(new data::QueuedInput(vocabs_, options_));
        ^
/mts/src/service/rest/rest_server.cpp:180:21: note: in instantiation of function template specialization 'marian::server::TranslationService::start<marian::BeamSearch>' requested here
  service->template start<BeamSearch>();
                    ^
3 errors generated.
src/service/CMakeFiles/rest-server.dir/build.make:62: recipe for target 'src/service/CMakeFiles/rest-server.dir/rest/rest_server.cpp.o' failed
make[2]: *** [src/service/CMakeFiles/rest-server.dir/rest/rest_server.cpp.o] Error 1
CMakeFiles/Makefile2:1379: recipe for target 'src/service/CMakeFiles/rest-server.dir/all' failed
make[1]: *** [src/service/CMakeFiles/rest-server.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

Make private methods private

class BatchTranslator {
public:
BatchTranslator(const BatchTranslator &) = default;
BatchTranslator(DeviceId const device, PCQueue<PCItem> &pcqueue,
Ptr<Options> options);
void initGraph();
void translate(RequestSentences &, Histories &);
void mainloop();
std::string _identifier() { return "worker" + std::to_string(device_.no); }
void join();
private:
Ptr<Options> options_;
DeviceId device_;
std::vector<Ptr<Vocab const>> vocabs_;
Ptr<ExpressionGraph> graph_;
std::vector<Ptr<Scorer>> scorers_;
Ptr<data::ShortlistGenerator const> slgen_;
std::thread thread_;
PCQueue<PCItem> *pcqueue_;
};

It looks like translate and mainloop are not meant to be called from outside the class. So they should be private.

gettimeofday isn't monotonic

gettimeofday(&created, tz);

NOTES
       The time returned by gettimeofday() is affected by discontinuous jumps in the system time (e.g., if the system
       administrator manually changes the system time).  If you need a monotonically increasing clock, see clock_get‐
       time(2).

Also do you even need a notion of clock time or is a sequence number sufficient to compare ages?

Make thread_ a direct member variable of BatchTranslator

If it can be a direct member variable it should be.

Also, there's no need for the if statements because it's the constructor.

BatchTranslator::BatchTranslator(DeviceId const device,
PCQueue<PCItem> &pcqueue, Ptr<Options> options)
: device_(device), options_(options), pcqueue_(&pcqueue) {
ABORT_IF(thread_ != NULL, "Don't call start on a running worker!");
thread_.reset(new std::thread([this] { this->mainloop(); }));

boost 1.7+ breaks compilation

Hey,

Compilation is broken with boost version>= 1.7 (released in April 2019) due to change in the interface:

/mnt/Storage/mts/3rd_party/crow/include/crow/socket_adaptors.h: In member function ‘boost::asio::io_service& crow::SocketAdaptor::get_io_service()’:
/mnt/Storage/mts/3rd_party/crow/include/crow/socket_adaptors.h:22:28: error: ‘boost::asio::ip::tcp::socket’ {aka ‘class boost::asio::basic_stream_socket<boost::asio::ip::tcp>’} has no member named ‘get_io_service’
   22 |             return socket_.get_io_service();
      |                            ^~~~~~~~~~~~~~

More info here:
rstudio/rstudio#4636

We should at least have that in the requirements.

Cheers,

Nick

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.