Giter Site home page Giter Site logo

cardboardcode / epd_core Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ros-industrial/easy_perception_deployment

0.0 1.0 0.0 33.18 MB

A ROS2 package that accelerates the training and deployment of CV models for industries.

License: Apache License 2.0

CMake 1.70% Python 45.04% Shell 14.19% C++ 37.91% Dockerfile 1.16%

epd_core's People

Contributors

cardboardcode avatar briancbn avatar carlowiesse avatar mercedes149 avatar dependabot[bot] avatar

Watchers

James Cloos avatar

epd_core's Issues

Unable to install EPD on Ubuntu 22.04 with ROS2 Humble

Issue Description

Doing a colcon build of easy_perception_deployment on Ubuntu 22.04 with ROS2 Humble Hawksbill results in the following build errors:

fatal error: tf2/LinearMath/Quaternion.h: No such file or directory
   31 | #include "tf2/LinearMath/Quaternion.h"
no matching function for call to EasyPerceptionDeployment::declare_parameter(const char [28])
  231 |   this->declare_parameter("camera_to_plane_distance_mm");
/home/rosi/Desktop/repo_archive/easy_perception_deployment/easy_perception_deployment/include/ort_cpp_lib/p3_ort_base.cpp:1848:16: error:cv::TrackerMedianFlow has not been declared
 1848 |     return cv::TrackerMedianFlow::create();
/home/rosi/Desktop/repo_archive/easy_perception_deployment/easy_perception_deployment/include/ort_cpp_lib/p3_ort_base.cpp: In member function void Ort::P3OrtBase::tracking_evaluate(const std::vector<std::array<float, 4> >&, const cv::Mat&, std::string, std::vector<cv::Ptr<cv::Tracker> >&, std::vector<int>&, std::vector<EPD::LabelledRect2d>&):
/home/rosi/Desktop/repo_archive/easy_perception_deployment/easy_perception_deployment/include/ort_cpp_lib/p3_ort_base.cpp:1655:51: error: cannot bind non-const lvalue reference of type cv::Rect& {aka cv::Rect_<int>&} to an rvalue of type cv::Rect_<int>
 1655 |       trackers[i]->update(img, tracker_results[i].obj_bounding_box);
In file included from /usr/include/opencv4/opencv2/core.hpp:57,
                 from /usr/include/opencv4/opencv2/tracking.hpp:8,
                 from /home/rosi/Desktop/repo_archive/easy_perception_deployment/easy_perception_deployment/include/ort_cpp_lib/p3_ort_base.cpp:16:
/usr/include/opencv4/opencv2/core/types.hpp:1921:1: note:   after user-defined conversion: \u2018cv::Rect_<_Tp>::operator cv::Rect_<_Tp2>() const [with _Tp2 = int; _Tp = double]\u2019

Author's Notes

May need to deprecate Tracking Use Case in EPD when porting to ROS2 Humble.

Unable to train past the first custom dataset

Issue Description

EPD is currently exhibiting a bug that prevents it from training a second custom dataset because it fails to copy over the new custom dataset image folder to the relevant.

This seems to be caused by the fact that prepare_trainfarm_docker_container.bash will only run once after the EPD Trainer Docker Container has been successfully set up before.

Aimed to patch this out in EPD v0.3.3 - Patch Pull Request.

EPD unable to communicate beyond docker container

Issue Description

Running EPD's Deployment docker images allows building the ROS2 package but when deployed, I am unable to feed sensor_msgs::msg::Image via /virtual_camera/image_raw.

In other words, active ROS2 topics within the docker container are uninteractable.

I will be using this issue to track and capture as much details to resolve this swiftly.

Faulty check for CUDA installation presence via CMakeLists.txt

Issue Description

The following check implemented within easy_perception_deployment's CMakeLists.txt simply does not work in verifying that a version of CUDA have been installed in order to set the right GPU flag.

if(EXISTS ${CMAKE_CUDA_COMPILER})
  message(AUTHOR_WARNING "Using [-GPU-].")
  add_definitions(-DUSE_GPU=true)
else()
  message(AUTHOR_WARNING "Your local onnxruntime does not support CUDA. Using [-CPU-] instead.")
  add_definitions(-DUSE_GPU=false)
endif()

Perhaps modifications or a new feature needs to be implemented to properly check CUDA presence before installing.

Color RGB image format is flipped in ROS2 Humble

Issue Description

The following image is outputted when running default virtual_camera test image with EPD on ROS2 Humble:
Screenshot from 2022-08-03 21-21-39

Source of Error

For some reason, the RGB format of incoming sensor_msgs::msg::Image is flipped as shown by the output image.

Minimize & stabilize EPD GUI dependencies

Currently, every instantiation of Anaconda environment epd_gui_env takes up at least 1.6GB.

Further investigation will be done shortly to explore alternatives to reduce software bloat.

Unable to install onnxruntime on Ubuntu 22.04

Issue Description

The following error is generated when running bash install_dep_cpu.bash in an attempt to install EPD dependencies:

[ 72%] Built target onnxruntime
Consolidate compiler generated dependencies of target onnxruntime_test_utils
[ 73%] Built target onnxruntime_test_utils
Consolidate compiler generated dependencies of target onnx_test_data_proto
[ 75%] Built target onnx_test_data_proto
Consolidate compiler generated dependencies of target onnx_test_runner_common
[ 76%] Built target onnx_test_runner_common
Consolidate compiler generated dependencies of target gtest
[ 76%] Built target gtest
Consolidate compiler generated dependencies of target gmock
[ 76%] Built target gmock
Consolidate compiler generated dependencies of target onnxruntime_test_all
[ 76%] Building CXX object CMakeFiles/onnxruntime_test_all.dir/home/rosi/onnxruntime/onnxruntime/test/providers/cpu/controlflow/loop_test.cc.o
/home/rosi/onnxruntime/onnxruntime/test/providers/cpu/controlflow/loop_test.cc: In lambda function:
/home/rosi/onnxruntime/onnxruntime/test/providers/cpu/controlflow/loop_test.cc:557:23: error: \u2018sleep_for\u2019 is not a member of \u2018std::this_thread\u2019
  557 |     std::this_thread::sleep_for(std::chrono::seconds(3));
      |                       ^~~~~~~~~
make[2]: *** [CMakeFiles/onnxruntime_test_all.dir/build.make:1420: CMakeFiles/onnxruntime_test_all.dir/home/rosi/onnxruntime/onnxruntime/test/providers/cpu/controlflow/loop_test.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:680: CMakeFiles/onnxruntime_test_all.dir/all] Error 2
make: *** [Makefile:146: all] Error 2
Uninstall with cat install_manifest.txt | sudo xargs rm

Source of Error

For now, the source of error can be narrowed down to a potential discrepancy in the CPP library where std::thread is different. This could be solved by using a more recent version of onnxruntime or downgrading the cmake.

sleep_for is not a member of std::this_thread

Too much P3 Training GUI misprints

Issue Description

When attempting to set up training for a Precision Level 3 model, the following terminal output are given which may be confusing to non-EPD developers.

Setting Model to  maskrcnn
[ WARNING ] - Dataset directory not provided. Please choose Dataset.
Set Model to  maskrcnn
[ WARNING ] - Dataset directory not provided. Please choose Dataset.
[ WARNING ] - Dataset not properly restructured.Please restructure Dataset.

Dockerize Precision Level 3 exporter enviro.

The current exporter dockerized enviro is held up by issues.

Further evaluation reveals the possibility to avoid dockerize the entire maskrcnn_benchmark just for the feature to export .pth files to .onnx file format.

However, this requires in-depth investigation into maskrcnn_benchmark.

This issue also serves as a first-step on properly utilizing the Projects Kanban board, EPD Dockerization.

Dockerize Precision Level 3 training enviro.

Managed to dockerize P3 Training Enviro. However, there are the following major caveats in its use:

  1. Its installation of maskrcnn_benchmark python dependency needs to be reinstalled for every individual new instantiation of the docker image on a new GPU system.

  2. The image is ~8GB big and requires trimming to improve portability.

  3. It is also observed that its use is facing stalling for an unknown reason where used for a different GPU system.

Unit Testing Failure

st_results/easy_perception_deployment/epd_test_init.gtest.xml
1: [==========] Running 3 tests from 1 test case.        
1: [----------] Global test environment set-up.
1: [----------] 3 tests from EPD_TestSuite
1: [ RUN      ] EPD_TestSuite.Test_readSessionUseCaseConfigTextFile_EPDContainer
1: onnx_model_path = ./data/model/squeezenet1.1-7.onnx
1: unknown file: Failure
1: C++ exception with description "Load model from ./data/model/squeezenet1.1-7.onnx failed:Load model ./data/model/squeezenet1.1-7.onnx failed. File doesn't exist" thrown in the test body.
1: [  FAILED  ] EPD_TestSuite.Test_readSessionUseCaseConfigTextFile_EPDContainer (11 ms)
1: [ RUN      ] EPD_TestSuite.Test_setFrameDimension_EPDContainer
1: onnx_model_path = ./data/model/squeezenet1.1-7.onnx
1: unknown file: Failure
1: C++ exception with description "Load model from ./data/model/squeezenet1.1-7.onnx failed:Load model ./data/model/squeezenet1.1-7.onnx failed. File doesn't exist" thrown in the test body.
1: [  FAILED  ] EPD_TestSuite.Test_setFrameDimension_EPDContainer (0 ms)
1: [ RUN      ] EPD_TestSuite.Test_setInitBoolean_EPDContainer
1: onnx_model_path = ./data/model/squeezenet1.1-7.onnx
1: unknown file: Failure
1: C++ exception with description "Load model from ./data/model/squeezenet1.1-7.onnx failed:Load model ./data/model/squeezenet1.1-7.onnx failed. File doesn't exist" thrown in the test body.
1: [  FAILED  ] EPD_TestSuite.Test_setInitBoolean_EPDContainer (1 ms)
1: [----------] 3 tests from EPD_TestSuite (12 ms total)
1: 
1: [----------] Global test environment tear-down
1: [==========] 3 tests from 1 test case ran. (12 ms total)
1: [  PASSED  ] 0 tests.
1: [  FAILED  ] 3 tests, listed below:
1: [  FAILED  ] EPD_TestSuite.Test_readSessionUseCaseConfigTextFile_EPDContainer
1: [  FAILED  ] EPD_TestSuite.Test_setFrameDimension_EPDContainer
1: [  FAILED  ] EPD_TestSuite.Test_setInitBoolean_EPDContainer

P3 Training Fail: ValueError: num_samples should be a positive integer value, but got num_samples=0

Issue Description

Encountered the following error when attempting to train a Precision Level 3 MaskRCNN model using EPD. This error comes after having integrated the .yaml parser within P3Trainer.py.

Traceback (most recent call last):
  File "tools/train_net.py", line 201, in <module>
    main()
  File "tools/train_net.py", line 194, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_net.py", line 72, in train
    start_iter=arguments["iteration"],
  File "/home/cardboardvoice/anaconda3/envs/p3_trainer/lib/python3.6/site-packages/maskrcnn_benchmark-0.1-py3.6-linux-x86_64.egg/maskrcnn_benchmark/data/build.py", line 164, in make_data_loader
    sampler = make_data_sampler(dataset, shuffle, is_distributed)
  File "/home/cardboardvoice/anaconda3/envs/p3_trainer/lib/python3.6/site-packages/maskrcnn_benchmark-0.1-py3.6-linux-x86_64.egg/maskrcnn_benchmark/data/build.py", line 64, in make_data_sampler
    sampler = torch.utils.data.sampler.RandomSampler(dataset)
  File "/home/cardboardvoice/anaconda3/envs/p3_trainer/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 94, in __init__
    "value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0

Expected Behaviour

The training is supposed to proceed without any errors.

Actual Behaviour

The training fails the aforementioned error in terminal.

Error Source

Currently, the integration of the .yaml parser in P3Trainer.py seems to be the root cause.

[ Update as of 20220812 ]: The integration of the parser is not the root cause. With the EPD v0.2.2 P3 training workflow failing as well. It can be deduced that the cause should be narrowed to unknown dependency conflicts.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.