Giter Site home page Giter Site logo

te-ns's Introduction

Traffic Emulator for Network Service (TENS)

Traffic Emulator for Network Service (henceforth abbreviated as TENS) is a distributed traffic generator tool having a separate data and control planes aimed at providing with extensive abilities to stress, validate, and report metrics and errors regarding various load balancing and server functionalities emulating multiple Browser sessions, clients, connections and requests at layers of L4 and L7 in a distributed fashion across multiple computes with a single endpoint of control.

Why TENS?

  • There are existing tools which does bits and pieces of what TENS can do, but have different ways of operation. TENS operates as a single end point with uniformly exposed APIs to orchestrate and validate traffic.
  • Existing open source tools have different set of limitations, and none has all the features to validate a fully blown load balancer
  • Commercial tools are expensive

What it can do?

  • Generate real world workloads by emulating multiple users, multiple sessions per user and multiple connections & requests per session across multiple computes.
  • Provide with extensive metrics and error reporting at configurable intervals and at layers of L4(TCP/UDP), and L7(HTTP HTTPS).
  • Detailed failure reporting with configurable sampled 5 tuple and reason of failure.
  • Ability to support traffic generation from multiple source IPs and namespaces.
  • Ability to stress the environment in dimensions of RPS, TPS, TPUT and CPS, apart from configurable large header sizes.
  • Configurable mechanism to ramp up the number of users and sessions, simulating a real world traffic behavior.
  • Ability to run across multiple computes and controlled by single end point of APIs.
  • Ability to control the traffic via APIs to start, stop and update.
  • Get detailed metrics at a single endpoint at a granularity of what happened during any specified interval of time and space including but not limited to the 5 tuple of generated traffic at a level of a URI of an app hit by a client.
  • Generate traffic with embedded cookies, headers, query parameters at a very large scales at L7.
  • Evaluation of persistant decision making capabilities of load balancer utilizing cookie, session id and client ip persistence at L7.
  • Ability to send HTTP(S) traffic across versions of HTTP/1, HTTP/1.1 and HTTP/2 at L7.
  • Ability to send traffic with various cipher suites and SSL versions of SSLv2, SSLv3, TLSv1, TLSv1.0, TLSv1.1, TLSv1.2 and TLSv1.3 at L7.
  • Ability to perform mutual client-server authentication by providing and veryfying certificates at L7.
  • Ability to emulate uploads and downloads of large number of UDP datagrams at L4, with multiple concurrent connections.

Libraries and Utilities Used

TENS functions with specific set of libraries and we are thankful to the maintainers and active contributors to the below mentioned libraries and utilities.

Libraries Version Utilities
libcurl 7.67.0 Postgresql
libuv 1.27.0 ZeroMQ
openssl 1.1.1a RQ
libjson 1.7.2-1 Flask
Nginx
Docker

Compiling the datapath process

  • Move to te_dp folder
    • cd <work-space>/te/tedp_docker/
  • Install the necessary libraries (Only for debian)
    • ./setup.sh
  • Clean any existing binary
    • make clean
  • Make the datapath and statistics collector process
    • make all

How to access TENS

  • As of today one can access the codes from the repository of github.`
    • Using the code, one can build and use TENS with 1 Controller and as many datapaths as required

How to get a fully fledged TENS (With Controller)

  • TENS has 2 parts to it. One is the TENS Controller and the other is TENS Datapath
  • The Controller acts as a single point of access which exposes various apis to start, stop, update and get metrics from the data path process
  • To get a sample run, please refer to SAMPLE-RUN.md in home directory: /SAMPLE-RUN.md

How to run a standalone datapath process

  • bin/te_dp [options]

      -r resource_config         -- path to the resource configuration describing what traffic to send
      -j resource_config's-hash  -- hash/unique-identifier of the resource configuration
      [-s session_config]        -- path to the session configuration describing how to send the traffic
                                 -- To be used only in case of CLIENT
      [-k session_config's-hash] -- hash/unique-identifier of the session configuration
                                 -- To be used only in case of CLIENT
    
      [-p TCP/UDP]               -- profile of process
                                 -- UDP / TCP
                                 -- defaults to `TCP`
    
      [-a CLIENT/SERVER]         -- mode of the process
                                 -- CLIENT / SERVER
                                 -- defaults to `CLIENT`
    
      [-c pinned-cpu]            -- cpu to which the process is pinnned to
                                 -- compulsary argument only in case of UDP CLIENT profile
    
      [-i mgmt-ip]               -- management ip of the host
                                 -- compulsary argument only in case of UDP profile, both CLIENT AND SERVER
    
      [-d stats_dump_interval]   -- interval at which the collected metrics has to be dumped in seconds
                                 -- has to be used in conjuncture of options like [-m] and/or [-t]
                                 -- defaults to `NO` metrics dumping
    
      [-m]                       -- to enable collection of metrics
                                 -- enabling this option doesn't collect metrics regarding memory utilization
                                 -- defaults to `NO` metrics collection
    
      [-t]                       -- to enable collection of memory utilization metrics
                                 -- defaults to `NO` memory metrics collection
    

What are resource and session configurations:

  • Resource Configuration describes on WHAT to do. This includes details as what app to send traffic to, what HTTP version to use, what certificates to use for authentication, how many datagrams to send, etc.
  • Session Configuration describes HOW to stress the app. How many concurrent sessions has to be maintained, how many connections to open per session, how many requests to send per session, should the session be alive for ever, if the sessions has to be ramped up slowly, if there must be delay induced between sessions, etc
  • The options available in the TCP and UDP parts of the configurations are descibed seperately as RESOURCE_CONFIGURATION.md and SESSION_CONFIGURATION.md.

Appendix

  • In order for the datapath to work at maximum efficiency add the following knob in /etc/sysctl.conf

      net.ipv4.tcp_tw_recycle = 1
      net.ipv4.tcp_tw_reuse = 1
      net.ipv4.ip_local_port_range = 2048 65000
    
    • Run sysctl -p to reflect it.
  • For a fully fledged TENS (with Controller), make sure the following are installed in the bare metal / VM

    • wget
    • python
    • python requests library
    • Docker (> v17.09.0-ce)
    • Base kernel (>3.15 of Ubuntu (or) equivalent)
  • Install docker in ubuntu:

    apt-get update && \
    apt-get install -y apt-transport-https ca-certificates curl gnupg-agent software-properties-common && \
    curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - && \
    add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" && \
    apt-get update && \
    apt-get install -y --force-yes docker-ce
    

te-ns's People

Contributors

aravindhank11 avatar dependabot[bot] avatar nikesh1177 avatar priyadarshitathagat avatar sanathpholla avatar srikawnth avatar sudks avatar vipinpr5991 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

te-ns's Issues

Failure while building local TE-NS image

resolution :
add libffi-dev to the Dockerfile

./build_te.sh console logs :

  copying cffi/pkgconfig.py -> build/lib.linux-x86_64-3.5/cffi
  copying cffi/ffiplatform.py -> build/lib.linux-x86_64-3.5/cffi
  copying cffi/backend_ctypes.py -> build/lib.linux-x86_64-3.5/cffi
  copying cffi/_cffi_include.h -> build/lib.linux-x86_64-3.5/cffi
  copying cffi/parse_c_type.h -> build/lib.linux-x86_64-3.5/cffi
  copying cffi/_embedding.h -> build/lib.linux-x86_64-3.5/cffi
  copying cffi/_cffi_errors.h -> build/lib.linux-x86_64-3.5/cffi
  running build_ext
  building '_cffi_backend' extension
  creating build/temp.linux-x86_64-3.5
  creating build/temp.linux-x86_64-3.5/c
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DUSE__THREAD -DHAVE_SYNC_SYNCHRONIZE -I/usr/include/ffi -I/usr/include/libffi -I/usr/include/python3.5m -c c/_cffi_backend.c -o build/temp.linux-x86_64-3.5/c/_cffi_backend.o
  c/_cffi_backend.c:15:17: fatal error: ffi.h: No such file or directory
  compilation terminated.
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for cffi
  Running setup.py clean for cffi
Successfully built sysv-ipc
Failed to build cffi
Installing collected packages: pycparser, six, cffi, pynacl, cryptography, bcrypt, urllib3, redis, paramiko, idna, click, chardet, certifi, sysv-ipc, scp, rq, requests, greenlet
    Running setup.py install for cffi: started
    Running setup.py install for cffi: finished with status 'error'
    ERROR: Command errored out with exit status 1:
     command: /usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-3cc4dudc/cffi_c6ba5d2571b34dd1b96ebf734e4f589e/setup.py'"'"'; __file__='"'"'/tmp/pip-install-3cc4dudc/cffi_c6ba5d2571b34dd1b96ebf734e4f589e/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-6to5mcsn/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.5/cffi
         cwd: /tmp/pip-install-3cc4dudc/cffi_c6ba5d2571b34dd1b96ebf734e4f589e/
    Complete output (34 lines):
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.5
    creating build/lib.linux-x86_64-3.5/cffi
    copying cffi/error.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/model.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/vengine_cpy.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/lock.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/__init__.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/api.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/cffi_opcode.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/vengine_gen.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/verifier.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/recompiler.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/commontypes.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/cparser.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/setuptools_ext.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/pkgconfig.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/ffiplatform.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/backend_ctypes.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/_cffi_include.h -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/parse_c_type.h -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/_embedding.h -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/_cffi_errors.h -> build/lib.linux-x86_64-3.5/cffi
    running build_ext
    building '_cffi_backend' extension
    creating build/temp.linux-x86_64-3.5
    creating build/temp.linux-x86_64-3.5/c
    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DUSE__THREAD -DHAVE_SYNC_SYNCHRONIZE -I/usr/include/ffi -I/usr/include/libffi -I/usr/include/python3.5m -c c/_cffi_backend.c -o build/temp.linux-x86_64-3.5/c/_cffi_backend.o
    c/_cffi_backend.c:15:17: fatal error: ffi.h: No such file or directory
    compilation terminated.
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-3cc4dudc/cffi_c6ba5d2571b34dd1b96ebf734e4f589e/setup.py'"'"'; __file__='"'"'/tmp/pip-install-3cc4dudc/cffi_c6ba5d2571b34dd1b96ebf734e4f589e/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-6to5mcsn/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.5/cffi Check the logs for full command output.
The command '/bin/sh -c pip3 install --requirement /tmp/requirements.txt' returned a non-zero code: 1
root@vpr-lsc2:~/te-ns#

clear_config failing stating `'/tmp/TE%*': No such file or directory`

In [18]: te_obj.clear_config()
Out[18]:
{'status': False,
 'statusmessage': {'Unexpected result': {'10.99.17.81': {'Failure': {'Error': "ls: cannot access '/tmp/ramcache/*': No such file or directory\nls: cannot access '/tmp/TE%*': No such file or directory\n",
     'Output': ''},
    'status': False},
   '10.99.17.82': {'Failure': {'Error': "ls: cannot access '/tmp/ramcache/*': No such file or directory\nls: cannot access '/tmp/TE%*': No such file or directory\n",
     'Output': ''},
    'status': False},
   '10.99.17.83': {'Failure': {'Error': "ls: cannot access '/tmp/ramcache/*': No such file or directory\nls: cannot access '/tmp/TE%*': No such file or directory\n",
     'Output': ''},
    'status': False},
   '10.99.17.84': {'Failure': {'Error': "ls: cannot access '/tmp/ramcache/*': No such file or directory\nls: cannot access '/tmp/TE%*': No such file or directory\n",
     'Output': ''},
    'status': False},
   '10.99.17.85': {'Failure': {'Error': "ls: cannot access '/tmp/ramcache/*': No such file or directory\nls: cannot access '/tmp/TE%*': No such file or directory\n",
     'Output': ''},
    'status': False},
   '10.99.17.86': {'Failure': {'Error': "ls: cannot access '/tmp/ramcache/*': No such file or directory\nls: cannot access '/tmp/TE%*': No such file or directory\n",
     'Output': ''},
    'status': False}}}}

te_stats_collector high memory usage leads to OOM Kill on te_dp

Describe the bug

  • Total 1 TEDP VM with 4core and 8GB memory
  • Traffic was started to 1500 unique HTTPS VIPs at 2023-10-31 10:08:59
  • Traffic ran until 2023-11-01 16:51:03.
  • But at 2023-11-01 16:51:03, OOM Kill happened and te_stats collector was killed. It was also the highest contributor to total memory consumed.

rq.log

2023-10-31 09:55:55,195 - [  RQ  ] - INFO - RQ pids = [65, 67, 69, 71] stat_collector_pid = [89]
2023-10-31 09:55:55,196 - [  RQ  ] - INFO -  ========= POLLING FOR PIDs=[65, 67, 69, 71, 89] =========
2023-11-01 16:51:03,776 - [  RQ  ] - ERROR - ========= 89 GOT KILLED =========

syslog of host VM

Nov  1 16:50:55 shalini-client-1 kernel: [12302777.203487] [10724]     0 10724  2195690  1885987 17362944   270307             0 te_stats_collec
Nov  1 16:50:55 shalini-client-1 kernel: [12302777.203488] [11896]     0 11896    14338     9914   151552      627             0 te_dp
Nov  1 16:50:55 shalini-client-1 kernel: [12302777.203490] [11905]     0 11905    16037    11186   163840      365             0 te_dp
Nov  1 16:50:55 shalini-client-1 kernel: [12302777.203491] [11907]     0 11907    15170    10776   155648      212             0 te_dp
Nov  1 16:50:55 shalini-client-1 kernel: [12302777.203493] [12428]   101 12428    17661      153   184320        7             0 systemd-resolve
Nov  1 16:50:55 shalini-client-1 kernel: [12302777.203495] [12457]     0 12457    11964      159   114688      747         -1000 systemd-udevd
Nov  1 16:50:55 shalini-client-1 kernel: [12302777.203496] [12458]     0 12458    11964      159   114688      747         -1000 systemd-udevd
Nov  1 16:50:55 shalini-client-1 kernel: [12302777.203498] Out of memory: Kill process 10724 (te_stats_collec) score 888 or sacrifice child
Nov  1 16:50:55 shalini-client-1 kernel: [12302777.203642] Killed process 10724 (te_stats_collec) total-vm:8782760kB, anon-rss:7543948kB, file-rss:0kB, shmem-rss:0kB

Reproduction steps

  1. Start traffic to 1.5K VS VIPs
  2. Send 1 GET and 1 POST request
  3. Session config
{'max_waf_vss': {'connection-range': [10, 10],
  'cycle-type': 'restart',
  'get-post-ratio': '1:1',
  'num-cycles': 1,
  'num-sessions': 50,
  'requests-range': [10, 10],
  'session-type': 'MaxPerf',
  'target-cycles': 0}}
  1. instanceprofile and te_dp_dict
In [48]: traffic_tool.instanceProfileConfig
Out[48]:
{'https_res_0': {'res-tag': 'https_res_0', 'ses-tag': 'max_waf_vss'},
 'https_res_1': {'res-tag': 'https_res_1', 'ses-tag': 'max_waf_vss'},
 'https_res_2': {'res-tag': 'https_res_2', 'ses-tag': 'max_waf_vss'}}

In [49]: traffic_tool.te_dp_dict
Out[49]:
{'10.206.45.9': {'instance_profile': {'https_res_0': 1,
   'https_res_1': 1,
   'https_res_2': 1},
  'passwd': 'avi123',
  'tag': 'telocal1',
  'user': 'root'}}

...

Expected behavior

te_stats_collector should not have increased its linearly memory there by resulting in OOM Kill
Traffic should flow continuously until stop traffic is invoked

Additional context

Image Pull happened for harbor repo

root@shalini-client-1:~# docker image ls
REPOSITORY                             TAG                 IMAGE ID            CREATED             SIZE
projects.registry.vmware.com/tens/te   v2.0                03f3f9f186f4        4 months ago        3.12GB

building on ubuntu 20.04 src/te_agent.c:43:10: fatal error: json/json.h: No such file or directory

/tens/te-ns/te_dp$ sudo make clean
Cleaned the binaries of TE's DP and Stat collector
master@ubnt:
/tens/te-ns/te_dp$ sudo make all
Linked src/te_metrics.c Successfully
Linked src/te_utils.c Successfully
src/te_agent.c:43:10: fatal error: json/json.h: No such file or directory
43 | #include <json/json.h>
| ^~~~~~~~~~~~~
compilation terminated.
make: *** [Makefile:63: obj/te_agent.o] Error 1

Memory leak observed in te_stats_collector in te_dp

Describe the bug

When TE was started for 8000 VIPs, a steady memory increase was observed in te_dp machine. This resulted in OOM Kill after 8-10 hours of starting traffic.

Feb 17 13:04:12 Client-1 kernel: [50800339.249733] Out of memory: Kill process 10409 (te_stats_collec) score 878 or sacrifice child
Feb 17 13:04:12 Client-1 kernel: [50800339.250768] Killed process 10409 (te_stats_collec) total-vm:8515812kB, anon-rss:7538392kB, file-rss:0kB, shmem-rss:0kB
Feb 17 13:04:12 Client-1 kernel: [50800339.583239] oom_reaper: reaped process 10409 (te_stats_collec), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Feb 17 13:03:56 Client-1 kernel: [50800323.344583] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Feb 17 13:03:56 Client-1 kernel: [50800323.344677] [10409]     0 10409  2128953  1885090 16920576   186781             0 te_stats_collec
Feb 17 13:03:56 Client-1 kernel: [50800323.344679] [18859]     0 18859    21296     8823   217088     3184             0 te_dp
Feb 17 13:03:56 Client-1 kernel: [50800323.344680] [18861]     0 18861    20080     9250   212992     1828             0 te_dp
Feb 17 13:03:56 Client-1 kernel: [50800323.344682] [18863]     0 18863    21101     8297   208896     3704             0 te_dp
Feb 17 13:03:56 Client-1 kernel: [50800323.344683] [20156]     0 20156    23207      453   221184      212             0 sshd
Feb 17 13:03:56 Client-1 kernel: [50800323.344685] [25403]     0 25403  2128953  1885108 16908288   186781             0 te_stats_collec

Reproduction steps

1.Send traffic to 8000 VS VIPs
2.Every 15 mins, a increase of 15MB of memory was observed.

Expected behavior

te_stat_collector should free up memory after every allocation of memory.

Additional context

No response

Add Log Rotation for TE components logs

  • Currently the Loggers for different components in TE does not do log rotation.
  • Due to this, there might be scenario where the disk would be filled up in the long run and with scaled traffic for long period of time.

One such occurrence had occurred where te-postgres.log was taking up 1.7GB of space.

root@Client-te-1:/tmp# ls -lrth
total 1.7G
-rwxr-xr-x 1 root root  502 Feb 25  2020 setup_postgres.sh
-rw-r--r-- 1 root root  398 Feb 25  2020 requirements.txt
-rw-r--r-- 1 root root   80 Nov 30 02:49 wrk.te.log
-rw-r--r-- 1 root root  234 Nov 30 02:49 te-data.json
-rw-r--r-- 1 root root  137 Nov 30 02:49 te-zmq.log
-rw------- 1 root root  280 Nov 30 02:49 TE-stdout---supervisor-l_N4ZA.log
-rw-r--r-- 1 root root 2.2M Dec  1 06:09 te.log
-rw------- 1 root root  13K Dec  1 06:09 TE-stderr---supervisor-dTkxqn.log
-rw-r--r-- 1 root root 1.7G Dec  1 06:09 te-postgres.log
  • Ask here is to enable log rotation for the te components logs

Creating this issue to track this enhancement

No setup.sh in te_dp/

Describe the bug

"Compiling the datapath process" mentions going to te_dp/ and running setup.sh to install the necessary libraries, but there is no such file in the directory.

Reproduction steps

  1. Clone the repo
  2. Go to te_dp/
  3. No setup.sh file present

Expected behavior

setup.sh should be available

Additional context

No response

update_dns on only some te_dp's fails

Describe the bug

calling update_dns with te_dp_dict argument fails with below exception. This works fine with global_dns option

traffic_tool.aviTE.update_dns(te_dp_dict={'100.65.8.7': [("nameserver", "182.16.100.150"), ("domain", "tmo.fe.net")]})
[18/Jan/23 06:45:19.887 connectionpool.py _new_conn:206][MainProcess:MainThread][DEBUG] Starting new HTTP connection (1): 100.65.8.6:5000
[18/Jan/23 06:45:19.901 connectionpool.py _make_request:396][MainProcess:MainThread][DEBUG] http://100.65.8.6:5000 "POST /api/v1.0/te/update_dns HTTP/1.1" 200 313
[18/Jan/23 06:45:19.902 TE_WRAP.py update_dns:1322][MainProcess:MainThread][DEBUG] update_dns: Response is {u'status': False, u'statusmessage': {u'Exception Occured': u'Traceback (most recent call last):\n  File "/app//TE.py", line 795, in __run_mgmt_command\n    run_mgmt_command_te_dp, {"cmd":cmd_to_enqueue, "task":task})\nTypeError: run_mgmt_command_helper() missing 1 required positional argument: \'job_timeout\'\n'}}
Out[15]:
{u'status': False,
 u'statusmessage': {u'Exception Occured': u'Traceback (most recent call last):\n  File "/app//TE.py", line 795, in __run_mgmt_command\n    run_mgmt_command_te_dp, {"cmd":cmd_to_enqueue, "task":task})\nTypeError: run_mgmt_command_helper() missing 1 required positional argument: \'job_timeout\'\n'}}

te_logs corresponding to the API request

2023-01-18 06:45:19,897 - [  TE  ] - DEBUG - Making the call to UPDATE_DNS api
2023-01-18 06:45:19,898 - [  TE  ] - DEBUG - update_dns_api Called
2023-01-18 06:45:19,898 - [  TE  ] - DEBUG - Command for 100.65.8.7 is printf 'nameserver    182.16.100.150
domain    tmo.fe.net
' > /etc/resolv.conf
2023-01-18 06:45:19,899 - [  TE  ] - ERROR - Failure. result={'Exception Occured': 'Traceback (most recent call last):\n  File "/app//TE.py", line 795, in __run_mgmt_command\n    run_mgmt_command_te_dp, {"cmd":cmd_to_enqueue, "task":task})\nTypeError: run_mgmt_command_helper() missing 1 required positional argument: \'job_timeout\'\n'}, type(result)=<class 'dict'>

Reproduction steps

  1. Fire aviTE.update_dns API with te_dp_dict argument

Expected behavior

  • DNS entries should be updated to whatever te_dp hosts is mentioned in te_dp_dict

Additional context

No response

get_cpu_count api uses the variable password before declaring it.

In [12]: aviTE.get_cpu_count(tedp_dict)
[11/May/21 07:38:06.675 connectionpool.py _new_conn:205][MainProcess:MainThread][DEBUG] Starting new HTTP connection (1): 10.50.53.2:5000
[11/May/21 07:38:06.682 connectionpool.py _make_request:393][MainProcess:MainThread][DEBUG] http://10.50.53.2:5000 "POST /api/v1.0/te/get_cpu_count HTTP/1.1" 200 372
[11/May/21 07:38:06.683 TE_WRAP.py get_cpu_count:695][MainProcess:MainThread][DEBUG] get_cpu_count: Response is {u'function': u'FlaskApplicationWrapper.get_cpu_count_api', u'status': False, u'exception': u'Traceback (most recent call last):\n  File "/app//TE.py", line 289, in caller_func\n    result_of_api_call = func(self, json_content)\n  File "/app//TE.py", line 756, in get_cpu_count_api\n    te_dp_hosts[host_ip][\'password\'] = password\nNameError: name \'password\' is not defined\n'}
Out[12]:
{u'exception': u'Traceback (most recent call last):\n  File "/app//TE.py", line 289, in caller_func\n    result_of_api_call = func(self, json_content)\n  File "/app//TE.py", line 756, in get_cpu_count_api\n    te_dp_hosts[host_ip][\'password\'] = password\nNameError: name \'password\' is not defined\n',
 u'function': u'FlaskApplicationWrapper.get_cpu_count_api',
 u'status': False}

Fix TENS package versions for build process

Describe the bug

We are not bounding

  • paramiko
  • scp
    As a result of which gevent is getting installed with a version that has an issue with def and ask to use cdef instead
ERROR: Command errored out with exit status 1:
   command: /usr/bin/python3 /usr/local/lib/python3.5/dist-packages/pip/_vendor/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmpfmuzg_1b
       cwd: /tmp/pip-install-khaulkj3/gevent_3a6c45d5120e4d9dbee702af0f41bed0
  Complete output (61 lines):
  Compiling src/gevent/resolver/cares.pyx because it changed.
  [1/1] Cythonizing src/gevent/resolver/cares.pyx
  Compiling src/gevent/libev/corecext.pyx because it changed.
  [1/1] Cythonizing src/gevent/libev/corecext.pyx
  Compiling src/gevent/_greenlet_primitives.py because it changed.
  [1/1] Cythonizing src/gevent/_greenlet_primitives.py
  Compiling src/gevent/_hub_primitives.py because it changed.
  [1/1] Cythonizing src/gevent/_hub_primitives.py
  Compiling src/gevent/_hub_local.py because it changed.
  [1/1] Cythonizing src/gevent/_hub_local.py
  Compiling src/gevent/_waiter.py because it changed.
  [1/1] Cythonizing src/gevent/_waiter.py
  warning: src/gevent/resolver/cares.pyx:38:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310
  warning: src/gevent/resolver/cares.pyx:40:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310
  warning: src/gevent/resolver/cares.pyx:41:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310
  warning: src/gevent/libev/corecext.pyx:326:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310
  warning: src/gevent/libev/corecext.pyx:791:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310
  warning: src/gevent/libev/corecext.pyx:793:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310
  warning: src/gevent/libev/corecext.pyx:795:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310
  warning: src/gevent/libev/corecext.pyx:799:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310
  warning: src/gevent/_gevent_cgreenlet.pxd:112:33: Declarations should not be declared inline.

  Error compiling Cython file:
  ------------------------------------------------------------
  ...
  cdef load_traceback
  cdef Waiter
  cdef wait
  cdef iwait
  cdef reraise
  cpdef GEVENT_CONFIG
        ^
  ------------------------------------------------------------

Reproduction steps

./build_te.sh

Expected behavior

No errors should be seen in pip installs

Additional context

No response

TE build process fails due to PostgresSQL repo xenial-pgdg being deprecated

Describe the bug

The postgresSQL repo xenial-pgdg has been deprecated and we have a dependency on our TE-NS build.
The builds will fail...

W: The repository 'http://apt.postgresql.org/pub/repos/apt xenial-pgdg Release' does not have a Release file.
E: Failed to fetch http://apt.postgresql.org/pub/repos/apt/dists/xenial-pgdg/main/binary-amd64/Packages  404  Not Found [IP: 217.196.149.55 80]
E: Some index files failed to download. They have been ignored, or old ones used instead.
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package postgresql-11
sed: can't read /etc/postgresql/*/main/pg_hba.conf: No such file or directory
gpasswd: user 'postgres' does not exist
chown: invalid group: 'root:ssl-cert'
chmod: cannot access '/etc/ssl/private/ssl-cert-snakeoil.key': No such file or directory
https://www.postgresql.org/download/linux/ubuntu/

we are looking for workarounds currently... else we might have to move to 18.04

Reproduction steps

start the build
./build_te.sh

Expected behavior

The build should work fine.

Additional context

cc: @aravindhank11 - let us know if it's the best to move to 18.04.

Docker Pull Command Failed for setting up Controller

From the swagger interface for setting up the controller it currently fails to pull the docker image with the default container image - projects.registry.vmware.com/tens/te:v2.0. A docker pull for this same image succeeds without error.

API Call:

root@ubuntu-test:/opt/ten-ns/te# curl -vvv -X GET "http://10.206.114.251:4000/api/setup_te?te_controller_ip=10.206.114.251&user=ubuntu&passwd=ubuntu&dockerhub_repo=projects.registry.vmware.com%2Ftens%2Fte%3Av2.0" -H "accept: /"
Note: Unnecessary use of -X or --request, GET is already inferred.

  • Trying 10.206.114.251:4000...
  • TCP_NODELAY set
  • Connected to 10.206.114.251 (10.206.114.251) port 4000 (#0)

GET /api/setup_te?te_controller_ip=10.206.114.251&user=ubuntu&passwd=ubuntu&dockerhub_repo=projects.registry.vmware.com%2Ftens%2Fte%3Av2.0 HTTP/1.1
Host: 10.206.114.251:4000
User-Agent: curl/7.68.0
accept: /

  • Mark bundle as not supporting multiuse
  • HTTP 1.0, assume close after body
    < HTTP/1.0 200 OK
    < Content-Type: application/json
    < Content-Length: 62
    < Server: Werkzeug/1.0.1 Python/3.8.5
    < Date: Sun, 09 May 2021 01:10:38 GMT
    <
    {"status":false,"statusmessage":"docker pull command failed"}
  • Closing connection 0

Restart traffic runs with provided config post te_dp restart happens due to any of te process getting killed

Describe the bug

  • When an OOM Kill happens which results in one of the te_dp processse to be killed, RQ START happens.
  • Post all processes are brought up, no traffic is run. But ideally post this, the previous traffic configuration should be restarted since user has not specified STOP traffic

Reproduction steps

  1. Start traffic on TEDP
  2. Kill any process of TE_DP in TEDP VM
  3. POST new RQ connection with TE, traffic config started at step 1 does not resume

Expected behavior

The previous traffic configuration should be restarted post TE_DP RQ start since user has not specified STOP traffic

Additional context

No response

te build fails because "libffi-dev" is not installed

Building te image fails with below error

    Running setup.py install for cffi: finished with status 'error'
    ERROR: Command errored out with exit status 1:
     command: /usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-b0247htu/cffi_81986d8f9ab5496caa29c9e3356511aa/setup.py'"'"'; __file__='"'"'/tmp/pip-install-b0247htu/cffi_81986d8f9ab5496caa29c9e
3356511aa/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-_w89zuto/install-rec
ord.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.5/cffi
         cwd: /tmp/pip-install-b0247htu/cffi_81986d8f9ab5496caa29c9e3356511aa/
    Complete output (34 lines):
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.5
    creating build/lib.linux-x86_64-3.5/cffi
    copying cffi/api.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/setuptools_ext.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/backend_ctypes.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/ffiplatform.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/cparser.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/vengine_cpy.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/error.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/cffi_opcode.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/verifier.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/pkgconfig.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/recompiler.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/__init__.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/lock.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/vengine_gen.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/commontypes.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/model.py -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/_cffi_include.h -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/parse_c_type.h -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/_embedding.h -> build/lib.linux-x86_64-3.5/cffi
    copying cffi/_cffi_errors.h -> build/lib.linux-x86_64-3.5/cffi
    running build_ext
    building '_cffi_backend' extension
    creating build/temp.linux-x86_64-3.5
    creating build/temp.linux-x86_64-3.5/c
    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DUSE__THREAD -DHAVE_SYNC_SYNCHRONIZE -I/usr/include/ff
i -I/usr/include/libffi -I/usr/include/python3.5m -c c/_cffi_backend.c -o build/temp.linux-x86_64-3.5/c/_cffi_backend.o
    c/_cffi_backend.c:15:17: fatal error: ffi.h: No such file or directory
    compilation terminated.
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-b0247htu/cffi_81986d8f9ab5496caa29c9e3356511aa/setup.py'"'"'; __file__='"'"'/tmp/pip-install-b024
7htu/cffi_81986d8f9ab5496caa29c9e3356511aa/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tm
p/pip-record-_w89zuto/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.5/cffi Check the logs for full command output.
The command '/bin/sh -c pip3 install --requirement /tmp/requirements.txt' returned a non-zero code: 1

TE Controller disk gets full within days due to huge data getting filled up in postgres DB

Describe the bug

TE Controller disk gets full within days due to huge data getting filled up in postgres DB

root@Client-te-1:/var/lib/postgresql/11/main/base# df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          15G   14G     0 100% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
/dev/sda1        15G   14G     0 100% /te_host
root@Client-te-1:/var/lib/postgresql/11/main# du -h --max-depth=1 .
8.4G    ./base

Once the disk is 100% full,

  • None of the get_ses_metrics/get_vip_metrics would return expected result
  • clear_config call would also fail

te_dp's however continues to send traffic to VIPs without any issues seen

Only workaround is to redeploy TE Controller.

Reproduction steps

1. Start high scale traffic to 5K VS VIPs
2. Wait for around 1-2 days for the postgres DB to be populated with metrics

Expected behavior

Postgres metrics rotation should typically be done to prevent disk getting full

Additional context

No response

Unable to start on unconnected tedp machines error

Describe the bug

In [12]: obj.start(resource_config, session_config, instanceProfileConfig, tedp_dict) Out[12]: {'status': False, 'statusmessage': {'Unable to start on unconnected tedp machines': ['172.20.7.50']}}

Reproduction steps

Go through the steps of setup in https://github.com/vmware/te-ns/blob/main/how_to_use/TENS_IN_NO_ACCESS.md

When trying to obj.start the above error is produced.

Expected behavior

Expect the te_dp traffic to start running against the configured resource.

Additional context

We have tried removing the docker containers on our collector and data path VMs.

We have also confirmed that connectivity between our collector/datapaths/resources are working on the required ports.

I understand this may not be a "bug" but we cannot find a sollution to the error. Any help would be really appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.