Giter Site home page Giter Site logo

tbotnz / netpalm Goto Github PK

View Code? Open in Web Editor NEW
431.0 22.0 56.0 10.97 MB

ReST based network device broker

License: GNU Lesser General Public License v3.0

Python 63.99% CSS 35.39% Shell 0.20% Jinja 0.42%
jinja2-templates webhook napalm ncclient netmiko network-programming network-automation docker juniper cisco

netpalm's Introduction



The Open API Platform for Network Devices


netpalm makes it easy to push and pull state from your apps to your network by providing multiple southbound drivers, abstraction methods and modern northbound interfaces such as open API3 and REST webhooks.

Tests NTC Slack Github Issues Github Pull Requests Github Stars Github Contributors Github Release License

Supporting netpalm

Apcela
Apcela

Because Enterprise Speed Matters
Bandwidth


Delivering the power to communicate
Support
Maybe you?

Table of Contents

What is netpalm?

Leveraging best of breed open source network components like napalm, netmiko, ncclient and requests, netpalm makes it easy to abstract from any network devices native telnet, SSH, NETCONF or RESTCONF interface into a modern model driven open api 3 interface.

Taking a platform based approach means netpalm allows you to bring your own jinja2 config, service and webhook templates, python scripts and webhooks for quick adoption into your existing devops workflows.

Built on a scalable microservice based architecture netpalm provides unparalleled scalable API access into your network.

Features

  • Speaks REST and JSON RPC northbound, then CLI over SSH or Telnet or NETCONF/RESTCONF southbound to your network devices
  • Turns any Python script into a easy to consume, asynchronous and documented API with webhook support
  • Large amount of supported network device vendors thanks to napalm, netmiko, ncclient and requests
  • Built in multi-level abstraction interface for network service lifecycle functions for create, retrieve and delete and validate
  • In band service inventory
  • Ability to write your own service models and templates using your own existing jinja2 templates
  • Well documented API with postman collection full of examples and every instance gets it own self documenting openAPI 3 UI.
  • Supports pre- and post-checks across CLI devices raising exceptions and not deploying config as required
  • Multiple ways to queue jobs to devices, either pinned strict (prevent connection pooling at device)or pooled first in first out
  • Modern, container based scale out architecture supported by every component
  • Highly configurable for all aspects of the platform
  • Leverages an encrypted Redis layer providing caching and queueing of jobs to and from devices

Concepts

Basic Concepts

netpalm acts as a ReST broker and abstraction layer for NAPALM, Netmiko, NCCLIENT or a Python Script. netpalm uses TextFSM or Jinja2 to model and transform both ingress and egress data if required.

Component Concepts

netpalm is underpinned by a container based scale out architecture for all components.

Queueing Concepts

netpalm provides domain focused queueing strategy for task execution on network equipment.

Scaling Concepts

Every netpalm container can be scaled in and out as required. Kubernetes or Swarm is recommended for any large scale deployments.

To scale out the basic included compose deployment use the docker-compose command

docker-compose scale netpalm-controller=1 netpalm-worker-pinned=2 netpalm-worker-fifo=3

Additional Features

  • Jinja2

  • Parsers

    • TextFSM support via netmiko
    • NTC-templates for parsing/structuring device data (includes)
    • TTP Template Text Parser - Jinja2-like parsing of semi-structured CLI data
    • Napalm getters
    • Genie support via netmiko
    • Automated download and installation of TextFSM templates from http://textfsm.nornir.tech online TextFSM development tool
    • Optional dynamic rendering of Netconf XML data into JSON
  • Webhooks

    • Comes with standard REST webhook which supports data transformation via your own jinja2 template
    • Supports you to bring your own (BYO) webhook scripts
  • Scripts

    • Execute ANY python script as async via the ReST API and includes passing in of parameters
    • Supports pydantic models for data validation and documentation
  • Queueing

    • Supports a "pinned" queueing strategy where a dedicated process and queue is established for your device, tasks are sync queued and processed for that device
    • Supports a "fifo" pooled queueing strategy where a pool of workers
    • Supports on the fly changes to the async queue strategy for a device
  • Caching

    • Can cache responses from devices so that the same request doesn't have to go back to the device
    • Automated cache poisioning on config changes on devices
  • Scaling

    • Horizontal container based scale out architecture supported by each component

Examples

We could show you examples for days, but we recommend playing with the online postman collection to get a feel for what can be done. We also host a public instance where you can test netpalm via the Swagger UI.

getconfig method

netpalm also supports all arguments for the transport libs, simply pass them in as below

netpalm eg3

check response

netpalm eg4

ServiceTemplates

netpalm supports model driven service templates, these self render an OpenAPI 3 interface and provide abstraction and orchestration of tasks across many devices using the get/setconfig or script methods.

The below example demonstrates basic SNMP state orchestration across multiple devices for create, retrieve, delete

netpalm auto ingest

Template Development and Deployment

netpalm is integrated into http://textfsm.nornir.tech so you can ingest your templates with ease

netpalm auto ingest

API Docs

netpalm comes with a Postman Collection and an OpenAPI based API with a SwaggerUI located at http://localhost:9000/ after starting the container.

netpalm swagger

Caching

  • Supports the following per-request configuration (/getconfig routes only for now)

    • permit the result of this request to be cached (default: false), and permit this request to return cached data
    • hold the cache for 30 seconds (default: 300. Should not be set above redis_task_result_ttl which defaults to 500)
    • do NOT invalidate any existing cache for this request (default: false)
      {
        "cache": {
          "enabled": true,
          "ttl": 30,
          "poison": false
        }
      }
  • Supports the following global configuration:

    • Enable/Disable caching: "redis_cache_enabled": true for caching to apply it must be enabled BOTH globally and in the request itself
    • Default TTL: "redis_cache_default_timeout": 300
  • Any change to the request payload will result in a new cache key EXCEPT:

    • JSON formatting. { "x": 1, "y": 2 } == {"x":1,"y":2}
    • Dictionary ordering: {"x":1,"y":2} == {"y":2,"x"1}
    • changes to cache configuration (e.g. changing the TTL, etc)
    • fifo vs pinned queueing strategy
  • Any call to any /setconfig route for a given host:port will poison ALL cache entries for that host:port

    • Except /setconfig/dry-run of course

Configuration

Edit the config/config.json file to change any parameters ( see defaults.json for example )

Installation

  1. Ensure you first have docker installed
sudo apt-get install docker.io
sudo apt-get install docker-compose
  1. Clone this repository
git clone https://github.com/tbotnz/netpalm.git
cd netpalm
  1. Build the container
sudo docker-compose up -d --build
  1. After the container has been built and started, you're good to go! netpalm will be available on port 9000 under your docker hosts IP.
http://$(yourdockerhost):9000

Further Reading

Contributing

We are open to contributions, before making a PR, please make sure you've read our CONTRIBUTING.md document.

You can also find us in the channel #netpalm on the networktocode Slack.

netpalm's People

Contributors

brobare avatar diorgesl avatar jefvantongerloo avatar lamiskin avatar lboue avatar ndom91 avatar tbotnz avatar wrgeorge1983 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

netpalm's Issues

FR: new extensibles endpoints

Would certainly be nice to have a few new extensibles endpoints:

  1. Bulk upload. Reload using the /reload-extensibles endpoint once at the end of processing the payload.
  2. Delete extensibles. Bulk delete, maybe?

Issue Huawei netconf

I'm having this problem with Huawei NETCONF, is this a ncclient problem?
Task error:

  "Invalid tag name '<Element {urn:ietf:params:xml:ns:netconf:base:1.0}rpc at 0x7fd021d7d380>'",
  "failed: Invalid tag name '<Element {urn:ietf:params:xml:ns:netconf:base:1.0}rpc at 0x7fd021d7d380>'"

My body:

{
  "library": "ncclient",
  "connection_args": {
    "host": "1.1.2.3",
    "username": "user",
    "password": "pass",
    "port": 830,
    "hostkey_verify": false,
    "device_params" : { "name": "huawei"}
  },
  "args": {
    "rpc": "<get-config></get-config>",
    "render_json":true
  },
  "queue_strategy": "fifo"
}

Out of process by orphan processes

When i deploy the project on a physical machine with the pinned worker model, the network worker like "rq:worker:11.0.2.2_4a06654f-0a23-4ee3-8ae6-dceb8d2b6f54" will be orphan process when the pinned worker process down, which should be killed.

I looked through the source code,the RQ module use the os.fork to create a sub process in fork_work_horse, if the subprocess exit after execution,that's right. However, the subprocess is pinned_worker_constructor, it also has a subprocess pinned_worker which is loop, not exit, when the pinned_worker_constructor completed, it will be exited, so the pinned_worker will be orphan process. The orphan process will be sending heartbeat message after default_worker_ttl, so the pinned worker state has only one filed "last_heartbeat".

Out of process will be caused by the pinned worker multiple restarts. The container environment doesn't has the problem due to container only monitor the init process which will also take over the orphan process.

Ability to set TTL on a per-task basis

Per slack conversation, logging a FR to request a "per-task TTL" hook to allow long running jobs (think config backups for lots of devices) to influence the task TTL specifically for that task.

Scripts to support model definition in same file

Opening issue in regards to a slack conversation. Would like the ability to define both the script content and the Model in the same file. Currently, the model and code are in separate files, which could get hairy at scale. Would be sweet to be able to define in the same file, following the "services" pattern.

Failed to Conect

I think I missed something on the install I can run a getconfig command but the task is telling me it can not connect to the device. I am connecting to a cisco 3750 using cisco_ios. I can ssh in to that unit just fine from the machine that Docker is running on...what did I miss?

Issue with netmiko setcondig multiple

I'm using in the body of the setconfig netmiko the folllowing
{
"library": "netmiko",
"connection_args":{
    "device_type":"cisco_xe", "host":"10.0.2.1", "username":"", "password":""
},
"config": ["interface GigabitEthernet1\n", "description test\n"]
}
I get back the following (on the setconfig):
{
  "data": {
    "created_on": "Tue, 14 Apr 2020 18:20:43 GMT",
    "task_id": "6bd59805-0fdc-4f88-8603-0e1242f6257d",
    "task_queue": "10.0.2.1",
    "task_result": null,
    "task_status": "queued"
  },
  "status": "success"
}

Debugging / Timeline

Hey first of all I really appreciate all the effort you have put into this project.

That being said I am trying to get some more debugging info on whats happening in the containers for a particular restconf request I'm running. I wrote a python module that interacts with NXOS REST API, and I'm trying to drop a lot of the work I've already done to use the netpalm.

This is an example request im sending to /getconfig

{
	"library": "restconf",
	"connection_args": {
		"host": "{{ device_ip_address }}",
		"port": 8443,
		"username": "admin",
		"password": "admin",
		"verify": false,
		"timeout": 10,
		"transport": "https",
		"headers": {
			"Content-Type": "application/json",
			"Accept": "*/*"
		}
	},
	"args": {
		"uri": "/api/mo/aaaLogin.json",
		"action": "post",
		"payload": {
			"aaaUser": {
				"attributes": {
					"name": "{{ device_username }}",
					"pwd": "{{ device_password }}"
				}
			}
		}
	},
	"queue_strategy": "fifo"
}

However on the task result I'm seeing this.

{
  "status": "success",
  "data": {
    "task_id": "eb0bae87-d4d2-4530-8569-4c2804073773",
    "created_on": "2020-08-21 13:48:44.666103",
    "task_queue": "fifo",
    "task_meta": {
      "enqueued_at": "2020-08-21 13:48:44.666502",
      "started_at": "2020-08-21 13:48:44.740164",
      "ended_at": "2020-08-21 13:48:45.400580",
      "enqueued_elapsed_seconds": "0",
      "total_elapsed_seconds": "0"
    },
    "task_status": "finished",
    "task_result": {
      "https://10.254.0.101:8443/api/mo/aaaLogin.json": {
        "status_code": 400,
        "result": {
          "imdata": [
            {
              "error": {
                "attributes": {
                  "code": "400",
                  "text": "Failed to parse login request"
                }
              }
            }
          ]
        }
      }
    },
    "task_errors": []
  }
}

I've connected to the docker containers but I don't see any logs of requests etc. I'd like to be able to see the request sent to the device, or some logging related to it.

Thanks!

Error installing netpalm

Hi team,

I've been using netpalm for a project recently. Today I was trying to install over a new installation and I got this when doing the docker-compose build

W: GPG error: http://deb.debian.org/debian bookworm InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 648ACFD622F3D138 NO_PUBKEY 0E98404D386FA1D9 NO_PUBKEY F8D2585B8783D481
E: The repository 'http://deb.debian.org/debian bookworm InRelease' is not signed.
W: GPG error: http://deb.debian.org/debian bookworm-updates InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 0E98404D386FA1D9 NO_PUBKEY 6ED0E7B82643E131
E: The repository 'http://deb.debian.org/debian bookworm-updates InRelease' is not signed.
W: GPG error: http://deb.debian.org/debian-security bookworm-security InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 54404762BBB6E853 NO_PUBKEY BDE6D2B9216EC7A8
E: The repository 'http://deb.debian.org/debian-security bookworm-security InRelease' is not signed.
E: Problem executing scripts APT::Update::Post-Invoke 'rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true'
E: Sub-process returned an error code

Docker containers keep restarting

I have done a fresh install of netpalm but the containers do not stay UP.

(py3venv) [developer@devbox ~]$ docker ps
CONTAINER ID   IMAGE                           COMMAND                  CREATED             STATUS                          PORTS                                       NAMES
802b12dd5126   netpalm_netpalm-worker-fifo     "python3 worker.py f…"   About an hour ago   Restarting (1) 26 seconds ago                                               netpalm_netpalm-worker-fifo_1
ce39b3e672e8   netpalm_netpalm-worker-pinned   "python3 worker.py p…"   About an hour ago   Restarting (1) 6 seconds ago                                                netpalm_netpalm-worker-pinned_1
706c38ed13ec   netpalm_netpalm-controller      "/bin/sh -c 'gunicor…"   About an hour ago   Up 7 seconds                    0.0.0.0:9000->9000/tcp, :::9000->9000/tcp   netpalm_netpalm-controller_1
300a7e0bfa37   netpalm_redis                   "docker-entrypoint.s…"   About an hour ago   Up About an hour                6379/tcp                                    netpalm_redis_1

This is the error I am getting when running docker-compose not in background

netpalm-worker-fifo_1    | TypeError: To define root models, use `pydantic.RootModel` rather than a field called '__root__'

pinned worker bug

  • current pinned worker is being duplicated due to bug in the worker_is_alive method on the rediz class

Genie not available as parser

I checked the requirements.txt that genie is in the list of modules to be installed. But If I try to use it in a get Config request I get an error that it is not installed.

"args": {
   "use_genie": "true"
  },


"exception_args": [
          "\nGenie and PyATS are not installed. Please PIP install both Genie and PyATS:\npip install genie\npip install pyats\n"
        ]

read_timeout not working in setconfig

After setting read_timeout key to setconfig, it doesn't seem to be passed down to netmiko, for some reason.

I've debugged that it is getting through to the netmiko.drvr config function, but netmiko doesn't seem to be picking up on it.

I can't seem to figure out what's going on, but was looking at netmiko's docs, and everything lines up correctly.

adding kafka support

hello,

i mentioned in slack today but perhaps it is worth opening an issue here for discussion and posterity.

I think netpalm having native ability to publish the task result data to kafka topics would be very handy for various use cases for downstream consumption - especially when combined with the scheduled tasks - and at least functionally, i've been able to add it fairly easily by adding the library kafka-python (https://pypi.org/project/kafka-python/) to the worker containers. I think this doesn't have any additional dependencies.

i've also been able to add a fairly simple webhook/callback script that will produce the task results to a topic, and also updated the docker-compose file to include kafka and zookeeper (though i'll be honest, i'm not sure whether zookeeper is actually required any longer). i'm planning to clean up and improve the script this week after i do a bit more testing.

If there is interest to add it i'll be happy to open a PR with what i've done after I clean it up a bit and do a little more testing, to get the process started.

Thanks,

Will

Can't retrieve response

On fresh deployment using : ./redis_gen_new_certs.sh && docker-compose up -d --build

  1. Create a getconfig task :
POST /getconfig HTTP/1.1
Host: xxxx.com:9000
x-api-key: xxxxx
Content-Type: application/json
Content-Length: 360

{
    "library": "netmiko",
    "connection_args": {
        "device_type": "cisco_ios",
        "host": "10.x.x.x",
        "username": "florian_lacommare",
        "password": "xxxxx."
    },
    "command": "show ip int brief",
    "args": {
        "use_textfsm": true
    },
    "queue_strategy": "fifo"
}

Response :

{
    "status": "success",
    "data": {
        "task_id": "90ddad9d-f169-43a6-8a70-b6120ce8570a",
        "created_on": "2022-09-30 06:19:57.094087",
        "task_queue": "fifo",
        "task_meta": {
            "enqueued_at": "2022-09-30 06:19:57.094526",
            "started_at": null,
            "ended_at": null,
            "enqueued_elapsed_seconds": "0",
            "total_elapsed_seconds": "0",
            "assigned_worker": null
        },
        "task_status": "queued",
        "task_result": null,
        "task_errors": []
    }
}

Log on controler :

[2022-09-30 06:19:57,090:netpalm.routers.route_utils:wrapper:DEBUG] cacheable_model: req_data {'library': <LibraryName.netmiko: 'netmiko'>, 'connection_args': {'device_type': 'cisco_ios', 'host': '10.x.x.x', 'username': 'florian_lacommare', 'password': '******'}, 'command': 'show ip int brief', 'args': {'use_textfsm': True}, 'webhook': {}, 'queue_strategy': <QueueStrategy.fifo: 'fifo'>, 'post_checks': [], 'cache': {}, 'ttl': None}
[2022-09-30 06:19:57,091:netpalm.routers.route_utils:cache_key_from_req_data:INFO] hashed key: add992c10c403a5fde3f95b9e9a45c32040445bbcfd572382dbb1c91271093d6
[2022-09-30 06:19:57,091:netpalm.routers.route_utils:cache_key_from_req_data:DEBUG] cache_key_from_req_data: cache key 10.x.x.x:None:show ip int brief:add992c10c403a5fde3f95b9e9a45c32040445bbcfd572382dbb1c91271093d6
[2022-09-30 06:19:57,093:netpalm.backend.core.redis.rediz:__sendtask:DEBUG] __sendtask: {'library': <LibraryName.netmiko: 'netmiko'>, 'connection_args': {'device_type': 'cisco_ios', 'host': '10.x.x.x', 'username': 'florian_lacommare', 'password': '******'}, 'command': 'show ip int brief', 'args': {'use_textfsm': True}, 'webhook': {}, 'queue_strategy': <QueueStrategy.fifo: 'fifo'>, 'post_checks': [], 'cache': {}}
  1. Get task info for 90ddad9d-f169-43a6-8a70-b6120ce8570a
GET /task/90ddad9d-f169-43a6-8a70-b6120ce8570a HTTP/1.1
Host: xxxx.com:9000
x-api-key: xxxx
Content-Type: application/json

Response :

{
    "detail": "Not Found"
}

log controller :

[2022-09-30 06:22:06,565:netpalm.backend.core.redis.rediz:fetchtask:INFO] fetching task: 90ddad9d-f169-43a6-8a70-b6120ce8570a
  1. That's it ... no info, no task ...

For info, my task failed but I still should get task info which tell me that my task failed, right ? :

[2022-09-30 06:19:57,100:rq.worker:dequeue_job_and_maintain_ttl:INFO] fifo: fifo (90ddad9d-f169-43a6-8a70-b6120ce8570a)
[2022-09-30 06:19:57,130:netpalm.backend.core.utilities.rediz_meta:write_meta_error:ERROR] `write_meta_error` processing error
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/netmiko/base_connection.py", line 920, in establish_connection
    self.remote_conn_pre.connect(**ssh_connect_params)
  File "/usr/local/lib/python3.8/site-packages/paramiko/client.py", line 368, in connect
    raise NoValidConnectionsError(errors)
paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 10.x.x.x

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/code/netpalm/backend/plugins/drivers/netmiko/netmiko_drvr.py", line 27, in connect
    netmikoses = ConnectHandler(**self.connection_args)
  File "/usr/local/lib/python3.8/site-packages/netmiko/ssh_dispatcher.py", line 312, in ConnectHandler
    return ConnectionClass(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/netmiko/cisco/cisco_ios.py", line 17, in __init__
    return super().__init__(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/netmiko/base_connection.py", line 346, in __init__
    self._open()
  File "/usr/local/lib/python3.8/site-packages/netmiko/base_connection.py", line 351, in _open
    self.establish_connection()
  File "/usr/local/lib/python3.8/site-packages/netmiko/base_connection.py", line 942, in establish_connection
    raise NetmikoTimeoutException(msg)
netmiko.ssh_exception.NetmikoTimeoutException: TCP connection to device failed.

Common causes of this problem are:
1. Incorrect hostname or IP address.
2. Wrong TCP port.
3. Intermediate firewall blocking access.

Device settings: cisco_ios 10.x.x.x:22


[2022-09-30 06:19:57,134:rq.worker:handle_exception:ERROR] Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/rq/worker.py", line 1075, in perform_job
    rv = job.perform()
  File "/usr/local/lib/python3.8/site-packages/rq/job.py", line 854, in perform
    self._result = self._execute()
  File "/usr/local/lib/python3.8/site-packages/rq/job.py", line 877, in _execute
    result = self.func(*self.args, **self.kwargs)
  File "/code/netpalm/backend/plugins/calls/getconfig/exec_command.py", line 111, in exec_command
    write_meta_error(e)
  File "/code/netpalm/backend/core/utilities/rediz_meta.py", line 32, in write_meta_error
    raise exception from None  # Don't process the same exception twice
  File "/code/netpalm/backend/plugins/calls/getconfig/exec_command.py", line 36, in exec_command
    sesh = netmik.connect()
  File "/code/netpalm/backend/plugins/drivers/netmiko/netmiko_drvr.py", line 30, in connect
    write_meta_error(e)
  File "/code/netpalm/backend/core/utilities/rediz_meta.py", line 49, in write_meta_error
    raise NetpalmMetaProcessedException from exception
netpalm.exceptions.NetpalmMetaProcessedException
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/rq/worker.py", line 1075, in perform_job
    rv = job.perform()
  File "/usr/local/lib/python3.8/site-packages/rq/job.py", line 854, in perform
    self._result = self._execute()
  File "/usr/local/lib/python3.8/site-packages/rq/job.py", line 877, in _execute
    result = self.func(*self.args, **self.kwargs)
  File "/code/netpalm/backend/plugins/calls/getconfig/exec_command.py", line 111, in exec_command
    write_meta_error(e)
  File "/code/netpalm/backend/core/utilities/rediz_meta.py", line 32, in write_meta_error
    raise exception from None  # Don't process the same exception twice
  File "/code/netpalm/backend/plugins/calls/getconfig/exec_command.py", line 36, in exec_command
    sesh = netmik.connect()
  File "/code/netpalm/backend/plugins/drivers/netmiko/netmiko_drvr.py", line 30, in connect
    write_meta_error(e)
  File "/code/netpalm/backend/core/utilities/rediz_meta.py", line 49, in write_meta_error
    raise NetpalmMetaProcessedException from exception
netpalm.exceptions.NetpalmMetaProcessedException

Custom scripts fail at failing

It seems like when a script runs into an exception, it gets caught by s_exec in netpalm/backend/plugins/calls/scriptrunner/script.py and returns the exception string to the script_exec function, which then has no way to know it failed. This makes scripts always be reported as successful. Would it be better to either remove the try/except in s_exec or maybe have it re-raise the exception or something? Am I just missing something here?

Modified "hello_world" script to fail and raise an exception:

def run(**kwargs):
        args = kwargs.get("kwargs")
        world = args.get("hello")
        return non_existent_world

Current result:

<...>
   "task_status": "finished",
    "task_result": {},
    "task_errors": []
  }

Modified scriptrunner/script.py function:

    def s_exec(self):
        module = importlib.import_module(self.script_name)
        runscrp = getattr(module, "run")
        res = runscrp(kwargs=self.arg)
        return res

Result with modified s_exec():

<...>
    "task_status": "failed",
    "task_result": null,
    "task_errors": [
      "name 'non_existent_world' is not defined"
    ]

issue with "command" when using /netmiko/getconfig however legacy /getconfig works

{
"status": "success",
"data": {
"task_id": "e58dda33-a0a6-4fee-ada4-734878ca159a",
"created_on": "2020-10-06 06:43:34.170339",
"task_queue": "fifo",
"task_meta": {
"enqueued_at": "2020-10-06 06:43:34.170598",
"started_at": "2020-10-06 06:43:34.193613",
"ended_at": "2020-10-06 06:43:38.531095",
"enqueued_elapsed_seconds": null,
"total_elapsed_seconds": "4"
},
"task_status": "finished",
"task_result": null,
"task_errors": [
"send_command() got multiple values for argument 'command_string'"
]
}
}

commit not implemented erro

issue with Network device does not support 'commit()' method due to netmiko supporting the attr but just raising out

Published Containers have expired certs

Just performed fresh install using the documented instructions below:

Containers fail to start due to the included certificate having expired in August 2022:
_netpalm-controller_1 | redis.exceptions.ConnectionError: Error 1 connecting to redis:6379. [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (ssl.c:1131).

Please update documentation on the repo to include the additional step below:

  1. Ensure you first have docker installed
    sudo apt-get install docker.io
    sudo apt-get install docker-compose

  2. Clone this repository
    git clone https://github.com/tbotnz/netpalm.git
    cd netpalm

  3. Regenerate certificates <--- New Step
    ./redis_gen_new_certs.sh

  4. After the container has been built and started, you're good to go! netpalm will be available on port 9000 under your docker hosts IP.

Not sure if this the best way but it worked for me.

Additional improvement suggestions:

  1. Would it be possible for the compose script to automatically recreate the certificate upon each new build?

  2. Could steps be documented to explain how an admin can provide their own certificates if auto-regeneration is not possible/preferred?

It looks as though the regen script puts the certs into the below location:
/netpalm/netpalm/backend/core/security/cert/tls

Thanks

Lee

No rollback option for a task

Please add rollback option for any task. This can be solved by two ways-

  1. auto generating the delete config for service template which can be triggered when asked to rollback.
  2. A rollback api can be developed which can just copy the rollback config to current running configuration device( generated during task execution by napalm)

Cache interactions with Pre-Check and Post-Check are entirely unknown

It's possible they already act as desired, or could be made that way quite easily. It's also possible it'll take some real digging into.

Since Checks are sort of "sub-tasks":

  • Do (or should) they require their own cache config?
  • Do (or should) they inherit the cache config of the parent task?
  • If the Parent is a setconfig call, it should ALWAYS invalidate any cache after a PreCheck and before a PostCheck.
Task Start -> Pre Check -> SetConfig -> Poison Cache -> Post Check

Feature request - always included webhook

This is to request additional behavior for the webhook callbacks. Would love to be able to have a concept of an "always webhook" where a webhook could be defined within the config file that would always get ran.

The idea being, a custom webhook could be built that would always get ran upon task completion to create an archive of executions/results.

Would be great if the behavior was "hierarchical" as well so that specifying a webhook in the POST payload would also get ran. For example:

  1. If a webhook is specified in the payload:
    a. The "always" webhook gets executed
    b. The webhook in the payload gets executed
  2. If a webhook is not specified in the payload:
    a. the "always" webhook gets executed

Thoughts?

Adding new driver SOAP

I have built a similar program to netpalm for our team. It is less complex and doesn't support task queuing and some of the other more advanced concepts built here. But it does support the same drivers with the addition of a SOAP driver I created to interface with Adtran and Calix systems.

Would you be willing to work together to add a new driver for the system?

Use multiple ttp templates with multiple commands

At the moment it's not possible to pass an array of commands to getconfig endpoint, and match multiple ttp templates to the output, meaning that you can't effectively parse data correctly. When you try to do this, only the first command get's parsed correctly.

Would this be too hard to implement?

FR: method to scrub logs

Per slack conversation with @tbotnz opening a ticket to create an impelmention of "log scrubbing". Idea being, a method of removing certain patterns, such as usernme/password, from being presented in logs.

One idea on the implementation would be to allow for customized patterns to be matched:

{
    "scrub": [
         "password",
         "userrname",
         "some_special_key",
         "x-api-key"
    ]
}

Auto reload of template directory

Templates are currently loaded on container creation and containers have to be re-built if the templates are edited. It would be ideal if the template directory was reloaded automatically after a certain time period or when a template has been changed.

Allow for customization of Response objects

Would be nice to be able to customize the Response objects for all the routes and their different http response codes. Currently there is just a placeholder for the 200/201 and 422 codes.

Realtime Application

Hello, I'm developing an app to get realtime graphs, so, what is the best way to call the API every two seconds?

I done rising the workers for fifo, because the requests got queued a lot.

Thank you in advance!

FR: static work queues

Following up on slack conversation. Idea is to implement methods to implement a "static queue" within redis and allow tasks to target that queue. So if you had situations that you wanted to, say, run a config backup job for 5000 devices, you could spin up a FIFO worker and "route" those 5000 jobs to that worker. Would allow for the "main" worker queue to take on other (possibly/likely higher priority) tasks.

example ncclient call

hello,

this isn't actually an issue, but i'm not sure how to propose a wiki page or something -- hoping to just provide a simple example of using ncclient with juniper equipment for others.

Example /getconfig POST body that calls an RPC and will render the result into JSON:

{
    "library": "ncclient",
    "connection_args": {
        "host": "192.168.1.4",
        "username": "foo",
        "password": "bar",
        "port": 830,
        "hostkey_verify": false,
        "device_params": {
            "name": "junos"
        }
    },
    "args": {
        "rpc": "<get-software-information></get-software-information>",
        "render_json": true
    },
    "queue_strategy": "fifo"
}

and the task result with rendered JSON:

{
    "status": "success",
    "data": {
        "task_id": "73af9d88-4424-44fb-9be5-3453514dedf8",
        "created_on": "2020-11-22 17:59:57.427318",
        "task_queue": "fifo",
        "task_meta": {
            "enqueued_at": "2020-11-22 17:59:57.428441",
            "started_at": "2020-11-22 17:59:57.512601",
            "ended_at": "2020-11-22 18:00:03.786630",
            "enqueued_elapsed_seconds": "0",
            "total_elapsed_seconds": "6"
        },
        "task_status": "finished",
        "task_result": {
            "get_config": {
                "rpc-reply": {
                    "@message-id": "urn:uuid:10a9d9d9-09ea-4e70-a035-d9c06aba1a88",
                    "multi-routing-engine-results": {
                        "multi-routing-engine-item": {
                            "re-name": "fpc0",
                            "software-information": {
                                "host-name": "OFFICE-EX2200",
                                "product-model": "ex2200-c-12p-2g",
                                "product-name": "ex2200-c-12p-2g",
                                "package-information": [
                                    {
                                        "name": "junos",
                                        "comment": "JUNOS Base OS boot [12.3R12.4]"
                                    },
                                    {
                                        "name": "jbase",
                                        "comment": "JUNOS Base OS Software Suite [12.3R12.4]"
                                    },
                                    {
                                        "name": "jkernel-ex-2200",
                                        "comment": "JUNOS Kernel Software Suite [12.3R12.4]"
                                    },
                                    {
                                        "name": "jcrypto-ex",
                                        "comment": "JUNOS Crypto Software Suite [12.3R12.4]"
                                    },
                                    {
                                        "name": "jdocs-ex",
                                        "comment": "JUNOS Online Documentation [12.3R12.4]"
                                    },
                                    {
                                        "name": "jswitch-ex",
                                        "comment": "JUNOS Enterprise Software Suite [12.3R12.4]"
                                    },
                                    {
                                        "name": "jpfe-ex22x",
                                        "comment": "JUNOS Packet Forwarding Engine Enterprise Software Suite [12.3R12.4]"
                                    },
                                    {
                                        "name": "jroute-ex",
                                        "comment": "JUNOS Routing Software Suite [12.3R12.4]"
                                    },
                                    {
                                        "name": "jweb-ex",
                                        "comment": "JUNOS Web Management [12.3R12.4]"
                                    },
                                    {
                                        "name": "fips-mode-arm",
                                        "comment": "JUNOS FIPS mode utilities [12.3R12.4]"
                                    }
                                ]
                            }
                        }
                    }
                }
            }
        },
        "task_errors": []
    }
}

Thanks,

Will

Cache TTL can currently exceed request TTL. This should not be permitted.

Since what's actually cached is the Task metadata, and not the actual result, this should not be permitted.

Either:

  • raise an error
  • set cache_ttl to min(cache_ttl, response_ttl)
  • set response_ttl to max(cache_ttl, response_ttl)

I don't have any firm opinions on which way we go, but we should do one of those

support for Mikrotik vendor via napalm-ros

I'm a beginner to network automation and I tried to use napalm-ros driver with a RouterBoard without success, nothing happens.

Is this something that need to be added in netpalm source code?

TFSM template sections created by Netpalm aren't updated correctly on future runs

Sure. So to clarify a bit more, this only happens in the second template I add for a specific driver (generic, linux, etc.).

{ "key": "<any nornir key>", "driver": "generic", "command": "test1" }

If you go to the endpoint to show the list of templates, it is there, and if you check the template itself, you get the template contents.

If, after you add the first one, you try the above payload, but with command being test2, you'll see it adds the template, but it does not appear on the list. Furthermore, you can see the content of the template file.

Even with diferent key values, the template does not appear on the list, but you can see its contents.

Originally posted by @hanunes in #72 (comment)

Swagger documentation now rendered anymore."Please indicate a valid Swagger or OpenAPI version field"

When you use the latest version and after you fixed the issue with the pydantic version you don't get the usual swagger/openAPI screen as the file cannot be rendered. Instead you get this message:

Please indicate a valid Swagger or OpenAPI version field. Supported version fields are swagger: "2.0" and those that match openapi: 3.0.n (for example, openapi: 3.0.0

I am not familiar with FastAPI but it seems something was change between version 0.98.0 and 0.99.0 (current version is 0.103.2) that breaks the rendering. It might have something todo with the customer CSS in netpalm.

In order to fix the issue the specific version of FastAPI should be in the requirements.txt. Usually is bad practice to pin down a specific version, same as not providing any version et all. But in this case we can expect no further versions between 0.98.0 and 0.99.0 with security patches. Anyhow to stick with proper syntax I did changed this line in requirements.txt:

fastapi>=0.98.0,<0.99.0

rebuild containers and it is working again.

Off course this is only a quick fix, as you miss out on all security and functional updates for FastAPI. The proper way would be to solve the root cause why it is not rendered with FastAPI version >0.98.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.