meeshkan / hmt Goto Github PK

HTTP Mocking Toolkit

License: MIT License

Python 100.00%

api mocking openapi-schemas python http-types openapi3

hmt's Introduction

HMT

The HTTP Mocking Toolkit (HMT) is a tool that mocks HTTP APIs for use in sandboxes as well as for automated and exploratory testing. It uses a combination of API definitions, recorded traffic and code in order to make crafting mocks as enjoyable as possible.

Chat with us on Gitter to let us know about questions, problems or ideas!

What's in this document

Installation
Getting started with HMT
- Tutorial
Collect recordings of API traffic
Build a HMT spec from recordings
- Building modes
Mock server traffic using a HMT spec
Development
Contributing
- Code of Conduct

Installation

Install via pip (requires Python 3.6+):

pip install hmt

macOS users can install HMT with Homebrew:

brew tap meeshkan/tap
brew install hmt

Debian and Ubuntu users can install HMT with apt:

echo "deb [trusted=yes] https://dl.bintray.com/meeshkan/apt all main" | tee -a /etc/apt/sources.list
apt-get -qq update && apt-get install hmt

Getting started with HMT

The basic HMT flow is collect, build and mock.

First, collect data from recorded server traffic and/or OpenAPI specs.
Then, build a schema that unifies these various data sources.
Finally, use this schema to create a mock server of an API.

Tutorial

The quickest way to get an overview of HMT is to complete our interactive tutorial. It walks you through the collect, build, and mock flow - while also covering the concepts necessary for development.

Note: This tutorial has been tested on Python 3.6, 3.7, and 3.8.

After installing HMT, you can begin the tutorial by invoking from the command line:

$ hmt tutorial

Once you've run this, you should see:

    __              __ 
   / /_  ____ ___  / /_
  / __ \/ __ `__ \/ __/
 / / / / / / / / / /_
/_/ /_/_/ /_/ /_/\__/


The tutorial!!
Press ENTER to continue...

If not, it's probably our fault. Please let us know by filing an issue on this repo.

Collect recordings of API traffic

Let's look at how to build a HMT spec. First, you have to collect recordings of server traffic and/or OpenAPI server specs.

To record API traffic, the HMT CLI provides a record mode that captures API traffic using a proxy.

$ hmt record

This command starts HMT as a reverse proxy on the default port of 8000 and creates two directories: logs and specs.

With curl, for example, you can use HMT as a proxy like so:

$ curl http://localhost:8000/http://api.example.com

By default, the recording proxy treats the path as the target URL. It then writes a .jsonl file containing logs of all server traffic to the logs directory. All logs are created in the http-types format. This is because HMT's build tool expects all recordings to be represented in a .jsonl file containing recordings represented in the http-types format.

For more information about recording, including direct file writing and kafka streaming, see the recording documentation.

Build a HMT spec from recordings

Using the HMT CLI, you can build an OpenAPI schema from a single .jsonl file, in addition to any existing OpenAPI specs that describe how your service works.

$ hmt build --input-file path/to/recordings.jsonl

Note: The input file should be in JSON Lines format and every line should be in http-types JSON format. For an example input file, see recordings.jsonl.

Optionally, you can also specify an output directory using the --out flag followed by the path to this directory. By default, HMT will build the new OpenAPI specifications in the specs directory.

Use dash (--input-file -) to read from standard input:

$ hmt build --input-file - < recordings.jsonl

Building modes

You can use a mode flag to indicate how the OpenAPI spec should be built, for example:

hmt build --input-file path/to/recordings.jsonl --mode gen

Supported modes are:

gen [default] - infer a schema from the recorded data
replay - replay the recorded data based on exact matching

For more information about building, including mixing together the two modes and editing the created OpenAPI schema, see the building documentation.

Mock server traffic using a HMT spec

You can use an OpenAPI spec, such as the one created with hmt build, to create a mock server using HMT.

$ hmt mock path/to/dir/

Note: You can specify a path to the directory your OpenAPI spec is in or a path to one specific file.

For more information about mocking, including adding custom middleware and modifying the mocking schema JIT via an admin API, see the mocking documentation.

Development

Here are some useful tips for building and running HMT from source.

If you run into any issues, please reach out to our team on Gitter.

Getting started

Clone this repository: git clone https://github.com/meeshkan/hmt
Create a virtual environment: python3 -m venv .venv && source .venv/bin/activate
Install dependencies: pip install --upgrade -e '.[dev]'
Install pre-commit hooks to automatically format code as a git hook: pre-commit install

Tests

Run all checks:

$ python setup.py test

`pytest`

Run tests/ with pytest:

pytest
# or
python setup.py test

Configuration for pytest is found in pytest.ini.

Formatting

Formatting is checked by the above mentioned python setup.py test command.

To fix formatting:

$ python setup.py format

`flake8`

Run style checks:

$ flake8 .

`pyright`

You can run type-checking by installing pyright globally:

$ npm -i -g pyright

And then running:

$ pyright --lib
$ # or
$ python setup.py typecheck

Using the Pyright extension is recommended for development in VS Code.

Automated builds

Configuration for CircleCI build pipeline can be found in .circleci/config.yml.

Publishing HMT as a PyPi package

To publish HMT as a PyPi package, complete the following steps:

Bump the version in setup.py if the version is the same as in the published package. Commit and push.
Run python setup.py test to check that everything works
To build and upload the package, run python setup.py upload. Insert PyPI credentials to upload the package to PyPI. The command will also run git tag to tag the commit as a release and push the tags to remote.

To see what the different commands do, see Command classes in setup.py.

Contributing

Thanks for your interest in contributing! Please take a look at our development guide for notes on how to develop the package locally. A great way to start contributing is to file an issue or make a pull request.

Code of Conduct

Please note that this project is governed by the Meeshkan Community Code of Conduct. By participating, you agree to abide by its terms.

hmt's People

Contributors

Stargazers

Watchers

Forkers

jstockdi dfioravanti dediwsn aby2s kab0a1 jayvdb icodein

hmt's Issues

Add support for converting between JSON schema and OpenAPI

Response content JSON types are currently stored in JSON schema format using genson library. JSON schema is not valid OpenAPI specification, so it would be useful to add code for converting between formats.

Useful resources:

Fix tutorial issues

Add detailed error descriptions in the matcher

Error descriptions in the matcher look like:

'message': 'Could not find a valid OpenAPI schema for path=/pets, method=get'

It also matches a hostname and, probably, query params. We should show them all. And we should show the source of an error if we can. I mean, if there is no spec matching a hostname, we should return something like "No matching host found for ...".

Run meeshkan server as a daemon

Meeshkan ml builder PoC

Should we replace sys.exit() with exception?

We currently have a sys.exit(1) call at https://github.com/meeshkan/meeshkan/blob/master/meeshkan/server/server/callbacks.py#L42.

As @ksaaskil mentioned in a review comment, perhaps we should replace that by raising an exception, and catching it higher up. What do you think? My initial thoughts are that it's ok to call sys.exit directly for command-line programs, but not for any code that might be used as a library function. But I'm easily persuaded here!

prototype stateful mock server on Python

Add scheme generation mode to the proxy

Add support for inferring query parameters

Not implemented yet in the builder.

Extract data from recordings

Daemon mode does not work on python 3.8?

I initially thought that daemon mode did not work on mac, but it might be that it's broken on python 3.8.

This is the output of trying to use -d, --daemon:

Traceback (most recent call last):
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/bin/meeshkan", line 11, in <module>
    load_entry_point('meeshkan', 'console_scripts', 'meeshkan')()
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/Users/fornwall/src/meeshkan/meeshkan/meeshkan/serve/commands.py", line 125, in mock
    daemon = daemonocle.Daemon(
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/daemonocle/core.py", line 37, in __init__
    self.detach = detach & self._is_detach_necessary()
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/daemonocle/core.py", line 260, in _is_detach_necessary
    if cls._is_socket(sys.stdin):
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/daemonocle/core.py", line 224, in _is_socket
    sock = socket.fromfd(fd, socket.AF_INET, socket.SOCK_RAW)
  File "/Users/fornwall/.pyenv/versions/3.8.1/lib/python3.8/socket.py", line 544, in fromfd
    return socket(family, type, proto, nfd)
  File "/Users/fornwall/.pyenv/versions/3.8.1/lib/python3.8/socket.py", line 231, in __init__
    _socket.socket.__init__(self, family, type, proto, fileno)
OSError: [Errno 38] Socket operation on non-socket

This might be the underlying cause: Python 3.8 regression Socket operation on non-socket

Validate mock input data

We should validate input data to the mock, so that if called with incorrect data not obeying the schema we

Return a good HTTP response code (perhaps a 5xx one).
Give a descriptive error message in the HTTP response.
Log a descriptive error message.

Add better logging configuration

Current logging.conf messes with the root logger, which is bad. Also the names and formats need to be revised.

OpenAPI spec is pretty printed only for gen mode

At least mixed-mode specs are not pretty printed.

Fix tutorial

Change default build output to specs directory

Description

While creating a mock of the Studio Ghibli API, I ran into something peculiar... When booting up meeshkan record, there were two empty directories created: logs and specs. After recording some API traffic via curl, my recordings were saved to the logs directory.

Then when running a build command without a specified output directory (i.e. meeshkan build -i logs/recordings.jsonl), a new directory called out was created which now held my openapi.json file.

It would make more sense, in this case, to have the default build output be the specs folder since it is already created. Especially since the default value for meeshkan mock is specs.

We can then update the documentation to recommend that people put any existing specs into that specs directory use the -a flag to modify or the -o flag if they prefer a different location for their output.

Additional information

Python version: 3.8.1
Meeshkan version: 0.2.16

Solve logging problem

The proxy uses logging.yaml from meeshkan cli. It is hard to configure if it is installed as a package. And its default configuration doesn't include proxy debug/info logs.

Request body schema is not inferred

Add support for inferring the schema for requestBody, it should share a lot of code with inferring the response body.

Consider adding 'Issues Templates' to Repo

Summary

Having issues template will allow for consistent issues from a growing open source community in the future. Having a template can act as a checklist for the person creating the issue so that they consider all the relevant information before submitting. It also formats the issue in a cogent and understandable manner to help the maintainers.

You can copy:
https://github.com/COVID-19-electronic-health-system/Corona-tracker/tree/master/.github

Motivation

Why are we doing this?
Having a template acts as a checklist for the person creating the issue. They will consider all the relevant information before submitting. It will help the contributor think deeper about the problem which will give greater clarity to the issue, bug or feature request.

What use cases does it support?
Allows for expanding the open-source community in a sustainable manner.
Allows for developers to write more detailed issues which helps with development.

What is the expected outcome?
Formatted and consistent issues, with more relevant information to save time and help the maintainers.

Describe alternatives you've considered

A clear and concise description of the alternative solutions you've considered:
Allow for a standard to develop organically.

Additional context

N/A

meeshkan has no --version flag

This seems like a good thing to add.

ValueError: Algorithm conflict for path matching when running build command

Description

While attempting to build a spec from recorded Studio Ghibli API data, I ran into two ValueError: Algorithm conflict for path matching errors when running the meeshkan build -i logs/recordings.jsonl command.

References

Python version: 3.8.1
Meeshkan version: 0.2.16

Gist of my recordings.jsonl file: https://gist.github.com/carolstran/7418f2e6a2139a5a42142db5a076bfe4

Terminal output:

2020-03-04 10:24:27,470 - meeshkan.schemabuilder.builder - ERROR - Error updating spec
Traceback (most recent call last):
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/builder.py", line 325, in build_schema_async
    schema = update_openapi(schema, exchange, mode)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/builder.py", line 244, in update_openapi
    path_match_result = find_matching_path(normalized_pathname, schema_paths, request_method, operation_candidate)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/paths.py", line 186, in find_matching_path
    pathname_with_wildcard, pathname_to_be_replaced_with_wildcard, path_match = fn(path, path_item)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/paths.py", line 183, in <lambda>
    lambda p, pi: _dumb_match_to_path(request_path, paths, request_method, operation_candidate)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/paths.py", line 124, in _dumb_match_to_path
    raise ValueError('Algorithm conflict for path matching - got a match for %s %s, but then returned match === None' % (request_path, new_path))
ValueError: Algorithm conflict for path matching - got a match for /films/2baf70d1-42bb-4437-b551-e5fed5a87abe /films/{wqkditcg}, but then returned match === None
2020-03-04 10:24:27,475 - meeshkan.__main__ - DEBUG - Flushing all sinks
2020-03-04 10:24:27,476 - meeshkan.schemabuilder.writer - INFO - Writing to folder /Users/carolyn/meeshkan/team-day-mocks/studio-ghibli/out.
2020-03-04 10:24:27,479 - meeshkan.schemabuilder.writer - DEBUG - Writing to: /Users/carolyn/meeshkan/team-day-mocks/studio-ghibli/out/openapi.json

2020-03-04 10:24:27,479 - meeshkan.__main__ - INFO - Shutting down source
Traceback (most recent call last):
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/bin/meeshkan", line 10, in <module>
    sys.exit(cli())
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/__main__.py", line 118, in build
    run_from_source(source, UpdateMode[mode.upper()], openapi_spec, sinks=sinks)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/__main__.py", line 70, in run_from_source
    loop.run_until_complete(run(loop))
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 612, in run_until_complete
    return future.result()
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/__main__.py", line 67, in run
    await sink_task
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/__main__.py", line 40, in write_to_sink
    async for result in result_stream:
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/builder.py", line 325, in build_schema_async
    schema = update_openapi(schema, exchange, mode)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/builder.py", line 244, in update_openapi
    path_match_result = find_matching_path(normalized_pathname, schema_paths, request_method, operation_candidate)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/paths.py", line 186, in find_matching_path
    pathname_with_wildcard, pathname_to_be_replaced_with_wildcard, path_match = fn(path, path_item)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/paths.py", line 183, in <lambda>
    lambda p, pi: _dumb_match_to_path(request_path, paths, request_method, operation_candidate)
  File "/Users/carolyn/meeshkan/team-day-mocks/aichienv/lib/python3.8/site-packages/meeshkan/schemabuilder/paths.py", line 124, in _dumb_match_to_path
    raise ValueError('Algorithm conflict for path matching - got a match for %s %s, but then returned match === None' % (request_path, new_path))
ValueError: Algorithm conflict for path matching - got a match for /films/2baf70d1-42bb-4437-b551-e5fed5a87abe /films/{wqkditcg}, but then returned match === None

Better openapi.json format after build

Description

While creating a mock of the Studio Ghibli API, I ran into a potential improvement with how we present the OpenAPI spec post-build.

After running my build command (meeshkan build -i logs/recordings.jsonl), an openapi.json file was generated in the out directory. This file was usable... but it was all in a single line, which isn't the most enjoyable reading experience.

To make this file more readable and accessible for new meeshkan/OpenAPI spec users, we should make it so that this file is always formatted.

Additional information

Gist containing my generated openapi.json file: https://gist.github.com/carolstran/ecb555024a1653a045e343aa79242df7

Python version: 3.8.1
Meeshkan version: 0.2.16

ModuleNotFoundError: No module named 'jsonpath_rw'

I just set up the project and I am trying to run the tests, but I am getting this error.

When I run pytest

ImportError while loading conftest '/Users/es.py/Projects/Personal/meeshkan/tests/conftest.py'.
tests/conftest.py:10: in <module>
    from meeshkan.serve.mock.request_processor import RequestProcessor
meeshkan/serve/mock/request_processor.py:10: in <module>
    from meeshkan.serve.mock.faker.stateful_faker import StatefulFaker
meeshkan/serve/mock/faker/stateful_faker.py:8: in <module>
    from meeshkan.serve.mock.storage.entity import Entity
meeshkan/serve/mock/storage/entity.py:12: in <module>
    from jsonpath_rw import Fields, parse
E   ModuleNotFoundError: No module named 'jsonpath_rw'

I am sure it's not related to my module because when I check if it exists I get this....

[STORY] Polish the initial impression of running meeshkan

This is a collection of tasks that can be done for improving the initial impressions of running meeshkan.

Some possible ones:

Make it possible to install meeshkan using homebrew (#120).
Make it possible to install meeshkan using apt on Debian and Ubuntu (#137).
Look at the output of meeshkan --help and meeshkan CMD --help and see if things can be improved (#124).
Shorten the log format (#126).
Add colors to the logging.
Add emojis to the logging (some added in #142, more can be added).
Ensuring that we don't forget about user experience, perhaps by adding PR template with notes about the docs.

For meeshkan record:

Some questions that came up for me: If I already have a jsonl file and I go back to record more, does it update or overwrite? Is there any way to overwrite a specific file without deleting and regenerating? These questions could be solved via documentation or code, depending on how the product acts now.
One of the first logs says when running says spec generation mode disabled and I was like uhhhh what is this, is it bad 😆 Could be good to clarify for users what this means.
It currently doesn't tell you that it has created two directories (logs and specs)
Once the initial logs have printed, that's it. There's no helpful logs to reassure you that it's doing anything.
If you're using curl or another command-line tool to make requests, it should be done in a separate window because Meeshkan record is a blocking process. We don't mention this anywhere, so it's hard to know that this is what you're supposed to do. (Documented with #144)
How many API calls are "enough"? For the tutorial, we do 33 requests to the Pokemon API - but in general we haven't stress tested this. Is it just the more calls, the more accurate? Or is there a minimum number of calls? Would be good to know.
How do you kill the recorder without losing data? Turns out you do ctrl + c or a kill command - but that's not written anywhere. It'd be nice to have that documented or, even better, have a stop command (Documented in #144, but still in favor of a stop command)
Update documentation to reflect recordings filename changes in #100 in both RECORD.md and BUILD.md (#144)
Continuously build a schema when recording, and print a line every time the schema is updated due to a recording. That would give an indication when updates are slowing down/stopping, so the further recordings are not needed.

For meeshkan build:

Update documentation to default build output changes in #99 (#144)
Better flag descriptions in BUILD.md (ie the differences between using -o and -a) (#144)

For meeshkan mock:

Explain what the -s flag is doing in mock command example in MOCK.md (In meeshkan/meeshkan@608524e we write out --specs-dir)
When creating my mock server, I needed to search for the servers field in OpenAPI spec to find the URL for the mock server to respond to. But I only knew to do that because Mike told me to. We should clarify in docs that this is located there and that it will usually be the same url as the API itself, unless you hack it. We haven’t mentioned hackability much in the existing documentation. If we write out the endpoints at startup, then this is less of a problem. See also the next note about logging endpoints at startup. Solved by #142.
Log the endpoints at startup ala https://github.com/stoplightio/prism (gif) (#142)
We are currently using http://localhost:9000/http://myapi.com/path/to/api, perhaps use http://localhost:9000/path/to/api without the host (that's what prism and mocklab does at least).
How do we know that it’s not calling the real API? The host URL is the same. Mock mode never calls the real API - should be specified.
Clarify that the default value that the mock command will search in is specs. Or remove the default completely (Removed default in #154)
It should validate the input --> if it’s an int and you put in a string, we should show an error (#138).
Remove --specs-dir as an option, and instead take it as an argument. So run it like meeshkan mock directory instead of meeshkan mock --specs-dir directory (#154) (documented in #144)
Support running on a single schema file as well, like in meeshkan mock file-or-directory (#154)

General documentation improvements:

Should we recommend that people run a virtual environment? I've always had a better time when I've used one and we recommend it in many of our tutorials.
Link to instructions for installing curl, postman, etc (#144)
Explain what a daemon is once the quick start demo is ready (quick start demo isn't there, but the general issue is solved by linking daemon docs in #144)
Links to the next section at the end of each docs page (ie if you're currently on the recording docs, a link to the build docs) (#144)
At the end of each section of the README, instead of a sentence explaining what the documentation has in it --> have a list
Always write out the long format of options (like --admin-port instead of -a) in documentation (#144)
Add a shell completion to the install.
Write a link to the README in the --help output.

When working on this, let's create links to PR:s and concrete issues to visualise what has been done and ideas for further improvements.

Create a partial scheme matcher

Write builder output to folder

It could contain more files than simple OpenAPI spec such as Dockerfile.

Add support for combining paths belonging to the same logical endpoint

For example, /pets/1 and /pets/2 are probably part of /pets/{id} path, at least if their schemas coincide well enough.

Run examples as integration tests

Exception in schema builder - path matching

Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7fba040e4c80>)
Traceback (most recent call last):
  File "/home/nikolay/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/ioloop.py", line 758, in _run_callback
    ret = callback()
  File "/home/nikolay/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/nikolay/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/iostream.py", line 752, in wrapper
    return callback(*args)
  File "/home/nikolay/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/record/channel.py", line 304, in on_server_close
    self.remove_channel()
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/record/channel.py", line 298, in remove_channel
    self.flush(check_length=False)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/record/channel.py", line 245, in flush
    self._proxy_calback.on_request_complete(self._request, resp)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/record/proxy.py", line 52, in on_request_complete
    self._data_callback.log(request, response)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/utils/data_callback.py", line 75, in log
    self._specs[host], reqres, self._update_mode
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/build/builder.py", line 243, in update_openapi
    path_match_result = find_matching_path(normalized_pathname, schema_paths, request_method, operation_candidate)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/build/paths.py", line 187, in find_matching_path
    pathname_with_wildcard, pathname_to_be_replaced_with_wildcard, path_match = fn(path, path_item)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/build/paths.py", line 184, in <lambda>
    lambda p, pi: _dumb_match_to_path(request_path, paths, request_method, operation_candidate)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/build/paths.py", line 125, in _dumb_match_to_path
    raise ValueError('Algorithm conflict for path matching - got a match for %s %s, but then returned match === None' % (request_path, new_path))
ValueError: Algorithm conflict for path matching - got a match for /accounts/v3/accounts/gKLBnhCsw4TIALwCax3n99SFsjS9MmdnUME5aGeCBLM.TBb2WCdFe5kN6N87qvVQ4w.Qdzv56vvTwudmN8B4nvoRg /accounts/v3/accounts/{fbbtxzhd}, but then returned match === None

During OP Bank recording again

Add gzip response decompressing to proxy

The proxy failes on gzipped responses.
For example, http://localhost:8899/https/api.github.com/search/users?q=repos%3A%3E%3D30+location%3Aseattle&page=0

Document a use case of or drop header based routing in the mock command

Do we have a use case for --header-routing when running mocks?

Name record output '<host>-recordings.jsonl' by default

Description

While creating a mock of the Studio Ghibli API, I ran into a potential enhancement. After running meeshkan record and recording some API traffic via curl, my recordings were saved to the logs directory... but under the name ghibliapi.herokuapp.com.jsonl (ghibliapi.herokuapp.com is the API host).

Perhaps it would make more sense to have this file always be named recordings.jsonl by default. That way, it matches the documentation and makes it more clear to the user what this file is for.

Additional information

Python version: 3.8.1
Meeshkan version: 0.2.16

In case you're curious, here's a gist of the jsonl that was generated post-recordings: https://gist.github.com/carolstran/7418f2e6a2139a5a42142db5a076bfe4

Support windows and fix header mode in proxy

Windows seems to not support TCP_NODELAY param of a socket. We may omit it.
Header mode in proxy supports only 'Host' header while TCP standard requires it to be case insensitive.

Produced schema is not necessarily valid OpenAPI specification

For example, if genson sees { "value": null }, it infers the type to be "null". In OpenAPI, this is not valid. Instead, one must infer a valid non-null type such as type: string and then set nullable: true.

Another example is { arr: [] }. In this case, the JSON schema should be just { type: array }. This is not valid in OpenAPI, but one must always define also the types of items.

There are also many other cases where JSON schema produced by genson is not valid according to OpenAPI, see the differences here.

Add user-defined storages to meeshkan server.

We need custom storages and improved data manager for a new NLP based data structure analysis

Deprecate `--source` parameter

There's no need to do meeshkan build --source kafka, it's simpler and more expressive to use the protocol to define the source. For example, meeshkan build kafka:localhost:9092. One might also include the topic here with something like meeshkan build kafka:localhost:9092@http_recordings to avoid configurations unless really needed.

Add requirements.txt to proxy

As it is a separate tool in "wok in progress" state right now, I don't think we have to include tornado and other dependencies of proxy into setep.py of meeshkan package. Should we add requirements.txt in tools/meeshkan_proxy directory?

Exception in schema builder

Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7fbabd802c80>)
Traceback (most recent call last):
  File "/home/nikolay/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/ioloop.py", line 758, in _run_callback
    ret = callback()
  File "/home/nikolay/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/nikolay/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/iostream.py", line 752, in wrapper
    return callback(*args)
  File "/home/nikolay/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/record/channel.py", line 304, in on_server_close
    self.remove_channel()
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/record/channel.py", line 298, in remove_channel
    self.flush(check_length=False)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/record/channel.py", line 245, in flush
    self._proxy_calback.on_request_complete(self._request, resp)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/record/proxy.py", line 52, in on_request_complete
    self._data_callback.log(request, response)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/serve/utils/data_callback.py", line 75, in log
    self._specs[host], reqres, self._update_mode
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/build/builder.py", line 288, in update_openapi
    operation = update_operation(existing_operation, exchange, mode)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/build/builder.py", line 189, in update_operation
    response = update_response(existing_response, mode, request)
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/build/builder.py", line 119, in update_response
    schema=update_text_schema(v, mode, schema=response.headers.get(k, None))) for k, v in useable_headers.items()
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/build/builder.py", line 119, in <dictcomp>
    schema=update_text_schema(v, mode, schema=response.headers.get(k, None))) for k, v in useable_headers.items()
  File "/home/nikolay/projects/meeshkan/meeshkan/meeshkan/build/media_types.py", line 30, in update_text_schema
    oneOf=[specific, *([] if schema is None else [schema] if schema.oneOf is None else schema.oneOf)]
AttributeError: 'Header' object has no attribute 'oneOf'

Fix Python 3.8.1

Adapt schemebuilder to new http_types naming

Rename req/res to request/response

Ugly error on mock command when browser requests /favicon.ico

@fornwall pointed out that when you run meeshkan mock and access an API in the browser, the browser requests /favicon.ico. And then you get the following ugly stack trace printed due to that not being an API endpoint. Apart from it being ugly, it also may confuse people when they encounter it.

Uncaught exception GET /favicon.ico (::1)
HTTPServerRequest(protocol='http', host='localhost:8000', method='GET', uri='/favicon.ico', version='HTTP/1.1', remote_ip='::1')
Traceback (most recent call last):
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/tornado/web.py", line 1590, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "/Users/fornwall/src/meeshkan/meeshkan/meeshkan/serve/mock/views.py", line 34, in get
    self._serve()
  File "/Users/fornwall/src/meeshkan/meeshkan/meeshkan/serve/mock/views.py", line 56, in _serve
    route_info = self.router.route(self.request.path, headers)
  File "/Users/fornwall/src/meeshkan/meeshkan/meeshkan/serve/utils/routing.py", line 36, in route
    url = urllib.parse.urlsplit("{}//{}".format(splits[0], splits[1]))
IndexError: list index out of range
500 GET /favicon.ico (::1) 4.14ms
Uncaught exception GET /favicon.ico (::1)
HTTPServerRequest(protocol='http', host='localhost:8000', method='GET', uri='/favicon.ico', version='HTTP/1.1', remote_ip='::1')
Traceback (most recent call last):
  File "/Users/fornwall/src/meeshkan/meeshkan/.venv/lib/python3.8/site-packages/tornado/web.py", line 1590, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "/Users/fornwall/src/meeshkan/meeshkan/meeshkan/serve/mock/views.py", line 34, in get
    self._serve()
  File "/Users/fornwall/src/meeshkan/meeshkan/meeshkan/serve/mock/views.py", line 56, in _serve
    route_info = self.router.route(self.request.path, headers)
  File "/Users/fornwall/src/meeshkan/meeshkan/meeshkan/serve/utils/routing.py", line 36, in route
    url = urllib.parse.urlsplit("{}//{}".format(splits[0], splits[1]))
IndexError: list index out of range
500 GET /favicon.ico (::1) 0.73ms

Suggestion from @fornwall: In general we should probably only log a single line for something not found, like 500 GET /favicon.ico (::1) 0.73ms

Add the ability to consume pcap files

Requires tshark to be available in PATH
Probably requires streaming the CSV produced by tshark to avoid reading everything into memory, these files could get huge.

Recording fails with a proxy

When running the following command:

curl --proxy http://localhost:8000 http://time.jsontest.com

I got the following crash in version 0.2.6

OSError: [WinError 10022] An invalid argument was supplied
2020-02-18 17:12:43,346 - meeshkan.server.proxy.channel - DEBUG - [('::1', 58277, 0, 0)] Trying to flush incomplete data, request "{'method': 'get', 'host': 'time.jsontest.com', 'path': '/', 'pathname': '/', 'protocol': 'http', 'query': {}, 'body': '', 'bodyAsJson': {}, 'headers': {'Host': 'time.jsontest.com', 'User-Agent': 'curl/7.66.0', 'Accept': '*/*', 'Proxy-Connection': 'Keep-Alive'}}", response chunks 0
2020-02-18 17:12:43,356 - tornado.application - ERROR - Uncaught exception, closing connection.
Traceback (most recent call last):
  File "c:\users\mikesolomon\devel\meeshkan-test\.venv\lib\site-packages\tornado\iostream.py", line 752, in wrapper
    return callback(*args)
  File "c:\users\mikesolomon\devel\meeshkan-test\.venv\lib\site-packages\tornado\stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "c:\users\mikesolomon\devel\meeshkan-test\.venv\lib\site-packages\meeshkan\server\proxy\channel.py", line 234, in on_server_connect
    self._server_stream.on_connect(self.on_server_read)
AttributeError: 'NoneType' object has no attribute 'on_connect'     
2020-02-18 17:12:43,360 - tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 
0x000001AC9095DD08>)
Traceback (most recent call last):
  File "c:\users\mikesolomon\devel\meeshkan-test\.venv\lib\site-packages\tornado\ioloop.py", line 758, in _run_callback
    ret = callback()
  File "c:\users\mikesolomon\devel\meeshkan-test\.venv\lib\site-packages\tornado\stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "c:\users\mikesolomon\devel\meeshkan-test\.venv\lib\site-packages\tornado\iostream.py", line 752, in wrapper
    return callback(*args)
  File "c:\users\mikesolomon\devel\meeshkan-test\.venv\lib\site-packages\tornado\stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "c:\users\mikesolomon\devel\meeshkan-test\.venv\lib\site-packages\meeshkan\server\proxy\channel.py", line 234, in on_server_connect
    self._server_stream.on_connect(self.on_server_read)

Change default specs folder in proxy

Improve on our elevator pitch (one sentence summary)

We should replace "Meeshkan CLI" below with a good sentence.

Related: Should we focus more on the mocking part, and downplay the recording and schema parts?

$ meeshkan --help
Usage: meeshkan [OPTIONS] COMMAND [ARGS]...

  Meeshkan CLI.

Meeshkan proxy produces incorrect schemas

meeshkan record produces incorrect schemas:

!!python/object:openapi_typed_2.openapi.OpenAPIObject
_x: null
components: null
externalDocs: null
info: !!python/object:openapi_typed_2.openapi.Info {_license: null, _x: null, contact: null,
  description: API description, termsOfService: null, title: API title, version: '1.0'}
openapi: 3.0.0
paths:
  /accounts/v3/accounts: !!python/object:openapi_typed_2.openapi.PathItem
    _ref: null
    _x: null
    delete: null
    description: Path description
    get: !!python/object:openapi_typed_2.openapi.Operation
      _x: null
      callbacks: null
      deprecated: null
      description: Operation description
      externalDocs: null
      operationId: id
      parameters:
      - !!python/object:openapi_typed_2.openapi.Parameter
        _in: header
        _x: null
        allowEmptyValue: null

Should meeshkan record have --specs-dir and --mode options?

meeshkan record --help
Usage: meeshkan record [OPTIONS] COMMAND [ARGS]...

  Record HTTP traffic from a reverse proxy.

Options:
  -p, --port TEXT                Server port.
  -a, --admin-port TEXT          Admin server port.
  -s, --specs-dir TEXT           Directory with OpenAPI specifications.
  -d, --daemon                   Run meeshkan as a daemon.
  -r, --header-routing           Use headers to specify target hosts.
  -l, --log-dir TEXT             API calls logs direcotry
  -m, --mode [GEN|REPLAY|MIXED]  Spec building mode.
  --help                         Show this message and exit.

Typechecking fails with pyright >= 1.1.14

An internal error occurred while while performing type checking for /Users/kimmo/git/meeshkan/meeshkan/meeshkan/schemabuilder/builder.py: RangeError: Maximum call stack size exceeded
    at Object.isSameGenericClass (webpack:///./src/analyzer/types.ts?:177:32)

Once this is fixed, one can remove the fixed pyright version in .circleci/config.yml.

Fill in server field in schemebuilder

Path from server URL is not included when matching a request to a path

Similar to but slightly different from #13: If the server URL exists (such as https://petstore.swagger.io/v1, the path part (/v1) should be stripped away when matching to paths (i.e., path /v1/pets should match /pets).

meeshkan / hmt Goto Github PK

hmt's Introduction

HMT

What's in this document

Installation

Getting started with HMT

Tutorial

Collect recordings of API traffic

Build a HMT spec from recordings

Building modes

Mock server traffic using a HMT spec

Development

Getting started

Tests

pytest

Formatting

flake8

pyright

Automated builds

Publishing HMT as a PyPi package

Contributing

Code of Conduct

hmt's People

Contributors

Stargazers

Watchers

Forkers

hmt's Issues

Description

Additional information

Summary

Motivation

Describe alternatives you've considered

Additional context

Description

References

Description

Additional information

Description

Additional information

Recommend Projects

Recommend Topics

Recommend Org

`pytest`

`flake8`

`pyright`