Giter Site home page Giter Site logo

aioftp's Introduction

aioftp

Github actions ci for master branch https://pepy.tech/badge/aioftp/month

ftp client/server for asyncio (http://aioftp.readthedocs.org)

Features

  • Simple.
  • Extensible.
  • Client socks proxy via siosocks (pip install aioftp[socks]).

Goals

  • Minimum usable core.
  • Do not use deprecated or overridden commands and features (if possible).
  • Very high level api.

Client use this commands: USER, PASS, ACCT, PWD, CWD, CDUP, MKD, RMD, MLSD, MLST, RNFR, RNTO, DELE, STOR, APPE, RETR, TYPE, PASV, ABOR, QUIT, REST, LIST (as fallback)

Server support this commands: USER, PASS, QUIT, PWD, CWD, CDUP, MKD, RMD, MLSD, LIST (but it's not recommended to use it, cause it has no standard format), MLST, RNFR, RNTO, DELE, STOR, RETR, TYPE ("I" and "A"), PASV, ABOR, APPE, REST

This subsets are enough for 99% of tasks, but if you need something, then you can easily extend current set of commands.

Server benchmark

Compared with pyftpdlib and checked with its ftpbench script.

aioftp 0.8.0

STOR (client -> server)                              284.95 MB/sec
RETR (server -> client)                              408.44 MB/sec
200 concurrent clients (connect, login)                0.18 secs
STOR (1 file with 200 idle clients)                  287.52 MB/sec
RETR (1 file with 200 idle clients)                  382.05 MB/sec
200 concurrent clients (RETR 10.0M file)              13.33 secs
200 concurrent clients (STOR 10.0M file)              12.56 secs
200 concurrent clients (QUIT)                          0.03 secs

aioftp 0.21.4 (python 3.11.2)

STOR (client -> server)                              280.17 MB/sec
RETR (server -> client)                              399.23 MB/sec
200 concurrent clients (connect, login)                0.22 secs
STOR (1 file with 200 idle clients)                  248.46 MB/sec
RETR (1 file with 200 idle clients)                  362.43 MB/sec
200 concurrent clients (RETR 10.0M file)               5.41 secs
200 concurrent clients (STOR 10.0M file)               2.04 secs
200 concurrent clients (QUIT)                          0.04 secs

pyftpdlib 1.5.2

STOR (client -> server)                             1235.56 MB/sec
RETR (server -> client)                             3960.21 MB/sec
200 concurrent clients (connect, login)                0.06 secs
STOR (1 file with 200 idle clients)                 1208.58 MB/sec
RETR (1 file with 200 idle clients)                 3496.03 MB/sec
200 concurrent clients (RETR 10.0M file)               0.55 secs
200 concurrent clients (STOR 10.0M file)               1.46 secs
200 concurrent clients (QUIT)                          0.02 secs

Dependencies

  • Python 3.8+

0.13.0 is the last version which supports python 3.5.3+

0.16.1 is the last version which supports python 3.6+

0.21.4 is the last version which supports python 3.7+

License

aioftp is offered under the Apache 2 license.

Library installation

pip install aioftp

Getting started

Client example

WARNING

For all commands, which use some sort of «stats» or «listing», aioftp tries at first MLSx-family commands (since they have structured, machine readable format for all platforms). But old/lazy/nasty servers do not implement this commands. In this case aioftp tries a LIST command, which have no standard format and can not be parsed in all cases. Take a look at FileZilla «directory listing» parser code. So, before creating new issue be sure this is not your case (you can check it with logs). Anyway, you can provide your own LIST parser routine (see the client documentation).

import asyncio
import aioftp


async def get_mp3(host, port, login, password):
    async with aioftp.Client.context(host, port, login, password) as client:
        for path, info in (await client.list(recursive=True)):
            if info["type"] == "file" and path.suffix == ".mp3":
                await client.download(path)


async def main():
    tasks = [
        asyncio.create_task(get_mp3("server1.com", 21, "login", "password")),
        asyncio.create_task(get_mp3("server2.com", 21, "login", "password")),
        asyncio.create_task(get_mp3("server3.com", 21, "login", "password")),
    ]
    await asyncio.wait(tasks)

asyncio.run(main())

Server example

import asyncio
import aioftp


async def main():
    server = aioftp.Server([user], path_io_factory=path_io_factory)
    await server.run()

asyncio.run(main())

Or just use simple server

python -m aioftp --help

aioftp's People

Contributors

amdmi3 avatar asvetlov avatar bachya avatar ch3pjw avatar crafterkolyan avatar decaz avatar greut avatar imgbotapp avatar jacobtomlinson avatar janneronkko avatar jkr78 avatar jw4js avatar michalc avatar modelmat avatar ndhansen avatar olegt0rr avatar oleksandr-kuzmenko avatar pohmelie avatar ponypc avatar ported-pw avatar puddly avatar thirtyseven avatar webknjaz avatar yieyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aioftp's Issues

Move timeouts to pathio

What do you think about moving path timeouts to pathio? This will simplify the server and client (since pathio will be used in client too) code. What is better:

  • use timeout per method, as keyword argument?
  • use timeout per instance, as initialization argument?

Second looks more elegant.
Also, definetly there is place for decorator.

Avoid possible races in acquire/release

Semaphore-like object for counting connections and reject connections after some count acquired in "greeting" and "user" functions and released in "finally" part of "dispatcher". But I think there is possibility that exception will be raised between setting connection.user/connection.acquired and real acquire() call. In this case semaphore-like object will be released without acquiring and the counter will be broken. So acquire should be atomic, but how to implement this? 😕

Client fail on non-utf-8 server response

aioftp.version
'0.7.0'

Traceback (most recent call last):
File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
result = coro.send(None)
File "list_ftp.py", line 31, in get_images
for path, info in (await client.list(list_path, recursive=True)):
File "/usr/local/lib/python3.5/dist-packages/aioftp/common.py", line 135, in _to_list
async for item in self:
File "/usr/local/lib/python3.5/dist-packages/aioftp/client.py", line 752, in anext
name, info = cls.parse_line(line)
File "/usr/local/lib/python3.5/dist-packages/aioftp/client.py", line 509, in parse_mlsx_line
s = bytes.decode(b, encoding="utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 112: invalid start byte

storing traceback to user

What about to store traceback to user instead of just fall down and disconnect? Something like:

4xx-python error
4xx-in file...
4xx-line this
4xx-foo bar
4xx done

Using semaphore for connection count handling

Problem:

  1. Semaphor requires loop.
  2. Don't pass loop argument to User, to keep it static as much as possible.
  3. Semaphor class don't provide "unlimited mode".

What to do:

  1. Leave all as is (but it's not good, since user is mutable (available_connections)).
  2. Subclass semaphor to deal with None initial value for counter.
  3. Something else?

Throttle memory size

Recently Throttle changed behaviour and have storage for past N operations stats: size and time. Storage is limited deque (default size is 100). Should this option be available for user to change at initialization? Is it ok just to use 100? Is there a better value for deque length?

Client/ClientSession Download

I first want to thank you for putting this library together. I tried to use the higher level download, but I received this error.

aioftp.errors.StatusCodeError: Waiting for ('2xx',) but got 500 [" 'MLST my_path: command not understood."]

The ftp sever I'm trying to connect to doesn't support MLST, and download uses an is_file method, which uses MLST. Would you be open to changing the code so download just assumes the developer knows what they are doing/is inputting a correct file source path? This would eliminate the error/reliance on MLST for downloading a file.

MLSD not implemented

I tried your example with my a common FTP server app on the phone, I kept getting 502 for MLSD command. I'm very newbie in all this but doing some research I found MLSD is new and commonly falls back to LIST, I've tested it and it does work. Will it be possible to have it as a fallback? I'm happy to fork and make a pull request if you think it makes sense.

throttle tests failing

if for some reason the network or pathio is slow then the tests result in error (actually because "magic numbers" are used as a perfect time):

python tests/test-throttle.py 
INFO:aioftp:aioftp server: serving on 0.0.0.0:8888
INFO:aioftp:aioftp server: new connection from 127.0.0.1:36665
INFO:aioftp:aioftp server: 220 welcome
INFO:aioftp:aioftp client: 220 welcome
INFO:aioftp:aioftp client: USER anonymous
INFO:aioftp:aioftp server: USER anonymous
INFO:aioftp:aioftp server: 230 anonymous login
INFO:aioftp:aioftp client: 230 anonymous login
INFO:aioftp:aioftp client: TYPE I
INFO:aioftp:aioftp server: TYPE I
INFO:aioftp:aioftp server: 200 
INFO:aioftp:aioftp client: 200
INFO:aioftp:aioftp client: PASV
INFO:aioftp:aioftp server: PASV
INFO:aioftp:aioftp server: 227-listen socket created
INFO:aioftp:aioftp server: 227 (0,0,0,0,203,221)
INFO:aioftp:aioftp client: 227-listen socket created
INFO:aioftp:aioftp client: 227 (0,0,0,0,203,221)
INFO:aioftp:aioftp client: STOR tests/foo/foo.txt
INFO:aioftp:aioftp server: STOR tests/foo/foo.txt
INFO:aioftp:aioftp server: 150 data transfer started
INFO:aioftp:aioftp client: 150 data transfer started
INFO:aioftp:aioftp server: 226 data transfer done
INFO:aioftp:aioftp client: 226 data transfer done
INFO:aioftp:aioftp server: closing connection from 127.0.0.1:36665
INFO:aioftp:aioftp server: serving on 0.0.0.0:8888
INFO:aioftp:aioftp server: new connection from 127.0.0.1:36667
INFO:aioftp:aioftp server: 220 welcome
INFO:aioftp:aioftp client: 220 welcome
INFO:aioftp:aioftp client: USER anonymous
INFO:aioftp:aioftp server: USER anonymous
INFO:aioftp:aioftp server: 230 anonymous login
INFO:aioftp:aioftp client: 230 anonymous login
INFO:aioftp:aioftp client: TYPE I
INFO:aioftp:aioftp server: TYPE I
INFO:aioftp:aioftp server: 200 
INFO:aioftp:aioftp client: 200
INFO:aioftp:aioftp client: PASV
INFO:aioftp:aioftp server: PASV
INFO:aioftp:aioftp server: 227-listen socket created
INFO:aioftp:aioftp server: 227 (0,0,0,0,190,67)
INFO:aioftp:aioftp client: 227-listen socket created
INFO:aioftp:aioftp client: 227 (0,0,0,0,190,67)
INFO:aioftp:aioftp client: STOR tests/foo/foo.txt
INFO:aioftp:aioftp server: STOR tests/foo/foo.txt
INFO:aioftp:aioftp server: 150 data transfer started
INFO:aioftp:aioftp client: 150 data transfer started
INFO:aioftp:aioftp server: 226 data transfer done
INFO:aioftp:aioftp client: 226 data transfer done
INFO:aioftp:aioftp server: closing connection from 127.0.0.1:36667
Traceback (most recent call last):
  File "/home/roman/workspace/aioftp/tests/common.py", line 121, in wrapper
    yield from f(*args, tmp_dir=tmp_dir)
  File "tests/test-throttle.py", line 137, in test_client_write_throttle
    nose.tools.ok_(2.5 < (time.perf_counter() - start) < 3.5)
  File "/home/roman/workspace/aioftp/env/lib/python3.5/site-packages/nose/tools/trivial.py", line 22, in ok_
    raise AssertionError(msg)
AssertionError: None

IMO testing throttle this way is not fully correct.

PS. also, why don't you use loop.time() instead of time.perf_counter()?

ftplib api

What do you think about adding ftplib like api? (https://github.com/python/cpython/blob/master/Lib/ftplib.py)

So basically, create separate class or client, with api same as in ftplib but await or yield from added. People familiar with sync version of ftplib will pick up new async api easily. Such approach will simplify porting apps from sync to async too.

Porting ftplib should be strait forward, I can help with review and and may be code.

RuntimeError: readuntil() called while another coroutine is already waiting for incoming data

Here's MWE:

import asyncio
import aioftp
client = aioftp.Client()
async def main():
    await client.connect("ftp.bom.gov.au", 21) # doesn't matter here
    await client.login()
    await asyncio.wait([client.download("google98f6c7a1f42b62d2.html") for _ in range(10)])

asyncio.get_event_loop().run_until_complete(main())

Here's the full traceback:

Traceback (most recent call last):
  File "D:\Program Files\Python3\lib\site-packages\aioftp\client.py", line 921, in download
    if await self.is_file(source):
  File "D:\Program Files\Python3\lib\site-packages\aioftp\client.py", line 708, in is_file
    info = await self.stat(path)
  File "D:\Program Files\Python3\lib\site-packages\aioftp\client.py", line 680, in stat
    code, info = await self.command("MLST " + str(path), "2xx")
  File "D:\Program Files\Python3\lib\site-packages\aioftp\client.py", line 244, in command
    code, info = await self.parse_response()
  File "D:\Program Files\Python3\lib\site-packages\aioftp\client.py", line 184, in parse_response
    code, rest = await self.parse_line()
  File "D:\Program Files\Python3\lib\site-packages\aioftp\client.py", line 164, in parse_line
    line = await self.stream.readline()
  File "D:\Program Files\Python3\lib\site-packages\aioftp\common.py", line 537, in readline
    data = await super().readline()
  File "D:\Program Files\Python3\lib\asyncio\tasks.py", line 339, in wait_for
    return (yield from fut)
  File "D:\Program Files\Python3\lib\site-packages\aioftp\common.py", line 275, in readline
    return await self.reader.readline()
  File "D:\Program Files\Python3\lib\asyncio\streams.py", line 488, in readline
    line = yield from self.readuntil(sep)
  File "D:\Program Files\Python3\lib\asyncio\streams.py", line 581, in readuntil
    yield from self._wait_for_data('readuntil')
  File "D:\Program Files\Python3\lib\asyncio\streams.py", line 452, in _wait_for_data
    'already waiting for incoming data' % func_name)
RuntimeError: readuntil() called while another coroutine is already waiting for incoming data

From what I can see, this doesn't wait for the current connection to complete while running, and I would rather not create multiple clientsessions which are thrown away.

list() method fails with KeyError when MLSD command returns no facts

I am trying to use aioftp as a client for an FTP server with an MLSD command that returns less information than LIST. Specfically, it returns no facts, which is technically compliant with RFC 3659. This causes the list() method to raise a KeyError, as it assumes the response will contain a fact with at least the type property:

File "/Users/tkaplan/.local/share/virtualenvs/ad-pipes-Ecr7n0Zd/lib/python3.7/site-packages/aioftp/client.py", line 672, in __anext__
    if info["type"] == "dir" and recursive:
KeyError: 'type'

The fallback to the LIST method is not used in this case, since technically the server responds to the MLSD request. A nice workaround would be to allow the user to choose whether to use the MLSD or LIST command.

KeyError: 'type' on

Unhandled KeyError can get with Synology FTP server (i've print lines for debug):

2017-06-06 10:29:33,555 INFO ftp Trying to connect with 192.168.208.252
2017-06-06 10:29:33,631 INFO aioftp.client 220 NASDS415 FTP server ready.
2017-06-06 10:29:33,632 INFO aioftp.client USER ftpuser
2017-06-06 10:29:33,657 INFO aioftp.client 331 Password required for ftpuser.
2017-06-06 10:29:33,657 INFO aioftp.client PASS password
2017-06-06 10:29:33,955 INFO aioftp.client 230 User ftpuser logged in, access restrictions apply.
2017-06-06 10:29:33,956 INFO aioftp.client TYPE I
2017-06-06 10:29:34,180 INFO aioftp.client 200 Type set to I.
2017-06-06 10:29:34,180 INFO aioftp.client PASV
2017-06-06 10:29:34,401 INFO aioftp.client 227 Entering Passive Mode (192,168,208,252,217,7)
2017-06-06 10:29:34,425 INFO aioftp.client MLSD /
2017-06-06 10:29:34,446 INFO aioftp.client 150 Opening BINARY mode data connection for 'file list'.
b'type=file;modify=20170512000615;size=28676;UNIX.mode=0777;UNIX.owner=ftpuser;UNIX.group=users; .DS_Store\r\n'
b'type=dir;modify=20170508091520;size=1608;UNIX.mode=0000;UNIX.owner=root;UNIX.group=root; #recycle\r\n'
...
b'type=file;modify=20170606072933;size=218627167;UNIX.mode=0777;UNIX.owner=ftpuser;UNIX.group=users; results-00:d0:24:44:4d:70.txt\r\n'
b'type=file;modify=20170531094900;size=14947708;UNIX.mode=0777;UNIX.owner=ftpuser;UNIX.group=users; results-00:d0:24:44:d6:9c.txt\r\n'
2017-06-06 10:29:34,750 INFO aioftp.client 226 Transfer complete.
2017-06-06 10:29:34,751 INFO aioftp.client TYPE I
2017-06-06 10:29:34,775 INFO aioftp.client 200 Type set to I.
2017-06-06 10:29:34,775 INFO aioftp.client PASV
2017-06-06 10:29:34,796 INFO aioftp.client 227 Entering Passive Mode (192,168,208,252,217,8)
2017-06-06 10:29:34,816 INFO aioftp.client MLSD /#recycle
2017-06-06 10:29:34,835 INFO aioftp.client 150 Opening BINARY mode data connection for 'file list'.
b'ftpd: /#recycle: No such file or directory.\r\n'
2017-06-06 10:29:34,836 INFO aioftp.client QUIT
2017-06-06 10:29:34,836 INFO aioftp.client 226 Transfer complete.
Traceback (most recent call last):
File "./bin/tools.py", line 111, in
cli(obj=dict(inventory=None))
File "/Users/user/.virtualenvs/venv/lib/python3.6/site-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/Users/user/.virtualenvs/venv/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/Users/user/.virtualenvs/venv/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/user/.virtualenvs/venv/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/user/.virtualenvs/venv/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/user/.virtualenvs/venv/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/Users/user/.virtualenvs/venv/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/Users/user/Projects/tools/app/bin/commands.py", line 33, in do_fetch
loop.run_until_complete(servers[0])
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 466, in run_until_complete
return future.result()
File "/Users/user/Projects/tools/app/bin/commands.py", line 46, in handle_download
files = await client.list("/", recursive=True)
File "/Users/user/.virtualenvs/venv/lib/python3.6/site-packages/aioftp/common.py", line 135, in _to_list
async for item in self:
File "/Users/user/.virtualenvs/venv/lib/python3.6/site-packages/aioftp/client.py", line 754, in anext
if info["type"] == "dir" and recursive:
KeyError: 'type'
(venv) user@macbook app

Lazy change directory

Don't CWD until needed(list; downloading file; get_current_directory; uploading file; rmdir etc.). Aggregate sequential CWD's without other commands in between into one CWD. Consider it?(If yes I will do it)

server should accept data connection after service command and not only before

Currently this scenario, which is actually correct, can not be handled by server:

Command:    PASV
Server: start listening to 0,0,0,0,215,187
Response:   227-listen socket created
Response:   227 (0,0,0,0,215,187)
Command:    STOR test.wav
Client: connect to 0,0,0,0,215,187
Response:   150 data transfer started
Response:   226 data transfer done

I suppose to disable ConnectionConditions.passive_connection_made check on service commands and add a possibility to wait for data connection during some time after service command call (in https://github.com/pohmelie/aioftp/blob/master/aioftp/server.py#L1020 for example). If this is ok i'll provide a patch.

Human logging

As discussed we need more elegant logging. But if we add some context via extra argument for more flexible formatting and user decide to use extended format via basicConfig, then he must pass those extra in every call of logging.

import logging


top = logging.getLogger("top")
top.addHandler(logging.NullHandler())

real = logging.getLogger("top.yoba")


def yoba():

    real.error("FOO", extra={"foo": "bar"})
    logging.error("BAR")


if __name__ == "__main__":

    logging.basicConfig(
        level=logging.DEBUG,
        format="%(asctime)s %(message)s extra['foo'] = %(foo)s",
    )
    yoba()
2015-11-06 07:09:34,871 FOO extra['foo'] = bar
--- Logging error ---
Traceback (most recent call last):
  File "/usr/lib/python3.4/logging/__init__.py", line 978, in emit
    msg = self.format(record)
  File "/usr/lib/python3.4/logging/__init__.py", line 828, in format
    return fmt.format(record)
  File "/usr/lib/python3.4/logging/__init__.py", line 568, in format
    s = self.formatMessage(record)
  File "/usr/lib/python3.4/logging/__init__.py", line 537, in formatMessage
    return self._style.format(record)
  File "/usr/lib/python3.4/logging/__init__.py", line 381, in format
    return self._fmt % record.__dict__
KeyError: 'foo'
Call stack:
  File "./prog.py", line 22, in <module>
  File "./prog.py", line 13, in yoba
Message: 'BAR'
Arguments: ()

The solution looks like this:

import logging


top = logging.getLogger("top")
top.addHandler(logging.NullHandler())

real = logging.getLogger("top.yoba")


def yoba():

    real.error("FOO", extra={"foo": "bar"})
    logging.error("BAR")


if __name__ == "__main__":

    logging.basicConfig(
        level=logging.DEBUG,
        format="%(asctime)s %(message)s [base]",
    )

    top.propagate = False
    h = logging.StreamHandler()
    f = logging.Formatter(fmt="%(asctime)s %(message)s extra['foo'] = %(foo)s")
    h.setFormatter(f)
    top.addHandler(h)
    yoba()
2015-11-06 10:22:49,753 FOO extra['foo'] = bar
2015-11-06 10:22:49,754 BAR [base]

But, is this friendly and simple enough to end user? Of course, he can use basicConfig without configuring extra handler and loose some context.

Access user object from AbstractPathIO

I'm trying to implement my own specialised FTP server by extending the AbstractUserManager and AbstractPathIO. However I have a requirement for the AbstractPathIO based class to know which user is accessing it.

I'm basically passing the provided username and password onto a third-party service and need to be able to access them within every AbstractPathIO operation.

I would be tempted to make the connection object an arg on every AbstractPathIO method and pass it through on every call. However this is a large API change, and perhaps someone can think of a better way.

Avoid logging FTP output as bytes

I recently had to open a PR to another project because of aioftp logging strategy.

Basically aioftp logs everything sent from the ftp serve into the logger. The good thing is it allowed me to make crossbar more robust. The bad thing is aioftp probably should not do that, underlying code has no way to know the bytes encoding and can't decode it cleanly.

It should always provide unicode strings to the logger, dealing the decoding before passing it to any other part of the system. This means finding out the charset, or mitigating errors if they happen while decoding with a default charset.

On a personal note, aioftp has been very helpful to me in a professional project, and I like to thank you for the work you put into it. It's currently helping a heavy transport facility to track all vehicles in France.

Cheers !

Naming convention for upload/download

Right now client throttle bounds are named upload_speed_limit and download_speed_limit. To be consistent server throttle names should be the same. But upload for server is download for client. At the same time this is a «classic» naming for upload — client → server and download — server → client regardless of side. So, which naming should we use? Or something like client_to_server_speed_limit or read_speed_limit?

version 0.10.0 can't interact correctly with pyftpdlib server

using python 3.6
ftp server : pyftpdlib-1.5.3

Filezilla can connect to server and perform a list dir successfully.
aioftp 0.9.0 can connect to server and perform a list dir successfully.

Aioftp 0.10.0 client fails.

server code :

ftputil_server.py

import os

from pyftpdlib.authorizers import DummyAuthorizer
from pyftpdlib.handlers import FTPHandler
from pyftpdlib.servers import FTPServer


def main():
    # Instantiate a dummy authorizer for managing 'virtual' users
    authorizer = DummyAuthorizer()

    # Define a new user having full r/w permissions and a read-only
    # anonymous user
    authorizer.add_user('user', '12345', '.', perm='elradfmwMT')
    authorizer.add_anonymous(os.getcwd())

    # Instantiate FTP handler class
    handler = FTPHandler
    handler.authorizer = authorizer

    # Define a customized banner (string returned when client connects)
    handler.banner = "pyftpdlib based ftpd ready."

    # Specify a masquerade address and the range of ports to use for
    # passive connections.  Decomment in case you're behind a NAT.
    #handler.masquerade_address = '151.25.42.11'
    handler.passive_ports = range(60000, 65535)

    # Instantiate FTP server class and listen on 0.0.0.0:2121
    address = ('', 2121)
    server = FTPServer(address, handler)

    # set a limit for connections
    server.max_cons = 256
    server.max_cons_per_ip = 5

    # start ftp server
    server.serve_forever()


if __name__ == '__main__':
    main()

Client code:

aioftp_client.py

import asyncio
import aioftp


async def list_dir(host, port, login, password):
    async with aioftp.ClientSession(host, port, login, password) as client:
        return await client.list("/")


loop = asyncio.get_event_loop()
tasks = (
    list_dir("localhost", 2121, "user", "12345"),
)

loop.run_until_complete(asyncio.wait(tasks))
loop.close()

Error :


Task exception was never retrieved
future: <Task finished coro=<list_dir() done, defined at C:/Users/geoffroy.destaintot/Documents/Local/Informatique/Prototypes/2018-02-aioftp-bug/aioftp_client.py:5> exception=ValueError("invalid literal for int() with base 10: '61749|'",)>
Traceback (most recent call last):
  File "C:/Users/geoffroy.destaintot/Documents/Local/Informatique/Prototypes/2018-02-aioftp-bug/aioftp_client.py", line 7, in list_dir
    return await client.list("/")
  File "C:\Users\geoffroy.destaintot\Miniconda3\envs\aioftp-bug\lib\site-packages\aioftp\common.py", line 120, in _to_list
    async for item in self:
  File "C:\Users\geoffroy.destaintot\Miniconda3\envs\aioftp-bug\lib\site-packages\aioftp\client.py", line 618, in __aiter__
    cls.stream = await cls._new_stream(path)
  File "C:\Users\geoffroy.destaintot\Miniconda3\envs\aioftp-bug\lib\site-packages\aioftp\client.py", line 609, in _new_stream
    return await self.get_stream(command, "1xx")
  File "C:\Users\geoffroy.destaintot\Miniconda3\envs\aioftp-bug\lib\site-packages\aioftp\client.py", line 978, in get_stream
    reader, writer = await self.get_passive_connection(conn_type)
  File "C:\Users\geoffroy.destaintot\Miniconda3\envs\aioftp-bug\lib\site-packages\aioftp\client.py", line 944, in get_passive_connection
    ip, port = None, int(info[4:-2])
ValueError: invalid literal for int() with base 10: '61749|'

Process finished with exit code 0

client.download without logging in

Hi, thanks for developing this package! I ran into a problem when trying to download files from PDB, as it doesn't seem to require a login, and aioftp fails to download.

My test code looks like this:

import asyncio
import aioftp
import uvloop


async def download_file(session, url):
    url = url.lstrip("ftp://ftp.wwpdb.org")
    id = url[-11:-7]
    ID = id.upper()
    await session.download(url, f"{ID}.ent.gz")
    print(url)


async def unzip(sem, work_queue):
    while not work_queue.empty():
        queue_url = await work_queue.get()
        async with sem:
            async with aioftp.ClientSession("ftp.wwpdb.org") as session:
                await download_file(session, queue_url)
        

def download_queue(urls):
    loop = uvloop.new_event_loop()
    asyncio.set_event_loop(loop)
    q = asyncio.Queue()
    sem = asyncio.Semaphore(10)
    [q.put_nowait(url) for url in urls]
    tasks = [asyncio.ensure_future(unzip(sem, q)) for _ in range(len(urls))]
    loop.run_until_complete(asyncio.gather(*tasks))
    # Zero-sleep to allow underlying connections to close
    loop.run_until_complete(asyncio.sleep(0))
    loop.close()

if __name__ == "__main__":
    urls = [
        'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/nf/pdb4nfn.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/ny/pdb4nyj.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/mn/pdb2mnz.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/ra/pdb4ra4.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/x5/pdb4x5w.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/dm/pdb2dmq.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/n7/pdb2n7r.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/om/pdb2omv.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/oy/pdb3oy8.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/fe/pdb3fej.ent.gz', 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/hw/pdb2hw9.ent.gz']
    download_queue(urls)

The output:

Traceback (most recent call last):
  File "test.py", line 50, in <module>
    download_queue(urls)
  File "test.py", line 42, in download_queue
    loop.run_until_complete(asyncio.gather(*tasks))
  File "uvloop/loop.pyx", line 1364, in uvloop.loop.Loop.run_until_complete
  File "test.py", line 29, in unzip
    await download_file(session, queue_url)
  File "test.py", line 20, in download_file
    await session.download(url, f"{ID}.ent.gz")
  File "/home/yi/miniconda3/lib/python3.6/site-packages/aioftp/client.py", line 903, in download
    if await self.is_file(source):
  File "/home/yi/miniconda3/lib/python3.6/site-packages/aioftp/client.py", line 690, in is_file
    info = await self.stat(path)
  File "/home/yi/miniconda3/lib/python3.6/site-packages/aioftp/client.py", line 662, in stat
    code, info = await self.command("MLST " + str(path), "2xx")
  File "/home/yi/miniconda3/lib/python3.6/site-packages/aioftp/client.py", line 248, in command
    self.check_codes(expected_codes, code, info)
  File "/home/yi/miniconda3/lib/python3.6/site-packages/aioftp/client.py", line 214, in check_codes
    raise errors.StatusCodeError(expected_codes, received_code, info)
aioftp.errors.StatusCodeError: Waiting for ('2xx',) but got 550 [" Can't check for file existence"]

Thanks for your help!

Server exceptions

This is straight continue of #28 discuss.
Which exceptions should server handle globaly? Globaly means in dispatcher coroutine.
OSError is one of them. What information should server provide to user, except 451 return code? As far as I know, OSError is platform dependent and dispatching error codes will be annoying.
What else exception should server care about?

KeyError: 'type' from ClientSession.is_file

This code:

await client.download(source=remote_path, destination=local_path, write_into=True)

Throws a KeyError at:

File "/usr/lib/python3.6/site-packages/aioftp/client.py", line 597, in is_file
    return info["type"] == "file"

However, the type keyword is visible from the client.list() function:

[(PurePosixPath('a.xml'), {'type': 'file', ...

I'm using the 0.6.1 version (default pip installation).

readthedocs and python 3.5

Build failed, cause readthedocs does not deal with python 3.5. Maybe post documentation to github pages?

ClientSession.download throws an PathIOError() exception

While working, I faced an issue, where while downloading if the destination path is something like 'one/two/three/four/' and there exist only one or two sub-directory i.e. 'one/two' it throws this exception.

Traceback (most recent call last): File "/Users/abubakrdar/anaconda/lib/python3.6/site-packages/aioftp/pathio.py", line 77, in wrapper return await coro(*args, **kwargs) File "/Users/abubakrdar/anaconda/lib/python3.6/site-packages/aioftp/pathio.py", line 349, in mkdir return path.mkdir(parents=parents) File "/Users/abubakrdar/anaconda/lib/python3.6/pathlib.py", line 1225, in mkdir self._accessor.mkdir(self, mode) File "/Users/abubakrdar/anaconda/lib/python3.6/pathlib.py", line 388, in wrapped return strfunc(str(pathobj), *args) FileNotFoundError: [Errno 2] No such file or directory: 'downloads/pub/CDC/observations_germany/climate/daily/solar'

It works fine if sub-directories in the destination path are just one or two i.e. 'one/two' but I observed that for longer paths it threw this exception so I had to manually create path before downloading files.

Add ipv6 support

When running the server example from the README with no changes and attempting to connect from Filezilla results in the following error.

Task exception was never retrieved
future: <Task finished coro=<Server.dispatcher() done, defined at /opt/boxen/homebrew/lib/python3.6/site-packages/aioftp/server.py:834> exception=ValueError('too many values to unpack (expected 2)',)>
Traceback (most recent call last):
  File "/opt/boxen/homebrew/lib/python3.6/site-packages/aioftp/server.py", line 835, in dispatcher
    host, port = writer.transport.get_extra_info("peername", ("", ""))
ValueError: too many values to unpack (expected 2)

Python version: 3.6.3
aioftp version: 0.9.0
operating system: macOS 10.13 High Sierra

"Waiting for ('200',) but got 250" when downloading a file

I start the aioftp server from the command line:

 python -m aioftp --user user --pass password

I can connect fine using filezilla and get "/fixture.csv" file.

When I do:

async with aioftp.ClientSession(host, port, login, pwd) as client:
     while True:
         stream = await client.download_stream(path)
         print(await steam.read())
         await asyncio.sleep(1)

I can read the file the first time, then any next loop turn I get a 'aioftp.errors.StatusCodeError' exception stating:

 aioftp.errors.StatusCodeError: Waiting for ('200',) but got 250 ['-start', ' Type=file;Size=2914;Modify=1471460964.99208;Create=1488359466.8050766; fixture.csv', ' end']% 

I get the same error if I fail to read the file. The completely break my crawler since the csv refresh every second. And I can't find any workaround.

Interpretting your server benchmarks

I'm having a hard time interpretting your server benchmarks. As i interpret the numbers, aioftp is a lot slower than pyftpdlib it's being compared to. pyftpdlib as i gather doesn't work in an asynchronous way.

Am i interpretting the numbers correctly? If so, why is aioftp so much worse?

Tests fail on Windows

At least because of the following:

  1. Passive mode doesn't work. v0.1.7 has introduced a new default host, because binding to 0.0.0.0 was troublesome for pasv; however, the change was done in main.py, and setup.py test sets up the server directly, without calling main; therefore, on Windows, when running nosetests, the server binds to 0.0.0.0 and the client can't connect to the returned socket address
  2. The client (non-posix) requests file on paths that the server (posix behaviour) doesn't understand; for example, in test-file.test_download_folder, the client asks for MLST \server, which is non-posix; the server thinks the path doesn't exist, because it's looking for a posix path

ClientSession not closed if login fails

If login fails, ClientSession leaves a connection around.

    async def __aenter__(self):

        self.client = Client(**self.kwargs)
        await self.client.connect(self.host, self.port)
        await self.client.login(self.user, self.password, self.account)
        return self.client

should probably be

    async def __aenter__(self):

        self.client = Client(**self.kwargs)
        try:
            await self.client.connect(self.host, self.port)
            await self.client.login(self.user, self.password, self.account)
        except:
            self.client.close()
            raise
        return self.client

since if __aenter__ fails, __aexit__ will not be called.

Secure FTP

After #36 I read about FTPS, SFTP and FTP over SSH.

  • SFTP is SSH extension and we should ignore it

  • FTPS is extension of FTP (https://tools.ietf.org/html/rfc2228)

  • FTP over SSH is (as I realized) just tunneling and have problems(?) with data connections

  • implicit ftps mode (#81)

  • explicit ftps mode

Exception during upload_stream causes hang

This hangs:

from aioftp import ClientSession
import asyncio

async def main():
    async with ClientSession("localhost") as client:
        async with client.upload_stream("/foo/bar") as stream:
            raise RuntimeError()

asyncio.get_event_loop().run_until_complete(main())

This does not:

from aioftp import ClientSession
import asyncio

async def main():
    async with ClientSession("localhost") as client:
        async with client.upload_stream("/foo/bar") as stream:
            try:
                raise RuntimeError()
            finally:
                stream.writer.write_eof()

asyncio.get_event_loop().run_until_complete(main())

Apparently, without the write_eof, the data connection doesn't get closed.

In ClientSession.__aexit__ aioftp sends a QUIT but there's no reaction from the server until the data connection has been closed.

My guess is you should add the write_eof to DataConnectionThrottleStreamIO.__aexit__.

Tested against a local vsftpd.

Period in EPSV response causes exception

I am running into a similar problem to #69 where the server is returning a response ending with a "." that the client does not expect.

Using the latest version checked out of github, the simple test case of

async def run_test(host, port, login, password):
    async with aioftp.ClientSession(host, port, login, password) as client:
        await client.get_passive_connection()

creates the error ValueError: invalid literal for int() with base 10: '57217|'

I can confirm that this is a problem with a "." on the end by adding a print statement on line 961:

            # format: some message (|||port|)
            print(info)
            *_, info = info[-1].split()
            ip, port = None, int(info[4:-2])

[' Entering Extended Passive Mode (|||55901|).']

Adding a simple .rstrip(".") on line 962 solves the problem, but I presume you want to do something more complicated than that. If not, let me know and I will happily submit a pull request

server 502 type 'A' not implemented

python3 -m aioftp --user test --pass test -d "/home/" --port 21

[21:20:28]: [aioftp.server] new connection from 192.168.137.1:11449
[21:20:28]: [aioftp.server] 220 welcome
[21:20:28]: [aioftp.server] USER test
[21:20:28]: [aioftp.server] 331 password required
[21:20:29]: [aioftp.server] PASS test
[21:20:29]: [aioftp.server] 230 normal login
[21:20:29]: [aioftp.server] SYST
[21:20:29]: [aioftp.server] 215 UNIX Type: L8
[21:20:29]: [aioftp.server] FEAT
[21:20:29]: [aioftp.server] 502 'feat' not implemented
[21:20:29]: [aioftp.server] PWD
[21:20:29]: [aioftp.server] 257 "/"
[21:20:29]: [aioftp.server] CWD /
[21:20:29]: [aioftp.server] 250
[21:20:29]: [aioftp.server] PWD
[21:20:29]: [aioftp.server] 257 "/"
[21:20:30]: [aioftp.server] TYPE A
[21:20:30]: [aioftp.server] 502 type 'A' not implemented
[21:20:30]: [aioftp.server] TYPE A
[21:20:30]: [aioftp.server] 502 type 'A' not implemented
[21:20:59]: [aioftp.server] TYPE I
[21:20:59]: [aioftp.server] 200

User manager and ftp codes

Seems like user manager should not deal with ftp codes and messages. They should be splitted with special enum or something else.

ABOR(T) races

I think there can be races with "workers" and ABOR command. Flow:

  1. Worker running.
    === schedule ===
  2. Worker store positive (2xx) result to response queue.
  3. Worker done.
    === schedule ===
  4. abor function called and worker still in extra_workers.
  5. abor store 4xx and 2xx to queue.

What client see:

2xx - ok done
4xx - aborted
2xx - aborted successful

I think FTP protocol is made without this in mind. This is not just server, but client issue also.
How to abort transmission properly?

socket.send() raised exception

Sometimes, when trying to write to closed socket there is no ConnectionResetError exception, but some message to console "socket.send() raised exception" and program continue to execute as nothing happens. I don't understand the nature of this issue, but I saw same problem in stackoverflow questions. Here is code to reproduce, and this work on 3.4.2, 3.4.3 and 3.5.0 on my ubuntu 14.04:

import asyncio
import time


host, port = "127.0.0.1", 8888
data_size = 8192


@asyncio.coroutine
def write_worker():

    reader, writer = yield from asyncio.open_connection(host, port)
    while True:

        writer.write(b"-" * data_size)
        yield from writer.drain()


def read_worker(reader, writer):

    start = time.perf_counter()
    while time.perf_counter() - start < 0.5:

        data = yield from reader.read(data_size)

    writer.close()


if __name__ == "__main__":

    try:

        loop = asyncio.get_event_loop()
        coro = asyncio.start_server(read_worker, host, port)
        server = loop.run_until_complete(coro)
        loop.run_until_complete(write_worker())

    except KeyboardInterrupt:

        server.close()

    print("done")

If we put yield between writer.write(b"-" * data_size) and yield from writer.drain() then problem go away, but we can't use yield in 3.5.0 with coroutines.
What to do:

  • Left as is (:-1:)
  • use yield from asyncio.sleep(0)
  • something else?

KeyError: 'ot'

Hi there – I'm attempting to connect to a local FTP server on an IP camera and am getting the following stacktrace:

Task exception was never retrieved
future: <Task finished coro=<test() done, defined at test.py:13> exception=KeyError('ot',)>
Traceback (most recent call last):
    File "test.py", line 18, in test
      await client.list(path),
    File "/Users/abach/.local/share/virtualenvs/Downloads-ZJLnFE2W/lib/python3.6/site-packages/aioftp/common.py", line 120, in _to_list
      async for item in self:
    File "/Users/abach/.local/share/virtualenvs/Downloads-ZJLnFE2W/lib/python3.6/site-packages/aioftp/client.py", line 633, in __anext__
      name, info = cls.parse_line(line)
    File "/Users/abach/.local/share/virtualenvs/Downloads-ZJLnFE2W/lib/python3.6/site-packages/aioftp/client.py", line 378, in parse_list_line
      info["unix.mode"] = self.parse_unix_mode(s[1:10])
    File "/Users/abach/.local/share/virtualenvs/Downloads-ZJLnFE2W/lib/python3.6/site-packages/aioftp/client.py", line 308, in parse_unix_mode
      mode |= parse_rw[s[0:2]] << 6
KeyError: 'ot'

Here's the dummy app I built – seems like this stacktrace occurs with client.list():

async def test(path):
    async with aioftp.ClientSession('MY_HOST', 21, 'MY_USER, '') as client:
        await client.list()


loop = asyncio.get_event_loop()
tasks = [test()]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()

For what it's worth, I'm able to connect to this server fine using Filezilla:

screen shot 2017-10-06 at 11 21 38 pm

Not sure what other info I can give you, but please let me know. I read your comment about MLSD; wondering if that's an issue here?

Appreciate your help!

info['modify'] year is always 1900

Not sure if there is something wrong with the self.parse_ls_date(s[:12]) in aioftp/client.py but when doing the following:

`
await client.list(dir_path)

`

All the years are 1900 for the modify attribute , when in Filezilla they're shown as they should, I'm connecting to an anonymous ftp server. In fact I'm using an open data ftp server (ftp2.census.gov) if you'd like to try.

Is this something regarding the server, are we missing a new condition maybe during the parsing?

Thanks

Double USER message

Should we handle USER, PASS commands after user logged in? What should we do:

  • error code (which one?)
  • disconnect + error code
  • ignore

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.