Giter Site home page Giter Site logo

sftpclone's Introduction

sftpclone

PyPI version PyPI python version PyPI license

A tool for cloning/syncing a local directory tree with an SFTP server.

Features

  • Keep in sync a local directory tree with a specified folder of an SFTP server.
  • Update symbolic links as needed and keep files consistent.
  • Automatic tilde expansion/handling on the SFTP server.
  • Public key authentication.
  • ssh_config entries compatibility.
  • Syncing exclusion patterns.
  • Compatible with both Python 2 and Python 3.

Install

You can install sftpclone by using pip:

$ pip install sftpclone --user

Note: Sometimes building required dependencies in user mode doesn't work. In that case, you'd need to use sudo and to remove the --user flag. Alternatively, you could make use of a virtualenv.

Alternatively, you can clone this repository and then launch:

$ git clone https://github.com/unbit/sftpclone
$ cd sftpclone
$ python setup.py install

In both cases, you'll find the sftpclone script in your path.

Usage

usage: sftpclone [-h] [-k private-key-path]
                 [-l {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}] [-p PORT]
                 [-f] [-a] [-c ssh config path] [-n known_hosts path] [-d]
                 [-e exclude-from-file-path] [-t] [-o]
                 [-r --create-remote-directory]
                 local-path user[:password]@hostname:remote-path

Where, for each command line argument:

  • local-path: The path of the local folder. This path must exists and can contain ~ (we use tilde expansion).
  • sftp-url: It specifies the remote SFTP url having the form: [user[:password]@]hostname:remote-path. Both the password and the user field can be omitted. If you omit the former then you should specify a private key identity file. If you omit the latter then the current user is automatically used. The hostname can refer to a element of your ssh_config file. If the remote path contains ~, then it will be expanded to the default folder in which the user begins her SFTP session.
  • [h]elp: show the help message and exit.
  • private-[k]ey-path: the path to your private identity file. Set it if you are not using password authentication. It automatically defaults to ~/.ssh/id_rsa and can be used more than once.
  • [l]ogging: set the log level (ERROR by default).
  • [p]ort: SSH remote port (defaults to 22).
  • [f]ix-symlinks: if you have absolute symlinks pointing to your synced directory, they will remain consistent on the remote server: i.e., they will have an absolute path that reflect the path of the cloned directory on the server. Useful for cluster configurations.
  • ssh-[a]gent: enable ssh-agent support. Any private-[k]ey-path argument will be ignored.
  • ssh-[c]onfig-path: in the sftp-url's hostname you can specify an entry of your ssh_config file. If you are using a non-standard path, you can set it here.
  • k[n]own_hosts path: path to your known_hosts file. Default to ~/.ssh/known_hosts.
  • [d]isable-known-hosts: disable remote fingerprint check against local known_host file.
  • [e]xclude-from-file-path: the path to a file containing a list of patterns. Each file matched by these pattern will be ignored (not synced).
  • do-not-dele[t]e: do not delete remote files that are missing from the local directory.
  • all[o]w-unknown: do not ask for confirmation before connecting to unknown hosts.
  • -create-[r]emote-directory: Create remote base directory if missing.

Warning: be sure to select a proper remote folder. The synchronization process will indeed delete any file that doesn't exist in the local folder (unless you turn the -t option on).

ssh_config compatibility

The hostname in the sftp-url parameter can be a valid entry in a ssh_config file. Specifically, your entry should have relevant parameters such as:

  • HostName
  • User
  • Port
  • IdentityFile
  • ProxyCommand

Any value not found will fallback to the CLI arguments. Anyway, you have to set the IdentityFile field, otherwise authentication will try to fallback to ~/.ssh/id_rsa and could not work. The first hostname matching the pattern is chosen (in the ssh_config way).

known_hosts checking

By default sftpclone will match the remote host fingerprint against the one contained in your ~/.ssh/known_hosts file. If this file doesn't exists on your machine, you can specify a different path by using the -n option. Furthermore, you can disable the check with the -d flag. Unknown hosts will require the user to authorize the connection. Please note that, even after authorization, the known_host file won't be modified.

Exclude list

It takes inspiration from the rsync/tar --exclude-from flag.

You can specify among your command line arguments a file containing a list of patterns, one per each line. All those files that match any pattern will not be synced with the SFTP server.

Lines beginning with ; or # are ignored.

Each pattern is considered relative to the syncing directory. As a consequence, leading / are ignored.

Example

; This will exclude any file or directory beginning with foo
foo*
; This will exclude any file foo in a subdir of the directory bar.
bar/*/foo

Programmatic usage

You can find some examples of programmatic usage inside the examples directory.

Testing

This project uses nose for testing. In addition, on Python 2 you'll need the mock module (part of Python standard lib from 3.3). In both cases, you can install test requirements with:

$ pip install -r test_requirements.txt

Then, You can launch the test suite by using, from the project root directory:

$ nosetests
$ python setup.py test # alternatively

sftpclone's People

Contributors

aldur avatar dependabot[bot] avatar felixfontein avatar unbit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

sftpclone's Issues

Folder stucture Problem

When we connect the server, files are show in root directory not as like folder structure which is save as local directory

Support of proxycommand

ProxyCommand settings are being ignored even if you put ssh_config_path in the SFTPClone instance.

Host foobar
 hostname host.example.com
 proxycommand /usr/local/bin/corkscrew proxy.example.com 3128 %h %p ~/.corkscrewauth
 port 22
 user foo

a problem with unicode in file names

Hello,
I encountered the following bug:

  1. I have an empty local directory test and an empty directory test on an sftp server backup.
  2. Locally I create a file:
touch test/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxà

The name contains 198 (not less) letters "x" (normal ascii x) and the
something utf-8, here à is '0xc3 0xa0'.
3. Run sftpclone test/ backup:/test. Everything is OK, the file copied to
the server.
4. Run the same command again. The result:

Traceback (most recent call last):
  File "/usr/bin/sftpclone", line 7, in <module>
    sftpclone()
  File "/usr/lib64/python2.7/site-packages/sftpclone/sftpclone.py", line 746, in main
    sync.run()
  File "/usr/lib64/python2.7/site-packages/sftpclone/sftpclone.py", line 563, in run
    self.check_for_deletion()
  File "/usr/lib64/python2.7/site-packages/sftpclone/sftpclone.py", line 388, in check_for_deletion
    for remote_st in self.sftp.listdir_attr(remote_path):
  File "/usr/lib64/python2.7/site-packages/paramiko/sftp_client.py", line 209, in listdir_attr
    longname = msg.get_text()
  File "/usr/lib64/python2.7/site-packages/paramiko/message.py", line 186, in get_text
    return u(self.get_bytes(self.get_int()))
  File "/usr/lib64/python2.7/site-packages/paramiko/py3compat.py", line 53, in u
    return s.decode(encoding)
  File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 253: unexpected end of data

The same happens for any other utf-8 symbol, always "codec can't decode" the first byte "in position 253".

Crash if the ssh_config entry contains "IdentityFile"

Hello,
when I run

# sftpclone /root/test/ backup:/root_backup/test

with /root/.ssh/config:

Host backup
HostName ...
User ...
IdentityFile ~/.ssh/id_rsa_backup

sftpclones crashes as follows:

Traceback (most recent call last):
  File "/usr/bin/sftpclone", line 7, in <module>
    sftpclone()
  File "/usr/lib64/python2.7/site-packages/sftpclone/sftpclone.py", line 728, in main
    **kwargs
  File "/usr/lib64/python2.7/site-packages/sftpclone/sftpclone.py", line 162, in __init__
    key = os.path.expanduser(key)
  File "/usr/lib64/python2.7/posixpath.py", line 261, in expanduser
    if not path.startswith('~'):
AttributeError: 'list' object has no attribute 'startswith'

Without the "IdentityFile" in /root/.ssh/config sftpclone works normally.
My settings:
python-2.7.5
sftpclone-1.1
paramiko-1.16.0
ecdsa-0.13
pycrypto-2.6.1

There is (rather old) bug in duplicity
https://bugs.launchpad.net/duplicity/+bug/1156746
which looks similar. In that case the problem was:

The problem is that starting with version 1.10.0, Paramiko's SSH config parser allows multiple IdentityFile options per host, meaning that SSHParamikoBackend.config['identityfile'] is always a list now
(earlier versions simply returned a string).

Feature request: Allow for creating a remote folder

We have a small chain for generating and uploading some documentation where we tried this tool for the upload step. But when uploading into a folder on the remote that does not exists I get the following error:

"... ERROR - Error while opening remote folder. Are you sure it does exist?"

Would it make sense to allow the creation of a remote target folder if it does not already exist?
If so how would be a good way to go about it, and should we try to open a pull request - if we manage to come up with a working change?

Upload create breaks

Hi,
sorry for creating another issue today. I'm testing the develop branch at the moment. The first 8GB of files have been transferred just fine. Then an error occurs reproducible...

sftpclone_errdbg

sftpclone_errdbg_1

This is a normal jpg file like all the other ones in the same directory.

I have now temporarily removed this directory to see if it breaks again (still running).

Update: Yes, it breaks again without this thumb directory. I've run sftpclone with the parameters -f -t.

Best Regards,
Stefan

Exclude files

It'd be useful to be able to exclude files based on file/path patterns, like rsync's --exclude-from option.

Default username

When no username is specified for a remote host sftpclone should try with the current username

Syntax error, sftpclone.py", line 718

Hi
I'm getting a syntax error with python 2.6:

[root@vm0 backup_server]# sftpclone -h
Traceback (most recent call last):
File "/usr/bin/sftpclone", line 4, in
from sftpclone.sftpclone import main as sftpclone
File "/usr/lib/python2.6/site-packages/sftpclone/sftpclone.py", line 718
args_mapping['k']: v

I believe this is due to syntax difference between new python versions and 2.6.
Already "fixed" one instance related to "dict comprehension" on line 93 but I'm stuck with this one.
I (obviously) have very limited knowledge of python so if someone can help me here ...

Thanks

Host details

It would be nice if a host's details could be automatically selected from ssh_config

Agent support doesn't work if there is more than one SSH-key loaded into the agent

When I try to use agent support and more than one SSH key is loaded into the agent, I get this stacktrace:

 Traceback (most recent call last):
  File "/usr/local/bin/sftpclone", line 7, in <module>
    sftpclone()
  File "/usr/local/lib/python2.7/dist-packages/sftpclone/sftpclone.py", line 742, in main
    **kwargs
  File "/usr/local/lib/python2.7/dist-packages/sftpclone/sftpclone.py", line 146, in __init__
    agent_keys.append(*agent.get_keys())
TypeError: append() takes exactly one argument (2 given)

These keys are loaded into the agent:

$ ssh-add -l
1024 68:80xxxxxxx/Users/kork/.ssh/id_dsa (DSA)
4096 57:69:xxxxxx /Users/kork/.ssh/id_rsa_new (RSA)

Switch for never delete?

Hello,
would it be possible to provide a switch to disable the remote side deletion? I would like to keep a small local directory with incremental backups from the last months. But on the backup server I would like to "not delete" older backups. It should just add the new files. (like rsync without the --delete switch)

Best Regards,
Stefan

sftpclone breaks with Python 3.7

The problem is that it requires paramiko 2.1.2, which internally uses async (which is now a keyword) as an argument name in sftp_file.py; the call self._close(async=True) in the same file is now a syntax error.

This has been reported in paramiko in paramiko/paramiko#1108 and fixed in paramiko/paramiko@0e0b2b8
Bumping the required version of paramiko to 2.4.x should fix this. I've upgraded paramiko to 2.4.1 on my machine, and with that, sftpclone seems to work.

Vignette for programmatic usage

I am using sftpclone for a custom deployment for a small static website with some quirky legacy needs. The library is great, so thanks for that! I had to dig into the codebase a little bit to figure out how to use sftpclone programmatically. Is there a bit of documentation or an example for a minimal Python script with usage of this module? If not and if you are interested in having this kind of documentation, please let me know where you would like it, and I could send a pull request.

'EOF in transport thread' error

Hi,

We are trying to use sftpclone to clone a directory tree full of mysql backups (.sql.gz) and txt files to a backup drive. All within the Hetzner hosting park.

Let me start off by saying we really like the simplicity of sftpclone!

Unfortunately.. we keep getting an 'EOF in transport thread' error after a number of directories and files have been cloned:

2017-03-13 10:27:31,992 - DEBUG - [chan 0] stat('MYSQL_BACKUPS_126/status/status_monthly_2017-03-01_03h13m_March.txt.gz')
2017-03-13 10:27:31,993 - DEBUG - [chan 0] chmod('MYSQL_BACKUPS_126/status/status_monthly_2017-03-01_03h13m_March.txt.gz', 504)
2017-03-13 10:27:31,994 - DEBUG - [chan 0] utime('MYSQL_BACKUPS_126/status/status_monthly_2017-03-01_03h13m_March.txt.gz', (1489342025.0, 1488334381.0))
2017-03-13 10:27:31,995 - DEBUG - [chan 0] lstat('MYSQL_BACKUPS_126/status/status_weekly_2017-03-03_03h22m_9.txt.gz')
2017-03-13 10:27:31,996 - DEBUG - [chan 0] open('MYSQL_BACKUPS_126/status/status_weekly_2017-03-03_03h22m_9.txt.gz', 'wb')
2017-03-13 10:27:31,997 - DEBUG - [chan 0] open('MYSQL_BACKUPS_126/status/status_weekly_2017-03-03_03h22m_9.txt.gz', 'wb') -> 30663535626334326363393534653232
2017-03-13 10:27:32,000 - DEBUG - [chan 0] close(30663535626334326363393534653232)
2017-03-13 10:27:32,000 - DEBUG - [chan 0] stat('MYSQL_BACKUPS_126/status/status_weekly_2017-03-03_03h22m_9.txt.gz')
2017-03-13 10:27:32,001 - DEBUG - [chan 0] chmod('MYSQL_BACKUPS_126/status/status_weekly_2017-03-03_03h22m_9.txt.gz', 504)
2017-03-13 10:27:32,002 - DEBUG - [chan 0] utime('MYSQL_BACKUPS_126/status/status_weekly_2017-03-03_03h22m_9.txt.gz', (1489342025.0, 1488507722.0))
2017-03-13 10:27:32,103 - DEBUG - EOF in transport thread

The error consistently appears at that status/… file. When trying to exclude the status log files we still get the EOF:

2017-03-13 10:36:33,931 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_daily_2017-03-10_03h44m_Friday.txt.gz.
2017-03-13 10:36:33,931 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_daily_2017-03-08_03h10m_Wednesday.txt.gz.
2017-03-13 10:36:33,931 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_monthly_2016-11-01_03h14m_November.txt.gz.
2017-03-13 10:36:33,931 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_weekly_2017-02-10_03h18m_6.txt.gz.
2017-03-13 10:36:33,932 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_daily_2017-03-13_03h43m_Monday.txt.gz.
2017-03-13 10:36:33,932 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_weekly_2017-03-10_03h44m_10.txt.gz.
2017-03-13 10:36:33,932 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_daily_2017-03-07_04h39m_Tuesday.txt.gz.
2017-03-13 10:36:33,932 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_monthly_2017-02-01_03h30m_February.txt.gz.
2017-03-13 10:36:33,932 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_weekly_2017-02-03_03h30m_5.txt.gz.
2017-03-13 10:36:33,932 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_monthly_2016-12-01_03h10m_December.txt.gz.
2017-03-13 10:36:33,932 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_monthly_2017-03-01_03h13m_March.txt.gz.
2017-03-13 10:36:33,932 - INFO - Skipping excluded file /var/MYSQL_BACKUPS_126/status/status_weekly_2017-03-03_03h22m_9.txt.gz.
2017-03-13 10:36:34,031 - DEBUG - EOF in transport thread

When we move the status folder out of the source folder, the error appears at a different file:

2017-03-13 11:07:08,626 - DEBUG - [chan 0] stat('MYSQL_BACKUPS_126/latest')
2017-03-13 11:07:08,627 - DEBUG - [chan 0] chmod('MYSQL_BACKUPS_126/latest', 504)
2017-03-13 11:07:08,628 - DEBUG - [chan 0] utime('MYSQL_BACKUPS_126/latest', (1489398097.0, 1318112641.0))
2017-03-13 11:07:08,729 - DEBUG - EOF in transport thread

We are using Python 2.7.8 on CentOS 6.8, on a Hetzner VPS and a Hetzner backup space.

Any help is appreciated!

Error caused by slash directions on Windows

When I run the same clone method that works perfectly on Ubuntu Server 16 LTS system on a Windows PC, I get the following error:

2017-01-31 16:25:23,626 - DEBUG - [chan 0] lstat(b'/home/.../../static\\index.htm')
2017-01-31 16:25:23,702 - ERROR - Error while opening remote folder. Are you sure it does exist?
2017-01-31 16:25:23,703 - DEBUG - EOF in transport thread

On both systems, Python 3.5.1 is used.

It turns out the reason is that at Windows, paths both contain forward and backward slashes after os.path.join.
I currently solve it by modifying path_join in sftpclone.py manually:

def path_join(*args):
    """
    Wrapper around `os.path.join`.
    Makes sure to join paths of the same type (bytes).
    """
    args = (paramiko.py3compat.u(arg) for arg in args)
    joined = os.path.join(*args)
    joined = joined.replace("\\" ,  "/") #same slash direction for all path
    return joined

"@" not allowed to be in the username, password

In the SFTPClone class, if the username or password contains the "@" character, the username/hostname is parsed incorrectly. Here's the change I made on line 100 to sftpclone.py:

hostname, username = [k[::-1] for k in remote_url[::-1].split('@',1)]

Unfortunately, my amateur skills can't figure it out for the subsequent 'else'.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.