Giter Site home page Giter Site logo

sjtug / mirror-docker Goto Github PK

View Code? Open in Web Editor NEW
30.0 30.0 6.0 7.05 MB

Dockerfile for SJTUG mirror

Home Page: https://mirrors.sjtug.sjtu.edu.cn

License: GNU Affero General Public License v3.0

Shell 45.83% HTML 2.00% Python 35.89% Dockerfile 8.86% Julia 7.41%

mirror-docker's People

Contributors

htfy96 avatar l2dy avatar liubenyuan avatar skyzh avatar specter119 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mirror-docker's Issues

使用 mirror 优化 Git 仓库镜像

目前的 Git 镜像脚本 ( https://github.com/sjtug/mirror-docker/blob/v2/lug/worker-script/git.sh ) 使用了平常开发模式,会 checkout 一份工作区文件,这个其实没有必要,而且还占用空间。

建议使用 git mirror 方式,只会创建一个 bare 仓库,即只有 .git 目录。大致如下:

#!/bin/sh

set -xe

if [ ! -d "${LUG_path}.git" ]; then
	git clone --mirror "$LUG_origin" "${LUG_path}.git"
fi

cd "${LUG_path}.git"
git remote update --prune

参考:

使用环境变量设置路径

现在,生成的静态页面是放在 host 的 /home/mirror-web/_site 文件夹下,这里不应该写死,而是应该定义一个环境变量

System Monitor

We need a system monitor which could analyze logs, watch system status and provide detailed and visualized results.

The basic requirements are:

  • CPU/Mem/Net Usage
  • Top-X most popular URL

Official docker image would be a plus.

This will be included in 0.1 milestone (Due: May 1, 2016).

anaconda: invalid literal for int() with base 16: b''

果然这种缺乏一致性检验的hack同步脚本还是bug重重……

原因大致是上游没返回合适的size...

<Future at 0x7f4b7406b898 state=finished raised IncompleteRead> | Traceback (most recent call last):
  File "/usr/lib/python3.5/http/client.py", line 541, in _get_chunk_left
    chunk_left = self._read_next_chunk_size()
  File "/usr/lib/python3.5/http/client.py", line 508, in _read_next_chunk_size
    return int(line, 16)
ValueError: invalid literal for int() with base 16: b''

Traceback (most recent call last):
  File "/usr/lib/python3.5/http/client.py", line 573, in _readinto_chunked
    chunk_left = self._get_chunk_left()
  File "/usr/lib/python3.5/http/client.py", line 543, in _get_chunk_left
    raise IncompleteRead(b'')
http.client.IncompleteRead: IncompleteRead(0 bytes read)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/worker-script/anaconda.py", line 182, in <module>
    future.result()
  File "/usr/lib/python3.5/concurrent/futures/_base.py", line 405, in result
    return self.__get_result()
  File "/usr/lib/python3.5/concurrent/futures/_base.py", line 357, in __get_result
    raise self._exception
  File "/usr/lib/python3.5/concurrent/futures/thread.py", line 55, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/worker-script/anaconda.py", line 117, in download_repo
    shutil.copyfileobj(result, f)
  File "/usr/lib/python3.5/shutil.py", line 79, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/http/client.py", line 448, in read
    n = self.readinto(b)
  File "/usr/lib/python3.5/http/client.py", line 478, in readinto
    return self._readinto_chunked(b)
  File "/usr/lib/python3.5/http/client.py", line 589, in _readinto_chunked
    raise IncompleteRead(bytes(b[0:total_bytes]))
http.client.IncompleteRead: IncompleteRead(13441 bytes read)

Build Failure: corrupted gem & fetchError

Step 17 : RUN bundle install 
---> Running in acc5bcb5c7f8
 Don't run Bundler as root. Bundler can ask for sudo if it is needed, and installing your bundle as root will break this application for all non-root users on this machine.
 Fetching gem metadata from https://ruby.taobao.org/............ 
Fetching version metadata from https://ruby.taobao.org/.. 
Resolving dependencies...... 
Bundler::GemspecError:
 Could not read gem at /var/lib/gems/2.3.0/cache/RedCloth-4.2.9.gem. It may be corrupted. Gem::RemoteFetcher::UnknownHostError: timed out (https://ruby.taobao.org/gems/i18n-0.7.0.gem) 
Using json 1.8.3
 Using minitest 5.8.4
 Installing thread_safe 0.3.5
 Gem::RemoteFetcher::FetchError: too many connection resets (https://rubygems-china.oss-cn-hangzhou.aliyuncs.com/gems/addressable-2.4.0.gem)
 Gem::RemoteFetcher::FetchError: too many connection resets (https://rubygems-china.oss-cn-hangzhou.aliyuncs.com/gems/babel-source-5.8.35.gem)
 Using execjs 2.6.0
 Installing coffee-script-source 1.10.0
 Using colorator 0.1
 Using ffi 1.9.10
 Installing multipart-post 2.0.0
 Gem::RemoteFetcher::UnknownHostError: timed out (https://ruby.taobao.org/gems/gemoji-2.1.0.gem)
 Installing net-dns 0.8.0
 Bundler::GemspecError: Could not read gem at /var/lib/gems/2.3.0/cache/public_suffix-1.5.3.gem. It may be corrupted.
 Gem::RemoteFetcher::FetchError: too many connection resets (https://rubygems-china.oss-cn-hangzhou.aliyuncs.com/gems/sass-3.4.22.gem)
 Installing rb-fsevent 0.9.7 Using kramdown 1.10.0 Installing liquid 3.0.6 Using mercenary 0.3.5 Using rouge 1.10.1 Using safe_yaml 1.0.4 Installing jekyll-feed 0.4.0 Gem::RemoteFetcher::UnknownHostError: timed out (https://ruby.taobao.org/gems/mini_portile2-2.0.0.gem)
 Using jekyll-paginate 1.1.0
 Installing jekyll-sitemap 0.10.0
 Using rdiscount 2.1.8
 Installing redcarpet 3.3.3 with native extensions Installing terminal-table 1.5.2
 Using bundler 1.11.2
 An error occurred while installing RedCloth (4.2.9), and Bundler cannot continue. Make sure that `gem install RedCloth -v '4.2.9'` succeeds before bundling.
 Removing intermediate container acc5bcb5c7f8
 The command '/bin/sh -c bundle install' returned a non-zero code: 5

HTTPS support?

Visiting the page http://mirrors.sjtug.org/archlinux/ would encounter connection reset, even from China Telecom IPs.

philhu@philhu2 ~ $ curl ip.cn
当前 IP:180.168.218.163 来自:上海市 电信
philhu@philhu2 ~ $ curl --verbose http://mirrors.sjtug.org/archlinux/
* Hostname was NOT found in DNS cache
*   Trying 202.120.58.45...
* Connected to mirrors.sjtug.org (202.120.58.45) port 80 (#0)
> GET /archlinux/ HTTP/1.1
> User-Agent: curl/7.35.0
> Host: mirrors.sjtug.org
> Accept: */*
> 
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

Enable HTTPS to avoid this?

Several improvements and suggestions

Avoid installing or using git in container to make the image smaller.

===== original text =====
According to the official Dockerfile docs,

  • there should be no apt-get upgrade's
  • apt-get update and install should be placed in a single RUN command
  • avoid unnecessary installations such as editors
  • there shouldn't be too many layers

Also, I have several suggestions

  • place mirror root at /mnt/mirrors for possible additional mount points in the future
  • avoid using git in container. Use host .tar.gz files at build time instead. A host script can be written for grabbing the latest version of tunasync and mirror-web from github and pack them into .tar.gz files.

These changes are under development. I'll make PR tonight. Discussions and 批判一番s are strongly encouraged.

[讨论] anaconda 的同步脚本有些复杂?

咱们这个有些复杂,应该是为了让同步更robust?
相比 tuna 的就很扁平 https://github.com/tuna/tunasync-scripts/blob/master/anaconda.py
但是他们那个增删源,就需要改脚本,但是咱们这个改 yaml 就可以。
我不晓得 docker 里的日志怎么看,我看文件列表是直接 print 出来的,这个能看到吗?因为现在是缺 *.conda 文件( sjtug/mirror-requests#56 ),所以最直接的,应该看下载的文件列表。
不如用 python 的 logging 库,有 DEBUG INFO WARNING ERROR CRITICAL 5个不同等级的日志,见python3 logging。stdout 和文件里都可以存,比较灵活。

anaconda: http.client.RemoteDisconnected

Failed to download https://conda.anaconda.org/menpo/linux-64//menpo-0.7.6-py27_0.tar.bz2: Remote end closed connection without response | Traceback (most recent call last):
  File "/worker-script/anaconda.py", line 138, in download_repo
    future.result()
  File "/usr/lib/python3.5/concurrent/futures/_base.py", line 398, in result
    return self.__get_result()
  File "/usr/lib/python3.5/concurrent/futures/_base.py", line 357, in __get_result
    raise self._exception
  File "/usr/lib/python3.5/concurrent/futures/thread.py", line 55, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/worker-script/anaconda.py", line 42, in f_retry
    return f(*args, **kwargs)
  File "/worker-script/anaconda.py", line 99, in download_file
    result = urlopen_failsafe(urljoin(url_root, name))
  File "/worker-script/anaconda.py", line 52, in f_retry
    return f(*args, **kwargs)
  File "/worker-script/anaconda.py", line 60, in urlopen_failsafe
    return request.urlopen(*args, **kwargs)
  File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.5/urllib/request.py", line 466, in open
    response = self._open(req, data)
  File "/usr/lib/python3.5/urllib/request.py", line 484, in _open
    '_open', req)
  File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 1297, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/usr/lib/python3.5/urllib/request.py", line 1257, in do_open
    r = h.getresponse()
  File "/usr/lib/python3.5/http/client.py", line 1198, in getresponse
    response.begin()
  File "/usr/lib/python3.5/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.5/http/client.py", line 266, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.