Giter Site home page Giter Site logo

換行字元不一致 about pyptt HOT 2 CLOSED

eight04 avatar eight04 commented on May 23, 2024
換行字元不一致

from pyptt.

Comments (2)

Truth1987 avatar Truth1987 commented on May 23, 2024

接收到的原始資料就是如此喔@@

from pyptt.

eight04 avatar eight04 commented on May 23, 2024

接收到資料時並沒有包含 CR,是因為 CR 的意思是「回到行頭」而 LF 的意思是「下移一行」。PTT 在不需要「回到行頭」的情況下會避免送出額外的 CR,這包括空行或階梯狀的文字。例如這篇文章︰
文章代碼(AID): #1RA8h8KC (Test) [ptt.cc] [測試]
文章網址: https://www.ptt.cc/bbs/Test/M.1529383624.A.50C.html

理想中,取得的文章內容應該是︰

這是第一行
          這是第二行

執行以下程式抓取文章︰

from pprint import pprint
from PTTLibrary import PTT

bot = PTT.Library(kickOtherLogin=False)
bot.login(USER, PASSWORD)
err, post = bot.getPost("test", "1RA8h8KC")
pprint((
	post.getBoard(),
	post.getID(),
	post.getAuthor(),
	post.getContent()
))

會得到︰

[06-19 12:58:10][資訊] 偵測到前景執行使用編碼: utf-8
[06-19 12:58:10][資訊] 使用者帳號:
[06-19 12:58:10][資訊] 密碼:
[06-19 12:58:10][資訊] 產生 SSH 金鑰完成
[06-19 12:58:10][資訊] 連線頻道 0 啟動
[06-19 12:58:11][資訊] 頻道 0 建立互動通道成功
[06-19 12:58:11][資訊] 頻道 0 輸入帳號
[06-19 12:58:11][資訊] 頻道 0 輸入密碼
[06-19 12:58:11][資訊] 頻道 0 讀取 PTT 畫面..
[06-19 12:58:11][資訊] 不刪除重複登入的連線
[06-19 12:58:15][資訊] 任意鍵繼續
[06-19 12:58:15][資訊] 頻道 0 登入成功
('test',
 '1RA8h8KC',
 'eight0 (人類)',
 '這是第一行\n這是第二行\n\n--\nヾ(;ω;) ヾ(;ω;)\n\nhttp://i.imgur.com/oAd9
7.png\n\n--')

可以看到前面的空格遺失了。若試著去取得原始資料,就會發現兩行之間只有 LF︰

('\x1b[34;47m 作者 \x1b[0;44m eight0 '
 '(人類)                                              \x1b[34;47m 看板 \x1b[0;44
m '
 'Test \r\n'
 '\x1b[34;47m 標題 \x1b[0;44m '
 '[測試]                                                                 \r\n'
 '\x1b[34;47m 時間 \x1b[0;44m Tue Jun 19 12:47:00 '
 '2018                                               \r\n'
 '\x1b[0;36m──────────────────────────────────
─────\r\n'
 '\n'
 '\x1b[m這是第一行\n'
 '這是第二行\r\n'
 '\n'
 '--\r\n'
 '(;ω;) (;ω;)\r\n'
 '\n'
 'http://i.imgur.com/oAd97.png\r\n'
 '\n'
 '--\r\n'
 '\x1b[32m發信站: 批踢踢實業坊(ptt.cc), 來自: 118.160.126.59\r')

from pyptt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.