Giter Site home page Giter Site logo

luogu-discussion-archive's Introduction

luogu-discussion-archive

依赖

使用 pnpm 管理依赖

pnpm install

构建

pnpm -r run build

数据库

使用 PostgreSQL 作为数据库。

export DATABASE_URL="postgresql://postgres@localhost:5432/luogu_discussion_archive?schema=public" # PostgreSQL 数据库地址,参见:https://pris.ly/d/postgres-connector
pnpm exec prisma migrate deploy

archive

# 配置环境变量
export PORT="3001" # 存档 API 服务监听的端口
export DATABASE_URL="postgresql://localhost/luogu_discussion_archive" # PostgreSQL 数据库地址
export VIEWER_HOST="https://lglg.top" # viewer 调用 API 的 Origin 头,作 CORS 校验之用

# 执行
node ./packages/archive/dist/server.js

viewer

配置

packages/viewer/.env.local 中:

PORT="3000" # viewer 监听的端口
DATABASE_URL="postgresql://localhost/luogu_discussion_archive" # PostgreSQL 数据库地址
COOKIE="__client_id=ffffffffffffffffffffffffffffffffffffffff; _uid=0" # 请求洛谷的 Cookie 头
NEXT_PUBLIC_ARCHIVE_HOST="//lda.piterator.com" # 指向存档 API 服务器的相对地址

开发

cd packages/viewer
pnpm run dev
# 或直接
pnpm --filter viewer run dev

生产

node ./packages/viewer/.next/standalone/packages/viewer/server.js

luogu-discussion-archive's People

Contributors

bohanjun avatar immccn123 avatar wxh06 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

immccn123 318cyz

luogu-discussion-archive's Issues

洛谷国际站问题

最近,洛谷国际化,一些老帖子转移到了 <luogu.com>

可经过测试,洛谷帖子保存站无法保存、更新 <luogu.com> 上的帖子,并会报错Unexpected token < in JSON at position 0

在部分帖子里出现 Markdown 渲染器出错

          > 访问 https://lglg.top/727182 的时候服务器返回 500?

上游库有一堆锅导致的 情况比较复杂 尝试但降级不起作用

上游新版本存在内存泄漏 降到旧版本然后又是这个阴间错误 只能先等上游修了 抱歉

https://stackoverflow.com/questions/77138105/reactmarkdown-remarkgfm-everything-renders-as-expected-except-tables-typee

依赖的 remark-gfm 导致这个错误 升级 react-markdown 可以解决 但是 rehype-highlight 新版本存在内存泄漏 旧版本不与 react-markdown 新版本兼容 所以就卡在这里了

正在准备采取临时缓解措施

Originally posted by @immccn123 in #6 (comment)

洛谷新版讨论区 API

讨论区维护升级公告
讨论 API | luogu-api-docs


目前开发进度已经恢复的功能

  • archive(注:实现不是很优秀,ESLint 表示抗议)
  • viewer
    • 陶片(原站仍在运行)
    • (Basic) 帖子
      • 回复上下文推断
      • (Basic) Markdown Render
      • (Basic) LaTeX Render
      • (New) 查看源代码
      • (New) 历史快照
    • 单条回复
      • (New) 查看源代码
      • (New) 历史快照
    • 用户主页
      • 发帖
      • 回帖
      • 参与
      • (New) 用户历史信息
      • (New) 犇犇
      • 用户陶片
    • 发现(排行榜)
      • (New) 犇犇相关

[BUG] 服务器时间问题

rt,服务器时间是 UTC 而非北京时间,所有网页上显示的时间都比实际时间早 8 个小时。

Site is down

lglg.top connect timed out.

C:\Windows\system32>ping lglg.top

正在 Ping lglg.top [45.32.251.232] 具有 32 字节的数据:
请求超时。
请求超时。
请求超时。
请求超时。

45.32.251.232 的 Ping 统计信息:
    数据包: 已发送 = 4,已接收 = 0,丢失 = 4 (100% 丢失),

[Perf] 移除 `UserSnapshot`,或更改 `selectUser.withLatest` 的行为

Ref: #9 能否优先提升“发现”页面的稳定性?

注意到,当访问 /explore/Discussions 时,会产生以下 SQL Query:

prisma:query SELECT COUNT(*), "public"."ReplySnapshot"."authorId" FROM "public"."ReplySnapshot" LEFT JOIN "public"."Reply" AS "j1" ON ("j1"."id") = ("public"."ReplySnapshot"."replyId") WHERE ("j1"."time" >= $1 AND ("j1"."id" IS NOT NULL)) GROUP BY "public"."ReplySnapshot"."authorId" ORDER BY COUNT("public"."ReplySnapshot"."time") DESC LIMIT $2 OFFSET $3
prisma:query SELECT "public"."User"."id" FROM "public"."User" WHERE "public"."User"."id" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100) OFFSET $101
prisma:query SELECT COUNT(*), "public"."Reply"."postId" FROM "public"."Reply" LEFT JOIN "public"."Post" AS "j1" ON ("j1"."id") = ("public"."Reply"."postId") LEFT JOIN "public"."PostTakedown" AS "j2" ON ("j2"."postId") = ("j1"."id") WHERE ("public"."Reply"."time" >= $1 AND ("j2"."postId" IS NULL AND ("j1"."id" IS NOT NULL))) GROUP BY "public"."Reply"."postId" ORDER BY COUNT("public"."Reply"."id") DESC LIMIT $2 OFFSET $3
prisma:query SELECT "public"."Post"."id", "public"."Post"."time", "public"."Post"."replyCount" FROM "public"."Post" WHERE "public"."Post"."id" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50) OFFSET $51
prisma:query SELECT "public"."PostSnapshot"."postId", "public"."PostSnapshot"."time", "public"."PostSnapshot"."title", "public"."PostSnapshot"."forumSlug", "public"."PostSnapshot"."authorId", "public"."PostSnapshot"."until" FROM "public"."PostSnapshot" WHERE "public"."PostSnapshot"."postId" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50) ORDER BY "public"."PostSnapshot"."time" DESC OFFSET $51
prisma:query SELECT "public"."Forum"."slug", "public"."Forum"."name" FROM "public"."Forum" WHERE "public"."Forum"."slug" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10) OFFSET $11
prisma:query SELECT "public"."User"."id" FROM "public"."User" WHERE "public"."User"."id" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37) OFFSET $38
prisma:query SELECT "public"."UserSnapshot"."userId", "public"."UserSnapshot"."time", "public"."UserSnapshot"."name", "public"."UserSnapshot"."badge", "public"."UserSnapshot"."isAdmin", "public"."UserSnapshot"."isBanned", "public"."UserSnapshot"."isRoot", "public"."UserSnapshot"."color"::text, "public"."UserSnapshot"."ccfLevel", "public"."UserSnapshot"."until" FROM "public"."UserSnapshot" WHERE "public"."UserSnapshot"."userId" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37) OFFSET $38

注意到:

prisma:query SELECT "public"."UserSnapshot"."userId", "public"."UserSnapshot"."time", "public"."UserSnapshot"."name", "public"."UserSnapshot"."badge", "public"."UserSnapshot"."isAdmin", "public"."UserSnapshot"."isBanned", "public"."UserSnapshot"."isRoot", "public"."UserSnapshot"."color"::text, "public"."UserSnapshot"."ccfLevel", "public"."UserSnapshot"."until" FROM "public"."UserSnapshot" WHERE "public"."UserSnapshot"."userId" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37) OFFSET $38

以上 Query 在 /explore/Users 中也有出现。

$1$37 带入(2023/11/25 时)生产环境的值:

EXPLAIN ANALYZE SELECT "public"."UserSnapshot"."userId", "public"."UserSnapshot"."time", "public"."UserSnapshot"."name", "public"."UserSnapshot"."badge", "public"."UserSnapshot"."isAdmin", "public"."UserSnapshot"."isBanned", "public"."UserSnapshot"."isRoot", "public"."UserSnapshot"."color"::text, "public"."UserSnapshot"."ccfLevel", "public"."UserSnapshot"."until" FROM "public"."UserSnapshot" WHERE "public"."UserSnapshot"."userId" IN (588872,75446,3,397,22030,718487,1005127,768530,8457,750067,31440,39863,1,329072,268984,553671,756684,125913,894358,371409,743373,623577,1115391,1024912,482610,1130769,234074,1101328,213388,406941,436107,695154,1044851,10703,769907,1211663,974005,340362,912777,670355,950274,754021,314991,411727) ORDER BY "public"."UserSnapshot"."time" DESC OFFSET 0;

返回以下内容:

                                    QUERY PLAN                                                                                                                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge  (cost=877855.45..936773.66 rows=504978 width=72) (actual time=2495.353..2625.930 rows=858951 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=876855.42..877486.65 rows=252489 width=72) (actual time=2480.736..2518.731 rows=286317 loops=3)
         Sort Key: "time" DESC
         Sort Method: external merge  Disk: 19424kB
         Worker 0:  Sort Method: external merge  Disk: 18576kB
         Worker 1:  Sort Method: external merge  Disk: 18928kB
         ->  Parallel Seq Scan on "UserSnapshot"  (cost=0.11..843843.26 rows=252489 width=72) (actual time=77.311..2393.526 rows=286317 loops=3)
               Filter: ("userId" = ANY ('{588872,75446,3,397,22030,718487,1005127,768530,8457,750067,31440,39863,1,329072,268984,553671,756684,125913,894358,371409,743373,623577,1115391,1024912,482610,1130769,234074,1101328,213388,406941,436107,695154,1044851,10703,769907,1211663,974005,340362,912777,670355,950274,754021,314991,411727}'::integer[]))
               Rows Removed by Filter: 17018623
 Planning Time: 1.869 ms
 JIT:
   Functions: 12
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.910 ms, Inlining 153.563 ms, Optimization 45.596 ms, Emission 32.046 ms, Total 232.115 ms
 Execution Time: 2671.184 ms
(17 rows)

注意到,该过程并不能使用索引,且需要将 "public"."UserSnapshot" 全表进行排序。而 "public"."UserSnapshot" 数据量极大,导致查询缓慢。

luogu_discussion_archive=# SELECT COUNT(*) FROM "public"."UserSnapshot";
  count   
----------
 51909812
(1 row)

该查询来自以下代码:

src/app/explore/Discussions.tsx

  const discussions = Object.fromEntries(
    (
      await prisma.post.findMany({
        select: getPost.latestNoContent,
        where: { id: { in: discussionReplyCount.map((r) => r.postId) } },
      })
    ).map((d) => [d.id, d]),
  );
  discussionReplyCount.map((r) => ({
    ...discussions[r.postId],
    recentReplyCount: r._count,
  }));

src/lib/post.ts

export const getPost = {
  latestNoContent: {
    ...selectPost.withBasic,
    ...selectPost.withLatestSnapshotMeta,
  },
  latestWithContent: {
    ...selectPost.withBasic,
    ...selectPost.withLatestContent,
  },
};
  withLatestSnapshotMeta: Prisma.validator<Prisma.PostDefaultArgs>()({
    select: {
      snapshots: {
        select: {
          time: true,
          title: true,
          forum: true,
          author: {
            select: selectUser.withLatest,
          },
          until: true,
        },
        orderBy: { time: "desc" },
        take: 1,
      },
    },
  }).select,

src/lib/user.ts

export const selectUser = {
  withIdOnly: Prisma.validator<Prisma.UserDefaultArgs>()({
    select: {
      id: true,
    },
  }).select,
  withLatest: Prisma.validator<Prisma.UserDefaultArgs>()({
    select: {
      id: true,
      userSnapshots: {
        orderBy: { time: "desc" },
        take: 1,
      },
    },
  }).select,
};

此处将 userSnapshots 进行排序,并取出最新的。可以改为直接将 model UserSnapshot 的最新信息保存在 User 中,用于直接查询。另外,UserSnapshot 目前并没有实际用处(我也想不到会有什么用处)。

UserUserSnapshot 的数量:

luogu_discussion_archive=# SELECT COUNT(*) FROM "public"."User";
 count 
-------
 68402
(1 row)

luogu_discussion_archive=# SELECT COUNT(*) FROM "public"."UserSnapshot";
  count   
----------
 51909812
(1 row)

UserSnapshot 去重问题

有去重代码,但是还是会重复。

if (
lastSnapshot !== null &&
user.ccfLevel === lastSnapshot.ccfLevel &&
user.badge === lastSnapshot.badge &&
user.isAdmin === lastSnapshot.isAdmin &&
user.isBanned === lastSnapshot.isBanned &&
(user.isRoot ?? null) === lastSnapshot.isRoot &&
user.color === lastSnapshot.color &&
user.name === lastSnapshot.name
) {

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.