tokitouq / manga-api Goto Github PK
View Code? Open in Web Editor NEWA Python based web scraping api built with fastapi that provides easy access to manga contents
Home Page: https://mangareader-api.vercel.app
License: MIT License
A Python based web scraping api built with fastapi that provides easy access to manga contents
Home Page: https://mangareader-api.vercel.app
License: MIT License
New endpoint to get list of Manga with genre passed as parameter. eg: /genre/action
, returns list
.
Utilize BaseSearchScraper
for this since HTML structure is same.
also add other queries like page
offset
limit
and sort
A new endpoint: completed
which returns list of Mangas which is completed airing.
scraping link: https://mangareader.to/completed
features:
Our current workflow is like, now we've 2 providers for getting manga contents.
Due to this, we could only provide limited informations for each endpoints ( need to provide synced info ).
So I think its better if we can provide more information.
For this goal, we need to rely on single provider.
AniList seems promising and has server side rendering.
means we can use selectolax to get info ( faster ).
Create a pydantic model and follow this:
https://fastapi.tiangolo.com/tutorial/response-model/
The scraper should be in shinobi
the rest of the logcis should be ported to django-rest-framework
Add a new endpoint which search mangas and return a list.
Endpoint eg: /v1/search/?keyword=one+piece
Scraping url: https://mangareader.to/search?keyword=one+piece
Currently it returns empty fields if manga is not found or any error occurs
Instead I'd like to return a better 404
response with message
Unfortunately mangareader seems down ( showing its for sale ).
Currently manga-api only relies on mangareader for data, but since it went down we gotta change provider.
Well I think its better if we can change whole codebase for supporting different providers, we need to scale ( currently this architecture isnt scalable ).
We will prolly change folder structure as well like
url can be like: /api/<provider>/<query>/
List of manga providers we can use: https://github.com/anshumanv/awesome-anime-sources#manga
Note: we might need to use selenium
instead, well but its better if we can do this with selectolax
itself.
Currently this project uses FastAPI, but now I'm thinking to rewrite this in django and django-rest-framework.
since I use this kinda scraping logics for mangacore ( a project for streaming mangas ), it would be better to try this in django and drf.
I'd like hear your opinions.
Hi there, hope you like this API.
What do you think about making this a single provider? I mean, scrape datas from just 1 site?
currently you can get same data from 2 different providers.
I think if we move to single provider, we can provider more info such as "latest, trending, sorting, etc..." like extra queries.
duplicate(#175) nvm.
Please share your thoughts bout this.
have a nice day.... ❤️
Helpful links:
https://fastapi.tiangolo.com/tutorial/metadata/#create-metadata-for-tags
https://fastapi.tiangolo.com/tutorial/metadata/#use-your-tags
https://fastapi.tiangolo.com/tutorial/path-operation-configuration/#description-from-docstring
https://fastapi.tiangolo.com/tutorial/path-operation-configuration/#summary-and-description
Use self.parser.*
instead functions from utilities
.
eg: instead this code
https://github.com/tokitou-san/MangaAPI/blob/cccbcb6437861ebaf8aea7ef3d52bd00881cc3be/app/api/scrapers/popular.py#L16
Do something like ( might different for other cases )
slug = self.parser.css("div#manga-trending div.swiper-slide a.link-mask").attributes["href"]
Lmk if you want more info or anything :)
Here, used summary and description of genre
instead type
( mistake )
fix: Change summary
and description
for type
endpoint
Add markdown
for index page, currently its rendering in markdown format, we need in html
use markdown and pass to template
then render with |safe
filter
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.