Giter Site home page Giter Site logo

gridaco / figma-archives Goto Github PK

View Code? Open in Web Editor NEW
14.0 2.0 2.0 105.11 MB

Figma Files Scraper for Research & Studies

License: MIT License

Python 91.71% TypeScript 7.64% JavaScript 0.06% Shell 0.59%
crawler dataset design-database figma machine-learning scrapy selenium

figma-archives's Introduction

Public Figma Community Files Archive

Figma Files Scraper for Research & Studies

As of Apr 2023, This Archive contains 100GB (Minified, raw - 550GB) of Figma Files, and 3TB of Top-level Images corresponding to Fimga layers and used image fills (3mb optmized), and 30TB of all images (including layers) in the files.

Demo

figma-scraper demo

Demo of step 2 ~ 4 running concurrently

Access - Access the official Archive

With NodeJS Client - @figma-api/community

import { Client } from "@figma-api/community";

const client = Client();

// a file id is a id from figma.com/community/file/:id
// e.g. - https://www.figma.com/community/file/1035203688168086460
const fileid = "1035203688168086460";

// fetch file
const { data: document } = await client.file(fileid);

Usage - Run this archiver

This scraper is a combination of

  1. Selenium scraper to crawl the Figma community files (Takes about 5 hours) - You can skip this step and use our latest data
  2. Selenium automator to copy (duplicate) the file to your account (Takes about 3 days)
  3. Figma File Archiver to download the File content as JSON (Takes about 5 hours)
  4. And optionally, Figma Image Archiver to download the in-design images and layer exported as PNGs to your local machine (Takes about 6 days for top-frame layers, and about 1 month for all layers)
pip3 install -r requirements.txt

# step 1. (Skip and use pre-crawled data if you want as mentioned above)
cd figma_archiver
scrapy crawl figma_spider --nolog -a target=popular
# this will output a output.popular.json file


# step 2. You'll need a new figma account since it copies about 30,00+ files to your drafts
# setup .env following the README at figma_copy
cd figma_copy
python3 main.py --file='../data/latest/index.json' --batch-size=10000
# this will output a community : your-file mapping under prgress/[email protected]

# step 3. you can run this script with figma_copy in parallel
cd figma_archiver

# fetch files
python3 files.py -f ../figma_copy/progress/[email protected]

# fetch images (this use the output directory from above step)
python3 images.py --src='./downloads/*.json'

This is a brief example of how-to-use, for full-setup, please read the README on each automators. and the script argument may differ by your configurations, and as you use extarnal drive.

Requirements

  • About 1TB of free space on your local machine. (Minimal, for full scraping, without images)
  • About 100TB of free space on your external drive. (If you are collecting images as well. Full setup)

Todo

  • Docker image for easy deployment and running on the cloud
  • Official CDN Server will latest data

Disclaimer

This repository contains a Figma community crawler that collects and processes data from Figma community files. Some of the files used in this project are licensed under the Creative Commons Attribution 4.0 International License (CC-By 4.0). In accordance with this license, the following attribution is provided:

This work includes material that is derived from or based on Title of the original work by [Author's Name], which is licensed under the "Creative Commons Attribution 4.0 International License."

If you use or redistribute the data generated by this crawler, you must also adhere to the terms of the CC-By 4.0 license by providing appropriate credit to the original authors, linking to the license, and indicating if any changes were made.

Please note that this repository is provided "as-is" without warranty of any kind, express or implied. The creators of this repository are not responsible for any errors or omissions, or for the results obtained from the use of the data. Users are solely responsible for complying with the CC-By 4.0 license and any other applicable laws and regulations.

Remember to replace the placeholders ([Title of the original work], [URL_to_original_work], and [Author's Name]) with the relevant information for each work you include in your dataset.

Learn more about the Figma community license here.

figma-archives's People

Contributors

gridabot avatar softmarshmallow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.