Giter Site home page Giter Site logo

roboyoshi / datacurator-filetree Goto Github PK

View Code? Open in Web Editor NEW
1.4K 54.0 130.0 2.99 MB

a standard filetree for /r/datacurator [ and r/datahoarder ]

Home Page: https://reddit.com/r/datacurator

License: MIT License

Makefile 100.00%
file-organization filetree data datastructures template classification

datacurator-filetree's People

Contributors

bbtvandy avatar fionera avatar fjgenieter avatar jdorel avatar nomorenicksleft avatar roboyoshi avatar rwenz3l avatar teraskull avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datacurator-filetree's Issues

Add scene-centric branch and folder structure

What about downloaded Torrents that are in Scene format. I personally unrar them and move them to root/torrent/video/movies. Since I didnt had a better Idea. Maybe you have one :D

Add example for movie libraries splitted by language

Just so it was mentioned or can be tracked somewhere how one might want to go about it and what the Pro/Cons are.

  1. Have all in the same folder
  2. Split them at the higher level (movies-en, movies-fr, movies-de, ....)
  3. Split them at the movie level ('movies/The Movie (Year) [en]', 'movies/The Movie (Year) [jp]'
    ... and maybe more..

Update:

As commented, there is also the option to move the language 1 level higher.. which is useful if you have a lot of different language content (en,de,fr,es,nl,ru,...).

TODO:

  • Extend the README for videos to include the alternative listed here with Pro/Con arguments.

Add new make targets

I want to be able to build certain aspects of the filetree via the Makefile.

make base -> generates only up to 2 or 3 levels into the structure and excludes all files
make full -> generates the whole tree
make audio -> only generates the audio section
make video -> only generates the video section

  • more

Proposal: singular vs plural naming scheme & lower case for global structure

Best Roboyoshi,

I really love this project.

I have a few proposals.

  1. make all folder names singular vs plural.
    This makes reading a tree structure more human sensible.
    so e.g.: /root/video/movie/The Godfather (1972) VS /root/videos/movies/The Godfather (1972)

  2. make all the global folder names lowercase
    This makes it far easier to get to these folder via command line. Especially in case senitive operating systems.
    so e.g.: /root/software/operating systems/User Level VS /root/Software/Operating Systems/User Level

With kind regards,
Wouter

Update Artist README

I want to further expand the informations given in the music/artists section. I feel there is a lot one can mention/talk about to give people a better idea what naming to go with

case consistency & fixes

Currently some folders have a wrong Casing, e.g. "Movies" should be "movies" in the video folder, unless it is in the plex branch for example. It's important to be consistent with this.

TODO:

  • Find capitalization mistakes
  • Fix them
  • Clarify to the users/maintainers that this is important to follow (unless they don't care)
    -- Users are ofc free to do as they wish, but it should be stated in the contributing guidelines.

Create a Release of the FileTree with just the folders

TODO:

  • Create a nifty shell script for releases (e.g.:)
#!/bin/sh
# delete all files
find . -type f -delete
# rename stuff
# ...
  • Create a Release on GitHub for users to download
  • Probably merge to master first
    -- this means we need to fix some stuff first

Add structure for image section

TODO:

  • Specify a reasonable default for personal images
  • Base on Adobe Lightroom Template
  • Check for reddit posts on this (I vaguely remember the Timestamp discussion on the subreddit)

Split Games by OS

When you have games for multiple Operating Systems for PCs they should be seperated, right?

Add pornographic content section(s)

From the reddit thread:

I use 'adult' as a top level folder and replicate most of the structure beneath it because there's videos, images, literature, audio books, websites, software, games...and I don't really want porn in every directory just mixed in with my other stuff.

Well.. need to come up with something because a lot of people download a lot of porn.

Update main README to better introduce the project.

The current readme is very minimalistic and lacks some important information to users who are not familiar with the curation of data, filetrees, git, or anything like it.

TODO:

  • Improve the Introduction
  • Add Instructions on how to use this filetree (e.g. downloading a release, switching branches, etc)
  • Add additional information/links/alternatives

The repository is empty when cloned or downloaded

Hello,

The repository is empty when cloned or downloaded.
Hoping this repo would create an empty directory tree on a brand new external Hard disk drive.
Is if just me or what?
Any clues appreciated.

Thank you.

Update Games Section

After using the games section more heavily, it appears to me that the structure can be a bit simplified like:

root/
  games/
    boardgames/
    cardgames/
    microsoft-windows/
    microsoft-xbox/
    sony-ps4/
    nintendo-switch/

which makes cli usage a lot simpler and also removes the currently unecessary videogames layer.

Alternatives:

root/
  games/
    boardgames/
    cardgames/
    videogames/
      microsoft/
      sega/
      sony/
      nintendo/

Change audio section

TODO:

  • Consider renaming 'artists' to 'music' and merge with 'compilations'
  • Gather Community Feedback
    • Ask r/musichoarder for folder structures used in the wild
    • Check Hydrogenaud.io as well.
    • And Discord!
  • Add more folders
    • soundfx
    • recordings (voice memos)
    • ..?

The most common approach I've seen is just music/$artist/$album/track
Since we also need to fit other kinds of audio (e.g effects, podcasts, etc.) into the same section it makes a lot more sense to flatten music and put everything into one directory.

The artists/books/soundtracks is a rather personalized version and can be added as an "alternative" structure.

tv-anime disambiguation

Hi.
What comprised tv-anime? Japanese media only?
If so where should chinese, korean and western cartoons be classified?
Would it not be better to classify it under a broad animated folder like
videos/tv-series/animated/japanese
videos/tv-series/animated/western

I admit an issue here is that I have no idea what name to give RL movies. Live action is obviously wrong. Any suggestions for that?
Lastly thanks for the work on this tree. Really helped de clutter my hard drive.
EDIT
I read the tv-shows readme. And it cleared up why tv-series (and presumably other cartoons) are not being merged with anime.
I would appreciate a name to give RL movies in my scenario however.

Complete literature section by providing a complete UDC and LCC filetree

The filetree section for literature only provides the subsection 8 of the UDC specification.
It should provide all the Sections and give examples of classifications.

As an alternative, there should also be a feature-branch for the LCC specification.

TODO:

  • Implement UDC Spec (mentioned in the README resources section)
  • Implement the LCC Spec (also mentioned in the README)
  • Create feature branch for both specs so users can choose.

Update Branching Strategy

In order to allow for dynamic builds, I want to update the branching strategy.

Some sections are not going to be the "one way to rule them all" kind of type.
Literature can be classified with a few different specifications (e.g. LCC vs UDC)
Same for media, some people prefer a more slim approach (e.g. Plex Folder Structure) compared to more complex (e.g. the currently provided).

Overall I want to give users the option to combine some branches and then create a filetree out of it.

IMPORTANT: The "master" branch should still focus on providing 1 unified filetree that most users can agree on. Only reasonable dynamics shall be allowed in the scope of this ticket.

More to be discussed.

Add image section [pictures / images / photos]

I want to set the main folder for everything image/picture/photo related.

I personally thing the broadest term is "images".. pretty much covers everything.. but as pointed out elsewhere, it's a term also used for non-picture related stuff (like the image of a company in the public view).

TODO:
Settle the base-folder and create sub-folders accordingly..

Current Understanding and use:

  • images = perception of sth (e.g. The image of myself in a mirror)
  • pictures = comes from "pictura" meaning "painted".. so it's rather attached to everything artificially created like drawings, paintings, oil on canvas, etc.
  • photo = fairly new word that has been coined by the camera evolution and is hardly attached to everything that has been taken by a camera.
/images/
    pictures/ # art drawings and paintings organized by year/artist/genre/style ??
    photos/ # image taken by photographers/people/etc with an analog/digital camera
    ??

References:

TBD: Form vs. Function Filetrees

Sparked by the discussion on reddit, one might argue that the current filetree is too harsh in seperating files that should stay together. This is indeed an issue and it should be noted in the Sections and Sub-Sections how 'we' think this should be handled.

This Ticket is mainly for discussion and to remind myself.

Files related to your current/previous job

Where would you put files that are specific to your current job (or maybe previous or side project you re working on ) that you access daily and are of different types , i.e. diagrams , datasheets , etc ?
If there should be a special folder , on what hierarchy level would you place it ?

Lack of a 'wip' branch

Not only would it be useful to commit changes to there rather than straight to master, it's also already covered in documentation (see CONTRIBUTING.md) but was apparently deleted? Regardless, it would definitely be a useful addition for development.

Checkout v0.2 fails ...

FYI,

Git (for Windows!) cannot checkout the repository due to the below file name:

git clone https://github.com/roboyoshi/datacurator-filetree.git
Cloning into 'datacurator-filetree'...
remote: Enumerating objects: 64, done.
remote: Counting objects: 100% (64/64), done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 634 (delta 10), reused 46 (delta 3), pack-reused 570
Receiving objects: 100% (634/634), 1.62 MiB | 2.01 MiB/s, done.
Resolving deltas: 100% (223/223), done.
**error: invalid path 'root/audio/books/Rowling, J. K./J. K. Rowling - [Harry Potter 01] - Harry Potter and the Philosopher's Stone.m4b'**
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

Cheers,
Borg

List of Forbidden Characters

TODO: Compose a List of "forbidden" characters (= avoid chars) from the 3 major OS Systems (Windows, macOS, Linux/UNIX) and include it in one of the basic guidelines for File+Folder naming.

Windows

The following reserved characters:

    < (less than)
    > (greater than)
    : (colon)
    " (double quote)
    / (forward slash)
    \ (backslash)
    | (vertical bar or pipe)
    ? (question mark)
    * (asterisk)

- Integer value zero, sometimes referred to as the ASCII NUL character.
- Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed. For more information about file streams, see File Streams.
- Any other character that the target file system does not allow.

see: https://superuser.com/a/1112140

macOS:

File separators : (colon) / (forward-slash) \ (backward-slash) You should avoid using colons and slashes in the names of files and folders because some operating systems and drive formats use these characters as directory separators. Consider substituting an underline (_) or dash (-) where would normally like to use a slash or colon in a filename.
Non-alphabetical and non-numerical symbols ¢™$® Non alphanumeric characters may not be supported by all file systems or operating systems, or may be difficult to work with when exported to certain file formats such as EDL, OMF, or XML.
Punctuation marks, parentheses, quotation marks, brackets and operators . , [ ] { } ( ) ! ; " ' * ? < > | These characters are often reserved for special functions in scripting and programming languages.
White space characters such as spaces, tabs, new lines and embedded returns.   Although OS X and Mac OS formatted disks support spaces in filenames, certain processing scripts and applications may not recognize these characters, or may treat your files differently than expected. Consider substituting an underline (_) or dash (-) where you would normally use spaces.

see: https://support.apple.com/en-us/HT202808

linux:

/ (forward slash)
NUL

see: https://stackoverflow.com/a/31976060

Update TV README

The TV Readme should provide guidence on how tv-series and tv-shows could be splitted and especially why one might do it one way or the other.

Something like this:

US UK DE
Series (Television Series) Show (Television Show) Serie (Fernsehserie)
Season Series Staffel
Episode Episode Episode
US UK DE
Show Programme Programm?
Episode Episode Episode?

should help in choosing the correct folder-name or at least give some perspective.

I'd also like to give the option in splitting the tv section even further, like tv-animes, tv-cartoons, tv-series, etc.

Possible redundancy

Observation
Location datacurator-filetree/root/images/photos/ contains directories like personal-photos or other-photos.

Word photos in *-photos seems to be redundant, because information about the context is already contained in the parent directory.

Change proposition
Remove word photos from every directory in the datacurator-filetree/root/images/photos/

Add multiple movie folder and filename patterns

We currently suggest that the given example is the way to go but it actually isn't. One can freely choose how she/he names the files.. and we should suggest that by using multiple files with different names as examples:

The Godfather (1972).mkv (Also in line with the Plex Guidelines)
The.Godfather.1972.BluRay.Scene.Release.Foo.mkv (Scene Name)
godfather-1972.mkv (just a short name)
... + more

Add LICENSE

TODO:

  • Decide which License this project wants to use
  • Set License

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.