roboyoshi / datacurator-filetree Goto Github PK
View Code? Open in Web Editor NEWa standard filetree for /r/datacurator [ and r/datahoarder ]
Home Page: https://reddit.com/r/datacurator
License: MIT License
a standard filetree for /r/datacurator [ and r/datahoarder ]
Home Page: https://reddit.com/r/datacurator
License: MIT License
Observation
Archives readme file explains how news directory should be used but news directory does not exists in repository. Source: https://github.com/roboyoshi/datacurator-filetree/tree/master/root/archives
Change proposition
Create news directory in repository.
What about downloaded Torrents that are in Scene format. I personally unrar them and move them to root/torrent/video/movies
. Since I didnt had a better Idea. Maybe you have one :D
Just so it was mentioned or can be tracked somewhere how one might want to go about it and what the Pro/Cons are.
Update:
As commented, there is also the option to move the language 1 level higher.. which is useful if you have a lot of different language content (en,de,fr,es,nl,ru,...).
TODO:
Each folder needs to define it's
I have downloaded multiple videos that intrigued me and I wanted to keep around but didn't fall into some other category. I have personally created one but I think it would be useful.
I want to be able to build certain aspects of the filetree via the Makefile.
make base -> generates only up to 2 or 3 levels into the structure and excludes all files
make full -> generates the whole tree
make audio -> only generates the audio section
make video -> only generates the video section
Best Roboyoshi,
I really love this project.
I have a few proposals.
make all folder names singular vs plural.
This makes reading a tree structure more human sensible.
so e.g.: /root/video/movie/The Godfather (1972) VS /root/videos/movies/The Godfather (1972)
make all the global folder names lowercase
This makes it far easier to get to these folder via command line. Especially in case senitive operating systems.
so e.g.: /root/software/operating systems/User Level VS /root/Software/Operating Systems/User Level
With kind regards,
Wouter
I want to further expand the informations given in the music/artists section. I feel there is a lot one can mention/talk about to give people a better idea what naming to go with
Currently some folders have a wrong Casing, e.g. "Movies" should be "movies" in the video folder, unless it is in the plex branch for example. It's important to be consistent with this.
TODO:
TODO:
#!/bin/sh
# delete all files
find . -type f -delete
# rename stuff
# ...
Everyone Will Need To Keep Installers For Fonts, Plugins And Apps. Can We Add A Folder For This
System.zip
TODO:
Currently, I have it as /Books, but maybe it should be /Literature/demeter?
When you have games for multiple Operating Systems for PCs they should be seperated, right?
I feel like TV-Shows is more suited as folder name, what do you think?
Parenthesis are more harder to enter on a CLI since they need to be escaped. Isnt there a better way to do that?
It will help visualize this whole file tree, without needing to go through all folders to see what is inside.
From the reddit thread:
I use 'adult' as a top level folder and replicate most of the structure beneath it because there's videos, images, literature, audio books, websites, software, games...and I don't really want porn in every directory just mixed in with my other stuff.
Well.. need to come up with something because a lot of people download a lot of porn.
The current readme is very minimalistic and lacks some important information to users who are not familiar with the curation of data, filetrees, git, or anything like it.
TODO:
Hello,
The repository is empty when cloned or downloaded.
Hoping this repo would create an empty directory tree on a brand new external Hard disk drive.
Is if just me or what?
Any clues appreciated.
Thank you.
Sparked by recent discussion I want to add a new folder for storing (digital) fonts.
NOTES:
References:
TODO:
After using the games section more heavily, it appears to me that the structure can be a bit simplified like:
root/
games/
boardgames/
cardgames/
microsoft-windows/
microsoft-xbox/
sony-ps4/
nintendo-switch/
which makes cli usage a lot simpler and also removes the currently unecessary videogames
layer.
Alternatives:
root/
games/
boardgames/
cardgames/
videogames/
microsoft/
sega/
sony/
nintendo/
Some time ago users on the subreddit shared some mindmaps with their filetrees that should be added to the collection branch:
TODO:
The most common approach I've seen is just music/$artist/$album/track
Since we also need to fit other kinds of audio (e.g effects, podcasts, etc.) into the same section it makes a lot more sense to flatten music and put everything into one directory.
The artists/books/soundtracks is a rather personalized version and can be added as an "alternative" structure.
Spec: https://johnnydecimal.com/
Must Note: This is not recommended for teams.
TODO:
root/documents/{00,10,20,30,40,50,60,70,80,90} + Example Name/J.D Folder/Example Documents
Attachments:
I dont know where it should have its root but the scheme of golang seems like a good idea like: src/github.com/Example/Repo
Hi.
What comprised tv-anime? Japanese media only?
If so where should chinese, korean and western cartoons be classified?
Would it not be better to classify it under a broad animated folder like
videos/tv-series/animated/japanese
videos/tv-series/animated/western
I admit an issue here is that I have no idea what name to give RL movies. Live action is obviously wrong. Any suggestions for that?
Lastly thanks for the work on this tree. Really helped de clutter my hard drive.
EDIT
I read the tv-shows readme. And it cleared up why tv-series (and presumably other cartoons) are not being merged with anime.
I would appreciate a name to give RL movies in my scenario however.
The filetree section for literature only provides the subsection 8 of the UDC specification.
It should provide all the Sections and give examples of classifications.
As an alternative, there should also be a feature-branch for the LCC specification.
TODO:
In order to allow for dynamic builds, I want to update the branching strategy.
Some sections are not going to be the "one way to rule them all" kind of type.
Literature can be classified with a few different specifications (e.g. LCC vs UDC)
Same for media, some people prefer a more slim approach (e.g. Plex Folder Structure) compared to more complex (e.g. the currently provided).
Overall I want to give users the option to combine some branches and then create a filetree out of it.
IMPORTANT: The "master" branch should still focus on providing 1 unified filetree that most users can agree on. Only reasonable dynamics shall be allowed in the scope of this ticket.
More to be discussed.
I want to set the main folder for everything image/picture/photo related.
I personally thing the broadest term is "images".. pretty much covers everything.. but as pointed out elsewhere, it's a term also used for non-picture related stuff (like the image of a company in the public view).
TODO:
Settle the base-folder and create sub-folders accordingly..
Current Understanding and use:
/images/
pictures/ # art drawings and paintings organized by year/artist/genre/style ??
photos/ # image taken by photographers/people/etc with an analog/digital camera
??
References:
TODO:
Sparked by the discussion on reddit, one might argue that the current filetree is too harsh in seperating files that should stay together. This is indeed an issue and it should be noted in the Sections and Sub-Sections how 'we' think this should be handled.
This Ticket is mainly for discussion and to remind myself.
Where would you put files that are specific to your current job (or maybe previous or side project you re working on ) that you access daily and are of different types , i.e. diagrams , datasheets , etc ?
If there should be a special folder , on what hierarchy level would you place it ?
Not only would it be useful to commit changes to there rather than straight to master, it's also already covered in documentation (see CONTRIBUTING.md) but was apparently deleted? Regardless, it would definitely be a useful addition for development.
FYI,
Git (for Windows!) cannot checkout the repository due to the below file name:
git clone https://github.com/roboyoshi/datacurator-filetree.git
Cloning into 'datacurator-filetree'...
remote: Enumerating objects: 64, done.
remote: Counting objects: 100% (64/64), done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 634 (delta 10), reused 46 (delta 3), pack-reused 570
Receiving objects: 100% (634/634), 1.62 MiB | 2.01 MiB/s, done.
Resolving deltas: 100% (223/223), done.
**error: invalid path 'root/audio/books/Rowling, J. K./J. K. Rowling - [Harry Potter 01] - Harry Potter and the Philosopher's Stone.m4b'**
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'
Cheers,
Borg
TODO: Compose a List of "forbidden" characters (= avoid chars) from the 3 major OS Systems (Windows, macOS, Linux/UNIX) and include it in one of the basic guidelines for File+Folder naming.
The following reserved characters:
< (less than)
> (greater than)
: (colon)
" (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
* (asterisk)
- Integer value zero, sometimes referred to as the ASCII NUL character.
- Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed. For more information about file streams, see File Streams.
- Any other character that the target file system does not allow.
see: https://superuser.com/a/1112140
File separators | : (colon) / (forward-slash) \ (backward-slash) | You should avoid using colons and slashes in the names of files and folders because some operating systems and drive formats use these characters as directory separators. Consider substituting an underline (_) or dash (-) where would normally like to use a slash or colon in a filename. |
---|---|---|
Non-alphabetical and non-numerical symbols | ¢™$® | Non alphanumeric characters may not be supported by all file systems or operating systems, or may be difficult to work with when exported to certain file formats such as EDL, OMF, or XML. |
Punctuation marks, parentheses, quotation marks, brackets and operators | . , [ ] { } ( ) ! ; " ' * ? < > | | These characters are often reserved for special functions in scripting and programming languages. |
White space characters such as spaces, tabs, new lines and embedded returns. | Although OS X and Mac OS formatted disks support spaces in filenames, certain processing scripts and applications may not recognize these characters, or may treat your files differently than expected. Consider substituting an underline (_) or dash (-) where you would normally use spaces. |
see: https://support.apple.com/en-us/HT202808
/ (forward slash)
NUL
The TV Readme should provide guidence on how tv-series and tv-shows could be splitted and especially why one might do it one way or the other.
Something like this:
US | UK | DE |
---|---|---|
Series (Television Series) | Show (Television Show) | Serie (Fernsehserie) |
Season | Series | Staffel |
Episode | Episode | Episode |
US | UK | DE |
---|---|---|
Show | Programme | Programm? |
Episode | Episode | Episode? |
should help in choosing the correct folder-name or at least give some perspective.
I'd also like to give the option in splitting the tv section even further, like tv-animes
, tv-cartoons
, tv-series
, etc.
Observation
Location datacurator-filetree/root/images/photos/ contains directories like personal-photos or other-photos.
Word photos in *-photos seems to be redundant, because information about the context is already contained in the parent directory.
Change proposition
Remove word photos from every directory in the datacurator-filetree/root/images/photos/
I think it would be better to have Subfolders for Movies, Games etc for Soundtracks since it will be full
Current one is a bit outdated.
We currently suggest that the given example is the way to go but it actually isn't. One can freely choose how she/he names the files.. and we should suggest that by using multiple files with different names as examples:
The Godfather (1972).mkv (Also in line with the Plex Guidelines)
The.Godfather.1972.BluRay.Scene.Release.Foo.mkv (Scene Name)
godfather-1972.mkv (just a short name)
... + more
Where should we place our backup files?! This is a serious issue
TODO:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.