A FAQ in the community is a structured roadmap for learning Data Engineering and it's about time we start addressing it. We currently have a getting started guide but it's not detailed enough and was meant to be improved on anyways.
It can be a complex question to answer but we can simplify it by adding a few constraints. Since the majority of folks asking are those who are new to DE or trying to transition we should focus on skills for junior/entry level and mid level roles. While there aren't many jr roles at the moment it can still be useful to make the distinction for foundational skills. To make it as general as possible, I believe we should exclude tools/requirements that only apply to FAANG-like companies since they are more niche and oftentimes FAANG companies have developed their own internal tooling to solve their unique problems. Finally, the focus should be on core concepts instead of tooling. While we can include specific tools, we should try to avoid directly recommending specific tools and instead point learners to pages that have lists of the current popular tools to keep this resource as evergreen as possible (example: workflow orchestration popular tools).
While I don't believe a diagram is a requirement, I do think it could be helpful if we can get it to render nicely in mermaid because we can then make it interactive and link to other notes in the wiki like we do with other diagrams. The canvas feature for Obsidian publish is not yet supported so we would probably use a mermaid flowchart for now.
Existing popular roadmap shared in the community:
For V1, please share any thoughts/ideas/constructive criticism on the structure and core concepts. I'll start a new branch after Christmas and start something we can work from.