Giter Site home page Giter Site logo

Comments (5)

jamiros avatar jamiros commented on May 17, 2024

That's awesome! Thank you for sharing that!

from data-engineer-roadmap.

joseluistello avatar joseluistello commented on May 17, 2024

Documenting APIs: A guide for technical writers and engineers This is an excelent material too

from data-engineer-roadmap.

ysgurjar avatar ysgurjar commented on May 17, 2024

Thank you. @alexandraabbas and other folks, I am struggling to find a good resource for data structure and algorithms, Linux, serialisation. Additionally, I am not sure how much time I should be spending on each of these? There aren't any courses on data stack at this level. Suggestions?

from data-engineer-roadmap.

datatalking avatar datatalking commented on May 17, 2024

@ysgurjar it really depends upon where your skills are in terms of the interval of total skills as data engineering covers a wide swath of technology and experience level. Most of my work involves more scientific processing of data so I use linear algebra and matrix equations almost weekly. I'm looking at a book on my shelf and have eight books that I bought but really only use probably three or four.
0. I've been using 'Data Engineering for python' book and found it helps me. What language do you use @ysgurjar ?

  1. 'Data Structures and Algorithms Made Easy' by Narasimha which is 400+ pages and written in C so I have a friend I bribe to translate enough to python so I can grok it.
  2. 'Methods of Multivariate Analysis' or also known as 'Rencher' book is a deep dive into almost all of the algebra used in everything from NLP, ML and DL. So the Rencher book that many seem to love but its an advanced read.
  3. 'Intro to Algorithms' I had good luck with a friend and I who did together with me and she helped translate concepts from the 1,300 pages seems to solving problems so its a deep resource for me.
  4. If you are going to do the algebra it computes the stats and I had luck with 'Pearson Stats' and 'Introduction to Statistical Methods and Data Analytics' 7th edition, by Ott and Longnecker
  5. The Duke University open sourced i think all of their classes similar to MIT did so there is a wealth of data. Part of my MATH342 the professor recommended 'Introduction to Modern Statistics' by Mine Çetinkaya-Rundel and Johanna Hardin

from data-engineer-roadmap.

sarahgetter avatar sarahgetter commented on May 17, 2024

@ysgurjar Thanks for your list! I heartily endorse these O°Reilly books:

  1. 'Fundamentals of Data Engineering' by Joe Reis and Matt Housely
  2. 'Practical Statistics for Data Scientists' by Peter Bruce, Andrew Bruce and Peter Gedeck
  3. 'Data Science from Scratch' by Joel Grus
  4. 'Creating a Data-Driven Organization' by Carl Anderson
  5. 'Beautiful Visualization' by Julie Steele and Noah Iliinsky

I have thoroughly enjoyed 'Introduction to Design and Analysis of Experiments' by George W. Cobb, but would say this falls more into the realm of data science than data engineering.

'Beautiful Visualization' might feel outside of the data engineering umbrella, too, but helped me understand the use cases for different levels of time granularity, as it relates to how to best represent patterns and trends. This helped me decide when my materialization layers should offer up millisecond-level granularity, or when there is no need for per-event data, and the smallest period rollup can be a day. This book was also quite helpful for stepping into an "is this the most usable version for my tableau-utilizing analysts" perspective and stepping outside of my optimization-obsessed engineering perspective.

from data-engineer-roadmap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.