Giter Site home page Giter Site logo

zuzoovn / machine-learning-for-software-engineers Goto Github PK

View Code? Open in Web Editor NEW
27.6K 27.6K 6.2K 437 KB

A complete daily plan for studying to become a machine learning engineer.

Home Page: https://www.codementor.io/zuzoovn/how-i-plan-to-become-a-machine-learning-engineer-a4metbcuk

License: Creative Commons Attribution Share Alike 4.0 International

artificial-intelligence deep-learning machine-learning machine-learning-algorithms software-engineer

machine-learning-for-software-engineers's Introduction

Top-down learning path: Machine Learning for Software Engineers

Top-down learning path: Machine Learning for Software Engineers GitHub stars GitHub forks

Inspired by Coding Interview University.

Translations: Brazilian Portuguese | 中文版本 | Français | 臺灣華語版本

How I (Nam Vu) plan to become a machine learning engineer

What is it?

This is my multi-month study plan for going from mobile developer (self-taught, no CS degree) to machine learning engineer.

My main goal was to find an approach to studying Machine Learning that is mainly hands-on and abstracts most of the Math for the beginner. This approach is unconventional because it’s the top-down and results-first approach designed for software engineers.

Please, feel free to make any contributions you feel will make it better.


Table of Contents


Why use it?

I'm following this plan to prepare for my near-future job: Machine learning engineer. I've been building native mobile applications (Android/iOS/Blackberry) since 2011. I have a Software Engineering degree, not a Computer Science degree. I have an itty-bitty amount of basic knowledge about: Calculus, Linear Algebra, Discrete Mathematics, Probability & Statistics from university. Think about my interest in machine learning:

I find myself in times of trouble.

AFAIK, There are two sides to machine learning:

  • Practical Machine Learning: This is about querying databases, cleaning data, writing scripts to transform data and gluing algorithm and libraries together and writing custom code to squeeze reliable answers from data to satisfy difficult and ill-defined questions. It’s the mess of reality.
  • Theoretical Machine Learning: This is about math and abstraction and idealized scenarios and limits and beauty and informing what is possible. It is a whole lot neater and cleaner and removed from the mess of reality.

I think the best way for practice-focused methodology is something like 'practice — learning — practice', that means where students first come with some existing projects with problems and solutions (practice) to get familiar with traditional methods in the area and perhaps also with their methodology. After practicing with some elementary experiences, they can go into the books and study the underlying theory, which serves to guide their future advanced practice and will enhance their toolbox of solving practical problems. Studying theory also further improves their understanding on the elementary experiences, and will help them acquire advanced experiences more quickly.

It's a long plan. It's going to take me years. If you are familiar with a lot of this already it will take you a lot less time.

How to use it

Everything below is an outline, and you should tackle the items in order from top to bottom.

I'm using Github's special markdown flavor, including tasks lists to check progress.

  • Create a new branch so you can check items like this, just put an x in the brackets: [x]

More about Github-flavored markdown

Follow me

I'm a Vietnamese Software Engineer who is really passionate and wants to work in the USA.

How much did I work during this plan? Roughly 4 hours/night after a long, hard day at work.

I'm on the journey.

Nam Vu - Top-down learning path: machine learning for software engineers
USA as heck

Don't feel you aren't smart enough

I get discouraged from books and courses that tell me as soon as I open them that multivariate calculus, inferential statistics and linear algebra are prerequisites. I still don’t know how to get started…

About Video Resources

Some videos are available only by enrolling in a Coursera or EdX class. It is free to do so, but sometimes the classes are no longer in session so you have to wait a couple of months, so you have no access. I'm going to be adding more videos from public sources and replacing the online course videos over time. I like using university lectures.

Prerequisite Knowledge

This short section consists of prerequisites/interesting info I wanted to learn before getting started on the daily plan.

The Daily Plan

Each subject does not require a whole day to be able to understand it fully, and you can do multiple of these in a day.

Each day I take one subject from the list below, read it cover to cover, take notes, do the exercises and write an implementation in Python or R.

Motivation

Machine learning overview

Machine learning mastery

Machine learning is fun

Machine Learning: An In-Depth Guide

Stories and experiences

Machine Learning Algorithms

Beginner Books

Practical Books

Kaggle knowledge competitions

Video Series

MOOC

Resources

Games

Becoming an Open Source Contributor

Podcasts

Communities

Conferences

  • Neural Information Processing Systems (NIPS)
  • International Conference on Learning Representations (ICLR)
  • Association for the Advancement of Artificial Intelligence (AAAI)
  • IEEE Conference on Computational Intelligence and Games (CIG)
  • IEEE International Conference on Machine Learning and Applications (ICMLA)
  • International Conference on Machine Learning (ICML)
  • International Joint Conferences on Artificial Intelligence (IJCAI)
  • Association for Computational Linguistics (ACL)

Interview Questions

My admired companies

machine-learning-for-software-engineers's People

Contributors

0xradical avatar benedekrozemberczki avatar dangsonbk avatar fenthedev avatar giuliatondin avatar hugech38 avatar init27 avatar ipcenas avatar izzy-tait avatar jaygshah avatar joaopedronardari avatar jsgv avatar justmarkham avatar jwasham avatar lefnire avatar lsvih avatar microsheep avatar mostafa-samir avatar nerocube avatar pjpjq avatar qualityjacks avatar rajeshvaya avatar rmarquis avatar sachinbhatttech avatar samyak-shreyash avatar sentinelwarren avatar susiexu avatar tuanphuc avatar woliveiras avatar zuzoovn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

machine-learning-for-software-engineers's Issues

Need advice about how to divide this roadmap into some groups.

From the near future machine learning developer's point of view, this roadmap can be divided into some groups: Newbie, Explorer, Adventurer, Trailblazer

At the moment, I think all of the resources here are belong to "Newbie".
Currently, I have no ideas about A little bit more advanced roadway with math and theory.

Pull requests are always welcome.

Vietnamese language support?

Hi, I'm also a Vietnamese developer.

I think you should update your repository with mother tongue. All developers at Vietnam will be proud of you.

Although, your repository contains almost link reference to another resource.

But I think you can update to the Vietnamese language because your repository contains many languages like Chinese, Brazilian-Portuguese language.

Your repository was very useful. Machine learning is new almost people in Vietnam.

I hope it becomes popular.

English is not the main language, and I need to learn more.
So, I tried to contribute to your repository at a free time.

Keep going and you will grow up.
Thanks, @nam Vu.

Some link die

I open this post not only for me but for anyone who keeps up with this track reporting some links had died. And tell the author if he would recommend us similar ones:

  • Machine learning overview:
  • Deep Learning - A Non-Technical Introduction

我尝试翻译了一篇文章,如果可以能否提一些意见?

What If I Am Not Good At Mathematics(我要是不善于数学怎么办)

原文链接(https://machinelearningmastery.com/what-if-im-not-good-at-mathematics/)

可以主观的拍砖,但是请指出具体的问题——具体哪句话。
可以任意的下结论,但是请给出改进意见——请给出建议的译文。
没有建设性意见的,一律无视。

Practitioners of practical subjects can suffer from math envy.

实际项目的参与者可能常常羡慕那些数学好的。

This is where they think that mathematicians are smarter than they are and that they cannot excel in a subject until they “know the math”.

因为他们觉得数学家比他们聪明,他们无法在所在领域做得更好,是因为他们不懂数学。

I have seen this first hand, and I have seen it stop people from getting started.

我亲眼看到了这一点,人们还没开始就已经结束。

In this post, I want to convince you that you can get started and make great progress in machine learning without being strong in mathematics.

在这篇文章中,我想告诉你,即便你没有很强的数学背景,也能在机器学习上获得很大的进步。

Get Started and Learn by Doing

边做边学——用以致学

I didn’t learn boolean logic before I started programming.

我在学习编程之前,没学过布尔逻辑

I just started programming and you probably did to.

然后我就开始编程了,估计你也这样

I followed an empirical path that involved trial and error. It is slow and I wrote a lot of bad code, but I was passionately interested and I could see progress.

我依靠实践,不断的试错。这的确很慢,我写了很多烂代码,但是因为我能看到过程和进步,我感到特别有趣。

As I built larger and more complicated software programs I devoured textbooks because they let me build my programs better. I hunted for conceptual and practical tools I could use to overcome the limitations I was actually experiencing.

当我写大一点的程序的时候,我开始啃工具书,因为这能让我写的程序更好一点。我寻找一些实用工具和理论,他们可以克服我的经验不足的瓶颈。

This was a powerful learning tool. If I had started out programming by being forced to learn boolean logic or concepts like polymorphism, my passion would never have been ignited.

这是一种强大的学习工具。 如果我要先学习布尔逻辑或像多态这样的概念才开始编程,我估计不会有热情去开始。

The Danger Zone

危险区域

I like it when my programs don’t work. It means I have to roll up my sleeves and really understand what is going on.

当我的程序不工作时。 我必须撸起袖子,去真正的理解发生了什么。

You can get a long way by copy and pasting code without really understanding it. You only need to understand blocks of code as functional units that do a thing you need done. Glue enough of them together and you have a program that solves the problem you need solved.

当然,你可以一直这样copy代码而不用真的理解它,您只需要将代码块理解为执行您需要完成的事情的功能单元。 将它们粘合在一起,你就有了一个解决你需要解决的问题的程序。[错误的,不要这么做]

This empirical hackery is a great way to learn fast, but a terrifying way to build production systems. This is an important distinction to make. The often spoken of “danger zone” is when systems built from empirical learning are made operational and the author does not really know how it works or what the results actually mean.

这种方法能让你学的很快,但是在构建一个产品系统的时候将非常可怕。这一点非常重要!通常说的“危险区域”指的是这种通过实践学习的方式,构建的系统竟然还能运行,但是作者却不清楚它背后的是怎么工作的,以及不清楚到底会有什么样的结果。

This is a very real problem. For example, take a look at some I.T. systems and webpages of small businesses that put up with this level of work.

这是个非常现实的问题,例如:看看那种能解决一些商业问题的小系统或者小网页,你就知道了。

In my mind, a prototype is a ball of copy-pasted mud held together with sticky tape that might sketch out what a solution could look like.

在我看来,原型就像一个泥球,是一堆CTRL CV 大法堆在一起的代码,大概勾勒出解决方案的意思。

An operational system or a system that produces results or decisions used operationally has no surprises. You feel comfortable having an all day code review with the team picking over every line of code.

一个产品级(好用)的系统,他们的结果必须是明确的(你知道输入什么一定会输出什么),别给我来什么惊喜。你的团队在复查代码的时候,随便调出一行代码来,都能明白这是在干什么,就算这么弄一天,也觉得非常顺畅。这才是一个产品级的系统。

The Technician

工程师

You can get started in machine learning today, empirically. Three options available to you are:

你可以通过直接实战来开始学习人工智能,下面有三种方法供你选择

Learn to drive a tool like scikit-learn, R or WEKA.(先学会如何使用scikit-learn,R或者WEKA)
Use libraries that provide algorithms and write little programs(用别人的算法库,这样只用写很少的代码)
Implement algorithms yourself from tutorials and books.(通过教程自己实现算法)
More than options, this can be the path of the technician from beginner to intermediate that is learning the mathematics required for a technique, just-in-time.

这也是从初级机器学习技术人员通向中级机器学习技术人员的路径,用到哪块数学的时候再学,JUST-IN-TIME,随用随学,用以至学!

Define small problems, solve them methodically and present the results of what you have learned on your blog. You will start to build up some momentum following this process.

先定义个小问题,有条不紊的解决他们,并且把结果和你学到的东西放到你的博客上,这样你会获得一些动力,继续做下去。

There will be interesting algorithms that you will want to know more about, such as what a particular parameter actually does when you change it or how to get better results from a particular algorithm.

也许你想更多的了解一些更有趣的算法,例如改变一些特定的参数,或者针对特别的算法让结果变得更好。

This will drive you to want (need) to understand how that technique really works and what it is doing. You may draw pictures of data flow and transformations, but eventually, you will need to internalize the vector or matrix representations and transformations that are occurring, only because it is the best tools we have available to clearly unambiguously describe what is going on.

这将驱使你去理解这个技术背后是如何工作,以及做了什么。你可以画一些数据流图,但是最终还是内化为矩阵的形式。因为矩阵是一种清晰、无歧义的最佳的工具,这种工具可以描述正在发生什么事情。

You can remain the empiricist. I call this the path of the technician.

你可以继续这种经验主义方法,我称之为技术人员之路

You can build up an empirical intuition of which methods to use and how to use them. You can also learn just enough algebra to be able to read algorithm descriptions and turn them into code.

您可以凭经验直觉,了解使用哪些方法以及如何使用它们。 您也可以学习代数,以便能读懂算法并将其转换为代码。

There is a path here for the skilled technician to create tools, plug-in’s and even operational systems that use machine learning.

这是一条路径,技术人员可以创建工具、插件甚至是使用机器学习的操作系统。

The technician is contrasted to the theoretician at the other end of the scale. The theoretician can:

技术人员与理论家不同,理论家可以:

Internalize existing methods.(内化现有的方法)
Propose extensions to existing methods.(扩展现有的方法)
Devise entirely new methods.(发明一种全新的方法)

The theoretician may be able to demonstrate the capability of a method in the abstract, but is likely insufficiently skilled to turn the methods into code beyond prototype demonstration systems at best.

理论家拥有在抽象中证明方法的能力,但是他们在将理论转化成原型到代码的效率并不高。

You can learn as little or as much mathematics as you like, just in time. Focus on your strengths and be honest about your limitations.

你也可以学一些数学,如果你喜欢的话,随用随学。关注你自己的长处,并且诚实的面对自己的短处。

Mathematics is Critical, Later

之后,数学才是关键

If you have to learn linear algebra just-in-time, why not learn it fully more completely up front and understand the machine learning methods at this deep level from the beginning?

如果你需要随用随学的学习线性代数,那为什么不在一开始就完整全面深入的学习线性代数呢?

This is certainly an option, perhaps the most efficient option which is why it is the path used to teach in university. It’s just not the only option available to you.

这当然是一种方法,也许是最有效的方法,这就是为什么大学都这么教。 但它不是你唯一的选择。

Just like learning to program by starting with logic and abstract concepts, internalizing machine learning theory may not be the most efficient way for you to get started.

就像从逻辑和抽象概念开始学编程一样,内化(完全学懂)机器学习理论,可能不是最好的入门方法。

In this post, you learned that there is a path available for the technician separate from that of the theoretician.

在上文中,您了解到工程师可以使用跟理论家不同的手段来学习。

You learned that the technician can learn the mathematical representations and descriptions of machine learning algorithms just-in-time. You also learned that the danger zone for the technician is overconfidence and the risk of putting systems into production that are poorly understood.

您了解到技术人员可以“即用即学”的方式来学习机器学习算法的数学内涵。 你也了解到,技术人员的危险区域是过度自信以及将自己不怎么理解的代码发布到正式环境。

This might be a controversial post, leave a comment and let me know what you think.

这可能是一个有争议的帖子,发表评论并让我知道你的想法

Korean language support

Awesome work putting this together. I'm a trilingual dev, and I think it'd be a good idea to get more translations going like the coding interview university repo :) I can definitely help out with translating this into Korean.

Spanish language support

Hi @ZuzooVn !
First of all, thanks for this wonderfull repo. Really appeciate the effort you put in all this.
I am available for spanish and/or catalan translation if you want!

This is not really an issue, but a possible enhancement.

Cheers

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.