📊📋Data Science🧮🗂

⚙ What is data and Data Science? 🤔

Data is everything we perceive, describe.
For example, the population of Turkey is a data. The population of Germany, the population of the world, simply dogs, cats, houses, schools are all data.
Subcategories of data:

Numeric Data
Categorial Data

When we look at Numeric Data closely we'll see:

Continous (Interval)

Discerete (Ratio)

When we look at Categorial Data closely we'll see:

Binary

Multiclass

Although I have simplified the meaning of the term "data", actually "Data Science" is a broad concept that encompasses mathematics and statistics, custom programming, advanced analytics, artificial intelligence (AI) and machine learning. Data science is a multidisciplinary field that uses scientific methods, processes, algorithms and systems to extract information and insights from structured and unstructured data.
Data Science is collected under 3 main headings:

Data Analyst

Statistician

Machine Learning

If you want to learn more about "What is Data Science" here is a link for you to read. 👉https://www.ibm.com/topics/data-science

⚙ What is Machine Learning? 🤔

It is a communication tool used to tell our requests to the computer. Deep Learning is a sub-branch of Machine Learning and Machine Learning is a sub-branch of Data Science.
Machine Learning has 2 areas:

Applied Machine Learning

Machine Learning Research

If you want to learn more about "What is Machine Learning" here is a link for you to read. 👉https://www.ibm.com/topics/machine-learning

⚙ Let's learn some Data Science terms! 🦾🤖

Supervised: Supervised learning is to take output from labeled models. 👉https://www.ibm.com/topics/supervised-learning

Unsupervised: Unsupervised learning is grouping unlabeled models. 👉https://www.ibm.com/topics/unsupervised-learning

Regression: Estimated data are constantly variable. 👉https://www.investopedia.com/terms/r/regression.asp

Classification: Estimated data are in certain categories. 👉https://www.techtarget.com/searchdatamanagement/definition/data-classification

Also you need to know that 😏:

Regression ≌ Classification.

There is no absolute 0 reference in "interval", but there is in "ratio".

There is no contiuous variable.

Prediction: We have a lot of data and we try to correctly guess the answer to a question from this data. For example, we have data such as the height, leaves and color of a flower, and we can estimate whether the flower is poisonous by looking at these data.

Mapping: f(x1,x2,x3)= ŷ↔y. So we describe the function as an input, and the "ŷ" as an output. Also "ŷ" means prediction of the model,and "y"means the truth. Our main goal in mapping is to minimize the errors that occur. error = e(ŷ,y)

If e=0 in known data, there is no such thing as e0 in unknown data. The main purpose is to minimize the errors that will arise from the unseen data while training the machine in the "train". So, how can we do this?
The answer is: I just train the model on the "train" validation and set the hyperparameters of the model. But since I did this hyperparameter update according to its good performance on validation, my model starts to overfit the validation set, albeit indirectly. So I need data that it has never seen to test it implicitly as well.
In short 🤐, I split the "train" into 3:

Train

Validation

Test

⚙ What is Bias Statics? 🤔

Bias is when the model systematically discriminates. Models carry the ideas of the people who created them. That's why every model is as objective as its designer. (Look "overstimate and "understimate") 👉https://www.statisticshowto.com/what-is-bias/

Author of this article ✍: Merve Sena Çınar
Follow me on LinkedIn 💁‍♀️ https://www.linkedin.com/in/mervesenacinar/

mervesenacnr / veribilimi Goto Github PK

veribilimi's Introduction

📊📋Data Science🧮🗂

⚙ What is data and Data Science? 🤔

⚙ What is Machine Learning? 🤔

⚙ Let's learn some Data Science terms! 🦾🤖

⚙ What is Bias Statics? 🤔

veribilimi's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent