Giter Site home page Giter Site logo

csharp-notebooks's Introduction

.NET Interactive Notebooks for C#

Welcome to the home of .NET interactive notebooks for C#!

How to Install

VS Code

  1. Download the .NET Coding Pack for VS Code for Windows or macOS.
  2. Install the Polyglot Notebooks extension.

C# 101

Download or clone this repo and open the csharp-101 folder in VS Code to get started with the C# 101 notebooks. Or, if you want just tap on one of the Notebook links below and automatically have it open in VS Code!

# Topic Notebook Link Video Link Documentation
1 Hello World 01 Notebook 01 Video Intro to C#
2 The Basics of Strings 02 Notebook 02 Video Intro to C#
3 Searching Strings 03 Notebook 03 Video Intro to C#
4 Numbers and Integers Math 04 Notebook 04 Video Numbers in C#
5 Numbers and Integer Precision 05 Notebook 05 Video Numbers in C#
6 Numbers and Decimals 06 Notebook 06 Video Numbers in C#
7 Branches (if) 07 Notebook 07 Video Branches and Loops in C#
8 What Are Loops? 08 Notebook 08 Video Branches and Loops in C#
9 Combining Branches and Loops 09 Notebook 09 Video Branches and Loops in C#
10 Arrays, Lists, and Collections 10 Notebook 10 Video Arrays, Lists, and Collections in C#
11 Search, Sort, and Index Lists 11 Notebook 11 Video Arrays, Lists, and Collections in C#
12 Lists of Other Types 12 Notebook 12 Video Arrays, Lists, and Collections in C#
13 Objects and Classes 13 Notebook 13 Video Object Oriented Coding in C#
14 Methods and Members 14 Notebook 14 Video Object Oriented Coding in C#
15 Methods and Exceptions 15 Notebook 15 Video Object Oriented Coding in C#

Machine Learning

Download or clone this repo and open the machine-learning folder to get started with the machine-learning notebooks.

Getting Started Series

# Topic Notebook Link
1 Intro to Machine Learning 01 Notebook
2 Data Prep and Feature Engineering 02 Notebook
3 Training and AutoML 03 Notebook
4 Model Evaluation 04 Notebook
5 AutoML Sweepable API 05 Notebook
6 AutoML Tuners 06 Notebook

End to End (E2E) Notebooks - examples of the entire ML process.

# Topic Github Link
E2E Classification using AutoML (Iris Dataset) Iris E2E AutoML
E2E Forecasting using Regression (Luna Dataset) Luna E2E Regression
E2E Forecasting using SSA (Luna Dataset) Luna E2E SSA
E2E Regression using AutoML (Taxi Dataset) Taxi E2E AutoML
E2E Text Classification API (Yelp Dataset) Text Classification API

Reference Notebooks

# Topic Github Link
REF Data Processing with DataFrame Data Frame
REF Graphs and Visualizations Visualizations
REF Kaggle Competitions (Titanic Dataset) Kaggle
REF AutoML Search Space AutoML Search Space

.NET Foundation

.NET Interative Notebooks for C# is a .NET Foundation project.

There are many .NET related projects on GitHub.

  • .NET home repoย - links to 100s of .NET projects, from Microsoft and the community.
  • ASP.NET Core home - the best place to start learning about ASP.NET Core.

This project has adopted the code of conduct defined by the Contributor Covenant to clarify expected behavior in our community. For more information, see the .NET Foundation Code of Conduct.

License

.NET (including the csharp-notebooks repo) is licensed under the MIT license.

csharp-notebooks's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csharp-notebooks's Issues

Are ML.NET notebooks using non-public #r nuget references?

Are the ML.NET notebooks using non-public preview nugets?

I tried to run the first ML.NET notebook 01- Intro to ... I get the following error:

#r "nuget: Microsoft.ML, 2.0.0-preview.22356.1"
C:\Users\nicho\AppData\Local\Temp\nuget\20332--ab38aca3-7c58-4abd-98f6-0079cd1c2c87\Project.fsproj : error NU1102: Unable to find package Microsoft.ML with version (>= 2.0.0-preview.22356.1)
C:\Users\nicho\AppData\Local\Temp\nuget\20332--ab38aca3-7c58-4abd-98f6-0079cd1c2c87\Project.fsproj : error NU1102:   - Found 33 version(s) in nuget.org [ Nearest version: 2.0.0-preview.22313.1 ]
C:\Users\nicho\AppData\Local\Temp\nuget\20332--ab38aca3-7c58-4abd-98f6-0079cd1c2c87\Project.fsproj : error NU1102:   - Found 0 version(s) in Microsoft Visual Studio Offline Packages

This is the latest on nuget.org: #r "nuget: Microsoft.ML, 2.0.0-preview.22313.1"

Translate to Local Language

I have an idea to translate this into Sinhalese (Sri Lanka).

If I create a new branch on my local repo, will it be merged as a separate branch at the base?

Thanks

Alien translator: modify a collection in a foreach loop

The last code block of the Alien translator notebook lacks clarity.

Since you use the same collection variable before and after the foreach loop, it seems like you want the "student" to modify the collection in-place, which is not possible using a foreach loop.

Since the notebook is aimed at new C# programmers, it might be better to hint at a clearer pattern.

Maybe I'm missing something?

ML Notebook improvements

Thanks @andrasfuchs for trying out the notebooks and providing feedback:

This feedback is specific to the Training and AutoML notebook, but it's also good feedback in general to follow across notebooks.

  • The capitalization of "ML.NET" is not consistent in the document

  • The "This notebook is under active development" is barely visible with Visual Studio's dark theme
    image

  • The tutorial is easy to follow, I just missed the links to the different classes and methods (like .Fit, IDataView, Regression.Trainers, TrainTestSplit, etc.), because although they are familiar it would be nice to have a way to jump to that particular class or method to investigate it further. F1 doesn't seem to work as it normally does when I browse my own code.

  • It would have been a little better if the sample datasets had more than one feature columns, just to see how Transforms.Concatenate should look like in that case

  • In the "Use AutoML..." section when I run the code, it only showed a grey rectangle while it was running, and when it finished it remained a grey and made all the other previous outputs shown as grey rectangles.

    image

    When I closed the file, reopened it and clicked "Run All", I got the following exception:

    image

ML Notebook cleanup

  • Ensure Model Builder references are removed.
  • Link to appropriate reference sections
  • Spell check
  • Content
  • Remove "Unknown => " from the pipeline. Rename "Pipeline" to "Trainer". Add parameters column.

image

Local and Authenticated NuGet source break Package Restore

The machine learning notebooks reference Microsoft.ML version 2.0.0-preview.22356.1 from a private Azure DevOps server. This preview version of the Microsoft.ML package is not available from nuget.org making it hard to run the notebooks.

PackageManagement Error 3217 Invalid URI: The format of the URI could not be determined.

What is the reason to support two versions of notebooks?

I had forked this repository and started to translate it to Ukrainian (the first notebook is done).
But there are two versions of each notebook (dib and ipynb). I don't want to double my work.
Which one is better to choose?
What is the reason to support two versions of notebooks?

ML Notebook Enhancements

  • Add reference documentation links to classes like transforms and trainers. (i.e. LightGBM)
  • Include parameter names in method calls. (i.e. mlContext.Data.TrainTestSplit(data,testFraction: 0.2))
  • Use real data for examples. It makes it easier to understand the problem that's being solved opposed to randomly generated data.
  • Watch for code comments. Instead of embedding them in the code, promote them to text in a Markdown cell.

  • Put related code together. Break up cells containing large chunks of code and add Markdown cells explaining what each of the cells is doing.

Example

Original

var context =new MLContext(seed: 1);
var pipeline = context.Transforms.Concatenate("Features", "X")
  .Append(context.Auto().Regression("y", useLbfgs: false, useSdca: false, useFastForest: false));

var monitor = new NotebookMonitor();
var experiment = context.Auto().CreateExperiment();
experiment.SetPipeline(pipeline)
  .SetEvaluateMetric(RegressionMetric.RootMeanSquaredError, "y")
  .SetTrainingTimeInSeconds(30)
  .SetDataset(trainTestSplit.TrainSet, trainTestSplit.TestSet)
  .SetMonitor(monitor);

// Configure Visualizer			
monitor.SetUpdate(monitor.Display());

var res = await experiment.RunAsync();

Update

Initialize MLContext

MLContext is the starting point for all ML.NET applications.

var context =new MLContext(seed: 1);

Define training pipeline

  • Concatenate: Takes the input column X and creates a feature vector in the Features column.
  • Regression: Defines the task AutoML needs to find the best algorithm and hyperparameters for. In this case, Lbfgs, Sdca, and FastForest algorithms won't be explored since their respective parameters are set to false.
var pipeline = context.Transforms.Concatenate("Features", "X")
      .Append(context.Auto().Regression("y", useLbfgs: false, useSdca: false, useFastForest: false));

Initialize Monitor

The notebook monitor provides visualizations of the training progress as AutoML tries to find the best model for your data.

var monitor = new NotebookMonitor();

Initialize AutoML Experiment

An AutoML experiment is a collection of trials in which algorithms are explored.

var experiment = context.Auto().CreateExperiment();

Configure AutoML Experiment

The AutoML experiment tries to find the best algorithm using an evaluation metric. In this case, the evaluation metric selected is Root Mean Squared Error. The goal is to find the optimal evaluation metric in the provided training time which is set to 30 seconds. The longer you train, the more algorithms and hyperparameters AutoML is able to explore. The training set is the dataset that AutoML uses to train the model and the test set is used to calculate the evaluation metric to see how well a particular model selected by AutoML performs.

experiment.SetPipeline(pipeline)
        .SetEvaluateMetric(RegressionMetric.RootMeanSquaredError, "y")
        .SetTrainingTimeInSeconds(30)
        .SetDataset(trainTestSplit.TrainSet, trainTestSplit.TestSet)
        .SetMonitor(monitor);

Set monitor to display

monitor.SetUpdate(monitor.Display());

Run AutoML experiment

var res = await experiment.RunAsync();

  • NotebookMonitor: Display evaluation metric for best trial, active trial, and y-axis on graph.
  • When adding feeds, add link to document on how to reference them in VS / dotnet CLI
  • When installing NuGet packages that are not part of the BCL, list them in a Markdown cell where the packages are installed, and add a link to NuGet. (i.e. Microsoft.ML).

User Input

Within these interactive notebooks, I am unable to find any examples on how to get a user's input? As I try Console.ReadLine() it proceeds to execute the next line of code.

Example:
string []answer = new string[10];

for(int i = 0; i < answer.Length; i++)
{
answer[i]= Console.ReadLine();
}

With this code, the cell keeps running for me with no prompt (like with Python) to enter an input.

Getting alot of Errors: "Kernel Failed To Start."

Have VS2022 installed, installed the Notebook editor extension and just opened the getting started folder and getting the following
I was able to build C# Projects with my VS2022 and also C# Interactive windows work in VS2022

Kernel Failed To Start.
Cannot find a tool in the manifest file that has a command named 'dotnet-interactive'.
error: Cannot find a tool in the manifest file that has a command named 'dotnet-interactive'. 
    at StreamJsonRpc.JsonRpc.<InvokeCoreAsync>d__143`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.Notebook.Utils.DetectKernelStatusService.<ExecuteTaskAsync>d__3.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.Notebook.Utils.RepeatedTimeTaskService.<>c__DisplayClass7_0.<<ExecuteAsync>b__1>d.MoveNext()
error: Cannot find a tool in the manifest file that has a command named 'dotnet-interactive'. 
    at StreamJsonRpc.JsonRpc.<InvokeCoreAsync>d__143`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.Notebook.Utils.DetectKernelStatusService.<ExecuteTaskAsync>d__3.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
[...]

Machine Learning

Add H1 heading to machine learning notebooks with title of the notebook

ML Notebook consumes all the available memory, forcing Windows to close processes

The Training and AutoML notebook is able to consume a lot of memory, causing to hang or crash other processes.

Strangely enough, it usually works fine if you run the notebook only once. So to reproduce the problem, you should:

  1. Open Windows Task Manager, and check your memory usage
  2. Open Training and AutoML notebook
    image
  3. Run it's snippets one by one, but stop at "Use AutoML to simplify trainer selection and hyper-parameter optimization."
    image
  4. Run the "Use AutoML to simplify trainer selection and hyper-parameter optimization" code.
    image
  5. Sometimes it works fine, but last time at this point my system hang and terminated some VS processes and closed my browser unexpectedly. Memory consumption dropped back to ~950 MBs, and the notebook got into a seemingly endless loop of "Starting Kernel".
    image
  6. When I tried to re-run the "Use AutoML to simplify trainer selection and hyper-parameter optimization" code snippet again, I got the following exception, repeating over and over:
    image
error: The JSON-RPC connection with the remote party was lost before the request could complete. 
    at StreamJsonRpc.JsonRpc.<InvokeCoreAsync>d__154.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at StreamJsonRpc.JsonRpc.<InvokeCoreAsync>d__143`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.Notebook.Utils.DetectKernelStatusService.<ExecuteTaskAsync>d__3.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.Notebook.Utils.RepeatedTimeTaskService.<>c__DisplayClass7_0.<<ExecuteAsync>b__1>d.MoveNext()
  1. If you could run the notebook without issues, try to re-run the "Use AutoML to simplify trainer selection and hyper-parameter optimization" code many times, it is inconsistent on my machine as well.

E2E-Classification with Iris Dataset.ipynb -> 'NotebookMonitor' could not be found

I tried running the "machine-learning/E2E-Classification with Iris Dataset.ipynb" file, and received the following error:

The type or namespace name 'NotebookMonitor' could not be found (are you missing a using directive or an assembly reference?)

Versions of things:

  • .NET 7 SDK: 7.0.401
  • .NET Version: 7.0.11
  • VS Code: 1.82.2
  • Polyglot Notebooks: v1.0.4461041 (pre-release)

Tried searching for that object on the internet and it's used around this repo, but for me, it doesn't work in any of the. I also tried changing the Polyglot extension to "Release" and that didn't do anything. I also tried upgrading Microsoft.ML and Microsoft.ML.AutoML to the most recent versions (2.0.1, 0.20.1, respectively) and that didn't work.

I was able to get around the error by just removing the monitor, but that's not fun!

E2E-Classification with Iris Dataset.ipynb
image

E2E-Forecasting using Autoregressive with Luna Dataset.ipynb
image

Add link to the github location of the notebook

The readme.md links that automatically open visual studio is a great experience if you have the notebook extension installed. However, if you don't have the notebook extension installed when you first come to the site (๐Ÿ‘‹ ), then nothing happens when you click on the link and it seems like something is broken. I eventually saw the "Links below require Visual Studio 2022 and Notebook Editor Extension" but it was not immediate.

Would it make sense to also add a link to the github URL of the notebook?

Something along these lines below might be more obvious. I don't know what to call things, the main point is two links:

# Topic Notebook link (requires VS and notebook extension) Github link
1 Intro to Machine Learning 01 Notebook 01 Notebook

Or..

# Topic Notebook link
1 Intro to Machine Learning 01 Notebook / Github version

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.