blenderskool / vyaakaran Goto Github PK

View Code? Open in Web Editor NEW

42.0 1.0 9.0 1.43 MB

📜 Visualize formal languages and automata

Home Page: https://vyaakaran.now.sh

License: MIT License

TypeScript 48.56% HTML 1.07% Vue 35.01% Shell 0.08% JavaScript 1.02% Svelte 14.25%

automata context-free-grammar editor ide regular-grammar turing-machine visualizer compiler

vyaakaran's People

Contributors

Stargazers

Watchers

Forkers

anirudh24seven tarun-mm bivas-biswas vr-varad riya461 nishaaannnt

vyaakaran's Issues

Vyaakaran CLI

A CLI tool over the existing Vyaakaran compiler which exposes similar functionality to the Vyaakaran editor.
The CLI tool would take source grammar file as input and generate an output based on the type of grammar:

Regular Grammar: Regular Expression, ε-NFA and NFA graphs.
Context Free Grammar: LL(1), LR(0), SLR(1), LR(1), LALR(1) parse tables, parsing automaton, first & follow sets.
Turing Machine: State Transition Diagram.
Console: test and strings command from the console that would work for all the above grammar types.

For someone interested to work on this:

Go through the compiler and editor code to familiarize yourself with various functions exposed by compiler to process grammars.
Create a new cli directory at the root of the project and add it to the pnpm-workspace.yaml file.
Initialise a new Node.js project with TypeScript. You can add @vyaakaran/compiler as a dependency in the cli project to use all the functions exposed by the compiler.
You are free to develop the compiler however you want with whatever libraries you want. Take inspiration from other popular CLI tools as a reference.
Usability and the structure of the commands exposed by the CLI is important.
CLI should be extensible in the future for different grammars, different outputs and any new options.
Help descriptions for every command in the CLI must be added.
Well formatted outputs with appropriate colors for error messages would be appreciated.

TDD approach to writing grammars

Rough idea that occurred to me while using Vyaakaran

The way I write grammars usually has following iterations:

Write a rough grammar.
Check sample strings it generates / Check some strings I expect it to accept.
Go back to step 1 and extend the grammar with more details.

This approach makes writing complex grammars simpler as you don't have to think about fine details at the start. But it also has a downside of causing unexpected changes in the intended language the grammar represents as you extend it in every iteration.

With a TDD approach, I would be able to define some sample tests of strings that I'd always want my language to accept / reject.
These tests would always run when I compile the grammar and let me know if any of my tests are failing. In case a test fails, it indicates my grammar has deviated somewhere and is no longer representing the language I intend it to.

For someone interested to work on this:

Take references from:
- regexr.com: There is a "Tests" tab where test strings can be defined.
- regex101.com: There is a section called "Unit Tests" in the left sidebar.
  Both of these tools allow the user to define a list of test strings (that gets matched or does not get matched) and as the user modifies the regular expression, these strings are tested against that expression.

Vyaakaran already has the pieces in the compiler to test whether a string gets accepted, rejected, or accepted with ambiguity by the grammar. Thus, no new functionality needs to be added in the compiler itself. Check the existing console code to see how the test command works:

vyaakaran/editor/src/config/console.ts

Lines 125 to 138 in c541405

    
           const parseTreeCount = new EarleyParser( 
        
             playground.compiled?.parseTree as ParseTree 
        
           ).isParsable(str); 
        
           if (parseTreeCount > 1) { 
        
             pushToStream( 
        
               playground, 
        
               'Warning', 
        
               `"${str}" was accepted with ambiguity` 
        
             ); 
        
           } else if (parseTreeCount === 1) { 
        
             pushToStream(playground, 'Success', `"${str}" was accepted`); 
        
           } else { 
        
             pushToStream(playground, 'Warning', `"${str}" was rejected`); 
        
           }

A new section in the UI must be added for Regular Grammars and Context Free Grammars where the user can define a list of test strings (acceptable / unacceptable) and whenever the user compiles their grammar, all the test strings are tested against the grammar and the result is visually displayed for each test with appropriate colors.

Grammar generation from example strings

Defining example strings that are a part of your language is easier than writing the grammar directly.

There can be a feature in Vyaakaran that allows the user to enter strings that are a part of the language and Vyaakaran generates the grammar code directly from that.
Not sure about feasibility and complexity of this, but it should possibly be explored in following order:

Regular grammar (maybe something to do with prefix grouping might work?)
CFG
TM

Inconsistencies with validator and generator in console

The string validator's input and example string generator's output are inconsistent in some edge cases. Ideally, an output of the generator should also be a valid input to the validator.

Empty strings
The validator does not accept "", whereas the generator outputs "" for empty strings.

This issue can be updated with further edge cases that have inconsistencies.

Enhancing Regular Language Parsing: NFA to DFA, State Minimization, and Optimization Algorithms

Description:

We propose the following enhancements to improve the efficiency of regular language processing:

NFA to DFA Conversion: Implement the subset construction algorithm to convert Non-deterministic Finite Automata (NFA) to Deterministic Finite Automata (DFA).
DFA State Minimization: Add an algorithm to minimize DFA states by merging equivalent states, producing a more efficient automaton.
Optimization Algorithms:
- Forward Reachability: Determine the set of states reachable from the initial state and remove unreachable states.
- Co-reachability: Identify states that can reach a final state and eliminate non-contributing states.

Add parse table explorers

Following Parse table generators can be added in the CFG playground:

Add syntax cheat sheet UI

Syntax cheat sheet is currently available as comments in the editor. A better UI should be created for it to keep the editor clean and the cheat sheet always accessible.

Adding GitHub Workflow for running tests

Set up GitHub Actions for CI (tests on PRs) and CD (deploy on pushes).

Improve tokenization

Tokenization currently doesn't use look-forward to create tokens. It groups tokens based on their individual classification.
This causes - symbol to be treated as a Separator even though it could be Literal (as there is no special meaning with just - symbol in the syntax apart from its usage in -> symbol).

Generate parse tree

The parser should be updated to return a parse tree rather than array of productions.

Explore opportunities for non-UI-blocking processing

All the core Vyaakaran processing happens on the main thread which may lead to blocked UI in some heavy and long-running algorithms. Explore ways of moving these algorithm calls to maybe Web Workers which can be physically terminated by the user if they don't prefer waiting.

Unit testing setup

Unit testing setup with possibly Vitest for the compiler and editor module that will allow us to write unit tests for different parts of the codebase.

Why Vitest

Works out-of-the-box with TypeScript.
Has similar syntax to Jest.
Editor already uses Vite, so using Vitest makes sense.
Has UI and extensions with IDEs to interactively run tests.

playgrounds created, their type, name.
code written in the playgrounds

Incorrect unacceptable strings generated

Unacceptable strings generated for the following grammar are incorrect.

Test grammar

S -> # | a S.

Expected behavior

No unacceptable strings are generated as the above grammar matches all strings with symbol a.

	const parseTreeCount = new EarleyParser(
	playground.compiled?.parseTree as ParseTree
	).isParsable(str);
	if (parseTreeCount > 1) {
	pushToStream(
	playground,
	'Warning',
	`"${str}" was accepted with ambiguity`
	);
	} else if (parseTreeCount === 1) {
	pushToStream(playground, 'Success', `"${str}" was accepted`);
	} else {
	pushToStream(playground, 'Warning', `"${str}" was rejected`);
	}