samm82 / drasil Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jacquescarette/drasil

0.0 0.0 0.0 559.22 MB

Generate all the things (focusing on research software)

Home Page: https://jacquescarette.github.io/Drasil

License: BSD 2-Clause "Simplified" License

Shell 0.60% Python 0.15% Haskell 98.22% CSS 0.03% Nix 0.04% Makefile 0.96%

drasil's Introduction

About Me

Hello! My name is Samuel Crawford; I have a B.Eng. in Software Engineering from McMaster University, and I am back at McMaster for my M.A.Sc., working on Drasil. I am the recipient of a Schulich Leader Scholarship for community leadership and academic excellence, and I strive to put my skills to good use wherever I find myself.

Skills

My areas of expertise are object-oriented design, functional programming, and software testing
My main languages are Python, Haskell, LaTeX, Dart, and Java
I am comfortable with common tools and frameworks like Git, Flutter, pytest, and Make
I have experience with MATLAB, Bash, C, C++, and SQL

drasil's People

Contributors

Watchers

drasil's Issues

Am I on the right track with unit testing? + related questions

The manual tests I've developed so far can be found here, and I just wanted to show you what I have and make sure I'm on the right track. Some questions I have about my implementation:

I can't think of a way offhand of testing other modules without also using the InputParameters module to population inParams, since the InputParameters class only has one constructor, which reads from a file. Is this OK, since we test this module, or should I somehow make another function to populate inParams with specific values?
Should we test if inputs are stored even if a constraint is violated? (I'm pretty sure yes, but just checking.)
Should we be testing assumption violations? For example, if g = 0, we will have a ZeroDivisionError, but we also have an assumption that g = 9.8.

There are some other questions I have, but these are the main blockers that make me hesitant to make any more manual tests until I hear your two cents.

Translate manual test cases into other languages, examples, and/or variabilities

So far, I am developing test cases for Projectile's Python implementation (specifically the one with a combined input module) since it is what I am familiar with. We will eventually want to have manual versions of other languages, examples, and variabilities to ensure we have a target for all test cases we will generate.

I would most appreciate an investigation into testing frameworks for C++, C#, and Swift, as I am also familiar with JUnit (which I will likely be using), and my probable immediate next steps are to implement testing for the other Python variabilities and the Java version of Projectile, so work that wouldn't overlap with that would also be really helpful (e.g., other examples), but if anyone wants a more gradual introduction, let me know what you plan on working on to make sure we don't do the same work twice!

Workflow

Since this is a fork of the main Drasil repo and my work is being done on a branch, I think the best approach would be to branch off of the testGen branch (where I'm doing this work), then do your work and make a PR, making sure that it wil merge your branch into testGen, NOT master.

Your tests should be runnable by make test in the directory of the stable code, not from drasil/code/ like you are used to doing.

Potential sources of confusion from similar yet unrelated symbols

In my IM for matrix representation (shown below), I use $\mathbf{E}$ to represent the matrix itself and $e_{ij}$ to represent elements of the matrix, but I use $e_i$ to represent arbitrary elements when iterating. Are these variables discrete enough to be valid in general, and within the context of Drasil?

Also, from the work on #26, there is also a potential confusion between $r$ and $r_i$, which are of different types. Is this a problem? Should locally used variables even be present in this table?

4.3: Design “Dictated”?

This is a nitpick.

The design of the software artifacts isn't necessarily “dictated” by Drasil. Drasil does allow variation of the design of the generated software artifacts, and we aren't necessarily forced to use Drasil's off-the-shelf code generator either.

Test Case Coverage

I guess this is an issue with my own as well...

When writing out the test cases for each section, we should add a bit of information about the coverage. For example, 5.1.2 does not have test cases for each “feasible chemical reaction,” only a small pool – is there any reasoning to picking this particular pool you've chosen? Is it a sample that sufficiently represents the whole set of feasible chemical reactions?

Missing periods in notes in TM:intLinProg

Very minor nitpick!

Am I on the right track for adding an equation to TM2?

In e38d509, I added an equation for the Law of Conservation of Mass, as well as the relevant data types and just wanted to check to see if I was on the right track. Is it OK to define functions as methods like this (e.g., $r.\text{count}(e)$ )? It made sense to me initially (since we're looking for the $\text{count}$ of $e$ in $r$, but this looks like it could bring up unnecessary difficulty in specifying it and could be a design decision. I was going to specify its definition as a DD, but now I'm not sure how to do that, since it depends on what its called on. Should I just rework this function to take $e$ and $r$/ $p$ as traditional arguments?

3.2: Is there a “secondary” objective?

If there is nothing but “primary” objectives, then you can remove the “primary” bit.

Add a reaction type

Now that #24 is answered, a reaction datatype should be added to associate a set of reactants with a set of products for use in TM2 (and other places)

Best way for someone to see the best version of chemcode?

@samm82 I have a student in CAS 741 who is interested in solving the stoichiometry problem. They won't be using Drasil. They'll do it the "traditional" (manual) way. The Drasil version didn't completely solve the problem, and there is room for variability, so I'm going to approve the project.

I'd like to point them to your work, but I'm not 100% sure of where the best version is. They won't be interested in the Drasil code, but they will be interested in looking at the generated html. Is the version in stable in the chemcode branch the best version?

If you are interested, I'll keep you informed on how the student is doing. 😄

FR: Convert-to-Matrix – should the conversion of the equation to the matrix be defined in the SRS?

The current state of ChemCode

So I know the deadline was the 17th, but I wasn't quite able to get everything done. (~~I thought I was mostly there, but then realized that I had never actually finished TM2, which is a pretty non-trivial exclusion among some other more trivial omissions/shortcuts.~~ EDIT: Now there's really just minor tweaks to be made.) That being said, here are (hopefully all of) the relevant files:

I'm definitely going to finish this up, but from a pragmatic viewpoint, would you rather me focus on ChemCode or on marking capstone documentation?

Accuracy Testing: GCD of set of coefficients = 1?

It's a bit redundant, but what do you think about adding the above “test” to your accuracy testing in 5.2.1?

Data type definitions are difficult to read – what are we trying to define with them?

Obviously, Drasil doesn't support defining data types for the runtime of the generated artifacts, but we can probably take this opportunity to learn a bit about what we want.

For example in:

You have 2 data type definitions: C and R. Notably, you refer to tuples but then add labels, making them records rather than tuples. Does the order matter if the names are there when defining an element?

Do you have any prototype of what the type signatures would be if you were to write the same thing in plain simple type theory?

Create Manual Test Plan for Projectile

Reading the code to judge test case coverage/test case quality isn't easy. In addition to the code, we should generate a verification plan, or maybe just a testing plan for now. Each test should have the following simple information:

test case name
inputs
initial state
expected outputs
source of expected outputs (this could simply be manual calculation, or it could be a pseudo oracle)
rationale

The symbols used for the inputs and outputs should be the SRS symbols, not the code symbols.

Originally posted by @smiths in #42 (comment)

Drasil SRS "Ready" for Review (Dr. Smith)

My project is at a point where it is "ready" to be reviewed; there is still quite a lot to do, but this is a good stopping point for this part of the course. The code for generating this document using Drasil as-is can be found on the chemCode branch (relevant generated artifacts here), and the code for hacking Drasil to do what I needed is on the srsHacks branch (relevant artifacts here). Please let me know if you have any questions!

Are GS1 and R3 really necessary to have explicitly?

My original approach to this project was that the program would first check if the reaction was feasible, and if it was, then it would balance it. It was brought up in my presentation that this was unnecessary. As a potential side effect of this, does this mean that R3 (shown below) is also unnecessary? Knowledge of a reaction's feasibility is implicitly needed by R4-R6, but is it necessary to specify this? ChemCode itself won't explicitly determine whether or not a reaction is feasible, but it will implicitly know this from the output of IM2.

GS1 may be redundant for this same reason.

Missing Acronym definition: NIST

Inconsistency between SRS and VnV plan Documents

In the SRS document it is mentioned that all requirements including non-functionals and functionals will be validated. But for NFR4 there is no designed test for validation.
Table 2 shows the connection between Tests and requirements. So if you have not any test for NFR4 then maybe you should drop the column and not leave it blank.

T1's "Derivation" is more of a description & related

I think you can move the description (derivation) to above the definition of T1, and then make “T1” into a set of tests (maybe with a different prefix), one for each element.

This is really just for granularity of the tests.

Also, why are you going to use static analysis for T1? It seems like it would be quite complex to prove with static code analysis.

Ambiguous or omitted information on VnV plan

section 4.6 You have mentioned "pytest", I think it should be written in the first letter capitalized: Pytest. Also, it will be good if you provide a two-word introduction for Pytest. For example, Pytest is a library of python.
Table 1: you have mentioned the Drasil team in the table and in the text, But I couldn't find extra information about the Drasil team members in your VnV plan. For example how many people are on Drasil's team? how many of them have responsibilities and roles related to Chemcode?
Section 5.1.1 I think the input/output of the T1 test should be explicitly mentioned as you did for T2.
All functional tests, include valid equations. What will happen if someone inputs a non-valid equation? what output will be generated? It is good to think about new test cases when the input data has the wrong format and the software should respond properly with an error message.

Inlined citations should either be part of the sentence or parenthesized

In your introduction, you have 2 references that are inlined but a bit awkward to read where. I feel like this isn't exactly your issue, but Drasil's. However, for now, I'll create the ticket here since it looks like you want inlined citations to be treated as full words in a sentence (i.e., treated as nouns) since you have other cases of that.

Test Case Derivations

The "Test Case Derivation" components of the test cases are a bit dubious. Some are more like descriptions rather than derivations (e.g., T11), and some are actual derivations, but the derivations are all textual. Is it possible to make some of the derivations "mathematical", explaining what makes them "valid"? Also, why does the "size" of the tests matter?

Finally, if the derivations aren't too important (like in mine, I just linked to WolframAlpha [lol]), you might be able to drop it altogether, or add a few examples of the justification to the appendix.

Minor issues on VnV plan

Section 2 The external link to Abbreviations and Acronyms did not work for me. please check the hyperlink.
Section 4 should be a roadmap to the following subsections. You have written external data for the 4.7 section which is a software validation plan. Maybe I'm wrong but I think it is a little ambiguous.
The caption of Table 1: "Table of Teammates and Their Roles". I think only the first letter should be capitalized: Table of teammates and their roles.
section 4.4 I'm not sure someone outside the university knows about Rubric. Considering that you couldn't find a link for that maybe it is better to not mention it?

Disambiguating the things labelled "count"

In a similar vein to #25, @smiths pointed out in his review of my generated SRS that it is confusing to have a function named "count" and a record field named "count". The distinction is that the function is defined for every element (even those that don't exist in the compound) while the record field will only have an associated entry for elements that exist in the compound. I am thinking of renaming the function to "numInComp" for "number in compound" to try and make this more clear. Does this make more sense? Do you guys have any better suggestions?

Drasil SRS "Ready" for Review (Jason)

Review ChemCode VnV Plan (Dr. Smith)

It can be found here.

“a subset of the Drasil team” might be ambiguous to non-Drasil readers

If you haven't already, you should mention that you intend for the ChemCode project to be one of Drasil's “official” case studies.

Since it should be obvious that any candidate project to be made “official” needs to go through a review process by the appropriate reviewers, you can safely drop the “subset of the” in “subset of the Drasil team” whenever you mention it.

Review ChemCode VnV Plan (Maryam)

It can be found here.

All files should end with a trailing newline character

A nitpick.

It's common convention to always have files end with a trailing newline character. The UNIX standard and that of various programming langauges (such as Python) have adopted this convention too. I find it good simply for appending to files quickly from the terminal, or for quickly jumping to the bottom of the file to write in text editors.

Definition of "compound" should likely be updated

Technically, a compound includes more than one type of element, so H₂ would not be a compound (but it would be a molecule). This issue permeates all documentation.

DD for function?

How would you specify a function as a DD? Should this be another type?

SUS calculation: why is it calculated the way it is?

You provided a means of calculating the SUS score, but I'm not quite sure why it's expected to be done the way that it is.

Find replacement for test case with modified chemical reaction

One of the chemical formulas used in a test case does not obey IUPAC conventions, which is now assumed to be true of all inputs in 5dd85fb. This test case should be replaced by an equivalent one that follows these conventions.

Missing introductory blurb between Sec. 5 and 5.1

It doesn't need to be anything complicated though, I think.

Drasil SRS "Ready" for Review (Karen)

Thank you for your patience!

5.2: While “Maintainability” isn't necessarily “testable,” it is something that is encouraged heavily by usage of Drasil

I wrote this (below) because “maintability” is a quality of the SRS abstraction. Worst case, so as long as you can write a transformer from the SRS abstraction to another language or codebase that is more “maintainable”, it obtains “maintability” through transitivity.

How to deal with tuples in Drasil? Do I even need to?

In my TM for the Law of Conservation of Mass, I make an assumption that $R$ and $P$ belong to the same reaction. I want to make this explicit by introducing a reaction type that is a tuple of sequences (one for each side of the equation), but am not sure what a good way to "access" each element of the tuple would be. Use $\text{fst}$ and $\text{snd}$? Or since the TMs don't deal with code generation, is this sufficient enough to get the point across?

Review ChemCode VnV Plan (Jason)

It can be found here.

How to specify variable string output?

So far my output IM looks like the image provided. Although I'm sure there are other issues with it, the main one seems to be the specification of the first two outputs.

The first output isn't a string, just a description of the string that should be outputted. My thought is that the output of the balanced equation should match the format of the input, but since the input format is a design decision that shouldn't be specified, this means that the output format is also a design decision. Am I misunderstanding something? Is this sufficient?
I have a general template of what I want the output to look like (as @smiths pointed out in his review). For any elements that are present on one side but not the other, this information should be communicated to the user; for example, "This reaction is infeasible because O and Cl appear on the reactant side but not the product side and Mg appears on the product side but not the reactant side, violating the Law of Conservation of Mass." This would require another data definition to get the elements on each side that violate this law, which wouldn't be hard to do, but before I do, is it something that I should even bother with? If so, how would I specify it formally?

"Governing Equations" for Chemical Reactions

My brother (the Chemistry professor) got back to me on the topic of a fundamental way to express balanced chemical reactions. He pointed us toward Chapter 3 of Geochemical and Biogeochemical Reaction Modeling. I looked at it quickly. I don't think it helps us at the moment. It brings in some concepts that we don't seem to need, and it seems to be aimed at aqueous reactions.

I don't think we should change what we are doing, but in the future we might want to work with a domain expert to find a more "fundamental" way of expressing our chemistry knowledge.

@samm82 once you've had a look at it, you can close this issue.

Issues with 4.6 Automated Testing and Verification Tools

TODO

HLint tests the Haskell code, but readers might be confused as you talk about the Python code right before mentioning HLint
pytest and HLint should be cited or have a URL added
the listed features of Drasil's CI should use commas (right now you have "A and B and C")
you should be more explicit in what capacity the Makefiles will be used

Change the type definition of compound to a vector

For now, it seems like we are focusing on stoichiometric compounds (e.g., those with natural-number subscripts), but we agreed on representing nonstoichiometric compounds using rational numbers. The element type should be represented as a linearly ordered enumeration, so that compounds can be represented as $\mathbb{N}^{|E|}$ (including in the SRS!). Therefore, the type definition of a compound should be updated as part of this change.

Originally posted by @samm82 in #40 (comment)

How to improve the old A3 and A4?

In your review, you mention that it seemed like A4 was covered by A3 (note that they are now numbered A5 and A4, respectively, but I refer to them by their old numbers to match the screenshot). My intention for A3 was that the inputted formulas would follow some standardized set of conventions, so that (NH₄)₂CO₃ would be allowed. A4 restricts this even more to aid in converting the user's input to be processed, so that I didn't get stuck writing regex expressions to process every possible format of chemical formula; this means that (NH₄)₂CO₃ would not be allowed, but N₂H₈CO₃ would be.

My Questions

Should I make an explicit decision as to what set of conventions should be followed?
How should I better articulate the difference between these two assumptions?
You also mention that I should make an assumption excluding hydrates, polymers, isotopes, etc., but A4 as written already excludes these. Should I describe the actual symbols that would be needed to represent these (e.g., $\cdot$, parentheses, hyphens, and charge superscripts)?

T6: Does the expected output need to be precisely that text?

Or can it be something “along the lines of X, specifically mentioning Y”?

samm82 / drasil Goto Github PK

drasil's Introduction

About Me

Skills

drasil's People

Contributors

Watchers

drasil's Issues

Workflow

TODO

My Questions

Recommend Projects

Recommend Topics

Recommend Org