hackworthltd / primer Goto Github PK

View Code? Open in Web Editor NEW

11.0 3.0 1.0 6.17 MB

A pedagogical functional programming language.

License: GNU Affero General Public License v3.0

Makefile 0.13% Nix 2.91% Haskell 95.93% Dhall 0.04% PLpgSQL 0.35% Shell 0.64%

primer functional-programming education programming programming-language

primer's People

Contributors

Stargazers

Watchers

Forkers

jwhessjr

primer's Issues

Revisit integration around type signature actions

I am talking here about updating the kind display when you do an action in a type signature for a definition; but potentially the same issue occurs at other times.

The issue is that we have three (!) places that we redundantly transmit the same data to the frontend, and these
a. easily get out of sync
b. are each used to update the display at different points

The 3 places the kind info is stored: all coming out of applyActionsToTypeSig:
- the checkedDefs:Map Id Def, which get put into (App.hs) progDefs
- the def':Def, going into progSelection's def
- the zt:TypeZ, going into progSelection's DefSelection(SigNode) - basically the metadata

The different times they are displayed in the sidebar (I think):
1. immediately after a type action, we read it out of the DefSelection
metadata (i.e. zt)
2. when selecting something new in the same Def in the frontend (no backend
involvement), we grab it out of the Def's type's meta (def')
3. after a round trip to the backend, we rebuild it from the program
(checkedDefs) -- I suspect this rebuilding is probably done in the
backend.

Do we really need this duplication? (They have got out of sync before -- see [@brprice: todo; the following commit, but it may get rebased a bit... [brprice/poly-explicit d2405218f] Improve update of the kind field of the sidebar])

Module system

We should at least be able to import a prelude.

For Vonnegut, something that does just the bare minimum, as we did for Alpha, will probably suffice. However, long-term, I'd like to give modules some serious thought, and see if we can learn anything from ML and Haskell (especially backpack).

Related: hackworthltd/vonnegut#115

Surprising behavior regarding function application and annotations

(Surprising to me, anyway).

Here's a screenshot of a partial program I built in Vonnegut:

I was surprised that I was able to build this. We had a brief chat about this in Keybase, but I'm too lazy/busy to transcribe it, so I'll post a screenshot of that as well:

The key points from the chat are:

Yes, this is valid in Hazelnut's type system.
? in the annotation represents the Hole type, not a term-level hole as we typically encounter ? in our syntax.
We need to think about whether we should disallow this, and if so, how.

Ideas from chat on how to disallow it:

(@brprice) Potentially we should forbid annotations in checkable position and also forbid annotating a synth term?
(@hmac) I don't know if we can change it without breaking other bits of the type system. Possibly we should treat annotations with type ? specially, and just sort of ignore them? Then Zero : ? would synthesise the type Nat and you'd get an error like you expect.

Also in chat, @brprice linked to some related notes in Notion: https://www.notion.so/hackworthltd/Automatic-hole-annotation-insertion-removal-b6e28cf5aea64bc6b6adb845d0a134ca

Names should be non-empty strings

If we can nicely get serialisation to work, it would be nice to represent names as non-empty strings. Mostly for semantic reasons, partly because we do in DefineType.purs: arguments of constructors can be references to extant types by name, but these are represented as non-empty strings. It is a bit awkward to hook this up to the TypeDefs in hackworthltd/vonnegut#357 as TypeDef's names are possibly empty. (This may go away if/when we refer to types by id)

When automatically adding an annotation for a lambda, it should have the form ? -> ?

Currently (as of hackworthltd/vonnegut#175) if you insert a lambda in a synthesisable context we will annotate it with a hole:

λx → ? : ?

To refine this annotation you have to first insert an arrow, producing:

λx → ? : ? → ?

If you try to insert Bool or some other type instead, you'll get a (cryptic) error. We should just insert the arrow automatically, since it will always be required and it helps guide the user towards the correct type.

For reference, Hazelnut does this too.

Remove more holes automatically

We don't (even after hackworthltd/vonnegut#175) remove holes such as
{? Succ ?} Zero
because we don't have enough information to see if the hole is "finishable" at the point we handle the hole. In general, if a hole is in a synthesis position, we don't know where else downstream it is being referred to, so we dare not change its type.

I don't know how to do better here!

I also don't know how common this situation is in practice, so have no idea if this should be high priority or not.

Should we filter the list of in-scope variables by type?

Once hackworthltd/vonnegut#205 is in, we'll have the ability to show the user a list of in-scope variables. The current implementation shows a list of all in-scope variables, regardless of whether their type fits the selected hole.

We should probably filter the list so that only the in-scope variables that can fill the hole are displayed.

Eventually, we might want to add some help that explains why other in-scope variables cannot be used at the site; e.g., something like, "Variable x is in scope, but it has type Nat, and this hole can only be filled by a variable of type Bool. However, let's deal with that later.

Hydra and coverage tests for hedgehog

See https://github.com/hackworthltd/vonnegut/pull/129#discussion_r533510866

We have commented out coverage tests in some of our property tests (where we check that f : a -> b and g : b -> Maybe a are more-or-less inverses: we check that we generate some bs that hit Nothing and some that hit Just) since we don't want flaky hydra tests. This is particularly problematic since hydra doesn't easily let us know that it was only a coverage failure, so if it happens often we may get used to failing tests, and miss something more important.

Since hydra builds the tests 4 (I think?) times, note that even a 90% pass rate per test comes out as a 66% rate over all 4.

Automatically generated name indicies should start from the smallest available

I noticed that if I create a let binding, then delete it, then create a new one, the new one (i.e., the second one I created) is named a1. This continues ad infinitum. It would be nice if no-longer-used indices could be recycled, though I suspect this has something to do with the undo system.

Usable multi-student sessions

Once we have hackworthltd/vonnegut#307, I believe we’ll be able to have multiple agents connect to the same session hosted on the same server.

In the long term, this is definitely a feature we’ll want to support, as it enables scenarios such as peer learning, peer programming, and instructor observation/assistance. However, there are many deep UX, privacy, and technical issues to consider in order to make these scenarios robust, and this is, after all, still just a prototype at this stage.

That being said, I think there are a few things we could do on top of the work in hackworthltd/vonnegut#307 to better illustrate the scenarios we have in mind without needing to fully address all of the implementation issues mentioned above.

What are the key things we would need to add on top of hackworthltd/vonnegut#307 to make an effective demonstration of multi-student sessions? The first thing that comes to mind for me is some sort of client polling or Websockets functionality that would display a remote student's edits in your browser without requiring a page reload.

Should we allow actions on variable use sites?

This is a follow-up to hackworthltd/vonnegut#106, specifically regarding this comment:

https://github.com/hackworthltd/vonnegut/issues/106#issuecomment-737124680

Add a configuration system

This would permit default settings, user overrides, etc.

Incremental typechecking

See https://github.com/hackworthltd/vonnegut/pull/82#issue-514664673

Improved inference

(Note: I'm not sure the title is quite right. Feel free to change it.)

Consider the following partial program:

That all looks good. Now let's change it a bit:

Here I've changed the input type of the lambda from Bool to ?, and also changed the argument in the application from ? to False. In the second program, Vonnegut knows that the hole type ? can only be filled with Bool -- it displays an error if I try to fill it with Nat -- but it doesn't tell me that it knows this when I select that hole. It shows "Kind: Type", but what I want it to show is something along the lines of "Expected type: Bool, Actual type: ?", just as it did in the first program when the holes were swapped.

More better names

Now that hackworthltd/vonnegut#405 is merged, I think we should push on. It's somehow more tragic when you see a function named m or n than when it's named a3 or a237.

I'd like to break "better names" into 2 different but related aspects.

Conventional names based on type

It would be nice to choose from conventional names based on their type, where convention is determined by functional programming or math(s). For example:

Natural numbers are n, m, ...
Boolean values are a, b, ...
Functions are f, g, ...
Names of unknown type might be x, y, ... (or perhaps x?, y?, etc.)
Lists are ns, as, fs, xs, ...

I'm open to other conventions. Some of these may actually be anti-patterns (lists, for example?).

Scoping and variable capture

Among the groups I'd like to demo to are universities (e.g., MSP Strathclyde) and Microsoft Research Cambridge. If we're "cheating" by avoiding scoped variable names and capture issues, I think they'll notice, so we should probably try to do better. This will also help ease the pressure on generating good names (more as, fewer a12s, for example) and it will force students to reason about variable capture and shadowing, which are very real problems.[*]

We need to deal with this at some point, anyway, because at the moment, you can rename a variable to something that matches a local binder and then the UX is really poor. For example, in this program, there are two ns in scope and no indication of which n is which:

On the other hand, good UX for dealing with shadowing and variable capture is potentially a deep rabbit hole, and we need to build something that's realizable in the Demo project timeframe. One relatively simple mechanism to deal with this particular example would be to highlight shadowed variables in red and annotate them with a label that reads "Shadowed". For free variable capture scenarios, see our previous thinking in https://github.com/hackworthltd/vonnegut/issues/136.

[*] It might actually be a good idea to keep our current implementation with globally unique names as an "easy mode," so that we can avoid these particular naming issues until students are ready to deal with them.

Add integration test for Vonnegut.Types

Need a way to ensure changes to types in the API are reflected in Vonnegut.Types.

Enabling and disabling type system features ("levels of power")

We should think about ways to limit what features of the type system are exposed to the user. For example, it would be pedagogically useful in early lessons to hide actions that construct polymorphic types.

Assuming we think this is a good idea, we should come up with some sensible "presets."

One approach would be to stick to the dimensions of the lambda cube, but that's certainly too rigid. For example, given our principle of types-first programming, we might not want to go all the way down to pure STLC. (On the other hand, that might be an interesting exploration once students have a really strong grasp of types, to let them experience what they'd be missing!) @brprice mentioned another interesting example in chat when he suggested that we could downgrade from full System F but still allow type applications for things like List Bool.

Racket has a more powerful but related concept in its #lang feature. I don't fully understand all the technical details of the feature, nor its limitations, but some examples of Racket's #lang languages are here: https://en.wikipedia.org/wiki/Racket_features#Language_extensions. Note especially that Typed Racket is implemented (or at least exposed to the user) via the #lang extension.

To be clear, I'm not suggesting that we do something that changes syntax, only that we think about useful pre-defined collections of type system features that can be enabled or disabled according to the instructor's goals and students' experience.

How do users edit existing type definitions?

The tricky issues here are around the ux: changing a typedef can have long-range impacts on the validity of the program, causing a bunch of holes to appear, and even some code to be deleted (case branches not matching/existing). We need to work out what we want to happen!

There is some discussion of this in https://github.com/hackworthltd/vonnegut/discussions/350

Datalog for type checking/actions?

@brprice pointed to this blog post:

https://petevilter.me/post/datalog-typechecking/

Brief notes from our Keybase chat on this:

I asked whether the approach would be to use Datalog to model/reimplement the Haskell implementation of our current typechecker?
@brprice mentioned that good error messages might be harder in Datalog, but that we should look at https://github.com/frankmcsherry/explanation
@hmac suggested this would make queries like, "What variables are in scope?" easier.

Weird selection after undo

It seems that after an undo we are left with the "previous" selection, which is potentially nowhere near where the undo was.

Example A:
New session, select main, insert lambda, undo. You are now looking at "select a definition to get started"

Example B:
New session, select main, insert app (this also selects child2), move to other child(1), insert lambda, undo. You are now in child2.

Automatically create a fresh session if you have an invalid session ID

It's quite common during development to refresh the browser page after restarting the server and get this error:

Error: server returned bad initial state
Error

(NonEmptyList (NonEmpty "JSON.parse: unexpected character at line 1 column 1 of the JSON data" Nil)):
  Not a valid session

It would be nice if the frontend handled this automatically, recognising that its session ID isn't recognised and creating a fresh session.

This is mostly straightforward but there's a couple of bits of groundwork we need:

The backend needs to consistently return JSON-formatted errors so the frontend can handle them. We should have an APIError type with (at least) constructors for InvalidSession and ProgError (indicating a logical error like a type error).
On startup, the frontend should make an API request with its session and explicitly check for an InvalidSession error. If it gets one, it should go through the process of creating a fresh session.

"Add an argument" action for function application

Whatever we do for hackworthltd/vonnegut#308, we should consider mirror for function application. The problem isn't as dire for application, as we already have a working implementation (modulo cursor movement) where the user can highlight the root App node and press the $ action button as many times as they need arguments, but the UX isn't particularly clear. By adding a + button inline and an explicit "Add an argument" action button, as is currently planned in hackworthltd/vonnegut#308, we could probably improve things.

Let's unify our terminology

I think it would be helpful to unify our terminology and agree on some jargon, especially for UI elements and/or actions. For example, I frequently catch myself using "view" for different concepts, when in our system, "view" has a very specific meaning.

I've also been throwing around terms like "canvas" etc. without properly defining what they mean.

To reduce confusion, especially as we start to think about demo'ing Vonnegut to the outside world, we should come up with a lexicon that we can all agree on and stick to.

Model untrusted user input better?

The ConstructVar action current takes a Text value containing the name of the variable that the user wants to construct. This could be any text value, which may or may not be what we want. We may want to restrict it to alphanumerics or whatever. Additionally, a Text value carries no semantic information.

We might be better having ConstructVar take an UntrustedName or UnscopedName, which is a text newtype. We'd then explicitly convert this to a Name via scope and type checking, when applying the action.

What is the kind of a hole type?

Our current set of kinds are Type and Type → Type. In types and expressions we have a special value ? which represents an unknown, but we don't have an equivalent in kinds.

So what is the kind of the type ?? If it is Type then we can't use holes in places where higher kinded types are expected, for example in

let f = λxs → map inc xs : ? Nat
 in ...

where ? might be written in place of List when the user hasn't worked out what it should be yet.

Conversely, if its kind were Type → Type or Type → Type → Type then we'd rule out using it in place of basic types.]

We discussed this briefly in chat:

Harry: What kind does type `?` have? Do we need a kind `?` ?

Ben: Interesting! I think we can choose what to support here.
  In general a unknown kind sounds sane, but I don't know if it would be useful.
  We should think about the knock on effects of this choice - it's not clear to me.
  One option I think would be to disallow explicitly writing unknown types -
  i.e. "types first" means "fully known ...", then I hope we can know the kind of any type hole.
  Not sure on any of this though!

Harry: That option seems to imply that we'd take a bidirectional approach to kind checking.
  So e.g. a type constructor can synthesise a kind but a type hole can only be checked against a kind.

Ben: Perhaps. I was hoping that when we generate a type hole, we could also generate its kind.
  i.e. have one sort of type hole per kind.
  Further hoping that all type holes would have kind TYPE. But really haven't thought that bit through!

Harry: I doubt the latter holds if we let the user put holes in arbitrary places in types.
  It may hold if we ban that, though.
  I wonder what would be the UX implications of only allowing type holes of kind TYPE...

Do we need a letrec with mutual recursion?

Scheme's letrec supports this, with some evaluation limitations:

https://legacy.cs.indiana.edu/~dyb/pubs/fixing-letrec.pdf

Here are some examples of programs that are easier to write if you have it:

https://lists.racket-lang.org/users/archive/2004-May/005497.html
https://www.cs.toronto.edu/~ajuma/326f08/13Scheme6.pdf (see slide 28)

Also, I believe this could be used as the basis of a module system.

What can we do to help the user avoid free variable capture (and should we)?

With Vonnegut, we have the opportunity to help the user avoid free variable capture. As an example of one possible implementation, we could have Vonnegut choose initial default variable names which are guaranteed to be unique (either globally, or in scope) when the user creates new binders. We would then obviously enforce this invariant if the user tries to rename them and, if they choose a name that would capture a free variable, disallow it, with an explanation why (and perhaps give a suggestion on how to fix the issue).

A mechanism like this would also make it possible to support safe, automatic α-conversion, for refactoring operations.

At least one drawback to this approach is that it might make variable names less meaningful. This could possibly be addressed by letting the user provide a "base name" for a newly-introduced variable, which is then indexed or otherwise modified to avoid capture.

I would also want to make sure that names are stable in the presence of renaming actions which don't directly affect them. For example, if you have variables x1 and x2 and you rename x1 to y1, the implementation should not automatically α-convert x2 to x1. This can be expressed as the rule that once a name is chosen, only an explicit action by the user should be able to change it.

Another dimension to consider is whether this functionality, if implemented, could be disabled. This would allow us to run controlled tests where we explore whether experiencing unintentional free variable capture (and subsequently debugging it and fixing it manually) is pedagogically useful. On the other hand, given how notoriously difficult it is to get this stuff right in PL implementations, we probably don't want to support too many variants, for our own sanity.

Note that this issue is related to, but not the same as, hackworthltd/vonnegut#49 and hackworthltd/vonnegut#106.

Wrap up database STM bits in a nice abstraction

See @brprice's comments:

https://github.com/hackworthltd/vonnegut/pull/432#discussion_r615730038
https://github.com/hackworthltd/vonnegut/pull/432#discussion_r615720630

It might be worth trying to wrap up the STM data structures for the in-memory database and the database op queue in a nice interface. (On the other hand, if we only ever end up using this code in one place, maybe it's not worth it.)

Student-defined types

Need a new language for this.

Tracking issue: spurious cabal "extraneous version range" warning

See haskell/cabal#5119

Tracking this here as I keep seeing this and being annoyed by it. At least I now know why!

When configuring the backend testsuite (but not the executable), I see

Warning: The package has an extraneous version range for a dependency on an
internal library: vonnegut -any && ==1.0.0.0, vonnegut -any && ==1.0.0.0. This
version range includes the current package but isn't needed as the current
package's library will always be used.

but this is a lie: we have not got any version range on the vonnegut dep of vonnegut-test.

[Primer] Program synthesis

Edwin Brady has some sort of Idris program synthesizer. I'm not sure it's public yet, but here's the output from a session where he asked it to synthesize append on lists:

https://gist.github.com/edwinb/9b5dfda2ddf402133ab5d2ce47e85f3e

There's also Barliman:

https://www.youtube.com/watch?v=er_lLvkklsk

Ed Kmett was working on guanxi and coda, but those seem to be stalled:

https://github.com/ekmett/guanxi
https://github.com/ekmett/coda

Add a replay system

I'll fill in details later when we get closer to implementing this. There are some notes here:

https://www.notion.so/hackworthltd/2021-02-25-Dev-meeting-e58aceb035b742f49424fe43f487749f#a854b3aa523d4ef884741c6f5f1ea7b6

For now, this is low priority.

Versioning

Until/unless we start versioning our log format, the emitted log should include the application's git rev so that we know which version generated it.

DRY up action handlers in `Vonnegut.Action`

There may be some common functionality we can abstract out, such as erroring when the action is being applied to the wrong sort of term. See hackworthltd/vonnegut#50 for a bit more context.

Tuple syntax for function arguments

OK, per hackworthltd/vonnegut#265 hackworthltd/vonnegut#219 and hackworthltd/vonnegut#264, it seems like it's time to explore tuple syntax for function arguments, as touched on here and, more generally, in hackworthltd/vonnegut#219.

I suggest that this complement, not replace, -> syntax. That will allow us to do some user testing down the line.

Should we prevent annotations of type 'hole'?

Annotations of the form e : ? are basically equivalent in behaviour to a non-empty hole {? e ?}. They often seem to appear as tricky special cases and in some examples of bugs in our system. Would it simplify things if we banned annotations of this form?

Right now, when you add an annotation to an expression, the type is defaulted to ?. This change would mean that we instead default the type to be whatever the type of e actually is. More precisely, if e synthesises T then we would produce e : T, otherwise if e is successfully checked against S during typechecking, we would produce e : S. Since a successful typechecking pass must synthesise or check every expression, this approach will always find a type.

This change doesn't completely prevent holes appearing in annotations. For example, if we add an annotation to a hole we will get ? : ?, since holes synthesise the hole type. The same applies to non-empty holes: adding an annotation to {? Zero ?} will yield {? Zero ?} : ?.

To deal with these two cases, we could disallow annotations on holes. The type of a hole is always ?, so we can argue that annotations are never necessary. But they can be useful if you know what the type of the expression should be, but haven't finished constructing it yet.

So I'm not sure if there's a clear path here, but I think it's worth thinking about.

Pattern/binding/parameter issues

We have a bunch of issues all somewhat related:

How should we render patterns/bindings? https://github.com/hackworthltd/vonnegut/issues/290
Better rendering of case expresisons https://github.com/hackworthltd/vonnegut/issues/271
How could we render multiple equations with pattern matching? https://github.com/hackworthltd/vonnegut/issues/272
Non-selectable nodes (e.g., case branch labels) should be indicated somehow https://github.com/hackworthltd/vonnegut/issues/263
We currently can't rename variables bound in case branches https://github.com/hackworthltd/vonnegut/issues/233

Also, these or related issues have recently come up in PRs https://github.com/hackworthltd/vonnegut/pull/303 and https://github.com/hackworthltd/vonnegut/pull/285

What these issues have in common is that we don't handle patterns/bindings/parameters very well. I, for one, am having trouble keeping track of all these different discussions, searching the tracker for related issues, and determining what are the root causes or missing functionalities.

Let's use this issue to come up with some prioritized, actionable changes that will result in patterns/bindings/parameters being first class citizens of our programs.

When no variables are in scope, hide the "insert variable" button

Once hackworthltd/vonnegut#205 is in, we can present the user with a list of in-scope variables. In its current implementation, when no variables are in scope, the UI displays the "insert a variable" button, but it then tells you that no variables are in scope, and presents only the Cancel button.

IMO, when no variables are in scope, the button shouldn't be displayed at all. @hmac points out that we currently don't know that no variables are in scope until the button is pressed and we ask the backend for the list, so this will require some rethinking.

Allow detaching bits of the tree / comment out code / copy-n-paste

What is the analogue of "copy this code & comment it out because I want to refer to it / may want it later"?

Write up some documentation on smart hole removal

This could be done via examples; e.g., see the new property tests in hackworthltd/vonnegut#175.

Support polymorphism

It's unclear whether we should do this before user-defined types (#88). If we do it before those, we'll need at least something like a built-in Maybe or List type.

Should (type/term) application and (type/term) variable construction be the same thing?

See https://github.com/hackworthltd/vonnegut/pull/291/files/5d14f54ad0e2529b6653b0e2e014fe7cca082639#r576992001 and https://github.com/hackworthltd/vonnegut/pull/291/files/5d14f54ad0e2529b6653b0e2e014fe7cca082639#r576992097

Add doctest support

haskell.nix supposedly supports doctest now:

input-output-hk/haskell.nix#427

I'll add this to haskell-hacknix to test it, and if it works, enable it here, as well.

Golden program tests

We should all have a think about some "golden program tests." I propose that we:

Come up with a set of these tests.
Document them somewhere outside the code base.
Give them names/IDs so that we can refer to them in the code base.

Point 3 is important because, if/when a program test breaks, it would be nice to know what program the test was trying to construct, and this can be difficult by just staring at the expected AST.

The only question I have is, is it worth it to take this principled approach for Vonnegut when we're probably nearing the end of its lifetime? On the other hand, the initial set of golden programs for Vonnegut might be useful for Primer, as well, so it could be time well spent.

Add some kind of hash to the API, as a sanity check

A few weeks ago, I thought I'd discovered a bug, but when @hmac suggested that I rebuild the backend, it turned out he was right, and it wasn't a bug -- just a mismatched front/backend. I suspect this is a risk whenever we check out a new branch and rebuild, due to our use of make and git's obeying timestamps on files.

I'm now in the habit of running make clean whenever I switch branches, but it would be nice to have some kind of sanity check in the API so that, when the frontend starts up, the first thing it does is ask the backend whether they're running the same version. This could be possibly be as simple as embedding the current git HEAD SHA256 in the code (automatically, of course). That wouldn't be enough to detect dirty working trees, but that's not the scenario I'm worried about.

Store selection in backend

From hackworthltd/vonnegut#375:

As we've discussed, it would be really nice if the selection was remembered as part of the session. Then we could make changes to any part of the UI and see the results without having to interact with Vonnegut at all.
Without this feature, we might sometimes find that auto-refresh-on-save is actually an annoyance. Especially seeing as Purs IDE requires saving to disk in order to update anything.

And from @brprice in that same thread:

This may be related to hackworthltd/vonnegut#374 , and I have a similar concern in hackworthltd/vonnegut#373. It seems we are inconsistent/weird with how we set the selection. In theory we could store it in the backend always, and then on reload we will grab the program+selection off the server and be right back where we were.
I think essentially the question is how much state is frontend-only, and how much is essentially frontend-only but we would like to persist via the backend

Explain why holes appeared

It would be nice to have the equivalent of error messages. Let's explain why holes were (a) created in the first place (maybe pointing out the particular change that did so) and (b) why they are still around.

It is not clear exactly what this should mean -- there is a bunch of design work to do before any implementation.

A good first step would be to write up a few examples. It would be nice to see how different (a) and (b) can be in practice -- perhaps we only want one of them?

Type zippers

In both my pr hackworthltd/vonnegut#291 and in Action.hs, we do a nasty hack where we want to operate on a type T, but can only operate on a expression, so we actually look at ? : T.

We should do something better than this eventually!

UX improvements to cursor location

I think there's some small changes to the cursor location that will improve the user experience a bit:

When inserting an arrow type (→), leave the cursor on the LHS of the arrow. This makes it quick to flesh out the arrow type further.
When applying a function (i.e. inserting an application), leave the cursor on the RHS of the application. In my experience it's common to go from f to f ?, and the next thing you want to do is put something in the ?.

A pathological annotation case?

Here's something I just ran across while playing around. Start with:

Now try to give this definition type Bool:

Note that an annotation has been added automatically to the lambda. Now I delete the type Bool:

I'm back to step 1, except that the annotation has stuck around.

I suspect this is because we don't delete annotations anymore, but the user will probably find this behavior odd. Can we do anything clever here? Should we?

(p.s. Using Undo after step 2, rather than Delete, removes the annotation, as you would expect. The issue is that the user will probably not understand why Undo and Delete lead to different end states.)

Define Primer-the-language

We used this issue to discuss and track what remained to be included in the Primer 1.0 definition.

Terminology

Status: tracked in #333

I would like to be precise about terminology for concepts in Primer. This will be necessary to align our code documentation, help system, curricula, and marketing material.

Examples of terminology we need to nail down:

Functions: do they take "inputs"? "Parameters"? "Arguments"? Should we use different terms depending on whether we're talking about a function definition or the application of a function in an expression?
Variables: this term is overloaded in programming, and variables are also notoriously confusing to beginners. We are potentially even more vulnerable to this confusion as most students coming to Primer with previous programming experience will likely have learned that variables are things that you can modify. The mini-Haskell course we developed in 2020 did not use the term "variable" at all, in favor of "bindings," "names," and "values." Should we adapt this terminology for Primer?

Should shadowing be permitted?

Status: tracked in #332

Our implementation guarantees that node IDs are unique, and therefore names need not be. However, this is probably bad UX: in the UI, how does the student tell the difference between foo with ID 5 and a "shadowing" foo with ID 100?

This could potentially be solved via the UI itself: for example, when selecting a node containing a name, we could literally draw an arrow back to the definition. (But what about names imported from other modules? Presumably these would be shown with their namespaced name and therefore not strictly shadowing?)

At any rate, we need to decide whether this is permitted in the language definition, or whether we should not allow it for UX reasons.

Audit evaluation rules

Status: tracked in #325

More flexible pattern matching

Status: definitional issues are tracked in #334, #335, #336. UI issues will be tracked in primer-app when the time comes.

At the moment, we only support the most basic pattern matches. This may have to suffice for a 1.0, but if we stick with that plan, it's something we should address soon after shipping 1.0.

Time-depending, we may, at a later stage but pre-1.0, try to implement the following, which are not currently supported:

Wildcard patterns.
Pattern-matching on primitive constructors.

However, these would require writing an overlap checkers (and, for the special case of primitive constructors, an exhaustivity checker; however, this could probably be done as a special case where we simply ensure that the wildcard pattern appears as one of the patterns). This would also require some UX/UI features that we don't currently support, since at the moment, patterns are provided automatically by the backend based on the scrutinee's type.

Are constructors synthesizable or checkable?

Status: done

Should constructors be treated as synthesizable, or checkable? See @brprice's comment below for more details: #132 (comment)

We decided that it's helpful to the student if these are synthesizable.

`let type`

Status: tracked in #337

We have support for this in the core language because we use it for eval mode. Should we make it available to the student for use in their programs?

This should be reasonably easy to support in core, but it has implications on error messages in the type-checker and it's probably not worth spending on time pre-1.0. This feature is not particularly useful to beginners anyway.

Editable typedefs

Status: in progress

Note: at this point, we believe that no Primer language changes will be needed to support this, and it can all be done via a series of edits with the action system & the type-checker's smart holes implementation. Therefore, I may remove this section from the document at a later date.

We would be remiss if we didn't provide some way for students to edit (or at least delete) existing user-defined typedefs. This has lots of knock-on UX effects that we need to consider, like what happens to existing code when a type changes: there is probably no way to ensure the student's program still type-checks after a change is made, so what do we do with code that used to work but now doesn't?

A "cheap" way to do this would be to create a new type and keep the old one around until the student has time to make the changes & the old definition is no longer in use, at which time Primer would allow the old definition to be deleted (no more uses). Perhaps, given our goal of wrapping up Primer definition by the end of Q1 2022, this is the safe option for an initial implementation. However, students will probably be changing their types frequently, so it should really have better UX than "sorry, it's your problem."

Holes in type definitions

Status: tracked in #270

Should we support holes in type definitions? It may be useful pedagogically and it may be necessary to support editable typedefs. See #270 for more details.

I think we have arrived at a consensus that we should support these.

Naming rules

Status: tracked in #339

Related to the semantic typography bit I posted about above, do we want to adopt any of Haskell's naming rules? Here I mean things like, "(most) constructor names must be capitalized," "variables must begin with a lower-case latter," etc.

Also, per @brprice: can type constructors or value constructors be named the same thing as term definitions?

There is quite a bit to consider here and it mostly comes down to how easy we want to make it to write a plaintext version of Primer. Personally, my thinking on this is shifting more towards not having any particular naming rules in the language definition. Then conformant implementations would be free to implement stricter naming rules based on their own needs; e.g., a plaintext implementation might want to disallow spaces in names. Such implementations would still need to accept programs from implementations with no naming restrictions to be conformant! However, so long as they can mangle these more liberal names in such a way as the program semantics are preserved, there should be no issue here.

Internationalization support

Status: tracked in #339

[Related: hackworthltd/hackworthltd#25, but here, I'm not speaking about UI support for internationalization (i18n), but the core langauge itself.]

We have decided that for resource constraint reasons, we're not going to focus on i18n support in 1.0. However, I do want to make sure the core language is ready for i18n when the time comes.

What exactly does this mean? ... I'm not exactly sure yet! But we should think through some of the implications.

One example: keywords in our language (e.g., match ... with) should be generalized. When we say match with in the language spec, we shouldn't mean the literal words "match" and "with". Maybe in a Hanzi-based writing system, the best way to express the same concept isn't even two words (maybe it's one, maybe it's more). So while the core language spec might use match with, this should refer to the semantics of pattern match, and this should be divorced from its lexical syntax.

Another example, again involving keywords: suppose match with is best expressed in French as assortir avec. Now imagine the following (silly) program, "projected" in French:

-- Assume we don't enforce capitalization here, for the sake of argument
data preposition = to | from | with

f : preposition -> Bool
f x = assortir x avec
  to -> True
  from -> False
  with -> True

Now the student shares this program with another student who speaks English, who sees it as:

data preposition = to | from | with

f : preposition -> Bool
f x = match x with
  to -> True
  from -> False
  with -> True

Capitalization bikeshedding aside, we should definitely accept this program. But this raises some potential pedagogical issues about mixing up which with is meant. So there are some consequences to being i18n-friendly about language keywords.

Text syntax

Status: tracked in #339

Do we need to define a canonical text syntax for everything, including typedefs? Previously, I might have said, "No," to this question. However, recent feedback we've received during demos makes me think that we need to treat text view in Primer as a first class citizen, and therefore we should probably also specify a text syntax in the language definition that matches what the student sees in text view.

If Primer catches on, people may take our open source bits and want to write a traditional parser/interpreter/compiler for it, in which case, it might be helpful to have an agreed text representation, with no "cheating."

Personally, I am increasingly leaning towards just specifying a canonical AST representation (e.g., as JSON) and not worrying about a canonical text syntax. This would open the door to Primer implementations in languages other than English, for example. (See Internationalization support above.)

Bounded recursion

Status: tracked in #338

Should we create a language subset ("Proto-Primer"?) with bounded recursion?

At the very least, I like the idea from Etherium of giving a computation some "gas." I'm also curious about Conor McBride's "potato-powered proofs," but don't know much about how it works, or how accessible it would be to beginners.

On a related note, I'd like to think about how to implement a decent termination checker in the standard language.

Model top-level definitions as `letrec`s

Status: dropped. We went a different way.

Not only would this mean that top-level definitions aren't special, but also it should also make it easy to represent a block of letrecs as a canvas-within-a-canvas.

But what about types?

Module system

Status: in progress, though the module system for 1.0 will probably be fairly basic

Could we implement modules via letrec? As with the top-level definition question above, we might not be able to do this for exported types.

Do we need fancy ML-style modules? Something Backpack-ish? Probably not, but before reaching for the most obvious and/or simplest implementation, I'd like to review what the PL community has learned about module systems over the past 30-ish years.

Should the Prelude be part of the language definition?

Status: yes

This is pretty much just a yes/no question. Personally, I think the Prelude should be part of the language definition, as the language landscape is littered with languages that suffered for lack of a standard library (e.g., the Lisp+Scheme family of languages).

Edit: Having given this some more thought since our meeting this morning, I think it makes sense just to add the Prelude to the language. So long as we keep it relatively minimal and learn from Haskell's mistakes (e.g., no partial functions!), this seems uncontroversial to me.

Haskell-style (or -ish) type classes

Tracked in #343

Do we need them, or will our module system be powerful enough that we won't? I believe our current thinking is that we don't really know how to solve the coherence problem with a parameterized module system, except maybe if we go the fully dependently-typed route, which we're apprehensive about for a pedagogical language (or at least one designed for beginners, anyway).

We are aware of some attempts to improve upon ML's module system in order to obviate the need for type classes, but for various reasons, none of them have caught on, and meanwhile, Haskell developers seem more or less pleased with Haskell type classes.

Therefore, our current tentative plan is to do a relatively straightforward non-dependently-typed parameterized module system, and then consider whether we need to also add a type class system.

Built-ins/primitive types

Tracked in #272 and #344

We now support Int and Char primitives, which is probably good enough for 1.0. String will be implemented as List @Char.

We still need to decide the following for 1.0:

What primitive functions will be provided?
Where will they live?
How do we handle partial application of primitive functions? See #272.

Presumably the answer to the 2nd question is "In the Prelude."

Scoped type definitions

Status: dropped, no current use case.

Is there any reason why we shouldn't support scoped type definitions? We already have let type for type variables in eval. Shouldn't we expose this to the user language? Can we do something similar for type definitions?

Saving the student's work when a type changes

Status: irrelevant, as we don't think there are any definitional issues here.

There are a few scenarios in which, in response to the student changing the type of a node in the tree, the student may lose work. I'll try to enumerate them here. (Note: at the moment, this list is not exhaustive.)

Scrutinee type changes in `match _ with` expressions

In match x with ...: If the type of x changes, or if the student deletes x from the scrutinee location, causing the scrutinee to have type ?, then any expressions the student has written in her pattern match branches will disappear. This may be jarring. Also, if the student has written one or more non-trivial expressions in these branches, it is probably bad UX to simply delete them.

Could we/should we allow non-empty holes in pattern matches? Offhand, this seems like it would be pretty tricky. If the scruntinee is just a hole ?, then fine — the idea of keeping around some old patterns for type A under the match with expression seems simple enough. But then when the student fills the scrutinee hole again with a value of type B, what would we do with the old patterns for type A?

(Side note: I believe that Hazel (the editor, anyway, if not the Hazelnut calculus) allows something like this?)

Semantic typography

Status: never seriously considered for inclusion. This sort of thing has now been incorporated into the discussions above on text syntax, names, and internationalization support.

It seems obvious we should use typography (different typefaces, weights, styles, colors, etc.) to distinguish types, terms, etc. This is pretty standard fare in syntax highlighters for conventional languages.

However, during our 2021-05-11 developer meeting, we discussed whether it might be a good idea to make typography semantic. For example, if we want to distinguish between a local variable a and a global variable a, perhaps we should call the global variable a instead, where the boldface is part of global variable a's name.

One concern I have about this is that it might be too subtle, especially for students with vision impairments. Some of these students might use screen readers, which could solve the problem, but not all will.

Another concern I have about this idea is that it won't transfer. No other language I'm aware of makes such a distinction. From the perspective of transfer, we'd be better off sticking with using namespaces to distinguish global variables from local ones, assuming we want to do that at all (as obviously most languages don't support this distinction, either).

hackworthltd / primer Goto Github PK

primer's People

Contributors

Stargazers

Watchers

Forkers

primer's Issues

Conventional names based on type

Scoping and variable capture

Versioning

Terminology

Should shadowing be permitted?

Audit evaluation rules

More flexible pattern matching

Are constructors synthesizable or checkable?

let type

Editable typedefs

Holes in type definitions

Naming rules

Internationalization support

Text syntax

Bounded recursion

Model top-level definitions as letrecs

Module system

Should the Prelude be part of the language definition?

Haskell-style (or -ish) type classes

Built-ins/primitive types

Scoped type definitions

Saving the student's work when a type changes

Scrutinee type changes in match _ with expressions

Semantic typography

Recommend Projects

Recommend Topics

Recommend Org

`let type`

Model top-level definitions as `letrec`s

Scrutinee type changes in `match _ with` expressions