entomy / ada-improvements Goto Github PK

View Code? Open in Web Editor NEW

4.0 3.0 0.0 22 KB

Repository of Ada language improvement ideas

License: GNU General Public License v3.0

ada ada-lang language language-design

ada-improvements's Introduction

Hi there 👋

ada-improvements's People

Contributors

Stargazers

Watchers

ada-improvements's Issues

Interfaces as type shapes

Ada does:

Interfaces are essentially abstract tagged records, with multiple inheritance. An interface is strictly inherited from by another tagged record.

It should be:

The interface should be a type shape, or feature contract. That is, it describes what the type has present, but is not actually inherited from.

Rationale:

This would require the compiler verifying the contract is upheld on the type side, rather than just looking for null pointers after vtable synthesis. However it would also enable interfaces as type parameters working as type shapes, so as long as the type fulfills the feature contract, it can be used in its place; regardless of whether it inherited the interface. In fact in this approach, there is no inheritance of interfaces, just verification of contract.

Potential Alternatives:

Generics already implement type shaping, and maybe that can be reused instead.

Named/Anonymous parity

Ada does:

For reasons not well understood, and probably related to changing design ideals while maintaining backwards compatibility, names and anonymous types do not share the same support or usability. This is especially the case with access types, with wildly different rules for each.

It should be:

Named types should just be an alias of the anonymous type, allowing for greatly simplified rules, which helps both the programmer and the implementer.

Name aliases should still be provided because they are useful for nominative typing enforcement, a well as being something to allow contracts defined for.

Rationale:

3.10.2 is notoriously complex, and is also only part of the problem. Instead of compounding complexity with different rules for named and anonymous, just pick the rules for one, and have the two forms be fundamentally equivalent.

Anonymous Arrays

Should Support:

function Arithmetic_Mean(values : Integer[]) return Integer;

(using the syntax of #8)

Rationale:

Oftentimes we don't need a nominatively discriminated array type, and just want to define a function or operator for any array of a specific type. Adding in anonymous arrays wouldn't take away the cases where nominative discrimination is desired, and would merely make things easier on library maintainers and consumers, since they wouldn't need to provide and instantiate generics for what is literally not varying in any regards other than name. This makes Ada far more suitable to rapid development.

True Pure_Function

Ada does:

pragma Pure_Function();

with Pure_Function

This doesn't do anything. No seriously.

It should be:

Actually statically verified

Rationale:

This isn't just a notice to programmers that a subprogram may be "conceptually pure", whatever that actually means. It should be formally verified because there are numerous optimizations and various features that can be implemented in a function-level environment.

One of the most obvious and easy to explain is compile-time expression evaluation. This is only possible in a function-level environment, and formally and statically verifying this directive would allow for this possibility.

Another is proving thread safety. A function-level subprogram is always safe to use in a concurrent environment without locking.

Return by reference simplification

Ada does:

-- This is an example of the RM definition of a "reference_type"
type Reference_Type(Element : not null access Element_Type) is limited null record
   with Implicit_Dereference => Element;
type Constant_Reference_Type(Element : not null access constant Element_Type) is limited null record
   with Implicit_Dereference => Element;

function Reference (Self : aliased in out Container_Type; Index : Index_Type) return Reference_Type;
function Constant_Reference(Self : aliased Container_Type; Index : Index_Type) return Constant_Reference_Type;

It should be:

function Reference (Self : aliased in out Container_Type; Index : Index_Type) return aliased Element_Type;
function Constant_Reference(Self : aliased Container_Type; Index : Index_Type) return aliased constant Element_Type;

Rationale:
The current status quo has a number of issues:

It exposes access types to the client when it shouldn't need to
It's difficult to read and maintain. It is doubtful a new user will see it and understand what is going on.
They can be difficult to actually implement for some types without leaving a dangling reference
It makes creating iterators, indexers, and other similar things much less intuitive than it should.

The proposed syntax relies on the compiler to make all the under pinnings. It could use the same thing as the status quo or simply just access types under the hood, but not actually expose them to the user. In addition, by hiding the implementation details by default, it makes the types safer to use than they are currently.

For simplicity of implementation, the semantics of the proposed change could mirror the existing semantics of the status quo as a starting point.

If for some reason, the method expressed by the status quo version is needed for a special case, it could still be used by providing an additional aspect:

type Reference_Type(Element : not null access Element_Type) is limited private
   with Implicit_Dereference => Element;
type Constant_Reference_Type(Element : not null access constant Element_Type) is limited private
   with Implicit_Dereference => Element;

function Reference (Self : aliased in out Container_Type; Index : Index_Type) return aliased Element_Type
   with Reference_Return_Type => Reference_Type;
function Constant_Reference(Self : aliased Container_Type; Index : Index_Type) return aliased constant Element_Type
   with Reference_Return_Type => Constant_Reference_Type;

Name of aspect is just a placeholder...can be something better. This part is just there for the limited times where a special type is needed. For example, if GNAT wanted to keep it's internal tamper check logic or similar.

Semicolon termination

Ada does: All statements are terminated in semicolons.

This is something contentious and should be discussed, as including them just because of legacy use isn't a very valid argument. However it should be noted the use of semicolons isn't a legacy artifact either.

Attribute syntax

Ada does:

Type'Attribute(Object)

It should be:

Object'Attribute

Rationale:
A good type inference system can tell what the type is based on the object, and then call the correct function as appropriate

Aggregate Initialization for Tasks and Protected Types with Discriminants

Should Support:

Aggregates for task types and protected types:

V1 : Some_Task_Type := Some_Task_Type'(Some_Discriminant  => An_Option);
V2 : Some_Protected_Type  := Some_Protected_Type'(Some_Discriminant => An_Option);

Rationale:

This allows for creating arrays of discriminated tasks (assuming they have default initialization of course) without having to make wrapper types or constructing functions. Consider a record with a default discriminant:

type Option_Type is (A,B,C,D);
type Record_Type(Option : Option_Type := A) is null record;

You can then do:

A1 : array(1..10) of Record_Type :=
    (1 => (Option => B),
     2 => (Option => D),
     others => (Option => C));

This cannot be done with tasks or protected types directly:

task type Task_Type(Option : Option_Type := A);

-- This is illegal for task and protected types.
A1 : array(1..10) of Task_Type :=
    (1 => (Option => B),
     2 => (Option => D),
     others => (Option => C));

Ada is a strongly typed language, so it should be able to do this. For ambiguous situations, qualified expressions could be used if this was supported. This is just to ensure all types can be aggregate initialized consistently the same way..

Implicit/Explicit conversions

Should Support:

implicit conversion definition:

conversion Source_Type to Destination_Type is implicit
begin
     return --conversion
end conversion;

explicit conversion definition:

conversion Source_Type to Destination_Type is explicit
begin
    return --conversion
end conversion;

use:

Source_Type as Destination_Type

Rationale:
Strong type safety is a major selling point of Ada as it prevents many kinds of errors. However it also makes some totally sensible things much more difficult than it should be. By adopting the default position of extreme nominative type safety but allowing conversions to be explicitly defined, and then explicitly or implicitly occur, sensible conversions can be allowed without having to resort to dangerous things like view overlays.

For example consider the case of upsizing numeric types. In standard Ada it is required to explicitly convert a Short_Integer into a Long_Integer even though there is a guarantee of no data loss. In languages who adopt the implicit/explicit conversion model, this is a one way implicit conversion. That is, from the smaller to the larger size occurs implicitly as there is no potential issue. The other direction however is either left undefined or explicit, as it potentially causes data loss.

Explicit conversions are of interest specifically to take advantage of a special conversion syntax, instead of the To_Type() function calls that are scattered through so much of Ada sources.

Numeric promotion

Ada does:

I : Integer := 10;
LI : Long_Integer := 10;
LLI : Long_Long_Integer := 10;

It should be:

The syntax would actually be largely identical, although there is a difference. So I think it would be best to discuss the change in semantics and underlying behavior first.

Numeric promotion is a concept by which the size of a numeric variable/parameter etc is kept as the smallest that can represent the current value, and is automatically expanded as necessary. There's some internal trickery to make this work. But the general idea is that an snippet such as:

Factor : Integer := 2 ** 30
Result := Factor + Factor;

Would only throw an exception is the largest numeric value the processor supports is 32 bits. On a 64-bit platform this would be recognized as "will overflow" or "could overflow", and will instead be computed with the next size up, in this case 64-bit arithmetic.

There are some restrictions that promotion has to put in place to keep things sane and lossless.

Promotion from an integer to a float is unacceptable. Floating points are a useful tool but aren't simply an "integer with an exponent".

Similarly, demotion generally isn't possible. There are a very small set of conditions where demotion can be proven safe, but generally this can't happen. In cases of "could demote", no demotion can ever happen.

What this means in practice is that programmers are less concerned about what size to use, and the resulting algorithm is generally as compact/efficient as hand written ones. But there are admittedly cases where hand written ones will be more compact.

Furthermore embedded programmers also have certain constraints where a given value should never be beyond a certain size (in terms of storage size, not upper bound of the value). For this reason, I propose an addition to the typical promotion system. Where the 'Size of a value, parameter, field, etc has been explicitly set, no promotion shall occur. Rather, the value will be exactly of that size and always treated as that size, as in, classical semantics.

Subroutine simplification

Ada does:

function Example_Function(… parameters …) return Value;

procedure Example_Procedure(… parameters …);

It should be:

function Example(… parameters …) return Value;

function Example(… parameters …);

Rationale:
A return value is essentially just another parameter, so let's treat it more like that. Most languages don't really make the distinction, and depending on the syntax may use a void to say it doesn't really return anything.

I should also explain that in Ada the two are not simply a distinction between whether there was a return type or not, and are semantically distinct. Functions can be designated pure while procedures can not. If keeping comparability with older Ada standards, procedures can have out parameters while functions can not. The two should be compatible, only differing in return type, and making them the same construct accomplishes that in the most sensible way.

This also conveniently simplifies parsing since subroutine definitions can be determined with one definition type, with a simple check inside for whether a return clause is found.

Access type name change

Access types should instead be renamed reference types, since this is more clear about what they actually are in a way most programmers think about them.

Technically access is correct, in that it's an access to a heap allocated object.

Except actually this doesn't have to be the case. Across programming languages, there are often ways to get a reference to a value on the stack, or to arbitrary memory locations (managed, but not necessarily by a heap).

The way we really think of these types is as references to the actual value, and is why in almost every other language after the 90's they have been referred to as reference types. Simply redefining them to use reference Type instead of access Type, or possibly just ref Type would bring the language more in line with conventional thought.

Based literal e-notation removal

Remove:

16#DEAD.BEEF#e8

Rationale:

This is a bizarrely convoluted syntax with barely any use. The specified base only applies to the base number DEAD.BEEF, so the exponent is still base-10. Given that the primary use for specifying numbers in non-base-10 is hardware interfacing, protocol implementation, and bitmasks, e-notation just isn't very necessary.

Goal-Direction

Ada does:

Normal single return model with exception handling

It should be:

Implicit multi return model with the implicit return of a success state. For example, Boolean operators like < and > would still explicitly return Boolean but would also implicitly return their success state. If the operation failed, this would cascade through all remaining operations in the expression. This enables code like:

Comparisons:

Boolean Comparisons

if 1 < X and then X < 10 then
    Put_Line("X is between 1 and 10");
end if;

vs.

if 1 < X < 10 then
    Put_Line("X is between 1 and 10");
end if;

Copy File [1]

try {
   while ((a = read()) != EOF) {
     write(a);
   }
 } catch (Exception e) {
   // do nothing, exit the loop
 }

vs.

while write(read())

Rationale:
This greatly reduces the amount of times exceptions are require and in many cases allows a program to be compiled without any exception handler which greatly improves runtime performance. Furthermore, this approach often simplifies developer code by allowing expression of goal and not things like exception handlers.

Implementation Method:
The code necessary to support this is as simple as a check against the success state, which could probably be permanently tracked in a register. If the expression is still successful, continue execution like normal, and if not, jump to either the end of the function, or back to the call stack and into the next function like normal.

Properties

Should Support:

property Example : Type is
get
    return Value;
set
    value := Value;
end Example;

Rationale:
Properties are useful for three total reasons:

Introducing a "value" that is calculated at runtime instead of being stored in a data structure
Providing validation logic for a value, such as name validation or shape validation
Binding to languages which implemented separate getters and setters for what is conceptually a variable

Guarded Recursion

Ada does:

Functions/Procedures can just recurse freely

It should be:

function Fibonacci(P : Positive) return Positive is Recursive
    if P <= 2 then
        return 1;
    else
        return Fibonacci(P - 1) + Fibonacci(P - 2);
    end if;
end Fibonacci;

This syntax assumes #30

Rationale:

The thing is, recursion is very powerful and expressive, but it's not obvious and can be an accident, such as a typo, whereby an infinite loop is created until the function stack eventually overflows, or where tail recursion optimization was possible, never terminates.

The idea is that unless recursion is specifically requested, the analyzer should notice that the function is calling itself, and thereby raise an error.

Orthogonal block terminators

Ada does:

In most cases:

if True then
end if;

Terminating with the same keyword that introduced the block

but also:

package Example is
end Example;

package Example is
end;

Terminating with the name of the block, or just end

It should be:

end <introducing keyword> (<name>);

Rationale:
Lack of orthogonality confuses programmers, as they have to remember when a different syntax is used, and what that syntax is. Furthermore, it can (and in this case does) complicate parsing.

The <introducing keyword> should be mandatory, as it makes matching substantially easier, even if using advanced Regex engines. The <name> should stay optional, as it's not required for parsing, and in small code snippets is easily matched by a human, but allows for explicit naming in more complicated sources.

Constant declaration

Ada does:

I : Integer := 1;

It should be:

constant I : Integer := 1;

Rationale:
This proposal is twofold:

This enables the more orthogonal design of introducing everything with the keyword of what's being introduced.
It's better for the parser as we have a clear introduction instead of repeatedly guessing why an identifier is at the start of a statement.

While constant is used, const would be fine. I'm optimizing for syntax because intellisense greatly reduces keystrokes regardless.

Use [] for indexing instead of ()

Ada does:

Container(1)

It should be:

Container[1]

Rationale:
This is a minor adjustment just to be more clear about the operation being performed.

Type Conversion

Ada does:

F : Float := 1.34;
I : Integer := Integer(F):

It should be:

F : Float := 1.34;
I : Integer := F as Integer;

Rationale:
Introducing an as operator fits with the general approach of Ada of using keywords in most places. Furthermore, it distinguishes this from a function call which it definately should not be viewed as. Introducing a well defined operator also helps #11.

as should be treated higher than the current highest_precedence_operator

`not null` by default

Ada does:

Param : not null access Type; -- non null

Param : access Type -- null

It should be:

Param : access Type; -- non null

Param : nullable access Type; -- null

Rationale:
Nulls, while very useful, are also very dangerous. The default should be the safer option. C# gets this right.

Accessibility Modifiers

Ada does:

Public, "private", or whatever you want to call only being in a package body. Public is public. Private does not mean what it says on the box. And putting something in a package body without an entry in the spec actually means private.

Furthermore, these are done as a block. The problem is however, certain Ada processors use static elaboration despite the fact that this is supposed to be dynamic, and the elaboration rules even when dynamic elaboration occurs are not easy to wrap your head around. What this often means is that a public package member can not easily depend on a private package member. Incomplete declarations work around this issue, but there are single solutions which solve that workaround, and more.

It should be:

Firstly, accessibility modifiers should directly be associated with the member, and not introduce a block. Not only is this easier to parse, it's easier to structure code around it (since it's not introducing a block that has nothing to do with code structure).

Secondly, there should be more accessibility modifiers for more granular control beyond the minimal options Ada provides.

public - this is still exactly what it is
protected - this only applies to members of inheritable types and ensures the member can be accessed within the context of the inheriting type, but not outside of it
internal - this is what Ada currently calls private, whereby the member is viewable within the entirety of that package, as well as within child packages, but not outside of that.
private - this needs to be actually private, no visibility outside of that package, end of story.

Obviously the exact keywords don't need to be exactly that (except public and private).

Defaults can remain as they are. Anything unlabeled within a specification is public by default, and anything unlabeled within a body is private by default.

Deferred Partial Instantiation of Generic Specifications

Should Support:

    generic
       type Type1 is private;
       with function Image(Value : Type1) return String;
    package My_Client_Package is new My_Package
       (Type1     => Type1,
        Type2     => Integer,
        Type3     => String,
        Something => Something_For_Integer_And_String,
        Image     => Image);

Rationale:

See AdaCore/ada-spark-rfcs#41 for full details.

The intent of this is to allow partial specializations of generics so that one can more easily lay out complex designs using "core" generics that may have a lot of formals in order to cover various use cases. This feature would allow one to take such a "core" generic and make a more "client friendly" formal parameter list. Consider a core generic:

    generic
       type Type1 is private;
       type Type2 is limited private;
       type Type3(<>);
       with procedure Something(Param1 : Type2; Param2 : Type3);
       with function Image(Value : Type1) return String;
    package My_Package is
       procedure Yay(Value : Type1);  -- calls Image internally
       -- Other Stuff
    end My_Package;

The general 99% use case for that might really only need to supply Type1 and Image, but in order to cover some internal implementations or special niche cases, Type2, Type3, and Something are necessary. However, it may not be intuitive for a general user. This suggests the new syntax so that the user of a library can instantiate packages this way:

    package P is new My_Client_Package (My_Type, Image_For_My_Type);

and not have to worry about the more complex formal parameters.

Current Ada would require either manually recreating the API of My_Package inside of My_Client_Package and doing all the scaffolding, a maintenance and error hazard. Alternately one might instantiate an instance of My_Package inside of My_Client_Package, but then one would have to call P1.P2.Yay, with the P2 being superfluous and only there because there isn't a much better way, not because the design itself requires it.

Record syntax

Ada does:

type Example is record
end record;

It should be:

record Example is
end Example;

Rationale:
Records are obviously a kind of type so this is an excessive keyword. It also can't be argued that syntax should be kept that way for orthogonality as protected type and task type both violate the older syntax.

Furthermore, ending with the same name the type was introduced with helps parsers deal with nesting much better. Not that records should support nesting, but as block-statements are widely used in Ada, it's nice to keep things orthogonal. Consider other declarations like for package or function.

Enumeration syntax

Ada does:

type Example is (First, Second, Third);

for Example use (
    First => 1,
    Second => 2,
    Third => 4);

It should be:

default ordered:

enumeration Example is First, Second, Third;

representation specified:

enumeration Example is
    First := 1,
    Second := 2,
    Third := 4;

Rationale:
Enumerations are a very useful concept that was made overly cumbersome in Ada. These changes would also greatly simplify other possible changes to enumerations.

Furthermore enum would be fine but enumeration is listed here because of intellisense greatly decreasing necessary keystrokes anyways. So optimizing source for readability was chosen.

Additional generic formal parameters

Should Support:

type T is real (<>) --For any real type (float, fixed, and decimal)

type T is scalar (<>) --For any scalar type (any discrete or real)

Rationale:
Many algorithms can be made generic for any real or any scalar type, yet there is no way to define this point. Yet there is a generic formal type for any discrete type.

Anonymous constraints

Should Support:

function Discount(Cost : Currency, Rate : Integer range 0 .. 100) return Currency;

And similar

Rationale:
Sometimes, especially in mathematics, a function is valid only for a specific domain. These cases, especially 0.0 .. 1.0, really don't justify their own type, especially since Ada is nominatively typed. Not having this either requires adding this named type and having downstream convert to this type, or use more common types and add validation code inside your function like what is done with other languages, deferring errors to execution time instead.

Neither of these options are great. Interestingly enough these anonymous constraints are allowed in plenty of other cases, such as record fields and most dumbfoundedly, the return type of a function.

This is clearly an oversight and should be resolved.

Variable declaration

Ada does:

I : Integer := 1;

It should be:

variable I : Integer := 1;

Rationale:
This proposal is twofold:

This enables the more orthogonal design of introducing everything with the keyword of what's being introduced.
It's better for the parser as we have a clear introduction instead of repeatedly guessing why an identifier is at the start of a statement.

While variable is used, var would be fine. I'm optimizing for syntax because intellisense greatly reduces keystrokes regardless.

Inheritable/Subtypable extensions

Should Support:

exception Integer_Exception is abstract Exception;

exception Integer_Overflow is new Integer_Exception;

or alternatively

Integer_Exception : abstract Exception;

Integer_Overflow_Exception : Integer_Exception;

Rationale:
In the real world, whole classes of exceptions can't be handled and need to simply terminate execution, or at least terminate a task and ask again for human input. This allows entire branches of exceptions to be handled the same way, which also encourages such branching. This is preferable as it makes it easier for developers to provide much more detailed and specific exceptions, making the resulting error so much more clear.

I prefer the first syntax as it makes it more clear Exceptions aren't a type, as they aren't considered a type in Ada. I actually agree with this distinction, but then they are declared using the same declaration syntax types use.

Exponentiation as operator

Should Support:

Value : Integer := 10;
New_Value : Integer := Value e10; //Or literally 10e10

Rationale:
For one, number syntax can be greatly reduced by making this an operator. For two, this exposes the behavior as something that can be used after the fact. Not an incredibly common operator of course, but there's been uses for this in various algorithms such as converting a string to a number in one single pass.

Attribute definition

Should Support:

attribute Object_Type'Attribute_Name is ...Definition

Rationale:
Attributes are actually function calls to intrinsic functions in Ada. They can be passed as functions when necessary, such as a Generic parameter. Because they are names for functions, they should default to an intrinsic when possible, be undefined when not possible, and be definable when not intrinsic.

params syntax

Ada does:

function Example(Params : Integer_Array) return …

Example((1,2,3,4));

It should be:

function Example(params Params : Integer_Array) return …
--or
function Example(…Params : Integer_Array) return …
--or
function Example(Params : params Integer_Array) return ...

Example(1,2,3,4);

Rationale:
params as they are seen in some languages, are syntactic sugar around a very specific but common design pattern: the last part of a function parameter is an unbounded array to be operated upon. See C#'s for details and usecases.

There are a number of restrictions which make this work, that should be enforced. The common pitfall of params in some languages is weak typing or in some cases no typing. C's fprintf has issues because of this. However if the type is strongly enforced and the params array can only be of that singular type, no weakness occurs. With this restriction, it becomes a simple syntactic sugar around passing an array as a parameter. It is a hard requirement that params be the last parameter of the parameter list.

Remove declaration blocks, allow declarations anywhere

Rationale:

Ada initializes on declaration. This is extremely beneficial for certain reasons, but also expensive. Many objects can be expensive to initialize.

While not very often the case when Ada was first developed, programming has largely shifted to very small functions by default, with breaking out as quickly as possible being the ideal. This means if you know you already have the result and don't need certain objects, don't construct them, don't allocate, nothing; just return the result. Because declaration blocks come before execution blocks, and initialization occurs at declaration, the entire cost of every object you might need throughout the branch is hoisted to immediately after the jump to subprogram.

There are allowable declare blocks inside of execution blocks. However, at this point it should just become the norm. Instead of a clunky three line syntax addition, just allow the declaration at the required site, allowing explicit declaration of when initialization happens, potentially avoiding it when unnecessary.

entomy / ada-improvements Goto Github PK

ada-improvements's Introduction

Hi there 👋

ada-improvements's People

Contributors

Stargazers

Watchers

ada-improvements's Issues

Boolean Comparisons

Copy File [1]

Recommend Projects

Recommend Topics

Recommend Org