Work in progress!

This is the working draft for the new sys package interfaces (not concrete implementations). Eventually a PR will be submitted into haxe-evolution.

TODO

Haxe compatibility: Dns (Host), Socket
Https - mostly a copy of the Http APIs, some extra SSL-specific options

New `sys` APIs

Proposal: HXP-NNNN
Author: Aurel Bílý

Introduction

Improved API for both synchronous and asynchronous filesystem operations based on Node.js; improved networking API; asynchrony primitives; I/O streams.

Motivation

Asynchrony

There is currently no good way to asynchronously perform many sys-related tasks (without manually creating Threads). Two basic primitives are added to the library:

events (and listeners)
unified callback style

Streams

The current Haxe API contains haxe.io.Input and haxe.io.Output for input and output streams. These lack:

ability to express a read and write stream (sys.io.File has two separate streams rather than one RW stream)
pipelining without manual chunking
proper asynchronous operations
automatically pacing streams with different data emission / consumption rates

Filesystem

The current filesystem APIs in Haxe lack a number of important features:

asynchronous tasks
changing permissions, owners of files
symlink operations
watching for changes

Networking

Non-blocking socket operations are inconvenient to use in the current API even though they are the only (non-Thread) solution to some real-time network communication problems. IPC communication is not possible, UDP sockets are not fully featured.

There is a lack of proper unit testing of the networking APIs. Certain platforms also miss full implementations of various parts of the networking API. (See HaxeFoundation/haxe#6933, HaxeFoundation/haxe#6816)

Detailed design

Modified modules (new API + backward compatibility):

Added modules:

haxe.Error - for reporting errors, see errors
haxe.ErrorType
haxe.NoData - type to represent an absence of data in generics (e.g. Callback<NoData>)
haxe.async.Callback - generic type to represent an error-first callback, see callbacks
haxe.async.Event - see events
haxe.async.Listener - event listener
haxe.io.Duplex - see streams
haxe.io.IReadable
haxe.io.IStream
haxe.io.IWritable
haxe.io.Readable
haxe.io.Stream
haxe.io.Writable
sys.DirectoryEntry
sys.FileAccessMode
sys.FileCopyFlags
sys.FileMode
sys.FileOpenFlags
sys.FileWatcher
sys.async.FileSystem
sys.async.Http
sys.async.net.Socket
sys.io.AsyncFile
sys.io.FileReadStream
sys.io.FileWriteStream
sys.net.Dns
sys.net.Net
sys.net.Server
sys.net.UdpSocket
sys.net.Url

Relevant Node.js APIs:

Errors

A haxe.Error class is added to unify error reporting in the system APIs. It has a message field which contains the human-readable description of the error. It also includes a type field which can be switch-ed on.

try {
  sys.FileSystem.someOperation();
} catch (err:haxe.Error) {
  trace("error!", err);
}
// or
try {
  sys.FileSystem.someOperation();
} catch (err:haxe.Error) {
  switch (err.type) {
    case FileNotFound: // it's fine
    case _: throw err;
  }
}

Unresolved question:

There are multiple ways of expressing proper type-safe errors for the filesystem API:

errors represented by a single enum (sys.FileSystemError), with the individual cases containing all the information of that particular error

awkward to catch individual errors (any catch would need a switch)

fewer classes to maintain, less work to throw errors (the case names the error, so no message is needed)

errors represented by sub-classes of a single base class

possible to catch individual subclasses in separate catch blocks

many classes in the package (could be moved into a sub-package for errors?)

base class Error + enum for types, as implemented in the draft now

The primary aim for any solution is to be able to catch specific types of errors without having to rely on string comparison.

Events

A type-safe system for emitting events, similar to tink_core Signals is added. An Event<T> is simply an abstract over an array of listeners (Listener<T>). An event-emitting object has a number of final events.

class Example {
  public final eventFoo = new Event<NoData>();
  public final eventBar = new Event<String>();
  public function new() super();
  public function emitEvents() {
    eventFoo.emit(new NoData());
    eventBar.emit("hello");
  }
}

class Main {
  static function main():Void {
    var example = new Example();
    example.eventFoo.on(() -> trace("event foo"));
    example.eventBar.on(str -> trace("event bar", str));
    example.emitEvents();
  }
}

Currently no efforts were made to "hide" the emit method (like the Signal and SignalTrigger distinction made in tink_core).

Callbacks

Asynchronous methods are identical to their synchronous counter-parts, except:

their return type is Void
they have an additional, required callback argument of type Callback<DataType> or Callback<NoData>
- first argument passed to the callback is a haxe.Error, or null if no error occurred
- any additional arguments represent the data returned by the call, analogous to the return type of the synchronous method; if the synchronous method has a Void return type, the callback takes no additional arguments
- Callback<T> is an abstract which has some from methods, allowing a callback to be created from functions with a simpler signature (e.g. a Callback<NoData> from (err:Error)->Void)

Flags, modes, constants

Several methods in the API accept constants or a combination of flags. Constants (where the argument is exactly one of a set of options) have been converted to an enum or enum abstract. Flags (where the argument is zero or more of a set of options) have been converted to an abstract over Int, with an overloaded | operator.

Streams

At the core of a lot of Node.js APIs lie streams, which are abstractions for data consumers (Writable), data producers (Readable), or a mix of both (Duplex or Transform). Streams enable better composition of data operations with methods such as pipeline. There is also a mechanism to minimise buffering of data in memory (highWaterMark, drain) when combining streams.

File descriptors

The Node.js API has a concept of file descriptors, represented by a single integer. To avoid issues with platforms without explicit file descriptor numbers, sys.io.File is an abstract, similar to the new threading API.

Various fs.f* methods from Node.js which take fd as their first argument are moved into their own methods in the File abstract.

Synchronous / asynchronous versions

To avoid the someMethod + someMethodSync naming scheme present in Node.js, the two versions are more clearly split:

sys.FileSystem and sys.async.FileSystem (static methods)
sys.io.File has an async field for asynchronous instance methods

// synchronously:
var file = sys.FileSystem.open("file.txt", Read);
var data = file.readFile();

// asynchronously:
sys.async.FileSystem.open("file.txt", Read, (err, file) -> {
    file.readFile((err, data) -> {
        // ...
      });
  });

Non-Unicode filepaths

In Node.js, wherever a path is expected as an argument, a Buffer can be provided, equivalent to haxe.io.Bytes. Similarly, whenever paths are to be returned, either a String or a Buffer is returned, depending on the encoding option ("utf8" or "buffer").

It would be awkward to mirror this behaviour in Haxe, so instead, the assumption is made that filepaths will be Unicode most of the time, and String is used consistently in the API. In the rare cases that non-Unicode paths are returned, they are escaped into a Unicode string. The original Bytes can be obtained with sys.FileSystem.bytesOfPath(path). There is also the inverse sys.FileSystem.pathOfBytes(bytes).

See HaxeFoundation/haxe#8134

Backward compatibility

The methods in the current sys.FileSystem and sys.io.File APIs will be kept for the time being, as inlines using the new methods. The names of the methods in Node.js are arguably less intuitive (e.g. mkdir instead of createDirectory), but they were kept to retain familiarity.

Target specifics

Where possible, the asynchronous methods should use native calls. For some targets this might not be possible, so in the worst-case scenario these methods will run the synchronous call in a Thread, then trigger the callback once done.

For many targets, wrapping libuv (the library that powers Node.js APIs) will be the most straight-forward implementation option.

(TODO: research individual APIs on remaining targets)

cpp
cs
eval - libuv bindings for OCaml ?
hl - libuv bindings already started
java, jvm
js (with hxnodejs) - mostly trivial mapping since it is the Node.js API
lua - luvit
neko
php
python

Testing

The majority of tests for the current sys classes should be reused. It may be worthwhile to adapt the existing tests to test both implementations (with a forced synchronous operation on sys.async) so tests are not duplicated. Additional tests should be written to test async-specific features, such as writing multiple files in parallel.

For methods that were not present in the original APIs, some tests may be based on the extensive Node.js test suite.

Impact on existing code

Existing code should not be affected:

completely new APIs will be in new packages
new APIs which are largely compatible with the old APIs still keep the methods of the old APIs for backward compatibility

Drawbacks

Alternatives

Opening possibilities

better haxelib

Unresolved questions

To be determined before implementation (in PR discussions):

error reporting style
specifics of packages, class names generally
currently all filesize and file position arguments are Int, but this only allows sizes of up to 2 GiB
- should we use haxe.Int64?
- is the support of haxe.Int64 good enough on sys targets
- Node.js uses the Number type, which has at least 53 bits of integer precision

back2dos / haxe-sys Goto Github PK

haxe-sys's Introduction

Work in progress!

TODO

New sys APIs

Introduction

Motivation

Asynchrony

Streams

Filesystem

Networking

Detailed design

Errors

Events

Callbacks

Flags, modes, constants

Streams

File descriptors

Synchronous / asynchronous versions

Non-Unicode filepaths

Backward compatibility

Target specifics

Testing

Impact on existing code

Drawbacks

Alternatives

Opening possibilities

Unresolved questions

Recommend Projects

Recommend Topics

Recommend Org

New `sys` APIs