Giter Site home page Giter Site logo

serde's Introduction

Serde

Latest Version on Packagist Software License Total Downloads

Serde (pronounced "seer-dee") is a fast, flexible, powerful, and easy to use serialization and deserialization library for PHP that supports a number of standard formats. It draws inspiration from both Rust's Serde crate and Symfony Serializer, although it is not directly based on either.

At this time, Serde supports serializing PHP objects to and from PHP arrays, JSON, YAML, and CSV files. It also supports serializing to JSON or CSV via a stream. Further support is planned, but by design can also be extended by anyone.

Install

Via Composer

$ composer require crell/serde

Usage

Serde is designed to be both quick to start using and robust in more advanced cases. In its most basic form, you can do the following:

use Crell\Serde\SerdeCommon;

$serde = new SerdeCommon();

$object = new SomeClass();
// Populate $object somehow;

$jsonString = $serde->serialize($object, format: 'json');

$deserializedObject = $serde->deserialize($jsonString, from: 'json', to: SomeClass::class);

(The named arguments are optional, but recommended.)

Serde is highly configurable, but common cases are supported by just using the SerdeCommon class as provided. For most basic cases, that is all you need.

Key features

Supported formats

Serde can serialize to:

  • PHP arrays (array)
  • JSON (json)
  • Streaming JSON (json-stream)
  • YAML (yaml)
  • CSV (csv)
  • Streaming CSV (csv-stream)

Serde can deserialize from:

  • PHP arrays (array)
  • JSON (json)
  • YAML (yaml)
  • CSV (csv)

(YAML support requires the Symfony/Yaml library.) XML support is in progress.

Robust object support

Serde automatically supports nested objects in properties of other objects, which will be handled recursively as long as there are no circular references.

Serde handles public, private, protected, and readonly properties, both reading and writing, with optional default values.

If you try to serialize or deserialize an object that implements PHP's __serialize() or __unserialize() hooks, those will be respected. (If you want to read/write from PHP's internal serialization format, just call serialize()/unserialize() directly.)

Serde also supports post-load callbacks that allow you to re-initialize derived information if necessary without storing it in the serialized format.

PHP objects can be mutated to and from a serialized format. Nested objects can be flattened or collected, classes with common interfaces can be mapped to the appropriate object, and array values can be imploded into a string for serialization and exploded back into an array when reading.

Configuration

Serde's behavior is driven almost entirely through attributes. Any class may be serialized from or deserialized to as-is with no additional configuration, but there is a great deal of configuration that may be opted-in to.

Attribute handling is provided by Crell/AttributeUtils. It is worth looking into as well.

The main attribute is the Crell\Serde\Attributes\Field attribute, which may be placed on any object property. (Static properties are ignored.) All of its arguments are optional, as is the Field itself. (That is, adding #[Field] with no arguments is the same as not specifying it at all.) The meaning of the available arguments is listed below.

Although not required, it is strongly recommended that you always use named arguments with attributes. The precise order of arguments is not guaranteed.

In the examples below, the Field is generally referenced directly. However, you may also import the namespace and then use namespaced versions of the attributes, like so:

use Crell\Serde\Attributes as Serde;

#[Serde\ClassSettings(includeFieldsByDefault: false)]
class Person
{
    #[Serde\Field(serializedName: 'callme')]
    protected string $name = 'Larry';
}

Which you do is mostly a matter of preference, although if you are mixing Serde attributes with attributes from other libraries then the namespaced approach is advisable.

There is also a ClassSettings attribute that may be placed on classes to be serialized. At this time it has four arguments:

  • includeFieldsByDefault, which defaults to true. If set to false, a property with no #[Field] attribute will be ignored. It is equivalent to setting exclude: true on all properties implicitly.
  • requireValues, which defaults to false. If set to true, then when deserializing any field that is not provided in the incoming data will result in an exception. This may also be turned on or off on a per-field level. (See requireValue below.) The class-level setting applies to any field that does not specify its behavior.
  • renameWith. If set, the specified renaming strategy will be used for all properties of the class, unless a property specifies its own. (See renameWith below.) The class-level setting applies to any field that does not specify its behavior.
  • scopes, which sets the scope of a given class definition attribute. See the section on Scopes below.

exclude (bool, default false)

If set to true, Serde will ignore the property entirely on both serializing and deserializing.

serializedName (string, default null)

If provided, this string will be used as the name of a property when serialized out to a format and when reading it back in. for example:

use Crell\Serde\Attributes\Field;

class Person
{
    #[Field(serializedName: 'callme')]
    protected string $name = 'Larry';
}

Round trips to/from:

{
    "callme": "Larry"
}

renameWith (RenamingStrategy, default null)

The renameWith key specifies a way to mangle the name of the property to produce a serializedName. The most common examples here would be case folding, say if serializing to a format that uses a different convention than PHP does.

The value of renameWith can be any object that implements the RenamingStrategy interface. The most common versions are already provided via the Cases enum and Prefix class, but you are free to provide your own.

The Cases enum implements RenamingStrategy and provides a series of instances (cases) for common renaming. For example:

use Crell\Serde\Attributes\Field;
use Crell\Serde\Renaming\Cases;

class Person
{
    #[Field(renameWith: Cases::snake_case)]
    public string $firstName = 'Larry';

    #[Field(renameWith: Cases::CamelCase)]
    public string $lastName = 'Garfield';
}

Serializes to/from:

{
    "first_name": "Larry",
    "LastName": "Garfield"
}

Available cases are:

  • Cases::UPPERCASE
  • Cases::lowercase
  • Cases::snake_case
  • Cases::kebab_case (renders with dashes, not underscores)
  • Cases::CamelCase
  • Cases::lowerCamelCase

The Prefix class attaches a prefix to values when serialized, but otherwise leaves the property name intact.

use Crell\Serde\Attributes\Field;
use Crell\Serde\Renaming\Prefix;

class MailConfig
{
    #[Field(renameWith: new Prefix('mail_')]
    protected string $host = 'smtp.example.com';

    #[Field(renameWith: new Prefix('mail_')]
    protected int $port = 25;

    #[Field(renameWith: new Prefix('mail_')]
    protected string $user = 'me';

    #[Field(renameWith: new Prefix('mail_')]
    protected string $password = 'sssh';
}

Serializes to/from:

{
    "mail_host": "smtp.example.com",
    "mail_port": 25,
    "mail_user": "me",
    "mail_password": "sssh"
}

If both serializedName and renameWith are specified, serializedName will be used and renameWith ignored.

alias (array, default [])

When deserializing (only), if the expected serialized name is not found in the incoming data, these additional property names will be examined to see if the value can be found. If so, the value will be read from that key in the incoming data. If not, it will behave the same as if the value was simply not found in the first place.

use Crell\Serde\Attributes\Field;

class Person
{
    #[Field(alias: ['layout', 'design'])]
    protected string $format = '';
}

All three of the following JSON strings would be read into an identical object:

{
    "format": "3-column-layout"
}
{
    "layout": "3-column-layout"
}
{
    "design": "3-column-layout"
}

This is mainly useful when an API key has changed, and legacy incoming data may still have an old key name.

useDefault (bool, default true)

This key only applies on deserialization. If a property of a class is not found in the incoming data, and this property is true, then a default value will be assigned instead. If false, the value will be skipped entirely. Whether the deserialized object is now in an invalid state depends on the object.

The default value to use is derived from a number of different locations. The priority order of defaults is:

  1. The value provided by the default argument to the Field attribute.
  2. The default value provided by the code, as reported by Reflection.
  3. The default value of an identically named constructor argument, if any.

So for example, the following class:

use Crell\Serde\Attributes\Field;

class Person
{
    #[Field(default: 'Hidden')]
    public string $location;

    #[Field(useDefault: false)]
    public int $age;

    public function __construct(
        public string $name = 'Anonymous',
    ) {}
}

if deserialized from an empty source (such as {} in JSON), will result in an object with location set to Hidden, name set to Anonymous, and age still uninitialized.

default (mixed, default null)

This key only applies on deserialization. If specified, then if a value is missing in the incoming data being deserialized this value will be used instead, regardless of what the default in the source code itself is.

strict (bool, default true)

This key only applies on deserialization. If set to true, a type mismatch in the incoming data will be rejected and an exception thrown. If false, a deformatter will attempt to cast an incoming value according to PHP's normal casting rules. That means, for example, "1" is a valid value for an integer property if strict is false, but will throw an exception if set to true.

For sequence fields, strict set to true will reject a non-sequence value. (It must pass an array_is_list() check.) If strict is false, any array-ish value will be accepted but passed through array_values() to discard any keys and reindex it.

The exact handling of this setting may vary slightly depending on the incoming format, as some formats handle their own types differently. (For instance, everything is a string in XML.)

requireValue (bool, default false)

This key only applies on deserialization. If set to true, if the incoming data does not include a value for this field and there is no default specified, a MissingRequiredValueWhenDeserializing exception will be thrown. If not set, and there is no default value, then the property will be left uninitialized.

If a field has a default value, then the default value will always be used for missing data and this setting has no effect.

flatten (bool, default false)

The flatten keyword can only be applied on an array or object property. A property that is "flattened" will have all of its properties injected into the parent directly on serialization, and will have values from the parent "collected" into it on deserialization.

Multiple objects and arrays may be flattened (serialized), but on deserialization only the lexically last array property marked flatten will collect remaining keys. Any number of objects may "collect" their properties, however.

As an example, consider pagination. It may be very helpful to represent pagination information in PHP as an object property of a result set, but in the serialized JSON or XML you may want the extra object removed.

Given this set of classes:

use Crell\Serde\Attributes as Serde;

class Results
{
    public function __construct(
        #[Serde\Field(flatten: true)]
        public Pagination $pagination,
        #[Serde\SequenceField(arrayType: Product::class)]
        public array $products,
    ) {}
}

class Pagination
{
    public function __construct(
        public int $total,
        public int $offset,
        public int $limit,
    ) {}
}

class Product
{
    public function __construct(
        public string $name,
        public float $price,
    ) {}
}

When serialized, the $pagination object will get "flattened," meaning its three properties will be included directly in the properties of Results. Therefore, a JSON-serialized copy of this object may look like:

{
    "total": 100,
    "offset": 20,
    "limit": 10,
    "products": [
        {
            "name": "Widget",
            "price": 9.99
        },
        {
            "name": "Gadget",
            "price": 4.99
        }
    ]
}

The extra "layer" of the Pagination object has been removed. When deserializing, those extra properties will be "collected" back into a Pagination object.

Now consider this more complex example:

use Crell\Serde\Attributes as Serde;

class DetailedResults
{
    public function __construct(
        #[Serde\Field(flatten: true)]
        public NestedPagination $pagination,
        #[Serde\Field(flatten: true)]
        public ProductType $type,
        #[Serde\SequenceField(arrayType: Product::class)]
        public array $products,
        #[Serde\Field(flatten: true)]
        public array $other = [],
    ) {}
}

class NestedPagination
{
    public function __construct(
        public int $total,
        public int $limit,
        #[Serde\Field(flatten: true)]
        public PaginationState $state,
    ) {}
}

class PaginationState
{
    public function __construct(
        public int $offset,
    ) {
    }
}

class ProductType
{
    public function __construct(
        public string $name = '',
        public string $category = '',
    ) {}
}

In this example, both NestedPagination and PaginationState will be flattened when serializing. NestedPagination itself also has a field that should be flattened. Both will flatten and collect cleanly, as long as none of them share a property name.

Additionally, there is an extra array property, $other. $other may contain whatever associative array is desired, and its values will also get flattened into the output.

When collecting, only the lexically last flattened array will get any data, and will get all properties not already accounted for by some other property. For example, an instance of DetailedResults may serialize to JSON as:

{
    "total": 100,
    "offset": 20,
    "limit": 10,
    "products": [
        {
            "name": "Widget",
            "price": 9.99
        },
        {
            "name": "Gadget",
            "price": 4.99
        }
    ],
    "foo": "beep",
    "bar": "boop"
}

In this case, the $other property has two keys, foo and bar, with values beep and boop, respectively. The same JSON will deserialize back to the same object as before.

Value objects

Flattening can also be used in conjunction with renaming to silently translate value objects. Consider:

class Person
{
    public function __construct(
        public string $name,
        #[Field(flatten: true)]
        public Age $age,
        #[Field(flatten: true)]
        public Email $email,
    ) {}
}

readonly class Email
{
    public function __construct(
        #[Field(serializedName: 'email')] public string $value,
    ) {}
}

readonly class Age
{
    public function __construct(
        #[Field(serializedName: 'age')] public int $value
    ) {
        $this->validate();
    }

    #[PostLoad]
    private function validate(): void
    {
        if ($this->value < 0) {
            throw new \InvalidArgumentException('Age cannot be negative.');
        }
    }
}

In this example, Email and Age are value objects, in the latter case with extra validation. However, both are marked flatten: true, so their properties will be moved up a level to Person when serializing. However, they both use the same property name, so both have a custom serialization name specified. The above object will serialize to (and deserialize from) something like this:

{
    "name": "Larry",
    "age": 21,
    "email": "[email protected]"
}

Note that because deserialization bypasses the constructor, the extra validation in Age must be placed in a separate method that is called from the constructor and flagged to run automatically after deserialization.

It is also possible to specify a prefix for a flattened value, which will also be applied recursively. For example, assuming the same Age class above:

readonly class JobDescription
{
    public function __construct(
        #[Field(flatten: true, flattenPrefix: 'min_')]
        public Age $minAge,
        #[Field(flatten: true, flattenPrefix: 'max_')]
        public Age $maxAge,
    ) {}
}

class JobEntry
{
    public function __construct(
        #[Field(flatten: true, flattenPrefix: 'desc_')]
        public JobDescription $description,
    ) {}
}

In this case, serializing JobEntry will first flatten the $description property, with desc_ as a prefix. Then, JobDescription will flatten both of its age fields, giving each a separate prefix. That will result in a serialized output something like this:

{
    "desc_min_age": 18,
    "desc_max_age": 65,
}

And it will deserialize back to the same original 3-layer-object structure.

flattenPrefix (string, default '')

When an object or array property is flattened, by default its properties will be flattened using their existing name (or serializedName, if specified). That may cause issues if the same class is included in a parent class twice, or if there is some other name collission. Instead, flattened fields may be given a flattenPrefix value. That string will be prepended to the name of the property when serializing.

If set on a non-flattened field, this value is meaningless and has no effect.

Sequences and Dictionaries

In most languages, and many serialization formats, there is a difference between a sequential list of values (called variously an array, sequence, or list) and a map of arbitrary size of arbitrary values to other arbitrary values (called a dictionary or map). PHP does not make a distinction, and shoves both data types into a single associative array variable type.

Sometimes that works out, but other times the distinction between the two greatly matters. To support those cases, Serde allows you to flag an array property as either a #[SequenceField] or #[DictionaryField] (and it is recommended that you always do so). Doing so ensures that the correct serialization pathway is used for the property, and also opens up a number of additional features.

arrayType

On both a #[SequenceField] and #[DictionaryField], the arrayType argument lets you specify the type that all values in that structure are. For example, a sequence of integers can easily be serialized to and deserialized from most formats without any additional help. However, an ordered list of Product objects could be serialized, but there's no way to tell then how to deserialize that data back to Product objects rather than just a nested associative array (which would also be legal). The arrayType argument solves that issue.

If arrayType is specified, then all values of that array are assumed to be of that type. It may either be a class-string to specify all values are a class, or a value of the ValueType enum to indicate one of the four supported scalars.

On deserialization, then, Serde will either validate that all incoming values are of the right scalar type, or look for nested object-like structures (depending on the specific format), and convert those into the specified object type.

For example:

use Crell\Serde\Attributes\SequenceField;

class Order
{
    public string $orderId;

    public int $userId;

    #[SequenceField(arrayType: Product::class)]
    public array $products;
}

In this case, the attribute tells Serde that $products is an indexed, sequential list of Product objects. When serializing, that may be represented as an array of dictionaries (in JSON or YAML) or perhaps with some additional metadata in other formats.

When deserializing, the otherwise object-ignorant data will be upcast back to Product objects.

arrayType works the exact same way on a DictionaryField.

keyType

On DictionaryField only, it's possible to restrict the array to only allowing integer or string keys. It has two legal values, KeyType::Int and KeyType::String (an enum). If set to KeyType::Int, then deserialization will reject any arrays that have string keys, but will accept numeric strings. If set to KeyType::String, then deserialization will reject any arrays that have integer keys, including numeric strings.

(PHP auto-casts integer string array keys to actual integers, so there is no way to allow them in string-based dictionaries.)

If no value is set, then either key type will be accepted.

implodeOn

The implodeOn argument to SequenceField, if present, indicates that the value should be joined into a string serialization, using the provided value as glue. For example:

use Crell\Serde\Attributes\SequenceField;

class Order
{
    #[SequenceField(implodeOn: ',')]
    protected array $productIds = [5, 6, 7];
}

Will serialize in JSON to:

{
    "productIds": "5,6,7"
}

On deserialization, that string will get automatically get exploded back into an array when placed into the object.

By default, on deserialization the individual values will be trim()ed to remove excess whitespace. That can be disabled by setting the trim attribute argument to false.

joinOn

DictionaryFields also support imploding/exploding on serialization, but require two keys. implodeOn specifies the string to use between distinct values. joinOn specifies the string to use between the key and value.

For example:

use Crell\Serde\Attributes\DictionaryField;

class Settings
{
    #[DictionaryField(implodeOn: ',', joinOn: '=')]
    protected array $dimensions = [
        'height' => 40,
        'width' => 20,
    ];
}

Will serialize/deserialize to this JSON:

{
    "dimensions": "height=40,width=20"
}

As with SequenceField, values will automatically be trim()ed unless trim: false is specified in the attribute's argument list.

Date and Time fields

DateTime and DateTimeImmutable fields can also be serialized, and you can control how they are serialized using the DateField attribute. It has two arguments, which may be used individually or together. Specifying neither is the same as not specifying the DateField attribute at all.

use Crell\Serde\Attributes\DateField;

class Settings
{
    #[DateField(format: 'Y-m-d')]
    protected DateTimeImmutable $date = new DateTimeImmutable('4 July 2022-07-04 14:22);
}

Will serialize to this JSON:

{
    "date": "2022-07-04"
}

timezone

The timezone argument may be any timezone string legal in PHP, such as America/Chicago or UTC. If specified, the value will be cast to this timezone first before it is serialized. If not specified, the value will be left in whatever timezone it is in before being serialized. Whether that makes a difference to the output depends on the format.

On deserializing, the timezone has no effect. If the incoming value has a timezone specified, the resulting DateTime[Immutable] object will use that timezone. If not, the system default timezone will be used.

format

This argument lets you specify the format that will be used when serializing. It may be any string accepted by PHP's date_format syntax, including one of the various constants defined on DateTimeInterface. If not specified, the default format is RFC3339_EXTENDED, or Y-m-d\TH:i:s.vP. While not the most human-friendly, it is the default format used by Javascript/JSON so makes for reasonable compatibility.

On deserializing, the format has no effect. Serde will pass the string value to a DateTime or DateTimeImmutable constructor, so any format recognized by PHP will be parsed according to PHP's standard date-parsing rules.

Generators, Iterables, and Traversables

PHP has a number of "lazy list" options. Generally, they are all objects that implement the \Traversable interface. However, there are several syntax options available with their own subtleties. Serde supports them in different ways.

If a property is defined to be an iterable, then regardless of whether it's a Traversable object or a Generator the iterable will be "run out" and converted to an array by the serialization process. Note that if the iterable is an infinite iterator, the process will continue forever and your program will freeze. Don't do that.

Also, when using an iterable property the property MUST be marked with either #[SequenceField] or #[DictionaryField] as appropriate. Serde cannot deduce which it is on its own the way it (usually) can with arrays.

On deserializing, the incoming values will always be assigned to an array. As an array is an iterable, that is still type safe. While in theory it would be possible to build a dynamic generator on the fly to materialize the values lazily, that would not actually save any memory.

Note this does mean that serializing and deserializing an object will not be fully symmetric. The initial object may have properties that are generators, but the deserialized object will have arrays instead.

If a property is typed to be some other Traversable object (usually because it implements either \Iterator or \IteratorAggregate), then it will be serialized and deserialized as a normal object. Its iterable-ness is ignored. In this case, the #[SequenceField] and #[DictionaryField] attributes are forbidden.

CSV Formatter

Serde includes support for serializing/deserializing CSV files. However, because CSV is a more limited type of format only certain object structures are supported.

Specifically, the object in question must have a single property that is marked #[SequenceField], and it must have an explicit arrayType that is a class. That class, in turn, may contain only int, float, or string properties. Anything else will throw an error.

For example:

namespace Crell\Serde\Records;

use Crell\Serde\Attributes\SequenceField;

class CsvTable
{
    public function __construct(
        #[SequenceField(arrayType: CsvRow::class)]
        public array $people,
    ) {}
}

class CsvRow
{
    public function __construct(
        public string $name,
        public int $age,
        public float $balance,
    ) {}
}

This combination will result in a three-column CSV file, and also deserialize from a three-column CSV file.

The CSV formatter uses PHP's native CSV parsing and writing tools. If you want to control the delimiters used, pass those as constructor arguments to a CsvFormatter instance and inject that into the Serde class instead of the default.

Note that the lone property may be a generator. That allows a CSV to be generated on the fly off of arbitrary data. When deserialized, it will still deserialize to an array.

Streams

Serde includes two stream-based formatters (but not deformatters, yet), one for JSON and one for CSV. They work nearly the same way as any other formatter, but when calling $serde->serialize() you may (and should) pass an extra init argument. $init should be an instance of Serde\Formatter\FormatterStream, which wraps a writeable PHP stream handle.

The value returned will then be that same stream handle, after the object to be serialized has been written to it.

For example:

// The JsonStreamFormatter and CsvStreamFormatter are not included by default.
$s = new SerdeCommon(formatters: [new JsonStreamFormatter()]);

// You may use any PHP supported stream here, including files, network sockets,
// stdout, an in-memory temp stream, etc.
$init = FormatterStream::new(fopen('/tmp/output.json', 'wb'));

$result = $serde->serialize($data, format: 'json-stream', init: $init);

// $result is a FormatterStream object that wraps the same handle as before.
// What you can now do with the stream depends on what kind of stream it is.

In this example, the $data object (whatever it is) gets serialized to JSON piecemeal and streamed out to the specified file handle.

The CsvStreamFormatter works in the exact same way, but outputs CSV data and has the same restrictions as the CsvFormatter in terms of the objects it accepts.

In many cases that won't actually offer much benefit, as the whole object must be in memory anyway. However, it may be combined with the support for lazy iterators to have a property that produces objects lazily, say from a database query or read from some other source.

Consider this example:

use Crell\Serde\Attributes\SequenceField;

class ProductList
{
    public function __construct(
        #[SequenceField(arrayType: Product::class)]
        private iterable $products,
    ) {}
}

class Product
{
    public function __construct(
        public readonly string $name,
        public readonly string $color,
        public readonly float $price,
    ) {}
}

$databaseConn = ...;

$callback = function() use ($databaseConn) {
    $result = $databaseConn->query("SELECT name, color, price FROM products ORDER BY name");

    // Assuming $record is an associative array.
    foreach ($result as $record) {
        yield new Product(...$record);
    }
};

// This is a lazy list of products, which will be pulled from the database.
$products = new ProductList($callback());

// Use the CSV formatter this time, but JsonStream works just as well.
$s = new SerdeCommon(formatters: [new CsvStreamFormatter()]);

// Write to stdout, aka, back to the browser.
$init = FormatterStream::new(fopen('php://output', 'wb'));

$result = $serde->serialize($products, format: 'csv-stream', init: $init);

This setup will lazily pull records out of the database and instantiate an object from them, then lazily stream that data out to stdout. No matter how many product records are in the database, the memory usage remains roughly constant. (Note the database driver may do its own buffering of the entire result set, which could cause memory issues. That's a separate matter, however.)

While likely overkill for CSV, it can work very well for more involved objects being serialized to JSON.

TypeMaps

Type maps are a powerful feature of Serde that allows precise control over how objects with inheritance are serialized and deserialized. Type Maps translate between the class of an object and some unique identifier that is included in the serialized data.

In the abstract, a Type Map is any object that implements the TypeMap interface. TypeMaps may be provided as an attribute on a property, or on a class or interface, or provided to Serde when it is set up to allow for arbitrary maps.

Consider the following example, which will be used for the remaining explanations of Type Maps:

use Crell\Serde\Attributes\SequenceField;

interface Product {}

interface Book extends Product {}

class PaperBook implements Book
{
    protected string $title;
    protected int $pages;
}

class DigitalBook implements Book
{
    protected string $title;
    protected int $bytes;
}

class Sale
{
    protected Book $book;

    protected float $discountRate;
}

class Order
{
    protected string $orderId;

    #[SequenceField(arrayType: Book::class)]
    protected array $products;
}

Both Sale and Order reference Book, but that value could be a PaperBook, DigitalBook, or any other class that implements Book. Type Maps provide a way for Serde to tell which concrete type it is.

Class name maps

The simplest case of a class map is to include a #[ClassNameTypeMap] attribute on an object property. For example,

use Crell\Serde\ClassNameTypeMap;

class Sale
{
    #[ClassNameTypeMap(key: 'type')]
    protected Book $book;

    protected float $discountRate;
}

Now when a Sale is serialized, an extra property will be included named type that contains the class name. So a sale on a digital book would serialize like so:

{
    "book": {
        "type": "Your\\App\\DigitalBook",
        "title": "Thinking Functionally in PHP",
        "bytes": 45000
    },
    "discountRate": 0.2
}

On deserialization, the "type" property will be read and used to determine that the remaining values should be used to construct a DigitalBook instance, specifically.

Class name maps have the advantage that they are very simple, and will work with any class that implements that interface, even those you haven't thought of yet. The downside is that they put a PHP implementation detail (the class name) into the output, which may not be desirable.

Static Maps

Static maps allow you to provide a fixed map from classes to meaningful keys.

use Crell\Serde\Attributes\StaticTypeMap;

class Sale
{
    #[StaticTypeMap(key: 'type', map: [
        'paper' => Book::class,
        'ebook' => DigitalBook::class,
    ])]
    protected Book $book;

    protected float $discountRate;
}

Now, if a Sale object is serialized it will look like this:

{
    "book": {
        "type": "ebook",
        "title": "Thinking Functionally in PHP",
        "bytes": 45000
    },
    "discountRate": 0.2
}

Static maps have the advantage of simplicity and not polluting the output with PHP-specific implementation details. The downside is that they are static: They can only handle the classes you know about at code time, and will throw an exception if they encounter any other class.

Type maps on collections

Type Maps may also be applied to array properties, either sequence or dictionary. In that case, they will apply to all values in that collection. For example:

use Crell\Serde\Attributes as Serde;

class Order
{
    protected string $orderId;

    #[Serde\SequenceField(arrayType: Book::class)]
    #[Serde\StaticTypeMap(key: 'type', map: [
        'paper' => Book::class,
        'ebook' => DigitalBook::class,
    ])]
    protected array $books;
}

$products is an array of objects that implement Book, but could be either PaperBook or DigitalBook. A serialized copy of this object may look like:

{
    "orderId": "abc123",
    "products": [
        {
            "type": "ebook",
            "title": "Thinking Functionally in PHP",
            "bytes": 45000
        },
        {
            "type": "paper",
            "title": "Category Theory for Programmers",
            "pages": 335
        }
    ]
}

On deserialization, the type property will again be used to determine the class that the rest of the properties should be hydrated into.

Type mapped classes

In addition to putting a type map on a property, you may also place it on the class or interface that the property references.

use Crell\Serde\Attributes\StaticTypeMap;

#[StaticTypeMap(key: 'type', map: [
    'paper' => Book::class,
    'ebook' => DigitalBook::class,
])]
interface Book {}

Now, that Type Map will apply to both Sale::$book and to Order::$books with no further work on our part.

Type Maps also inherit. That means we can put a type map on Product instead if we wanted:

use Crell\Serde\Attributes\StaticTypeMap;

#[StaticTypeMap(key: 'type', map: [
    'paper' => Book::class,
    'ebook' => DigitalBook::class,
    'toy' => Gadget::class,
])]
interface Product {}

And both Sale and Order will still serialize with the appropriate key.

Dynamic type maps

Type Maps may also be provided directly to the Serde object when it is created. Any object that implements TypeMap may be used. This is most useful when the list of possible classes is dynamic based on user configuration, database values, what plugins are installed in your application, etc.

use Crell\Serde\TypeMap;

class ProductTypeMap implements TypeMap
{
    public function __construct(protected readonly Connection $db) {}

    public function keyField(): string
    {
        return 'type';
    }

    public function findClass(string $id): ?string
    {
        return $this->db->someLookup($id);
    }

    public function findIdentifier(string $class): ?string
    {
        return $this->db->someMappingLogic($class);
    }
}

$typeMap = new ProductTypeMap($dbConnection);

$serde = new SerdeCommon(typeMaps: [
    Your\App\Product::class => $typeMap,
]);

$json = $serde->serialize($aBook, to: 'json');

In practice, you would likely set that up via your Dependency Injection system.

Note that ClassNameTypeMap and StaticTypeMap may be injected as well, as can any other class that implements TypeMap.

Custom type maps

You may also write your own Type Maps as attributes. The only requirements are:

  1. The class implements the TypeMap interface.
  2. The class is marked as an #[\Attribute].
  3. The class is legal on both classes and properties. That is, #[\Attribute(\Attribute::TARGET_CLASS | \Attribute::TARGET_PROPERTY)]

Scopes

Serde supports "scopes" for having different versions of an attribute recognized in different contexts.

Any attribute (Field, TypeMap, SequenceField, DictionaryField, PostLoad, etc.) may take a scopes argument, which accepts an array of strings. If specified, that attribute is only valid if serializing or deserializing in that scope. If no scoped attribute is specified, then the behavior will fall back to an unscoped attribute or an omitted attribute.

For example, given this class:

class User
{
    private string $username;

    #[Field(exclude: true)]
    private string $password;

    #[Field(exclude: true)]
    #[Field(scope: 'admin')]
    private string $role;
}

If you serialize it like so:

$json = $serde->serialize($user, 'json');

It will result in this JSON response:

{
    "username": "Larry"
}

That's because, in an unscoped request, the first Field on $role is used, which excludes it from the output. However, if you specify a scope:

$json = $serde->serialize($user, 'json', scopes: ['admin']);

Then the admin version of $role's Field will be used, which is not excluded, and get this result:

{
    "username": "Larry",
    "role": "Developer"
}

When using scopes, it may be helpful to disable automatic property inclusion and require that each be specified explicitly. For example:

#[ClassSettings(includeFieldsByDefault: false)]
class Product
{
    #[Field]
    private int $id = 5;

    #[Field]
    #[Field(scopes: ['legacy'], serializedName: 'label')]
    private string $name = 'Fancy widget';

    #[Field(scopes: ['newsystem'])]
    private float $price = '9.99';

    #[Field(scopes: ['legacy'], serializedName: 'cost')]
    private float $legacyPrice = 9.99;

    #[Field(serializedName: 'desc')]
    private string $description = 'A fancy widget';

    private int $stock = 50;
}

If serialized with no scope specified, it will result in this:

{
    "id": 5,
    "name": "Fancy widget",
    "desc": "A fancy widget"
}

As those are the only fields that are "in scope" when no scope is specified.

If serialized with the legacy scope:

{
    "id": 5,
    "label": "Fancy widget",
    "cost": 9.99,
    "desc": "A fancy widget"
}

The scope-specific Field on $name gets used instead, which changes the serialized name. The $legacyPrice property is also included now, but renamed to "cost".

If serialized with the newsystem scope:

{
    "id": 5,
    "name": "Fancy widget",
    "price": "9.99",
    "desc": "A fancy widget"
}

In this case, the $name property uses the unscoped version of Field, and so is not renamed. The string-based $price is now in-scope, but the float-based $legacyPrice is not. Note that in none of these cases is the current $stock included, as it has no attribute at all.

Finally, it's also possible to serialize multiple scopes simultaneously. This is an OR operation, so any field marked for any specified scope will be included.

$json = $serde->serialize($product, 'json', scopes: ['legacy', 'newsystem']);
{
    "id": 5,
    "name": "Fancy widget",
    "price": "9.99",
    "cost": 9.99,
    "desc": "A fancy widget"
}

Note that since there is both an unscoped and a scoped version of the Field on $name, the scoped one wins and the property gets renamed.

If multiple attribute variants could apply for the specified scope, the lexically first in a scope will take precedence over later ones, and a scoped attribute will take precedence over an unscoped one.

Note that when deserializing, specifying a scope will exclude not only out-of-scope properties but their defaults as well. That is, they will not be set, even to a default value, and so may be "uninitialized." That is rarely desirable, so it may be preferable to deserialize without a scope, even if a value was serialized with a scope. That will depend on your use case.

For more on scopes, see the AttributeUtils documentation.

Validation with #[PostLoad]

It is important to note that when deserializing, __construct() is not called at all. That means any validation present in the constructor will not be run on deserialization.

Instead, Serde will look for any method or methods that have a #[\Crell\Serde\Attributes\PostLoad] attribute on them. This attribute takes no arguments other than scopes. After an object is populated, any PostLoad methods will be invoked with no arguments in lexical order. The main use case for this feature is validation, in which case the method should throw an exception if the populated data is invalid in some way. (For instance, some integer must be positive.)

The visibilty of the method is irrelevant. Serde will call public, private, or protected methods the same. Note, however, that a private method in a parent class of the class being deserialized to will not get called, as it is not accessible to PHP from that scope.

Extending Serde

Internally, Serde has five types of extensions that work in concert to produce a serialized or deserialized product.

  • Type Maps, as discussed above, are optional and translate a class name to a lookup identifier and back.
  • A Exporter is responsible for pulling values off of an object, processing them if necessary, and then passing them on to a Formatter. This is part of the Serialization pipeline.
  • A Importer is responsible for using a Deformatter to extract data from incoming data and then translate it as necessary to be written to an object. This is part of the Deserialization pipeline.
  • A Formatter is responsible for writing to a specific output format, like JSON or YAML. This is part of the Serialization pipeline.
  • A Deformatter is responsible for reading data off of an incoming format and passing it back to an Importer. This is part of the Deserialization pipeline.

Collectively, Importer and Exporter instances are called "handlers."

In general, Importers and Exporters are PHP-type specific, while Formatters and Deformatters are serialized-format specific. Custom Importers and Exporters can also declare themselves to be format-specific if they contain format-sensitive optimizations.

Importer and Exporter may be implemented on the same object, or not. Similarly, Formatter and Deformatter may be implemented together or not. That is up to whatever seems easiest for the particular implementation, and the provided extensions do a little of each depending on the use case.

The interfaces linked above provide more precise explanations of how to use them. In most cases, you would only need to implement a Formatter or Deformatter to support a new format. You would only need to implement an Importer or Exporter when dealing with a specific class that needs extra special handling for whatever reason, such as its serialized representation having little or no relationship with its object representation.

As an example, a few custom handlers are included to deal with common cases.

  • DateTimeExporter: This object will translate DateTime and DateTimeImmutable objects to and from a serialized form as a string. Specifically, it will use the \DateTimeInterface::RFC3339_EXTENDED format for the string when serializing. The timestamp will then appear in the serialized output as a normal string. When deserializing, it will accept any datetime format supported by DateTime's constructor.
  • DateTimeZoneExporter: This object will translate DateTimeZone objects to and from a serialized form as a timezone string. That is, DateTimeZone('America/Chicago)will be represented in the format as the stringAmerica/Chicago`.
  • NativeSerializeExporter: This object will apply to any class that has a __serialize() method (when serializing) or __unserialize() method (when deserializing). These PHP magic methods provide alternate representations of an object intended for use with PHP's native serialize() and unserialize() methods, but can also be used for any other format. If __serialize() is defined, it will be invoked and whatever associative array it returns will be written to the selected format as a dictionary. If __unserialize() is defined, this object will read a dictionary from the incoming data and then pass it to that method on a newly created object, which will then be responsible for populating the object as appropriate. No further processing will be done in either direction.
  • EnumOnArrayImporter: Serde natively supports PHP Enums and can serialize them as ints or strings as appropriate. However, in the special case of reading from a PHP array format this object will take over and support reading an Enum literal in the incoming data. That allows, for example, a configuration array to include hand-inserted Enum values and still be cleanly imported into a typed, defined object.

Architecture diagrams

Serialization works approximately like this:

sequenceDiagram
participant Serde
participant Serializer
participant Exporter
participant Formatter
Serde->>Formatter: initialize()
Formatter-->>Serde: prepared value
Serde->>Serializer: Set up
Serde->>Serializer: serialize()
activate Serializer
loop For each property
  Serializer->>Exporter: call depending on type
  Exporter->>Formatter: type-specific write method
  Formatter->>Serializer: serialize() sub-value
end
Serializer->>Formatter: finalize()
Serializer-->>Serde: final value
deactivate Serializer

And deserialization looks very similar:

sequenceDiagram
participant Serde
participant Deserializer
participant Importer
participant Deformatter
Serde->>Deformatter: initialize()
Deformatter-->>Serde: prepared source
Serde->>Deserializer: Set up
Serde->>Deserializer: deserialize()
activate Deserializer
loop For each property
Deserializer->>Importer: call depending on type
Importer->>Deformatter: type-specific read method
Deformatter->>Deserializer: deserialize() sub-value
end
Deserializer->>Deformatter: finalize()
Deserializer-->>Serde: final value
deactivate Deserializer

In both cases, note that nearly all behavior is controlled by a one-off serializer/deserializer object, not by Serde itself. Serde itself is just a wrapper that configures the context for the runner object.

Dependency Injection configuration

Serde is designed to be usable "out of the box" without any additional setup. However, when included in a larger system it is best to configure it properly via Dependency Injection.

There are three ways you can set up Serde.

  1. The SerdeCommon class includes most available handlers and formatters out of the box, ready to go, although you can add additional ones via the constructor.
  2. The SerdeBasic class has no pre-built configuration whatsoever; you will need to provide all Handlers, Formatters, or Type Maps you want yourself, in the order you want them applied.
  3. You may also extend the Serde base class itself and create your own custom pre-made configuration, with just the Handlers or Formatters (provided or custom) that you want.

Both SerdeCommon and SerdeBasic take four arguments: The ClassAnalyzer to use, an array of Handlers, an array of Formatters, and an array of Type Maps. If no analyzer is provided, Serde creates a memory-cached Analyzer by default so that it will always work. However, in a DI configuration it is strongly recommended that you configure the Analyzer yourself, with appropriate caching, and inject that into Serde as a dependency to avoid duplicate Analyzers (and duplicate caches). If you have multiple different Serde configurations in different services, it may also be beneficial to make all handlers and formatters services as well and explicitly inject them into SerdeBasic rather than relying on SerdeCommon.

Change log

Please see CHANGELOG for more information on what has changed recently.

Testing

$ composer test

Contributing

Please see CONTRIBUTING and CODE_OF_CONDUCT for details.

Security

If you discover any security related issues, please use the GitHub security reporting form rather than the issue queue.

Credits

Initial development of this library was sponsored by TYPO3 GmbH.

License

The Lesser GPL version 3 or later. Please see License File for more information.

serde's People

Contributors

crell avatar iceridder avatar lucas-gerard avatar ordago avatar themasch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

serde's Issues

TypeError when serializing a flattened nullable property

Detailed description

When using the #[Field(flatten: true)] on a property of type that has a nullable property, it fails with with a TypeError.

Here's a tiny reproduction:

#[ClassSettings(requireValues: true)]
final class BaseSubject
{
	public function __construct(
		public readonly ?string $firstName = null,
	) {}
}

#[ClassSettings(requireValues: true)]
final class Subject
{
	public function __construct(
		#[Field(flatten: true)]
		public readonly ?BaseSubject $baseSubject = null,
	) {}
}

// TypeError: Crell\Serde\Formatter\JsonFormatter::serializeString(): Argument #3 ($next) must be of type string, null given, called in /opt/project/vendor/crell/serde/src/PropertyHandler/ScalarExporter.php on line 20
(new SerdeCommon())->serialize(new Subject, 'json');

Context

It does not happen when serializing the nested class directly:

// no errors here
(new SerdeCommon())->serialize(new BaseSubject, 'json');

Possible implementation

Seems like it's caused by a missing check in ObjectExporter. Regular ObjectExporter::flattenValue checks whether given value is null, and if so - doesn't add a new CollectionItem so that field is never serialized. In the case of a flatten value, however, ObjectExporter::reduceObjectProperty() is used instead for some reason, which doesn't have a null check.

If I remove ObjectExporter::reduceObjectProperty() and replace it with direct flattenValue() calls, it works as expected, but I'm not sure why it was implemented that way in the first place.

Your environment

  • Version used (e.g. PHP 5.6, HHVM 3): PHP 8.1.20, Serde 0.6.0

Deserializing NULL values is ignored

Detailed description

Provide a detailed description of the change or addition you are proposing.

Make it clear if the issue is a bug, an enhancement or just a question.

Currently there is a bug that prevents nullable properties from being explicitly set to NULL.
If you use #[ClassSettings(requireValues: true)], you will receive a Crell\Serde\MissingRequiredValueWhenDeserializing error.

Context

Why is this change important to you? How would you use it?

Serde is used to map the raw data of a JSON API into an object.
It's necessary to be able to pass certain values as NULL.

How can it benefit other users?

This behavior is surely expected for other users.

Possible implementation

Not obligatory, but suggest an idea for implementing addition or change.

Passing an existing value as NULL should result in setting the property to NULL.

Your environment

Include as many relevant details about the environment you experienced the bug in and how to reproduce it.

  • Version used (e.g. PHP 5.6, HHVM 3): PHP 8.2.15 (cli) (built: Jan 20 2024 14:17:05) (NTS)
  • Operating system and version (e.g. Ubuntu 16.04, Windows 7): Ubuntu 22.04.4 LTS
  • Link to your project: https://github.com/waahhhh/serde-test

Ability to flatten single property objects

Detailed description

Ability to flatten objects with single properties automatically

Context

When working on a bounded context, it may be desirable to create an identity representation so it can have specific behavior to the domain. This can be useful for instance, when deserializing JSON payloads from en event sourced system.

Consider the following payload:

{
  "id": "identity",
  "name": "Testing Purposes"
}

and considering the following object representation:

final class Identity
{
    public function __construct(
        public readonly string $value,
    ) {
    }
}

final class FlattenSingleProperty
{
    public function __construct(
        public readonly Identity $id,
        public readonly string $name,
    ) {
    }
}

It would be ideal to successfully deserialize the object representation without having to specify the value property in the JSON payload.

How can it benefit other users?

It may be beneficial for other users for the same reason as my current use case: to avoid specifying a single property for domain objects.

Possible implementation

I went ahead and added a test case which hopefully conveys the idea and the expectation better. The test case is currently failing as I'm not sure yet in which part of the project to do the check for an object having a single property.

Branch with the test

Your environment

PHP 8.2

Using string-backed enums in type maps

Detailed description

I would like to deserialize a string value into a string-backed Enum instance while also using the same string as part of a type map. Like this:

enum Version: string
{
    case V1 = 'v1';
}

#[StaticTypeMap(key: 'version', map: [Version::V1->value => V1::class])]
interface Payload
{
    //
}

class V1 implements Payload
{
    public function __construct(
        public readonly Version $version,
        public readonly string  $foo,
        public readonly int     $bar,
    ) {
        //
    }
}

$json = <<<JSON
{
    "version": "v1",
    "foo": "asd",
    "bar": 123
}
JSON;


$serde   = new SerdeCommon();
$payload = $serde->deserialize(serialized: $json, from: 'json', to: Payload::class);
var_dump($payload);

This currently leads to an error:

TypeError: Crell\Serde\Attributes\StaticTypeMap::findClass(): Argument #1 ($id) must be of type string, Version given, called in vendor/crell/serde/src/TypeMapper.php on line 61 and defined in vendor/crell/serde/src/Attributes/StaticTypeMap.php:38

Context

Removing the $version property or changing its type to string fixes this issue. However, I think using enums to represent a set of known discriminator values is generally a good idea.
The current behavior is pretty unintuitive.

Possible implementation

We could either patch TypeMapper like this:

diff --git a/src/TypeMapper.php b/src/TypeMapper.php
index 6559322..b974198 100644
--- a/src/TypeMapper.php
+++ b/src/TypeMapper.php
@@ -58,6 +58,10 @@ class TypeMapper
             return null;
         }

+        if ($key instanceof \BackedEnum) {
+            $key = $key->value;
+        }
+
         if (!$class = $map->findClass($key)) {
             throw NoTypeMapDefinedForKey::create($key, $field->phpName ?? $field->phpType);
         }

Or add a separate TypeMap implementation - something like this:

#[Attribute(Attribute::TARGET_CLASS | Attribute::TARGET_PROPERTY)]
class BackedEnumTypeMap extends StaticTypeMap
{
    public function findClass(string|BackedEnum $id): ?string
    {
        if ($id instanceof BackedEnum) {
            $id = $id->value;
        }

        return parent::findClass($id);
    }
}

int-backed enums would remain unsupported. I don't think this is an issue, as ints are already unsupported in static type maps anyway.

Your environment

$ php -v
PHP 8.2.9 (cli) (built: Aug 16 2023 19:49:37) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.2.9, Copyright (c) Zend Technologies
    with Zend OPcache v8.2.9, Copyright (c), by Zend Technologies
    with Xdebug v3.2.1, Copyright (c) 2002-2023, by Derick Rethans
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy
$ composer show -t crell/serde
crell/serde dev-master A general purpose serialization and deserialization library
├──crell/attributeutils ~0.8.2
│  ├──crell/fp ~0.4.0
│  │  └──php ~8.1
│  └──php ~8.1
├──crell/fp >= 0.3.3
│  └──php ~8.1
└──php ~8.1

Better support for composite value objects

Detailed description

Given this scenario

readonly class Name
{
    public function __construct(public string $value) {
        // validation logic
    }
}

readonly class Age
{
    public function __construct(public int $value) {
        // validation logic
    }
}

readonly class Person
{
      public function __construct(public Name $name, public Age $age) {}
}

If I try to serialize the class Person then I will have this JSON

{"name":{"value":"John Smith"},"age":{"value":42}}

Trying with the flatten: true attribute in both properties, this is the result

{"value":"John Smith"}

But I would prefer an output like this

{"name":"John Smith", "age": 42}

Context

When working extensively with value objects and wrapping primitives, it's common to wrap fields like Email, Url, Age, etc. These are values that need specific validation and can be reused to compose more complex value objects or entities. I believe those adopting a DDD approach in PHP could share similar needs.

Possible implementation

I'm still studying how the library internally works, but I'm not yet ready to give an implementation suggestion.
But I can give some thought about a possible behaviour.

The first approach that came to my mind was this one: add the Field serializedName argument to specify a new name.

readonly class Person
{
    public function __construct(
        #[Field(flatten: true, serializedName: 'name')]
        public Name $name,
        #[Field(flatten: true, serializedName: 'age')]
        public Age $age
    ) {}
}

Alternatively, I suggest a new Attribute:

#[Attribute(Attribute::TARGET_CLASS)]
readonly class ValueObject
{
    public function __construct(public string $field = 'value') {}
}

So we can define the metadata directly inside the Value Object, instead of repeating it every time it's used.

#[ValueObject]
readonly class Name
{
    public function __construct(public string $value) {
        // validation logic
    }
}

What do you think? I would love to hear your thoughts on this proposal.

Consider parsing ints for deserializing Dates

Oftentimes, we programmers need to deserialize unix timestamps. However, Serde doesn't know what to do with an int when passing it to create a DateTimeImmutable. These are a little funky, requiring an "@" prepended before the number:

$time = 1706309512;

echo (new DateTimeImmutable("@$time"))->format('U');
// outputs 1706309512

Detailed description

When a property is annotated with DateField and the type is an int, cast to a string with "@" prepended on deserialization. Additionally, we can check that format is "U".

Context

Parsing timestamps from external services without having a post-load callback and duplicating properties.

Possible implementation

This should be pretty straightforward to implement.

Deserializing top-level sequence

Detailed description

I wonder that how to deserialize top-level sequences into an object. I found that something similar already exists in this library, but limited to CSV.

For example, I have such a PHP array

$data = [
  ['id' => 1],
  ['id' => 2],
];

And the DTOs

readonly class MyList {
  public array $items;
}

readonly class MyItem {
  public int $id;
}

It seems that I cannot deserialize $data into an instance of MyList. I tried adding these attributes to MyList#items

#[Field(flatten: true)]
#[SequenceField(arrayType: MyItem::class)]

It deserializes the array without errors, but the type of items becomes associative array rather than MyItem. Seems like the SequenceField attribute is ignored if flatten: true.

Context

I am refactoring the API in an old project. Some of the routes accepts JSON arrays as its body, and I found that Serde does not provide a simple way to deserialize the arrays into strictly-typed objects.

Possible implementation

Generalize the feature existing in the CSV handler. The analogy in Rust's serde (transparent) could be a good API design for this feature.

Your environment

  • Version used: PHP 8.3
  • Operating system and version: Arch Linux
  • Link to your project: N/A

Deserialize arrays of objects when used as values in dictionaries

Hi, loving the library so far!

I'm struggling to properly deserialize dictionaries whose values are arrays of objects. This is where I'm at:

use Crell\Serde\Attributes as Serde;

class Note {
    public string $text;
}

class Test {
    /**
     * @var array<string, Note[]>
     */
    #[Serde\DictionaryField(arrayType: 'array')]
    public array $notes = [];
}

As the PHPDoc specifies, the $notes member of the Test class is supposed to be a dictionary with string keys and Note arrays as values.

I managed to get it to work by introducing a Notes intermediary class, which I can then specify as the arrayType like so:

class Notes {
    #[Serde\SequenceField(arrayType: Note::class)]
    public array $notes = [];
}

class Test {
    /**
     * @var array<string, Notes>
     */
    #[Serde\DictionaryField(arrayType: Notes::class)]
    public array $notes = [];
}

but this is not ideal, since it complicates the serialized output unnecessarily, and makes it harder to address the individual elements as I have to go through the Notes class. I also tried applying type maps, but I didn't get far with that either.

So my question is: Can I do this in a better way? Maybe I'm missing an obvious solution here, so I would be grateful if someone could point me in the right direction.

Consider limited validation on deserialization

From Reddit, maybe we should do some validation on the object after it's loaded?

https://www.reddit.com/r/PHP/comments/wduj55/comment/iin84r2/?utm_source=reddit&utm_medium=web2x&context=3

public function __construct(private string $name) { … }

No one would expect this class to exist with an uninitialized property $name. Yes, Reflection makes it possible to create it, but then you should make sure it's not left in this state. I believe that when you specify strict: true on a non-nullable property, it should throw an error when this attribute is missing, because it's violating its type. Feel free to disagree, but even Rust's serde works that way.

So at least basic type validation around uninitalized values.

The problem is knowing which properties should get validated like that, in which case we may need an explicit flag of some kind.

mixed fails to serialize objects

Detailed description

When an object contains a mixed property containing an object, the property fails to serialize in MixedExporter due to not being an array, int, float, bool, or string.

Context

It would be nice to serialize generic objects.

Workaround

The workaround is to call $serde->serialize(...) on the mixed property and then serialize the parent object.

Can I add TOML support?

Just watched your talk, love it! I work with PHP and Rust professionally so I'd love to convince my team to use this. I would like to add TOML support if you are open to it.

Enums lose type information when serialised as part of a static map

Detailed description

When serializing with a TypeMap and one of the possible elements is an Enum it will lose its type

Ex:

#[StaticTypeMap(key: 'type', map: [
    'enum' => MyEnum::class,
    'object' =>  MyType::class,
])]

enum MyEnum: int implements MyInterface
{
    case A = 1;
    case B = 2;
}

class MyType implements MyInterface {
    public function __construct(
        public int $id = 1,
    )
    {
    }
}

class Element
{
    public function __construct(
        #[SequenceField(arrayType: MyInterface::class)]
        public array $elements = [],
    )
    {
    }
}

$element = new Element([MyEnum::A]);
$serde->serialize($element, 'json');

Will result in : {"elements":[1]} and thus de-serialization will fail

Context

I am trying to create a generic collection to serialize/de-serialize list of items

Possible implementation

Add map check in EnumExporter similar to ObjectExporter and if one is present serialize as {keyName: mixed, value: mixed}. I would not consider this a BC break as the feature was not working anyway.

Your environment

Include as many relevant details about the environment you experienced the bug in and how to reproduce it.

  • Version used (e.g. PHP 5.6, HHVM 3): PHP 8.1
  • Operating system and version (e.g. Ubuntu 16.04, Windows 7): Ubuntu 22.10

Problem with abstract nested object

I have some abstract classes that represent my common "base" value objects: for example ULID, String, Email, ...
I typically extend these abstract classes with one more appropriate for the context of my value object: for example CustomerId, CustomerName, CustomerEmail, ...
Finally I use these Value Objects inside my entity, my other Value Object and so on.

abstract class UlidValueObject
{
    final public function __construct(public readonly string $value)
    {
        if (!Ulid::isValid($value)) {
            throw new InvalidArgumentException('bla bla bla');
        }
    }

    // public static function generate(): static
    // public function equalTo(self $other): bool
    // public function __toString(): string
}

class CustomerId extends UlidValueObject
{
}

class Customer
{
    public function __construct(
        public CustomerId $id,
        public CustomerEmail $email,
        // ...
    ) {
    }
}

When I serialize the Customer object I get:

{"id": {"value": "01HBQXG72X9KF1PF6KSJXPW26W"}, "email": {"value": "[email protected]"}}

but when I try to deserialize it I get this error

Cannot initialize readonly property ...\UlidValueObject::$value from scope ...\CustomerId"

How can I fix it?

My environment

PHP 8.2

Field path on deserialization error

When error occures during deserialization the field path could be present to use in API error response.

Detailed description and context

For programmer centric experience when using deserialization as a part of API endpoint validation the resulting error
should contain a field path and basic error description and error type to be passed back to the caller, so that he can react and solve the issue.

(Or to be unified with like symfony validator and on UI level shown to fields of the form.)

Possible implementation

N/A

Your environment

  • PHP 8.1
  • Using together with symfony validator for logical (business) validation.

FR: infer types from docblock

Detailed description

Docs mention similar snippet like this:

use Crell\Serde\Attributes as Serde;

class Results
{
    public function __construct(
        #[Serde\SequenceField(arrayType: Product::class)]
        public array $products,
    ) {}
}

Could it support docblocks so we don't have to add additional attributes?

use Crell\Serde\Attributes as Serde;

class Results
{
    public function __construct(
        /** @param array<Product> */
        public array $products,
    ) {}
}

Context

I'd prefer to keep my classes free of any attributes. I have already fully typed properties, e.g. collections are typed as array<Product> so there's no really a need for additional annotation since the type can be already inferred.

So I'd like

        /** @param array<Product> */
        public array $products,

work the same as public int $i.

Property default values not working

Detailed description

Not sure if I'm missing something obvious or if this is a regression in Serde, but I can't seem to get property default values for deserialization working at all:

use Crell\Serde\Attributes\Field;
use Crell\Serde\SerdeCommon;

class Foo
{
    public function __construct(
        #[Field(default: null)]
        public ?int  $id,
        public string $name
    ) {
        //
    }        
}

$serde = new SerdeCommon();
$newFoo = $serde->deserialize(['name' => 'foobar'], 'array', Foo::class);
dump($newFoo);
= Foo {#6214
    +name: "foobar",
  }

I have tried a couple different ways, even the php default values in the constructor are not working.

Your environment

$ composer show -t crell/serde
crell/serde 0.6.0 A general purpose serialization and deserialization library
├──crell/attributeutils ~0.8.2
│  ├──crell/fp ~0.4.0
│  │  └──php ~8.1
│  └──php ~8.1
├──crell/fp >= 0.3.3
│  └──php ~8.1
└──php ~8.1

Serde should enforce SequenceField declaration

When marking a property as a SequenceField Serde currently just validates via array_is_list on deserialization, possibly throwing an exception if the validation fails. I would like to propose that Serde enforces a list via array_values if the validation fails instead.

Detailed description

When deserializing a Laravel request array I ran into a Serde\InvalidArrayKeyType exception, which proved fairly tricky to debug. In the end the problem turned out to be rather silly as Laravel's request validation may reorder request array elements, effectively turning the array into an associative one:

final class Bar
{
    public function __construct(
        public string $baz,
        #[SequenceField(arrayType: Foo::class)]
        public array $fooArray = [],
    ) {
        //
    }    
}

final class Foo
{
    public function __construct(
        public string $name,
    ) {
        //
    }        
}

$serde = new SerdeCommon();

$ok = ['baz' => 'bla', 'fooArray' => [0 => ['name' => 'meh'], 1 => ['name' => '123']]];

$error = ['baz' => 'bla', 'fooArray' => [1 => ['name' => '123'], 0 => ['name' => 'meh']]];

$serde->deserialize($ok, 'array', Bar::class);
$serde->deserialize($error, 'array', Bar::class);

Context

Enforcing a list when validation fails would save other users the pain of debugging this when major frameworks like Laravel might reorder elements unexpectedly. Also having to manually enforce lists via array_values is rather cumbersome for bigger DTOs.

Your environment

$ composer show -t crell/serde
crell/serde 0.6.0 A general purpose serialization and deserialization library
├──crell/attributeutils ~0.8.2
│  ├──crell/fp ~0.4.0
│  │  └──php ~8.1
│  └──php ~8.1
├──crell/fp >= 0.3.3
│  └──php ~8.1
└──php ~8.1

Allow required property validation

Throw an error on deserialization when the required (not nullable) is not present in the payload.

Detailed description

For an API endpoint strict validation an error being thrown when the field is missing is necessary thing.

Context

This can help everyone to build strict and more safe & secure interfaces.

Possible implementation

N/A

Your environment

  • PHP 8.1 with strict types

Upgrade docker setup

Detailed description

There are some points in the docker setup that can be upgraded and or improved

  • Currently docker is using php:8.1.0RC3-cli. Since the official php:8.0.0 is out it should be upgraded.
  • Add composer and dependencies to the docker image in order to be able to install composer dependencies from docker
  • Remove the "vendor/bin/phpunit" command from the Dockerfile to allow composer install from fresh
  • Update docker-compose example commands

Context

Why is this change important to you? How would you use it?

Having a working docker environment from scratch allows for testing and contributing easier.

How can it benefit other users?

They do not need to have the specific php version installed

Possible implementation

Not obligatory, but suggest an idea for implementing addition or change.

I will submit a PR: #2

Your environment

Include as many relevant details about the environment you experienced the bug in and how to reproduce it.

  • Docker version 20.10.12
  • docker-compose version 1.25.0
  • Ubuntu 21.04 x86_64

Null value not set on deserialization

When null is passed present in deserialized data it is not deserialized and the property remains uninitialized.

Detailed description

Consider this DTO:

class FooDTO {
public function __construct(public ?array $testExamples) {}
}

And this payload

['testExamples' => null]

The expected outcome is that the instance of FooDTO will have null set into $testExample property, but currently following error happens on access:

$testExamples must not be accessed before initialization

Context

This is important behaviour to be fixed as changing structure to = null at declaration is undesired at most cases.

Possible implementation

Fix implementation of assign.

Your environment

  • PHP 8.1

Improve developer's tooling

Detailed description

Provide configured PHPStan and PHPCS (or Easy Coding Standard) to easily run the desired checks so the code reviews may be less of style checks.

Context

  • PHPStan installed but not configured in composer.json, it also not configured by default.
  • PHPCS is configured in composer.json but the code standard is not changed to support the code base (returns too many errors)

Possible implementation

TBA

Your environment

  • Default repository checkout

Support for integer-keyed dictionaries

Hello,

I just tried out your new library, and I'm very pleased about the ease of use and functionality. Love it!

However, there was one thing I couldn't realize:
Serializing a simple array of objects into JSON.

First try: Passing the array into the serializer, but I can't. Serde is only accepting objects.
Would be great to be able to serialize arrays as well, but this might come with drawbacks I'm not aware of.

Okay, so my next try was creating a wrapper class to solve this. I'm not the biggest fan of having an extra class just for serialization, but here we go:

class FooCollection
{
    /**
     * @param Foo[] $foos
     */
    public function __construct(#[DictionaryField(arrayType: Foo::class)] public array $foos)
    {
    }
}

Now here's the catch: My array is keyed with Unix timestamps (stored as integer):

$foos = [
    1658311200 => new Foo(),
    1658314800 => new Foo(),
    1658318400 => new Foo(),
    1658322000 => new Foo(),
    ...
];

$serde->serialize(new FooCollection($foos), 'json');

I can't use SequenceExporter, because it's expecting the array to be a list. DictionaryExporter on the other hand expects the array to be keyed with strings and is failing with an exception:

Crell\Serde\Attributes\Field::create(): Argument #1 ($serializedName) must be of type string, int given, called in vendor\crell\serde\src\PropertyHandler\DictionaryExporter.php on line 30

Is there something I'm missing, or an easy solution for this? Of course, I can walk through the whole array, convert the integer keys to strings, and do the whole thing again in the opposite direction when deserializing. But that doesn't feel great, so I want to avoid it.

Type check scalar arrays

I would like to properly deserialize array of scalar values (int, strings)...

Detailed description

The same way as #[SequenceField(arrayType: ExampleDTO::class)] can be provided to ensure all items are of particular type the scalar types should work with #[SequenceField(arrayType: 'string')]

Context

This ensures that the deserialized object complies to the type declaration/documentation and can be relied on to some extend matching current PHP state-of-the-art as type-checked for example by PHPStan.

This helps everyone that aims to do strict input validation of incoming API payloads.

Possible implementation

When fiddling with this

// ArrayBasedDeformatter.php line 104 changed to
if (class_exists($class) || interface_exists($class) || in_array($class, $allowedScalar types])) {

make it work so it throws

Crell\Serde\TypeMismatch : Expected value of type string when writing to property 0, but found type int.

on invalid type

Your environment

  • PHP 8.1 with strict types

Why does `requireValue` default to false?

Detailed description

Mostly just a question, but is there a specific reason why the requireValue option on the Field attribute defaults to false? It seems a bit weird IMO, because this allows code like the following to work:

use Crell\Serde\SerdeCommon;

class User {
    public function __construct(
        public readonly string $name,
        public readonly string $address
    ) {}
}

$serde = new SerdeCommon();

$json = '{}';
$user = $serde->deserialize($json, from: 'json', to: User::class);

Context

Defaulting to requiring the value to be present on deserialization if the property has no default value would be a less surprising behavior; and also lower the amount of attributes needed to achieve this behavior

Possible implementation

Changing the default value, or making it configurable at the Serde level somehow

Your environment

PHP 8.1

Alias for Inner Objects

Hey, what a cool and useful package! I'd like to ask if Serde would be capable of handling a very specific deserialization case. I work on a legacy application with a somewhat eccentric database, let's put it that way. In PDO queries, I usually bring all results as flattened arrays, and hydrating these data is a bit tedious because it's almost all done manually. My use case would be:

<?php

$data = [
    [
        "USER_ID" => 1,
        "USER_NAME" => "John",
        "ADDRESS_LINE" => "...",
        "ADDRESS_POST_CODE" => "...",
    ]
];

class User
{
    #[Serde\Field(serializedName: 'USER_ID')]
    public int $id;

    #[Serde\Field(serializedName: 'USER_NAME')]
    public string $name;

    #[Field(alias: 'ADDRESS_*')]
    public Address $address;
}

class Address {
    #[Serde\Field(serializedName: 'ADDRESS_LINE')]
    public string $line;

    #[Serde\Field(serializedName: 'ADDRESS_POST_CODE')]
    public string $postcode;
}

Would the library be able to handle a case like this?

Serializing objects with iterable properties

Detailed description

I'd like to see serialization support for object properties of type "iterable".
They should be treated the same way as arrays (or more precise sequences).

Context

I am working on an export function to aggregate and dump data of a medical study website. Collecting the neccessary data is highly complex and takes several minutes even on the live servers. In the past, I've been using the CSV file format, dumping every row directly to "php://output" with no output buffer. This allowed the browser to update the download's file size every few seconds, and hinting the user with some kind of progress.

However, the new data is too complex for a CSV dump, so I have to switch to a more dynamic format (JSON that is, using Serde's json-stream formatter).

Now, instead of collecting all the export data first (which takes a few minutes, like I said), and then passing the result array to Serde, I'd like to lazy-load the data while serializing. This would allow me to start the file download instantly, with the file size growing over time like it used to be.

So I tried changing my property's type from array to iterable and assigning a generator function, but Serde treats iterable as an unsupported type. Changing the type to \Traversable or \Iterable doesn't work as well (PHP Warning: Cannot bind closure to scope of internal class Generator in vendor\crell\serde\src\PropertyHandler\ObjectExporter.php line 30).

Possible implementation

Well, I just needed to add iterable to Attributes\Field.php -> deriveTypeCategory() -> TypeCategory::Array and PropertyHandler\SequenceExporter.php -> canExport(), and now Serde is happily serializing my generator function to a JSON array.

But of course, implementing it correctly takes some more considerations.

  • Serializing an iterable works well, but you can't deserialize to an iterable directly. So if this feature gets added to Serde eventually, it would be one-way only. This might be a bad thing or not.
  • Is it sufficient to just add support for iterable? What if my property's type annotation is \Traversable, \Iterable or even \Generator? Could possibly be solved with some kind of check, whether the whole type inherits from iterable?
  • What about classes inheriting \Traversable, but also having additional properties? Should they be serialized as usual, or be treated as an array? I'd say, they should be treated as usual, unless they are attributed with #[SequenceField]

This might be very well an edge case, but what are your thoughts about this? I hope you get my intentions behind.

Error deserializing whole-number float values

Imagine the following class:

class Foo {
    public float $floatNumber;
}

This works just fine:

$foo = new Foo();
$foo->floatNumber = 1234.5;
$json = $this->serde->serialize($foo, 'json'); // Value reads {"floatNumber":1234.5}
$bar = $this->serde->deserialize($json, 'json', Foo::class);

However, using a (valid) whole-number float causes the Deserializer to throw an exception:

$foo = new Foo();
$foo->floatNumber = 1234.0;
$json = $this->serde->serialize($foo, 'json'); // Value reads {"floatNumber":1234}
$bar = $this->serde->deserialize($json, 'json', Foo::class);

Crell\Serde\TypeMismatch
Expected value of type float when writing to property floatNumber, but found type int.
vendor\crell\serde\src\TypeMismatch.php line 16

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.