Comments (8)
Hi @azjezz
I've started working on this. Got first draft, can you verify it?
Index: src/Psl/Internal/Loader.php
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- src/Psl/Internal/Loader.php (revision 2336e5624e3c6a8f89ea899a35b15a66385e6bac)
+++ src/Psl/Internal/Loader.php (date 1602170341000)
@@ -190,6 +190,7 @@
'Psl\SecureRandom\string',
'Psl\PseudoRandom\float',
'Psl\PseudoRandom\int',
+ 'Psl\RegExp\filter_array',
'Psl\Str\Byte\capitalize',
'Psl\Str\Byte\capitalize_words',
'Psl\Str\Byte\chr',
Index: src/Psl/RegExp/filter_array.php
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- src/Psl/RegExp/filter_array.php (date 1602170593000)
+++ src/Psl/RegExp/filter_array.php (date 1602170593000)
@@ -0,0 +1,18 @@
+<?php
+
+declare(strict_types=1);
+
+namespace Psl\RegExp;
+
+/**
+ * Perform a regular expression search and replace
+ */
+function filter_array(
+ array $pattern,
+ array $replacement,
+ array $subject,
+ int $limit = -1,
+ ?int &$count = null
+): array {
+ return preg_filter($pattern, $replacement, $subject, $limit, $count);
+}
Index: tests/Psl/FilterArrayTest.php
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- tests/Psl/FilterArrayTest.php (date 1602170385000)
+++ tests/Psl/FilterArrayTest.php (date 1602170385000)
@@ -0,0 +1,31 @@
+<?php
+
+declare(strict_types=1);
+
+namespace Psl\Tests;
+
+use PHPUnit\Framework\TestCase;
+
+use function Psl\RegExp\filter_array;
+
+class FilterArrayTest extends TestCase
+{
+ public function testFilterArray()
+ {
+ $subject = ['1', 'a', '2', 'b', '3', 'A', 'B', '4'];
+ $pattern = ['/\d/', '/[a-z]/', '/[1a]/'];
+ $replace = ['A:$0', 'B:$0', 'C:$0'];
+
+ self::assertSame(
+ [
+ 0 => 'A:C:1',
+ 1 => 'B:C:a',
+ 2 => 'A:2',
+ 3 => 'B:b',
+ 4 => 'A:3',
+ 7 => 'A:4',
+ ],
+ filter_array($pattern, $replace, $subject)
+ );
+ }
+}
from psl.
I think a psalm-plugin is a good idea, we can have it a separate package and it the suggestions ( we don't want to install a pile of packages that might not be used if someone is not even using Psl\Regex
).
the plugin can take care of other things beside regex, such as enforcing Type\string()->coerce($var)
istead of (string) $var
, and forbidding the use of php builtin functions that PSL replaces. 🤔
as you said, people would have to either add type hint themselves for the pattern or the result.
so Pattern
doesn't really add any value :/
from psl.
Hey @ddziaduch, thank you for picking this up! I suggest you open a pull request so I can do a line-by-line review, one thing to note about the Regex API, is that we want it to be type-safe, while it's "impossible" to do so 100%, we should try, and as per PSL rules, references ( int &$coutn
), must not be used.
You can take a look at HSL implementation ( https://docs.hhvm.com/search?term=HH%5CLib%5CRegex ), but we can't actually reimplement the same API as HSL, as Hack has a special generic type for regex patterns ( Pattern<T>
), but we will be using strings.
as a starter, i would suggest you start with replace
, split
, and matches
, as these are the most commently used regex functions and won't give us any trouble with types.
from psl.
Thanks for the feedback @azjezz :)
from psl.
Besides enhancing the types, I find it also important that the regex functions deal with pcre errors (invalid regexes).
There is some good stuff in here: https://github.com/spatie/regex - even though I don't always like the API of the package.
More info about dealing with PCRE errors:
https://github.com/spatie/regex/blob/master/src/RegexResult.php#L9-L16
@azjezz : I like the Pattern<T>
. Can't wa provide a class for that with a simple wrapper function for converting strings in patterns?
from psl.
mm no, Pattern<T>
in hack is not a class, its a subtype of string that can constructed by using re""
, e.g: $pattern = re"#^(HTTP/)?(?P<version>[1-9]\d*(?:\.\d)?)$#";
is of type Pattern<shape('version' => string, ... )>
, at runtime, it is just a string.
from psl.
Would it make sense to make a class for it?
For static analysis, it is an issue to validate what is inside a pattern in order to tell what e.g. the matches array will look like:
If you add an additional pattern class, you can make psalm type-safe:
Downside : you still need to type the pattern in very a manual way in your applications:
/** @var Pattern<array{ 0: string, 1: string, word: string, 2: string}> $pattern */
$pattern = new Pattern('/([H])(?P<word>ello)/');
Meaning you could drop the pattern class and might as well type the resulting array-shape instead.
So maybe it is better to provide a psalm plugin that parses regexes in order to automatically detect the shape of the matching items instead of adding a Pattern class in here ... dunno :)
BTW : match
is reserved in PHP 8 🤦
from psl.
I found these snippets:
https://psalm.dev/r/ab3913bfdf
<?php
$matches = [];
$a = preg_match('/([H])(?P<word>ello)/', "Hello", $matches);
var_dump($matches[0], $matches['word']);
Psalm output (using commit 7195275):
INFO: PossiblyUndefinedIntArrayOffset - 7:10 - Possibly undefined array offset 'int(0)' is risky given expected type 'array-key'. Consider using isset beforehand.
INFO: PossiblyUndefinedStringArrayOffset - 7:23 - Possibly undefined array offset 'string(word)' is risky given expected type 'array-key'. Consider using isset beforehand.
ERROR: ForbiddenCode - 7:1 - Unsafe var_dump
INFO: UnusedVariable - 5:1 - Variable $a is never referenced
https://psalm.dev/r/b91bfe654b
<?php
/**
* @template MatchingShape
*/
class Pattern
{
private string $pattern;
public function __construct(string $pattern)
{
$this->pattern = $pattern;
}
public function toString(): string
{
return $this->pattern;
}
}
/**
* @template Shape of array
*
* @param string $lookup
* @param Pattern<Shape> $pattern
*
* @return Shape
*/
function regex_match(Pattern $pattern, string $subject): array
{
$matches = [];
if (false === preg_match($pattern->toString(), $subject, $matches)) {
throw new \RuntimeException('invalid pattern ... include pcre error info here');
}
/** @var Shape $matches */
return $matches;
}
/** @var Pattern<array{ 0: string, 1: string, word: string, 2: string}> $pattern */
$pattern = new Pattern('/([H])(?P<word>ello)/');
$result = regex_match($pattern, 'Hello');
var_dump(
$result[1],
$result['word']
);
Psalm output (using commit 7195275):
ERROR: ForbiddenCode - 47:1 - Unsafe var_dump
from psl.
Related Issues (20)
- [Type] introduce new integer/float functions HOT 1
- [Type] add `nonnull` type
- [Type] add `string_range` and `int_range` types.
- Undeprecate `positive_int` HOT 2
- base64 url safe encoding HOT 1
- Return value of `Psl\Str\trim()` must be of the type string, null returned HOT 6
- `CoercionException::withValue()` called but not throwed. Is this dead code? HOT 1
- Improve error messages for shape and vec -like types
- Improve Dict\merge HOT 1
- Make PSL PHPStan compliant
- Add an `option()` type to `Psl\Type` to coerce to a `Psl\Option\Option` HOT 3
- Split this package is multiple sub-packages HOT 2
- [Bug] Vec\range handles ints which don't roundtrip through floats wrong
- intersection + shapes unexpected behaviour (possible bug?) HOT 4
- Type\implements_interface HOT 3
- Corece an `array` out of an `stdClass`? HOT 3
- [RFC] Change `ResultInterface<T>` to `ResultInterface<T, E>` HOT 5
- chore: rename all enum cases to use PascalCase
- Improvement converted json type HOT 5
- Is nonnull helpful in any way? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from psl.