-
Notifications
You must be signed in to change notification settings - Fork 34
Description
Hello!
I was using the regex() generator and ran into a consistent crash. It happens whenever the regex pattern includes an unbounded quantifier like * or +. It took a bit of digging to figure out what was going on.
1. The problem is in a dependency.
The issue comes from the icomefromthenet/reverse-regex library. It has a SimpleRandom class that's used to generate numbers. This class has a hardcoded limit and throws an exception if it's asked to generate a number larger than about 2.7 million.
When reverse-regex sees a * in a pattern, it sets the "max repeats" to a huge number. This huge number gets passed to SimpleRandom, which then immediately crashes.
2. The fix seems simple: use another random number generator.
The interesting part is that reverse-regex already has another generator, MersenneRandom, that doesn't have this low limit.
The bug in Eris is that RegexGenerator.php is hardcoded to only use the broken one:
// in Eris\Generator\RegexGenerator.php
$gen = new SimpleRandom($rand->rand());Just changing this to use the other generator should fix the crash and make the regex() helper work with * and + patterns.
// Suggested change
$gen = new MersenneRandom($rand->rand());3. A better fix might be using the factory.
For a longer-term solution, reverse-regex also has a GeneratorFactory. If Eris used that, it wouldn't be tied to one specific implementation. You could even add an option to let us choose the generator, which would be pretty cool.
4. That dependency is also a bit old.
By the way, icomefromthenet/reverse-regex seems to be unmaintained and doesn't support modern PHP versions.
Some people have forked it to get it working on newer PHP releases. For example, these pull requests look promising:
https://github.com/icomefromthenet/ReverseRegex/pull/18https://github.com/icomefromthenet/ReverseRegex/pull/15
A nice side-effect of these modernized forks is that they use PHP's native mt_rand() instead of the library's own implementation in PHP. This would probably give a decent speed-up.
The regex() generator is a really cool and useful feature. It would be awesome to keep it, but right now things are a bit brittle. Making it more solid would be a big help.