-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Originally posted by @lkorczowski in https://github.com/_render_node/MDI0OlB1bGxSZXF1ZXN0UmV2aWV3Q29tbWVudDM3NTc2NTA3Mw==/comments/review_comment
This issue is created in order to maintain an important discussion about how intervals should be sampled.
About using np.linspace instead of np.arange:
While I thought and tested that solution, it is unfortunately not exactly the same:
- np.linespace splits the interval in equal lengths in float i.e.
int(max_dropout_fraction_n /step_sizes_n[0])resulting often in integer interval of different length, np.arange guaranty that the interval is strictly the same (at the cost of not sampling all the interval sometimes which is often ok)
Example:
If I want to sample the interval [0,10] into chunks of 4 samples, expected: [0-3], [4-7]
np.round(np.arange(0,10.5,4)).astype(int)
array([0, 4, 8])
correct answer
np.linspace(0,10,round(10/4)).astype(int)
array([ 0, 10])
wrong due to the round "half to event" strategy round(10/4) -> 2
even be doing
np.linspace(0,10,3)
array([ 0., 5., 10.])
which is wrong and could be uneven interval. There is simply no way to directly sampling the [0,10] interval every 4 samples without doing some boring calculus of a subinterval (i.e. here [0,8[ )
Conclusion:
- we should NOT use np.linspace to sample intervals and use instead np.arange if we want an even index distribution.