Skip to content

Sampling intervals: np.linspace VS np.arange #22

@lkorczowski

Description

@lkorczowski

Originally posted by @lkorczowski in https://github.com/_render_node/MDI0OlB1bGxSZXF1ZXN0UmV2aWV3Q29tbWVudDM3NTc2NTA3Mw==/comments/review_comment

This issue is created in order to maintain an important discussion about how intervals should be sampled.

About using np.linspace instead of np.arange:
While I thought and tested that solution, it is unfortunately not exactly the same:

  • np.linespace splits the interval in equal lengths in float i.e. int(max_dropout_fraction_n /step_sizes_n[0]) resulting often in integer interval of different length, np.arange guaranty that the interval is strictly the same (at the cost of not sampling all the interval sometimes which is often ok)

Example:
If I want to sample the interval [0,10] into chunks of 4 samples, expected: [0-3], [4-7]
np.round(np.arange(0,10.5,4)).astype(int)

array([0, 4, 8])

correct answer

np.linspace(0,10,round(10/4)).astype(int)

array([ 0, 10])

wrong due to the round "half to event" strategy round(10/4) -> 2

even be doing
np.linspace(0,10,3)

array([ 0., 5., 10.])

which is wrong and could be uneven interval. There is simply no way to directly sampling the [0,10] interval every 4 samples without doing some boring calculus of a subinterval (i.e. here [0,8[ )

Conclusion:

  • we should NOT use np.linspace to sample intervals and use instead np.arange if we want an even index distribution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    wontfixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions