Bug in `construct_W.py`

Hi there,

I'm trying to construct the `W` weight matrix to work with `lap_score` on the following simple dataset: [employes-region.txt](https://github.com/jundongl/scikit-feature/files/4466147/employes-region.txt).  I've tried the following code, which is provided as an example in file `test_lap_score.py`: 

``` python 
    kwargs_W = {"metric": "euclidean", "neighbor_mode": "knn", "weight_mode": "heat_kernel", "k": 5, 't': 1}
    W = construct_W.construct_W(X, **kwargs_W)
```

Unfortunately, it fails with the following exception at line 152 of file `construct_W.py`:

```
could not broadcast input array from shape (25) into shape (30) 
```

I've gone through the code, and I think that the problem's that the dimensions of `G` are wrong.  This is the piece of code involved in the exception:

``` python
            t = kwargs['t']
            # compute pairwise euclidean distances
            D = pairwise_distances(X)
            D **= 2
            # sort the distance matrix D in ascending order
            dump = np.sort(D, axis=1)
            idx = np.argsort(D, axis=1)  #  *** 1
            idx_new = idx[:, 0:k+1]  #  *** 2
            dump_new = dump[:, 0:k+1] #  *** 2
            # compute the pairwise heat kernel distances
            dump_heat_kernel = np.exp(-dump_new/(2*t*t))
            G = np.zeros((n_samples*(k+1), 3)) #  *** 2
            G[:, 0] = np.tile(np.arange(n_samples), (k+1, 1)).reshape(-1) #  *** 2
            G[:, 1] = np.ravel(idx_new, order='F') # *** EXCEPTION HERE!!
            G[:, 2] = np.ravel(dump_heat_kernel, order='F')
            # build the sparse affinity matrix W
            W = csc_matrix((G[:, 2], (G[:, 0], G[:, 1])), shape=(n_samples, n_samples))
            bigger = np.transpose(W) > W
            W = W - W.multiply(bigger) + np.transpose(W).multiply(bigger)
```

I think that there's a problem at line `*** 1`.  Should it compute `idx`using dump?  I mean:

```  python 
            idx = np.argsort(dump, axis=1)  #  *** 1
```

And the other problem is at the lines `*** 2`.  Shouldn't they use `k` as a multiplier instead of `k+1`?  That is:

``` python
            idx_new = idx[:, 0:k]  #  *** 2
            dump_new = dump[:, 0:k] #  *** 2
            # compute the pairwise heat kernel distances
            dump_heat_kernel = np.exp(-dump_new/(2*t*t))
            G = np.zeros((n_samples*(k), 3)) #  *** 2
            G[:, 0] = np.tile(np.arange(n_samples), (k, 1)).reshape(-1) #  *** 2
 ```

I've fixed my local installation using this path and I've run the system on a large collection with 200+ datasets.  It works correctly now.

I've seen that there are many other lines in which a similar patch might apply, bu I haven't tried other configuration options.

Thanks!  Regards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in `construct_W.py` #58

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Bug in construct_W.py #58

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Bug in `construct_W.py` #58