extra model size induced by non-parameter layer

Hi there, I'm now working to benchmark my BNN with LCE. Following the instruction of larq and lce, I first implement my BNN with Keras and convert it to `.tflite` by LCE. However, the actual size of my `.tflite` is much bigger than the theoretical expected one, reported by lq.models.summary(). I find that the extra size (about 60K) is introduced by a non-parameter custom keras layer in my model, which is show below. (This custom layer is used as a substitute for `Sparse-Dense Matrix Multiplication`.)

```python
class MyLayer(Layer):
    def __init__(self,  **kwargs):
        super(MyLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        self.build = True

    def call(self, inputs, mask=None):
        data= inputs[0]  # A dense tensor with size of [N, D]
        idx = inputs[1]  # A dense tensor with size of [N, M]
        weight = inputs[2] # A dense tensor with size of [N, M]

        idx_sparse = tf.sparse.from_dense(tf.cast(idx, dtype=tf.int32))
        weight_sparse = tf.sparse.from_dense(weight)
        output = tf.nn.embedding_lookup_sparse(features, idx_sparse, weight_sparse)

        return output

    def get_config(self):
        config = {}

        base_config = super(MyLayer, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))
```
This non-parameter custom layer introduces ~60K `.tflite` size, which is unaffordable in my case, because the rest of my BNN module only introduces 16K `.tflite` size (slightly bigger than the theoretical model size). So I want to lower its size. 

It seems that the extra size is introduced by the `sparse tensor`. I make a simple test to prove it. Using the following `call()` function to replace the original one in `MyLayer`, the overall size of `.tflite` is 17K, which means  this `MyLayer` only add 1K to `.tflite`. (This `MyLayer` only contains two matmul operators.)

```python
def call(self, inputs, mask=None):
        data= inputs[0]  # A dense tensor with size of [N, D]
        idx = inputs[1]  # A dense tensor with size of [N, M]
        
        output = K.dot(K.dot(idx, tf.transpose(idx)), features)

        return output
```
Then, when the sparse tensor is involved, like in the following case, I just convert a dense tensor to a sparse tensor and then convert it back to the dense one. The overall size of  `.tflite` becomes 82K! 
```python
def call(self, inputs, mask=None):
        data= inputs[0]  # A dense tensor with size of [N, D]
        idx = inputs[1]  # A dense tensor with size of [N, M]
        
        idx_sparse= tf.sparse.from_dense(idx)
        idx = tf.sparse.to_dense(idx_sparse)

        output = K.dot(K.dot(idx, tf.transpose(idx)), features)

        return output
```

So, I'm wondering why the sparse tensor introduces so much extra size of `.tflite`, and how I lower it and implement the `Sparse-Dense Matrix Multiplication` operator. Can you give me some hint? 

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extra model size induced by non-parameter layer #773

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

extra model size induced by non-parameter layer #773

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions