-
Notifications
You must be signed in to change notification settings - Fork 9
Reproduce the quality result #2
Description
Hi,
I have trained this model following the settings in your paper (batch size 32, on BSDS dataset, 500 epochs, the lr decay etc), but I found I cannot obtain the same MS-SSIM result mentioned in your paper. Therefore, I used a subset of UCF101 dataset as the training set, which improves the performance. But still, the MS-SSIM result is not satisfying. For example, I got MS-SSIM 0.951 at about 0.44 bpp. As you have mentioned in your paper, models at different bit rates are obtained by fine tuning the final layer of the encoder, while I trained every model from scratch by modifying the numbers channels in the final layer of the encoder. I wonder this might cause a performance gap?
Another question in the compute_bpp function, I found that you used the theoretical lower bound of the entropy to represent the code length, which is a reasonable estimation. However, if we want to compare it with the traditional compression algorithm, like JPEG, which uses Huffman coding, I think we might need the real code length after Huffman coding to calculate bpp for a fair comparison.
Still another question about the PSNR result, which is not mentioned in your paper. In the paper lossy image compression with compressive autoencoders, the trained model can get a PSNR of 35 dB at 1 bpp. While my trained model can only get 30.6 dB at a similar bit rate. I think it is really a huge gap. It is true that the PSNR as an evaluation metric has its limitation, but it is still an important aspect to evaluate a compression algorithm. I wonder if you could share the PSNR result of your trained model? Because I have built and trained several image compression models, I found it is really hard to improve the PSNR result, and I really hope to know the reason.
Looking forward to your reply!
Gong