Applied and Computational Engineering

- The Open Access Proceedings Series for Conferences


Proceedings of the 4th International Conference on Computing and Data Science (CONF-CDS 2022)

Series Vol. 2 , 22 March 2023


Open Access | Article

Precise Human Removal and Inpainting Using Mask RCNN and LaMa

Xiangzhi Wang 1
1 The Hong Kong Polytechnic University, Department of Computing, HKSAR

* Author to whom correspondence should be addressed.

Applied and Computational Engineering, Vol. 2, 180-199
Published 22 March 2023. © 2023 The Author(s). Published by EWA Publishing
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Citation Xiangzhi Wang. Precise Human Removal and Inpainting Using Mask RCNN and LaMa. ACE (2023) Vol. 2: 180-199. DOI: 10.54254/2755-2721/2/20220668.

Abstract

Sometimes people are not supposed to be in a photo for various purposes, but this is usually unavoidable. Therefore, in the postprocessing of the image, it can be solved by removing people from the picture without affecting the coherence and naturalness of the object and background in the photo. We propose a human removal method based on image instance segmentation and image inpainting. Firstly, we send an image into the image instance segmentation algorithm to obtain a mask covering the unwanted parts of the picture. Then we do dilatation on this mask to expand the mask region. Finally, the inpainting algorithm will take the image and the processed mask and produce an inpainted image with no human and feel natural.

Keywords

image inpainting, image instance segmentation, photo enhancement., privacy protection

References

1. Hafiz, A. M., & Bhat, G. M. (2020). A survey on instance segmentation: state of the art. International journal of multimedia information retrieval, 9(3), 171-189.

2. Hariharan, B., Arbeláez, P., Girshick, R., & Malik, J. (2014, September). Simultaneous detection and segmentation. In European conference on computer vision (pp. 297-312). Springer, Cham.

3. Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Fea-ture pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117-2125).

4. Bolya, D., Zhou, C., Xiao, F., & Lee, Y. J. (2019). Yolact: Real-time instance segmen-tation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9157-9166).

5. He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

6. Weber, M., Wang, H., Qiao, S., Xie, J., Collins, M. D., Zhu, Y., ... & Chen, L. C. (2021). Deeplab2: A tensorflow library for deep labeling. arXiv preprint arXiv:2106.09748.

7. Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., & Pietikäinen, M. (2020). Deep learning for generic object detection: A survey. International journal of computer vision, 128(2), 261-318.

8. Bertalmio, M., Sapiro, G., Caselles, V., & Ballester, C. (2000, July). Image inpainting. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques(pp. 417-424).

9. Elharrouss, O., Almaadeed, N., Al-Maadeed, S., & Akbari, Y. (2020). Image inpainting: A review. Neural Processing Letters, 51(2), 2007-2028.

10. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2018). Generative image inpainting with contextual attention. In Proceedings of the IEEE conference on com-puter vision and pattern recognition (pp. 5505-5514).

11. Nazeri, K., Ng, E., Joseph, T., Qureshi, F. Z., & Ebrahimi, M. (2019). Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212.

12. Yi, Z., Tang, Q., Azizi, S., Jang, D., & Xu, Z. (2020). Contextual residual aggregation for ultra high-resolution image inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7508-7517).

13. Suvorov, R., Logacheva, E., Mashikhin, A., Remizova, A., Ashukha, A., Silvestrov, A., ... & Lempitsky, V. (2022). Resolution-robust large mask inpainting with fourier convolutions. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 2149-2159).

14. Wang, K., Gou, C., Duan, Y., Lin, Y., Zheng, X., & Wang, F. Y. (2017). Generative adversarial networks: introduction and outlook. IEEE/CAA Journal of Automatica Sini-ca, 4(4), 588-598.

15. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE con-ference on computer vision and pattern recognition (pp. 580-587).

16. Van de Sande, K. E., Uijlings, J. R., Gevers, T., & Smeulders, A. W. (2011, Novem-ber). Segmentation as selective search for object recognition. In 2011 international con-ference on computer vision (pp. 1879-1886). IEEE.

17. Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440-1448).

18. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.

19. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolu-tional networks for visual recognition. IEEE transactions on pattern analysis and ma-chine intelligence, 37(9), 1904-1916.

20. Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L. (2014, September). Microsoft coco: Common objects in context. In European con-ference on computer vision (pp. 740-755). Springer, Cham.

21. Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8759-8768).

22. Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for seman-tic segmentation. In Proceedings of the IEEE conference on computer vision and pat-tern recognition (pp. 3431-3440).

23. Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6), 679-698.

24. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., & Torralba, A. (2017). Places: A 10 mil-lion image database for scene recognition. IEEE transactions on pattern analysis and machine intelligence, 40(6), 1452-1464.

25. Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., & Oliva, A. (2014). Learning deep fea-tures for scene recognition using places database. Advances in neural information pro-cessing systems, 27.

26. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee.

27. Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., & Torralba, A. (2010, June). Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society con-ference on computer vision and pattern recognition (pp. 3485-3492). IEEE.

28. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recogni-tion. In Proceedings of the IEEE conference on computer vision and pattern recogni-tion (pp. 770-778).

29. Jaderberg, M., Simonyan, K., & Zisserman, A. (2015). Spatial transformer networks. Advances in neural information processing systems, 28.

30. Jankowski, M. (2006, June). Erosion, dilation and related operators. In 8th International Mathematica Symposium (pp. 1-10).

31. Chi, L., Jiang, B., & Mu, Y. (2020). Fast fourier convolution. Advances in Neural In-formation Processing Systems, 33, 4479-4488.

32. Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1125-1134).

33. Mescheder, L., Geiger, A., & Nowozin, S. (2018, July). Which training methods for GANs do actually converge?. In International conference on machine learning (pp. 3481-3490). PMLR.

34. Wang, T. C., Liu, M. Y., Zhu, J. Y., Tao, A., Kautz, J., & Catanzaro, B. (2018). High-resolution image synthesis and semantic manipulation with conditional gans. In Pro-ceedings of the IEEE conference on computer vision and pattern recognition (pp. 8798-8807).

Data Availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Authors who publish this series agree to the following terms:

1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.

2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.

3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open Access Instruction).

Volume Title
Proceedings of the 4th International Conference on Computing and Data Science (CONF-CDS 2022)
ISBN (Print)
978-1-915371-19-5
ISBN (Online)
978-1-915371-20-1
Published Date
22 March 2023
Series
Applied and Computational Engineering
ISSN (Print)
2755-2721
ISSN (Online)
2755-273X
DOI
10.54254/2755-2721/2/20220668
Copyright
22 March 2023
Open Access
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited

Copyright © 2023 EWA Publishing. Unless Otherwise Stated