伍家松

发布者:伍家松发布时间:2025-04-03浏览次数:748

1. 多模态深度学习
     如图1所示,该方向主要研究“虚拟主播”,属于人工智能生成内容(Artificial Intelligence Generated Content, AIGC):像人类一样具备生成创造能力的AI技术,即生成式AI,它可以基于训练数据和生成算法模型,自主生成创造新的文本、图像、音乐、视频、3D交互内容等各种形式的内容和数据,以及包括开启科学新发现、创造新的价值和意义等。

 

1 多技能人工智能体接收、理解和播报中文新闻类视频

 

2给出了四个具体研究内容:多模态语音分离、音视频混合驱动的视频生成、多模态视频描述、三维语音合成。


a)语音->视频


b)视频->语音

虚拟主播研究内容

代表性研究成果:

[1]   Zidong Liu, Jiasong Wu, Zeyu Shen, Xin Chen, Qianyu Wu, Zhiguo Gui, Lotfi Senhadji, Huazhong Shu. Improving End-to-end Sign Language Translation with Adaptive Video Representation Enhanced Transformer.IEEE Transactions on Circuits and Systems for Video Technology, 2024, doi: 10.1109/TCSVT.2024.3376404

[2]       Jiasong Wu, Qingchun Li, Guanyu Yang, Lei Li, Lotfi Senhadji, Huazhong Shu. Self-supervised speech denoising using only noisy audio signals. Speech Communication, 2023, 149: 63-73.

[3]       Xize Wu, Jiasong Wu, Lei Zhu, Lotfi Senhadji, Huazhong Shu. Collaborative aware bidirectional semantic reasoning for video question answering. IEEE Transactions on Circuits and Systems for Video Technology, 2024. Major Revision.

[4]       Jiasong Wu, Xuan Li, Taotao Li, Fanman Meng, Youyong Kong, Guanyu Yang, Lotfi Senhadji, Huazhong Shu. CSLNSpeech: solving extended speech separation problem with the help of Chinese sign language. Speech Communication, 2024, 165, 103131.2020. 

[5]       Fanman Meng, Jiasong Wu, et al. SSWMNet: Solving The Problem Of Speech Separation While Wearing a Mask.https://github.com/fanmanqian/SSWMNetwork

 

2. 人工智能与信号处理的结合
 
 该研究方向尝试沟通深度学习与信号处理两个研究领域,具体包括用信号处理的方法对深度学习网络进行解释;将信号处理中的时频分析方法作为模块构建深度学习网络(图3);深度学习网络的数域扩展等。


3 小波变换与Vision Transformer融合

代表性研究成果:

[1]       Fuzhi Wu, Jiasong Wu, Youyong Kong, Chunfeng Yang, Guanyu Yang, Huazhong Shu, Guy Carrault, Lotfi Senhadji. Wavelet-Based Dual-Task Network. IEEE Transactions on Neural Networks and Learning Systems. 2024, minor revision.

[2]       Fuzhi Wu, Jiasong Wu, Huazhong Shu, Guy Carrault, Lotfi Senhadji. Spatial-enhanced Multi-level Wavelet Patching in Vision Transformers. IEEE Signal Processing Letters, 2024, 31: 446-450.

[3]       Fuzhi Wu, Jiasong Wu, Youyong Kong, Chunfeng Yang, Guanyu Yang, Huazhong Shu, Guy Carrault, Lotfi Senhadji. Multiscale low-frequency memory network for improved feature extraction in convolutional neural networks. The 38th AAAI Conference on Artificial Intelligence (AAAI), Vancouver, Canada, 2024, 5967-5975.

[4]       F.Z. Wu#, J.S. Wu#, Y.Y. Kong, C.F. Yang, G.Y. Yang, H.Z. Shu*, G. Carrault, L. Senhadji. Convolutional modulation theory: A bridge between convolutional neural networks and signal modulation theory. Neurocomputing, 2022, 514: 195-215.

[5]       Jiasong Wu*, Xiang Qiu, Jing Zhang, Fuzhi Wu, Youyong Kong, Guanyu Yang, Lotfi Senhadji, Huazhong Shu. Fractional wavelet based generative scattering networks. Frontiers in Neurorobotics. 2021.

[6]       J.S. Wu*, L. Xu, F.Z. Wu, Y.Y. Kong, L. Senhadji, H.Z. Shu. Deep octonion networks. Neurocomputing, vol. 397, pp. 179-191, 2020.

[7]       L. Liu, J. S. Wu, D. W. Li, L. Senhadji, H. Z. Shu*. Fractional wavelet scattering network and applications. IEEE Transactions on Biomedical Engineering, vol. 66, no. 2, pp. 553-563, 2019.

[8]       J. S. Wu*, S. J. Qiu, Y. Y. Kong, L. Y. Jiang, Y. Chen, W. K. Yang, L. Senhadji, H. Z. Shu. PCANet: An energy perspective. Neurocomputing, vol. 313, pp. 271-287, 2018.

[9]       Zeng R, Wu J S*, Shao Z H, Chen Y, Senhadji L, Shu H Z. Color image classification via quaternion principal component analysis network. Neurocomputing, vol. 216, pp. 416-428, 2016.

3. 疾病辅助诊疗系统开发
该方向主要研究“虚拟医生”“虚拟病人”,同样属于人工智能生成内容(AIGC)领域。

3.1 肝病辅助诊疗系统开发

如图4和图5所示开发一套多源肝病智能决策系统和肝脏疾病诊疗方案生成和解释系统。

多源肝病智能决策系统


肝脏疾病诊疗方案生成和解释系统

 

代表性研究成果:

[1]       Yingyao Ma, Jiasong Wu, et al. Multimodal Entity Linking with Dynamic Modality Selection and Interactive Prompt Learning. IEEE TKDE, 2024. (will be submitted)

[2]       Yifan Xue, Yingyao Ma, Jiasong Wu, Lotfi Senhadji, Huazhong Shu, Jian Yang. OneForKG: A Unified and Effective Framework for Various Knowledge Graph Completion. IEEE TKDE, 2024. (will be submitted )

 

3.2 牙颌面畸形辅助诊疗系统开发

如图6所示开发一套牙颌面畸形辅助诊疗系统

牙颌面畸形辅助诊疗系统

 

代表性研究成果:

[1]       Han Bao, Zhidong He, Jiasong Wu, John Baxter, Lotfi Senhadji, Hengjia Zhang, Shirin Shahrbaf, Sherif Elbarbary, Huazhong Shu, Luwei Liu, Bin Yan. Development and validation of the deep learning enhanced facial soft tissue network (FST-Net) for 3D landmarking. Progress in Orthodontics, 2024 (Submitted)

[2]       Han Bao, Zhidong He, Jiasong Wu, John Baxter, Lotfi Senhadji, Hengjia Zhang, Shirin Shahrbaf, Sherif Elbarbary, Huazhong Shu, Luwei Liu, Bin Yan. A New Automated 3D Facial Soft Tissue Landmarking Method via Deep Learning. Journal of Dental Research, 2024 (Submitted)

[3]     Zhidong He, Han Bao, Mingzhang Chen, Jiasong Wu, Luwei Liu, Lotfi Senhadji, Huazhong Shu, Bin Yan. FST-Net: Facial Soft Tissue Landmark Localization on 3dMD Scans Using Feature Fusion and Local Coordinate Regression. IEEE International Symposium on Biomedical Imaging (ISBI), 2024.

 

 


  • 联系方式
  • 通信地址:南京市江宁区东南大学路2号东南大学九龙湖校区计算机学院
  • 邮政编码:211189
  • ​办公地点:东南大学九龙湖校区计算机楼
  • 学院微信公众号