Maintained by Difan Deng and Marius Lindauer.
The following list considers papers related to neural architecture search. It is by no means complete. If you miss a paper on the list, please let us know.
Please note that although NAS methods steadily improve, the quality of empirical evaluations in this field are still lagging behind compared to other areas in machine learning, AI and optimization. We would therefore like to share some best practices for empirical evaluations of NAS methods, which we believe will facilitate sustained and measurable progress in the field. If you are interested in a teaser, please read our blog post or directly jump to our checklist.
Transformers have gained increasing popularity in different domains. For a comprehensive list of papers focusing on Neural Architecture Search for Transformer-Based spaces, the awesome-transformer-search repo is all you need.
2022
Huynh, Lam; Rahtu, Esa; Matas, Jiri; Heikkilä, Janne
Fast Neural Architecture Search for Lightweight Dense Prediction Networks Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-01994,
title = {Fast Neural Architecture Search for Lightweight Dense Prediction Networks},
author = {Lam Huynh and Esa Rahtu and Jiri Matas and Janne Heikkilä},
url = {https://doi.org/10.48550/arXiv.2203.01994},
doi = {10.48550/arXiv.2203.01994},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.01994},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Lin, Ke; A, Yong; Gan, Zhuoxin; Jiang, Yingying
WPNAS: Neural Architecture Search by jointly using Weight Sharing and Predictor Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-02086,
title = {WPNAS: Neural Architecture Search by jointly using Weight Sharing and Predictor},
author = {Ke Lin and Yong A and Zhuoxin Gan and Yingying Jiang},
url = {https://doi.org/10.48550/arXiv.2203.02086},
doi = {10.48550/arXiv.2203.02086},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.02086},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Chen, Xuehui; Niu, Xin; Jiang, Jingfei; Pan, Hengyue; Dong, Peijie; Wei, Zimian
Influence of Initialization and Modularization on the Performance of Network Morphism-Based Neural Architecture Search Proceedings Article
In: Yao, Jian; Xiao, Yang; You, Peng; Sun, Guang (Ed.): The International Conference on Image, Vision and Intelligent Systems (ICIVIS 2021), pp. 875–887, Springer Singapore, Singapore, 2022, ISBN: 978-981-16-6963-7.
@inproceedings{10.1007/978-981-16-6963-7_77,
title = {Influence of Initialization and Modularization on the Performance of Network Morphism-Based Neural Architecture Search},
author = {Xuehui Chen and Xin Niu and Jingfei Jiang and Hengyue Pan and Peijie Dong and Zimian Wei},
editor = {Jian Yao and Yang Xiao and Peng You and Guang Sun},
url = {https://link.springer.com/chapter/10.1007/978-981-16-6963-7_77},
isbn = {978-981-16-6963-7},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {The International Conference on Image, Vision and Intelligent Systems (ICIVIS 2021)},
pages = {875--887},
publisher = {Springer Singapore},
address = {Singapore},
abstract = {Neural Architecture Search (NAS), the process of automatic network architecture design, has enabled remarkable progress over the last years on Computer Vision tasks. In this paper, we propose a novel and efficient NAS framework based on network morphism to further improve the performance of NAS algorithms. Firstly, we design four modular structures termed RBNC block, CBNR block, BNRC block and RCBN block which correspond to four initial neural network architectures and four modular network morphism methods. Each block is composed of a ReLU layer, a Batch-Norm layer and a convolutional layer. Then we introduce network morphism to correlate different modular structures for constructing network architectures. Moreover, we study the influence of different initial neural network architectures and modular network morphism methods on the performance of network morphism-based NAS algorithms through comparative experiments and ablation experiments. Finally, we find that the network morphism-based NAS algorithm that uses CBNR block for initialization and modularization is the best method to improve performance. Our proposed method achieves a test accuracy of 95.84% on CIFAR-10 with least parameters (only 2.72 M) and fewer search costs (2 GPU-days) for network architecture search.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Sun, Jialiang; Jiang, Tingsong; Li, Chao; Zhou, Weien; Zhang, Xiaoya; Yao, Wen; Chen, Xiaoqian
Searching for Robust Neural Architectures via Comprehensive and Reliable Evaluation Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-03128,
title = {Searching for Robust Neural Architectures via Comprehensive and Reliable Evaluation},
author = {Jialiang Sun and Tingsong Jiang and Chao Li and Weien Zhou and Xiaoya Zhang and Wen Yao and Xiaoqian Chen},
url = {https://doi.org/10.48550/arXiv.2203.03128},
doi = {10.48550/arXiv.2203.03128},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.03128},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Chebykin, Alexander; Alderliesten, Tanja; Bosman, Peter A. N.
Evolutionary Neural Cascade Search across Supernetworks Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-04011,
title = {Evolutionary Neural Cascade Search across Supernetworks},
author = {Alexander Chebykin and Tanja Alderliesten and Peter A. N. Bosman},
url = {https://doi.org/10.48550/arXiv.2203.04011},
doi = {10.48550/arXiv.2203.04011},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.04011},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Xiao, Yuhan; Sun, Shang; Liao, TaoLin
Parameter search-based scaling network for self-supervised depth Proceedings Article
In: Mohiddin, Md Khaja; Chen, Siting; EL-Zoghdy, Said Fathy (Ed.): Third International Conference on Electronics and Communication; Network and Computer Technology (ECNCT 2021), pp. 463 – 467, International Society for Optics and Photonics SPIE, 2022.
@inproceedings{10.1117/12.2629190,
title = {Parameter search-based scaling network for self-supervised depth},
author = {Yuhan Xiao and Shang Sun and TaoLin Liao},
editor = {Md Khaja Mohiddin and Siting Chen and Said Fathy EL-Zoghdy},
url = {https://doi.org/10.1117/12.2629190},
doi = {10.1117/12.2629190},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {Third International Conference on Electronics and Communication; Network and Computer Technology (ECNCT 2021)},
volume = {12167},
pages = {463 -- 467},
publisher = {SPIE},
organization = {International Society for Optics and Photonics},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Javaheripi, Mojan; Shah, Shital; Mukherjee, Subhabrata; Religa, Tomasz L.; Mendes, Caio C. T.; Rosa, Gustavo H.; Bubeck, Sébastien; Koushanfar, Farinaz; Dey, Debadeepta
LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-02094,
title = {LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models},
author = {Mojan Javaheripi and Shital Shah and Subhabrata Mukherjee and Tomasz L. Religa and Caio C. T. Mendes and Gustavo H. Rosa and Sébastien Bubeck and Farinaz Koushanfar and Debadeepta Dey},
url = {https://doi.org/10.48550/arXiv.2203.02094},
doi = {10.48550/arXiv.2203.02094},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.02094},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Lopes, Vasco; Alexandre, Luís A.
Towards Less Constrained Macro-Neural Architecture Search Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-05508,
title = {Towards Less Constrained Macro-Neural Architecture Search},
author = {Vasco Lopes and Luís A. Alexandre},
url = {https://doi.org/10.48550/arXiv.2203.05508},
doi = {10.48550/arXiv.2203.05508},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.05508},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Wu, Xixin; Hu, Shoukang; Wu, Zhiyong; Liu, Xunying; Meng, Helen
Neural Architecture Search for Speech Emotion Recognition Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-16928,
title = {Neural Architecture Search for Speech Emotion Recognition},
author = {Xixin Wu and Shoukang Hu and Zhiyong Wu and Xunying Liu and Helen Meng},
url = {https://doi.org/10.48550/arXiv.2203.16928},
doi = {10.48550/arXiv.2203.16928},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.16928},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Wei, Zimian; Pan, Hengyue; Niu, Xin; Dong, Peijie; Li, Dongsheng
UENAS: A Unified Evolution-based NAS Framework Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-04300,
title = {UENAS: A Unified Evolution-based NAS Framework},
author = {Zimian Wei and Hengyue Pan and Xin Niu and Peijie Dong and Dongsheng Li},
url = {https://doi.org/10.48550/arXiv.2203.04300},
doi = {10.48550/arXiv.2203.04300},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.04300},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Xiang, Tiange; Zhang, Chaoyi; Wang, Xinyi; Song, Yang; Liu, Dongnan; Huang, Heng; Cai, Weidong
Towards bi-directional skip connections in encoder-decoder architectures and beyond Journal Article
In: Medical Image Analysis, vol. 78, pp. 102420, 2022, ISSN: 1361-8415.
@article{XIANG2022102420,
title = {Towards bi-directional skip connections in encoder-decoder architectures and beyond},
author = {Tiange Xiang and Chaoyi Zhang and Xinyi Wang and Yang Song and Dongnan Liu and Heng Huang and Weidong Cai},
url = {https://www.sciencedirect.com/science/article/pii/S1361841522000718},
doi = {https://doi.org/10.1016/j.media.2022.102420},
issn = {1361-8415},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {Medical Image Analysis},
volume = {78},
pages = {102420},
abstract = {U-Net, as an encoder-decoder architecture with forward skip connections, has achieved promising results in various medical image analysis tasks. Many recent approaches have also extended U-Net with more complex building blocks, which typically increase the number of network parameters considerably. Such complexity makes the inference stage highly inefficient for clinical applications. Towards an effective yet economic segmentation network design, in this work, we propose backward skip connections that bring decoded features back to the encoder. Our design can be jointly adopted with forward skip connections in any encoder-decoder architecture forming a recurrence structure without introducing extra parameters. With the backward skip connections, we propose a U-Net based network family, namely Bi-directional O-shape networks, which set new benchmarks on multiple public medical imaging segmentation datasets. On the other hand, with the most plain architecture (BiO-Net), network computations inevitably increase along with the pre-set recurrence time. We have thus studied the deficiency bottleneck of such recurrent design and propose a novel two-phase Neural Architecture Search (NAS) algorithm, namely BiX-NAS, to search for the best multi-scale bi-directional skip connections. The ineffective skip connections are then discarded to reduce computational costs and speed up network inference. The finally searched BiX-Net yields the least network complexity and outperforms other state-of-the-art counterparts by large margins. We evaluate our methods on both 2D and 3D segmentation tasks in a total of six datasets. Extensive ablation studies have also been conducted to provide a comprehensive analysis for our proposed methods.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Sengar, Neha; Singh, Akriti; Yadav, Saumya; Dutta, Malay Kishore
Äutomated System for Face-Mask Detection Using Convolutional Neural Network Proceedings Article
In: Giri, Debasis; Choo, Kim-Kwang Raymond; Ponnusamy, Saminathan; Meng, Weizhi; Akleylek, Sedat; Maity, Santi Prasad (Ed.): Proceedings of the Seventh International Conference on Mathematics and Computing, pp. 373–380, Springer Singapore, Singapore, 2022, ISBN: 978-981-16-6890-6.
@inproceedings{10.1007/978-981-16-6890-6_28,
title = {Äutomated System for Face-Mask Detection Using Convolutional Neural Network},
author = {Neha Sengar and Akriti Singh and Saumya Yadav and Malay Kishore Dutta},
editor = {Debasis Giri and Kim-Kwang Raymond Choo and Saminathan Ponnusamy and Weizhi Meng and Sedat Akleylek and Santi Prasad Maity},
url = {https://link.springer.com/chapter/10.1007/978-981-16-6890-6_28},
isbn = {978-981-16-6890-6},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {Proceedings of the Seventh International Conference on Mathematics and Computing},
pages = {373--380},
publisher = {Springer Singapore},
address = {Singapore},
abstract = {Coronavirus Disease 2019 (COVID-19) pandemic is affecting the health of the global population severely. It is one of the deadliest diseases in history and has severely affected all the countries. The only way to prevent the spread of corona is to cover faces and follow social distancing norms until a vaccine is developed. The face mask is effective in blocking the droplets that contain the COVID-19 virus. Hence, it is necessary to wear a face mask as a precautionary measure against it. In the proposed work, the face mask detection model is generated using an optimized neural network architecture for performing the classification task (mask or no mask). For training and model assessment, a dataset of 8695 images has been taken from four different sources. The model achieves a validation accuracy of 99.52%.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Girish, Sharath; Dey, Debadeepta; Joshi, Neel; Vineet, Vibhav; Shah, Shital; Mendes, Caio Cesar Teodoro; Shrivastava, Abhinav; Song, Yale
One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning Technical Report
2022.
@techreport{girish2022one,
title = {One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning},
author = {Sharath Girish and Debadeepta Dey and Neel Joshi and Vibhav Vineet and Shital Shah and Caio Cesar Teodoro Mendes and Abhinav Shrivastava and Yale Song},
url = {https://arxiv.org/abs/2203.08130},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {arXiv preprint arXiv:2203.08130},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Li, Zi; Li, Ziyang; Liu, Risheng; Luo, Zhongxuan; Fan, Xin
Automated Learning for Deformable Medical Image Registration by Jointly Optimizing Network Architectures and Objective Functions Technical Report
2022.
@techreport{li2022automated,
title = {Automated Learning for Deformable Medical Image Registration by Jointly Optimizing Network Architectures and Objective Functions},
author = {Zi Li and Ziyang Li and Risheng Liu and Zhongxuan Luo and Xin Fan},
url = {https://arxiv.org/abs/2203.06810},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {arXiv preprint arXiv:2203.06810},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Wang, Haoxiang; Wang, Yite; Sun, Ruoyu; Li, Bo
Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-09137,
title = {Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning},
author = {Haoxiang Wang and Yite Wang and Ruoyu Sun and Bo Li},
url = {https://doi.org/10.48550/arXiv.2203.09137},
doi = {10.48550/arXiv.2203.09137},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.09137},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Shi, Jiachen; Zhou, Guoqiang; Bao, Shudi; Shen, Jun
Multi-SelfGAN: A Self-Guiding Neural Architecture Search Method for Generative Adversarial Networks with Multi-Controllers Journal Article
In: IEEE Transactions on Cognitive and Developmental Systems, pp. 1-1, 2022.
@article{9737565,
title = {Multi-SelfGAN: A Self-Guiding Neural Architecture Search Method for Generative Adversarial Networks with Multi-Controllers},
author = {Jiachen Shi and Guoqiang Zhou and Shudi Bao and Jun Shen},
url = {https://ieeexplore.ieee.org/abstract/document/9737565},
doi = {10.1109/TCDS.2022.3160475},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {IEEE Transactions on Cognitive and Developmental Systems},
pages = {1-1},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Dong, Junwei; Hou, Boyu; Feng, Liang; Tang, Huajin; Tan, Kay Chen; Ong, Yew-Soon
A Cell-Based Fast Memetic Algorithm for Automated Convolutional Neural Architecture Design Journal Article
In: IEEE Transactions on Neural Networks and Learning Systems, pp. 1-14, 2022.
@article{9737315,
title = {A Cell-Based Fast Memetic Algorithm for Automated Convolutional Neural Architecture Design},
author = {Junwei Dong and Boyu Hou and Liang Feng and Huajin Tang and Kay Chen Tan and Yew-Soon Ong},
url = {https://ieeexplore.ieee.org/abstract/document/9737315},
doi = {10.1109/TNNLS.2022.3155230},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {IEEE Transactions on Neural Networks and Learning Systems},
pages = {1-14},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Mi, Jian-Xun; Feng, Jie; Huang, Ke-Yang
Designing efficient convolutional neural network structure: A survey Journal Article
In: Neurocomputing, vol. 489, pp. 139-156, 2022, ISSN: 0925-2312.
@article{MI2022139,
title = {Designing efficient convolutional neural network structure: A survey},
author = {Jian-Xun Mi and Jie Feng and Ke-Yang Huang},
url = {https://www.sciencedirect.com/science/article/pii/S0925231222003162},
doi = {https://doi.org/10.1016/j.neucom.2021.08.158},
issn = {0925-2312},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {Neurocomputing},
volume = {489},
pages = {139-156},
abstract = {As a powerful machine learning method, deep learning has attracted the attention of numerous researchers. While exploring a high-performance neural network model, the floating-point operations of a neural network model are also increasing. In recent years, many researchers have noticed that efficiency is also one of important indicators to measure the property of neural network models. Obviously, the efficient neural network model is more helpful to deploy on mobile and embedded devices. Therefore, the efficient neural network model becomes a hot research spot. In this paper, we review the methods related to the structural design of efficient convolution neural networks in recent years. According to the characteristics of these methods, we divide them into three kinds of methods: model pruning, efficient architecture, and neural architecture search. Detailed analyses of each method are presented to demonstrate their advantages and disadvantages. Then, we comprehensively compare them in detail and propose many suggestions about the design of the efficient convolution neural network model structure. Inspired by these suggestions, we built a new efficient neural network model, SharedNet. And the SharedNet obtains the best accuracy of manually-designed efficient CNN models on the ImageNet dataset.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Blumberg, Stefano B.; Lin, Hongxiang; Grussu, Francesco; Zhou, Yukun; Figini, Matteo; Alexander, Daniel C.
Progressive Subsampling for Oversampled Data - Application to Quantitative MRI Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-09268,
title = {Progressive Subsampling for Oversampled Data - Application to Quantitative MRI},
author = {Stefano B. Blumberg and Hongxiang Lin and Francesco Grussu and Yukun Zhou and Matteo Figini and Daniel C. Alexander},
url = {https://doi.org/10.48550/arXiv.2203.09268},
doi = {10.48550/arXiv.2203.09268},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.09268},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Vo-Ho, Viet-Khoa; Yamazaki, Kashu; Hoang, Hieu; Tran, Minh-Triet; Le, Ngan
Meta-Learning of NAS for Few-shot Learning in Medical Image Applications Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-08951,
title = {Meta-Learning of NAS for Few-shot Learning in Medical Image Applications},
author = {Viet-Khoa Vo-Ho and Kashu Yamazaki and Hieu Hoang and Minh-Triet Tran and Ngan Le},
url = {https://doi.org/10.48550/arXiv.2203.08951},
doi = {10.48550/arXiv.2203.08951},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.08951},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Chang, Qing; Peng, Junran; Xie, Lingxi; Sun, Jiajun; Yin, Haoran; Tian, Qi; Zhang, Zhaoxiang
DATA: Domain-Aware and Task-Aware Self-supervised Learning Proceedings Article
In: CVPR2022, 2022.
@inproceedings{DBLP:journals/corr/abs-2203-09041,
title = {DATA: Domain-Aware and Task-Aware Self-supervised Learning},
author = {Qing Chang and Junran Peng and Lingxi Xie and Jiajun Sun and Haoran Yin and Qi Tian and Zhaoxiang Zhang},
url = {https://openaccess.thecvf.com/content/CVPR2022/papers/Chang_DATA_Domain-Aware_and_Task-Aware_Self-Supervised_Learning_CVPR_2022_paper.pdf},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {CVPR2022},
journal = {CoRR},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Lu, Zhenyu; Liang, Shaoyang; Yang, Qiang; Du, Bo
Evolving Block-Based Convolutional Neural Network for Hyperspectral Image Classification Journal Article
In: IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-21, 2022.
@article{9737511,
title = {Evolving Block-Based Convolutional Neural Network for Hyperspectral Image Classification},
author = {Zhenyu Lu and Shaoyang Liang and Qiang Yang and Bo Du},
url = {https://ieeexplore.ieee.org/abstract/document/9737511},
doi = {10.1109/TGRS.2022.3160513},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {IEEE Transactions on Geoscience and Remote Sensing},
volume = {60},
pages = {1-21},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Lukasik, Jovita; Jung, Steffen; Keuper, Margret
Learning Where To Look - Generative NAS is Surprisingly Efficient Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-08734,
title = {Learning Where To Look - Generative NAS is Surprisingly Efficient},
author = {Jovita Lukasik and Steffen Jung and Margret Keuper},
url = {https://doi.org/10.48550/arXiv.2203.08734},
doi = {10.48550/arXiv.2203.08734},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.08734},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Yan, Chenqian; Zhang, Yuge; Zhang, Quanlu; Yang, Yaming; Jiang, Xinyang; Yang, Yuqing; Wang, Baoyuan
Privacy-preserving Online AutoML for Domain-Specific Face Detection Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-08399,
title = {Privacy-preserving Online AutoML for Domain-Specific Face Detection},
author = {Chenqian Yan and Yuge Zhang and Quanlu Zhang and Yaming Yang and Xinyang Jiang and Yuqing Yang and Baoyuan Wang},
url = {https://doi.org/10.48550/arXiv.2203.08399},
doi = {10.48550/arXiv.2203.08399},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.08399},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Yang, Sen; Yang, Wankou; Cui, Zhen
Searching part-specific neural fabrics for human pose estimation Journal Article
In: Pattern Recognition, vol. 128, pp. 108652, 2022, ISSN: 0031-3203.
@article{YANG2022108652,
title = {Searching part-specific neural fabrics for human pose estimation},
author = {Sen Yang and Wankou Yang and Zhen Cui},
url = {https://www.sciencedirect.com/science/article/pii/S0031320322001339},
doi = {https://doi.org/10.1016/j.patcog.2022.108652},
issn = {0031-3203},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {Pattern Recognition},
volume = {128},
pages = {108652},
abstract = {Neural architecture search (NAS) has emerged in many domains to jointly learn the architectures and weights of neural networks. The core spirit behind NAS is to automatically search neural architectures for target tasks with better performance-efficiency trade-offs. However, existing approaches emphasize on only searching a single architecture with less human intervention to replace a human-designed neural network, yet making the search process almost independent of the domain knowledge. In this paper, we aim to apply NAS for human pose estimation and we ask: when NAS meets this localization task, can the articulated human body structure help to search better task-specific architectures? To this end, we first design a new neural architecture search space, Cell-based Neural Fabric (CNF), to learn micro as well as macro neural architecture using a differentiable search strategy. Then, by viewing locating human parts as multiple disentangled prediction sub-tasks, we exploit the compositionality of human body structure as guidance to search multiple part-specific CNFs specialized for different human parts. After the search, all these part-specific neural fabrics have been tailored with distinct micro and macro architecture parameters. The results show that such knowledge-guided NAS-based model outperforms a hand-crafted part-based baseline model, and the resulting multiple part-specific architectures gain significant performance improvement against a single NAS-based architecture for the whole body. The experiments on MPII and COCO datasets show that our models11Code is available at https://github.com/yangsenius/PoseNFS. achieve comparable performance against the state-of-the-art methods while being relatively lightweight.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Zhang, Haichao; Hao, Kuangrong; Pedrycz, Witold; Gao, Lei; Tang, Xue-Song; Wei, Bing
Vision Transformer with Convolutions Architecture Search Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-10435,
title = {Vision Transformer with Convolutions Architecture Search},
author = {Haichao Zhang and Kuangrong Hao and Witold Pedrycz and Lei Gao and Xue-Song Tang and Bing Wei},
url = {https://doi.org/10.48550/arXiv.2203.10435},
doi = {10.48550/arXiv.2203.10435},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.10435},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Hu, Yiming; Wang, Xingang; Gu, Qingyi
PWSNAS: Powering Weight Sharing NAS With General Search Space Shrinking Framework Journal Article
In: IEEE Transactions on Neural Networks and Learning Systems, pp. 1-14, 2022.
@article{9739130,
title = {PWSNAS: Powering Weight Sharing NAS With General Search Space Shrinking Framework},
author = {Yiming Hu and Xingang Wang and Qingyi Gu},
url = {https://ieeexplore.ieee.org/abstract/document/9739130},
doi = {10.1109/TNNLS.2022.3156373},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {IEEE Transactions on Neural Networks and Learning Systems},
pages = {1-14},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Wang, Xiaoxing; Lin, Jiale; Yan, Junchi; Zhao, Juanping; Yang, Xiaokang
EAutoDet: Efficient Architecture Search for Object Detection Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-10747,
title = {EAutoDet: Efficient Architecture Search for Object Detection},
author = {Xiaoxing Wang and Jiale Lin and Junchi Yan and Juanping Zhao and Xiaokang Yang},
url = {https://doi.org/10.48550/arXiv.2203.10747},
doi = {10.48550/arXiv.2203.10747},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.10747},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Habibian, Amirhossein; Yahia, Haitam Ben; Abati, Davide; Gavves, Efstratios; Porikli, Fatih
Delta Distillation for Efficient Video Processing Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-09594,
title = {Delta Distillation for Efficient Video Processing},
author = {Amirhossein Habibian and Haitam Ben Yahia and Davide Abati and Efstratios Gavves and Fatih Porikli},
url = {https://doi.org/10.48550/arXiv.2203.09594},
doi = {10.48550/arXiv.2203.09594},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.09594},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Arora, Parul; Jalali, Seyed Mohammad Jafar; Ahmadian, Sajad; Panigrahi, Bijaya Ketan; Suganthan, Pn; Khosravi, Abbas
Probabilistic Wind Power Forecasting Using Optimised Deep Auto-Regressive Recurrent Neural Networks Journal Article
In: IEEE Transactions on Industrial Informatics, pp. 1-1, 2022.
@article{9739990,
title = {Probabilistic Wind Power Forecasting Using Optimised Deep Auto-Regressive Recurrent Neural Networks},
author = {Parul Arora and Seyed Mohammad Jafar Jalali and Sajad Ahmadian and Bijaya Ketan Panigrahi and Pn Suganthan and Abbas Khosravi},
url = {https://ieeexplore.ieee.org/abstract/document/9739990},
doi = {10.1109/TII.2022.3160696},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {IEEE Transactions on Industrial Informatics},
pages = {1-1},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Yüzügüler, Ahmet Caner; Dimitriadis, Nikolaos; Frossard, Pascal
U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search Technical Report
2022.
@techreport{yuzuguler2022u,
title = {U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search},
author = {Ahmet Caner Yüzügüler and Nikolaos Dimitriadis and Pascal Frossard},
url = {https://arxiv.org/abs/2203.12412},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {arXiv preprint arXiv:2203.12412},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Xie, Yirong; Chen, Hong; Ma, Yongjie; Xu, Yang
Automated design of CNN architecture based on efficient evolutionary search Journal Article
In: Neurocomputing, vol. 491, pp. 160-171, 2022, ISSN: 0925-2312.
@article{XIE2022160,
title = {Automated design of CNN architecture based on efficient evolutionary search},
author = {Yirong Xie and Hong Chen and Yongjie Ma and Yang Xu},
url = {https://www.sciencedirect.com/science/article/pii/S092523122200340X},
doi = {https://doi.org/10.1016/j.neucom.2022.03.046},
issn = {0925-2312},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {Neurocomputing},
volume = {491},
pages = {160-171},
abstract = {Evolutionary Neural Architecture Search (ENAS) is a promising method for the automated design of deep network architecture, which has attracted extensive attention in the field of automated machine learning. However, the existing ENAS methods often need a lot of computing resources to design CNN architecture automatically. In order to achieve efficient and automated design of CNNs, this paper focuses on two aspects to improve efficiency. On the one hand, efficient CNN-based building blocks are introduced to ensure the effectiveness of the generated architectures and a triplet attention mechanism is incorporated into the architectures to further improve the classification performance. On the other hand, a random forest-based performance predictor is used in the fitness evaluation to reduce the amount of computation required to train each individual from scratch. Experimental results show that the proposed algorithm can significantly reduce the computational resources required and achieve competitive classification performance on the CIFAR dataset. Also, the architecture designed for the traffic sign recognition task exceeds the accuracy of manual expert design.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Benmeziane, Hadjer; Ouarnoughi, Hamza; Maghraoui, Kaoutar El; Niar, Smail
Real-Time Style Transfer with Efficient Vision Transformers Proceedings Article
In: Proceedings of the 5th International Workshop on Edge Systems, Analytics and Networking, pp. 31–36, Association for Computing Machinery, Rennes, France, 2022, ISBN: 9781450392532.
@inproceedings{10.1145/3517206.3526271,
title = {Real-Time Style Transfer with Efficient Vision Transformers},
author = {Hadjer Benmeziane and Hamza Ouarnoughi and Kaoutar El Maghraoui and Smail Niar},
url = {https://doi.org/10.1145/3517206.3526271},
doi = {10.1145/3517206.3526271},
isbn = {9781450392532},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {Proceedings of the 5th International Workshop on Edge Systems, Analytics and Networking},
pages = {31–36},
publisher = {Association for Computing Machinery},
address = {Rennes, France},
series = {EdgeSys '22},
abstract = {Style Transfer aims at transferring the artistic style from a reference image to a content image. While Deep Learning (DL) has achieved state-of-the-art Style Transfer performance using Convolutional Neural Networks (CNN), its real-time application still requires powerful hardware such as GPU-accelerated systems. This paper leverages transformer-based models to accelerate real-time Style Transfer on mobile and embedded hardware platforms. We designed a Neural Architecture Search (NAS) algorithm dedicated to vision transformers to find the best set of architecture hyperparameters that maximizes the Style Transfer performance, expressed in Frame/seconds (FPS). Our approach has been evaluated and validated on the Xiaomi Redmi 7 mobile phone and the Raspberry Pi 3 platform. Experimental evaluation shows that our approach allows to achieve a 3.5x and 2.1x speedups compared to CNN-based Style Transfer models and Transformer-based models respectively1.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Rajesh, Chilukamari; Kumar, Sushil
An evolutionary block based network for medical image denoising using Differential Evolution Journal Article
In: Applied Soft Computing, vol. 121, pp. 108776, 2022, ISSN: 1568-4946.
@article{RAJESH2022108776,
title = {An evolutionary block based network for medical image denoising using Differential Evolution},
author = {Chilukamari Rajesh and Sushil Kumar},
url = {https://www.sciencedirect.com/science/article/pii/S1568494622002022},
doi = {https://doi.org/10.1016/j.asoc.2022.108776},
issn = {1568-4946},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {Applied Soft Computing},
volume = {121},
pages = {108776},
abstract = {Image denoising is the key component in several computer vision and image processing operations due to unavoidable noise in the image generation process. For medical image processing, deep convolutional neural networks (CNN) gives a state-of-the-art performance. However, network structures are manually constructed for specific tasks and require several trials to tune a large number of hyperparameters, which can take a long time to construct a network. Additionally, the fittest hyperparameters which may be suitable for source data properties like noisy features cannot be easily found to target data. The realistic noise is generally mixed, complex, and unpredictable in medical images, which makes it difficult to design an efficient denoising network. We developed a Differential Evolution (DE) based automatic network evolution model in this paper to optimize the network architectures and hyperparameters by exploring the fittest parameters. Furthermore, we adopted a transfer learning technique to accelerate the training process. The proposed evolutionary algorithm is flexible and finds optimistic network architectures using well-known methods including residual and dense blocks. Finally, the proposed model was evaluated on four different medical image datasets. The obtained results at different noise levels show the potentiality of the proposed model named DEvoNet for identifying the optimal parameters to develop a high-performance denoising network structure.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Zhou, Qinqin; Sheng, Kekai; Zheng, Xiawu; Li, Ke; Sun, Xing; Tian, Yonghong; Chen, Jie; Ji, Rongrong
Training-free Transformer Architecture Search Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-12217,
title = {Training-free Transformer Architecture Search},
author = {Qinqin Zhou and Kekai Sheng and Xiawu Zheng and Ke Li and Xing Sun and Yonghong Tian and Jie Chen and Rongrong Ji},
url = {https://doi.org/10.48550/arXiv.2203.12217},
doi = {10.48550/arXiv.2203.12217},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.12217},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Mok, Jisoo; Na, Byunggook; Kim, Ji-Hoon; Han, Dongyoon; Yoon, Sungroh
Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training? Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-14577,
title = {Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?},
author = {Jisoo Mok and Byunggook Na and Ji-Hoon Kim and Dongyoon Han and Sungroh Yoon},
url = {https://doi.org/10.48550/arXiv.2203.14577},
doi = {10.48550/arXiv.2203.14577},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.14577},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Das, Mayukh; Singh, Brijraj; Chheda, Harsh Kanti; Sharma, Pawan; NS, Pradeep
AutoCoMet: Smart Neural Architecture Search via Co-Regulated Shaping Reinforcement Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-15408,
title = {AutoCoMet: Smart Neural Architecture Search via Co-Regulated Shaping Reinforcement},
author = {Mayukh Das and Brijraj Singh and Harsh Kanti Chheda and Pawan Sharma and Pradeep NS},
url = {https://doi.org/10.48550/arXiv.2203.15408},
doi = {10.48550/arXiv.2203.15408},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.15408},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Yang, Jin; Huang, Yingying; Jiang, Guangxin; Chen, Ying
An Intelligent End-to-End Neural Architecture Search Framework for Electricity Forecasting Model Development Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-13563,
title = {An Intelligent End-to-End Neural Architecture Search Framework for Electricity Forecasting Model Development},
author = {Jin Yang and Yingying Huang and Guangxin Jiang and Ying Chen},
url = {https://doi.org/10.48550/arXiv.2203.13563},
doi = {10.48550/arXiv.2203.13563},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.13563},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Sun, Haiyang; Lian, Zheng; Liu, Bin; Li, Ying; Sun, Licai; Cai, Cong; Tao, Jianhua; Wang, Meng; Cheng, Yuan
EmotionNAS: Two-stream Architecture Search for Speech Emotion Recognition Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-13617,
title = {EmotionNAS: Two-stream Architecture Search for Speech Emotion Recognition},
author = {Haiyang Sun and Zheng Lian and Bin Liu and Ying Li and Licai Sun and Cong Cai and Jianhua Tao and Meng Wang and Yuan Cheng},
url = {https://doi.org/10.48550/arXiv.2203.13617},
doi = {10.48550/arXiv.2203.13617},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.13617},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Lu, Bingqian; Yan, Zheyu; Shi, Yiyu; Ren, Shaolei
A Semi-Decoupled Approach to Fast and Optimal Hardware-Software Co-Design of Neural Accelerators Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-13921,
title = {A Semi-Decoupled Approach to Fast and Optimal Hardware-Software Co-Design of Neural Accelerators},
author = {Bingqian Lu and Zheyu Yan and Yiyu Shi and Shaolei Ren},
url = {https://doi.org/10.48550/arXiv.2203.13921},
doi = {10.48550/arXiv.2203.13921},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.13921},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
M., Abishai Ebenezer; Arya, Arti
An Atypical Metaheuristic Approach to Recognize an Optimal Architecture of a Neural Network Technical Report
2022.
@techreport{DBLP:conf/icaart/MA22,
title = {An Atypical Metaheuristic Approach to Recognize an Optimal Architecture of a Neural Network},
author = {Abishai Ebenezer M. and Arti Arya},
editor = {Ana Paula Rocha and Luc Steels and H. Jaap Herik},
url = {https://doi.org/10.5220/0010951600003116},
doi = {10.5220/0010951600003116},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {Proceedings of the 14th International Conference on Agents and Artificial
Intelligence, ICAART 2022, Volume 3, Online Streaming, February
3-5, 2022},
pages = {917--925},
publisher = {SCITEPRESS},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Wang, Chunnan; Chen, Xingyu; Wu, Chengyue; Wang, Hongzhi
AutoTS: Automatic Time Series Forecasting Model Design Based on Two-Stage Pruning Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-14169,
title = {AutoTS: Automatic Time Series Forecasting Model Design Based on Two-Stage Pruning},
author = {Chunnan Wang and Xingyu Chen and Chengyue Wu and Hongzhi Wang},
url = {https://doi.org/10.48550/arXiv.2203.14169},
doi = {10.48550/arXiv.2203.14169},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.14169},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Zaman, Khalid; Sun, Zhaoyun; Shah, Sayyed Mudassar; Shoaib, Muhammad; Pei, Lili; Hussain, Altaf
Driver Emotions Recognition Based on Improved Faster R-CNN and Neural Architectural Search Network Journal Article
In: Symmetry, vol. 14, no. 4, pp. 687, 2022.
@article{DBLP:journals/symmetry/ZamanSSSPH22,
title = {Driver Emotions Recognition Based on Improved Faster R-CNN and Neural Architectural Search Network},
author = {Khalid Zaman and Zhaoyun Sun and Sayyed Mudassar Shah and Muhammad Shoaib and Lili Pei and Altaf Hussain},
url = {https://doi.org/10.3390/sym14040687},
doi = {10.3390/sym14040687},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {Symmetry},
volume = {14},
number = {4},
pages = {687},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Zheng, Ruiqi; Qu, Liang; Cui, Bin; Shi, Yuhui; Yin, Hongzhi
AutoML for Deep Recommender Systems: A Survey Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-13922,
title = {AutoML for Deep Recommender Systems: A Survey},
author = {Ruiqi Zheng and Liang Qu and Bin Cui and Yuhui Shi and Hongzhi Yin},
url = {https://doi.org/10.48550/arXiv.2203.13922},
doi = {10.48550/arXiv.2203.13922},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.13922},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Raychaudhuri, Dripta S.; Suh, Yumin; Schulter, Samuel; Yu, Xiang; Faraki, Masoud; Roy-Chowdhury, Amit K.; Chandraker, Manmohan
Controllable Dynamic Multi-Task Architectures Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-14949,
title = {Controllable Dynamic Multi-Task Architectures},
author = {Dripta S. Raychaudhuri and Yumin Suh and Samuel Schulter and Xiang Yu and Masoud Faraki and Amit K. Roy-Chowdhury and Manmohan Chandraker},
url = {https://doi.org/10.48550/arXiv.2203.14949},
doi = {10.48550/arXiv.2203.14949},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.14949},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Wang, Rui; Bai, Qibing; Ao, Junyi; Zhou, Long; Xiong, Zhixiang; Wei, Zhihua; Zhang, Yu; Ko, Tom; Li, Haizhou
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-15610,
title = {LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT},
author = {Rui Wang and Qibing Bai and Junyi Ao and Long Zhou and Zhixiang Xiong and Zhihua Wei and Yu Zhang and Tom Ko and Haizhou Li},
url = {https://doi.org/10.48550/arXiv.2203.15610},
doi = {10.48550/arXiv.2203.15610},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.15610},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Chang, Yangyang; Sobelman, Gerald E.
Lightweight CNN Frameworks and their Optimization using Evolutionary Algorithms Proceedings Article
In: 2022 International Electrical Engineering Congress (iEECON), pp. 1-4, 2022.
@inproceedings{9741692,
title = {Lightweight CNN Frameworks and their Optimization using Evolutionary Algorithms},
author = {Yangyang Chang and Gerald E. Sobelman},
url = {https://ieeexplore.ieee.org/abstract/document/9741692},
doi = {10.1109/iEECON53204.2022.9741692},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {2022 International Electrical Engineering Congress (iEECON)},
pages = {1-4},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Park, Gunju; Yi, Youngmin
CondNAS: Neural Architecture Search for Conditional CNNs Journal Article
In: Electronics, vol. 11, no. 7, 2022, ISSN: 2079-9292.
@article{electronics11071101,
title = {CondNAS: Neural Architecture Search for Conditional CNNs},
author = {Gunju Park and Youngmin Yi},
url = {https://www.mdpi.com/2079-9292/11/7/1101},
doi = {10.3390/electronics11071101},
issn = {2079-9292},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {Electronics},
volume = {11},
number = {7},
abstract = {As deep learning has become prevalent and adopted in various application domains, the need for efficient convolution neural network (CNN) inference on diverse target platforms has increased. To address the need, a neural architecture search (NAS) technique called once-for-all, or OFA, which aims to efficiently find the optimal CNN architecture for the given target platform using genetic algorithm (GA), has recently been proposed. Meanwhile, a conditional CNN architecture, which allows early exits with auxiliary classifiers in the middle of a network to achieve efficient inference without accuracy loss or with negligible loss, has been proposed. In this paper, we propose a NAS technique for the conditional CNN architecture, CondNAS, which efficiently finds a near-optimal conditional CNN architecture for the target platform using GA. By attaching auxiliary classifiers through adaptive pooling, OFA’s SuperNet is successfully extended, such that it incorporates the various conditional CNN sub-networks. In addition, we devise machine learning-based prediction models for the accuracy and latency of an arbitrary conditional CNN, which are used in the GA of CondNAS to efficiently explore the large search space. The experimental results show that the conditional CNNs from CondNAS is 2.52× and 1.75× faster than the CNNs from OFA for Galaxy Note10+ GPU and CPU, respectively.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Zhou, Qinghua; Gorban, Alexander N.; Mirkes, Evgeny M.; Bac, Jonathan; Zinovyev, Andrei Yu.; Tyukin, Ivan Yu.
Quasi-orthogonality and intrinsic dimensions as measures of learning and generalisation Technical Report
2022.
@techreport{DBLP:journals/corr/abs-2203-16687,
title = {Quasi-orthogonality and intrinsic dimensions as measures of learning and generalisation},
author = {Qinghua Zhou and Alexander N. Gorban and Evgeny M. Mirkes and Jonathan Bac and Andrei Yu. Zinovyev and Ivan Yu. Tyukin},
url = {https://doi.org/10.48550/arXiv.2203.16687},
doi = {10.48550/arXiv.2203.16687},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {CoRR},
volume = {abs/2203.16687},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Li, Yawei
Towards Efficient Deep Neural Networks PhD Thesis
ETH Zurich, 2022.
@phdthesis{20.500.11850/540498,
title = {Towards Efficient Deep Neural Networks},
author = {Yawei Li},
doi = {10.3929/ethz-b-000540498},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
publisher = {ETH Zurich},
address = {Zurich},
school = {ETH Zurich},
abstract = {Computational efficiency is an essential factor that influences the applicability of computer vision algorithms. Although deep neural networks have reached state-of-the-art performances in a variety of computer vision tasks, there are a couple of efficiency related problems of the deep learning based solutions. First, the overparameterization of deep neural networks results in models with millions of parameters, which lowers the parameter efficiency of the designed networks. To store the parameters and intermediate feature maps during the computation, a large device memory footprint is required. Secondly, the massive computation in deep neural networks slows down their training and inference. This limits the application of deep neural networks to latency-demanding scenarios and low-end devices. Thirdly, the massive computation consumes significant amount of energy, which leaves a large carbon footprint of deep learning models.The aim of this thesis is to improve the computational efficiency of current deep neural networks. This problem is tackled from three perspective including neural network compression, neural architecture optimization, and computational procedure optimization.In the first part of the thesis, we reduce the model complexity of neural networks by network compression techniques including filter decomposition and filter pruning. The basic assumption for filter decomposition is that the ensemble of filters in deep neural networks constitutes an overcomplete set. Instead of using the original filters directly during the computation, they can be approximated by a linear combination of a set of basis filters. The contribution of this thesis is to provide a unified analysis of previous filter decomposition methods. On the other hand, a differentiable filter pruning method is proposed. To achieve differentiability, the layers of neural networks is reparameterized by a meta network. Sparsity regularization is applied to the input of the meta network, i.e. latent vectors. Optimizing with the introduced regularization leads to an automatic network pruning method. Additionally, a joint analysis of filter decomposition and filter pruning is presented from the perspective of compact tensor approximation. The hinge of the two techniques is the introduced sparsity inducing matrix. By simply changing the way the group sparsity regularization is enforced to the matrix, the two techniques can be derived accordingly.Secondly, we try to improve the performance of a baseline network by a fine-grained neural architecture optimization method. Different from network compression methods, the aim of this method is to improve the prediction accuracy of neural networks while reducing their model complexity at the same time. Achieving the two targets simultaneously makes the problem more challenging. In addition, a nearly cost-free constraint is enforced during the architecture optimization, which differs from current neural architecture search methods with bulky computation. This can be regarded as another efficiency-improving technique.Thirdly, we optimize the computational procedure of graph neural networks. By mathematically analyzing the operations in graph neural network, two methods are proposed to improve the computational efficiency. The first method is related to the simplification of neighbor querying in graph neural network while the second involves shuffling the order of graph feature gathering and an feature extraction operations. To summarize, this thesis contributes to multiple aspects of improving the computational efficiency of neural networks during the optimization, training, and test phase.},
keywords = {},
pubstate = {published},
tppubtype = {phdthesis}
}